The various ARM CPU chips are vastly different in their abilities. Sometimes, a single digit difference can have a BIG difference in performance.
In particular, I have 2 SBCs with ARM chips of type 7x where the x differs by one digit. A72 vs A73 cores. The RockPro64 has 2 x A72 cores, and then 4 x A53 cores. The Odroid N2 also has 6 cores, but it is 4 x A73 and 2 x A53 cores.
https://www.hardkernel.com/blog-2/odroid-n2/
The main CPU of the N2 is based on big.Little architecture which integrates a quad-core ARM Cortex-A73 CPU cluster and a dual core Cortex-A53 cluster with a new generation Mali-G52 GPU.
Thanks to the modern 12nm silicon technology, the A73 cores runs at 1.8Ghz without thermal throttling using the stock metal-housing heatsink allowing a robust and quiet computer.
https://ameridroid.com/products/rockpro64
powered by Rockchip’s RK3399 processor. The RK3399 is a 6-core chip with two ARM Cortex-A72 CPU cores, four Cortex-A53 cores, and Mali-T860MP4 graphics.
I’m not going to get into the use of the graphics engine for parallel computes, nor how that can drastically change the ‘throughput’ of a given system for parallel codes. That’s a whole ‘nother topic. Here, I’m just going to look at what changes with A72 vs A73 and similar “single digit” differences.
Now the ‘first blush’ is that you get 4 big and 2 little in the N2, but 2 big and 4 little in the RockPro64. They both sell for about $70 with 2 GB of memory, but the 4 GB memory of each is closer to $90. So are they really close to the same board in performance?
https://en.wikipedia.org/wiki/Comparison_of_ARMv8-A_cores
Has a column WAAAaaaay off to the right with “Dhrystone” benchmark scores. Note that it is integer focused so doesn’t measure the effectiveness of your floating point hardware nor string manipulation. Some folks find that limitation unacceptable. I think it is quite reasonable as a general first yardstick. LOTS of things in a computer are done with INTs.
https://en.wikipedia.org/wiki/Dhrystone
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor (CPU) performance. The name “Dhrystone” is a pun on a different benchmark algorithm called Whetstone.
It also doesn’t pander to the “modern” language tribe that uses OO and interpreters and such, so they are not too fond of it either, but whatever. It’s still a good raw power measure IMHO.
So what’s that wiki on ARM cores say about those two?
Cortex-A72 2015 3-wide 4.72
Cortex-A73 2016 2-wide ~6.35
The A72 has a 3-wide decode ability and 5-wide dispatch, so has more parallel paths it can execute at once in a predictive way (then throw away the one that wasn’t taken in the ‘if’ branch) while the A73 has cut that back by one each. Yet it is 1.35 x as fast. Easily 1/3 faster. AND it runs cooler.
How about those A53 cores?
Cortex-A53 2014 2-wide 2.24
While it has 2 wide decode of instructions, it has no parallel dispatch at all. Then the performance is just 2.24 Dhry. So the A72 is about 2.1 x faster and the A73 is about 2.8 x faster.
In particular boards / SOCs (System On Chip) you would need to adjust those for any added CPU Clock speed the maker got compared to the Arm Holdings baseline. But as a first approximation it gives a pretty good idea what you are buying.
So figure an A53 as the baseline of “one”, then the RockPro64 is 4 + 2(2.1) = 8.2 total. The Odroid N2 would be 4(2.8)+2 = 13.2 so in total, you could expect (modulo exact clock rates used and any thermal throttling from no heatsink with the RockPro64 unless you pay more to get one…) the N2 is roughly 1.6 times a much computes as the RockPro64.
That’s how much that one digit on the A7x and the other swap of 2 vs 4 gets you.
Bottom Line?
You really really must look at the relative CPU speeds of the different chip / Core numbers to know what you are getting.
My experience with both boards has been roughly that, too. The N2 is largely “no waiting” and the RockPro64 sometimes lags just a bit, and I notice it isn’t quite as fast. Also note that often it is a SINGLE core that’s pegged as the job does not run all the cores. Isn’t multithreaded. In that case, I’ve still noticed the 1/3 faster. 3 vs 4 seconds isn’t that much, but you will notice it sometimes. 30 minutes vs 40 minutes, you know…
Some others to note? An A57 core is 4.6 so almost the same as an A72 on that benchmark.
When they get down to “Small systems” SBCs the other A7x cores will make a big dent:
Cortex-A75 2017 3-wide 8.2-9.5
Cortex-A76 2018 4-wide 10.7-12.4
Yeah, about 9 and about 11 for those two. Or about the same as the 4 x A53 cores in a Raspberry Pi Model 3. Or 5 cores for the A76. So a 4 x A76 core SOC will do about the same as 5 Raspberry Pi boards… Just sayin’…
Though note that the Pi M4 has A72 cores. Now it IS a fairly fast board due to that, but… the A72 cores run hot compared to the A73, so you need a LOT of heat extraction to use it without thermal limiting. Unfortunately, it does not come with a heat sink. All sorts of folks have done all sorts of exotic stuff trying to keep them cool enough. From powered fans to heat pipes and more. Compare the Odroid N2 with built on passive heat sink. No fan needed.
Oh, and the Odroid N2+ ups the clock speed to 2 GHz (compare the Pi at 1.5 GHz), so there’s that…
In Conclusion
There’s a WHOLE LOT more to benchmarks than just one number, Dhrystone or otherwise. For many kinds of use, the memory quantity and speed matter more, or the speed to disk (USB 2.0 vs 3.0 that’s about 10 times faster). The Raspberry Pi, unfortunately, regularly cheaps out on both heat extraction, so your board heat limits to very low performance, sometimes under half; and on I/O structure, so any I/O bound jobs are just slugs, and sometimes non-I/O bound on other gear becomes I/O bound on the Pi.
Overall, the Odroid family has done a much better job of the hardware design. Real barrel connectors for power (where the USB-c for power was OK on the original Pi, it is at or sometimes over limit on later bigger Pi’s and really you need a different power connector), heat sinks big and included, good I/O designs. Other makers grade out between those two.
BUT.
Odroid mostly just ships a variation of Obuntu as their operating system. Their boot process is a bit different and arcane in some ways, so a lot of developers just don’t port to their stuff. Raspberry Pi has just about every OS on it (despite having their own arcane boot process… but ‘size matters’ and there’s millions of Pi boards sold). Other boards again grade out between them.
So I must say that if you are going to buy an SBC, do check out the available operating system choices and what age / quality of ports exist. Make sure the one you want is available on the hardware you like.
For Completion, the V7 instruction set 32 bit ARM cores:
https://en.wikipedia.org/wiki/Comparison_of_ARMv7-A_cores
ARM Cortex-A7 1.9
ARM Cortex-A15 3.5 to 4.01
ARM Cortex-A17 4.0
The A7 cores are what is in the earlier Raspberry Pi Model 2 boards (they later got an upgrade) and many other systems. They are also the “little” cores in my Odroid XU4, while the A15 are the BIG cores in the XU4. It has 4 of each. I just note in passing that the A15 and A17 both have some parallel execution and both score about a 4 Dhrysone rating.
So my XU4 has roughy 4 x 4 + 4 x 1.9 or 23.6 total Dhrystones of go juice. Divide that by 2.24 to compare it in terms of A53 cores, you get: 10.5 or roughly 1/2 way between the RockPro64 and the Odroid N2 in total computes. But when limited to a one-core job, that one A15 core is 4/4.72 = 85% of an A72 core. Damn Fast still, but not as fast. It is 4 / 6.35 or 0.63 a little under 2/3 the speed of the A73 core in my Odroid N2. Then, for some things, that 64 bit int matters a lot compared to the 32 bit int and you can end up 1/4 or less the speed as you do double precision math.
So while I love the little XU4 for “general computing” like running a browser, where the 32 bit word size means a whole lot less memory wasted; on hard core high precision math it just isn’t that fast.
The point?
How fast a given job runs depends a lot on the particulars. Single thread or multi-threaded? Double precision 64 bit math, or 32 bit? Different boards have different “best uses”. Be careful when you buy to match the problem you are solving to the equipment, and watch those single digit differences. They matter.
I wonder if anyone has managed to get an Apple M1 loaded w/ Linux? Too pricey to put with the Raspberry Pi crowd, though it has screamer performance (esp. per Watt). Much of that performance may be lost loading Linux though… the architecture is tailored to Apple’s OS.
Yes:
https://www.theverge.com/2021/1/21/22242107/linux-apple-m1-mac-port-corellium-ubuntu-details
FWIW, MacOS is a “Mach Kernel” *nix. Under the skins, in the CLI, most of the Unix / Linux commands are the same / still there. (I’ve done that on the MacBook).
So no, you won’t lose much performance. The Arch is ARM and it is just the same to Linux or Mach.
What you will lose, is tweaking and tuning. The thousands of hours of polish on the programming base to make things work more smoothly . Eventually Linux will smooth out, but for now, it will be a young, fresh, rough port. One example? Usually at first ports will drive the video with a limited generic driver, often using the CPU. Eventually someone gets around to writing / porting a driver for the GPU on a new chipset, but it takes time. So video runs slow and things like dragging a window get jitters and such. Until they don’t… Mac will have had loads of programmer hours put into that already.
In short, it is more that the software is tailored to the Arch rather than the Arch being tailored to the software.
In some ways, the M1 SOC is just an 8 core Big/little ARM v8 package like any other. That part of the chipset is where Linux will originally run as that’s just like all the other V8 arm ISA world. BUT….
Apple put a lot of OTHER compute stuff in the SOC. That’s the bits that initially MacOS will use (as programmers will have been porting stuff to it for a while now) while Linux will leave it idle. “Whenever” someone gets around to doing the programming for Linux to use those bits too, it will be “just as fast”.
But wait, there’s more…
Most of the Apple code today is x86 code. Apple runs that via an instruction emulator / translator of some kind, so take a performance hit. Over Time they will finish the conversion, but not just yet… So a native ARM V8 Linux will have bits that take no translation and run full native speed.
So it is a horse race to the bottom between interpreter sloth and Linux not using available specialized hardware, as to which is the slowest horse…
https://en.wikipedia.org/wiki/Apple_M1
Now 4 x 2 Ghz and 4 x 3 GHz cores is nothing to sneeze at. But a code translator can suck that dry and more, depending on what you do… and how well it was written.
But it’s these hardware bits that initially will just be ignored by a first Linux port in the Time To Market race:
So just like a “sysbench” benchmark can score 10x faster on ARM with SIMD or similar GPU compute stack available, but slow if that’s not active in the Linux port; those various specialized processors need coding into the OS stack to be used and present their power. That will take time.
So the question really comes down to: How much has Apple taken advantage of those added CPU “variety special purpose cores” vs. how much has the x86 Translator sucked up?
The core OS for either Mac OS or Linux running on Arm v8 cores will have very similar performance. Linux v8 compiled applications may well, initially, beat Apple Applications if the latter are x86 translated and not using special cores for things like video processing much.
Actually, much of the Apple code is Objective-C or Swift. As such it compiles (via XCode LLVM) to whichever target is needed. Big Sur on M1 is not X86
Nice to know. When I was at apple, we did the 68000 to PowerPC swap. Did the same thing with an interpreter “for a while”. They profiled the code and the parts that were used a lot (mostly M 68000 binary assembler) was recoded for the new CPU. The rarely used bits left to the interpreter until they had time (after reaching time to market goals).
Something similar happened at the swap to Intel CPUs. Same general modus, but less in assembly as the CPU speed had gotten fast enough that less assembler was needed.
So now they have a similar interpreter process. But your asserting none of the current OS is written in assembler anymore. Well, it’s about time. Glad to hear it.
OK, so that makes the “race condition” between MacOS guys bothering to recode things to use the added interesting but unusual hardware bits, and the Linux guys doing the same. That ought to only leave applications folks who do have codes in x86 needing the interpreter / translator. So that sloth ought to only show up in applications. (Linux applications already compiled to ARM ISA). The two (essentially Unix-oid) OSs ought to have very similar performance envelopes then.
One hopes the Apps guys get to the recompiling too… then again, that would mean two sets of Apps in the Apps Stores… or whatever Macs use these days for distribution.
Using GPUs for general computes is a current hot topic across the compute world, so I’d not be surprised if Mac OS had already done some of that. No idea how much though, and it isn’t easy to do / figure out. Then that Neural processor is just ‘out there’. I doubt much of anything uses it yet.
I’d also suspect that Apple didn’t bother re-writing all of the Mach stuff from C to Objective-C, so a good bit of it ought to still be straight C (as is common in OS lower levels of code). Ought to still run through the same compiler though. (I’d guess Clang, but who knows…)
Interesting… Looks like they have Big Sur running on both x86 and the new ARM systems:
https://www.cnet.com/how-to/macos-big-sur-compatibility-will-your-laptop-work-with-the-new-os/
So it runs on the older Macs, but not so well… They have optimized it for the newer ARM chipset. OK…
I’d guess that means that where they had a coding choice, they opted for an ARM Friendly algorithm and not a PC x86 AMD friendly one. Then just compile the same source code for both… Makes sense as ARM chips are often at a gross speed disadvantage vs x86 so need the help more.
Plus Apple has a bit of history of not giving a damn about making older hardware a bit hobbled with an update so you need to buy a new one. (Yes, I love the Apple gear I’ve had over the years, but really, they are not that much a warm fuzzy marketing strategy folks company… But they are more subtle and friendly about it than Micro$oft…)
But nice to know I likely don’t want to pick up cheap old hardware running it… unless I want to take the pain of installing Linux on a “outsider hostile” closed system Apple Mac device…
Well, I think I’m now officially not that interested in Macs anymore. Oh Well. I was an original purchaser of one of the very first about 1984? or so. It is still in its Mac Sack in the garage… The love affair with Mac lasted until about 2000? Somewhere in there. Maybe as late as 2005. Then I’ve slowly drifted away.
Now I just don’t see the point of spending Order Of Magnitude $1000+ when I can get all the computes I want or need for under $100. And in an open system where I can fiddle with any of it that I think needs a fiddle. And where I’m not tied to a vendor deciding when my equipment is too out of date for their interest. (I’m still using my very first Raspberry Pi v6 chipset single core 700 Mhz board ;-) as DNS server / router.)
Well, it was a nice peek over at where Apple is going next. But don’t see any home for me there anymore. Oh Well.
I still have nostalgia for the old time small systems. LPC1115 – 64K ROM 8K Ram Runs FORTH in 16K of the ROM – and by ROM I mean Flash. It is a 32 bit machine but all the works fits on a 1.5″ X 3 ” board with mounting holes.. Most instructions are single cycle vs the 3 or 4 cylces of a Z80. And it runs at 50 MHz vs 4 MHz for the Z80. So about 30 to 40 times faster.
Just for grins, I swapped the RockPro64 into the TV Media Server roll and swapped the Odroid N2 out for a while.
I noticed a few things:
1) The pauses to bring up a web page are noticeably longer. Not quite enough to irritate, but enough to experience and note.
2) Running a video at 1080 p is slightly beyond it’s “always fine” ability where the N2 didn’t notice or care. Some run fine, others glitch when lots of background changes and it needs to shove a lot of bits down the wire.
3) Video at 720 p runs the two BIG cores at about 75-80% and some of the small cores get action too.
4) It gets HOT when running that hard. I knew the A-72 cores ran hotter than A-73 cores, but thought the big heat sink for just 2 of them would be fine. It doesn’t heat limit, but the heat sink is hot to the touch.
5) Overall, it is an acceptable performance in this roll, on a medium sized 720 p monitor, but for a bigger screen at 1080p, I’d want the N2 or better.
6) Both machines running Armbian for this test to make it equal. It is quite possible that an OS Release dedicated to video (like KODI) and using the graphics engines better could do a lot more with lesser hardware and less heat. Software matters a Great Deal to video performance.
7) When a running video is using the 2 ‘big’ cores, any action in another tab takes place at the A-53 core performance level, and you notice it a lot more. The N2 with 4 BIG cores doesn’t have that issue at all.
8) The 2 GB of memory on the RockPro64 fills up and rolls to swap a lot sooner than the 4 GB on the N2. Having 4 or more is better.
@M. Simon:
One of my favorite systems is a handheld computer from Radio Shack. Has a one line of text display, you program in BASIC, and it is dual 4 bit processors. One dedicated to I/O stuff… You can hook up a cassette tape and ‘chain’ programs from it… Doesn’t re-wind though, just read forward to next block.
Sooo, having read our gracious host’s extensive comparison/review of selected SBCs several days ago [*], it seems that a reader’s practical choices are between the Odroid–N2 (currently maxxed out at 4 MB RAM and temporarily sold out) from South Korea [☯], vs. the Raspberry Pi 4 (currently available with 4 MB, but maxxed out at 8 MB RAM) from the U.K. [♕]. I suppose that most or all of the other SBC brands with which he has experience are disqualified by being products of Red China; that’s an attitude with which I newly agree.
It seems fair to infer that he boils it down [*] to a kind of decision with which avocational computerists hate being confronted:
• Odroid-N2 is the superior hardware design, featuring a robust integral heat sink [*]. Its manufacturer Hardkernel (Co., Ltd.) claims it’s adequate for heat dissipation in 95°F (“35°C”) ambient air [🌴], and offers a cooling fan for situations in which it isn’t [🌵]. Its support for industrial hardware prototyping is exemplified by numerous hardware accessories. But its support for software development is overstated even if diplomatically dubbed “limited”. As our host has written in previous blog entries, its manufacturer is typical of many in the business of high-volume sales to manufacturers who intend to build products containing the SBC(s), and who are thus expected to have their own teams of software developers for evaluating & resolving software issues with the SBC(s).
• Pi 4 has disappointing hardware because of its unresolved heat dissipation issues, thus speed down-throttling issues, because the Raspberry Pi Foundation seems not to recognize a need for heat sinks [*], and I’ve yet to see R.P.F. even offer approved designs for add-ons as hardware options. This being the 21st Century, surely there are no homes or schools in tropical climates where RasPis would be operated without air-conditioning, right? The R.P.F. offers strong support for education, which was its original declared mission, and thus also for software for its SBCs.
Our host’s comparison/review disrupts my previous confidence that the Pi 4 was the obvious choice for me and others among Chiefio denizens who lack experience with frequent building of Linux systems.
Meanwhile, denizens inclined toward Odroid must hope that Red China doesn’t decide that the “Bejing Biden” & Commie-la Harris Administration provides an unusually favorable opportunity to send its client state North Korea flooding southward over the Korean DMZ, aided by Red-Chinese “advisors” and geopolitical extortion (presumably including nukes).
——–
Note * : E.M.Smith: “ARM Chips – What’s A Digit Mean?”. Posted on 21 January 2021 at 3:49 am GMT: https://chiefio.wordpress.com/2021/01/21/arm-chips-whats-a-digit-mean/.
Note ☯ : 🇰🇷 Hardkernel (Co., Ltd.) is the manufacturer of the Odroid SBCs: https://www.hardkernel.com/shop/odroid-n2-with-4gbyte-ram-2/.
Note ♕ : 🇬🇧 https://www.raspberrypi.org/products/raspberry-pi-4-model-b/.
Note 🌴 : 95°F can routinely be encountered when outdoors, and when relying upon ambient-temperature ventilation, during summery stretches of weather in Florida; such days might seem interminable by August.
Note 🌵 : 95°F is relatively cooler than some highs in the U.S. desert Southwest, where 100s°F can reputedly be encountered, altho’ I’ve deliberately avoided experiencing it.
(Rendering test: 🐧)
@CompuGator:
Pretty much sums it up. Note that the present product is the Odroid N2+
https://ameridroid.com/products/odroid-n2-plus
You may still find the (only slightly) slower N2 board on Amazon or other places.
My experience has been that running a 64 bit (Arm64 / Aarch64) operating system takes about 2 x as much memory. So my 4 GB N2 uses about as much memory percentage to run a browser as my Odroid XU4 (32 bit) with 2 GB of memory.
I would NOT buy any more 2 GB boards where the intent is to run big software like a browser. 4 GB and up only. For example, I’m typing this on my (pre-china HQ move) RockPro64 with 2 GB of memory. I’m running ONLY Chromium and top in a terminal window at the moment. I have 15 tabs “open” but about 1/2 of them have not loaded anything since I launched this browser with the saved tabs and it doesn’t re-load them until you go to that tab. So figure about 7 have actual content loaded. Present memory use:
So 33.8 MB rolled to swap already… Now a fair mount of that is buffer cache and subject to re-use, BUT, IF I run a video or two, like a simple music video of a few minutes, it rapidly ends up in swap land…
IMHO, most browsers today do a HORRIBLE job of memory management. Even FireFox re-written in RUST. (That’s supposed to auto-magically handle memory deallocation for you). As long as the tab exists, it looks to me in my use, like it keeps the memory used too. Rust only releasing it when the tab is closed.
So ‘whatever’…
Per “other boards”:
The Pi M3 is almost enough for an uncomfortably slow desktop. I’ve used it “in a pinch” and got by OK. It does work well enough for a KODI server as that’s using the GPU to good effect.
I’ve not gone through the canonical list of all board makers to find out who is NOT made in China. Perhaps I ought to. There’s a LOT of them. I am not too worried about resisters and capacitors made in China (though they have stuck backdoor chips on full PC boards made there for some PCs using x86 type CPUs). I’d be worried about any ROMs and CPUs (i.e. SOC SBCs) where the System On Chip was made in China. My other boards, bought only due to incredible cheapness at $12 to $15, are Orange Pi and also from China. Not used for anything important as with 512 MB memory, and an H3 chip without heatsink, they heat up fast and can’t do much… but make an OK headless appliance for things that don’t matter…
FWIW, I also own several other Odroids. C1, C2, XU4-quiet (no fan big heatsink). I like the XU4 a lot and running Armbian works very well without software issues other than being SystemD based. On sale cheap right now at $58:
https://ameridroid.com/products/odroid-xu4q-with-passive-heatsink
You can get it with a fan and smaller heat sink, but in my testing of the quiet model, it would very rarely heat limit and then only when running all cores hard for a good while (like a minute or three). Not a normal use profile really.
https://ameridroid.com/products/odroid-xu4
4 x A15 cores at 2 GHz… plus 4 x A7 cores for low power on easy tasks.
I used it as my Daily Driver desktop for a year or more and was happy enough.
Yes, I like the N2 (no plus…) better and it IS a significant speed up, especially on high res music videos, but for everything else, the XU4q has been fine. Setting a video res down to 780p or in a smaller window (‘regular’ or ‘theatre’ mode) works OK on the XU4 even with a lot of action / pixel changes.
I’d buy another one of them before I’d touch a Pi M4:
https://www.martinrowan.co.uk/2019/07/cooling-options-for-the-hot-raspberry-pi-4/
ExplainingComputers.com has some nice videos of extreme active cooling to get it to run at full speed…. So unless you want a “heat management project”, I don’t see the point of buying one. By the time you add the heat extraction costs (and work and shipping) it’s just as cheap to get an XU4.
There’s a LOT of makers now:
“https://en.wikipedia.org/wiki/Category:Single-board_computers”
Title in search is “63 Best SBCs”…
https://www.slant.co/topics/1629/~best-single-board-computers
So a bit of work to look at ALL of them…
Were I “in the market” right now, I’d likely get an NVIDIA Jetson of some sort. The Nano is $115 if you can find one (supposedly $99 direct from Nvidia website):
https://ameridroid.com/products/nvidia-jetson-nano
With basic SBC performance of ‘pretty good’ but then you get to play with the Cuda Cores for video OMGness… ;-)
LOTS of reviews of boards here: https://www.explainingcomputers.com/sbc.html
Head to head comparison of Raspberry Pi 4 vs Jetson Nano
Ahhh, yes: “PCB production issues” [**].
I last looked at the manufacturer’s Web site within the last couple of days, and all they revealed was that their N2+ is out-of-stock, but they expected it to be in-stock again at the “end of February” [*].
Suddenly, I remembered experiences at a computer start-up that farmed ut production of 6-layer boards, which at their grand size, were quite a challenge in the early 1980s, and I thought: I wonder how many layers are in those (“Board Dimensions: 90mm x 90mm x 17mm”) (approx.) 3.6 × 3.6 × 0.6-in. N2 boards? To reduce a PCB footprint, or add components but keep the PCB size unchanged, you’ve gotta either:
• squeeze traces more tightly together, which might not be possible given voltage & current demands of the chips & connectors, or
• add board layers and reroute the traces.
I would not be surprised if Hardkernel has farmed out that aspect of production to companies providing specialized skills for it, and productive at it, placing PCB production issues outside Hardkernel’s direct control.
So I now wonder if the “N2+” model exists to provide fixes to some problem that was discovered or confirmed only after the “N2” entered volume production. At this time, I’d be wary of buying an “N2”.
I’ve never been a board-layout guy, so I won’t be surprised by any errors I’ve made above in trying to remember the correct terminology from nearly 4 decades ago.
——–
Note * : Hardkernel: https://www.hardkernel.com/shop/odroid-n2-with-4gbyte-ram-2/.
Note ** : Should I assume that this is the primary or most reputable importer? https://ameridroid.com/products/odroid-n2-plus.
I’ve had exactly zero problems with the N2. The N2 Plus was a ‘speed up’ build with some other changes that look, to me, like a ‘make it cheaper a little’ too. Smaller heat sink for one thing.
I’d be happy to buy another N2 any day. It’s a solid little machine. I’ve had it in daily use on one TV / workstation area for as long as I’ve owned it. Only took it out of service lately as I needed to clean the layer of dust off of it… Yeah, it sat there “circuit side up” long enough to get enough dust to be a concern to me (but it never complained).
The N2 PCB is a different size and shape from the N2 Plus board too. So, IMHO, it’s a complete “do over” for no good reason. They did add a Real Time Clock battery socket too. The PCB revision history has one change from “sample” to production, then the only next change was for making it the N2+ with added battery and
https://wiki.odroid.com/odroid-n2/hardware
My suspicion is that the re-routing needed to accommodate the coin battery holder, plus the speed bump from 1.8 GHz to 2.4 GHz, likely had some subtle intermittent issue they didn’t catch in an early release prototype like they did with the original N2. Note they went directly to a production rev.0.5 for the N2+. Personally, I’d never do that.
Per Ameridroid:
I’m a 100% satisfied customer of theirs. I’ve bought all my Odroids from them (4 for sure…) and my Pine64 boards ( 3 of them – Pine64, Rock64, RockPro64), and some other stuff that isn’t at top of stack at the moment… PSUs, heat sinks and such.
ZERO issues. Everything arrived fast and correct and working.
Photos of the two board layouts here:
https://www.hardkernel.com/blog-2/odroid-n2/
and here:
https://wiki.odroid.com/odroid-n2/hardware
look very similar, other than the added coin cell holder.
But, IMHO, you can’t just add that much metal, take out that much board surface conductor, and up the speed by that much without some risk of capacitance issues (at a minimum) being a potential issue.
It might also just be that due to all the economic crap going on around Covid that the problem is not the board design, but just getting the materials to make it or getting fab time at the factory.
Oh, and I’m not sure if I mentioned this anywhere else…
I’ve got Devuan 2.0 Ascii running on the Odroid XU4. Once I figured out it was built to expect the emmc boot device and I was running from a uSD it was a lot easier to assemble a working form.
FWIW, I was just watching a video in 720p on it, and it was using about 60-80% (variable) of the 4 fast cores. At 1080p it will sporadically saturate the cores and take glitching / dropped frames. OTOH, one of my TVs is only 720p so I can’t use faster / more bits on it anyway ;-)
That was using FireFox. The Chromium Browser does a “launch and die” so something not right with it on this setup.
Also, dragging a window around has lag and some jitter, so clearly this build is not using the GPU but rather the CPU for Xorg / windows. Sloppy and lazy… OTOH, I can’t program a GPU so who am I to complain…
Though I would expect the GPU to be used in a proper port, especially when the hardware has been out a few years. But it DOES work and isn’t a pain, so there’s that.
IF I cared enough, I’d do the compile / build myself and “make it all go right”… but I haven’t so clearly my sloth is enough that I can only managed to complain that others are only modestly less slothful…
I did a little calculation to rank my boards. The DMIPS is how many Dhrystone MIPS you get per MHz of CPU speed. So total Integer DMIPS for a board would be:
DMIPS/ MHz x MHz x # Cores
The results are:
Which makes the Odroid XU4 equivalent to about 42 of the original Raspberry Pi boards, and about 5 1/2 of the original Pi M2 boards. The Odroid N2 is about the same as 6 Raspberry Pi Model 3 boards.
I’d noticed in use that the RockPro64 didn’t seem any faster than the XU4 on most things, but had used about 2 x as much memory. That agrees with these numbers. What little need there is for 64 bit math is just not enough to have it show up in perceived wait time for things like web pages. But you do notice the longer time to load all the 64 bit long instructions from memory of about the same speed…
Overall, the N2 just blows the doors off of all the other boards. Which probably explains why I keep using it the most by far ;-) Followed closely by the XU4 (that had been my favorite for a couple of years before the N2 showed up…)
The RockPro64 is nice, and fast enough, mostly. But really, I’d take either of the Odroids over it.
The surprising thing is that the Pi M3, despite being 1/3 the speed of the RockPro64, is actually still usable, though obviously slower in doing things. It is “tolerable” in an emergency or ‘too lazy to boot the other box for just this one web lookup” kind of way.
The other surprising thing is that the XU4 with 2 GB of memory has about the same memory fullness as the N2 with 4 GB of memory. It ought not be surprising as programs will take a 64 bit width instead of 32 bit for each instruction. But I’d expected data to be better packed. The RockPro64 with 2 GB of memory runs out of it on just a few tabs open in the browser and things start to hit the swap device.
IMHO, for 32 bit machines, 2 GB is fine, but for 64 bit machines, you really want 4 GB or more especially if you like lots of open tabs in a browser.
Also, for significant compute tasks like a Build Monster where you share out C compilation over a cluster with “distcc”, having a couple of N2 boards beats having a stack of a dozen Raspberry Pi Model 3 boards.
Basically, A Big Mother computer CPU still beats a gaggle of little ones. So for compute intensive tasks, go with the bigger faster boards even if more expensive.
Odroid XU4 selling for $58 is equivalent to Raspberry Pi Model 2 boards at $10.50 each, while the N2 is equivalent to Raspberry Pi M3 boards at $16 each. BUT, you also need to buy a big dogbone case to stack your Pi boards and a half dozen PSUs, uSD cards, network appliance (switch / router) to connect them, etc. etc.
Essentially the Bigger Iron is a heck of a lot cheaper per MIP usable.
One Exception: Note that the 6 Pi boards come with 6 GB of memory, so if you have a bunch of jobs that need more memory, you have 2 more GB spread around the cluster.
OTOH, the N2 will knock out any given job much faster and so can turn over the memory faster too. So ‘how long it runs’ matters to total memory occupied per hour.
The Odroid N2 fast cores are about 11,430 for one, so about 4 times the speed of the Pi M3 individual cores, so any given task that saturates One Core can complete in 1/4 of the time, freeing the memory for the next task…
Bottom Line?
I’m only buying things as fast as the Odroid XU4 or N2 or faster going forward.
The other boards have uses, but I have more boards than uses in that class. (File Server, Squid Proxy Server, PiHole ad filtering DNS, Time Server…). It turns out that when run “headless” without Xorg and browsers killing them, even the Pi Model 2 boards are ‘lightly loaded’ and doing mostly nothing with those traditional infrastructure services running on them. (For a very long time the Pi 1 original board was my Squid Proxy Server, DNS, and time server and even it was mostly idle…)
I’m figuring I’ll be adding some test bed servers for things like a Web Server / Blog hosting, distributed P2P “tweet” server of some kind aka ‘micro-blogging’, a Faceplant analog, etc. BUT since all those Social Media things end up rate limited on the network interface anyway, even a Pi Model 2 is overkill.
It ends up a strange world. Need fastest biggest for highly compute heavy tasks or memory hogs (like browsers, Xorg, climate model devo, distcc) but even the cheapest little thing is way more than you need for anything that is a network server / service box. Oh Well…
FWIW, still working on the “Which are Chinese and which are not” list / posting… Just for now realize Allwinner chips are made in China, so lots of boards with them. The Banana Pi series. My Pine64 board. The Orange Pi family has a lot, etc. Also, Rockchip are Chinese and used in several makers boards.
UPDATE:
Due to a discussion / comment by Hubersn here:
I’ve added the Raspberry Pi Model 4 speed calculations to my chart, even though I don’t own one.
Key point to note is that the RockPro64 still beats it, despite only 2 x A72 cores while the Pi has 4 x A72 cores. How? Well, first off the RockPro64 runs at 2 GHz while the Pi M4 is only 1.5 GHz. That bites. Then the 4 x A53 cores add another 13,800 DMIPS to the 18,880 of the 2 x A72 cores. Also note that single core performance will be 9,440 for the RockPro64 but only 7080 for the Pi M4. Things that are single threaded will go faster on the RockPro64.
Pingback: China In Chips & Boards, Or Not | Musings from the Chiefio
E.M.Smith commented on 29 January 2021 at 8:05 pm GMT [*] [**]:
I substantially agree. Opening many tabs, and being able to leave them open, is a great! browser feature. For my manner of browser usage, I leave manytabs open as a reminder that they contain content that I want to get back to “real soon now”.
My suspicion, of at least several months standing, is that the Rustaceans and their Firefox-U.I. predecessors failed to recognize the need for managing memory for tabs by adopting least-recently-used (LRU) strategies, e.g.:
• For tabs that have not been accessed in some period of neglect extending backward from the present (defaulted but user-overridable), keep only its tab-table entry, plus the previously loaded HTML/XML, but free all other storage associated with it. But to make LRU strategies as user-friendly as practical, don’t free the previously loaded HTML/XML until the period of neglect has lengthened, but do free them then. Any user access to user-neglected tabs after that time will require reloading and rerendering. So the computing overhead required to reload & rerender neglected tabs when a user actually does get back to them, ranks as b.f.d., esp. compared to enduring bloated working sets containing all the data underlying tabs that haven’t been accessed in few or several months. A user can always “save as” tabbed Web pages if he|she considers it important to retain exact instantaneous content.
I’m mystified that Rustaceans can believe that compile-time verification of pointer/object usage is adequate for what memory management must be done at run-time. LRU strategies are dependent on time-stamps, and the technical challenges are matters for run-time, not compile-time.
The challenges raised by LRU strategies were pretty much solved many decades ago, on mainframes for which 1 MB of magnetic core was often a generous amount of main memory, and programmers required good judgment about which data needed to be kept in main memory. So I’m mystified by programmers in this new decade of the 21st Century, in which 1 GB of DRAM for main memory just ain’t enough. I infer that the extravagance of spreading all their data thro’out main memory is the only way they know how to program anymore.
Have the youngish computing herds who’re obsessed with novelty decided that explicit storage-management code is “sooo 1970s!”[⎈], and that the need for people who have the skills to do it properly are obsolete (“redundant” as the Brits would say), now that any programming language worth being used for serious projects–except of course, C, as handed down from God to Bell Labs–features garbage collection [✪].
——–
Note * : https://chiefio.wordpress.com/2021/01/21/arm-chips-whats-a-digit-mean/#comment-138944.
Note ** : But also see Rustaceans leaping into the fray here in the Chiefio blog 2 years ago, beginning more-or-less with this apparent Rustaceans comment: https://chiefio.wordpress.com/2018/12/10/this-looks-like-why-firefox-is-a-pig/#comment-105103. And the responses without the guts to comment directly in the Chiefio blog: https://www.reddit.com/r/rust/comments/a5g3ca/this_looks_like_why_firefox_is_a_pig/. I know times have changed, but wasn’t such cowardice once considered a breach of netiquette?
Note ⎈ : Texts covering LRU strategies probably included in Donald E. Knuth ca. 1970: The Art of Computer Programming, Vol. I. And other sources.
Note ✪ : It seems to me that Algol 68 and Ada were exceptions. In the case of the eventual original Ada, 1st publicized by a DoD Standard, 1 explicit DoD requirement was to provide for explicit memory management (MIL-STD-1853, perhaps? Tt was 4 decades ago, and I plead unavailability of my relevant documents, them being, ummm, in storage).
Compu Gator commented on 2 February 2021 at 6:00 am GMT:
Well, no, actually.
• Time-stamps can be avoided by a browser that instead uses a queue to identify the freshest of still-open tabs. But its length, which is configurable, is deliberately too short to hold all the open tabs of tab-enthusiasts like me. I suppose that the tab entries in the queue would be most efficiently implemented as pointers to the tab-table entries. The concept, in sketchy form [†], is that the most recently acessed tab is placed into the back of the queue. As long as a tab is in that queue, all memory related to a particular tab, including its rendering, is retained. But when a still-open tab exits the queue, that causes freeing of most of its memory, except for its tab-table entry.
——–
Note † : I’m already thinking ineffectively tonight because of something stressful that came up yesterday, and provided only 24-hour notice. My nearly dozen “ooops!“ foul-ups on text formatting in my cited comments are a result of that. So here I sit, drinking coffee in the wee dark hours.
@CompuGator:
Yup, exactly.
Per some (stealing your term) Rustacean tossing rocks at me somewhere: “Frankly my dear, I don’t give a damn.” Gave up caring what random people think about me long long long long ago. About 9th grade I think it was… Nerds who are “often right” get rejected early in High School (and before) and many of us ‘get over it’ rather quickly.
My mantra has generally been “Consider the source…” and move on.
FWIW, some of it may be an interaction of the browser with the OS. (Or the folks who made BRAVE Browser [based on Chromium] did a bit of tweak in the porting…) I say this because on my (now approaching ancient) Samsung Tablet, the Brave Browser instance has some unGodly number of tabs open. At first I was reticent, then OK with it, and now it’s become a challenge. Find out what the limit might me. I’d guess it is well over a hundred at this point. It doesn’t seem to care at all.
https://en.wikipedia.org/wiki/Samsung_Galaxy_Note_10.1
Says the tablet has something like 2 GB of memory. Arm processor 4 core @ 1.4 GHz and Android. Samsung Note 10.1 from about 8 years ago ( I got it a few months after first release).
I don’t have Brave on anything else to be able to compare, but I do have Chromium (using it now on the Odroid N2) and using Chromium on machines of “lesser memory” has resulted in swapping after “not too many tabs open”. On a 1 GB 64 bit box I’ve had it ‘crash tabs’ with losing their contents at about 6 to 12 open high page weight (puts a black tab with an X in it, but will reload if you hit the circle arrow… so has the pointer). So some difference between either Brave / Chromium or Android / Linux or the interaction of Browser / OS. The Linux In Question (Armbian / Debian) seems to ‘hit a wall’ of Swap Thrash Lock at about 1.1 GB of swap in use. Slows down painfully… and at some point one or the other effect dominates. Even my 4 GB board hits swap thrash at about 1.1 GB swap used (though much harder to get to that point ;-)
The behaviour of native Chromium seems to be to NOT load a tab at first launch of the Browser, but just the pointer to the cache entry. Only if you go to / open the tab does it load and render. BUT, then, seemingly dependent on the particular instance of the browser (or I just haven’t figured it out fully… yet…) it holds onto that tab data in memory (and eventually rolling to swap) even as you change to other tabs as the active tab.
Given that all the tab data is sitting in cache anyway, I don’t see what the big need is to hold all of it in active memory. Rolling to swap is just as expensive as a cache reload (maybe more so as you must ‘write then read’ when cache may know it’s unchanged so need not write…)
But what do I know, I’m just some guy who writes FORTRAN and C and other ancient languages, thinks FORTH in 16 Meg of memory is Boss, and can’t figure out how on earth folks manage to use Gigs of memory even if they tried…
Per Time Stamps: LRU can be time stamped, or just a queue like you pointed out, or I think you can just have a table with ‘use hit counts’ in it. Though technically that isn’t how recent but how much / often. IIRC some schedulers do that. Bits used a lot get a high count and kept in memory, rarely used has a low count and rolls to swap (and some bits get a ‘do not swap’ bit as you need them to operate swap ;-)
Having run Unix on a Cray Supercomputer with 8 M WORDS of memory (in the XMP-48 the 4 was cores / CPUs and the 8 was Megawords of 64 bits memory, so a 64 Megabyte memory…) and having it be an astounding “no waiting” experience on anything but long intense model runs, I really do struggle to understand needing such huge chunks of memory. All I can figure is bad I/O channels and lousy slow storage devices makes a ‘read from storage’ extremely show compared to the very fast channels and incredibly fast disks we had then… maybe…)
But that system often had dozens of folks all logged on at once, all doing compiles and active sessions and some running “Big Jobs” (queue of Moldflow and similar simulations that would run for hours modeling plastic flow and more). All in that dinky memory space. But Cray put a LOT of work into the compiler code and the OS tweaking… And it mostly ran C and FORTRAN and not much else…
So I just boggle at how BAD the “modern” languages and programmers must be to get about the same performance, now, with Gigs of memory…
“
CRustacean” is not “my term”. I found it on the language’s Wikipedia page [♋]. Now having visited the Web site of the folks who use it affectionately, I’d rather avoid its fans: Their overtly woke attitude exudes the odor of required conformity of thought [♋♋]:Because, golly! A mascot bearing a traditionally male name might secretly “identify” as a she-crab, despite possessing the narrow abdominal flap that conclusively distinguishes crabs as male for the stereotypical crab genus Cancer [♋♋♋].
So perhaps it should be no surprise that they have a “Code of Conduct” page, from which I quote [†]:
Ahhh, yes: Feelings! The latter bullet encourages indulgence in the tyranny of a minority of one! It looks like the work of people who reject the fundamental principle that “words have meanings”, but instead insist that “words mean whatever we say they mean, whenever we say it, and regardless of whatever we have said before”.
——–
Note ♋ : Cartoon image of “Ferris” the “Rustacean” was flowed into a right-side sidebar within the cited section, but other readers may see it rendered somewhere else on this page: https://en.wikipedia.org/wiki/Rust_(programming_language)#History.
Note ♋♋ : “Ferris” the “Rustacean”: https://www.rust-lang.org/learn/get-started#ferris.
Note ♋♋♋ : “Biological Info: Anatomy of a Blue Crab”, depicting diagnostic undersides of each sex (the same diagnostic is applied to Dungeness crabs of the Left Coast): https://www.seagrantfish.lsu.edu/biological/anatomy_crab.htm.  Perhaps more detailed, or more usefully laid-out, version: https://www.seagrantfish.lsu.edu/pdfs/anatomy_crab.pdf.
Note † : “Code of conduct”: https://www.rust-lang.org/policies/code-of-conduct.
OK, not “your” term, but you were where I first hear it…
OK, ANY language that has PC Enforcement and a required code of conduct does not need me to use it. Or support it in any way.
A language is just a tool for making a computer do something. It needs nothing other than an Engineering Manual.
Well! Maybe it’s time to start applying a new category to the names of programming languages: “snowflake languages (tm?)”. For marking such languages, instead of using “(R)” or “(tm)”, Unicode defines 3 styles of snowflake: U+2744–2746 (❄–❆), e.g.: “Rust❄”.