The various ARM CPU chips are vastly different in their abilities. Sometimes, a single digit difference can have a BIG difference in performance.
In particular, I have 2 SBCs with ARM chips of type 7x where the x differs by one digit. A72 vs A73 cores. The RockPro64 has 2 x A72 cores, and then 4 x A53 cores. The Odroid N2 also has 6 cores, but it is 4 x A73 and 2 x A53 cores.
The main CPU of the N2 is based on big.Little architecture which integrates a quad-core ARM Cortex-A73 CPU cluster and a dual core Cortex-A53 cluster with a new generation Mali-G52 GPU.
Thanks to the modern 12nm silicon technology, the A73 cores runs at 1.8Ghz without thermal throttling using the stock metal-housing heatsink allowing a robust and quiet computer.
powered by Rockchip’s RK3399 processor. The RK3399 is a 6-core chip with two ARM Cortex-A72 CPU cores, four Cortex-A53 cores, and Mali-T860MP4 graphics.
I’m not going to get into the use of the graphics engine for parallel computes, nor how that can drastically change the ‘throughput’ of a given system for parallel codes. That’s a whole ‘nother topic. Here, I’m just going to look at what changes with A72 vs A73 and similar “single digit” differences.
Now the ‘first blush’ is that you get 4 big and 2 little in the N2, but 2 big and 4 little in the RockPro64. They both sell for about $70 with 2 GB of memory, but the 4 GB memory of each is closer to $90. So are they really close to the same board in performance?
Has a column WAAAaaaay off to the right with “Dhrystone” benchmark scores. Note that it is integer focused so doesn’t measure the effectiveness of your floating point hardware nor string manipulation. Some folks find that limitation unacceptable. I think it is quite reasonable as a general first yardstick. LOTS of things in a computer are done with INTs.
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor (CPU) performance. The name “Dhrystone” is a pun on a different benchmark algorithm called Whetstone.
It also doesn’t pander to the “modern” language tribe that uses OO and interpreters and such, so they are not too fond of it either, but whatever. It’s still a good raw power measure IMHO.
So what’s that wiki on ARM cores say about those two?
Cortex-A72 2015 3-wide 4.72
Cortex-A73 2016 2-wide ~6.35
The A72 has a 3-wide decode ability and 5-wide dispatch, so has more parallel paths it can execute at once in a predictive way (then throw away the one that wasn’t taken in the ‘if’ branch) while the A73 has cut that back by one each. Yet it is 1.35 x as fast. Easily 1/3 faster. AND it runs cooler.
How about those A53 cores?
Cortex-A53 2014 2-wide 2.24
While it has 2 wide decode of instructions, it has no parallel dispatch at all. Then the performance is just 2.24 Dhry. So the A72 is about 2.1 x faster and the A73 is about 2.8 x faster.
In particular boards / SOCs (System On Chip) you would need to adjust those for any added CPU Clock speed the maker got compared to the Arm Holdings baseline. But as a first approximation it gives a pretty good idea what you are buying.
So figure an A53 as the baseline of “one”, then the RockPro64 is 4 + 2(2.1) = 8.2 total. The Odroid N2 would be 4(2.8)+2 = 13.2 so in total, you could expect (modulo exact clock rates used and any thermal throttling from no heatsink with the RockPro64 unless you pay more to get one…) the N2 is roughly 1.6 times a much computes as the RockPro64.
That’s how much that one digit on the A7x and the other swap of 2 vs 4 gets you.
You really really must look at the relative CPU speeds of the different chip / Core numbers to know what you are getting.
My experience with both boards has been roughly that, too. The N2 is largely “no waiting” and the RockPro64 sometimes lags just a bit, and I notice it isn’t quite as fast. Also note that often it is a SINGLE core that’s pegged as the job does not run all the cores. Isn’t multithreaded. In that case, I’ve still noticed the 1/3 faster. 3 vs 4 seconds isn’t that much, but you will notice it sometimes. 30 minutes vs 40 minutes, you know…
Some others to note? An A57 core is 4.6 so almost the same as an A72 on that benchmark.
When they get down to “Small systems” SBCs the other A7x cores will make a big dent:
Cortex-A75 2017 3-wide 8.2-9.5
Cortex-A76 2018 4-wide 10.7-12.4
Yeah, about 9 and about 11 for those two. Or about the same as the 4 x A53 cores in a Raspberry Pi Model 3. Or 5 cores for the A76. So a 4 x A76 core SOC will do about the same as 5 Raspberry Pi boards… Just sayin’…
Though note that the Pi M4 has A72 cores. Now it IS a fairly fast board due to that, but… the A72 cores run hot compared to the A73, so you need a LOT of heat extraction to use it without thermal limiting. Unfortunately, it does not come with a heat sink. All sorts of folks have done all sorts of exotic stuff trying to keep them cool enough. From powered fans to heat pipes and more. Compare the Odroid N2 with built on passive heat sink. No fan needed.
Oh, and the Odroid N2+ ups the clock speed to 2 GHz (compare the Pi at 1.5 GHz), so there’s that…
There’s a WHOLE LOT more to benchmarks than just one number, Dhrystone or otherwise. For many kinds of use, the memory quantity and speed matter more, or the speed to disk (USB 2.0 vs 3.0 that’s about 10 times faster). The Raspberry Pi, unfortunately, regularly cheaps out on both heat extraction, so your board heat limits to very low performance, sometimes under half; and on I/O structure, so any I/O bound jobs are just slugs, and sometimes non-I/O bound on other gear becomes I/O bound on the Pi.
Overall, the Odroid family has done a much better job of the hardware design. Real barrel connectors for power (where the USB-c for power was OK on the original Pi, it is at or sometimes over limit on later bigger Pi’s and really you need a different power connector), heat sinks big and included, good I/O designs. Other makers grade out between those two.
Odroid mostly just ships a variation of Obuntu as their operating system. Their boot process is a bit different and arcane in some ways, so a lot of developers just don’t port to their stuff. Raspberry Pi has just about every OS on it (despite having their own arcane boot process… but ‘size matters’ and there’s millions of Pi boards sold). Other boards again grade out between them.
So I must say that if you are going to buy an SBC, do check out the available operating system choices and what age / quality of ports exist. Make sure the one you want is available on the hardware you like.
For Completion, the V7 instruction set 32 bit ARM cores:
ARM Cortex-A7 1.9
ARM Cortex-A15 3.5 to 4.01
ARM Cortex-A17 4.0
The A7 cores are what is in the earlier Raspberry Pi Model 2 boards (they later got an upgrade) and many other systems. They are also the “little” cores in my Odroid XU4, while the A15 are the BIG cores in the XU4. It has 4 of each. I just note in passing that the A15 and A17 both have some parallel execution and both score about a 4 Dhrysone rating.
So my XU4 has roughy 4 x 4 + 4 x 1.9 or 23.6 total Dhrystones of go juice. Divide that by 2.24 to compare it in terms of A53 cores, you get: 10.5 or roughly 1/2 way between the RockPro64 and the Odroid N2 in total computes. But when limited to a one-core job, that one A15 core is 4/4.72 = 85% of an A72 core. Damn Fast still, but not as fast. It is 4 / 6.35 or 0.63 a little under 2/3 the speed of the A73 core in my Odroid N2. Then, for some things, that 64 bit int matters a lot compared to the 32 bit int and you can end up 1/4 or less the speed as you do double precision math.
So while I love the little XU4 for “general computing” like running a browser, where the 32 bit word size means a whole lot less memory wasted; on hard core high precision math it just isn’t that fast.
How fast a given job runs depends a lot on the particulars. Single thread or multi-threaded? Double precision 64 bit math, or 32 bit? Different boards have different “best uses”. Be careful when you buy to match the problem you are solving to the equipment, and watch those single digit differences. They matter.