1 Dry Number Look at SBC Speeds

Every computer has things it does better, and things it does worse. Handling integer math, floating point math, moving bytes ( I/O speed ) or doing character manipulation. So it goes. Old IBM 360 machines were actually slow processors, but had several high speed data channels to disk. IBM realized that the typical business transaction moved a large record, but only did math on one or a few fields. The old Cray did a block of 64 double precision math problems with one instruction, but was not so good at “scalar” problems or moving long records of bits. So trying to compare computers based on just one performance number is a bit daft.

You really MUST characterize the problem you are solving and then match that to the kind of computer prior to doing comparisons.

But I’m going to ignore that and look at Just One Number. The Dhrystone benchmark performance.

https://en.wikipedia.org/wiki/Dhrystone

Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor (CPU) performance. The name “Dhrystone” is a pun on a different benchmark algorithm called Whetstone.

With Dhrystone, Weicker gathered meta-data from a broad range of software, including programs written in FORTRAN, PL/1, SAL, ALGOL 68, and Pascal. He then characterized these programs in terms of various common constructs: procedure calls, pointer indirections, assignments, etc. From this he wrote the Dhrystone benchmark to correspond to a representative mix. Dhrystone was published in Ada, with the C version for Unix developed by Rick Richardson (“version 1.1”) greatly contributing to its popularity.

It isn’t ideal, but it’s good enough for most things and widely used. For a more complete set of benchmarks of many sorts, you can go wandering in the open benchmark site here:

http://openbenchmarking.org/

Basically, what I’m going to do is look almost entirely at just those boards which have a tested production release of Devuan available. If folks don’t care about SystemD, then many more boards have a native Debian or Ubuntu port and the same exercise can be done for them. For now, for this look, I’m starting with the Devuan native boards. (IFF I don’t find anything I like, I can expand the search later).

These are all ARM CPU boards, and mostly all v7 or v8 instruction sets ( 32 bit armhf or 64bit arm64 ). But it isn’t just a 2 way split. Not all cores are created equal per MHz of clock.

A couple of decades back, many or most of these ‘tricks’ were reserved for high end CISC machines (Complex Instruction Set Computer). Now they are showing up in what is nominally a RISC (Reduced Instruction Set Computer) like the ARM. How many instructions can be decoded in parallel? How are the instructions “pipelined” so the next one is started while the last one isn’t done yet? Is hardware floating point fast, faster, 64 bit, 32 bit, or missing? Can a path be executed, then thrown away if not taken by a later test function? So all cores, even of what is nominally the same instruction set and architecture are not created equal. This page gives a rough Dhrystone factor for the various ARM chips along with some details about things like decode width, pipeline, Floating Point Unit (FPU) and out of order execution. At the far right is a “DMIPS/MHz” factor.

https://en.wikipedia.org/wiki/Comparison_of_ARM_cores

What I’m going to do is look up the chipset for each board, the CPU type, and the MHz, then “do the math”, for each of a bunch of boards that have a native Devuan available for download. I’m getting the chip set and MHz information from here:

https://en.wikipedia.org/wiki/Comparison_of_single-board_computers

The list of supported boards is from the Devuan site readme here:

https://files.devuan.org/devuan_jessie/embedded/README.txt

I’m leaving out the Chromebooks as I’m not looking for a laptop right now, similarly the Allwinner Tablet was skipped, and skipping the Nokia phone. I’ve also skipped the “Lomobo R1″just because I’ve never heard of it… and the CubieTruck using the A83T was not in the comparison wiki. Note that many of these boards use the Allwinner A10 or A20 chips, so not that many distinct comparisons to make, really. Though other boards use the same H3 or Samsung chips with different MHz so some variations creep back in.

Currently supported images:

* Acer Chromebook (chromeacer)
* Veyron/Rockchip Chromebook (chromeveyron)
* Nokia N900 (n900)
* Odroid XU (odroidxu)
* Raspberry Pi 0 and 1 (raspi1)
* Raspberry Pi 2 and 3 (raspi2)
* Raspberry Pi 3 64bit (raspi3)

Allwinner boards with mainline U-Boot and mainline Linux can be booted
using the sunxi image, and flashing the according u-boot blob found in
the u-boot directory here. The filenames are board-specific, but this
file is commonly known as “u-boot-sunxi-with-spl.bin”.

Currently supported Allwinner boards:

* Olimex OLinuXino Lime (A10)
* Olimex OLinuXino Lime (A20)
* Olimex OLinuXino Lime2 (A20)
* Olimex OLinuXino MICRO (A20)
* Banana Pi (A20)
* Banana Pro (A20)
* CHIP (R8)
* CHIP Pro (GR8)
* Cubieboard (A10)
* Cubieboard2 (A20)
* Cubietruck (A20)
* Cubieboard4 (A80)
* Cubietruck Plus (A83t)
* Lamobo R1 (A20)
* OrangePi2 (H3)
* OrangePi Lite (H3)
* OrangePi Plus (H3)
* OrangePi Zero (H2+)
* OrangePi (A20)
* OrangePi Mini (A20)
* Allwinner-based q8 touchscreen tablet (A33)

All the A10 use the same MHz as do all the A20, so only one line of data for each. The general layout of the comparison is “chip set”, core count and architecture type, scaling factor (Dhst/MHz), then a relative performance number for that core type at that MHz, and a total for all the cores on the chip set. Finally, a list of the boards using that chip set to make figuring out who’s a A20 easier…

The Rel. Perf. is good for comparing how fast a monolithic task completes in one core (like some browser tasks). It tends to indicate how the board feels in terms of response in use. The Total is better for indicating how much gets done on long fully loaded tasks like running models or doing BOINC. The A15 cores have a range of relative performance from 3.5 to 4. I’ve just used 4 in the posting to keep the chart manageable.

Chip # x Type MHz   Scale  Rel   Total  Boards
Set                 Factor Perf  Rel Perf
             
A10  1   A8   1000  2      2000   2000   Olimex Lime, CubieBoard    
A20  2   A7   1000  1.9    1900   3800   Olimex Lime2, Olimex Micro, Banana Pi & Pro, CubieBoard 2, CubieTruck, Orange Pi & Mini
A80  4   A15  1300  4      5200  20800   CubieBoard4
     4   A7   1300  1.9    2470   9880
                                 30680 total for A80
H3   4   A7   1536  1.9    2918  11672   Orange Pi 2 & Plus
H3   4   A7   1200  1.9    2280   9120   Orange Pi One & Lite
H2   4   A7   1200  1.9    2280   9120   Orange Pi Zero
R8   1   A8   1000  2      2000   2000   C.H.I.P.
Broadcom - Raspberry Pi
2835 1 ARM11   700  1.25    875    875   R.Pi B+
2836 4   A7    900  1.9    1710   6840   R.Pi M2
2837 4   A8   1200  2.3    2760  11040   R.Pi M3
Samsung 
5410 4   A15  1700  4      6800  27200   Odroid XU
     4   A7   1200  1.9    2280   9120
                                 36320 total for Samsung 5410

Not Running Devuan 1.0 native, but via Armbian "Uplift":
Amlogic
905  4   A53  1500  2.3    3450  13800   Odroid C2
805  4   A5   1500  1.57   2355   9420   Odroid C1+
Samsung
5422 4   A15  2000  4      8000  32000   Odroid XU4
     4   A7   1400  1.9    2660  10640
                                 42640 total for Samsung 5422

Now there you can see in one number just why that Odroid XU4 spent so much time doing nothing and was very crisp on web pages and such. The octo core chips are just monsters. More than 3 x the speed of the C2, and single core performance at about 3 x a Pi M3 core. To find some other board like it, running fairly nicely, well, that’s going to be hard, or expensive, or both.

Still, all I need is “fast enough” really. Given the Pi M3 on editing WordPress pages (lots of bytes sent back and forth and a bit of typeahead), I’m likely to need something closer to 3000 or 4000 single core speed. That’s mostly CubieBoards and Odroids. (Or I give up on Devuan or accept an x86 or make some other compromise with my principles).

This is the same reason I bought it and the C2 in the first place. Oh Well.

In Conclusion

I suspect I mostly just need to do my “postmortem” as planned and then work on getting some code working rather than worry about building more “stuff”. I’ve got enough rig for the early stages of work and I can easily use something else as my “daily driver”. But “when the time comes”, knowing how many DMarks / $$$ you get for any given board is going to be a key number. Then those Dhrystones get divided by Dollars and you have yet another very interesting number. All of which needs leavening with things like “Float vs Integer” performance and “can I use the GPU?” along with “Can I get out all the heat at full load 24 x 7?” (so “effective Dhrystones over hours…) and “Is the I/O fast enough to keep the CPUs fed?”…

But that is how you evaluate the “buy” decision on buying more computes. How much do the computes cost, how many can you get done before the machine breaks or becomes obsolete, how much work to keep it up and running right. This is just a first rough cut number, but it lets you rapidly dump some options as “uninteresting”.

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits and tagged , , , , , , . Bookmark the permalink.

13 Responses to 1 Dry Number Look at SBC Speeds

  1. p.g.sharrow says:

    the creation of the modern computer industry has required the input of vast numbers of people. the best of the best of billions. a valid point of consideration is the base of contributors that might work on the programing…pg.

  2. E.M.Smith says:

    @P.G:

    So true! Microsoft is a great example of how poor software can consume computes like crazy, and Unix an example of clean code running fast. The modern PC is faster than the supercomputers of the 1980’s yet delivers little more to the MS Excel or Word user than it did then (mostly more “eye candy” and application integration).

    Microsoft assumes all added computes are for them to use in making software more cheaply (less programmer time and skill). Unix is still as efficient as ever. (So is most of Linux, but recently the GUI writers and some others have been more wasteful of resources and systemd is imposing a MS Registry like mode of function).

    Cray wrote some incredibly efficient compilers and you were nearly running at bare metal level with their FORTRAN in terms of speed. My biggest complaint about modern software is the rise of Object Oriented programming that tends to fat slow code; followed closely by the growth of a zillion languages. You just can’t make that many good compilers. Then folks use 6 languages to write one application… so 6 libraries are loaded into nemory…

    Part of why I like Old School languages like FORTRAN and C and use Linux and Unix whenever possible often from a command line. It means a $20 SBC can work wonders… The Orange Pi is roughly doing what I bought a $500,000 VAX to do in the 80s. (Though not as many folks asking for login accounts as my “staff” is smaller :-) I deliberately used a Pentium class machine to port / run GIStemp just to make the point…

    My Brother-in-law showed me a graph from his work at NASA (aeronautics) that showed improvements in computes from Moore’s Law and from software & algorithm improvement. The software curve (line on log graph IIRC) was steeper than the Moor’s Law line… so good software matters more than Moore’s Law. The necessary corollary to that is bad software can consume all hardware improvement (and then some…).

    IMHO similar issues show up in the climate models. Lots of computes thrown at bad ideas and with mediocre programming skills. Oh Well. First I need to get one to run, then I can work on fixing it :-)

  3. Lionell Griffith says:

    “First I need to get one to run, then I can work on fixing it”

    Good luck with that. Some code is simply so bad it cannot be fixed. Badly conceived, poorly structured, non-existent internal documentation, nonfunctional variable and procedure names, spiderweb coupling, multiple functionality functions, uncontrolled use of global, massively numerous randomly nested gotos, and external documentation so obsolete that it refers to nonexistent code are some of its worst flaws. Any one of these flaws is bad enough but get three or more concurrent, the code is hopeless.

    Such code can never be made to work. What works was never defined. It was built by accretion to satisfy a fuzzy idea that constantly morphed. This kind of code cannot be tested. To test presumes knowledge of correct behavior. This did not exist. The best that could be said is that the code had/has behavior. The meaning and validity of that behavior cannot be known.

  4. E.M.Smith says:

    @Lionell:

    So true! Thus my sloth on making progress.

    Oh, and add to your list that these often reflect decades of grad students adding layers, so something of an archaeological dig… then season with these codes NOT being designed with one reality in mind, but rather built as a kit of processes and parameters that you glue together on the fly to conduct “experiments”. There is just no way to properly QA test that.

    My idea is to get a simple one to run then hand it and some typical settings over to the public to play with (learn from. I.e see the problems with them.). THEN use it as a kit of parts to make a more proper reality oriented model (water emphasis, solar variable, spectral distribution included, convective non-radiative troposphere, CO2 radiation in the stratosphere to space, etc.)

    I expect it to take me years. (Maybe I need a go-fund-me to rent a few programmers more… a $1/2 Million could get it done in a year or less, I estimate. I’m pretty good at those estimates having done it for a living for decades.).

    But the task for today is to take the raw performance ranks above and turn them into PiM3 ratios and $/computes… then estimate computes needed, number of boards x time, and from that ultimate $ (or longer month for my budget). It will be a rough estimate but enough for a sanity check. I can cross check with known model run times on Big Iron… but the model I have in mind is already running on PCs. IIRC, with sub-month run times. Ought to be able to get down to a day with about 30 boards of ARM (or fewer if like the XU4 :-)

    Figure about $1500 of hardware for a credible system, and a few weeks making bits of code parrallel (once it is built and running at all…) possibly as low as $500 if the GPUs can be brought to the party…

    Unfortunately, I need to be Plumber and Carpenter for a few weeks instead of Program Manager and Lead Programmer… something about cash flow when not employed for money inhibiting buying services from others… or doing what you must instead of what you want…

  5. Lionell Griffith says:

    “doing what you must instead of what you want…”

    Ah yes. Life happens and you must service its needs or it doesn’t last as long or as well.

    All solvable problems have solutions that come down to the correct application of time, money, and brilliance. Brilliance can often save a lot of time and money but some solutions are more demanding. Correctly modeling the world’s climate is a very demanding thing. I suspect that a really bad model is as good as it can get.

    The challenge is more than I am willing to accept. I think you are entering a dark forest of issues where many fire breathing dragons, trolls, goblins, and evil witches live. Don’t feed the trolls coal, they really don’t like it. ZYZZX should take you back to the starting point should you find yourself lost in a cave.

    I wish you the best in your efforts.

  6. E.M.Smith says:

    Wait! I’ve played that game!! ;-)

    Yes, it’s a black hole of time suck death. That’s why I ration my time on it… but “someone” needs to give it a go and I don’t see anyone else “going there”… so it must be me… Skeptics really need their own model to toss back in the face of the “all the models agree!” argument. (That really means “all the model WRITERS agreed”…)

  7. Steven Fraser says:

    As the first ‘tech’ resource at a software company I co-founded in 1991, one of my early tasks was to profile vendor hardware for performance of our main application, so that recommendations could be made to handle 3 years of load growth for clients. After running my test suite on several models of RS6000, hP-UX, Alphas, Solaris and Linux systemsI was able to find a benchmark test that was predictive of the test suite, +- 10%… the SPECInt95.

    This made my recommendation process much simpler. It was also apparent, year by year, how much faster the integer performance of the processors was improving. The king-of-the-hill, at least for a while, was the Alpha, with up to 16 processors. These days, even a modest vmware frame will smoke that, running Linux.

  8. E.M.Smith says:

    Well, I finally got around to looking up prices and making the $/kDhrystones figure Some prices are a bit variable (like Amazon having a bit of ‘lift’ over others), plus I could not find a price for the A20 based Orange Pi boards (Orange Pi, Orange Pi Mini; and the Orange pi Plus that I found was a “Plus 2”), so price might be for a different product, maybe. It looks like Orange is largely moving to H3 and H5 chipsets, so those are likely the only numbers that matter anyway.

    Essentially, I just toted up the total kilo-Dhrystones a board would do and divided it into whatever price I could find. This ignores things like how good the I/O is, how many ports, and can you actually run it at 100% due to heat load ( remember that the Orange Pi One topped out at 2 cores before heat started slowing the clock, so large heat sink really needs to be added)

    Not surprisingly (since I’d done a mental estimate of $/bang when I bought it) the Odroid XU4 is the BIG winner at $1.64 /kDhrys and it has a big enough heat sink (plus fan for the model priced) that you can run it 100% without heat throttling.

    Here’s the table, for those wondering how others rank (minus those I could not price or where availability was dodgy. The Olimex boards are priced in € and I used $1.40 / € for the conversion. Why that number? Because it’s about the middle of what I’ve seen over the last decade or so as a rough guess.)

    $/kDhrys  Board
    28.57     Pi B+
     5.11     Pi M2
     3.17     Pi M3
     1.64     Odroid XU4
     3.40     Odroid C2
     3.50     Odroid C1+
     4.50     C.H.I.P.
     8.00     C.H.I.P. Pro
    22.50     Cubieboard
    17.11     Cubieboard 2
    25.00     Cubietruck
    16-28     Cubieboard4 ($130 to $215 prices)
     5.16     Orange Plus (2 model price)
     2.88     Orange Lite
     2.52     Orange zero
     1.60 to
     1.97     Orange One (prices from $10 to $18 at Amazon)
              Closer to $3.20 to $4 if you don't put a heat 
              sink of quality on it - or any of the H3 boards)
    21.00     Olimex Lime ( A10 single core, but open source HW & SW)
    12.15     Olimex Lime  (A20)
    16.58     Olimex Lime 2 
    20.26     Olimex mini
     9.21     Banana Pi
    11.84     Banana Pro
    

    So there you have it. For most Bang / Buck, the Odroids and Oranges have the winners, but the Pi M3 still beats most of the other boards and is only behind the Odroid XU4 with the octo-core complexity and the smaller “no heat sink” Oranges with the H3 chipset, so one needs to figure out how to get the heat out of them effectively, otherwise the actual usable $/kDhrys is more expensive than the Pi M3 with even a cheap heat sink glued onto it.

    I am pleased to see that “doing the math” ended up with the same leader boards as I got from “eyeballing the price and size” ;-)

    NOTE that the Odroids are NOT the ones with an official Devuan 1.0 release (at least not one I could get running…) nor does the cheapest Orange have such a release (but it’s likely the official release tested on another board with the same chipset ought to work).

    With that:

    Adding to the cluster with least fuss at reasonable cost:
    Pi M3 running Devuan armhf.

    Adding for best $/compute:
    Odroid XU4 running either Debian or Armbian with Devuan conversion.

    Adding to the cluster for lowest $/board:
    Orange Pi family (with added heatsink and Armbian)

    Note that the Orange Pi One at $10 is the absolute lowest $/kDhrys BUT that price was only direct from a China vendor, I didn’t explore shipping costs, and it ignores that you MUST buy a GOOD heatsink that’s likely to run $5 all by itself or you are really getting 1 to 2 cores + 2 second bursts… so as an all up and balanced board, the XU4 beats those cost issues. Only IFF you can buy a bunch of known sufficient heat sinks in bulk for low cost would the Orange Pi One be the really $/kDhry winner. Worth exploring if ever building a 10,000 core cluster, but pointless for 4 to 8 boards IMHO.

    Sidebar:

    I got the shower re-done. Nice knew faucets and nozzles (shower / tub spout). Now I’m on to the toilet and flooring… I’m remembering why I’m not fond of plumbing. Percussive Disassembly required and you don’t know how far into the wall (or to the street) you will need to go to be done… OTOH, I’m making about 1/2 plumbers wages / hour, but paying no tax on my own work, so uplifted by the roughly 50% tax take (combined everything in California) puts it net at about plumbers wages saved, even given that I’m about 1/2 as fast / good at it. Last I looked plumbers were asking about $50 to $100 / hour, but who knows what it is now. I’ll take those wages (as cost avoided).

    Gee, plumbing and building a computer cluster in one posting, where else but here! ;-)

  9. p.g.sharrow says:

    Damn, sounds like you expect to live in that house for a few more years :-)
    Sometimes “House work” is a good change up from other labors…pg

  10. E.M.Smith says:

    @P.G.:

    More like I expected to move out just prior to all this stuff reaching the End Of Service but that transition to Florida fell through. So now it’s reached the point where there is no choice but to fix it. Roof actively leaked into garage, can’t ignore that. Tub faucet (hot) leaking enough to show up in the bills, can’t ignore that. Toilette wobbling on the base, so going to leak into sub-floor. Can’t ignore that… Neighbor’s tree removal removed part of fence, so had to build a new one or else share yards and dogs… can’t ignore that…

    But yeah, it is kind of nice to find out you still remember how to do some things. First toilette replacement I ever participated in was “helping” Dad at about 9 y.o…. Fence building at 10. Roofing at 11 or 12. etc. etc. (Plumbing was at age 7 1/2 when I helped put plumbing into the crawl spaces under our ‘restaurant to be’ where my Dad could not fit ;-) I also got to put up paneling behind the booths and build the kitchen sink from the kit of parts. Giant 3 sink aluminum thing with legs and shelves out each side for incoming and outgoing… electrical was about the same age.)

  11. EM – if running the Pi3 at 1.2Ghz rather than 600Mhz doubles the ratings, then maybe the Pi3 would actually end up as best value. May need to find a better way to get the power in, though…. Quality of manufacture and design is likely higher than the Orange boards, anyway, and less likely to get odd failures from soldering faults or tiny switches.

  12. E.M.Smith says:

    @Simon:

    UNfortunately, the “ratings” are based on the specs, not the 600 MHz as limited shipped settings, so in fact it is more that the Pi M3 goes to 1/2 the above rating UNTIL you change the CPU setting from powersaver to ondemand. For folks reading this months from now, here’s the link we’re talking about:
    https://chiefio.wordpress.com/2017/11/22/make-your-raspberry-pi-m3-run-2-x-as-fast-at-the-advertized-rate/

    It is more like the R.Pi M3 is now to be treated more like the Orange Pi. A machine that thermal limits below specified rating if you don’t put a really big aftermarket heat sink on it (and not the dinky little aluminum one sold in the $5 kits…). So both a performance reduction until fixed and a price bump of significant size to fix it. A $10 heat sink on a $35 board is a big cost percentage (that could be strongly reduced had they built it on at the factory…)

    Essentially, this pushes more toward the Odroids as they already come with big effective heat sinks. (AND non micro-USB power connectors that are not being driven over spec to get enough power into the board… )

  13. E.M.Smith says:

    Interesting micro sized cluster company… One product puts 5 x Pine 64 boards in a stack with what looks like integrated PSU and Ethernet Switch. About $46 / board all up. $230 sale price ATM, but $250 list so $50 / board “normally”.

    https://www.picocluster.com/collections/rock-64/products/pico-5-rock64

    This one says:

    https://www.picocluster.com/collections/pico-20

    Pico 20 is our primary business class cluster. It comes with an integrated power supply, 4 Ethernet switches, and 4 stacks of 5 SBCs. It can be outfitted with Raspberry PIs, ODroid C2s, or Pine64 2GB boards.

    You can use these clusters to run almost any kind of distributed or parallel software. Run your own LAMP cluster, Docker, Kubernetes, Hadoop, ElasticSearch, Cassandra and many others. Also learn languages like Javascript, Java, Pthon, R, and so on. Use for Development, QA, DevOps, or Education.

    Pico 20 is available as a DIY Starter or Advanced Kit, or as a complete Cube.

    Looks like for $1000 one can get a 20 board Odroid XU4 unit. That would be 160 cores of general purpose computing, plus the GPUs if one can use them for any given application. IIRC, there are 6 per Mali … (he searches…)

    http://mechanicalforex.com/2016/10/using-the-gpu-and-cpu-for-opencl-programs-in-an-odroid-xu4.html


    The Mali-T628 that comes with the ODROID-XU4 is a 6 core GPU
    that gives you a lot of additional punch for your OpenCL computations. In previous versions it was difficult to use these GPUs because we had to build the Mali SDK from scratch but the latest ODROID-XU4 images now ship with the Mali SDK that we can use from the get-go. However there is no OpenCL ICD installed which means that the device is inaccessible after installing POCL until we create the appropriate entry in the /etc/OpenCL/vendors folder. The “sudo nano” command above opens up an editor to a “mali.icd” file where you can simply write the line “/usr/lib/arm-linux-gnueabihg/mali-egl-libOpenCL.so” (without the quotes). This creates an ICD entry that tells the computer that the OpenCL SDK for the GPU is located within a specifically defined path.

    So that’s 120 cores of GPU available.

    Total of 280 cores of mixed types available. 80 A7, 80 A15, and 120 Mali GPU.

    For “over 1000 cores” we’re talking just $4000. ( 280 x 4 = 1120 )

    A 10,000 core massive compute engine at $40,000 that would be reasonable for many R&D / Education purposes (and massive overkill for things like Linux system compiles…)

    A 100,000 core machine that starts to be in Real SuperComputer Land at $400,000 is a surprisingly low price for the total MFLOPs you get. (Though in reality, anything above about 1000 cores and you will start running into network fabric speed limits for anything but highly parallel problems.. At the 100,000 cores level you need roughly 400 ports of interconnect between them. That’s going to put you at 3 layers of switching with typical network gear. 1 switch in the device, 48 devices / switch at level 2, and then you need to connect basically 9 of those together “higher up”. So some kinds of problems needing lots of interprocessor communications will bog on the switching…)

    For the 1000 core / 4 box solution, it looks like there are ports left over on the internal switches so one could likely directly connect them (with some care in the switch configuration…). That would make it just 2 switch jumps from one board to another and relatively low latency / communication.

    While it would take some care in programming also to use all those cores effectively AND there’s that constant question of “Is the heat extraction good enough to use it 100% 24 x 7?” it is likely quite enough computer to run even some fancy climate models.

    Well, a fellow can dream can’t he ? :-)

Anything to say?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s