Rise of Big Compute
I've been harping on the importance of GPUs since my October, 2012 blog post Supercomputing for $500 and more recently in my reviews here of the SC13 conference. A couple of news stories this month indicate broader recognition of the growing importance of "Big Compute".
First is the November 9, 2013 TEDx Virginia Tech talk (released to YouTube Dec. 5) Big compute vs big data by Wu Feng who created the GPU-based HokieSpeed. In the talk, embedded below, Wu Feng shares his observation that the U.S. seems to be focused more on Big Data while China, with its fastest supercomputer in the world, seems to be focused more on Big Compute, with Big Data being just a special application, or subset, of the realm of Big Compute.
Wu Feng concludes that both Big Compute and Big Data are needed. We've been hearing a lot about "Big Data" over the past three years or so (a lot over the past 1-2 years, and a little starting with the 2004 Google papers), but "Big Compute" hasn't yet reached the same buzzword status. It needs to in order for real Data Science to progress. A lot of today's data science either is simple statistics over large data sets, or it is advanced machine learning over small data sets (that can fit in the RAM of a single machine for processing in R or iPython Notebook). There is a top-tier Big Data vendor out there that can't sell any Hadoop nodes with more than 384GB of RAM. In this new era of distributed RAM processing systems such as Spark and Druid, vendors need to catch up, so we data scientists can catch up.
The second news story this month is ExtremeTech's Massive surge in Litecoin mining leads to graphics card shortage. While Litecoin, a competitor to Bitcoin, is a niche application that is all-compute-no-data, the fact that such a niche application can cause a run on GPU cards illustrates how thin the GPU market is. If U.S. IT were embracing Big Compute more fully, the run would instead have been caused by business and scientific applications. We all felt it a couple of years ago when the Thai monsoon doubled the price of hard drives and it made all the major newspapers and media outlets. But there's been nothing in the non-tech media about the GPU shortage.
The recognition of the importance of Big Compute -- GPUs, high core counts, SIMD width, and large RAM -- is rising, at least TED and tech news outlets. But it's not mainstream yet. It needs to be for Data Science to progress.