SC15: Spark poor at HPC's specialty: physics simulation
Whereas two years ago when I attended SC13, presenters hadn't even tried Spark yet, now according a Reddit comment, presenters are now saying at SC15 that they've tried Spark and it's no good for that classic HPC application, physics simulation.
In particular, one talk reportedly reported a 60x slowdown by moving from MPI to Spark. We'll have to wait for the paper to be published to see the details of the test. My immediate comments were:
Out of the box, Spark has no API for spatiotemporal. RDDs are unordered. Spark is geared for key/value pairs and matrices (it has a blas).
There are add-ons at spark-packages.org for time series and image processing, but none I'm aware of for physics simulation. MPI seems to me to be really well-suited for that. Maybe someone someday will write a Spark package for that, but I doubt it would ever have gating barriers or asynchronous message passing.
Without having read the benchmark, a few questions do pop into my mind:
- Which version of Spark? 1.5 or later, which has Tungsten (uses non-JVM memory)?
- Was any attempt made to use mapPartitions() instead of map()? map() is fine if your problem is easily expressed in terms of MapReduce, but mapPartitions() essentially lets you write arbitrary code that executes on each core (albeit without gating control barriers or message passing).
- Was Kryo used?
- Were checkpoints used to truncate the RDD lineages between iterations?
- Was Tachyon used to store the checkpoints?
I would have expected MPI to be 5x, maybe 10x, as fast as Spark at physics simulation. 60x is alarming and surprising.