I am often asked how to select the right database. The answer is it depends - no one size fits all situations. How do you choose?
Spark Summit (West) 2016 took place this past week in San Francisco, with the big news of course being Spark 2.0 which among other things ushers in yet another 10x performance improvement through whole-stage code generation.
Flink is an open source platform for distributed stream and batch data processing. Fast, general purpose distributed data processing system that combines batch and stream processing. Up to 100x faster than
Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Key features:
Recognizing the difficulties inherent in managing big data projects, Hydrosphere is pleased to announce its flagship product. This opensource platform plugs data scientists and data engineers into the continuous release mindset and machinery of traditional development teams. It takes the automated devops approach, which is already adopted by Java projects, and fits that onto data science.