This week Databricks announced GraphFrames, a library posted to spark-packages.org that is based on Spark SQL Dataframes rather than RDDs (as GraphX is). GraphFrames is still a work in progress -- it is currently at the 0.1 version -- so it provides interoperability with GraphX (graphs can be converted back and forth).
Data Science Skills Survey 2015
- Do you have domain knowledge? If yes, construct a better set of “ad hoc” features.
- Are your features commensurate? If no, consider normalizing them.
- Do you suspect interdependence of features? If yes, expand your feature set by constructing conjunctive features or products of features, as much as your computer resources allow you (see example of use in Section 4.4).
Arrow is a columnar in-memory analytics framework that provides the performance benefits of modern techniques while also providing the flexibility of complex data and dynamic schemas. Arrow grew out of three prevailing trends and business requirements:
CALL FOR PAPERS
Paper submission deadline: February 28, 2016
Nello Cristianini, Bristol University, UK
Stephen H. Muggleton, Imperial College London, UK
Other Keynote Speakers will be announced soon
Watch out folks - there is a new breed of health analytics firms sprouting like weeds - mining "big data" about you and making broad and bold predictions about your state of health to make workplace decisions.