Michael.Walker's blog

Apache Flink - Distributed Stream and Batch Data Processing

Flink is an open source platform for distributed stream and batch data processing. Fast, general purpose distributed data processing system that combines batch and stream processing. Up to 100x faster than
Hadoop Mapreduce.

Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Key features:

The Goal of Ubiquitous Computing

Ubiquitous computing may be defined broadly as "machines that fit the human environment instead of forcing humans to enter theirs." Mark Weiser coined the phrase "ubiquitous computing" around 1988, during his tenure as Chief Technologist of the Xerox Palo Alto Research Center (PARC).

Good Tables: Free Service for Validating Tabular Data - Alpha Release

Good Tables web service is an API and UI for processing tabular data and is currently an alpha release; we invite the community to start using and contributing to it to help us move towards a v1.0 release.

In the current release, the Good Tables web service will validate CSV and Excel files (the first sheet therein) for well-formedness, and, if a JSON Table Schema is supplied, for conformity to the given schema.