Smart Data Integration Systems

Al Roth won the Nobel Prize for "matching" and the design of new types of markets. The Gale-Shapley algorithm is a cornerstone of the matching methods Al Roth pioneered. The algorithm has been extended by Roth and others to apply "Matching Theory" to design matching markets solving real world problems like matching students to the right schools.
Tabular Validator is a Python package for validating tabular data through a processing pipeline. Applications range from simple validation checks on CSV files, to integration with a larger ETL pipeline.
A recent breakthrough for solving graph isomorphism has solved one of the mysteries in complexity theory and computer science. This innovative algorithm appears to be significantly more efficient than previous algorithms.
Key-value store optimized for convenience in Go. It wraps leveldb to provide persistant key-value storage of gob-compatible types.
Streaming SQL for Spark is a project based on Catalyst and Spark Streaming to support SQL-style queries on data streams. It bridges the gap between structured data queries and stream processing. It provides: