In a previous blog post, Real-time data science, I textually described an algorithm that can be used, for example, in real-time data streaming applications to estimate the size (cardinality) of a set.
Takashi Ozaki has a nice blog post comparing machine learning classifiers based on their hyperplanes or decision boundaries.
Compared: Decision Tree, Logistic Regression, SVM, Neural Net and Random Forrest.
It appears Random Forests work best when cluster boundaries are unknown.
RESEARCH COLLOQUIUM: CALL FOR PAPERS
LEGAL AND ETHICAL ISSUES IN PREDICTIVE DATA ANALYTICS
June 19 & 20, 2014
Abstract Submission Deadline: March 3, 2014
NFL 2013 Team Expected Points Added (EPA) per game - Defense by Offense