Table of XX2Vec Algorithms

XX2Vec Embed In Sup/Unsup Algorithms used
Char2Vec Character Sentence Unsupervised CNN -> LSTM
Word2Vec Word Sentence Unsupervised ANN
GloVe Word Sentence Unsupervised SGD
Doc2Vec Paragraph Vector Document Supervised ANN -> Logistic Regression
Image2Vec Image Elements Image Unsupervised DNN
Video2Vec Video Elements Video Supervised CNN -> MLP

The powerful word2vec algorithm has inspired a host of other algorithms listed in the table above. (For a description of word2vec, see my Spark Summit 2015 presentation.) word2vec is a convenient way to assign vectors to words, and of course vectors are the currency of machine learning. Once you've vectorized your data, you are then free to apply any number of machine learning algorithms.

Introduction to Data Quality

How many times have you heard managers and colleagues complain about the quality of the data in a particular report, system or database? People often describe poor quality data as unreliable or not trustworthy. Defining exactly what high or low quality data is, why it is a certain quality level and how to manage and improve it is often a trickier task.

Datification 2016

Big Data, to be effective, must recognize the following voices (in order).  

  1. VOC=Voice of the Customer
  2. VOB=Voice of the Business
  3. VOP=Voice of the Process

Datification is the link between the three voices. As well as capturing and  displaying  relevant metrics from all  your improvement projects. 

DATIFICATION!! What's the big deal? And why your business needs it?

 What is Datification?