Bit by bit they gathered over the years
They bit, they spread, and they flew everywhere!
On land, in air – they left no empty space
They sucked everyone into a pretty mad race!
Megabytes! Gigabytes! Terabytes! Their sizes grew bigger
Petabytes and Zettabytes are now ready to trigger!
They sped through the wires, they rode the air waves
In dots and lines, they came in all shapes!
The ‘likes’, the ‘dislikes’ and even the very ‘neutral’
You are forced to pay attention and cannot be too casual!
Find out how Data to Value’s Graph Data software partners Neo4j and Linkurious have been used in the Panama Papers investigation.
|Char2Vec||Character||Sentence||Unsupervised||CNN -> LSTM|
|Doc2Vec||Paragraph Vector||Document||Supervised||ANN -> Logistic Regression|
|Video2Vec||Video Elements||Video||Supervised||CNN -> MLP|
The powerful word2vec algorithm has inspired a host of other algorithms listed in the table above. (For a description of word2vec, see my Spark Summit 2015 presentation.) word2vec is a convenient way to assign vectors to words, and of course vectors are the currency of machine learning. Once you've vectorized your data, you are then free to apply any number of machine learning algorithms.