Prediction Based Word Embedding Techniques

Prediction Based Word Embedding Techniques

After the frequency based Word Embedding Techniques , There was a revolutionary concept came in 2013 Word2Vec (Tomas Mikolov) .This concept really change the existing NLP approach . We can create smart chatting bots after this algorithm release . Even Google became so powerful after its invention . It was able to to capture the context while creating embeddings .Word2Vec is a kind of prediction word embedding technique .Actually Word2Vec is pre trained Prediction base Embedding Model ( Covers algorithms and training on its own data )   . It was trained on Google news corpus .  Well  I put these interesting facts  in the starting paragraph to generate interest in you . This article will cover Prediction Based Word Embedding Techniques.

Prediction Based Word Embedding Techniques –

It is built on two Principal lets understand them first –

1. Continuous Bag of words (CBOW) –

Under this approach , We try to predict the target word on the basis of context . Here context is nothing but neighbour words .We create window of some fixed size . All the words from left and right from the target words by the the window size comes as context . We train the shallow neurel network ( only one hidden layer ) .
Example – Lets understand with the below sentence –
Embedding is essential for all NLP stuffs .
if  we take window size is 2 . If we consider “essentail” as target word , These taken will be context [“Embedding” , “is” , “for” , “all” ].

2. Skip gram –

This is just opposite to Continuous Bag of words (CBOW) .Here we predict the contex based on the given word . While as I have already explained that in CBOW , we predict the word based on the context .
Example – Lets understand with the some example . For ease Lets take the same sentance –
Embedding is essential for all NLP stuffs .
if window size is 2 then there will be 4 pair for context and target .

1. essentail -> Embedding

2 .essentail ->is

3 . essentail ->for

4. essentail ->all

How to choose Skip Gram vs CBOW ? –

It is really a million doller question . There could be multiple scenario which effects this decision but we will cover the game changer reason . I mean which has the higher impact . Lets understand –
1.In order to achieve fast training / speed , You should go with SkipGram algoritms if data size is not very big . In General view , CBOW is faster than SkipGram.
2. SkipGram performs well on less frequent data . CBOW is good with high frequent data .

Word2Vec , FastText and GloVe-

These are pretained Word  Embedding Models on big corpus . Majorly it has good performance on general data . Still if you have domain specific data , just go for training your own word embedding on the same model like ( Word2Vec , FastText and Glove  ) with your own data .

Conclusion –

Word Embedding is really important when it comes to handle the context and co-occurrence of words . The context and co-occurrence is the general requirement for most of the NLP stuffs . So what the Frequency based model is out dates ? Please think and answer . No they are also important in some scenario . Obviously Prediction based is more problem solving generally but it may fail in some scenario.We will discuss in different article .


Data Science Learner Team

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages, where he and his team share knowledge and help others learn more about data science.
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner