Start your LLMs journey with Embedding

Start your LLMs journey with Embedding

Today we are gonna see simple yet important topic in Machine Learning especially in Natural Language Processing world. In my last Post I told I will talk about this so here we go..

After this post from the next time you see the word "Embedding", you are like 'oh that is simple concept man'.

Embedding

🎈 So what is it and Why we need ???
Have you ever wondered how the Searching operation works in case of big browsers like Chrome, edge or music platforms like Spotify work. Lets take both the examples,
-> In chrome/google when we search for something we will get lots of data from different websites but all in common when it comes to "CONTEXT". Yes similar Context.

-> Then in case of Spotify when you search for a song lets say a Sad song with a name, you will get the song you want. But from the next time when you open Spotify you will recommendations(sad version songs)from different movies/genres that are similar to the song you have searched or listened.

Embedding is the concept that is doing all the magic behind the scenes.. It is a simple concept

🎊 What is Embedding : Embedding is the process of representing the data as a series of numbers in the form of a vector in a multi dimensional space. So that all the similar documents will be near to each other forming like a cluster. Based on the distance between the vectors they will be categorized as a group.
Here the data can be anything videos, audio, text, paragraphs, images etc.,.

🎯 🎣 Here you can see the True Power of Embedding, the below two sentences, "Hello, how are you?" and "Hi, How it's going" actually doesn't share any words in common. But we can clearly see the vector representations of those two are almost very much similar. That's the magic of Embedding.

So to implement embedding like Bag of Words(BoW), Term Frequency-Inverse Document Frequency (TF-IDF), Using Pre-trained word embeddings like Word2Vec, GloVe, or FastText. Soon will try to provide the link for jupyter notebook to for practical implementation.

Get in touch here : https://www.linkedin.com/in/hariprasad-alluru-9bb6a9183/