Member-only story
A quick start to RAG with a local setup
data:image/s3,"s3://crabby-images/cd4bc/cd4bc57cc23fd36bc24854ec893432ea9c142e2a" alt=""
To read this story for free check out this link.
Imagine heading over to https://chatgpt.com/ and asking ChatGPT a bunch of questions. A pretty good way to pass a hot and humid sunday afternoon if you ask me. What if you had a bunch of documents you wanted to decipher? Perhaps they are your lecture notes from CS2040. Now ask the LLM a question: “What did the professor highlight about linked lists in Lecture 4?”
data:image/s3,"s3://crabby-images/900cb/900cb711b8ef2f23f1f62ab8fcb09adf14cd505c" alt=""
The model spits out a bunch of random information. Let’s say someone magically types in some information (read: context) to the model to help with this.
data:image/s3,"s3://crabby-images/a990c/a990c1ff38e0b015bf6d72dfcef48d12473cce2d" alt=""
You have to admit it’s naive to always provide the model with context. Is someone always going to have to type this out? Well you are in luck! Retrieval-Augmented Generation does just that! The idea is that a query is vectorised and used to search against a pre-vectorised set of information in a database to retrieve the top few matches based on a similarity algorithm. These matches are returned to the LLM as context for it to answer…