Retrieval augmented generation (RAG) is a technology that allows for the use of a large language model (LLM) to interpret and transform information from an external database, without the need for constant retraining. This makes LLM-based applications more useful and capable. RAG uses an embedding model to convert user prompts into a numeric format, which is then matched against information stored in a vector database. If a match is found, the prompt and matching information are used to generate a response.