LLM, RAG, Vector DB & Fine-tuning

LLM, RAG & Vector Databases

To build AI-native applications, you must understand the underlying components.

core Components

1. Large Language Model (LLM)

The engine. A probabilistic model trained on internet-scale text to predict the next token.

Examples: GPT-5.2, Claude Opus 4.5 / Sonnet 4.5, Gemini 3 Pro / Flash, Llama 4.
Role: Reasoning, language understanding, code generation.

2. Embeddings & Vector Databases

The memory.

Embeddings: Converting text into a list of numbers (vectors) where similar meanings are mathematically close.
Vector DB: A database optimized to store and search these vectors fast.
Examples: Pinecone, Weaviate, pgvector (PostgreSQL).

3. Retrieval Augmented Generation (RAG)

The workflow. A technique to give the LLM “open book” access to your private data.

sequenceDiagram
    participant User
    participant App
    participant VectorDB
    participant LLM

    User->>App: "What is our refund policy?"
    App->>App: Convert query to vector
    App->>VectorDB: Search similar documents
    VectorDB-->>App: Returns "Refund Policy.pdf" chunks
    App->>LLM: Prompt + Query + Context Chunks
    LLM-->>App: Generates answer based on Context
    App-->>User: "Our refund policy is 30 days..."

Fine-tuning vs. RAG vs. Prompt Engineering

Choosing the right approach is key to cost and performance.

Approach	Description	Best For	Cost
Prompt Engineering	optimizing the instructions sent to the model.	Quick prototyping, behavior adjustment.	Low
RAG	Injecting knowledge into the prompt context dynamically.	Knowledge retrieval, searching documentation, “Chat with PDF”.	Medium
Fine-tuning	Retraining the model weights on your specific data.	Teaching specific style, vocabulary, or highly specialized tasks (e.g., medical diagnosis).	High

Enterprise Examples

Enterprise Knowledge Assistant

Tech: RAG + Vector DB.
Use Case: HR Policy bot.
Why: Policies change often. RAG allows instant updates by just updating the database, no re-training needed.

Document Search

Tech: Hybrid Search (Keyword + Vector).
Use Case: Legal contract review.
Why: Lawyers need to find specific clauses (keywords) and conceptual matches (vector).

Customer Support

Tech: RAG + Function Calling.
Use Case: Order status bot.
Why: Needs access to live database (via tools) and static knowledge base (RAG).