Skip to content

LLM, RAG, Vector DB & Fine-tuning

To build AI-native applications, you must understand the underlying components.

The engine. A probabilistic model trained on internet-scale text to predict the next token.

  • Examples: GPT-5.2, Claude Opus 4.5 / Sonnet 4.5, Gemini 3 Pro / Flash, Llama 4.
  • Role: Reasoning, language understanding, code generation.

The memory.

  • Embeddings: Converting text into a list of numbers (vectors) where similar meanings are mathematically close.
  • Vector DB: A database optimized to store and search these vectors fast.
  • Examples: Pinecone, Weaviate, pgvector (PostgreSQL).

The workflow. A technique to give the LLM “open book” access to your private data.

sequenceDiagram
    participant User
    participant App
    participant VectorDB
    participant LLM

    User->>App: "What is our refund policy?"
    App->>App: Convert query to vector
    App->>VectorDB: Search similar documents
    VectorDB-->>App: Returns "Refund Policy.pdf" chunks
    App->>LLM: Prompt + Query + Context Chunks
    LLM-->>App: Generates answer based on Context
    App-->>User: "Our refund policy is 30 days..."

Fine-tuning vs. RAG vs. Prompt Engineering

Section titled “Fine-tuning vs. RAG vs. Prompt Engineering”

Choosing the right approach is key to cost and performance.

ApproachDescriptionBest ForCost
Prompt Engineeringoptimizing the instructions sent to the model.Quick prototyping, behavior adjustment.Low
RAGInjecting knowledge into the prompt context dynamically.Knowledge retrieval, searching documentation, “Chat with PDF”.Medium
Fine-tuningRetraining the model weights on your specific data.Teaching specific style, vocabulary, or highly specialized tasks (e.g., medical diagnosis).High
  • Tech: RAG + Vector DB.
  • Use Case: HR Policy bot.
  • Why: Policies change often. RAG allows instant updates by just updating the database, no re-training needed.
  • Tech: Hybrid Search (Keyword + Vector).
  • Use Case: Legal contract review.
  • Why: Lawyers need to find specific clauses (keywords) and conceptual matches (vector).
  • Tech: RAG + Function Calling.
  • Use Case: Order status bot.
  • Why: Needs access to live database (via tools) and static knowledge base (RAG).