[AI] Exploring RAGs. Creating a Chat over custom Data

[AI] Exploring RAGs. Creating a Chat over custom Data

November 30, 2024
ℹ️
The Data-ChatBot 💻

How to use RAGs

Previously this year, I got lucky enough to find these open source projects:

They both use LangChain as RAG framework

We can build very interesting QnA over knowledge apps: https://github.com/langchain-ai/chat-langchain

General RAG Architecture

This is the general idea of a RAG architecture:

flowchart TD
    A[User Query] -->|Send Query| B[Retrieval Module]

    subgraph RAG_Framework[ ]
        B -->|Retrieve Documents| C[Knowledge Base]
        C -->|Relevant Documents| B
        B -->|Provide Context| D[Generation Module]
        D -->|Generate Response| E[Final Response]
    end
    
    E -->|Send Back| F[User Response]

    style RAG_Framework fill:#f9f9f9,stroke:#333,stroke-width:2px
    style A fill:#FFDDC1
    style E fill:#85E3FF
    style F fill:#85E3FF

    %% Annotations
    classDef query fill:#FFDDC1,stroke:#333,stroke-width:1px;
    classDef retrieval fill:#FFABAB,stroke:#333,stroke-width:1px;
    classDef knowledgeBase fill:#FFC3A0,stroke:#333,stroke-width:1px;
    classDef generation fill:#D5AAFF,stroke:#333,stroke-width:1px;
    classDef response fill:#85E3FF,stroke:#333,stroke-width:1px;

And as you can imagine, there as few frameworks already out there.

ℹ️
Normally, you will see that RAG frameworks relate with: VectorDBs, Embedding Models and LLMs

RAG Frameworks

ChatBot for Real Estate - LlamaIndex

LLamaIndex is awsome.

And for a real estate agent bot, LlamaIndex + Mem0 does the trick.

How Exactly?

See this repo folder.

You will need OpenAI & Anthropic APIs

ℹ️
For the Real Estate Web Project commented in this post I was asked to provide a QnA Bot

Exploring LangChain

The LangChain framework is amazing.

It can helpful to:

  1. Chat with PDFs
  2. Even with CSV’s…
  3. …or a Database!
ℹ️
You might be interested to see also LangGraph

LangChain PandasDF Chat

ℹ️
Kind of PandasAI, but with LangChain

Exploring PandasAI

I was using the PandasAI project previously to talk with dataframes as covered on this Post

HayStack as RAG Framework

The Haystack framework is completely now to me.

pip install haystack-ai

EmbedChain - Mem0

It seems that the embedchain project got absorbed into a bigger one. Im talking about the mem0 framework.

Llama-Index

You might Know Llama-Index because of its RAG capabilities.

LlamaIndex is a framework for building context-augmented generative AI applications with LLMs including agents and workflows.

PydanticAI

I was using Pydantic this year.

Agent Framework / shim to use Pydantic with LLMs. MIT Licensed!

What it is Pydantic? 📌

Pydantic is a data validation and settings management library in Python.

It’s widely used for validating data and ensuring that inputs conform to the expected types and formats.


Summing Up

We have seen some interesting RAG Frameworks working in Python

Star History Chart

Whats next from here?

Why not building something cool?

Star History Chart

Interesting API keys for LLMs

Other LLMs that I have not covered yet in posts

You can always use Ollama!

LLMs that have already appeared:

Running LLMs Locally

Interesting RAG Resoures

VectorDBs


FAQ

More Github Actions CI/CD

  1. https://fossengineer.com/docker-github-actions-cicd/
  2. https://jalcocert.github.io/JAlcocerT/create-streamlit-chatgpt/#conclusion---and-what-i-learnt
#https://github.com/nektos/act/releases/tag/v0.2.70

wget https://github.com/nektos/act/releases/download/v0.2.70/act_Linux_x86_64.tar.gz

tar -xzf act_Linux_x86_64.tar.gz
sudo mv act /usr/local/bin/
sudo chmod +x /usr/local/bin/act

act --version

Then go to the repo folder (where ./github/workflows are)

act