Use LangChain to Chat with your PDFs - Streamlit RAG

March 7, 2024

Langchain is a very hot topic right now in the AI space.

I found an amazing project and video from Alejandro AO and decided to make a few tweaks so that more people can deploy it and try the project.

My Fork in Github - AskMultiPlePDFs YT Video on SelfHosting Chat with PDF ↗

The Project is done in Python Streamlit for the UI part ✅
- Project Source Code at Github
The vector store it is Open Source (FAISS) ✅
The RAG Framework used is LangChain ✅
This part (the LLMs) is not Free & Open Source ❎
- OpenAI API Keys
- Hugging Face API Tokens

You can always replace those with Open LLMs by using LLM’s with Ollama 🦙

The initial project is available in Github, but you can use your own Gitea repository as well.

Chat with PDF Streamlit

If you want, you can try the project first:

Install Python 🐍
Clone the repository
And install Python dependencies 👇
(Optional) - Use GH Actions to build a production ready env

git clone https://github.com/JAlcocerT/ask-multiple-pdfs/

python -m venv chatwithpdf #create it

chatwithpdf\Scripts\activate #activate venv (windows)
source chatwithpdf/bin/activate #(linux)

#deactivate #when you are done

Once active, you can just install packages as usual and that will affect only that venv:

pip install -r requirements.txt #all at once

#pip list
#pip show streamlit #check the installed version

SelfHosting a PDF Chat Bot

As always, we are going to use containers to simplify the deployment process.

Really, Just Get Docker 🐋👇

You can install Docker for any PC, Mac, or Linux at home or in any cloud provider that you wish. It will just take a few moments. If you are on Linux, just:

apt-get update && sudo apt-get upgrade && curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh
#sudo apt install docker-compose -y

And install also Docker-compose with:

apt install docker-compose -y

When the process finishes, you can use it to self-host other services as well. You should see the versions with:

docker --version
docker-compose --version
#sudo systemctl status docker #and the status

Now, these are the things you need:

The Dockerfile
The requirements.txt file
The App (Python+Streamlit, of course)
The Docker Image ready with the App bunddled and the Docker Configuration

The Dockerfile

FROM python:3.11-slim

# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME

COPY . ./

RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    software-properties-common \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install production dependencies.
RUN pip install -r requirements.txt

EXPOSE 8501

We will need the libraries for:

OpenAI API
LangChain for the RAG
FAISS as vector store.

The Packages required for Python

streamlit==1.28.0
pypdf2==3.0.1

langchain==0.0.325
python-dotenv==1.0.0

faiss-cpu==1.7.4

openai==0.28.1
tiktoken==0.5.1

And if you want to build your own image… ⏬

docker build -t chat_multiple_pdf .

#export DOCKER_BUILDKIT=1
#docker build --no-cache -t chat_multiple_pdf .

#docker build --no-cache -t chat_multiple_pdf . > build_log.txt 2>&1

#docker run -p 8501:8501 chat_multiple_pdf:latest
#docker exec -it chat_multiple_pdf /bin/bash

#sudo docker run -it -p 8502:8501 chat_multiple_pdf:latest /bin/bash

You can make this build manually, use Github Actions, or your can even combine Gitea and Jenkins to do it for you.

PDF Chat Bot - Docker Compose

With this Docker Compose below, you will be using the x86 Docker Image created by the CI/CD of Github Actions

#version: '3'

services:
  streamlit-embeddings-pdfs:
    image: ghcr.io/jalcocert/ask-multiple-pdfs:v1.0 #chat_multiple_pdf / whatever name you built
    container_name: chat_multiple_pdf
    volumes:
      - ai_chat_multiple_pdf:/app
    working_dir: /app  # Set the working directory to /app
    command: /bin/sh -c "export OPENAI_API_KEY='your_openai_api_key_here' && export HUGGINGFACE_API_KEY='your_huggingface_api_key_here' && streamlit run app.py"
    #command: tail -f /dev/null    
    ports:
      - "8501:8501"    

volumes:
  ai_chat_multiple_pdf:  

If you followed along, the PDF chat UI it is available at localhost:8501 and looks like:

Streamlit Chat PDF Diagram - AlejandroAO

FAQ

How to use Github Actions to Build a Streamlit Image

The project has a GH Actions Worflow created that will push a new image to the Github Container Registry whenever a new push of the code is submited.

The Workflow Configuration File 👇

You just need this at: .github/workflows:

name: CI/CD Pipeline

on:
  push:
    branches:
      - main

jobs:
  build-and-push:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout repository
      uses: actions/checkout@v2

    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v1 #it will produce x86 and ARM64 arch images

    - name: Login to GitHub Container Registry
      uses: docker/login-action@v1
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.CICD_TOKEN_AskPDF }} #Settings -> Dev Settings -> PAT's -> Tokens +++ Repo Settings -> Secrets & variables -> Actions -> New repo secret 

    - name: Build and push Docker image
      uses: docker/build-push-action@v2
      with:
        context: .
        push: true
        tags: ghcr.io/yourghuser/ask-multiple-pdfs:v1.0

Copy it to your project and get the most out of GH Actions

Other FREE AI Tools to Chat with Docs

Service	Description
PrivateGPT	The PrivateGPT Project works with Docker and with local and open models out of the box.
Other FREE Vector Stores 👇	This project uses FAISS as Vector Database, but there are other F/OSS alternatives:
ChromaDB	Another F/OSS vector store option.
Vector Admin Project	An additional F/OSS vector store alternative.

RAG Frameworks

RAG is a technique for enhancing the capabilities of large language models (LLMs) by allowing them to access and process information from external sources.

It follows a three-step approach:
- Retrieve: Search for relevant information based on the user query using external sources like search engines or knowledge bases.
- Ask: Process the retrieved information and potentially formulate additional questions based on the context.
- Generate: Use the retrieved and processed information to guide the LLM in generating a more comprehensive and informative response.

Sometime ago, I was covering another RAG - The EmbedChain RAG Framework

Interesting Concepts for RAGs 📌

Concept	Description
Embedded Models	Machine learning models that convert text into dense vector representations with semantic information.
Vector Store	Data structure that stores vector representations of text/content with associated metadata.
Chroma	A library by Spotify for efficient vector similarity search, often used in music recommender systems.
FAISS	A library by Facebook AI Research for scalable and high-performance vector similarity search.
Role in Semantic Search	Embedded models generate vector representations, vector stores hold these vectors, and tools like Chroma/FAISS facilitate efficient search.
Applications	Semantic search, recommendation systems, content retrieval, personalized content delivery, question answering.

Examples of Embedded Models:

NLP (Natural Language Processing): Models like BERT or GPT-3 used for tasks like text summarization, sentiment analysis, chatbots.
Computer Vision: CNNs for object recognition, facial recognition, or medical image analysis.
Speech Recognition: Used in transcribing speech for voice-controlled systems.

How Embedded Models Enhance Semantic Search:

Natural Language Understanding: Grasp the semantics and relationships in user queries.
Intent Recognition: Identify different user intents (informational, transactional, etc.).
Query Expansion: Suggest related terms or synonyms to improve results.
Contextual Search: Take into account the query and content context for relevant results.
Entity Recognition: Identify entities (people, places, etc.) in queries.
Question Answering: Answer questions using context from the query and data.
Relevance Ranking: Rank search results by semantic similarity.
Personalization: Tailor results to individual user preferences.

How Components Work Together:

Embedded Models: Convert text into dense vector representations (embeddings).
Vector Store: Stores embeddings with metadata for efficient retrieval.
Chroma: Scalable vector similarity search library (developed by Spotify).
FAISS: High-performance similarity search library (developed by Facebook AI).

Typical Workflow for Semantic Search:

Text is processed by embedded models into vectors.
These vectors are stored in the vector store.
User queries are converted into vectors.
Libraries like Chroma/FAISS search for the most similar vectors.
Results are ranked and presented as recommendations or search results.

Applications

Semantic search
Recommendation systems
Content retrieval
Personalized content delivery
Question-answering systems

LangChain

This Streamlit Project is using LangChain as RAG - with its core focus on the retrieval aspect of the RAG pipeline:

langchain Source Code at Github
- License MIT ❤️
- LangChain in PyPi

A Python Library to Build context-aware reasoning applications

Why or Why not LangChain as RAG? ⏬

Provides a high-level interface for building RAG systems
Supports various retrieval methods, including vector databases and search engines
Offers a wide range of pre-built components and utilities for text processing and generation
Integrates well with popular language models like OpenAI’s GPT series

LangChain supports several Vector Stores: https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/

But LangChain it is not the only F/OSS option…

…we also have LLamaIndex, LangFlow

LLama Index

Why LLama Index as RAG? ⏬

Designed specifically for building index-based retrieval systems
Provides a simple and intuitive API for indexing and querying documents
Supports various indexing techniques, including vector-based and keyword-based methods
Offers built-in support for common document formats (e.g., PDF, HTML)
Lightweight and easy to integrate into existing projects

Primarily focused on indexing and retrieval, may lack advanced generation capabilities

LLamaIndex Source Code at Github
- License MIT ❤️

LangFlow

Langflow is a visual framework for building multi-agent and RAG applications.

It’s open-source, Python-powered, fully customizable, model and vector store agnostic.

Why (or Why not) LangFlow as RAG? ⏬

LangFlow - Pros:
- Offers a visual programming interface for building RAG pipelines
- Allows for easy experimentation and prototyping without extensive coding
- Provides a library of pre-built nodes for various tasks (e.g., retrieval, generation)
- Supports integration with popular language models and libraries
- Enables rapid development and iteration of RAG systems
Cons:
- May have limitations in terms of customization and fine-grained control compared to code-based approaches
- Visual interface may not be suitable for complex or large-scale projects
- Dependency on the LangFlow platform and its ecosystem

Langflow Source Code at Github
- License MIT ❤️

Tinkering with PhyPhox How to use iGPU's for local AI