How to use LangChain to chat with your PDFs

How to use LangChain to chat with your PDFs

March 7, 2024

Langchain is a very hot topic right now in the AI space.

I found an amazing project and video from Alejandro AO and decided to make a few tweaks so that more people can deploy it and try the project.

You can always replace those with Open LLMs by using LLM’s with Ollama 🦙

The initial project is available in Github, but you can use your own Gitea repository as well.

Chat with PDF Streamlit

If you want, you can try the project first:

  1. Install Python 🐍
  2. Clone the repository
  3. And install Python dependencies 👇
  4. (Optional) - Use GH Actions to build a production ready env
git clone https://github.com/JAlcocerT/ask-multiple-pdfs/

python -m venv chatwithpdf #create it

chatwithpdf\Scripts\activate #activate venv (windows)
source chatwithpdf/bin/activate #(linux)

#deactivate #when you are done

Once active, you can just install packages as usual and that will affect only that venv:

pip install -r requirements.txt #all at once

#pip list
#pip show streamlit #check the installed version

SelfHosting a PDF Chat Bot

As always, we are going to use containers to simplify the deployment process.

Really, Just Get Docker 🐋👇

You can install Docker for any PC, Mac, or Linux at home or in any cloud provider that you wish. It will just take a few moments. If you are on Linux, just:

apt-get update && sudo apt-get upgrade && curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh
#sudo apt install docker-compose -y

And install also Docker-compose with:

apt install docker-compose -y

When the process finishes, you can use it to self-host other services as well. You should see the versions with:

docker --version
docker-compose --version
#sudo systemctl status docker #and the status

Now, these are the things you need:

  1. The Dockerfile
  2. The requirements.txt file
  3. The App (Python+Streamlit, of course)
  4. The Docker Image ready with the App bunddled and the Docker Configuration
The Dockerfile
FROM python:3.11-slim

# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME

COPY . ./

RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    software-properties-common \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install production dependencies.
RUN pip install -r requirements.txt

EXPOSE 8501

We will need the libraries for:

The Packages required for Python
streamlit==1.28.0
pypdf2==3.0.1

langchain==0.0.325
python-dotenv==1.0.0

faiss-cpu==1.7.4

openai==0.28.1
tiktoken==0.5.1
And if you want to build your own image… ⏬
docker build -t chat_multiple_pdf .

#export DOCKER_BUILDKIT=1
#docker build --no-cache -t chat_multiple_pdf .

#docker build --no-cache -t chat_multiple_pdf . > build_log.txt 2>&1

#docker run -p 8501:8501 chat_multiple_pdf:latest
#docker exec -it chat_multiple_pdf /bin/bash

#sudo docker run -it -p 8502:8501 chat_multiple_pdf:latest /bin/bash

You can make this build manually, use Github Actions, or your can even combine Gitea and Jenkins to do it for you.

PDF Chat Bot - Docker Compose

With this Docker Compose below, you will be using the x86 Docker Image created by the CI/CD of Github Actions

version: '3'

services:
  streamlit-embeddings-pdfs:
    image: ghcr.io/jalcocert/ask-multiple-pdfs:v1.0 #chat_multiple_pdf / whatever name you built
    container_name: chat_multiple_pdf
    volumes:
      - ai_chat_multiple_pdf:/app
    working_dir: /app  # Set the working directory to /app
    command: /bin/sh -c "export OPENAI_API_KEY='your_openai_api_key_here' && export HUGGINGFACE_API_KEY='your_huggingface_api_key_here' && streamlit run app.py"
    #command: tail -f /dev/null    
    ports:
      - "8501:8501"    

volumes:
  ai_chat_multiple_pdf:  

If you followed along, the PDF chat it is available at localhost:8501 and looks like:

Streamlit Chat with PDF with OpenAI


FAQ

How to use Github Actions to Build a Streamlit Image

The project has a GH Actions Worflow created that will push a new image to the Github Container Registry whenever a new push of the code is submited.

The Workflow Configuration File 👇

You just need this at: .github/workflows:

name: CI/CD Pipeline

on:
  push:
    branches:
      - main

jobs:
  build-and-push:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout repository
      uses: actions/checkout@v2

    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v1 #it will produce x86 and ARM64 arch images

    - name: Login to GitHub Container Registry
      uses: docker/login-action@v1
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.CICD_TOKEN_AskPDF }} #Settings -> Dev Settings -> PAT's -> Tokens +++ Repo Settings -> Secrets & variables -> Actions -> New repo secret 

    - name: Build and push Docker image
      uses: docker/build-push-action@v2
      with:
        context: .
        push: true
        tags: ghcr.io/yourghuser/ask-multiple-pdfs:v1.0

Copy it to your project and get the most out of GH Actions

Other FREE AI Tools to Chat with Docs

ServiceDescription
PrivateGPTThe PrivateGPT Project works with Docker and with local and open models out of the box.
Other FREE Vector Stores 👇This project uses FAISS as Vector Database, but there are other F/OSS alternatives:
ChromaDBAnother F/OSS vector store option.
Vector Admin ProjectAn additional F/OSS vector store alternative.

RAG Frameworks

RAG is a technique for enhancing the capabilities of large language models (LLMs) by allowing them to access and process information from external sources.

  • It follows a three-step approach:
    • Retrieve: Search for relevant information based on the user query using external sources like search engines or knowledge bases.
    • Ask: Process the retrieved information and potentially formulate additional questions based on the context.
    • Generate: Use the retrieved and processed information to guide the LLM in generating a more comprehensive and informative response.

Sometime ago, I was covering another RAG - The EmbedChain RAG Framework

Interesting Concepts for RAGs 📌
ConceptDescription
Embedded ModelsMachine learning models that convert text into dense vector representations with semantic information.
Vector StoreData structure that stores vector representations of text/content with associated metadata.
ChromaA library by Spotify for efficient vector similarity search, often used in music recommender systems.
FAISSA library by Facebook AI Research for scalable and high-performance vector similarity search.
Role in Semantic SearchEmbedded models generate vector representations, vector stores hold these vectors, and tools like Chroma/FAISS facilitate efficient search.
ApplicationsSemantic search, recommendation systems, content retrieval, personalized content delivery, question answering.

Examples of Embedded Models:

  • NLP (Natural Language Processing): Models like BERT or GPT-3 used for tasks like text summarization, sentiment analysis, chatbots.
  • Computer Vision: CNNs for object recognition, facial recognition, or medical image analysis.
  • Speech Recognition: Used in transcribing speech for voice-controlled systems.

How Embedded Models Enhance Semantic Search:

  1. Natural Language Understanding: Grasp the semantics and relationships in user queries.
  2. Intent Recognition: Identify different user intents (informational, transactional, etc.).
  3. Query Expansion: Suggest related terms or synonyms to improve results.
  4. Contextual Search: Take into account the query and content context for relevant results.
  5. Entity Recognition: Identify entities (people, places, etc.) in queries.
  6. Question Answering: Answer questions using context from the query and data.
  7. Relevance Ranking: Rank search results by semantic similarity.
  8. Personalization: Tailor results to individual user preferences.

How Components Work Together:

  1. Embedded Models: Convert text into dense vector representations (embeddings).
  2. Vector Store: Stores embeddings with metadata for efficient retrieval.
  3. Chroma: Scalable vector similarity search library (developed by Spotify).
  4. FAISS: High-performance similarity search library (developed by Facebook AI).

Typical Workflow for Semantic Search:

  • Text is processed by embedded models into vectors.
  • These vectors are stored in the vector store.
  • User queries are converted into vectors.
  • Libraries like Chroma/FAISS search for the most similar vectors.
  • Results are ranked and presented as recommendations or search results.

Applications

  • Semantic search
  • Recommendation systems
  • Content retrieval
  • Personalized content delivery
  • Question-answering systems

LangChain

This Streamlit Project is using LangChain as RAG - with its core focus on the retrieval aspect of the RAG pipeline:

A Python Library to Build context-aware reasoning applications

Why or Why not LangChain as RAG? ⏬
  • Provides a high-level interface for building RAG systems
  • Supports various retrieval methods, including vector databases and search engines
  • Offers a wide range of pre-built components and utilities for text processing and generation
  • Integrates well with popular language models like OpenAI’s GPT series

LangChain supports several Vector Stores: https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/

But LangChain it is not the only F/OSS option…

…we also have LLamaIndex, LangFlow

LLama Index

Why LLama Index as RAG? ⏬
  • Designed specifically for building index-based retrieval systems
  • Provides a simple and intuitive API for indexing and querying documents
  • Supports various indexing techniques, including vector-based and keyword-based methods
  • Offers built-in support for common document formats (e.g., PDF, HTML)
  • Lightweight and easy to integrate into existing projects

Primarily focused on indexing and retrieval, may lack advanced generation capabilities

LangFlow

Langflow is a visual framework for building multi-agent and RAG applications. It’s open-source, Python-powered, fully customizable, model and vector store agnostic.

Why (or Why not) LangFlow as RAG? ⏬
  • LangFlow - Pros:
    • Offers a visual programming interface for building RAG pipelines
    • Allows for easy experimentation and prototyping without extensive coding
    • Provides a library of pre-built nodes for various tasks (e.g., retrieval, generation)
    • Supports integration with popular language models and libraries
    • Enables rapid development and iteration of RAG systems
  • Cons:
    • May have limitations in terms of customization and fine-grained control compared to code-based approaches
    • Visual interface may not be suitable for complex or large-scale projects
    • Dependency on the LangFlow platform and its ecosystem