7. Ollama

This script will do,
✅ Generating a Streamlit app that integrates RAG retrieval with a local LLM (Ollama) via subprocess.
✅ Saving the app to /mnt/data/rag_streamlit_with_ollama/app.py.
✅ Zipping the app into /mnt/data/rag_streamlit_with_ollama.zip for easy transfer/deployment.

To Run Everything:

1. Install requirements:

pip install streamlit sentence-transformers chromadb

2. Start the app:

streamlit run /mnt/data/rag_streamlit_with_ollama/app.py

3. Ensure Ollama is running:

Make sure a model (e.g., llama3) is downloaded and Ollama is installed and serving:

ollama run llama3

📦 Output:

You’ll get:

A web app UI with input for user question
Embeds the query using MiniLM
Fetches chunks from ChromaDB
Builds prompt
Sends prompt to ollama run llama3
Displays result in browser

# Generate Streamlit app with direct integration to Ollama (via subprocess call)
from pathlib import Path
import zipfile

# Define the app path
app_path = Path("/mnt/data/rag_streamlit_with_ollama")
app_path.mkdir(parents=True, exist_ok=True)

# Updated Streamlit app with Ollama integration
streamlit_ollama_code = """
import streamlit as st
from sentence_transformers import SentenceTransformer
import chromadb
import subprocess

st.set_page_config(page_title="RAG with Ollama", layout="wide")
st.title("🧠 RAG + Ollama: Ask Questions from Your PDFs")

query = st.text_input("Ask a question from your documents:")

if query:
    model = SentenceTransformer('all-MiniLM-L6-v2')
    chroma_client = chromadb.Client()
    collection = chroma_client.get_collection("rag_chunks")

    # Embed query
    query_vector = model.encode(query).tolist()
    results = collection.query(query_embeddings=[query_vector], n_results=3)
    contexts = results['documents'][0]

    st.subheader("📄 Retrieved Contexts")
    for i, chunk in enumerate(contexts):
        st.markdown(f"**Chunk {i+1}:**")
        st.code(chunk, language="markdown")

    # Construct prompt
    context_str = "\\n---\\n".join(contexts)
    full_prompt = f\"\"\"Context:
{context_str}

Question: {query}
Answer:\"\"\"

    st.subheader("🧠 Prompt Sent to LLM")
    st.code(full_prompt, language="markdown")

    # Run with Ollama
    st.subheader("🤖 Response from Ollama")
    try:
        result = subprocess.run(
            ["ollama", "run", "llama3"],
            input=full_prompt,
            capture_output=True,
            text=True,
            check=True
        )
        st.success(result.stdout.strip())
    except subprocess.CalledProcessError as e:
        st.error("❌ Error running Ollama.")
        st.text(e.stderr)
else:
    st.info("Enter a question to get started.")
"""

# Save the app file
streamlit_ollama_path = app_path / "app.py"
streamlit_ollama_path.write_text(streamlit_ollama_code.strip())

# Zip it
zip_path = "/mnt/data/rag_streamlit_with_ollama.zip"
with zipfile.ZipFile(zip_path, 'w') as zipf:
    zipf.write(streamlit_ollama_path, "rag_streamlit_with_ollama/app.py")

zip_path

After Venv


pip install streamlit sentence-transformers chromadb
streamlit run app.py
ollama run llama3

Folder Structure

project/
├── main.py                   # Your PDF -> ChromaDB pipeline
├── /pdfs/                    # Drop PDFs here
├── /extracted_text/          # Auto-generated .txt files
├── /chunks/                  # Chunked JSON
├── /ocr_output_pdfs/         # OCR'd PDF copies
├── /venv/                    # Optional: Python virtualenv
└── /rag_streamlit_with_ollama/
    └── app.py                # Streamlit UI