7. Ollama
This script will do,
✅ Generating a Streamlit app that integrates RAG retrieval with a local LLM (Ollama) via subprocess.
✅ Saving the app to /mnt/data/rag_streamlit_with_ollama/app.py.
✅ Zipping the app into /mnt/data/rag_streamlit_with_ollama.zip for easy transfer/deployment.
To Run Everything:
1. Install requirements:
pip install streamlit sentence-transformers chromadb
2. Start the app:
streamlit run /mnt/data/rag_streamlit_with_ollama/app.py
3. Ensure Ollama is running:
Make sure a model (e.g., llama3) is downloaded and Ollama is installed and serving:
ollama run llama3
📦 Output:
You’ll get:
- A web app UI with input for user question
- Embeds the query using MiniLM
- Fetches chunks from ChromaDB
- Builds prompt
- Sends prompt to ollama run llama3
- Displays result in browser
# Generate Streamlit app with direct integration to Ollama (via subprocess call)
from pathlib import Path
import zipfile
# Define the app path
app_path = Path("/mnt/data/rag_streamlit_with_ollama")
app_path.mkdir(parents=True, exist_ok=True)
# Updated Streamlit app with Ollama integration
streamlit_ollama_code = """
import streamlit as st
from sentence_transformers import SentenceTransformer
import chromadb
import subprocess
st.set_page_config(page_title="RAG with Ollama", layout="wide")
st.title("🧠 RAG + Ollama: Ask Questions from Your PDFs")
query = st.text_input("Ask a question from your documents:")
if query:
model = SentenceTransformer('all-MiniLM-L6-v2')
chroma_client = chromadb.Client()
collection = chroma_client.get_collection("rag_chunks")
# Embed query
query_vector = model.encode(query).tolist()
results = collection.query(query_embeddings=[query_vector], n_results=3)
contexts = results['documents'][0]
st.subheader("📄 Retrieved Contexts")
for i, chunk in enumerate(contexts):
st.markdown(f"**Chunk {i+1}:**")
st.code(chunk, language="markdown")
# Construct prompt
context_str = "\\n---\\n".join(contexts)
full_prompt = f\"\"\"Context:
{context_str}
Question: {query}
Answer:\"\"\"
st.subheader("🧠 Prompt Sent to LLM")
st.code(full_prompt, language="markdown")
# Run with Ollama
st.subheader("🤖 Response from Ollama")
try:
result = subprocess.run(
["ollama", "run", "llama3"],
input=full_prompt,
capture_output=True,
text=True,
check=True
)
st.success(result.stdout.strip())
except subprocess.CalledProcessError as e:
st.error("❌ Error running Ollama.")
st.text(e.stderr)
else:
st.info("Enter a question to get started.")
"""
# Save the app file
streamlit_ollama_path = app_path / "app.py"
streamlit_ollama_path.write_text(streamlit_ollama_code.strip())
# Zip it
zip_path = "/mnt/data/rag_streamlit_with_ollama.zip"
with zipfile.ZipFile(zip_path, 'w') as zipf:
zipf.write(streamlit_ollama_path, "rag_streamlit_with_ollama/app.py")
zip_path
After Venv
pip install streamlit sentence-transformers chromadb
streamlit run app.py
ollama run llama3
Folder Structure
project/
├── main.py # Your PDF -> ChromaDB pipeline
├── /pdfs/ # Drop PDFs here
├── /extracted_text/ # Auto-generated .txt files
├── /chunks/ # Chunked JSON
├── /ocr_output_pdfs/ # OCR'd PDF copies
├── /venv/ # Optional: Python virtualenv
└── /rag_streamlit_with_ollama/
└── app.py # Streamlit UI