Day 77 of 80

Capstone: Building

Phase 9: Capstone & Career

What You'll Build Today

Welcome to Day 77! This is it. You have spent weeks learning Python syntax, understanding how Large Language Models (LLMs) work, mastering vector databases, and experimenting with prompt engineering. Yesterday, you planned your Capstone. Today, we break ground.

You are going to build the Minimum Viable Product (MVP) of your final application.

It is tempting to want to build the "perfect" version right away—a sleek interface, support for 50 file types, and complex multi-agent reasoning. But if you try to do that all at once, you will get stuck. Today is about establishing the "Walking Skeleton": a thin implementation of your system that connects the front end to the back end and actually runs.

Here is what we are focusing on:

* Modular Architecture: Why you must separate your visual interface (Streamlit) from your logic (RAG pipeline).

* Session State Management: Why your chatbot "forgets" conversation history and how to fix it.

* Iterative Integration: How to build one piece, test it, and then attach the next piece, rather than writing 500 lines of code and hoping it works.

Let's turn that design document into working software.

---

The Problem

Imagine you are excited to build your "Financial Report Analyzer." You sit down, open a blank Python file, and start coding. You want it to do everything: upload a PDF, split the text, embed it, store it, run a chat interface, and handle errors.

You end up writing a single, massive script like this:

# monolithic_nightmare.py
import streamlit as st
from langchain_community.document_loaders import PyPDFLoader
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI
import os

# Trying to do everything in the global scope
st.title("My App")

uploaded_file = st.file_uploader("Upload PDF")

if uploaded_file:
    # PROBLEM 1: This runs every single time you click a button
    # It re-processes the PDF constantly, wasting money and time.
    with open("temp.pdf", "wb") as f:
        f.write(uploaded_file.getbuffer())
    
    loader = PyPDFLoader("temp.pdf")
    pages = loader.load_and_split()
    
    # PROBLEM 2: Hard to debug. If embedding fails, the whole app crashes.
    embeddings = OpenAIEmbeddings()
    vectorstore = FAISS.from_documents(pages, embeddings)
    retriever = vectorstore.as_retriever()
    
    user_input = st.text_input("Ask a question")
    
    if user_input:
        # PROBLEM 3: Logic is mixed with UI. 
        # You can't test the logic without running the web server.
        llm = ChatOpenAI()
        docs = retriever.invoke(user_input)
        context = "\n".join([d.page_content for d in docs])
        response = llm.invoke(f"Context: {context} Question: {user_input}")
        st.write(response.content)

The Pain:

The Refresh Loop: Every time you type a letter in the input box, Streamlit reruns the entire script. Your app re-reads the PDF and re-embeds the vectors every single time. It is agonizingly slow and costs you API credits for no reason.

Untestable Logic: If the answer is wrong, you don't know if it's the PDF loader, the embedding, or the prompt. You have to run the whole web app just to check if a function returns True.

Spaghetti Code: As you add features (like history), this file grows to 500 lines. Finding a bug becomes impossible.

There is a better way. We separate concerns. We build the engine first, then the car body.

---

Let's Build It

We are going to build a clean, modular RAG (Retrieval-Augmented Generation) system. We will split our project into two distinct files:

backend.py: Handles the logic (PDF loading, Vector Store, LLM).

app.py: Handles the user interface (Streamlit).

Step 1: Build the Backend Class

First, we create a class to manage our data. This allows us to keep the vector store in memory without reloading it constantly.

Create a file named backend.py.

# backend.py
import os
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

class RAGPipeline:
    def __init__(self):
        # Initialize core components
        # We don't load data yet, just set up the tools
        self.embeddings = OpenAIEmbeddings()
        self.llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
        self.vectorstore = None
        self.retriever = None
        self.chain = None

    def process_pdf(self, file_path):
        """
        Loads a PDF, splits it, and creates a vector store.
        Returns True if successful.
        """
        try:
            # 1. Load
            loader = PyPDFLoader(file_path)
            docs = loader.load()
            
            # 2. Split
            text_splitter = RecursiveCharacterTextSplitter(
                chunk_size=1000,
                chunk_overlap=200
            )
            splits = text_splitter.split_documents(docs)
            
            # 3. Store
            self.vectorstore = FAISS.from_documents(splits, self.embeddings)
            self.retriever = self.vectorstore.as_retriever()
            
            # 4. Build Chain
            self._build_chain()
            
            return True
        except Exception as e:
            print(f"Error processing PDF: {e}")
            return False

    def _build_chain(self):
        """Internal method to construct the RAG chain."""
        template = """Answer the question based only on the following context:
        {context}

        Question: {question}
        """
        prompt = ChatPromptTemplate.from_template(template)
        
        def format_docs(docs):
            return "\n\n".join(doc.page_content for doc in docs)

        self.chain = (
            {"context": self.retriever | format_docs, "question": RunnablePassthrough()}
            | prompt
            | self.llm
            | StrOutputParser()
        )

    def ask(self, query):
        """
        Public method to ask a question.
        """
        if not self.chain:
            return "Please upload a document first."
        
        return self.chain.invoke(query)

# Simple test block to verify backend works WITHOUT the UI
if __name__ == "__main__":
    # Create a dummy PDF for testing if needed, or point to a real one
    print("Backend test initialized. Point to a real PDF path to test.")
    # rag = RAGPipeline()
    # rag.process_pdf("sample.pdf")
    # print(rag.ask("Summarize this document."))

Why this matters: We have encapsulated all the complex logic. We can test this file independently. If this breaks, we know the issue is in the AI logic, not the web button.

Step 2: Set Up the Streamlit Interface

Now we build the UI. We will use st.session_state to keep our RAGPipeline alive between user interactions so we don't reload the PDF every time.

Create a file named app.py in the same folder.

# app.py
import streamlit as st
import os
import tempfile
from backend import RAGPipeline # Import our class

st.set_page_config(page_title="Capstone MVP", layout="wide")

st.title("📄 Document AI Assistant")

# --- SESSION STATE SETUP ---
# This ensures the backend object persists across reruns
if "rag_pipeline" not in st.session_state:
    st.session_state.rag_pipeline = RAGPipeline()
if "messages" not in st.session_state:
    st.session_state.messages = []

# --- SIDEBAR: CONFIGURATION ---
with st.sidebar:
    st.header("Setup")
    uploaded_file = st.file_uploader("Upload a PDF", type="pdf")
    
    if uploaded_file:
        # Save uploaded file to a temporary file
        # Streamlit uploads are bytes; LangChain needs a file path
        with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp_file:
            tmp_file.write(uploaded_file.getvalue())
            tmp_path = tmp_file.name
        
        if st.button("Process Document"):
            with st.spinner("Processing..."):
                success = st.session_state.rag_pipeline.process_pdf(tmp_path)
                if success:
                    st.success("Document processed! You can now chat.")
                    # Clean up temp file
                    os.remove(tmp_path)
                else:
                    st.error("Failed to process document.")

# --- MAIN CHAT INTERFACE ---
# Display chat history
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Handle user input
if prompt := st.chat_input("Ask a question about your document..."):
    # 1. Display user message immediately
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)

    # 2. Get response from backend
    with st.chat_message("assistant"):
        message_placeholder = st.empty()
        full_response = ""
        
        # Call the backend
        with st.spinner("Thinking..."):
            response = st.session_state.rag_pipeline.ask(prompt)
        
        message_placeholder.markdown(response)
        
    # 3. Save assistant response to history
    st.session_state.messages.append({"role": "assistant", "content": response})

Step 3: Run and Verify

Open your terminal and run:

``bash


streamlit run app.py



What you should see:
 A clean interface with a sidebar.
 When you upload a PDF and click "Process," it takes a moment (loading/embedding), then says "Success."
 When you ask a question, the answer appears.
 Crucially: If you ask a second question, it answers instantly without re-processing the PDF. That is the power of Session State.

Step 4: Iterative Improvement (Adding Source Citations)

Right now, the bot answers, but we don't know where the answer came from. Let's modify the backend.py to return sources.

Update the _build_chain and ask methods in backend.py:



    def _build_chain(self):
        template = """Answer the question based only on the following context:
        {context}

        Question: {question}
        """
        prompt = ChatPromptTemplate.from_template(template)
        
        # We modify the chain to return the source documents too
        self.chain = (
            {"context": self.retriever, "question": RunnablePassthrough()} 
            | RunnablePassthrough.assign(answer=prompt | self.llm | StrOutputParser())
        )

    def ask(self, query):
        if not self.chain:
            return {"answer": "Please upload a document first.", "sources": []}
        
        result = self.chain.invoke(query)
        return {
            "answer": result["answer"],
            "sources": [doc.page_content[:100] + "..." for doc in result["context"]]
        }

Now, update app.py to handle this dictionary response:



        # Inside the chat handling block in app.py
        with st.spinner("Thinking..."):
            result_dict = st.session_state.rag_pipeline.ask(prompt)
            response_text = result_dict["answer"]
            sources = result_dict["sources"]
        
        # Display answer
        message_placeholder.markdown(response_text)
        
        # Display sources in an expander
        with st.expander("View Sources"):
            for s in sources:
                st.info(s)
                
        # Update history with just the text answer
        st.session_state.messages.append({"role": "assistant", "content": response_text})


By making small changes to the backend and then the frontend, we keep control of the code.

---

Now You Try

You have a working RAG skeleton. Now apply this to your specific capstone idea.

 Customize the System Prompt:

Go into backend.py. Change the template string. Give your AI a persona. Is it a "Strict Legal Assistant"? A "Friendly Tutor"? Add specific instructions like "If you don't know, say 'I am not sure'."



 Add a "Clear Chat" Button:

In the sidebar of app.py, add a button that clears st.session_state.messages. This helps when you want to start a fresh topic without refreshing the whole page (which would lose the processed PDF).



 Handle Different File Types:

If your capstone involves text files or CSVs, modify the process_pdf method in backend.py (rename it to process_file) to check the file extension and use the appropriate LangChain loader (e.g., TextLoader or CSVLoader).



---

Challenge Project: The End-to-End Capstone Alpha

Your goal today is not to finish your capstone, but to finish the Alpha version.

Requirements:
 Data Ingestion: The app must accept the specific data type relevant to your project (PDF, URL, Text, JSON).
 Storage: It must successfully create a vector store from that data.
 Retrieval: It must retrieve relevant context based on a user query.
 Generation: It must produce a coherent answer using an LLM.
 Persistence: The chat history must persist as long as the browser tab is open (using Session State).
 Safety: It must not crash if the user asks a question before uploading a file.

Example Scenario:
If you are building a "Recipe Helper":
*   Input: User uploads a PDF cookbook.
*   Action: User asks, "How do I make lasagna?"
   Output: The bot replies with the recipe from that specific book*, not general internet knowledge.

Hint: If you get stuck on specific errors (like API connection issues), check your

.env file loading. Ensure load_dotenv() is called at the very top of backend.py

.

---

What You Learned

Today you moved from planning to execution. You learned:

* Separation of Concerns: Keeping logic (backend.py) separate from presentation (app.py) makes debugging infinitely easier.

* State Management: Using st.session_state` to prevent your application from "forgetting" data every time the user interacts with it.

* The MVP Mindset: Building the simplest thing that works end-to-end before trying to make it perfect.

Why This Matters:

In the real world, AI applications are rarely single scripts. They are systems composed of APIs, databases, and frontends. The architecture you built today—separating the "Brain" (RAG Pipeline) from the "Face" (Streamlit)—is the exact same architecture used in enterprise applications, just on a smaller scale.

Tomorrow: We focus on Polish and Presentation. You have a working skeleton; tomorrow we add error handling, better formatting, and prepare the narrative for your final demo.

← Day 76 Day 78 →