Day 78 of 80

Capstone: Polish & Documentation

Phase 9: Capstone & Career

Here is the comprehensive curriculum for Day 78.

What You'll Build Today

You have spent the last few weeks building an incredible Capstone project. The logic is sound, the AI is clever, and the results are impressive. But today, we are going to build the most important part of your project: The Packaging.

Today isn't about writing new Python logic. It is about turning a "coding experiment" into a "professional product." You will take your raw code and polish it until it shines, ensuring that anyone—especially hiring managers—can understand and run it without frustration.

Here is what you will master today:

* README Architecture: You will write documentation that sells your project before anyone even looks at the code. This is needed because recruiters often spend less than 30 seconds looking at a repository.

* Defensive Coding: You will refactor your code to handle errors gracefully. This is needed because a crash during a demo is the fastest way to lose credibility.

* Dependency Management: You will create a precise requirements.txt. This is needed so your project runs on other people's computers, not just yours.

* Visual Documentation: You will learn to create flowcharts using code (Mermaid.js). This is needed to prove you understand the system architecture, not just the syntax.

The Problem

Imagine a hiring manager finds your GitHub profile. They see a repository named final-capstone-v2. They are interested, so they download it to test it out.

They open their terminal and run your main script. Here is what happens:

# The Hiring Manager runs this:
python main.py

# And sees this output:
Traceback (most recent call last):
  File "main.py", line 4, in 
    df = pd.read_csv("C:/Users/YourName/Downloads/data_final.csv")
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/YourName/Downloads/data_final.csv'

They frown. They realize you hardcoded a file path that only exists on your laptop. They try to fix it, but then they hit another error:

ModuleNotFoundError: No module named 'langchain_community'

They don't know which version of LangChain you used. They don't know what other libraries are missing.

Finally, they open your code to see if they can figure it out manually. They see this:

# main.py
import os
# x is the api key
x = "sk-12345..." 
def do_stuff(t):
    # prints result
    print(t)

do_stuff("hello")

The Pain:

It crashes immediately. The code is fragile and assumes it is running on your specific computer.

It leaks secrets. Hardcoding API keys is a major security red flag.

It is vague. Function names like do_stuff and variables like x show a lack of care.

There are no instructions. The hiring manager has to guess how to run it.

After 2 minutes of struggle, the hiring manager closes the tab and moves to the next candidate. Your code works perfectly on your machine, but because it wasn't "packaged," your effort was wasted.

We need to fix this. We need to make your project bulletproof.

Let's Build It

We are going to take a "messy" script and step-by-step transform it into a professional artifact. While we will use a simple example script here, you will apply these exact steps to your actual Capstone project.

Step 1: The Messy Starting Point

Let's imagine this is a piece of your Capstone. It is a simple script that summarizes text.

Create a file named messy_script.py and paste this in:

import openai
import os

# BAD: Hardcoded key (never do this!)
os.environ["OPENAI_API_KEY"] = "sk-placeholder-key"

def get_summary(text):
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": f"Summarize this: {text}"}]
    )
    return response.choices[0].message.content

# BAD: Hardcoded absolute path
with open("C:/Users/Student/Documents/input.txt", "r") as f:
    data = f.read()

print(get_summary(data))

If you try to run this, it will likely fail because that file path doesn't exist on your computer, or you don't have the library installed.

Step 2: Dependency Management

The first step to professionalism is ensuring others can install your tools. We use a requirements.txt file for this.

Instead of asking users to guess which libraries to install, we provide a list.

Create a file named requirements.txt.

Add the libraries your project needs, ideally with version numbers to prevent future breaking changes.

File: requirements.txt

openai>=1.0.0
python-dotenv>=1.0.0

Now, anyone can set up your project by running:

pip install -r requirements.txt

Step 3: Removing Hardcoded Paths and Secrets

Never hardcode paths (like C:/Users...) or secrets. Use relative paths and environment variables.

Create a .env file for your secrets.

Use the os and pathlib modules to handle file paths dynamically.

File: .env

OPENAI_API_KEY=your-actual-api-key-here

File: clean_script.py (Part 1 - Setup)

import os
from pathlib import Path
from dotenv import load_dotenv
import openai

# 1. Load secrets safely
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise ValueError("API Key not found! Please check your .env file.")

# 2. Define paths relative to the script location, not your hard drive
# This gets the folder where this script lives
BASE_DIR = Path(__file__).parent 
# This looks for input.txt in that same folder
INPUT_FILE = BASE_DIR / "input.txt"

Step 4: Adding Defensive Error Handling

Professional code anticipates failure. What if the input file is missing? What if the API is down? We wrap dangerous operations in try/except blocks.

File: clean_script.py (Part 2 - Logic)

def get_summary_safe(text):
    try:
        client = openai.OpenAI()
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": f"Summarize this: {text}"}]
        )
        return response.choices[0].message.content
    except openai.APIConnectionError:
        return "Error: Could not connect to OpenAI. Check your internet."
    except openai.AuthenticationError:
        return "Error: Your API key is invalid."
    except Exception as e:
        return f"An unexpected error occurred: {e}"

def main():
    print("--- AI Summarizer Starting ---")
    
    # Defensive file reading
    if not INPUT_FILE.exists():
        print(f"Error: The file '{INPUT_FILE.name}' was not found.")
        print(f"Please create a file named 'input.txt' in this folder: {BASE_DIR}")
        return

    try:
        with open(INPUT_FILE, "r") as f:
            data = f.read()
            
        if not data.strip():
            print("Error: The input file is empty.")
            return

        print("Analyzing text...")
        summary = get_summary_safe(data)
        print("\n--- Summary ---")
        print(summary)

    except Exception as e:
        print(f"File Error: {e}")

if __name__ == "__main__":
    main()

Create a dummy input.txt file in the same folder with some text, then run this script. Notice how it guides you if something is wrong, rather than crashing with a scary red traceback.

Step 5: The README.md

This is the face of your project. If you don't have a README, you don't have a project. We use Markdown for this.

Create a file named README.md. Here is a template you should adapt for your Capstone:

``markdown


# [Project Name]: AI-Powered PDF Assistant

🚀 What It Does
This application allows users to upload PDF documents and chat with them using natural language. It uses OpenAI's GPT-4 for reasoning and FAISS for vector storage, ensuring accurate answers based solely on the document content.

🛠️ Technologies Used

Python 3.10
Streamlit (Frontend)
LangChain (Orchestration)
OpenAI API (LLM)


🏗️ Architecture


💻 How to Run It Locally

Clone the repository

bash
   git clone https://github.com/yourusername/project-name.git
   cd project-name
   
Install dependencies
   bash
   pip install -r requirements.txt
   
Set up environment variables

Create a .env file in the root directory.

Add your API key: OPENAI_API_KEY=sk-...



Run the app
   bash
   streamlit run main.py
   
📸 Screenshots
![App Interface](https://via.placeholder.com/800x400?text=Screenshot+of+Your+App)
(Note: Replace the link above with an actual screenshot of your app)


Step 6: Architecture as Code (Mermaid.js)

You can generate diagrams directly in your README using Mermaid syntax. This looks incredibly professional on GitHub.

Add this section to your README.md:

markdown
🧠 Logic Flow

mermaid
graph TD;
    A[User Uploads PDF] --> B[PyPDFLoader Extracts Text];
    B --> C[Text Splitter Chunks Data];
    C --> D[Embeddings Generator];
    D --> E[(Vector Database)];
    
    F[User Asks Question] --> E;
    E --> G[Retrieve Relevant Chunks];
    G --> H[LLM Generates Answer];
    H --> I[Display to User];


When you view this on GitHub, it will render as a flow chart!

Now You Try

Now that you have practiced on the dummy script, it is time to apply this to your Main Capstone Project.

1. The "Clean Sweep" Refactor

Go through your Capstone code (specifically app.py or main.py).

* Action: Ensure every open() call uses pathlib or relative paths.


*   Action: Ensure there are ZERO hardcoded API keys.
   Action: Add comments to complex logic blocks explaining why you did something, not just what* you did.

2. The Visual Upgrade
Take a screenshot of your application running.
*   Action: If it's a CLI tool, take a screenshot of the terminal output. If it's Streamlit, screenshot the browser.

* Action: Upload this image to your repository (create a folder named assets/).

* Action: Link this image in your README.md so it appears at the very top. Humans process images 60,000 times faster than text.



3. Record the Demo Video
Recruiters might not run the code, but they will watch a video.
*   Action: Record a screen capture (using Loom, OBS, or Zoom).
*   Constraint: Keep it under 2 minutes.
*   Script:
    1.  Introduce yourself and the problem you are solving (20s).
    2.  Show the app working (upload a file, ask a question) (60s).
    3.  Briefly show the code structure or architecture diagram (30s).
    4.  Conclusion (10s).
*   Action: Add the link to this video at the top of your README.

Challenge Project: The "Grandma Test"

Your challenge today is not code—it is user testing your documentation.

The Goal:

Verify that your README.md is so clear that a stranger can run your project without asking you a single question.



Instructions:
 Find a friend, family member, or peer who has not seen your project code before.
 Send them the link to your GitHub repository.
 Ask them to clone it and run it on their computer.
 The Hard Part: You are NOT allowed to speak, type for them, or explain anything. You must sit on your hands.
 Watch them. Note exactly where they get stuck.
    *   Did they fail to install Python?

* Did they forget the .env file because the instructions were buried?


    *   Did they get a version conflict?

Deliverable:
*   A list of "Friction Points" observed during the test.

* Updates to your README.md that solve these specific friction points. (e.g., adding a "Troubleshooting" section).



Hints:
*   If they don't know how to use terminal/git, your README should link to a "Prerequisites" guide.
*   If your app requires specific API keys (like Pinecone or Serper), make sure you explain where to get them in the README.

What You Learned

Today you shifted from being a "coder" to a "software engineer." You learned that code is only valuable if it is usable.

* Reproducibility: You used requirements.txt` and relative paths to ensure your code works everywhere.

* Documentation: You treated your README as a product landing page.

* Defensive Coding: You handled errors so the user doesn't see ugly tracebacks.

Why This Matters:

In a professional AI engineering role, you will often hand off your code to DevOps engineers or other developers. If they can't run your code in 5 minutes, they will assume the code is broken. Good documentation is a career superpower.

Tomorrow:

Now that your project is polished and documented, we need to prepare you. Tomorrow, we dive into Interview Preparation, specifically focusing on how to explain the technical decisions you made in this Capstone.

← Day 77 Day 79 →