Capstone: Polish & Documentation
Here is the comprehensive curriculum for Day 78.
What You'll Build Today
You have spent the last few weeks building an incredible Capstone project. The logic is sound, the AI is clever, and the results are impressive. But today, we are going to build the most important part of your project: The Packaging.
Today isn't about writing new Python logic. It is about turning a "coding experiment" into a "professional product." You will take your raw code and polish it until it shines, ensuring that anyone—especially hiring managers—can understand and run it without frustration.
Here is what you will master today:
* README Architecture: You will write documentation that sells your project before anyone even looks at the code. This is needed because recruiters often spend less than 30 seconds looking at a repository.
* Defensive Coding: You will refactor your code to handle errors gracefully. This is needed because a crash during a demo is the fastest way to lose credibility.
* Dependency Management: You will create a precise requirements.txt. This is needed so your project runs on other people's computers, not just yours.
* Visual Documentation: You will learn to create flowcharts using code (Mermaid.js). This is needed to prove you understand the system architecture, not just the syntax.
The Problem
Imagine a hiring manager finds your GitHub profile. They see a repository named final-capstone-v2. They are interested, so they download it to test it out.
They open their terminal and run your main script. Here is what happens:
# The Hiring Manager runs this:
python main.py
# And sees this output:
Traceback (most recent call last):
File "main.py", line 4, in
df = pd.read_csv("C:/Users/YourName/Downloads/data_final.csv")
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/YourName/Downloads/data_final.csv'
They frown. They realize you hardcoded a file path that only exists on your laptop. They try to fix it, but then they hit another error:
ModuleNotFoundError: No module named 'langchain_community'
They don't know which version of LangChain you used. They don't know what other libraries are missing.
Finally, they open your code to see if they can figure it out manually. They see this:
# main.py
import os
# x is the api key
x = "sk-12345..."
def do_stuff(t):
# prints result
print(t)
do_stuff("hello")
The Pain:
do_stuff and variables like x show a lack of care.After 2 minutes of struggle, the hiring manager closes the tab and moves to the next candidate. Your code works perfectly on your machine, but because it wasn't "packaged," your effort was wasted.
We need to fix this. We need to make your project bulletproof.
Let's Build It
We are going to take a "messy" script and step-by-step transform it into a professional artifact. While we will use a simple example script here, you will apply these exact steps to your actual Capstone project.
Step 1: The Messy Starting Point
Let's imagine this is a piece of your Capstone. It is a simple script that summarizes text.
Create a file namedmessy_script.py and paste this in:
import openai
import os
# BAD: Hardcoded key (never do this!)
os.environ["OPENAI_API_KEY"] = "sk-placeholder-key"
def get_summary(text):
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Summarize this: {text}"}]
)
return response.choices[0].message.content
# BAD: Hardcoded absolute path
with open("C:/Users/Student/Documents/input.txt", "r") as f:
data = f.read()
print(get_summary(data))
If you try to run this, it will likely fail because that file path doesn't exist on your computer, or you don't have the library installed.
Step 2: Dependency Management
The first step to professionalism is ensuring others can install your tools. We use a requirements.txt file for this.
Instead of asking users to guess which libraries to install, we provide a list.
requirements.txt.requirements.txt
openai>=1.0.0
python-dotenv>=1.0.0
Now, anyone can set up your project by running:
pip install -r requirements.txt
Step 3: Removing Hardcoded Paths and Secrets
Never hardcode paths (like C:/Users...) or secrets. Use relative paths and environment variables.
.env file for your secrets.os and pathlib modules to handle file paths dynamically..env
OPENAI_API_KEY=your-actual-api-key-here
File: clean_script.py (Part 1 - Setup)
import os
from pathlib import Path
from dotenv import load_dotenv
import openai
# 1. Load secrets safely
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
raise ValueError("API Key not found! Please check your .env file.")
# 2. Define paths relative to the script location, not your hard drive
# This gets the folder where this script lives
BASE_DIR = Path(__file__).parent
# This looks for input.txt in that same folder
INPUT_FILE = BASE_DIR / "input.txt"
Step 4: Adding Defensive Error Handling
Professional code anticipates failure. What if the input file is missing? What if the API is down? We wrap dangerous operations in try/except blocks.
clean_script.py (Part 2 - Logic)
def get_summary_safe(text):
try:
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Summarize this: {text}"}]
)
return response.choices[0].message.content
except openai.APIConnectionError:
return "Error: Could not connect to OpenAI. Check your internet."
except openai.AuthenticationError:
return "Error: Your API key is invalid."
except Exception as e:
return f"An unexpected error occurred: {e}"
def main():
print("--- AI Summarizer Starting ---")
# Defensive file reading
if not INPUT_FILE.exists():
print(f"Error: The file '{INPUT_FILE.name}' was not found.")
print(f"Please create a file named 'input.txt' in this folder: {BASE_DIR}")
return
try:
with open(INPUT_FILE, "r") as f:
data = f.read()
if not data.strip():
print("Error: The input file is empty.")
return
print("Analyzing text...")
summary = get_summary_safe(data)
print("\n--- Summary ---")
print(summary)
except Exception as e:
print(f"File Error: {e}")
if __name__ == "__main__":
main()
Create a dummy input.txt file in the same folder with some text, then run this script. Notice how it guides you if something is wrong, rather than crashing with a scary red traceback.
Step 5: The README.md
This is the face of your project. If you don't have a README, you don't have a project. We use Markdown for this.
Create a file named README.md. Here is a template you should adapt for your Capstone:
``markdown
🚀 What It Does
This application allows users to upload PDF documents and chat with them using natural language. It uses OpenAI's GPT-4 for reasoning and FAISS for vector storage, ensuring accurate answers based solely on the document content.
🛠️ Technologies Used
- Python 3.10
- Streamlit (Frontend)
- LangChain (Orchestration)
- OpenAI API (LLM)
🏗️ Architecture
💻 How to Run It Locally
bash
git clone https://github.com/yourusername/project-name.git
cd project-name
Install dependencies
bash
pip install -r requirements.txt
Set up environment variables
- Create a
.env file in the root directory.
Add your API key: OPENAI_API_KEY=sk-...
Run the app
bash
streamlit run main.py
📸 Screenshots

(Note: Replace the link above with an actual screenshot of your app)
Step 6: Architecture as Code (Mermaid.js)
You can generate diagrams directly in your README using Mermaid syntax. This looks incredibly professional on GitHub.
Add this section to your
README.md:
`markdown
🧠 Logic Flow
`mermaid
graph TD;
A[User Uploads PDF] --> B[PyPDFLoader Extracts Text];
B --> C[Text Splitter Chunks Data];
C --> D[Embeddings Generator];
D --> E[(Vector Database)];
F[User Asks Question] --> E;
E --> G[Retrieve Relevant Chunks];
G --> H[LLM Generates Answer];
H --> I[Display to User];
When you view this on GitHub, it will render as a flow chart!
Now You Try
Now that you have practiced on the dummy script, it is time to apply this to your Main Capstone Project.
1. The "Clean Sweep" Refactor
Go through your Capstone code (specifically
app.py or main.py).
* Action: Ensure every
open() call uses pathlib or relative paths.
* Action: Ensure there are ZERO hardcoded API keys.
Action: Add comments to complex logic blocks explaining why you did something, not just what* you did.
2. The Visual Upgrade
Take a screenshot of your application running.
* Action: If it's a CLI tool, take a screenshot of the terminal output. If it's Streamlit, screenshot the browser.
* Action: Upload this image to your repository (create a folder named
assets/).
* Action: Link this image in your
README.md so it appears at the very top. Humans process images 60,000 times faster than text.
3. Record the Demo Video
Recruiters might not run the code, but they will watch a video.
* Action: Record a screen capture (using Loom, OBS, or Zoom).
* Constraint: Keep it under 2 minutes.
* Script:
1. Introduce yourself and the problem you are solving (20s).
2. Show the app working (upload a file, ask a question) (60s).
3. Briefly show the code structure or architecture diagram (30s).
4. Conclusion (10s).
* Action: Add the link to this video at the top of your README.
Challenge Project: The "Grandma Test"
Your challenge today is not code—it is user testing your documentation.
The Goal:
Verify that your
README.md is so clear that a stranger can run your project without asking you a single question.
Instructions:
Find a friend, family member, or peer who has not seen your project code before.
Send them the link to your GitHub repository.
Ask them to clone it and run it on their computer.
The Hard Part: You are NOT allowed to speak, type for them, or explain anything. You must sit on your hands.
Watch them. Note exactly where they get stuck.
* Did they fail to install Python?
* Did they forget the
.env file because the instructions were buried?
* Did they get a version conflict?
Deliverable:
* A list of "Friction Points" observed during the test.
* Updates to your
README.md that solve these specific friction points. (e.g., adding a "Troubleshooting" section).
Hints:
* If they don't know how to use terminal/git, your README should link to a "Prerequisites" guide.
* If your app requires specific API keys (like Pinecone or Serper), make sure you explain where to get them in the README.
What You Learned
Today you shifted from being a "coder" to a "software engineer." You learned that code is only valuable if it is usable.
* Reproducibility: You used
requirements.txt` and relative paths to ensure your code works everywhere.
* Documentation: You treated your README as a product landing page.
* Defensive Coding: You handled errors so the user doesn't see ugly tracebacks.
Why This Matters:In a professional AI engineering role, you will often hand off your code to DevOps engineers or other developers. If they can't run your code in 5 minutes, they will assume the code is broken. Good documentation is a career superpower.
Tomorrow:Now that your project is polished and documented, we need to prepare you. Tomorrow, we dive into Interview Preparation, specifically focusing on how to explain the technical decisions you made in this Capstone.