Day 76 of 80

Capstone: Planning & Architecture

Phase 9: Capstone & Career

What You'll Build Today

Welcome to Day 76. You have spent the last 75 days learning Python, APIs, databases, vector stores, and prompt engineering. You are now a builder.

Today, you start your Capstone Project. This is the centerpiece of your portfolio. It is the proof that you can not only write code but also architect a solution to a real-world problem.

We are not writing the application logic today. Instead, we are building the Blueprint.

You will learn:

* Problem Scoping: Why you must define exactly what you are solving before you type import.

User Persona Definition: Why knowing who uses your app dictates how* you build it.

* MVP (Minimum Viable Product) Strategy: Why building fewer features makes for a better project.

* System Architecture: How to draw the map of your data flow so you don't get lost.

* Tech Stack Selection: How to choose the right tools (and justify those choices).

This is the difference between a "tutorial follower" and a "software engineer." Let's get to work.

---

The Problem

You might be tempted to skip this day. You might think, "I know what I want to build, I'll just start coding."

This is the "Spaghetti Trap."

When you start coding without a plan, your code usually ends up looking like a single, massive file where logic, user interface, and database connections are hopelessly tangled.

Here is what it looks like when you don't plan your architecture. This is a snippet from a hypothetical project called main.py where a student tried to build a "Chat with PDF" app on the fly.

The "Spaghetti Code" Nightmare

# spaghetti_bot.py
# The result of "coding before planning"

import streamlit as st
import openai
import sqlite3
import os

# PROBLEM 1: Hardcoding configurations everywhere
openai.api_key = "sk-..."
st.title("My PDF Chat")

# PROBLEM 2: Database logic mixed with UI code
conn = sqlite3.connect('my_db.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS chats (msg text)''')

uploaded_file = st.file_uploader("Upload PDF")

if uploaded_file:
    # PROBLEM 3: Heavy processing logic blocking the UI thread
    # Imagine 100 lines of PDF parsing code right here...
    text = "extracted text..." 
    
    user_input = st.text_input("Ask a question")
    if user_input:
        # PROBLEM 4: Direct API calls mixed with presentation logic
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": user_input + text}]
        )
        
        # PROBLEM 5: No separation of concerns. 
        # If you want to change the database later, you have to rewrite the UI.
        c.execute(f"INSERT INTO chats VALUES ('{user_input}')")
        conn.commit()
        st.write(response.choices[0].message.content)

Why this hurts

Unmaintainable: If you want to switch from SQLite to Pinecone, you have to dig through UI code to find the database lines.

Unscalable: As you add features (like user login or history), this file will grow to 2,000 lines. You will be afraid to touch it.

Expensive: Without architectural planning, you might accidentally re-embed the PDF every time the user clicks a button, costing you API credits.

There is a better way. We separate the Plan from the Code.

---

Let's Build It

We are going to define a Capstone Project together. For this tutorial, we will plan a project called "JobHunter AI"—an assistant that analyzes your resume against a job description and suggests improvements.

You will follow this process for your own unique idea in the Challenge section.

Step 1: The Problem Statement & User Persona

Before we choose a database, we must define the "Who" and the "Why."

The Problem: Job seekers apply to dozens of jobs but get rejected by Applicant Tracking Systems (ATS) because their resumes don't match specific keywords in the job description. Customizing resumes manually takes too long. The User: "Alex," a mid-level marketing manager looking for a new role. Alex is non-technical, stressed, and needs quick, actionable advice, not complex charts. Why this step matters:

If we didn't define Alex as "non-technical," we might have built a complex dashboard with JSON outputs. Since Alex is non-technical, we know we need a simple Chat Interface.

Step 2: The MVP (Minimum Viable Product)

We cannot build everything in one week. We must ruthlessly cut features. We will categorize features into "Must Have" (MVP) and "Nice to Have" (V2).

Must Have (MVP):

Paste text of a Resume.

Paste text of a Job Description.

Generate a "Match Score" (0-100).

List missing keywords.

Generate 3 bullet points to improve the resume.

Nice to Have (Cut for now):

Upload PDF (Parsing PDFs is hard, text paste is easier for MVP).

Save user history (Requires user authentication/login system).

Directly apply to LinkedIn (Too complex).

Why this step matters:

By cutting "PDF Upload" and "User Accounts," we just saved ourselves roughly 15 hours of coding. We can add them later if the core AI logic works.

Step 3: System Architecture

Now we map how data moves. We will not write the app code yet, but we will write a Python script that generates our project structure based on a clean architecture.

We will use a standard pattern:

Frontend (UI): Streamlit.

Orchestrator (Logic): Python backend functions.

LLM Service: OpenAI.

Data: In-memory (variables) for the MVP to keep it simple.

Let's write a script to set up this architecture physically.

import os

def create_project_structure(project_name):
    """
    Creates a clean, architectural folder structure for the Capstone.
    """
    base_path = f"./{project_name}"
    
    # Define the folder structure
    structure = {
        "app": ["__init__.py", "main.py", "ui_components.py"],
        "core": ["__init__.py", "resume_analyzer.py", "llm_interface.py"],
        "data": ["__init__.py", "sample_resume.txt", "sample_job_desc.txt"],
        "utils": ["__init__.py", "config.py", "prompts.py"],
        ".": ["requirements.txt", "README.md", ".env"]
    }

    print(f"🏗️  Scaffolding project: {project_name}...")

    # Create directories and files
    for folder, files in structure.items():
        # Create folder path
        if folder == ".":
            folder_path = base_path
        else:
            folder_path = os.path.join(base_path, folder)
        
        os.makedirs(folder_path, exist_ok=True)
        
        # Create files inside
        for file in files:
            file_path = os.path.join(folder_path, file)
            with open(file_path, 'w') as f:
                # Add a docstring to explain the file's purpose
                if file.endswith(".py"):
                    f.write(f'"""\nModule: {file}\nPurpose: [Write purpose here]\n"""\n\n')
                elif file == "README.md":
                    f.write(f"# {project_name}\n\n## Problem Statement\n\n## MVP Features\n")
                elif file == ".env":
                    f.write("OPENAI_API_KEY=sk-...\n")
            
            print(f"   Created: {file_path}")

    print("\n✅ Project structure ready. No spaghetti code allowed!")

# Run the scaffolder
create_project_structure("JobHunter_AI")

Run this code. You will see a new folder JobHunter_AI appear. Why this step matters:

We have physically separated the ui (Streamlit) from the core (Logic).

* app/main.py will handle buttons and text inputs.

* core/resume_analyzer.py will handle the thinking.

* utils/prompts.py will store our prompt templates so they don't clutter the code.

Step 4: Tech Stack Rationale

You must justify your tools. Here is the rationale for JobHunter AI:

Language: Python (Great for AI libraries).

Frontend: Streamlit (Fastest way to build data UIs, no HTML/CSS needed).

LLM: OpenAI GPT-4o-mini (Cost-effective, smart enough for text analysis).

Orchestration: LangChain (Optional, but good for managing prompt templates).

Deployment: Streamlit Cloud (Free, easy GitHub integration).

Step 5: Defining the Interface (Pseudo-code)

Before we write the real backend tomorrow, let's sketch the app/main.py to see if our architecture makes sense.

We will write this into JobHunter_AI/app/main.py.

"""
Module: main.py
Purpose: The entry point for the Streamlit application.
         ONLY handles UI logic. No business logic here.
"""

import streamlit as st
# Notice we import from our 'core' module (even though we haven't written it yet)
# from core.resume_analyzer import analyze_match

def render_header():
    st.header("JobHunter AI 🎯")
    st.markdown("Optimize your resume for any job description.")

def render_inputs():
    col1, col2 = st.columns(2)
    with col1:
        resume = st.text_area("Paste Resume Text", height=300)
    with col2:
        job_desc = st.text_area("Paste Job Description", height=300)
    return resume, job_desc

def main():
    render_header()
    resume, job_desc = render_inputs()
    
    if st.button("Analyze Match"):
        if not resume or not job_desc:
            st.warning("Please provide both texts.")
            return
            
        with st.spinner("Analyzing keywords..."):
            # This function doesn't exist yet, but we know we NEED it.
            # result = analyze_match(resume, job_desc)
            
            # Placeholder for today
            st.success("Analysis Complete (Mock)")
            st.json({
                "score": 85,
                "missing_keywords": ["Python", "SQL"],
                "advice": "Add more metrics to your experience."
            })

if __name__ == "__main__":
    main()

Why this step matters:

This is "Interface-Driven Development." We defined how the app looks and what data it needs from the backend. Now, we have a clear specification for tomorrow: we need to write a function analyze_match that takes two strings and returns a JSON object.

---

Now You Try

You have the structure for "JobHunter AI." Now, practice adapting the plan.

Pivot the Persona:

Change the user from "Job Seeker" to "HR Recruiter."

Task:* Update the MVP feature list. Does a recruiter paste one resume? Or do they need to upload 50 resumes and rank them? (Hint: This changes the architecture significantly).

Swap the Tech Stack:

Imagine the resume data is highly confidential and cannot be sent to OpenAI.

Task:* Update the Tech Stack Rationale to use a local model (like Ollama/Llama3) instead of GPT-4.

Add a "Phase 2" Feature:

Task:* Design the data flow for a "History" feature. If you wanted to save every analysis, where would you store it? (SQL? JSON file?). Add a database.py file to your folder structure script to accommodate this.

---

Challenge Project: Your Capstone Plan

It is time. You are going to plan your project.

The Goal: Create a comprehensive PLAN.md file for the project you will build over the next few days. Requirements:

Project Name & Tagline: Catchy and descriptive.

Problem Statement: 2-3 sentences describing the pain point.

Target Audience: Who is this for?

MVP Feature List: Max 3-4 core features. Be strict.

Architecture Diagram (Text): Describe the flow (e.g., User -> Streamlit -> Python Logic -> ChromaDB -> OpenAI -> UI).

Tech Stack: List libraries and why you chose them.

Example PLAN.md format:

``markdown


# Project: ChefGPT

Problem
Home cooks often have random ingredients in the fridge but don't know what to make, leading to food waste and ordering takeout.

User
Busy parent who wants quick recipes based on available inventory.

MVP Features
Input list of ingredients (text).
Select dietary restrictions (checkboxes).
Generate 3 recipe names with brief descriptions.
Click a recipe to get full instructions.

Architecture
User Input -> Prompt Template -> OpenAI GPT-4 -> JSON Parser -> Streamlit Display

Tech Stack

Streamlit: UI
LangChain: Prompt management
OpenAI: Recipe generation



Guidance:
*   Keep it simple. A finished simple project is infinitely better than a half-finished complex one.

* Do not start coding the logic today. Just set up the folder structure and write the PLAN.md`.

* Show this plan to a friend (or rubber duck) and explain it. If you can't explain the flow, you can't code it.

---

What You Learned

Today you learned that coding is the last step of software engineering.

* Problem Scoping: You defined boundaries to prevent "feature creep."

* Separation of Concerns: You learned to separate UI code from Logic code using a directory structure.

* MVP Thinking: You learned to focus on the "Must Haves" to ensure you actually ship a product.

* Interface Design: You sketched the UI to define what the backend needs to do.

Why This Matters:

In a real job interview, they will ask, "Tell me about a time you designed a system." They don't want to hear about syntax. They want to hear about trade-offs, user needs, and architecture. This plan gives you that story.

Tomorrow: We take our plan and breathe life into it. We will build the Backend Logic for your Capstone.

← Day 75 Day 77 →