Day 7 of 80

Functions & Reusability

Phase 1: Python Foundation

What You'll Build Today

Welcome to Day 7. Today marks a massive shift in how you will write code. Up until now, you have been writing scripts that run from top to bottom, line by line. If you needed to do the same thing three times, you probably copy-pasted the code three times.

Today, we stop doing that.

We are going to build a Text Processing Pipeline. In the world of Generative AI, data is rarely clean. It comes in messy, full of extra spaces, weird capitalization, and unnecessary noise. Before an AI model can use it, that data needs to travel through a specific set of steps to get cleaned and formatted.

You will build a system that takes a messy raw sentence, cleans it, extracts the important keywords, and formats it for a report—all automatically.

Here is what you will learn and why:

* The def keyword: This allows you to write logic once and give it a name so you can use it anytime.

Parameters and Arguments: This makes your code flexible. Instead of hard-coding values, you pass data into* your logic.

* Return Values: This is critical. It allows a function to do work and hand the result back to you, rather than just printing it to the screen.

* Pipelines (Chaining): You will learn the secret architecture of software: the output of one function becomes the input of the next.

The Problem

Let's look at a scenario that should feel frustrating. Imagine you are processing user queries for a chatbot. You have three different messages coming in. You need to strip off whitespace, convert them to lowercase for consistency, and check if the word "urgent" is inside.

Here is how you might write this based on what we have learned so far:

# Raw data coming in

message_1 = " My system is BROKEN "

message_2 = "I need help with my account"

message_3 = " URGENT: Login failure "

# Process Message 1

clean_1 = message_1.strip()

clean_1 = clean_1.lower()

if "urgent" in clean_1:

print("Message 1 is High Priority: " + clean_1)

else:

print("Message 1 is Normal Priority: " + clean_1)

# Process Message 2

clean_2 = message_2.strip()

clean_2 = clean_2.lower()

if "urgent" in clean_2:

print("Message 2 is High Priority: " + clean_2)

else:

print("Message 2 is Normal Priority: " + clean_2)

# Process Message 3

clean_3 = message_3.strip()

clean_3 = clean_3.lower()

if "urgent" in clean_3:

print("Message 3 is High Priority: " + clean_3)

else:

print("Message 3 is Normal Priority: " + clean_3)

Look at that code. It works, but it is painful to look at.

The Pain Points:
  • Repetition: We wrote the exact same logic three times.
  • Maintainability Nightmare: What if your boss says, "Actually, we need to remove punctuation too"? You now have to find every place you wrote that logic and update it manually. If you miss one, your program becomes inconsistent.
  • Readability: It is hard to see what is actually happening because the logic is cluttered with repeated commands.
  • There has to be a better way. We need a way to wrap that logic in a box and just say, "Hey Python, process this message," regardless of which message it is.

    Let's Build It

    We are going to solve this using Functions. A function is a reusable block of code that performs a specific task. You define it once, and you can "call" it as many times as you want.

    Step 1: Defining a Simple Function

    First, let's learn the syntax. We use the keyword def (define), followed by a name we choose, parentheses (), and a colon :. Everything indented under this line belongs to the function.

    Let's create a function that just prints a welcome message.

    # 1. Define the function
    

    def welcome_user():

    print("---------------------------")

    print("System initializing...")

    print("Ready to process text.")

    print("---------------------------")

    # 2. The code above does nothing yet! We have to "call" it.

    print("Starting program...")

    welcome_user()

    print("Program finished.")

    Why this matters: We grouped four lines of printing code into one command called welcome_user(). If we want to change the welcome message later, we only change it in one place.

    Step 2: Parameters (Passing Data In)

    A function that does the exact same thing every time isn't very useful. We need it to act on different data. We do this using parameters. Parameters are temporary variables listed inside the parentheses.

    Let's build our text cleaner function.

    def clean_text(raw_text):
        # 'raw_text' is a variable that only exists inside this function
    

    cleaned = raw_text.strip()

    cleaned = cleaned.lower()

    print(f"Cleaned result: {cleaned}")

    # Now we can use it on different inputs

    clean_text(" My system is BROKEN ")

    clean_text(" URGENT: Login failure ")

    Output:
    Cleaned result: my system is broken
    

    Cleaned result: urgent: login failure

    Why this matters: raw_text acts like a placeholder. When we call clean_text("..."), Python takes our string and assigns it to raw_text inside the function.

    Step 3: Return Values (Getting Data Out)

    This is the most common point of confusion. Currently, our function prints the result. But what if we want to use that cleaned text in the next step of our program?

    If a function just prints, the data is lost to the rest of the program. It's just pixels on the screen. To pass the data back to the main program, we use the return keyword.

    Think of print as shouting the answer to the room. Think of return as writing the answer on a slip of paper and handing it to the person who asked.

    def clean_text(raw_text):
    

    cleaned = raw_text.strip()

    cleaned = cleaned.lower()

    # We are NOT printing here. We are sending the value back.

    return cleaned

    # We capture the returned value in a variable

    message_1 = " My system is BROKEN "

    processed_1 = clean_text(message_1)

    # Now we can do things with 'processed_1' outside the function

    print(f"Original: '{message_1}'")

    print(f"Processed: '{processed_1}'")

    Why this matters: This is essential for pipelines. If clean_text didn't return anything, we couldn't pass the result to the next step.

    Step 4: Building the Pipeline (Chaining)

    Now we will build the "real system." We will create three distinct functions:

  • clean_text: Prepares the string.
  • extract_keywords: Finds important words.
  • format_report: Makes it look nice.
  • Then, we will chain them together.

    # Function 1: Cleaner
    

    def clean_text(raw_text):

    # Remove whitespace and make lowercase

    return raw_text.strip().lower()

    # Function 2: Logic

    def extract_keywords(text):

    # Simple logic: check for specific words

    if "urgent" in text or "broken" in text:

    return "High Priority"

    elif "help" in text:

    return "Medium Priority"

    else:

    return "Low Priority"

    # Function 3: Formatter

    def format_report(original, priority):

    return f"REPORT: [{priority}] - Content: {original.strip()}"

    # --- THE PIPELINE ---

    incoming_data = " URGENT: Login failure "

    # Step 1: Clean it

    step_1_output = clean_text(incoming_data)

    # Step 2: Analyze it (using the output from step 1)

    step_2_output = extract_keywords(step_1_output)

    # Step 3: Format it (using original data and step 2 output)

    final_result = format_report(incoming_data, step_2_output)

    print(final_result)

    Output:
    REPORT: [High Priority] - Content: URGENT: Login failure
    
    Why this matters: This is exactly how data flows in AI applications. Data goes into a cleaning function, the result goes into the AI model, and the AI's result goes into a formatting function.

    Step 5: Default Parameters

    Sometimes you want a function to have a standard behavior, but allow for overrides. We can set default values for parameters.

    Let's modify clean_text to optionally replace specific words.

    def clean_text(raw_text, remove_word=""):
        # First, standard cleaning
    

    cleaned = raw_text.strip().lower()

    # If the user provided a word to remove, do it

    if remove_word != "":

    cleaned = cleaned.replace(remove_word, "")

    return cleaned

    # Usage 1: Default behavior (remove_word is "")

    msg1 = clean_text(" Hello World ")

    print(msg1)

    # Usage 2: Override behavior

    msg2 = clean_text(" Hello World ", remove_word="world")

    print(msg2)

    Output:
    hello world
    

    hello

    Why this matters: Defaults keep your function calls simple (clean_text(msg)) most of the time, while retaining the power to be complex (clean_text(msg, remove_word="bad")) when needed.

    Now You Try

    You have the pipeline code above. Now, expand it.

  • Add a Length Checker:
  • Create a new function called check_length(text).

    * If the text is shorter than 5 characters, return "Too Short".

    * Otherwise, return "Valid".

    * Integrate this into your pipeline before keyword extraction.

  • Enhance the Keyword Extractor:
  • Modify the extract_keywords function to accept a second parameter called urgent_word.

    * Default it to "urgent".

    * Update the logic to check for whatever word is in that variable, rather than hard-coding "urgent".

  • The Summarizer:
  • Create a function called mock_summarize(text).

    * Since we don't have a real AI yet, just make this function return the first 10 characters of the text followed by "...".

    * Example Input: "This is a very long sentence about Python."

    * Example Output: "This is a ..."

    Challenge Project: The AI Chatbot Simulator

    Your challenge is to build a simulated interaction loop using three functions chained together.

    Requirements:
  • get_user_input(): This function takes no arguments. It uses input() to ask the user "Enter your query: ". It returns the string the user typed.
  • process_query(query): This function takes the string.
  • * It cleans it (strip/lower).

    * If the cleaned query is "hello", return "Greeting detected".

    * If the cleaned query is "bye", return "Exit command detected".

    * Otherwise, return "Processing request...".

  • display_response(status): This function takes the status string from the previous step.
  • It prints a formatted box around the status (e.g., AI SAYS: [status] *).
  • The Execution Logic: Write code at the bottom that calls these three functions in order:
  • * Get input -> pass to processor -> pass result to display.

    Example Run:
    Enter your query:    HELLO   
     AI SAYS: Greeting detected 
    
    Hints:

    * Remember that get_user_input doesn't need parameters, but it needs to return the result of the input() function.

    * Your "main" code section should look like: variable = function_1(), then variable2 = function_2(variable).

    What You Learned

    Today you moved from writing scripts to writing systems.

    * Functions (def) allow you to encapsulate logic and reuse it.

    * Parameters make your logic flexible and dynamic.

    * Return allows data to flow out of a function and into the next one.

    * Pipelines are the standard way to process data: Input -> Function A -> Function B -> Output.

    Why This Matters for AI:

    When you use a Large Language Model (LLM) like GPT-4, you are essentially calling a massive function:

    response = generate_text(prompt).

    But before you call that function, you need to clean your data. After you get the response, you need to format it. Real AI engineering is 10% modeling and 90% building these pipelines of functions to manage the data flow.

    Tomorrow:

    You have built a nice pipeline, but what happens if the user inputs a number instead of text? or if the input is empty? The program crashes. Tomorrow, we tackle Error Handling—how to make your code bulletproof so it doesn't break when things go wrong.