Day 9 of 80

Files & Data Persistence

Phase 1: Python Foundation

What You'll Build Today

Up until now, your programs have suffered from a severe case of amnesia. You run the code, you create variables, you build lists, and the moment the program finishes... poof. Everything is gone. It’s like writing a novel on a blackboard; the moment you walk away, someone erases it.

Today, we are going to give your programs long-term memory.

We will build a Persistent Note-Taking App. This tool will remember your thoughts, to-do lists, or startup ideas even after you shut down your computer and come back a week later.

Here is what you will master today:

* File I/O (Input/Output): Why your program needs to talk to the hard drive, not just the RAM.

* The with Statement: Why manually opening and closing files is dangerous, and how to do it safely so you don't corrupt your data.

* File Modes: How to tell Python whether you want to read, overwrite, or add to a file.

* JSON Handling: Why saving data as "text" is messy, and how to save complex Python lists and dictionaries so they reload perfectly.

Let's cure your code's amnesia.

---

The Problem

Let's look at a standard Python program like the ones we have been writing. We want to create a simple "Guest Book" where users can sign their names.

Type this code and run it:

# A simple guest book program

guest_list = []

while True:

name = input("Sign the guest book (or type 'quit'): ")

if name == 'quit':

break

guest_list.append(name)

print(f"Current guests: {guest_list}")

print("Goodbye!")

The Pain Points:
  • Run the program. Add three names: "Alice", "Bob", "Charlie".
  • The program prints the list. Everything looks great.
  • Type 'quit' to end the program.
  • Run the program again.
  • Where are Alice, Bob, and Charlie? They are gone.

    Every time you run a Python script, it starts with a blank slate. Variables live in RAM (Random Access Memory), which is volatile. It only exists while the program is running. To keep data, we need to write it to the Disk (Hard Drive).

    If you were building a chatbot, this means it would forget the user's name every time they closed the window. If you were analyzing data, you would have to manually re-enter the data every time you tweaked your code.

    We need a way to dump the contents of our variables into a file before the program ends, and load them back up when the program starts.

    ---

    Let's Build It

    We are going to learn how to interact with files step-by-step, eventually building up to our JSON note-taker.

    Step 1: The Wrong Way to Open Files

    In older programming tutorials, you might see code like this. Do not run this, just read it:

    file = open("data.txt", "w") # Open file
    

    file.write("Hello World") # Write data

    # ... imagine an error happens here ...

    file.close() # Close file

    The open() function asks the operating system for access to a file. The close() function releases it.

    The Danger: If your program crashes at the line where the comment says "imagine an error happens here," the file.close() line never runs. The file stays "locked" by your program. This can corrupt the file, prevent other programs from opening it, or cause memory leaks.

    Step 2: The Right Way (Context Managers)

    Python solves this with the with keyword. This is called a Context Manager. It guarantees the file closes properly, even if your program crashes halfway through writing.

    Let's write our first file.

    # The 'w' mode means "Write". 
    # WARNING: If the file exists, 'w' wipes it clean and starts over.
    # If it doesn't exist, it creates it.
    
    

    print("Writing to file...")

    with open("my_notes.txt", "w") as file:

    file.write("This is my first persistent note.\n")

    file.write("Here is another line.")

    print("Done! Check your folder for my_notes.txt")

    Run this code. Then, look in the folder where your python script is saved. You will see a new file called my_notes.txt. Open it with Notepad or TextEdit. You just created permanent data!

    Step 3: Reading Data Back

    Now that the file exists, let's write a separate program to read it.

    print("Reading from file...")
    
    # The 'r' mode means "Read". This is the default if you don't specify a mode.
    

    with open("my_notes.txt", "r") as file:

    content = file.read()

    print("--- File Contents ---")

    print(content)

    print("---------------------")

    Why this matters: We successfully moved data from the hard drive into a variable (content) inside our program.

    Step 4: The Append Mode

    Remember how w mode wipes the file clean? If you want to keep adding to a log without deleting history, you need a (Append) mode.

    new_thought = "I must remember to buy milk."
    
    # 'a' mode adds to the end of the file
    

    with open("my_notes.txt", "a") as file:

    # We add \n to ensure it starts on a new line

    file.write("\n" + new_thought)

    print("Added new thought.")

    Run this, then run your Step 3 reader code again. You will see the new line added to the bottom.

    Step 5: The Challenge of Complex Data (Enter JSON)

    Text files are great for simple strings. But what if you have a list of dictionaries?

    users = [
    

    {"name": "Alice", "score": 10},

    {"name": "Bob", "score": 25}

    ]

    If you write this to a text file using .write(str(users)), it saves it as a string. When you read it back, it's just a long string of characters. You can't easily say users[0]['score'] anymore because Python thinks it's just text.

    To solve this, we use JSON (JavaScript Object Notation). Don't let the name scare you—it is the universal standard for saving data, and it looks almost exactly like Python dictionaries and lists.

    We need the json library (built into Python).

    import json
    
    # Data we want to save
    

    game_data = {

    "player": "Hero123",

    "level": 5,

    "inventory": ["sword", "shield", "potion"]

    }

    # SAVE (Serialize) # 'w' mode because we want to update the save file

    with open("savegame.json", "w") as f:

    json.dump(game_data, f)

    print("Game saved!")

    Now, let's load it back into a real Python dictionary:

    import json
    
    # LOAD (Deserialize)
    

    with open("savegame.json", "r") as f:

    loaded_data = json.load(f)

    print(f"Welcome back, {loaded_data['player']}!")

    print(f"You have {len(loaded_data['inventory'])} items.")

    Step 6: The Final Project - Persistent Note Taker

    Now we will combine everything into a robust application. This app will:

  • Load existing notes from a file (if it exists).
  • Display them.
  • Ask for a new note.
  • Save everything back to the file.
  • import json
    

    import os # We need this to check if a file exists

    filename = "my_brain.json"

    notes = []

    # 1. LOAD EXISTING NOTES # We check if the file exists first to avoid a crash on the very first run

    if os.path.exists(filename):

    with open(filename, "r") as f:

    notes = json.load(f)

    print(f"Welcome back! Loaded {len(notes)} notes.")

    else:

    print("No save file found. Starting fresh.")

    # 2. DISPLAY NOTES

    print("\n--- YOUR NOTES ---")

    for i, note in enumerate(notes):

    print(f"{i + 1}. {note}")

    print("------------------\n")

    # 3. GET NEW INPUT

    new_note = input("Enter a new note (or press Enter to skip): ")

    if new_note:

    notes.append(new_note)

    # 4. SAVE EVERYTHING

    with open(filename, "w") as f:

    json.dump(notes, f)

    print("Note saved successfully!")

    else:

    print("No changes made.")

    Run this code multiple times.

    * Run 1: It says "Starting fresh." Enter a note like "Learn Python."

    * Run 2: It says "Loaded 1 notes." It shows "Learn Python." Enter "Eat Lunch."

    * Run 3: It shows both notes.

    You have now created persistent memory.

    ---

    Now You Try

    Extend the Persistent Note Taker with these features. Do them in order:

  • Timestamping:
  • Modify the code so that instead of just saving a string "Note text", you save a dictionary: {"text": "Note text", "timestamp": "2023-10-27 10:00:00"}.

    Hint:* You will need import datetime and use datetime.datetime.now(). Hint:* You will need to change how the loop prints the notes since note is now a dictionary, not a string.
  • Delete Functionality:
  • Add an option at the start: "Type 'add' to add a note or 'del' to delete one."

    If they choose delete, ask for the number of the note (the index) and use notes.pop(index) to remove it. Then save the file.

  • Safe Loading:
  • Currently, if you manually open my_brain.json and delete the contents so it is empty, the program might crash because valid JSON cannot be empty. Wrap the json.load line in a try...except block. If loading fails, just set notes = [].

    ---

    Challenge Project: The Log Analyzer

    In the real world, systems generate massive "log files" tracking errors. Your job is to build a tool that reads a messy text file and generates a clean report.

    Setup

    First, run this small script once to generate the dummy data you need to analyze.

    # RUN THIS ONCE TO GENERATE THE LOG FILE
    

    log_content = """INFO: System started

    INFO: User logged in

    ERROR: Database connection failed

    INFO: User clicked button

    ERROR: Timeout waiting for response

    WARNING: Disk space low

    ERROR: variable 'x' is undefined

    INFO: System shutdown"""

    with open("server.log", "w") as f:

    f.write(log_content)

    print("server.log created.")

    The Challenge

    Create a new script called analyzer.py.

  • Open server.log in read mode.
  • Read the file line by line (Hint: use file.readlines() or iterate over the file object).
  • Count how many times the word "ERROR" appears.
  • Create a list containing only the error messages (strip out the "ERROR: " part if you can).
  • Write a new file called error_report.txt containing:
  • * The total count of errors.

    * The list of specific errors.

    Example Output (error_report.txt)

    Analysis Report
    

    Total Errors Found: 3

    Details:

    • Database connection failed
    • Timeout waiting for response
    • variable 'x' is undefined

    Hints

    * You can loop through lines like this: for line in file:

    * Strings have a .startswith() method or you can use if "ERROR" in line:.

    * Remember to strip the newline character \n from lines when you read them using .strip().

    ---

    What You Learned

    Today you bridged the gap between temporary calculation and permanent storage.

    * open(filename, mode): Accessing the hard drive.

    * with statement: The safety net that ensures files close properly.

    * "w", "r", "a": The modes for Writing (overwrite), Reading, and Appending.

    * json: The standard way to save Python lists and dictionaries so they can be reloaded later.

    Why This Matters for AI:

    When you start building AI agents, they need context. An AI that can't read files can't summarize documents. An AI that can't write files can't generate reports or save the code it writes for you. This is the foundation of RAG (Retrieval Augmented Generation), where we feed external data (files) into an LLM to make it smarter.

    Tomorrow: We tackle the most powerful concept in modern programming: Classes. We will move beyond simple variables and start modeling real-world objects in code.