Day 9 of 80

Files & Data Persistence

Phase 1: Python Foundation

What You'll Build Today

Up until now, your programs have suffered from a severe case of amnesia. You run the code, you create variables, you build lists, and the moment the program finishes... poof. Everything is gone. It’s like writing a novel on a blackboard; the moment you walk away, someone erases it.

Today, we are going to give your programs long-term memory.

We will build a Persistent Note-Taking App. This tool will remember your thoughts, to-do lists, or startup ideas even after you shut down your computer and come back a week later.

Here is what you will master today:

* File I/O (Input/Output): Why your program needs to talk to the hard drive, not just the RAM.

* The with Statement: Why manually opening and closing files is dangerous, and how to do it safely so you don't corrupt your data.

* File Modes: How to tell Python whether you want to read, overwrite, or add to a file.

* JSON Handling: Why saving data as "text" is messy, and how to save complex Python lists and dictionaries so they reload perfectly.

Let's cure your code's amnesia.

---

The Problem

Let's look at a standard Python program like the ones we have been writing. We want to create a simple "Guest Book" where users can sign their names.

Type this code and run it:

# A simple guest book program
guest_list = []

while True:
    name = input("Sign the guest book (or type 'quit'): ")
    
    if name == 'quit':
        break
        
    guest_list.append(name)
    print(f"Current guests: {guest_list}")

print("Goodbye!")

The Pain Points:

Run the program. Add three names: "Alice", "Bob", "Charlie".

The program prints the list. Everything looks great.

Type 'quit' to end the program.

Run the program again.

Where are Alice, Bob, and Charlie? They are gone.

Every time you run a Python script, it starts with a blank slate. Variables live in RAM (Random Access Memory), which is volatile. It only exists while the program is running. To keep data, we need to write it to the Disk (Hard Drive).

If you were building a chatbot, this means it would forget the user's name every time they closed the window. If you were analyzing data, you would have to manually re-enter the data every time you tweaked your code.

We need a way to dump the contents of our variables into a file before the program ends, and load them back up when the program starts.

---

Let's Build It

We are going to learn how to interact with files step-by-step, eventually building up to our JSON note-taker.

Step 1: The Wrong Way to Open Files

In older programming tutorials, you might see code like this. Do not run this, just read it:

file = open("data.txt", "w") # Open file
file.write("Hello World")    # Write data
# ... imagine an error happens here ...
file.close()                 # Close file

The open() function asks the operating system for access to a file. The close() function releases it.

The Danger: If your program crashes at the line where the comment says "imagine an error happens here," the file.close() line never runs. The file stays "locked" by your program. This can corrupt the file, prevent other programs from opening it, or cause memory leaks.

Step 2: The Right Way (Context Managers)

Python solves this with the with keyword. This is called a Context Manager. It guarantees the file closes properly, even if your program crashes halfway through writing.

Let's write our first file.

# The 'w' mode means "Write". 
# WARNING: If the file exists, 'w' wipes it clean and starts over.
# If it doesn't exist, it creates it.

print("Writing to file...")

with open("my_notes.txt", "w") as file:
    file.write("This is my first persistent note.\n")
    file.write("Here is another line.")

print("Done! Check your folder for my_notes.txt")

Run this code. Then, look in the folder where your python script is saved. You will see a new file called my_notes.txt. Open it with Notepad or TextEdit. You just created permanent data!

Step 3: Reading Data Back

Now that the file exists, let's write a separate program to read it.

print("Reading from file...")

# The 'r' mode means "Read". This is the default if you don't specify a mode.
with open("my_notes.txt", "r") as file:
    content = file.read()

print("--- File Contents ---")
print(content)
print("---------------------")

Why this matters: We successfully moved data from the hard drive into a variable (content) inside our program.

Step 4: The Append Mode

Remember how w mode wipes the file clean? If you want to keep adding to a log without deleting history, you need a (Append) mode.

new_thought = "I must remember to buy milk."

# 'a' mode adds to the end of the file
with open("my_notes.txt", "a") as file:
    # We add \n to ensure it starts on a new line
    file.write("\n" + new_thought) 

print("Added new thought.")

Run this, then run your Step 3 reader code again. You will see the new line added to the bottom.

Step 5: The Challenge of Complex Data (Enter JSON)

Text files are great for simple strings. But what if you have a list of dictionaries?

users = [
    {"name": "Alice", "score": 10},
    {"name": "Bob", "score": 25}
]

If you write this to a text file using .write(str(users)), it saves it as a string. When you read it back, it's just a long string of characters. You can't easily say users[0]['score'] anymore because Python thinks it's just text.

To solve this, we use JSON (JavaScript Object Notation). Don't let the name scare you—it is the universal standard for saving data, and it looks almost exactly like Python dictionaries and lists.

We need the json library (built into Python).

import json

# Data we want to save
game_data = {
    "player": "Hero123",
    "level": 5,
    "inventory": ["sword", "shield", "potion"]
}

# SAVE (Serialize)
# 'w' mode because we want to update the save file
with open("savegame.json", "w") as f:
    json.dump(game_data, f)
    
print("Game saved!")

Now, let's load it back into a real Python dictionary:

import json

# LOAD (Deserialize)
with open("savegame.json", "r") as f:
    loaded_data = json.load(f)

print(f"Welcome back, {loaded_data['player']}!")
print(f"You have {len(loaded_data['inventory'])} items.")

Step 6: The Final Project - Persistent Note Taker

Now we will combine everything into a robust application. This app will:

Load existing notes from a file (if it exists).

Display them.

Ask for a new note.

Save everything back to the file.

import json
import os # We need this to check if a file exists

filename = "my_brain.json"
notes = []

# 1. LOAD EXISTING NOTES
# We check if the file exists first to avoid a crash on the very first run
if os.path.exists(filename):
    with open(filename, "r") as f:
        notes = json.load(f)
    print(f"Welcome back! Loaded {len(notes)} notes.")
else:
    print("No save file found. Starting fresh.")

# 2. DISPLAY NOTES
print("\n--- YOUR NOTES ---")
for i, note in enumerate(notes):
    print(f"{i + 1}. {note}")
print("------------------\n")

# 3. GET NEW INPUT
new_note = input("Enter a new note (or press Enter to skip): ")

if new_note:
    notes.append(new_note)
    
    # 4. SAVE EVERYTHING
    with open(filename, "w") as f:
        json.dump(notes, f)
    print("Note saved successfully!")
else:
    print("No changes made.")

Run this code multiple times.

* Run 1: It says "Starting fresh." Enter a note like "Learn Python."

* Run 2: It says "Loaded 1 notes." It shows "Learn Python." Enter "Eat Lunch."

* Run 3: It shows both notes.

You have now created persistent memory.

---

Now You Try

Extend the Persistent Note Taker with these features. Do them in order:

Timestamping:

Modify the code so that instead of just saving a string "Note text", you save a dictionary: {"text": "Note text", "timestamp": "2023-10-27 10:00:00"}.

Hint:* You will need import datetime and use datetime.datetime.now(). Hint:* You will need to change how the loop prints the notes since note is now a dictionary, not a string.

Delete Functionality:

Add an option at the start: "Type 'add' to add a note or 'del' to delete one."

If they choose delete, ask for the number of the note (the index) and use notes.pop(index) to remove it. Then save the file.

Safe Loading:

Currently, if you manually open my_brain.json and delete the contents so it is empty, the program might crash because valid JSON cannot be empty. Wrap the json.load line in a try...except block. If loading fails, just set notes = [].

---

Challenge Project: The Log Analyzer

In the real world, systems generate massive "log files" tracking errors. Your job is to build a tool that reads a messy text file and generates a clean report.

Setup

First, run this small script once to generate the dummy data you need to analyze.

# RUN THIS ONCE TO GENERATE THE LOG FILE
log_content = """INFO: System started
INFO: User logged in
ERROR: Database connection failed
INFO: User clicked button
ERROR: Timeout waiting for response
WARNING: Disk space low
ERROR: variable 'x' is undefined
INFO: System shutdown"""

with open("server.log", "w") as f:
    f.write(log_content)
print("server.log created.")

The Challenge

Create a new script called analyzer.py.

Open server.log in read mode.

Read the file line by line (Hint: use file.readlines() or iterate over the file object).

Count how many times the word "ERROR" appears.

Create a list containing only the error messages (strip out the "ERROR: " part if you can).

Write a new file called error_report.txt containing:

* The total count of errors.

* The list of specific errors.

Example Output (`error_report.txt`)

Analysis Report
Total Errors Found: 3

Details:

Database connection failed
Timeout waiting for response
variable 'x' is undefined

Hints

* You can loop through lines like this: for line in file:

* Strings have a .startswith() method or you can use if "ERROR" in line:.

* Remember to strip the newline character \n from lines when you read them using .strip().

---

What You Learned

Today you bridged the gap between temporary calculation and permanent storage.

* open(filename, mode): Accessing the hard drive.

* with statement: The safety net that ensures files close properly.

* "w", "r", "a": The modes for Writing (overwrite), Reading, and Appending.

* json: The standard way to save Python lists and dictionaries so they can be reloaded later.

Why This Matters for AI:

When you start building AI agents, they need context. An AI that can't read files can't summarize documents. An AI that can't write files can't generate reports or save the code it writes for you. This is the foundation of RAG (Retrieval Augmented Generation), where we feed external data (files) into an LLM to make it smarter.

Tomorrow: We tackle the most powerful concept in modern programming: Classes. We will move beyond simple variables and start modeling real-world objects in code.

← Day 8 Day 10 →