Files & Data Persistence
What You'll Build Today
Up until now, your programs have suffered from a severe case of amnesia. You run the code, you create variables, you build lists, and the moment the program finishes... poof. Everything is gone. It’s like writing a novel on a blackboard; the moment you walk away, someone erases it.
Today, we are going to give your programs long-term memory.
We will build a Persistent Note-Taking App. This tool will remember your thoughts, to-do lists, or startup ideas even after you shut down your computer and come back a week later.
Here is what you will master today:
* File I/O (Input/Output): Why your program needs to talk to the hard drive, not just the RAM.
* The with Statement: Why manually opening and closing files is dangerous, and how to do it safely so you don't corrupt your data.
* File Modes: How to tell Python whether you want to read, overwrite, or add to a file.
* JSON Handling: Why saving data as "text" is messy, and how to save complex Python lists and dictionaries so they reload perfectly.
Let's cure your code's amnesia.
---
The Problem
Let's look at a standard Python program like the ones we have been writing. We want to create a simple "Guest Book" where users can sign their names.
Type this code and run it:
# A simple guest book program
guest_list = []
while True:
name = input("Sign the guest book (or type 'quit'): ")
if name == 'quit':
break
guest_list.append(name)
print(f"Current guests: {guest_list}")
print("Goodbye!")
The Pain Points:
Where are Alice, Bob, and Charlie? They are gone.
Every time you run a Python script, it starts with a blank slate. Variables live in RAM (Random Access Memory), which is volatile. It only exists while the program is running. To keep data, we need to write it to the Disk (Hard Drive).
If you were building a chatbot, this means it would forget the user's name every time they closed the window. If you were analyzing data, you would have to manually re-enter the data every time you tweaked your code.
We need a way to dump the contents of our variables into a file before the program ends, and load them back up when the program starts.
---
Let's Build It
We are going to learn how to interact with files step-by-step, eventually building up to our JSON note-taker.
Step 1: The Wrong Way to Open Files
In older programming tutorials, you might see code like this. Do not run this, just read it:
file = open("data.txt", "w") # Open file
file.write("Hello World") # Write data
# ... imagine an error happens here ...
file.close() # Close file
The open() function asks the operating system for access to a file. The close() function releases it.
file.close() line never runs. The file stays "locked" by your program. This can corrupt the file, prevent other programs from opening it, or cause memory leaks.
Step 2: The Right Way (Context Managers)
Python solves this with the with keyword. This is called a Context Manager. It guarantees the file closes properly, even if your program crashes halfway through writing.
Let's write our first file.
# The 'w' mode means "Write".
# WARNING: If the file exists, 'w' wipes it clean and starts over.
# If it doesn't exist, it creates it.
print("Writing to file...")
with open("my_notes.txt", "w") as file:
file.write("This is my first persistent note.\n")
file.write("Here is another line.")
print("Done! Check your folder for my_notes.txt")
Run this code. Then, look in the folder where your python script is saved. You will see a new file called my_notes.txt. Open it with Notepad or TextEdit. You just created permanent data!
Step 3: Reading Data Back
Now that the file exists, let's write a separate program to read it.
print("Reading from file...")
# The 'r' mode means "Read". This is the default if you don't specify a mode.
with open("my_notes.txt", "r") as file:
content = file.read()
print("--- File Contents ---")
print(content)
print("---------------------")
Why this matters: We successfully moved data from the hard drive into a variable (content) inside our program.
Step 4: The Append Mode
Remember how w mode wipes the file clean? If you want to keep adding to a log without deleting history, you need a (Append) mode.
new_thought = "I must remember to buy milk."
# 'a' mode adds to the end of the file
with open("my_notes.txt", "a") as file:
# We add \n to ensure it starts on a new line
file.write("\n" + new_thought)
print("Added new thought.")
Run this, then run your Step 3 reader code again. You will see the new line added to the bottom.
Step 5: The Challenge of Complex Data (Enter JSON)
Text files are great for simple strings. But what if you have a list of dictionaries?
users = [
{"name": "Alice", "score": 10},
{"name": "Bob", "score": 25}
]
If you write this to a text file using .write(str(users)), it saves it as a string. When you read it back, it's just a long string of characters. You can't easily say users[0]['score'] anymore because Python thinks it's just text.
To solve this, we use JSON (JavaScript Object Notation). Don't let the name scare you—it is the universal standard for saving data, and it looks almost exactly like Python dictionaries and lists.
We need the json library (built into Python).
import json
# Data we want to save
game_data = {
"player": "Hero123",
"level": 5,
"inventory": ["sword", "shield", "potion"]
}
# SAVE (Serialize)
# 'w' mode because we want to update the save file
with open("savegame.json", "w") as f:
json.dump(game_data, f)
print("Game saved!")
Now, let's load it back into a real Python dictionary:
import json
# LOAD (Deserialize)
with open("savegame.json", "r") as f:
loaded_data = json.load(f)
print(f"Welcome back, {loaded_data['player']}!")
print(f"You have {len(loaded_data['inventory'])} items.")
Step 6: The Final Project - Persistent Note Taker
Now we will combine everything into a robust application. This app will:
import json
import os # We need this to check if a file exists
filename = "my_brain.json"
notes = []
# 1. LOAD EXISTING NOTES
# We check if the file exists first to avoid a crash on the very first run
if os.path.exists(filename):
with open(filename, "r") as f:
notes = json.load(f)
print(f"Welcome back! Loaded {len(notes)} notes.")
else:
print("No save file found. Starting fresh.")
# 2. DISPLAY NOTES
print("\n--- YOUR NOTES ---")
for i, note in enumerate(notes):
print(f"{i + 1}. {note}")
print("------------------\n")
# 3. GET NEW INPUT
new_note = input("Enter a new note (or press Enter to skip): ")
if new_note:
notes.append(new_note)
# 4. SAVE EVERYTHING
with open(filename, "w") as f:
json.dump(notes, f)
print("Note saved successfully!")
else:
print("No changes made.")
Run this code multiple times.
* Run 1: It says "Starting fresh." Enter a note like "Learn Python."
* Run 2: It says "Loaded 1 notes." It shows "Learn Python." Enter "Eat Lunch."
* Run 3: It shows both notes.
You have now created persistent memory.
---
Now You Try
Extend the Persistent Note Taker with these features. Do them in order:
Modify the code so that instead of just saving a string "Note text", you save a dictionary: {"text": "Note text", "timestamp": "2023-10-27 10:00:00"}.
import datetime and use datetime.datetime.now().
Hint:* You will need to change how the loop prints the notes since note is now a dictionary, not a string.
Add an option at the start: "Type 'add' to add a note or 'del' to delete one."
If they choose delete, ask for the number of the note (the index) and use notes.pop(index) to remove it. Then save the file.
Currently, if you manually open my_brain.json and delete the contents so it is empty, the program might crash because valid JSON cannot be empty. Wrap the json.load line in a try...except block. If loading fails, just set notes = [].
---
Challenge Project: The Log Analyzer
In the real world, systems generate massive "log files" tracking errors. Your job is to build a tool that reads a messy text file and generates a clean report.
Setup
First, run this small script once to generate the dummy data you need to analyze.
# RUN THIS ONCE TO GENERATE THE LOG FILE
log_content = """INFO: System started
INFO: User logged in
ERROR: Database connection failed
INFO: User clicked button
ERROR: Timeout waiting for response
WARNING: Disk space low
ERROR: variable 'x' is undefined
INFO: System shutdown"""
with open("server.log", "w") as f:
f.write(log_content)
print("server.log created.")
The Challenge
Create a new script called analyzer.py.
server.log in read mode.file.readlines() or iterate over the file object).error_report.txt containing:* The total count of errors.
* The list of specific errors.
Example Output (error_report.txt)
Analysis Report
Total Errors Found: 3
Details:
- Database connection failed
- Timeout waiting for response
- variable 'x' is undefined
Hints
* You can loop through lines like this: for line in file:
* Strings have a .startswith() method or you can use if "ERROR" in line:.
* Remember to strip the newline character \n from lines when you read them using .strip().
---
What You Learned
Today you bridged the gap between temporary calculation and permanent storage.
* open(filename, mode): Accessing the hard drive.
* with statement: The safety net that ensures files close properly.
* "w", "r", "a": The modes for Writing (overwrite), Reading, and Appending.
* json: The standard way to save Python lists and dictionaries so they can be reloaded later.
When you start building AI agents, they need context. An AI that can't read files can't summarize documents. An AI that can't write files can't generate reports or save the code it writes for you. This is the foundation of RAG (Retrieval Augmented Generation), where we feed external data (files) into an LLM to make it smarter.
Tomorrow: We tackle the most powerful concept in modern programming: Classes. We will move beyond simple variables and start modeling real-world objects in code.