Day 59 of 80

Framework Best Practices

Phase 6: Frameworks & Agents

What You'll Build Today

Welcome to Day 59. We are approaching the end of our journey, and today is about acquiring wisdom, not just syntax.

For the last few weeks, we have been using powerful frameworks like LangChain and LlamaIndex. They are incredible. They are also dangerous. It is very easy to fall into the trap of "over-engineering"—using a sledgehammer to crack a nut.

Today, we are going to build The Great Refactor. We will take a piece of code that is unnecessarily complex, heavily reliant on framework abstractions, and "black box" logic, and we will rewrite it into clean, readable, raw Python. Then, we will do the opposite: take a complex raw Python script and show where a framework would actually save us time.

Here is what you will learn and why:

* Raw Code vs. Frameworks: You will learn to identify when to use a library and when to just write the code yourself. This prevents "dependency hell."

* The Cost of Abstraction: You will see how frameworks hide the actual prompt sent to the LLM, making debugging difficult, and how to fix that.

* Hybrid Development: You will learn that you don't have to choose sides. You can use LlamaIndex for data loading and raw Python for logic.

* Debugging "Magic": You will learn how to inspect what is actually happening inside those chain objects.

Let's strip away the magic and take back control.

The Problem

To understand why frameworks can sometimes hurt us, let's look at a classic case of "Hello World" over-engineering.

Imagine you just want to ask an LLM to tell you a joke about Python.

If you strictly follow framework documentation without thinking, you might end up writing code like this. Do not run this yet, just look at it.

# The "Over-Engineered" Approach

from langchain.prompts import PromptTemplate

from langchain.chat_models import ChatOpenAI

from langchain.schema import StrOutputParser

from langchain_core.runnables import RunnablePassthrough

# Initialize Model

model = ChatOpenAI(model="gpt-4o")

# Create a complex chain for a simple task

prompt = PromptTemplate.from_template("Tell me a joke about {topic}")

output_parser = StrOutputParser()

# The "Chain"

chain = (

{"topic": RunnablePassthrough()}

| prompt

| model

| output_parser

)

# Execution

result = chain.invoke("Python")

print(result)

The Pain Points:
  • Import Bloat: Look at those imports. RunnablePassthrough? StrOutputParser? We just want a joke.
  • Hidden Logic: Where is the API call? It's hidden inside invoke. If you get an error, the "stack trace" (the error message) will be massive and point to files deep inside the library, not your code.
  • Abstraction Cost: If OpenAI changes their API tomorrow, you have to wait for LangChain to update. If you want to change a parameter that LangChain doesn't support yet, you are stuck.
  • You might be thinking: "Is this really necessary? Can't I just send a string to the API?"

    Yes, you can. And often, you should.

    Let's Build It

    We are going to go through a process of "De-engineering" and then "Strategic Engineering."

    Step 1: The Raw Python Solution

    Let's rewrite the joke generator using raw Python and the standard OpenAI library. This is the baseline.

    Why this matters: This gives you full visibility. You know exactly what data is being sent and received.
    import os
    

    from openai import OpenAI

    # Make sure you have your API Key set in your environment # os.environ["OPENAI_API_KEY"] = "sk-..."

    client = OpenAI()

    def get_joke_raw(topic):

    print(f"--- Requesting joke about {topic} ---")

    response = client.chat.completions.create(

    model="gpt-4o",

    messages=[

    {"role": "system", "content": "You are a helpful assistant."},

    {"role": "user", "content": f"Tell me a joke about {topic}"}

    ]

    )

    # We explicitly extract the content. No magic parsers.

    return response.choices[0].message.content

    # Run it

    joke = get_joke_raw("Python")

    print(joke)

    The Result: You should see a joke printed. The Insight: This code is longer vertically, but it is simpler conceptually. There are no "chains" or "runnables." If it breaks, you know exactly where: the client.chat.completions.create line.

    Step 2: When Raw Code Gets Painful (The Data Problem)

    Now, let's look at a scenario where raw code starts to hurt. Imagine you need to load a text file, split it into chunks, and embed it.

    Doing this in raw Python requires writing a file loader, a text splitter (handling overlap and boundaries), and an embedding loop.

    # PAINFUL RAW PYTHON EXAMPLE
    

    def load_and_split_manually(text, chunk_size=100):

    # This is a terrible splitter because it cuts words in half

    chunks = []

    for i in range(0, len(text), chunk_size):

    chunks.append(text[i:i+chunk_size])

    return chunks

    text = "Generative AI is transforming the world. " * 50

    chunks = load_and_split_manually(text)

    print(f"Created {len(chunks)} chunks.")

    print(f"Sample chunk: {chunks[0]}")

    The Problem: Writing a good text splitter that respects sentence boundaries is hard. Writing a PDF parser is even harder. This is where we should use a framework.

    Step 3: The Hybrid Approach (Best of Both Worlds)

    This is the "Golden Path." We will use LlamaIndex or LangChain for the heavy lifting (data loading/splitting) but keep the actual LLM interaction in Raw Python or a simplified chain so we can debug it.

    Let's build a simple RAG (Retrieval Augmented Generation) system using this hybrid mindset.

    Requirements:
  • Create a dummy text file.
  • Use LlamaIndex to load and chunk it (because they are great at that).
  • Use Raw Python to send the context to the LLM (because we want control).
  • import os
    

    from openai import OpenAI

    from llama_index.core import Document

    from llama_index.core.node_parser import SimpleNodeParser

    # 1. Setup Data

    text_data = """

    LlamaIndex is a data framework for your LLM applications.

    LangChain is a framework for developing applications powered by language models.

    Raw Python provides maximum control but requires more boilerplate code.

    """

    # 2. Use Framework for what it's good at: Data Handling # We wrap our text in a Document object

    documents = [Document(text=text_data)]

    # We use their parser to split text smartly

    parser = SimpleNodeParser.from_defaults(chunk_size=1024, chunk_overlap=20)

    nodes = parser.get_nodes_from_documents(documents)

    print(f"Framework helped us create {len(nodes)} clean nodes.")

    # 3. Use Raw Python for the Logic # We will manually simulate the "Retrieval" step for simplicity here

    context_str = "\n".join([n.get_content() for n in nodes])

    client = OpenAI()

    def query_with_context(query, context):

    print("--- Constructing Prompt with Context ---")

    # We build the prompt manually so we see EXACTLY what the LLM sees

    final_prompt = f"""

    You are an assistant. Answer the question based ONLY on the context below.

    Context:

    {context}

    Question: {query}

    """

    # Debugging print - this is hard to do in deep framework chains!

    print(f"DEBUG PROMPT PREVIEW:\n{final_prompt[:100]}...\n")

    response = client.chat.completions.create(

    model="gpt-4o",

    messages=[{"role": "user", "content": final_prompt}]

    )

    return response.choices[0].message.content

    # Run it

    answer = query_with_context("What is Raw Python good for?", context_str)

    print("Answer:", answer)

    Why this is better:

    * We didn't write a text splitter (Framework handled it).

    * We didn't use a RetrievalQAChain (which hides the prompt). We built the prompt string ourselves.

    * If the model hallucinates, we can print final_prompt and see exactly why.

    Step 4: Debugging Frameworks (When you MUST use them)

    Sometimes you have to use the full framework (e.g., for a complex agent). When you do, you need to know how to see inside.

    In LangChain, the set_debug(True) and set_verbose(True) flags are your best friends.

    from langchain.globals import set_debug
    

    from langchain.prompts import PromptTemplate

    from langchain.chat_models import ChatOpenAI

    from langchain.schema import StrOutputParser

    # Turn on global debugging - this prints EVERYTHING

    set_debug(True)

    print("\n--- Running Framework with Debug Mode ---\n")

    model = ChatOpenAI(model="gpt-4o")

    prompt = PromptTemplate.from_template("Summarize this in one word: {text}")

    chain = prompt | model | StrOutputParser()

    # Watch the console output closely when you run this

    chain.invoke({"text": "The quick brown fox jumps over the lazy dog."})

    The Output: You will see a massive amount of text in the console. It shows the "Prompts" going in and the "Generations" coming out. The Lesson: Never run a complex chain blindly. If it fails, turn on debug mode immediately.

    Now You Try

    Refactor and extend the code above.

  • The "Trace" Extension:
  • In the Step 3 code (Hybrid Approach), modify the query_with_context function to save the exact prompt sent to the LLM into a text file called last_prompt.txt. This is a common logging technique for production apps.

  • The Model Swap:
  • Take the raw Python code from Step 1. Modify it to use a different model (e.g., gpt-3.5-turbo). Then, try to modify the LangChain code in Step 4 to use a different model. Note which one felt more intuitive to you.

  • The Tool Integration:
  • Write a raw Python function that calculates multiply(a, b).

    Then, write a prompt that says: "If the user asks for math, output the function name and arguments in JSON format."

    Send this to the LLM using raw Python.

    Goal: Understand that "Tool Calling" isn't magic; it's just the LLM outputting structured text that you parse.

    Challenge Project: The "Roast My Code" Comparison

    Your challenge is to build the same simple application three times and write a short comparison.

    The Application:

    A "Sentiment Analyzer." It takes a user sentence (e.g., "I hate rainy days but I love reading inside") and returns: {"sentiment": "mixed", "reasoning": "..."}.

    Requirements:
  • Version A (Raw Python): Use openai client. Define the JSON structure in the system prompt. Parse the string response manually (or use the json_mode feature if you recall it).
  • Version B (LangChain): Use PromptTemplate and JsonOutputParser (or PydanticOutputParser).
  • Version C (LlamaIndex): Use LlamaIndex's program module or a simple query engine if you prefer.
  • Comparison Report:

    For each version, note:

    * Lines of code (approximate).

    * Readability (1-10).

    * "Magic" factor (How much is happening that you can't see?).

    Example Input:

    "The service was slow, but the food was amazing."

    Example Output (for all versions):

    ``json

    {

    "sentiment": "positive",

    "reasoning": "While service was criticized, the core product (food) was praised highly."

    }

    `` Hint: For the raw Python version, telling the model "Return a valid JSON object" in the system prompt is usually enough.

    What You Learned

    Today was about maturity in coding. You learned that:

    * Frameworks are tools, not rules: You don't have to use every feature a framework offers.

    * Abstraction has a price: Every layer of "magic" makes debugging harder.

    * Hybrid is healthy: Using LlamaIndex for data ingestion and Raw Python for execution is a powerful, professional pattern.

    * Visibility is key: If you can't print the prompt, you can't fix the bug.

    Why This Matters:

    As you move into building Agents (tomorrow!), complexity will skyrocket. If you build an Agent entirely with "magic black boxes," you will never be able to fix it when it starts looping endlessly or hallucinating. Understanding the raw logic underneath gives you the power to intervene.

    Tomorrow: We put it all together. You will build an Autonomous Research Agent that can browse the web, read pages, and write a report—applying the debugging and structure lessons you learned today.