Day 13: Temperature Control

Balance consistency and creativity in AI responses

What You'll Build Today

You're building a tool that demonstrates how temperature affects AI responses - from rigid and consistent to wild and creative. You'll create a comparison app that shows the same prompt at different temperatures, helping you choose the right setting for your use case.

Today's Project: A temperature explorer that shows how this single parameter transforms AI behavior.

The Problem

Sometimes your AI is too boring and predictable. Other times it's too random and unreliable. You need to know when to use which setting.

The Pain:
// You need consistent data extraction
"Extract the price from: 'The item costs $49.99'"
AI: "Sure! The price is $49.99" ✓

// But sometimes it gets creative
"Extract the price from: 'The item costs $49.99'"
AI: "The cost of this product is forty-nine dollars and
     ninety-nine cents, which is less than $50!" ✗

// You just wanted "$49.99"!

Different tasks need different levels of creativity vs consistency.

Let's Build It

Step 1: Understanding Temperature

Temperature controls randomness in AI responses. It's a number from 0 to 2 (for OpenAI):

  • 0.0: Deterministic, always picks the most likely word
  • 0.7: Balanced (default for most use cases)
  • 1.5+: Creative, unpredictable, sometimes chaotic
import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY
});

async function askWithTemperature(prompt, temperature) {
    const response = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{ role: "user", content: prompt }],
        temperature: temperature  // The magic parameter!
    });

    return response.choices[0].message.content;
}

// Try the same prompt at different temperatures
const prompt = "Write a tagline for a coffee shop";

console.log("Temperature 0.0:");
console.log(await askWithTemperature(prompt, 0.0));
// "Your daily dose of freshness and flavor."

console.log("\nTemperature 1.0:");
console.log(await askWithTemperature(prompt, 1.0));
// "Where every cup tells a story."

console.log("\nTemperature 2.0:");
console.log(await askWithTemperature(prompt, 2.0));
// "Brewing dreams, one cosmic sip at a time!"

Step 2: Temperature Comparison Tool

Build a tool that shows how temperature affects the same prompt:

async function compareTemperatures(prompt, temperatures = [0, 0.5, 1.0, 1.5, 2.0]) {
    console.log(`Prompt: "${prompt}"\n`);

    for (const temp of temperatures) {
        console.log(`Temperature ${temp}:`);

        // Get 3 responses to show consistency
        const responses = await Promise.all([
            askWithTemperature(prompt, temp),
            askWithTemperature(prompt, temp),
            askWithTemperature(prompt, temp)
        ]);

        responses.forEach((response, i) => {
            console.log(`  Response ${i + 1}: ${response}`);
        });

        console.log(); // blank line
    }
}

// Test it
await compareTemperatures("What is 2+2?");

// At temp 0.0, all three responses are identical
// At temp 2.0, all three responses are different!

Step 3: Choosing the Right Temperature

Create a guide for selecting temperature based on task type:

const temperatureGuide = {
    dataExtraction: {
        temp: 0.0,
        reason: "Need consistent, predictable extraction",
        example: "Extract the email from this text"
    },

    classification: {
        temp: 0.0,
        reason: "Categories should be consistent",
        example: "Is this review positive or negative?"
    },

    summarization: {
        temp: 0.3,
        reason: "Mostly consistent, slight variation ok",
        example: "Summarize this article"
    },

    chatbot: {
        temp: 0.7,
        reason: "Natural variation, not robotic",
        example: "General conversation"
    },

    brainstorming: {
        temp: 1.2,
        reason: "Want diverse, creative ideas",
        example: "Generate 10 startup ideas"
    },

    creative: {
        temp: 1.5,
        reason: "Maximum creativity and uniqueness",
        example: "Write a surreal poem"
    }
};

// Function to get recommended temperature
function recommendTemperature(taskDescription) {
    const lower = taskDescription.toLowerCase();

    if (lower.includes('extract') || lower.includes('parse')) {
        return { temp: 0.0, task: 'dataExtraction' };
    }
    if (lower.includes('classify') || lower.includes('category')) {
        return { temp: 0.0, task: 'classification' };
    }
    if (lower.includes('creative') || lower.includes('story')) {
        return { temp: 1.5, task: 'creative' };
    }
    if (lower.includes('brainstorm') || lower.includes('ideas')) {
        return { temp: 1.2, task: 'brainstorming' };
    }

    return { temp: 0.7, task: 'chatbot' }; // safe default
}

// Use it
const rec = recommendTemperature("Extract dates from emails");
console.log(`Recommended: ${rec.temp}`);
console.log(`Reason: ${temperatureGuide[rec.task].reason}`);

Step 4: Real-World Use Cases

Apply temperature to practical scenarios:

// USE CASE 1: Data extraction (temp = 0.0)
async function extractStructuredData(text) {
    const response = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{
            role: "user",
            content: `Extract name, email, and phone from this text.
            Return JSON only:\n\n${text}`
        }],
        temperature: 0.0  // Maximum consistency
    });

    return JSON.parse(response.choices[0].message.content);
}

// USE CASE 2: Creative product descriptions (temp = 1.2)
async function generateProductDescription(product) {
    const response = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{
            role: "user",
            content: `Write a creative, engaging product description
            for: ${product}`
        }],
        temperature: 1.2  // More creativity
    });

    return response.choices[0].message.content;
}

// USE CASE 3: Customer support (temp = 0.5)
async function customerSupportResponse(issue) {
    const response = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{
            role: "system",
            content: "You are a helpful customer support agent."
        }, {
            role: "user",
            content: issue
        }],
        temperature: 0.5  // Helpful but consistent
    });

    return response.choices[0].message.content;
}

// Test them
const contact = "John Doe, john@example.com, 555-1234";
console.log(await extractStructuredData(contact));

console.log(await generateProductDescription("wireless earbuds"));

console.log(await customerSupportResponse("My order hasn't arrived"));

Step 5: Temperature + Top P (Advanced)

There's another parameter that works with temperature - top_p (nucleus sampling):

// Temperature: How random
// Top P: How many options to consider

async function advancedSampling(prompt, temp, topP) {
    const response = await openai.chat.completions.create({
        model: "gpt-4",
        messages: [{ role: "user", content: prompt }],
        temperature: temp,
        top_p: topP  // 0.1 = very focused, 1.0 = consider all options
    });

    return response.choices[0].message.content;
}

// High temp + low top_p = creative but focused
console.log(await advancedSampling(
    "Name a fruit",
    1.5,   // high creativity
    0.1    // but only common fruits
));
// Might give: "apple", "banana", "orange" (common)

// High temp + high top_p = creative and diverse
console.log(await advancedSampling(
    "Name a fruit",
    1.5,   // high creativity
    1.0    // all fruits considered
));
// Might give: "dragonfruit", "rambutan", "starfruit" (exotic)

// RULE: Use temperature OR top_p, not both
// Most cases: just use temperature

Now You Try

Exercise 1: Find the Breaking Point

Test a factual question ("What is the capital of France?") at temperatures from 0.0 to 2.0 in increments of 0.2. At what temperature does the answer become wrong or weird?

Exercise 2: Consistency Test

Ask the same question 10 times at temperature 0.0 and 10 times at temperature 1.5. Calculate what percentage of responses are identical in each case.

Exercise 3: Task Matcher

Build a function that takes a task description and automatically sets the appropriate temperature. Test it with various tasks.

Challenge Project

Build: Temperature Playground

Create an interactive web app that helps users understand temperature:

  • Input field for prompt
  • Slider for temperature (0.0 to 2.0)
  • Button to generate 5 responses at once
  • Display all responses side-by-side
  • Show consistency score (how similar the responses are)
  • Recommendations for optimal temperature based on prompt
  • Bonus: Visual heat map showing word choice probability

What You Learned

Key Insight: Temperature is like the difference between a script reader (low) and an improv actor (high). Choose based on whether you need reliability or creativity!