What You'll Build Today
You're building a ChatGPT-style interface where responses appear word-by-word in real-time instead of all at once after waiting. This makes your app feel fast, responsive, and professional.
The Problem
Without streaming, users stare at a loading spinner for 10-30 seconds while the AI generates a response. This feels slow and broken, even though the AI is working.
// Non-streaming (bad UX)
User: "Explain quantum computing"
[... waits 15 seconds seeing nothing ...]
AI: [entire 500-word response appears at once]
// User thinks: "Is this broken? Should I refresh?"
// Streaming (good UX)
User: "Explain quantum computing"
AI: "Quantum computing is a..." [keeps typing]
"...revolutionary technology that..." [keeps typing]
// User thinks: "It's working! I can start reading!"
Streaming makes your app feel 10x faster even though the total time is the same.
Let's Build It
Step 1: Basic Streaming with OpenAI
Enable streaming by setting stream: true and handling chunks:
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function streamResponse(prompt) {
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
stream: true // Enable streaming!
});
// Process chunks as they arrive
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content); // Print immediately
}
console.log(); // New line at end
}
// Try it
await streamResponse("Write a short story about a robot");
// You'll see: "Once upon a time..." appear word by word!
Step 2: Streaming with Event Handlers
Create a cleaner interface with callbacks:
async function streamWithCallbacks(prompt, onChunk, onComplete) {
let fullResponse = '';
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
fullResponse += content;
onChunk(content); // Call this for each chunk
}
}
onComplete(fullResponse); // Call this when done
}
// Use it
await streamWithCallbacks(
"Explain JavaScript promises",
(chunk) => {
process.stdout.write(chunk); // Stream to console
},
(full) => {
console.log('\n\nDone! Total length:', full.length);
}
);
Step 3: Streaming to a Web Interface
Create an Express endpoint that streams to the browser:
import express from 'express';
const app = express();
app.use(express.json());
app.post('/api/chat', async (req, res) => {
const { message } = req.body;
// Set headers for Server-Sent Events (SSE)
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: message }],
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
// Send chunk to browser
res.write(`data: ${JSON.stringify({ content })}\n\n`);
}
}
res.write('data: [DONE]\n\n');
res.end();
});
app.listen(3000, () => {
console.log('Server running on http://localhost:3000');
});
Frontend code to receive the stream:
// client.js
async function sendMessage(message) {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
const parsed = JSON.parse(data);
// Append to UI
document.getElementById('output').textContent += parsed.content;
}
}
}
}
// Use it
sendMessage("Tell me a joke");
Step 4: Building a Complete Chat UI
Create a full streaming chat interface:
// chat.html
<!DOCTYPE html>
<html>
<head>
<style>
#messages {
height: 400px;
overflow-y: auto;
border: 1px solid #ccc;
padding: 10px;
margin-bottom: 10px;
}
.message {
margin: 10px 0;
padding: 8px;
border-radius: 4px;
}
.user { background: #e3f2fd; }
.assistant { background: #f5f5f5; }
.streaming { opacity: 0.8; }
</style>
</head>
<body>
<div id="messages"></div>
<input id="input" type="text" placeholder="Type a message...">
<button onclick="sendMessage()">Send</button>
<script>
async function sendMessage() {
const input = document.getElementById('input');
const message = input.value;
if (!message) return;
// Show user message
addMessage('user', message);
input.value = '';
// Create assistant message for streaming
const assistantMsg = addMessage('assistant', '', true);
// Stream response
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let fullText = '';
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') {
assistantMsg.classList.remove('streaming');
return;
}
const parsed = JSON.parse(data);
fullText += parsed.content;
assistantMsg.textContent = fullText;
// Auto-scroll
assistantMsg.scrollIntoView({ behavior: 'smooth' });
}
}
}
}
function addMessage(role, content, streaming = false) {
const messagesDiv = document.getElementById('messages');
const msgDiv = document.createElement('div');
msgDiv.className = `message ${role} ${streaming ? 'streaming' : ''}`;
msgDiv.textContent = content;
messagesDiv.appendChild(msgDiv);
return msgDiv;
}
// Send on Enter key
document.getElementById('input').addEventListener('keypress', (e) => {
if (e.key === 'Enter') sendMessage();
});
</script>
</body>
</html>
Step 5: Error Handling in Streams
Handle errors gracefully during streaming:
async function streamWithErrorHandling(prompt, onChunk, onComplete, onError) {
try {
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: prompt }],
stream: true
});
let fullResponse = '';
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
fullResponse += content;
onChunk(content);
}
// Check for finish reason
if (chunk.choices[0]?.finish_reason === 'length') {
onError(new Error('Response truncated - max tokens reached'));
return;
}
}
onComplete(fullResponse);
} catch (error) {
onError(error);
}
}
// Use it
await streamWithErrorHandling(
"Tell me about AI",
(chunk) => process.stdout.write(chunk),
(full) => console.log('\n✓ Complete'),
(error) => console.error('\n✗ Error:', error.message)
);
Now You Try
Exercise 1: Typing Speed Control
Add artificial delay between chunks to simulate human typing speed. Make it configurable (slow, medium, fast).
Exercise 2: Streaming Stats
Display real-time statistics while streaming: tokens per second, total tokens, estimated time remaining.
Exercise 3: Cancel Streaming
Add a "Stop generating" button that cancels the stream mid-response. Clean up properly.
Challenge Project
Build: Multi-Chat Streaming Dashboard
Create an advanced chat interface with these features:
- Multiple chat threads (like ChatGPT sidebar)
- Streaming responses with typing indicators
- Save/load conversation history
- Copy code blocks with syntax highlighting
- Regenerate responses
- Export chat as markdown or PDF
- Show token usage per message
- Bonus: Stream to multiple models simultaneously and compare
What You Learned
- Streaming provides real-time feedback as responses generate
- Server-Sent Events (SSE) send data from server to browser continuously
- Delta content is the chunk of new text in each stream update
- UX improvement - streaming makes apps feel faster even at same speed
- Error handling is critical for graceful failures mid-stream
- ReadableStream API handles streaming data in browsers