1. What We're Building
Every SaaS application needs customer support, and in 2026, users expect instant, intelligent answers — not a "we'll get back to you in 24 hours" email. An AI chatbot can resolve 60–80% of support tickets automatically, save you thousands in support costs, and keep your users happy at 3 AM.
In this guide, we'll build a production-ready AI chatbot for a Laravel 12 SaaS application. Not a toy demo — a real chatbot with:
- Real-time streaming — Tokens appear as they're generated, just like ChatGPT.
- Conversation memory — The bot remembers what you said earlier in the conversation.
- Function calling (tools) — The bot can look up orders, check subscription status, or query your database.
- File uploads & vision — Users can upload screenshots and the bot can analyze them.
- Human escalation — When the bot can't help, it hands off to a human agent smoothly.
- Usage limits — Prevent abuse and control your AI spend per user.
We'll use the Laravel AI SDK (which supports OpenAI, Anthropic Claude, and other providers) so you can swap models without changing your code.
2. Architecture Overview
Here's how the pieces fit together:
User (Browser)
│
├── Sends message via POST /chat/{conversation}/message
│
├── Receives streamed response via GET /chat/{conversation}/stream (SSE)
│
└── Chat UI (Livewire component + Alpine.js)
Laravel Backend
│
├── ChatController → Handles messages, creates conversations
├── ChatStreamController → Streams AI responses via SSE
├── AiChatService → Orchestrates AI calls, tools, memory
├── Conversation model → Stores chat sessions
├── Message model → Stores individual messages
└── Tools (Functions) → OrderLookup, SubscriptionStatus, etc.
AI Provider (OpenAI / Anthropic)
└── GPT-4o / Claude 4 Sonnet via Laravel AI SDK
The key design decisions:
- SSE over WebSockets — Simpler to deploy (no Pusher/Soketi needed), works behind load balancers, and the chatbot only needs server-to-client streaming.
- Database-backed memory — Conversations persist across page refreshes and devices. Users can return to old conversations.
- Service class pattern — All AI logic lives in
AiChatService, making it testable and swappable.
3. Database Schema & Models
Migrations
// database/migrations/create_conversations_table.php
Schema::create('conversations', function (Blueprint $table) {
$table->ulid('id')->primary();
$table->foreignId('user_id')->constrained()->cascadeOnDelete();
$table->foreignId('team_id')->nullable()->constrained()->nullOnDelete();
$table->string('title')->nullable(); // Auto-generated from first message
$table->string('status')->default('active'); // active, escalated, closed
$table->string('model')->default('gpt-4o'); // AI model used
$table->json('metadata')->nullable(); // Extra context, tags, etc.
$table->integer('total_tokens')->default(0); // Track usage
$table->timestamp('escalated_at')->nullable();
$table->timestamps();
$table->index(['user_id', 'status']);
$table->index(['team_id', 'created_at']);
});
// database/migrations/create_messages_table.php
Schema::create('messages', function (Blueprint $table) {
$table->ulid('id')->primary();
$table->foreignUlid('conversation_id')->constrained()->cascadeOnDelete();
$table->string('role'); // user, assistant, system, tool
$table->text('content');
$table->json('tool_calls')->nullable(); // Function calls made by AI
$table->json('tool_result')->nullable(); // Results returned to AI
$table->json('attachments')->nullable(); // File uploads
$table->integer('input_tokens')->default(0);
$table->integer('output_tokens')->default(0);
$table->timestamps();
$table->index(['conversation_id', 'created_at']);
});
Eloquent Models
// app/Models/Conversation.php
class Conversation extends Model
{
use HasUlids;
protected $fillable = [
'user_id', 'team_id', 'title', 'status',
'model', 'metadata', 'total_tokens', 'escalated_at',
];
protected function casts(): array
{
return [
'metadata' => 'array',
'escalated_at' => 'datetime',
];
}
public function user(): BelongsTo
{
return $this->belongsTo(User::class);
}
public function messages(): HasMany
{
return $this->hasMany(Message::class)->orderBy('created_at');
}
public function isEscalated(): bool
{
return $this->status === 'escalated';
}
}
// app/Models/Message.php
class Message extends Model
{
use HasUlids;
protected $fillable = [
'conversation_id', 'role', 'content',
'tool_calls', 'tool_result', 'attachments',
'input_tokens', 'output_tokens',
];
protected function casts(): array
{
return [
'tool_calls' => 'array',
'tool_result' => 'array',
'attachments' => 'array',
];
}
public function conversation(): BelongsTo
{
return $this->belongsTo(Conversation::class);
}
}
4. The AI Chat Service
The AiChatService is the brain of your chatbot. It takes a conversation, builds the message history, calls the AI provider, and handles tool execution.
// app/Services/AiChatService.php
namespace App\Services;
use App\Models\Conversation;
use App\Models\Message;
use Illuminate\Support\Facades\AI;
class AiChatService
{
protected array $tools = [];
public function __construct(
private readonly ChatToolRegistry $toolRegistry,
) {
$this->tools = $toolRegistry->all();
}
/**
* Send a message and get a complete (non-streamed) response.
*/
public function send(Conversation $conversation, string $userMessage): Message
{
// 1. Store the user's message
$conversation->messages()->create([
'role' => 'user',
'content' => $userMessage,
]);
// 2. Build the message array for the AI
$messages = $this->buildMessages($conversation);
// 3. Call the AI
$response = AI::using($conversation->model)
->withTools($this->tools)
->chat($messages);
// 4. Handle tool calls if any
if ($response->hasToolCalls()) {
return $this->handleToolCalls($conversation, $response);
}
// 5. Store and return the assistant's response
$assistantMessage = $conversation->messages()->create([
'role' => 'assistant',
'content' => $response->text,
'input_tokens' => $response->usage->inputTokens,
'output_tokens' => $response->usage->outputTokens,
]);
$conversation->increment('total_tokens',
$response->usage->inputTokens + $response->usage->outputTokens
);
return $assistantMessage;
}
/**
* Build the message array with system prompt + conversation history.
*/
protected function buildMessages(Conversation $conversation): array
{
$messages = [
['role' => 'system', 'content' => $this->systemPrompt($conversation)],
];
foreach ($conversation->messages()->latest()->limit(50)->get()->reverse() as $msg) {
$messages[] = [
'role' => $msg->role,
'content' => $msg->content,
];
}
return $messages;
}
/**
* Generate a context-aware system prompt.
*/
protected function systemPrompt(Conversation $conversation): string
{
$user = $conversation->user;
return <<name}
- Email: {$user->email}
- Plan: {$user->subscription?->plan ?? 'Free'}
- Member since: {$user->created_at->format('F Y')}
Guidelines:
- Be concise, friendly, and professional.
- If you can solve the problem using the available tools, do so.
- If you cannot help or the user is frustrated, offer to escalate to a human agent.
- Never make up information. If you don't know, say so.
- Do not share other users' data or internal system details.
PROMPT;
}
}
5. Real-Time Streaming with Server-Sent Events
Streaming makes your chatbot feel alive. Instead of waiting 3–5 seconds for a complete response, users see tokens appearing in real time — just like ChatGPT. We'll use Server-Sent Events (SSE), which is simpler than WebSockets and perfect for one-way streaming.
Backend: Stream Controller
// app/Http/Controllers/ChatStreamController.php
namespace App\Http\Controllers;
use App\Models\Conversation;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\AI;
use Symfony\Component\HttpFoundation\StreamedResponse;
class ChatStreamController extends Controller
{
public function __invoke(Request $request, Conversation $conversation): StreamedResponse
{
$this->authorize('view', $conversation);
$request->validate([
'message' => 'required|string|max:4000',
]);
// Store user message
$conversation->messages()->create([
'role' => 'user',
'content' => $request->message,
]);
$messages = app(AiChatService::class)->buildMessages($conversation);
return response()->stream(function () use ($conversation, $messages) {
// Disable output buffering
if (ob_get_level()) ob_end_clean();
$fullResponse = '';
$inputTokens = 0;
$outputTokens = 0;
$stream = AI::using($conversation->model)
->stream($messages);
foreach ($stream as $chunk) {
if ($chunk->text) {
$fullResponse .= $chunk->text;
echo "data: " . json_encode([
'type' => 'chunk',
'text' => $chunk->text,
]) . "\n\n";
if (ob_get_level()) ob_flush();
flush();
}
if ($chunk->usage) {
$inputTokens = $chunk->usage->inputTokens;
$outputTokens = $chunk->usage->outputTokens;
}
}
// Store the complete response
$conversation->messages()->create([
'role' => 'assistant',
'content' => $fullResponse,
'input_tokens' => $inputTokens,
'output_tokens' => $outputTokens,
]);
$conversation->increment('total_tokens', $inputTokens + $outputTokens);
// Send the done event
echo "data: " . json_encode(['type' => 'done']) . "\n\n";
if (ob_get_level()) ob_flush();
flush();
}, 200, [
'Content-Type' => 'text/event-stream',
'Cache-Control' => 'no-cache',
'Connection' => 'keep-alive',
'X-Accel-Buffering' => 'no', // Disable Nginx buffering
]);
}
}
Routes
// routes/web.php
Route::middleware('auth')->group(function () {
Route::post('/chat', [ChatController::class, 'store']); // Create conversation
Route::post('/chat/{conversation}/stream', ChatStreamController::class); // Stream response
Route::get('/chat/{conversation}', [ChatController::class, 'show']); // View conversation
Route::get('/chat', [ChatController::class, 'index']); // List conversations
});
Frontend: Consuming the Stream
// resources/js/chat-stream.js
async function sendMessage(conversationId, message) {
const response = await fetch(`/chat/${conversationId}/stream`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-CSRF-TOKEN': document.querySelector('meta[name="csrf-token"]').content,
},
body: JSON.stringify({ message }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let assistantMessage = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
const lines = text.split('\n');
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const data = JSON.parse(line.slice(6));
if (data.type === 'chunk') {
assistantMessage += data.text;
updateChatUI(assistantMessage); // Render with markdown
}
if (data.type === 'done') {
finalizeMessage(assistantMessage);
}
}
}
}
Important: Add X-Accel-Buffering: no to disable Nginx's response buffering, otherwise your stream will arrive in large chunks instead of token by token. If you're using Apache, add Header set X-Accel-Buffering no to your config.
6. Conversation Memory & Context Window Management
AI models have a limited context window (the maximum number of tokens they can process at once). GPT-4o supports 128K tokens, Claude supports 200K. In a long conversation, you'll eventually hit this limit. Here's how to handle it gracefully.
Strategy 1: Sliding Window (Simple)
// Only send the last N messages to the AI
protected function buildMessages(Conversation $conversation, int $limit = 30): array
{
$messages = [
['role' => 'system', 'content' => $this->systemPrompt($conversation)],
];
// Get the most recent messages, then reverse to chronological order
$recentMessages = $conversation->messages()
->whereIn('role', ['user', 'assistant'])
->latest()
->limit($limit)
->get()
->reverse();
foreach ($recentMessages as $msg) {
$messages[] = [
'role' => $msg->role,
'content' => $msg->content,
];
}
return $messages;
}
Strategy 2: Summarization (Smart)
For long conversations, summarize older messages and keep only the summary plus recent messages:
protected function buildMessagesWithSummary(Conversation $conversation): array
{
$allMessages = $conversation->messages()
->whereIn('role', ['user', 'assistant'])
->orderBy('created_at')
->get();
// If conversation is short, send everything
if ($allMessages->count() <= 20) {
return $this->formatMessages($allMessages, $conversation);
}
// Split: older messages get summarized, recent ones stay intact
$olderMessages = $allMessages->slice(0, -10);
$recentMessages = $allMessages->slice(-10);
// Generate or retrieve cached summary
$summary = $this->getOrCreateSummary($conversation, $olderMessages);
$messages = [
['role' => 'system', 'content' => $this->systemPrompt($conversation)],
['role' => 'system', 'content' => "Summary of earlier conversation:\n{$summary}"],
];
foreach ($recentMessages as $msg) {
$messages[] = ['role' => $msg->role, 'content' => $msg->content];
}
return $messages;
}
protected function getOrCreateSummary(Conversation $conversation, $messages): string
{
$cacheKey = "conversation:{$conversation->id}:summary:{$messages->last()->id}";
return Cache::remember($cacheKey, now()->addDay(), function () use ($messages) {
$transcript = $messages->map(fn ($m) => "{$m->role}: {$m->content}")->join("\n");
$response = AI::using('gpt-4o-mini') // Use a cheaper model for summarization
->chat([
['role' => 'system', 'content' => 'Summarize this conversation in 2-3 paragraphs. Preserve key details, decisions, and any unresolved issues.'],
['role' => 'user', 'content' => $transcript],
]);
return $response->text;
});
}
The summarization approach costs a few extra cents per long conversation but prevents context window errors and keeps the bot's responses relevant even in 100+ message threads.
7. Function Calling: Give Your Bot Superpowers
Without tools, your chatbot can only answer from its training data. With function calling (also called "tools"), it can query your database, look up orders, check subscription status, create tickets, and more — all within the conversation flow.
Define Your Tools
// app/Services/ChatTools/OrderLookupTool.php
namespace App\Services\ChatTools;
use App\Models\Order;
use Illuminate\Support\Facades\AI;
class OrderLookupTool
{
public static function definition(): array
{
return [
'name' => 'lookup_order',
'description' => 'Look up an order by order number or email address. Returns order status, items, and tracking information.',
'parameters' => [
'type' => 'object',
'properties' => [
'order_number' => [
'type' => 'string',
'description' => 'The order number (e.g., ORD-12345)',
],
'email' => [
'type' => 'string',
'description' => 'The email address associated with the order',
],
],
'required' => [],
],
];
}
public static function execute(array $params, int $userId): array
{
$query = Order::where('user_id', $userId); // Always scope to current user!
if (isset($params['order_number'])) {
$query->where('order_number', $params['order_number']);
}
$order = $query->first();
if (! $order) {
return ['error' => 'Order not found. Please check the order number and try again.'];
}
return [
'order_number' => $order->order_number,
'status' => $order->status,
'total' => '$' . number_format($order->total / 100, 2),
'items' => $order->items->map(fn ($i) => $i->name)->toArray(),
'tracking_url' => $order->tracking_url,
'created_at' => $order->created_at->format('M j, Y'),
];
}
}
// app/Services/ChatTools/SubscriptionStatusTool.php
class SubscriptionStatusTool
{
public static function definition(): array
{
return [
'name' => 'check_subscription',
'description' => 'Check the current user\'s subscription plan, billing date, and usage.',
'parameters' => [
'type' => 'object',
'properties' => [],
],
];
}
public static function execute(array $params, int $userId): array
{
$user = User::find($userId);
$subscription = $user->subscription;
if (! $subscription) {
return ['plan' => 'Free', 'message' => 'No active subscription.'];
}
return [
'plan' => $subscription->plan,
'status' => $subscription->active() ? 'Active' : 'Inactive',
'next_billing_date' => $subscription->next_billing_at?->format('M j, Y'),
'cancel_at_period_end' => $subscription->cancel_at_period_end,
];
}
}
Tool Registry & Execution
// app/Services/ChatToolRegistry.php
class ChatToolRegistry
{
protected array $tools = [
'lookup_order' => OrderLookupTool::class,
'check_subscription' => SubscriptionStatusTool::class,
'create_support_ticket' => CreateTicketTool::class,
'search_help_articles' => SearchHelpArticlesTool::class,
];
public function all(): array
{
return array_map(fn ($class) => $class::definition(), $this->tools);
}
public function execute(string $name, array $params, int $userId): array
{
if (! isset($this->tools[$name])) {
return ['error' => "Unknown tool: {$name}"];
}
return $this->tools[$name]::execute($params, $userId);
}
}
Handle Tool Calls in the AI Service
// In AiChatService.php
protected function handleToolCalls(Conversation $conversation, $response): Message
{
// Store the assistant's tool call message
$conversation->messages()->create([
'role' => 'assistant',
'content' => $response->text ?? '',
'tool_calls' => $response->toolCalls,
]);
// Execute each tool and store results
foreach ($response->toolCalls as $toolCall) {
$result = $this->toolRegistry->execute(
$toolCall['name'],
$toolCall['arguments'],
$conversation->user_id
);
$conversation->messages()->create([
'role' => 'tool',
'content' => json_encode($result),
'tool_result' => [
'tool_call_id' => $toolCall['id'],
'name' => $toolCall['name'],
'result' => $result,
],
]);
}
// Send the tool results back to the AI for a natural language response
$messages = $this->buildMessages($conversation);
$followUp = AI::using($conversation->model)->chat($messages);
return $conversation->messages()->create([
'role' => 'assistant',
'content' => $followUp->text,
'input_tokens' => $followUp->usage->inputTokens,
'output_tokens' => $followUp->usage->outputTokens,
]);
}
Now when a user asks "What's the status of my order ORD-12345?", the AI will automatically call the lookup_order tool, get the real data from your database, and reply with a natural language answer containing the actual order status.
8. File Uploads & Vision
Letting users upload screenshots, documents, or images makes your chatbot dramatically more helpful. "Here's a screenshot of the error" is worth a thousand words.
// app/Http/Controllers/ChatAttachmentController.php
class ChatAttachmentController extends Controller
{
public function store(Request $request, Conversation $conversation)
{
$this->authorize('update', $conversation);
$request->validate([
'file' => ['required', 'file', 'max:10240', 'mimes:jpg,jpeg,png,webp,gif,pdf'],
'message' => ['nullable', 'string', 'max:4000'],
]);
$path = $request->file('file')->store(
"chat-attachments/{$conversation->id}",
's3'
);
$userMessage = $request->message ?? 'Please analyze this file.';
// For images, use the vision capability
if (str_starts_with($request->file('file')->getMimeType(), 'image/')) {
$imageUrl = Storage::disk('s3')->temporaryUrl($path, now()->addHour());
$conversation->messages()->create([
'role' => 'user',
'content' => $userMessage,
'attachments' => [['type' => 'image', 'path' => $path, 'url' => $imageUrl]],
]);
// Call AI with vision
$response = AI::using('gpt-4o')
->chat([
...$this->buildMessages($conversation),
[
'role' => 'user',
'content' => [
['type' => 'text', 'text' => $userMessage],
['type' => 'image_url', 'image_url' => ['url' => $imageUrl]],
],
],
]);
$conversation->messages()->create([
'role' => 'assistant',
'content' => $response->text,
]);
return response()->json(['message' => $response->text]);
}
// For PDFs/documents, extract text first
// ... handle document processing
}
}
9. Human Escalation & Handoff
No chatbot can handle everything. When the bot is stuck, the user is frustrated, or the issue requires human judgement, you need a smooth escalation path. Here's how to build it.
Add an Escalation Tool
// app/Services/ChatTools/EscalateTool.php
class EscalateTool
{
public static function definition(): array
{
return [
'name' => 'escalate_to_human',
'description' => 'Escalate the conversation to a human support agent. Use this when: (1) you cannot resolve the issue, (2) the user explicitly asks for a human, (3) the issue involves billing disputes or account deletion, or (4) the user seems frustrated after 2+ failed attempts.',
'parameters' => [
'type' => 'object',
'properties' => [
'reason' => [
'type' => 'string',
'description' => 'Brief summary of why this needs human attention',
],
'priority' => [
'type' => 'string',
'enum' => ['low', 'medium', 'high'],
'description' => 'Priority level based on urgency',
],
],
'required' => ['reason', 'priority'],
],
];
}
public static function execute(array $params, int $userId): array
{
$conversation = Conversation::where('user_id', $userId)
->where('status', 'active')
->latest()
->first();
$conversation->update([
'status' => 'escalated',
'escalated_at' => now(),
'metadata' => array_merge($conversation->metadata ?? [], [
'escalation_reason' => $params['reason'],
'escalation_priority' => $params['priority'],
]),
]);
// Notify support team
$admins = User::role('support-agent')->get();
Notification::send($admins, new ConversationEscalated($conversation, $params));
return [
'success' => true,
'message' => 'A human agent has been notified and will join this conversation shortly.',
'estimated_wait' => '< 2 hours during business hours',
];
}
}
Auto-Detect Frustration
You can also detect frustration automatically and proactively offer escalation:
// Add to system prompt
$frustrationRules = <<<'RULES'
ESCALATION RULES:
- If the user uses ALL CAPS, profanity, or says phrases like "this is ridiculous",
"nothing works", "let me talk to a person" — immediately offer to escalate.
- If you've given 3+ responses and the user's issue isn't resolved, proactively say:
"I want to make sure you get the help you need. Would you like me to connect you
with a human agent?"
- For billing disputes, refund requests, or account deletions, always escalate.
Do not process these yourself.
RULES;
10. The Chat UI: Livewire + Alpine.js
A great chatbot needs a great UI. Here's a clean, responsive chat interface built with Livewire for the component structure and Alpine.js for the real-time streaming behavior.
<!-- resources/views/livewire/chat-widget.blade.php -->
<div
x-data="chatWidget('{{ $conversation->id }}')"
class="flex flex-col h-[600px] bg-gray-900 rounded-xl border border-gray-700 overflow-hidden"
>
<!-- Header -->
<div class="flex items-center justify-between px-4 py-3 bg-gray-800 border-b border-gray-700">
<div class="flex items-center gap-2">
<div class="w-2 h-2 bg-green-400 rounded-full animate-pulse"></div>
<span class="text-sm font-medium text-white">AI Support</span>
</div>
<button @click="$dispatch('close-chat')" class="text-gray-400 hover:text-white">
×
</button>
</div>
<!-- Messages -->
<div
x-ref="messages"
class="flex-1 overflow-y-auto p-4 space-y-4"
>
<template x-for="msg in messages" :key="msg.id">
<div :class="msg.role === 'user' ? 'flex justify-end' : 'flex justify-start'">
<div
:class="msg.role === 'user'
? 'bg-green-600 text-white rounded-2xl rounded-br-md'
: 'bg-gray-800 text-gray-200 rounded-2xl rounded-bl-md'"
class="max-w-[80%] px-4 py-2.5 text-sm leading-relaxed"
x-html="renderMarkdown(msg.content)"
></div>
</div>
</template>
<!-- Typing indicator -->
<div x-show="isStreaming" class="flex justify-start">
<div class="bg-gray-800 rounded-2xl rounded-bl-md px-4 py-2.5">
<div class="flex gap-1">
<span class="w-2 h-2 bg-gray-500 rounded-full animate-bounce"></span>
<span class="w-2 h-2 bg-gray-500 rounded-full animate-bounce" style="animation-delay:0.1s"></span>
<span class="w-2 h-2 bg-gray-500 rounded-full animate-bounce" style="animation-delay:0.2s"></span>
</div>
</div>
</div>
</div>
<!-- Input -->
<form @submit.prevent="send" class="p-4 border-t border-gray-700">
<div class="flex gap-2">
<input
x-model="input"
type="text"
placeholder="Type your message..."
:disabled="isStreaming"
class="flex-1 px-4 py-2.5 bg-gray-800 border border-gray-600 rounded-lg text-white
placeholder-gray-500 focus:outline-none focus:border-green-500 text-sm"
>
<button
type="submit"
:disabled="isStreaming || !input.trim()"
class="px-4 py-2.5 bg-green-500 hover:bg-green-400 disabled:opacity-50
text-black font-medium rounded-lg text-sm transition-colors"
>
Send
</button>
</div>
</form>
</div>
<script>
function chatWidget(conversationId) {
return {
messages: @json($messages),
input: '',
isStreaming: false,
async send() {
if (!this.input.trim() || this.isStreaming) return;
const userMessage = this.input.trim();
this.input = '';
// Add user message to UI immediately
this.messages.push({
id: Date.now(),
role: 'user',
content: userMessage,
});
this.scrollToBottom();
this.isStreaming = true;
// Add placeholder for assistant response
const assistantMsg = {
id: Date.now() + 1,
role: 'assistant',
content: '',
};
this.messages.push(assistantMsg);
try {
const response = await fetch(`/chat/${conversationId}/stream`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-CSRF-TOKEN': document.querySelector('meta[name="csrf-token"]').content,
},
body: JSON.stringify({ message: userMessage }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const lines = decoder.decode(value).split('\n');
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const data = JSON.parse(line.slice(6));
if (data.type === 'chunk') {
assistantMsg.content += data.text;
this.scrollToBottom();
}
}
}
} catch (error) {
assistantMsg.content = 'Sorry, something went wrong. Please try again.';
}
this.isStreaming = false;
},
scrollToBottom() {
this.$nextTick(() => {
this.$refs.messages.scrollTop = this.$refs.messages.scrollHeight;
});
},
renderMarkdown(text) {
// Basic markdown: bold, code, links
return text
.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
.replace(/`(.*?)`/g, '<code class="bg-gray-700 px-1 rounded text-xs">$1</code>')
.replace(/\n/g, '<br>');
},
};
}
</script>
This gives you a clean chat bubble UI with real-time token streaming, a typing indicator, markdown rendering, and automatic scrolling. For production, swap the basic renderMarkdown with a proper library like marked or markdown-it.
11. Usage Limits & Cost Control
AI API calls cost real money. Without usage limits, one abusive user can rack up hundreds of dollars in API costs. Here's how to prevent that.
// app/Http/Middleware/EnforceChatLimits.php
class EnforceChatLimits
{
public function handle(Request $request, Closure $next): Response
{
$user = $request->user();
$plan = $user->subscription?->plan ?? 'free';
$limits = [
'free' => ['messages_per_day' => 10, 'max_tokens' => 50_000],
'starter' => ['messages_per_day' => 100, 'max_tokens' => 500_000],
'pro' => ['messages_per_day' => 500, 'max_tokens' => 2_000_000],
'team' => ['messages_per_day' => 2000, 'max_tokens' => 10_000_000],
];
$planLimits = $limits[$plan] ?? $limits['free'];
// Check daily message count
$todayMessages = Message::whereHas('conversation', fn ($q) =>
$q->where('user_id', $user->id)
)->where('role', 'user')
->whereDate('created_at', today())
->count();
if ($todayMessages >= $planLimits['messages_per_day']) {
return response()->json([
'error' => 'Daily message limit reached.',
'limit' => $planLimits['messages_per_day'],
'upgrade_url' => route('billing'),
], 429);
}
// Check monthly token usage
$monthTokens = Conversation::where('user_id', $user->id)
->whereMonth('created_at', now()->month)
->sum('total_tokens');
if ($monthTokens >= $planLimits['max_tokens']) {
return response()->json([
'error' => 'Monthly token limit reached.',
'upgrade_url' => route('billing'),
], 429);
}
return $next($request);
}
}
// Apply to chat routes
Route::middleware(['auth', EnforceChatLimits::class])->group(function () {
Route::post('/chat/{conversation}/stream', ChatStreamController::class);
});
Pro tip: Show users their remaining message count in the chat UI. This reduces frustration when they hit the limit and encourages upgrades.
12. Testing Your Chatbot
You don't want to call the real OpenAI API in your test suite. The Laravel AI SDK provides fakes for exactly this purpose:
use Illuminate\Support\Facades\AI;
// tests/Feature/ChatbotTest.php
it('responds to a user message', function () {
AI::fake([
'Hello! I can help you with your subscription. What would you like to know?',
]);
$user = User::factory()->create();
$conversation = Conversation::factory()->for($user)->create();
$this->actingAs($user)
->post("/chat/{$conversation->id}/message", [
'message' => 'How do I upgrade my plan?',
])
->assertOk();
expect($conversation->messages)->toHaveCount(2); // user + assistant
expect($conversation->messages->last()->role)->toBe('assistant');
AI::assertSent(function (array $messages) {
return str_contains($messages[1]['content'], 'upgrade my plan');
});
});
it('enforces daily message limits', function () {
$user = User::factory()->create(); // Free plan
$conversation = Conversation::factory()->for($user)->create();
// Create 10 messages (daily limit for free plan)
Message::factory(10)->for($conversation)->create([
'role' => 'user',
'created_at' => today(),
]);
$this->actingAs($user)
->postJson("/chat/{$conversation->id}/stream", [
'message' => 'One more message',
])
->assertStatus(429)
->assertJson(['error' => 'Daily message limit reached.']);
});
it('escalates when user requests human agent', function () {
AI::fake([
AI::response(toolCalls: [
['name' => 'escalate_to_human', 'arguments' => ['reason' => 'User requested', 'priority' => 'medium']],
]),
'I\'ve connected you with a human agent. They\'ll be with you shortly.',
]);
Notification::fake();
$user = User::factory()->create();
$conversation = Conversation::factory()->for($user)->create();
$this->actingAs($user)->post("/chat/{$conversation->id}/message", [
'message' => 'Let me talk to a real person',
]);
expect($conversation->fresh()->status)->toBe('escalated');
Notification::assertSentTo(User::role('support-agent')->get(), ConversationEscalated::class);
});
13. How Much Does It Cost to Run?
Here's a realistic cost breakdown for running an AI chatbot on your SaaS:
- GPT-4o — $2.50/1M input tokens, $10/1M output tokens
- GPT-4o-mini — $0.15/1M input tokens, $0.60/1M output tokens (75x cheaper)
- Claude 4 Sonnet — $3/1M input tokens, $15/1M output tokens
A typical customer support conversation is about 800 input tokens (system prompt + history + user message) and 300 output tokens (bot reply). That means:
- Per conversation (GPT-4o): ~$0.005 (half a cent)
- 1,000 conversations/day: ~$150/month
- Per conversation (GPT-4o-mini): ~$0.0003 (a fraction of a cent)
- 1,000 conversations/day (mini): ~$9/month
Cost Optimization Strategies
- Route simple queries to cheaper models. Use GPT-4o-mini for FAQ-style questions and GPT-4o only for complex issues. You can add a classifier that picks the model based on query complexity.
- Keep system prompts short. Every token in your system prompt is sent with every message. A 500-token system prompt across 1,000 daily conversations costs $1.25/month on GPT-4o. Trim the fat.
- Summarize old conversations instead of sending the full history. Use GPT-4o-mini for summarization (see section 6).
- Cache common responses. If 20% of your queries are "How do I reset my password?", cache the response and skip the API call entirely.
- Set hard limits per plan (see section 11). Free users get 10 messages/day, paid users get more.
14. Conclusion & Next Steps
You now have the complete blueprint for building a production-ready AI chatbot in Laravel 12. Let's recap what we covered:
- Database schema for conversations and messages with ULIDs and token tracking.
- AI Chat Service with context-aware system prompts and tool execution.
- Real-time streaming via Server-Sent Events for a ChatGPT-like experience.
- Conversation memory with sliding window and summarization strategies.
- Function calling to let the bot query orders, subscriptions, and more.
- File uploads & vision for screenshot analysis.
- Human escalation with auto-detection and notification.
- Chat UI with Livewire and Alpine.js.
- Usage limits per plan to control costs.
- Testing with AI fakes.
To take it further, consider adding:
- RAG (Retrieval-Augmented Generation) — Let the bot search your docs and knowledge base. See our RAG in Laravel guide.
- Multi-language support — GPT-4o and Claude handle 50+ languages natively. Just set the system prompt's language preference based on the user's locale.
- Analytics dashboard — Track resolution rate, average conversation length, escalation rate, and common topics.
- Feedback collection — Add thumbs up/down to bot responses to improve your system prompt over time.
LaraSpeed includes a fully working AI chatbot out of the box. Streaming, memory, tools, escalation, usage limits, and the chat UI are all pre-built and ready to customize. Connect your API key, set your system prompt, and you have a customer support bot in minutes, not weeks.
Skip the boilerplate. Ship your AI chatbot today.
LaraSpeed gives you a production-ready Laravel SaaS with AI chatbot, authentication, billing, teams, admin panel, and deployment configs — everything wired together so you can focus on your product.
Get LaraSpeed — Starting at $49