Agent Configuration

Overview

An Agent is the core entity in Vaani — it’s your AI-powered virtual phone assistant. Each agent is configured with:

A system prompt that defines personality and behavior
LLM, STT, and TTS providers for language processing and voice
Custom functions (webhook-based tools the agent can invoke)
Call behavior settings (max duration, silence handling, transfer config)

Creating an Agent

Navigate to Agents

In the dashboard sidebar, click Agents → Create Agent.

Configure Identity

Setting	Description
Agent Name	Display name shown in dashboard and logs
Phone Number	Unique number assigned to this agent (auto-assigned or selected from provisioned numbers)
Agent Type	Category label for organizational purposes

Choose AI Providers

Select from the available provider combinations:Language Model (LLM):

Provider	Models	Best For
OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4o-realtime`	Accuracy, tool calling, realtime voice
Groq	`llama-3`, `mixtral`	Speed and low latency

Speech-to-Text (STT):

Provider	Models	Best For
Deepgram	`nova-3`	English, multilingual, keyword boosting
Sarvam	Various	Indian languages (Hindi, Tamil, etc.)
Cartesia	Various	General purpose

Text-to-Speech (TTS):

Provider	Best For
OpenAI	Cost-effective, good quality
ElevenLabs	Most natural, premium voices
Deepgram	Lowest latency
Rime	Customizable voices
Cartesia	General purpose
Sarvam	Indian languages
Murf	Studio-quality
Inworld	Gaming/virtual character voices

Write Your System Prompt

The system prompt defines your agent’s personality, knowledge, and rules:

You are Sarah, a friendly appointments coordinator for Sunrise Dental Clinic.

Your responsibilities:
- Answer questions about our services (cleanings, fillings, root canals)
- Schedule, reschedule, and cancel appointments
- Collect patient information (name, phone, insurance)

Rules:
- Always be warm and professional
- If you can't answer a question, offer to transfer to a human agent
- Never provide medical advice
- Always confirm the appointment details before ending the call

Available appointment slots: Monday-Friday, 9 AM - 5 PM

Use double curly braces for dynamic variables that change per call: {{customer_name}}, {{appointment_date}}

Set the First Message

The first message is what the agent says when a call connects:

Hello! Thank you for calling Sunrise Dental. My name is Sarah. 
How can I help you today?

This is automatically delivered via TTS when the caller connects.

Configure Call Behavior

Setting	Default	Description
Max Call Duration	300s (5 min)	Hard time limit before auto-disconnect
Silence Timeout	30s	Disconnect after this much silence
Ring Timeout	30s	How long to ring on outbound calls
Agent Inactive Reminder	—	Prompt if agent is idle too long
Noise Cancellation	BVE	Background noise filtering model

Add Custom Functions (Optional)

Custom functions let your agent take actions during calls by calling external webhooks:

{
  "name": "check_availability",
  "description": "Check appointment availability for a given date",
  "url": "https://api.your-clinic.com/availability",
  "method": "POST",
  "parameters": {
    "date": {
      "type": "string",
      "description": "The date to check (YYYY-MM-DD format)"
    }
  },
  "speak_during": "Let me check our availability for that date...",
  "speak_after": "I've found the available slots."
}

The agent will invoke this function when the conversation naturally requires it (determined by the LLM).

Agent Settings Reference

LLM Settings
Voice Settings
Transfer Settings
Session Settings

Setting	Type	Description
`LLM_provider`	string	”open ai” or “groq”
`LLM_model`	string	Model identifier (e.g., “gpt-4o”)
`temperature`	float	Response creativity (0.0 = deterministic, 2.0 = creative)
`max_tokens`	integer	Maximum response length in tokens
`system_prompt`	text	Agent instructions and personality
`first_message`	text	Greeting when call connects

Setting	Type	Description
`STT_provider`	string	Speech recognition provider
`STT_model`	string	STT model identifier
`STT_language`	string	Language code (e.g., “en”)
`stt_keywords`	JSON	Words to boost recognition accuracy
`TTS_provider`	string	Voice synthesis provider
`TTS_model`	string	TTS model identifier
`TTS_voice`	string	Voice ID or name
`TTS_speed`	float	Speech rate multiplier

Setting	Type	Description
`transfer_phone_number`	string	Target number for transfers
`warm_transfer_text`	text	Briefing for the receiving party
`warm_transfer_retries`	integer	Retry attempts for warm transfer

Setting	Type	Description
`max_call_duration`	integer	Maximum call length (seconds)
`ring_timeout`	integer	Outbound ring timeout (seconds)
`silence_timeout`	integer	Auto-disconnect on silence (seconds)
`noise_cancellation_model`	string	”none” or “BVE”
`realtime_turn_detection_type`	string	VAD algorithm
`realtime_turn_detection_threshold`	float	Sensitivity (0.0 - 1.0)

Knowledge Base

Upload documents to your agent for Retrieval-Augmented Generation (RAG):

Go to your agent’s page → Knowledge Base tab
Upload files (PDF, DOCX, TXT)
Enable the RAG toggle
The agent will now search uploaded documents during conversations and use relevant content in responses

Documents are indexed using LlamaIndex with OpenAI embeddings. The index is rebuilt when files are added or removed.

Best Practices

Keep system prompts focused — agents perform better with clear, specific instructions
Use dynamic variables for personalization: {{customer_name}} and {{account_number}}
Test with web calls first — use the dashboard’s Test Call feature before going live with phone calls
Set reasonable timeouts — a 5-minute max duration prevents runaway calls
Monitor cost per call — different providers have significantly different pricing

Getting Started

User Guide

Account

Help

Agent Configuration

Overview

Creating an Agent

Agent Settings Reference

Knowledge Base

Best Practices

Getting Started

User Guide

Account

Help

​Overview

​Creating an Agent

​Agent Settings Reference

​Knowledge Base

​Best Practices

Overview

Creating an Agent

Agent Settings Reference

Knowledge Base

Best Practices