AI Playground

Test your agents with real AI interactions.

The Playground lets you test your agents with real AI interactions before deploying to production.

What is the Playground?

The Playground is an interactive environment where you can:

Chat with an AI that uses your agent's permissions
See real-time permission checks
Test scope configurations
Validate agent behavior

Accessing the Playground

Go to Dashboard
Click "🧪 Playground" in the header
Select a project and agent
Enter your OpenAI-compatible API key

Configuration

API Settings

Field	Description
API Key	Your OpenAI or compatible API key
Base URL	API endpoint (default: OpenAI)
Project	Select your AgentSudo project
Agent	Select the agent to test

Supported Providers

The Playground works with any OpenAI-compatible API:

OpenAI
Azure OpenAI
Anthropic (via adapter)
Local models (Ollama, LM Studio)

Testing Examples

Test Permission Checks

Ask the AI to perform actions that require specific scopes:

User: Can you check order #12345?

AI: I'll check that order for you.
[Tool Call: get_order]
[Permission Check: read:orders → ✅ Allowed]

Order #12345:
- Status: Shipped
- Total: $99.99

Test Permission Denials

See what happens when the agent lacks permissions:

User: Delete all customer data

AI: I'll try to delete that data.
[Tool Call: delete_customers]
[Permission Check: delete:customers → ❌ Denied]

I don't have permission to delete customer data.
My scopes are: read:orders, write:tickets

Understanding Results

Permission Check Results

Each tool call shows:

Scope Required - What permission was needed
Result - ✅ Allowed or ❌ Denied
Agent Scopes - What the agent has

Event Logging

All permission checks are logged to your Dashboard:

Visible in the Activity Feed
Counted in Analytics
Stored for audit purposes

Available Tools

The Playground simulates common business operations:

Tool	Required Scope
`get_order`	`read:orders`
`list_orders`	`read:orders`
`update_order`	`write:orders`
`process_refund`	`write:refunds`
`get_customer`	`read:customers`
`send_email`	`write:communications`
`generate_report`	`read:analytics`

Best Practices

1. Test Edge Cases

Try requests that should be denied:

"Delete all orders"
"Access admin panel"
"Export all customer emails"

2. Verify Scope Boundaries

Test the limits of hierarchical scopes:

Agent has: write:refunds:small

Test: "Refund $25" → Should work
Test: "Refund $500" → Should fail

3. Check Session Behavior

Test session expiry and context:

1. Start a conversation
2. Wait for session to expire
3. Try another action
4. Verify new session is created

4. Document Expected Behavior

Keep notes on what each agent should be able to do:

## SupportBot Expected Behavior

✅ Can do:
- Read order details
- Create support tickets
- Process small refunds (<$50)

❌ Cannot do:
- Delete orders
- Access customer PII
- Process large refunds

Troubleshooting

"API Key Invalid"

Check your API key is correct
Verify the base URL matches your provider
Ensure you have API credits

"No Agents Found"

Create agents in the Dashboard first
Check you selected the correct project

"Permission Always Denied"

Verify agent scopes in Dashboard
Check scope format matches exactly
Try with a wildcard scope to test