Documentation Index
Fetch the complete documentation index at: https://allhandsai-chore-rename-allhands-bot-github-pat-to-pat-token.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
A ready-to-run example is available here!
Overview
When an LLM requests multiple tool calls in a single response, the SDK can execute them concurrently rather than sequentially. This is controlled by thetool_concurrency_limit parameter on the Agent class.
Benefits:
- Faster execution when tools are independent (e.g., reading multiple files)
- Better utilization of I/O-bound operations
- Enables parallel sub-agent delegation
- Running multiple read-only operations simultaneously
- Delegating to multiple sub-agents at once
- Executing independent API calls or file operations
Configuration
Setting the Concurrency Limit
Configuretool_concurrency_limit when creating an Agent:
Concurrency Limit Values
| Value | Behavior |
|---|---|
1 (default) | Sequential execution—tools run one at a time |
2-8 | Moderate parallelism—good for most use cases |
>8 | High parallelism—only for I/O-heavy workloads with independent tools. Risk of resource exhaustion. |
The optimal value depends on your workload. Start with a lower value (e.g.,
4) and increase if needed.Use Cases
Parallel File Operations
When reading multiple independent files:Parallel Sub-Agent Delegation
Combine with sub-agent delegation for parallel task processing:Sub-Agents with Their Own Parallelism
Each sub-agent can have its own concurrency limit:Considerations
Thread Safety
When NOT to Use
- Tools that must execute in a specific order
- Operations that modify the same files
- Workflows where one tool’s output feeds into another
Ready-to-run Example
This example demonstrates parallel tool execution with an orchestrator agent that delegates to multiple sub-agents, each running their own tools concurrently.This example is available on GitHub: examples/01_standalone_sdk/45_parallel_tool_execution.py
examples/01_standalone_sdk/45_parallel_tool_execution.py
The model name should follow the LiteLLM convention:
provider/model_name (e.g., anthropic/claude-sonnet-4-5-20250929, openai/gpt-4o).
The LLM_API_KEY should be the API key for your chosen provider.Understanding the Example
The example demonstrates a two-level parallel execution pattern:-
Orchestrator Level: The main agent has
tool_concurrency_limit=8, allowing it to delegate to all three sub-agents simultaneously -
Sub-Agent Level: Each sub-agent has
tool_concurrency_limit=4, allowing them to run their own tools (terminal commands, file reads) in parallel - Verification: The example includes a parallelism report that analyzes persisted events to confirm tools actually ran concurrently
Next Steps
- Sub-Agent Delegation - Delegate work to specialized sub-agents
- Custom Tools - Create thread-safe custom tools
- Agent Architecture - Understand the agent execution model

