Configuration Guide

This guide explains all configuration options available in flujo.

Settings Overview

flujo uses a Settings class (powered by Pydantic-settings) to manage its configuration. Settings are primarily loaded from environment variables, with support for .env files for local development. This provides a flexible and robust way to configure your flujo applications.

How Settings are Loaded

Environment Variables: flujo will automatically read environment variables. For example, OPENAI_API_KEY.
.env files: For local development, you can create a .env file in your project root. Variables defined in this file will be loaded and take precedence over system environment variables.

`Settings` Class Properties

Below is a comprehensive list of all available settings, their types, default values, and a brief description.

API Keys

These settings manage API keys for various language model providers. They support AliasChoices for backward compatibility with older environment variable names.

openai_api_key: Optional[SecretStr]
- Environment Variables: OPENAI_API_KEY, ORCH_OPENAI_API_KEY, orch_openai_api_key
- Description: API key for OpenAI models.
google_api_key: Optional[SecretStr]
- Environment Variables: GOOGLE_API_KEY, ORCH_GOOGLE_API_KEY, orch_google_api_key
- Description: API key for Google models (e.g., Gemini).
anthropic_api_key: Optional[SecretStr]
- Environment Variables: ANTHROPIC_API_KEY, ORCH_ANTHROPIC_API_KEY, orch_anthropic_api_key
- Description: API key for Anthropic models.
logfire_api_key: Optional[SecretStr]
- Environment Variables: LOGFIRE_API_KEY, ORCH_LOGFIRE_API_KEY, orch_logfire_api_key
- Description: API key for Logfire telemetry integration.
provider_api_keys: Dict[str, SecretStr]
- Description: Dynamically loaded dictionary for any other _API_KEY environment variables not explicitly listed above (e.g., MYPROVIDER_API_KEY).

Feature Toggles

These boolean settings enable or disable specific flujo features.

reflection_enabled: bool = True
- Description: Enables or disables the reflection agent in multi-agent pipelines.
reward_enabled: bool = True
- Description: Enables or disables reward model scoring.
telemetry_export_enabled: bool = False
- Description: Enables or disables the export of telemetry data.
otlp_export_enabled: bool = False
- Description: Enables or disables OpenTelemetry Protocol (OTLP) export for distributed tracing.

Default Models

These settings define the default language models used by various agents within flujo.

default_solution_model: str = "openai:gpt-4o"
- Description: Default model for the Solution agent.
default_review_model: str = "openai:gpt-4o"
- Description: Default model for the Review agent.
default_validator_model: str = "openai:gpt-4o"
- Description: Default model for the Validator agent.
default_reflection_model: str = "openai:gpt-4o"
- Description: Default model for the Reflection agent.
default_self_improvement_model: str = "openai:gpt-4o"
- Description: Default model for the SelfImprovementAgent.
default_repair_model: str = "openai:gpt-4o"
- Description: Default model for the internal JSON repair agent.

Orchestrator Tuning

These settings control the behavior and performance of the flujo orchestrator.

max_iters: int = 5
- Description: Maximum number of iterations for multi-agent loops.
k_variants: int = 3
- Description: Number of solution variants to generate per iteration.
reflection_limit: int = 3
- Description: Maximum number of reflection steps allowed.
scorer: Literal["ratio", "weighted", "reward"] = "ratio"
- Description: The default scoring strategy to use.
t_schedule: list[float] = [1.0, 0.8, 0.5, 0.2]
- Description: A list of floating-point numbers representing the temperature for each iteration round. The last value is used for any rounds beyond the schedule's length. This setting is validated to ensure it's not empty.
otlp_endpoint: Optional[str] = None
- Description: The endpoint URL for OpenTelemetry Protocol (OTLP) export.
agent_timeout: int = 60
- Description: Timeout in seconds for individual agent calls.

Python Configuration

You can also configure the orchestrator programmatically by importing the settings object and modifying its attributes directly. This is useful for dynamic configuration or testing scenarios.

from flujo.infra.settings import settings

# Override a setting programmatically
settings.max_iters = 10
settings.reflection_enabled = False

# Access a setting
print(f"Current solution model: {settings.default_solution_model}")

Model Configuration

Model Selection

from flujo.infra.agents import make_agent_async

# Use different models for different agents
review_agent = make_agent_async(
    "openai:gpt-4",  # More capable model for review
    "You are a critical reviewer...",
    Checklist
)

solution_agent = make_agent_async(
    "openai:gpt-3.5-turbo",  # Faster model for generation
    "You are a creative writer...",
    str
)

Model Parameters

# Configure model parameters
agent = make_agent_async(
    "openai:gpt-4",
    "You are a helpful assistant...",
    str,
    temperature=0.7,  # Control randomness
    max_tokens=1000,  # Limit response length
    top_p=0.9,       # Nucleus sampling
    frequency_penalty=0.5,  # Reduce repetition
    presence_penalty=0.5    # Encourage diversity
)

Cost Tracking Configuration

Flujo provides integrated cost and token usage tracking for LLM steps. This feature allows you to monitor and control spending across your AI pipelines.

Configuring Provider Pricing

To enable cost tracking, you need to configure pricing for your LLM providers in your flujo.toml file. The cost tracking system automatically calculates costs based on token usage and configured pricing.

Basic Cost Configuration

Add a [cost] section to your flujo.toml:

# flujo.toml
[cost]
[cost.providers]
[cost.providers.openai]
# OpenAI Models (Pricing: https://openai.com/pricing)
[cost.providers.openai.gpt-4o]
prompt_tokens_per_1k = 0.005
completion_tokens_per_1k = 0.015

[cost.providers.openai.gpt-4o-mini]
prompt_tokens_per_1k = 0.00015
completion_tokens_per_1k = 0.0006

[cost.providers.openai.gpt-3.5-turbo]
prompt_tokens_per_1k = 0.0005
completion_tokens_per_1k = 0.0015

[cost.providers.anthropic]
# Anthropic Models (Pricing: https://www.anthropic.com/pricing)
[cost.providers.anthropic.claude-3-sonnet]
prompt_tokens_per_1k = 0.003
completion_tokens_per_1k = 0.015

[cost.providers.anthropic.claude-3-haiku]
prompt_tokens_per_1k = 0.00025
completion_tokens_per_1k = 0.00125

[cost.providers.google]
# Google Models (Pricing: https://ai.google.dev/pricing)
[cost.providers.google.gemini-1.5-pro]
prompt_tokens_per_1k = 0.0035
completion_tokens_per_1k = 0.0105

[cost.providers.google.gemini-1.5-flash]
prompt_tokens_per_1k = 0.000075
completion_tokens_per_1k = 0.0003

Pricing Structure

The pricing configuration follows this hierarchy: - [cost.providers.{provider_name}] - Provider section (e.g., openai, anthropic, google) - [cost.providers.{provider_name}.{model_name}] - Model-specific pricing - prompt_tokens_per_1k - Cost per 1,000 prompt tokens (input) - completion_tokens_per_1k - Cost per 1,000 completion tokens (output)

Note: The configuration uses [cost.providers] rather than [providers] at the top level to group all cost-related configuration under one section. This provides better organization and avoids potential naming conflicts.

Supported Providers

Flujo supports cost tracking for these providers: - OpenAI: gpt-4o, gpt-4o-mini, gpt-3.5-turbo, etc. - Anthropic: claude-3-sonnet, claude-3-haiku, etc. - Google: gemini-1.5-pro, gemini-1.5-flash, etc.

Using Cost Tracking in Pipelines

Once configured, cost tracking is automatically enabled for all pipeline steps that use LLM agents.

Accessing Cost Information

Cost and token information is available in pipeline results:

from flujo import Step, Flujo

# Create a pipeline with cost tracking
pipeline = Step.solution(my_agent) >> Step.validate(validator_agent)
runner = Flujo(pipeline)

# Run the pipeline
result = runner.run("Your prompt")

# Access cost information
for step_result in result.step_history:
    print(f"Step: {step_result.name}")
    print(f"  Cost: ${step_result.cost_usd:.4f}")
    print(f"  Tokens: {step_result.token_counts}")
    print(f"  Success: {step_result.success}")

Setting Usage Limits

You can set cost and token limits to prevent excessive spending:

from flujo import Flujo, Step, UsageLimits

# Define usage limits
limits = UsageLimits(
    total_cost_usd_limit=1.0,    # Maximum $1.00 total cost
    total_tokens_limit=5000       # Maximum 5,000 tokens
)

# Apply limits to pipeline
runner = Flujo(pipeline, usage_limits=limits)

try:
    result = runner.run("Your prompt")
except UsageLimitExceededError as e:
    print(f"Pipeline stopped due to usage limits: {e}")
    # Access partial results
    partial_result = e.partial_result

Step-Level Limits

You can also set limits on individual steps:

from flujo import Step, UsageLimits

# Set limits for a specific step
step_limits = UsageLimits(
    total_cost_usd_limit=0.10,   # Maximum $0.10 for this step
    total_tokens_limit=1000       # Maximum 1,000 tokens for this step
)

pipeline = (
    Step.solution(my_agent, usage_limits=step_limits)
    >> Step.validate(validator_agent)
)

Cost Tracking Features

Automatic Token Counting

Flujo automatically extracts token usage from LLM responses: - Prompt tokens: Input tokens sent to the model - Completion tokens: Output tokens generated by the model - Total tokens: Sum of prompt and completion tokens

Cost Calculation

Costs are calculated using the formula:

cost = (prompt_tokens / 1000) * prompt_tokens_per_1k +
       (completion_tokens / 1000) * completion_tokens_per_1k

Pricing Units: The configuration uses _per_1k (per 1,000 tokens) rather than _per_million_tokens to align with common provider pricing pages and provide more intuitive configuration values. For example, GPT-4o costs $0.005 per 1K prompt tokens, which is configured as prompt_tokens_per_1k = 0.005.

Fallback Pricing

If a model is not explicitly configured in flujo.toml, Flujo will check against a list of hardcoded default prices for popular models. A critical warning will be logged if a default is used. If no default exists, the cost will be 0.0.

⚠️ CRITICAL WARNING: Hardcoded defaults are for development/testing only and may be outdated. Always configure explicit pricing in flujo.toml for production use.

Parallel Execution Limits

When using parallel steps, Flujo can proactively cancel sibling branches when limits are exceeded:

from flujo import Step, Pipeline, UsageLimits

# Create parallel branches
fast_expensive = Pipeline.from_step(Step("expensive", costly_agent))
slow_cheap = Pipeline.from_step(Step("cheap", cheap_agent))

parallel = Step.parallel_branch(fast_expensive, slow_cheap)

# If fast_expensive breaches the limit, slow_cheap will be cancelled immediately
limits = UsageLimits(total_cost_usd_limit=0.10)
runner = Flujo(parallel, usage_limits=limits)

Best Practices

1. Regular Price Updates

Keep your pricing configuration up to date: - Monitor provider pricing changes - Update flujo.toml when prices change - Use provider-specific pricing for accuracy

2. Appropriate Limits

Set reasonable usage limits: - Start with conservative limits - Monitor actual usage patterns - Adjust limits based on your budget

3. Cost Monitoring

Monitor costs in production: - Log cost information for analysis - Set up alerts for high-cost runs - Track cost trends over time

4. Model Selection

Choose cost-effective models: - Use cheaper models for simple tasks - Reserve expensive models for complex work - Consider token efficiency

Troubleshooting Cost Tracking

Common Issues

No cost calculated: Check that pricing is configured for your model
Incorrect costs: Verify pricing values in flujo.toml
Missing token counts: Ensure your agent returns usage information

Debugging

Enable debug logging to troubleshoot cost tracking:

import logging
logging.getLogger("flujo.cost").setLevel(logging.DEBUG)

This will show detailed information about cost calculations and token extraction.

Pipeline Configuration

Step Configuration

from flujo import Step, Flujo

# Configure individual steps
pipeline = (
    Step.review(review_agent, timeout=30)  # 30-second timeout
    >> Step.solution(
        solution_agent,
        retries=3,            # Number of retries
        temperature=0.7,      # Control randomness
    )
    >> Step.validate(validator_agent)
)

Runner Configuration

# Configure the pipeline runner
runner = Flujo(
    pipeline,
    retry_on_error=True
)

Scoring Configuration

Custom Scoring

from flujo.domain.scoring import weighted_score

# Define custom weights
weights = {
    "correctness": 0.4,
    "readability": 0.3,
    "efficiency": 0.2,
    "documentation": 0.1
}

# Use in pipeline
pipeline = (
    Step.review(review_agent)
    >> Step.solution(solution_agent)
    >> Step.validate(
        validator_agent,
        scorer=lambda c: weighted_score(c, weights)
    )
)

Tool Configuration

Tool Settings

from pydantic_ai import Tool

def my_tool(param: str) -> str:
    """Tool description."""
    return f"Processed: {param}"

# Configure tool
tool = Tool(
    my_tool,
    timeout=10,  # Tool timeout
    retries=2,   # Number of retries
    backoff_factor=1.5,  # Backoff between retries
)

Best Practices

Environment Variables
Use .env for development
Use secure environment variables in production
Never commit API keys to version control
Model Selection
Choose models based on task requirements
Consider cost and performance trade-offs
Use appropriate model parameters
Pipeline Design
Set appropriate timeouts
Configure retries for reliability
Use parallel execution when possible
Telemetry
Enable in production
Configure appropriate sampling
Use secure endpoints
Cost Management
Configure accurate pricing
Set appropriate usage limits
Monitor costs regularly

Troubleshooting

Common Issues

API Key Issues
Verify keys are set correctly
Check key permissions
Ensure keys are valid
Timeout Issues
Increase timeouts for complex tasks
Check network latency
Monitor model response times
Memory Issues
Reduce batch sizes
Use appropriate model sizes
Monitor memory usage
Cost Tracking Issues
Verify pricing configuration
Check model name matching
Ensure usage information is available

Getting Help

Check the Troubleshooting Guide
Search existing issues
Create a new issue if needed

Next Steps

Read the Usage Guide for examples
Explore Advanced Topics
Check out Use Cases

Robust Path Handling for SQLite State Backends

Flujo supports robust, standards-compliant path handling for all SQLite state backends, ensuring correct behavior in both CLI and programmatic usage.

How `state_uri` is Resolved

Absolute paths (e.g., sqlite:////abs/path/to/flujo_ops.db) are used as-is.
Relative paths (e.g., sqlite:///./flujo_ops.db or sqlite:///flujo_ops.db) are always resolved relative to the current working directory of the process (not the config file location), following RFC 3986.
The path normalization logic ensures that URIs like sqlite:///./foo.db are interpreted as ./foo.db (relative), while sqlite:////foo.db is /foo.db (absolute).

How the CLI and Scripts Find the Config File

The CLI and all Flujo scripts search for flujo.toml in the current directory and parent directories, unless the FLUJO_CONFIG_PATH environment variable is set.
The state_uri in the config is then resolved as described above.
This guarantees that both CLI and scripts use the same database file, regardless of where they are run from, as long as the working directory and config are consistent.

Best Practices for Multi-Directory Projects and CI

Always use relative paths in state_uri for portable, environment-agnostic workflows.
In CI or multi-directory setups, set the working directory to the location where you want the database file to be created/accessed, or use an absolute path if you need a fixed location.
Use the FLUJO_CONFIG_PATH environment variable to explicitly specify the config file if running from outside the project root.

Example URIs and Their Effects

sqlite:///./flujo_ops.db → ./flujo_ops.db (relative to CWD)
sqlite:///flujo_ops.db → flujo_ops.db (relative to CWD)
sqlite:////tmp/flujo_ops.db → /tmp/flujo_ops.db (absolute)
sqlite:///../data/ops.db → ../data/ops.db (relative to CWD)

Design Principles

Separation of concerns: Path normalization and config loading are handled in dedicated, testable functions.
No hardcoded paths: All logic is parameterized and standards-compliant; no magic values are used.
Single responsibility: Each function does one thing—parsing, normalization, or backend instantiation.

This approach guarantees robust, predictable, and portable state management for all Flujo workflows.

Configuration Guide

Settings Overview

How Settings are Loaded

Settings Class Properties

API Keys

Feature Toggles

Default Models

Orchestrator Tuning

Python Configuration

Model Configuration

Model Selection

Model Parameters

Cost Tracking Configuration

Configuring Provider Pricing

Basic Cost Configuration

Pricing Structure

Supported Providers

Using Cost Tracking in Pipelines

Accessing Cost Information

Setting Usage Limits

Step-Level Limits

Cost Tracking Features

Automatic Token Counting

Cost Calculation

Fallback Pricing

Parallel Execution Limits

Best Practices

1. Regular Price Updates

2. Appropriate Limits

3. Cost Monitoring

4. Model Selection

Troubleshooting Cost Tracking

Common Issues

Debugging

Pipeline Configuration

Step Configuration

Runner Configuration

Scoring Configuration

Custom Scoring

Tool Configuration

Tool Settings

Best Practices

Troubleshooting

Common Issues

Getting Help

Next Steps

Robust Path Handling for SQLite State Backends

How state_uri is Resolved

How the CLI and Scripts Find the Config File

Best Practices for Multi-Directory Projects and CI

Example URIs and Their Effects

Design Principles

`Settings` Class Properties

How `state_uri` is Resolved