GPT‑5 Architect Pipeline and model_settings
This guide explains how Flujo supports GPT‑5 provider controls via model_settings
in YAML and showcases a reference Architect pipeline. The CLI’s Architect is now implemented programmatically (state machine) in flujo/architect/builder.py
; the YAML example remains useful for studying agent settings and prompts.
Overview
With GPT‑5, you can pass fine‑grained controls (e.g., reasoning effort and text verbosity) directly to agents in your YAML blueprints. Flujo forwards these settings to the underlying pydantic-ai
Agent during compilation.
Key benefits: - Single powerful architect agent replaces multiple planning steps. - In‑memory YAML validation with a self‑correction loop. - Optional repair agent leverages GPT‑5 reasoning to fix issues.
Using model_settings
in YAML
Add model_settings
under any declarative agent:
agents:
architect_agent:
model: "openai:gpt-5"
model_settings:
reasoning: { effort: "high" }
text: { verbosity: "low" }
system_prompt: |
You are the Flujo AI Architect...
output_schema:
type: object
properties: { yaml_text: { type: string } }
required: [yaml_text]
These settings are passed as‑is to pydantic_ai.Agent
(via Flujo’s compiler) so providers can interpret them natively.
GPT‑5 Architect Pipeline (Example)
See examples/architect_pipeline.yaml
for a full example (reference). Highlights:
- agents.architect_agent
designs and emits the YAML (YamlWriter(yaml_text: str)
).
- Validation loop uses flujo.builtins.validate_yaml
and branches on flujo.utils.context:predicate_is_valid_report
.
- Valid branch is a passthrough; invalid branch uses agents.repair_agent
(also with model_settings
).
If you prefer a declarative state machine that handles interactive clarification (HITL) and validation phases, see examples/architect_pipeline_state_machine.yaml
. It demonstrates a transitions:
block with an on: pause
self‑transition for the Clarification
state so that, upon resume, the state re‑enters to process new user input.
Timeouts & Retries
Complex GPT‑5 calls can take longer than typical LLM requests. You can tune timeouts and retries in two places:
- Agent-level (applies to the LLM call itself; enforced by the Agent wrapper):
agents:
architect_agent:
model: "openai:gpt-5"
timeout: 180 # seconds
max_retries: 1
model_settings:
reasoning: { effort: "high" }
text: { verbosity: "low" }
- Step-level (alias for plugin/validator phases; normalized to
timeout_s
):
steps:
- name: DesignAndBuildBlueprint
uses: agents.architect_agent
config:
timeout: 180 # used for plugin/validator stages; agent call uses agent.timeout above
max_retries: 1
Notes:
- Agent timeout
and max_retries
are forwarded to make_agent_async
and enforced by the Agent wrapper.
- Step config.timeout
is normalized to timeout_s
and used by plugin/validator phases. The agent call timeout is governed by the agent-level timeout
.
CLI Support
flujo create
passes a list of available skills to the architect via initial context. In the programmatic builder, this is set during the GatheringContext
state. To enable the full conversational state machine for the CLI, set FLUJO_ARCHITECT_STATE_MACHINE=1
.
Testing Notes
- The compiler now accepts
model_settings
in the agent schema and forwards them throughmake_agent_async
. - E2E tests can assert that
model_settings
are received by the agent constructor using monkeypatching offlujo.domain.blueprint.compiler.make_agent_async
.