Error Handling

What happens when a step fails? By default: the pipeline stops immediately and returns an error. That's the safe default — better to fail loudly than silently produce wrong results.

But sometimes you want something smarter. You might want to retry a flaky network call, use a fallback value, or just keep going despite a failure. That's what JigSpec's error handling features are for.

Default behavior

If a step fails and you haven't configured error handling, the pipeline stops:

PipelineError: Step "fetch_data" failed: Connection timeout after 30s
Pipeline halted at step 2 of 5

This is intentional. Partial results are often worse than no results.

Retry with `max_attempts`

Implemented

For ai steps, you can configure how many times JigSpec will retry if the model call fails:

yaml

- name: analyze
  action: ai
  prompt: "Analyze this data: {{ input.text }}"
  config:
    max_attempts: 3   # try up to 3 times before failing

Retries use exponential backoff — each retry waits a bit longer than the last. This handles transient API errors and rate limits gracefully.

Default retry behavior

The default value for max_attempts is 1 (no retry). Set it to 3 for production pipelines that make external API calls.

Pipeline-level error policy

Partially implemented

You can set a pipeline-wide error policy that applies to all steps:

yaml

pipeline:
  name: my-pipeline
  on_error: stop   # stop (default) | continue | rollback

The three policies:

Policy	Behavior	Status
`stop`	Halt the pipeline at the first failure (default)	Implemented
`continue`	Mark the failed step, keep running steps that don't depend on it	Implemented
`rollback`	Attempt to undo side effects before stopping	Not implemented

continue is useful when your pipeline has multiple independent branches and you want a failure in one branch to skip the rest of that branch without killing the others. Steps that transitively depend on the failed step are skipped (not retried); unrelated steps keep going.

rollback is not implemented

Declaring on_error: rollback will parse but the runtime silently falls back to stop behavior. If you need rollback semantics, implement the undo step explicitly in your pipeline logic.

Step-level error handling

Implemented

For fine-grained control, configure error handling per step:

yaml

- name: risky_call
  action: code
  run: |
    const resp = await fetch(input.url)
    if (!resp.ok) throw new Error(`HTTP ${resp.status}`)
    return { body: await resp.text() }
  outputs:
    - body
  on_error:
    retry: 3                   # retry up to 3 times (4 total attempts)
    backoff: exponential       # exponential | linear | fixed
    fallback: default_value    # use this step's output if all retries fail

- name: default_value
  action: ai
  prompt: "Generate a default response since the fetch failed"
  outputs:
    - body

How it behaves:

retry: N means up to N+1 total dispatches (the original attempt plus N retries).
backoff controls the delay between retries: fixed (constant), linear (delay × attempt number), or exponential (doubles each attempt). The backoff delay is capped to keep failures from stalling the pipeline.
fallback names another step in the same pipeline. When retries are exhausted, JigSpec dispatches that step and substitutes its outputs for the failed step's. Downstream steps referencing the failed step name transparently read the fallback's values — they don't need to know a fallback happened.

The combination of retry: N + backoff and fallback lets you build pipelines that degrade gracefully instead of aborting on transient failures.

max_attempts vs on_error.retry

max_attempts on an ai step is the model's own self-correction loop (the model retries with validation/tool-use feedback inside a single dispatch). on_error.retry is the outer wrapper that re-dispatches the whole step. They compose — an ai step can have both.

Cost safety: `max_cost`

Implemented

For agent pipelines that might make many model calls, you can set a cost limit:

yaml

pipeline:
  name: research-agent
  config:
    max_cost: 1.00   # halt if cumulative cost crosses $1.00

JigSpec tracks token usage and pricing for every model call and keeps a running cost total. As soon as the cumulative total would cross max_cost, the pipeline halts with a clear max_cost_exceeded error — stopping pipelines from quietly running up unexpected bills on rate-limit retries or runaway agent loops.

You can also set a per-step cap:

yaml

- name: researcher
  action: ai
  prompt: "Research {{ input.company }}"
  tools: [WebSearch, WebFetch]
  max_attempts: 20
  max_cost: 0.25   # this single step can't spend more than $0.25

Pricing data

When a pipeline declares max_cost, JigSpec loads a model-pricing table at startup (required mode — the run will fail with a clear error if the table can't be fetched). Pipelines without max_cost skip this fetch entirely so normal runs don't pay a network round-trip on startup.

Error Handling ​

Default behavior ​

Retry with max_attempts ​

Pipeline-level error policy ​

Step-level error handling ​

Cost safety: max_cost ​

Error Handling

Default behavior

Retry with `max_attempts`

Pipeline-level error policy

Step-level error handling

Cost safety: `max_cost`