Interruptible Advancement
The expand transition fans out over a collection, creating a new step instance for every element. For modest collections this is no problem. But when the collection holds hundreds of thousands (or millions) of elements, materializing every step in one shot becomes a problem: one enormous database transaction and unbounded memory growth, with all of that progress lost if the process goes down partway through.
Ductwork Pro replaces this all-at-once expansion with interruptible advancement. The advancer materializes expanded steps in bounded batches, each committed in its own transaction. An uninterrupted advance works through every batch in a single pass; but because each batch is durable the moment it commits, an expansion that’s interrupted partway keeps the steps it already created and resumes from there. This lets a single expand afford millions of steps without long transactions or memory spikes.
Why interruptible advancement?
Section titled “Why interruptible advancement?”Naive fan-out over a very large collection runs into hard limits:
- Long transactions - inserting millions of rows in one transaction holds locks, bloats your WAL/undo logs, and risks timeouts
- Memory pressure - building the full set of step records in memory at once can exhaust the advancer process
- Unsafe shutdown - a crash, kill, or deploy arriving mid-expansion rolls back the whole transaction, discarding every step created so far
Interruptible advancement removes these ceilings, making large-scale fan-out (batch enrichment, per-record processing, large backfills) a first-class pattern rather than something to work around.
How it works
Section titled “How it works”When a pipeline reaches an expand transition, the Pro advancer materializes the expanded steps in batches rather than all at once:
- Batch creation - a bounded number of step instances are created and committed in a single transaction
- Wire and commit - each new step’s input is marked as wired in that same transaction, so once a batch commits it is never recreated
- Continue - the advance moves on to the next not-yet-wired batch, repeating until the collection is exhausted
A single uninterrupted advance completes the whole expansion in one pass. Splitting the work into independently committed batches is what keeps any one transaction small and bounds how many records are held in memory at once — and, because each batch is durable the moment it commits, it is also what makes an interrupted expansion recoverable (see Safe interruption).
No changes to your pipeline definition are required. The same expand transition you already use benefits automatically:
class EnrichAllUsersDataPipeline < Ductwork::Pipeline define do |pipeline| pipeline.start(QueryUsersRequiringEnrichment) .expand(to: LoadUserData) endendIf QueryUsersRequiringEnrichment returns ten elements or ten million, the advancer fans out the same way; the only difference is how many batches it takes to finish.
Safe interruption
Section titled “Safe interruption”Because every batch commits independently, an expansion that’s interrupted partway doesn’t lose the steps it already created:
- Interrupted mid-expansion - if the process crashes, is killed, or is forced down during a deploy (see Signal Handling), every batch that committed before the interruption survives. An in-flight batch that hadn’t committed yet rolls back cleanly, leaving no partial rows behind.
- Resume on re-advance - when the branch is picked up again, expansion continues from the first batch that wasn’t yet wired. Already-created steps are not duplicated.
This makes large fan-outs safe to run across deploys and restarts. The work that committed survives the interruption, and the expansion picks up where it left off instead of starting over.
Configuration
Section titled “Configuration”There’s nothing to configure. Interruptible advancement is always on for expand transitions in Ductwork Pro. The advancer manages batching internally, with no tuning required.
Interaction with max_depth
Section titled “Interaction with max_depth”Interruptible advancement governs how expanded steps are created; it does not change the limit on how many may be created. The pipeline_advancer.steps.max_depth ceiling still applies across the full expansion. If a collection would push a branch past max_depth, the pipeline is halted just as it would be without batching. The limit is evaluated against the cumulative total, not per batch.
To allow truly large fan-outs, raise (or unset) max_depth accordingly:
default: &default pipeline_advancer: steps: max_depth: -1 # unlimitedMonitoring
Section titled “Monitoring”If you’ve configured Metrics, a runaway fan-out still surfaces:
pipeline.halted- incremented (tagged with the pipeline class) if an expansion exceeds the configuredmax_depth
Note that the batched fan-out bulk-inserts its steps, so it does not emit the per-step job.enqueued metric that ordinary transitions do. Instead, each committed batch is recorded in the logs as a Job batch enqueued entry. For a high-cardinality expand, watching the step count climb in the dashboard as batches commit is the clearest signal that a fan-out is progressing rather than stalled.