Skip to content

Deployment

Ductwork is designed to run reliably in production environments, from traditional servers to containerized deployments. By default, bin/ductwork starts a supervisor that forks child processes: one pipeline advancer and one job worker per configured pipeline. This works well for traditional deployments but conflicts with container best practices.

The container problem: Containers expect a single foreground process. Forking introduces complications with signal handling, logging, and process management; container orchestrators like Kubernetes work best when they can directly manage each process.

Ductwork provides a role configuration that allows you to run a single process type without forking. This aligns with the container philosophy of one process per container.

Set the role in your YAML configuration:

config/ductwork.pipeline_advancer.yml
default: &default
role: advancer
pipelines: "*"
production:
<<: *default
config/ductwork.job_worker.yml
default: &default
role: worker
pipelines:
- EnrichUserDataPipeline
- ProcessOrdersPipeline
production:
<<: *default

Or override via environment variable:

Terminal window
DUCTWORK_ROLE=advancer bin/ductwork

Run separate containers for each role. This gives you independent scaling, cleaner logs, and better resource isolation.

Dockerfile:

FROM ruby:3.3-slim
WORKDIR /app
COPY Gemfile Gemfile.lock ./
RUN bundle install --without development test
COPY . .
CMD ["bin/ductwork"]

docker-compose.yml:

version: "3.8"
services:
pipeline_advancer:
build: .
command: bin/ductwork -c config/ductwork.yml
environment:
DUCTWORK_ROLE: advancer
RAILS_ENV: production
DATABASE_URL: postgres://db:5432/myapp
depends_on:
- db
restart: unless-stopped
job_worker_enrichment:
build: .
command: bin/ductwork -c config/ductwork.enrichment.yml
environment:
DUCTWORK_ROLE: worker
RAILS_ENV: production
DATABASE_URL: postgres://db:5432/myapp
depends_on:
- db
- pipeline_advancer
deploy:
replicas: 3
restart: unless-stopped
job_worker_orders:
build: .
command: bin/ductwork -c config/ductwork.orders.yml
environment:
DUCTWORK_ROLE: worker
RAILS_ENV: production
DATABASE_URL: postgres://db:5432/myapp
depends_on:
- db
- pipeline_advancer
deploy:
replicas: 2
restart: unless-stopped
db:
image: postgres:16
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:

Scaling: Scale job workers independently of the advancer. If your bottleneck is step execution, add more worker containers without spinning up redundant advancers.

Resource allocation: Assign different CPU and memory limits based on workload characteristics. Advancers are typically lightweight; workers may need more resources depending on your step logic.

Logging: Each container’s stdout is a single, coherent log stream. No interleaved output from multiple processes.

Failure isolation: A crash in a job worker doesn’t affect the advancer or other workers. The orchestrator restarts only what failed.


Kubernetes deployments follow the same container-per-role pattern.

apiVersion: apps/v1
kind: Deployment
metadata:
name: ductwork-advancer
labels:
app: ductwork
component: advancer
spec:
replicas: 1 # Usually only need one
selector:
matchLabels:
app: ductwork
component: advancer
template:
metadata:
labels:
app: ductwork
component: advancer
spec:
containers:
- name: advancer
image: myapp:latest
command: ["bin/ductwork"]
args: ["-c", "config/ductwork.yml"]
env:
- name: DUCTWORK_ROLE
value: advancer
- name: RAILS_ENV
value: production
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
exec:
command:
- pgrep
- -f
- ductwork
initialDelaySeconds: 10
periodSeconds: 30
terminationGracePeriodSeconds: 45
apiVersion: apps/v1
kind: Deployment
metadata:
name: ductwork-worker
labels:
app: ductwork
component: worker
spec:
replicas: 5
selector:
matchLabels:
app: ductwork
component: worker
template:
metadata:
labels:
app: ductwork
component: worker
spec:
containers:
- name: worker
image: myapp:latest
command: ["bin/ductwork"]
args: ["-c", "config/ductwork.yml"]
env:
- name: DUCTWORK_ROLE
value: worker
- name: RAILS_ENV
value: production
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- pgrep
- -f
- ductwork
initialDelaySeconds: 10
periodSeconds: 30
terminationGracePeriodSeconds: 45

Scale workers based on CPU utilization or custom metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ductwork-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ductwork-worker
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

For orchestrators that support health checks, you can verify the Ductwork process is running:

Simple process check:

Terminal window
pgrep -f ductwork

Heartbeat-based check: Query the ductwork_processes table for recent heartbeats from the current container. A process that hasn’t reported a heartbeat in 5 minutes is considered unhealthy.

# Example health check endpoint
Ductwork::Process
.where('last_heartbeat_at > ?', 5.minutes.ago)
.exists?

Traditional (Non-Containerized) Deployment

Section titled “Traditional (Non-Containerized) Deployment”

For VMs or bare-metal servers, use the default supervisor mode with a process manager like systemd:

/etc/systemd/system/ductwork.service
[Unit]
Description=Ductwork Pipeline Processor
After=network.target postgresql.service
[Service]
Type=simple
User=deploy
WorkingDirectory=/var/www/myapp/current
ExecStart=/var/www/myapp/current/bin/ductwork -c config/ductwork.yml
Restart=always
RestartSec=5
Environment=RAILS_ENV=production
[Install]
WantedBy=multi-user.target

In this mode, the supervisor handles process management internally, restarting crashed children automatically.

Deployment TypeRole ConfigProcess Management
Traditional (VM/bare-metal)supervisor (default)systemd, upstart, etc.
Docker/Kubernetesadvancer or workerOrchestrator handles restarts

The role configuration lets you choose the model that fits your infrastructure. For containers, run separate services for each role. For traditional deployments, let the supervisor manage child processes.