Funnel TES Architecture
Understanding how Funnel Task Execution Service works
This guide explains the core architecture of Funnel, how it manages task execution on Kubernetes, and the design patterns used to ensure reliability and scalability.
ποΈ Component Overview
Funnel Server
The central control plane running on a master or system node:
- REST API (port 8000): Workflow engines submit tasks here
- gRPC API (port 8001): Efficient task status updates
- Task Manager: Monitors pod lifecycle
- Database: Persists task state and metadata
DaemonSet Pattern
A Funnel Disk Setup DaemonSet runs on every worker node:
setup-nfs-hostinitContainer: Mounts shared storage on hostholdercontainer: Keeps mount alive with periodic touchfile
Task Executor
Each Funnel task becomes a Kubernetes Pod:
- initContainer:
wait-for-nfs(waits for mount readiness) - container:
funnel-worker-<taskid>(creates actual task container via nerdctl) - Runs unprivileged userspace container for isolation
π¦ Storage Architecture (DaemonSet Pattern)
DaemonSet funnel-disk-setup (1 per node)
β
ββ initContainer: setup-nfs-host
β ββ Command: nsenter --mount=/proc/1/ns/mnt -- mount -t nfs ...
β ββ Mounts NFS on host filesystem at /mnt/shared
β
ββ container: holder
β ββ Command: while true; do sleep 30; touch /mnt/shared/.keepalive; done
β ββ Keeps NFS connection alive (prevents timeout)
β
ββ volumes: hostPath /mnt/shared (DirectoryOrCreate)
Task Pod (per WDL task)
β
ββ initContainer: wait-for-nfs
β ββ Command: until [ -f /mnt/shared/.keepalive ]; do sleep 5; done
β ββ Waits for DaemonSet to mount NFS
β
ββ container: funnel-worker-<id> (privileged)
β ββ volumeMount: /mnt/shared (hostPath, HostToContainer propagation)
β ββ nerdctl RunCommand: --volume /mnt/shared:/mnt/shared:rw
β ββ Bind-mounts shared storage into task container
β
ββ nerdctl task container (unprivileged)
ββ Sees /mnt/shared, can read/write normally
Why This Design?
Problem: Soft NFS mounts timeout after idle period (~1-2 minutes)
Old Solution (Broken):
- Task Pod mounts NFS when created
- Task completes, NFS remains mounted but idle
- Next task checks mount: it looks OK (VFS entry exists)
- Next task accesses it: TCP timeout β I/O error
New Solution (DaemonSet Keepalive):
- Single owner (DaemonSet) mounts NFS on host
- Keepalive loop (
touchevery 30s) prevents idle timeout - Task pods only consume (no mounting/unmounting)
- Safe for parallel tasks (no race conditions)
π Task Lifecycle
1. Workflow Engine (Cromwell)
β
ββ> Submit task via gRPC
curl http://funnel:8000/v1/tasks (POST)
{
"name": "task-1",
"commandLine": ["bash", "-c", "echo hello"],
"outputs": [...]
}
2. Funnel Server
β
ββ> Create Pod manifest
β ββ initContainer: wait-for-nfs
β ββ container: funnel-worker-task-1
β ββ volumeMount: /mnt/shared (HostToContainer)
β
ββ> Submit to Kubernetes API
kubectl create pod task-1 -n funnel
3. Kubernetes Scheduler
β
ββ> Assign pod to node
Select worker node with sufficient resources
4. Node's kubelet
β
ββ> Pull container image
ββ> Start initContainers
β ββ wait-for-nfs waits for DaemonSet mount
ββ> Start main container
β ββ nerdctl launches task container
ββ> Monitor until completion
5. Task Execution
β
ββ> runCommand: ["bash", "-c", "echo hello"]
ββ> Mount propagation: HostToContainer picks up host's /mnt/shared
ββ> Task reads/writes to /mnt/shared
ββ> Task exits (exit code captured)
6. Funnel Server
β
ββ> Poll pod status every N seconds
ββ> Capture exit code and logs
ββ> Mark task COMPLETE/FAILED
ββ> Store result in database
7. Workflow Engine
β
ββ> Query task status via gRPC
curl http://funnel:8000/v1/tasks/task-id
Returns: { state: "COMPLETE", outputs: [...] }
π API Endpoints
REST API (Port 8000)
# List tasks
GET /v1/tasks
# Create task
POST /v1/tasks
Body: { "name": "task-1", "commandLine": [...], ... }
# Get task details
GET /v1/tasks/{taskId}
# Get task logs
GET /v1/tasks/{taskId}/logs
# Cancel task
POST /v1/tasks/{taskId}:cancel
# Health check
GET /healthz
gRPC API (Port 8001)
service TaskService {
rpc CreateTask(Task) returns (CreateTaskResponse);
rpc GetTask(GetTaskRequest) returns (Task);
rpc ListTasks(ListTasksRequest) returns (ListTasksResponse);
rpc CancelTask(CancelTaskRequest) returns (Empty);
rpc WatchTask(WatchTaskRequest) returns (stream Task);
}
π Data Model
Task Object
{
"id": "task-abc123",
"state": "RUNNING", // QUEUED, INITIALIZING, RUNNING, PAUSED, COMPLETE, EXECUTOR_ERROR, SYSTEM_ERROR
"name": "alignment-task",
"commandLine": ["bwa", "mem", "ref.fa", "reads.fq"],
"inputs": [
{
"url": "s3://bucket/ref.fa",
"path": "/task/ref.fa"
}
],
"outputs": [
{
"url": "s3://bucket/results/out.bam",
"path": "/task/out.bam"
}
],
"resources": {
"cpuCores": 2,
"ramGb": 8,
"diskGb": 100
},
"logs": [
{
"taskId": "task-abc123",
"stdLog": "Task started...",
"endTime": "2026-03-13T10:00:00Z"
}
],
"createdTime": "2026-03-13T09:50:00Z",
"startTime": "2026-03-13T09:51:00Z",
"endTime": "2026-03-13T10:00:00Z"
}
π Security Model
Container Isolation
- Funnel Worker: Runs as root (creates containers)
- Task Container: Runs as unprivileged user (canβt escalate)
- Storage Access: Bind-mounted NFS is read/write but isolated
Volume Access
- Task cannot mount new volumes (unprivileged)
- Task cannot access other podsβ volumes
- Cross-pod communication via network only
π Scalability Considerations
Horizontal Scaling
- Add more worker nodes β Karpenter provisions them
- DaemonSet automatically runs on new nodes
- Task Pod can be scheduled to any available node
Vertical Scaling
- Increase resource requests/limits on tasks
- Funnel scales to node capabilities
- Karpenter provisions larger instances if needed
Storage Bottlenecks
- DaemonSet keepalive ensures NFS stability
- All nodes use same shared mount
- Consider storage I/O patterns for large workflows
β‘ Performance Tuning
Task Startup Time
Total Time = Pod creation (2-5s) + Container pull (5-30s) + Task init (0-10s)
Optimize by:
- Pre-pulling container images on nodes
- Using smaller container images
- Parallel task submission
NFS Performance
Tuned mount options:
mount -t nfs -o vers=4,soft,timeo=30,retrans=3,_netdev ...
vers=4: NFSv4 (modern, efficient)soft: Soft timeout (better for cloud)timeo=30: 3-second timeout (with retrans=3)_netdev: Network device (mount after network ready)
π Failure Handling
Soft Mount Timeout
When NFS is idle > 2 minutes (cloud timeout):
- Soft mount returns I/O error
retrans=3retries 3 times- If still fails, task gets I/O error
Prevention: DaemonSet keepalive prevents idle timeout.
Pod Eviction
If node becomes unavailable:
- kubelet marks pod as pending
- Funnel detects status change
- Marks task as SYSTEM_ERROR
- Workflow engine can retry on another node
Container Crash
If task container crashes:
- Funnel detects exit code
- Marks task as EXECUTOR_ERROR
- Logs error message
- Workflow engine decides to retry or fail
π See Also
- Container Images β Build & manage images
- Configuration β Runtime options
- Troubleshooting β Debug issues
- Cromwell Integration β How Cromwell uses TES
Last Updated: March 13, 2026