Cromwell + TES on Kubernetes
A comprehensive platform for genomic workflow orchestration on any* cloud
This page describes steps taken to deploy Cromwell (workflow engine) + Funnel TES (task execution) on OVH, AWS, or other managed Kubernetes service. We discuss setup, configuration to give special attention, and try to make cost estimates for running genomics analysis on this platform.
Might be a bit optimistic. This page focuses on AWS EKS and OVHcloud MKS.
🎯 Project Goal
Run high-throughput genomic workflows (WDL/CWL) with:
- Cromwell: Workflow language execution engine (WDL, CWL) — can run on-prem or in-cloud
- Funnel TES: Task Execution Service (GA4GH standard) — runs in Kubernetes cluster
- Kubernetes: Any managed or self-hosted cluster (OVH, AWS, GCP, etc.)
- Cloud Storage: S3, NFS, EFS, or other cloud-native options
- Auto-scaling: Karpenter-managed worker pools
Architecture
Prior to this setup, I had limited hands-on K8s experience. The setup described here was achieved using fail-forward and pragmatic decisions, and I’m well aware there might be significant improvement options. I invite you to provide them througy Pull Requests 😊.
┌──────────────────────────┐
│ Cromwell Workflow │ (on-prem, cloud, or local)
│ Engine │ (submits tasks to TES)
└────────────┬─────────────┘
│ WDL/CWL tasks
↓
┌────────────────────────────────┐
│ Kubernetes Cluster │ (OVH, AWS, GCP, etc.)
│ ┌──────────────────────────┐ │
│ │ Funnel TES │ │
│ │ (task executor) │ │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ Task Pods (workers) │ │
│ │ (auto-scaled (karpenter))│ │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ Storage │ │
│ │ (S3/NFS/EFS/...) │ │
│ └──────────────────────────┘ │
└────────────────────────────────┘
│
↓
┌────────────────────┐
│ Results/Outputs │
│ (Cloud Storage) │
└────────────────────┘
1. Cloud-Agnostic Core
- TES, Cromwell, Karpenter work on any Kubernetes
- Documentation tries to separate platform-agnostic from platform-specific
2. Modular Components
- Swap storage (NFS ↔ S3 ↔ EFS)
- Swap compute (OVH ↔ AWS (↔ GCP))
- Swap autoscaling (Karpenter ↔ Cloud-Managed)
3. Storage Strategy (DaemonSet Pattern)
DaemonSet (on every node)
├─ Mounts shared storage (NFS/EFS/S3)
└─ Keeps connection alive to prevent timeouts on node re-usage during consecutive tasks
Task Pods
├─ Wait for mount to be ready
├─ Consume via hostPath + propagation
└─ No unmounting on exit (DaemonSet owns lifecycle)
4. Auto-scaling (Karpenter)
Karpenter can be configured to select nodes from a preselected list of instance types
Monitor pod queue → Insufficient resources → Scale up nodes
Pods completed → Idle timeout → Scale down
5. Cost Optimization
This deployment was built with routine genomics pipelines in mind. In this setting, data comes in spikes (sequencing machines finish), and are time critical. Therefore, we aimed for:
- Nodes scaling to (near) zero when idle + keep cromwell out of the K8s ecosystem
- Use spot instances with robust retries where available
- Prevent localization of static data where possible (reference data)
- Provide access to wide ranges of instance types
� Development & Contributing
Development: Pending Pull Requests
Documentation of work-in-progress improvements to upstream repositories:
- karpenter-provider-ovhcloud: Node labeling, drift detection, pool creation fixes
- funnel: Kubernetes backend improvements (optional S3, ConfigMap templates, template rendering)
- cromwell: S3 endpoint support, TES memory-retry, local filesystem support
Contribute
- Issues: Document in GitHub Issues with
[TES],[Cromwell],[OVH], or[AWS]prefix - Updates: Submit PRs with documentation improvements
- Questions: Check relevant section or file an issue
- Contact : geert.vandeweyer@uza.be
📖 How to Use This Documentation
The documentation is organized as two main deployment examples : OVHcloud and AWS. Next there are a couple of pages describing setups in more detail.
I want to deploy on OVHcloud
Start here: OVHcloud Installation Guide
Then reference:
- TES Architecture — Understand the task execution layer
- Cromwell Configuration — Set up workflow engine
- Karpenter Configuration — Auto-scaling (optional)
I want to deploy on AWS
Start here: AWS Installation Guide
Then reference:
- TES Architecture — Understand the task execution layer
- Cromwell Configuration — Set up workflow engine
- Karpenter Configuration — Auto-scaling (optional)
The Task Execution Layer : TES (Funnel)
Start here: Funnel TES Overview
Then dive into:
- Architecture — Design & patterns
- Container Images — Custom builds
- Configuration — Runtime options
- Troubleshooting — Common issues
The Workflow Execution Layer : Cromwell
Start here: Cromwell Overview
Then dive into:
- Configuration — Backends & runtime settings
- Workflows — Submitting & monitoring
- Troubleshooting — Common issues
The Cluster Autoscaling Layer : Karpenter
Start here: Karpenter Overview
Then dive into:
I need quick command references
See: Quick Reference
kubectl, openstack, aws CLI commands, common tasks, troubleshooting.
📊 Status & Maintenance
| Component | Status | Tested |
|---|---|---|
| OVHcloud Deployment | ✅ Production | OVH MKS 1.34 |
| AWS Deployment | 📋 Template | Not yet |
| Funnel TES | ✅ Functional (PR#1357) | Yes |
| Cromwell | ✅ Functional (PR#7858) | Yes |
| Karpenter OVH | ✅ Functional (PR#1) | Yes |
| Karpenter AWS | 📋 Template | Not yet |
🔐 Security Considerations
- Encryption: LUKS on persistent volumes (OVH), EBS encryption (AWS)
- Networking: Private subnets, security groups configured
- Access Control: RBAC configured per component
- Credentials: Environment variables, no hardcoding
See platform-specific guides for detailed security setup.
📋 Helpful Resources
- Cromwell Documentation
- Funnel TES Documentation
- Kubernetes Documentation
- Karpenter Documentation
- GA4GH TES Specification
Last Updated: March 28, 2026
Version: 1.0 (OVH testing finished)