Cromwell + TES on Kubernetes

A comprehensive platform for genomic workflow orchestration on any* cloud

This page describes steps taken to deploy Cromwell (workflow engine) + Funnel TES (task execution) on OVH, AWS, or other managed Kubernetes service. We discuss setup, configuration to give special attention, and try to make cost estimates for running genomics analysis on this platform.

Might be a bit optimistic. This page focuses on AWS EKS and OVHcloud MKS.


🎯 Project Goal

Run high-throughput genomic workflows (WDL/CWL) with:

  • Cromwell: Workflow language execution engine (WDL, CWL) — can run on-prem or in-cloud
  • Funnel TES: Task Execution Service (GA4GH standard) — runs in Kubernetes cluster
  • Kubernetes: Any managed or self-hosted cluster (OVH, AWS, GCP, etc.)
  • Cloud Storage: S3, NFS, EFS, or other cloud-native options
  • Auto-scaling: Karpenter-managed worker pools

Architecture

Prior to this setup, I had limited hands-on K8s experience. The setup described here was achieved using fail-forward and pragmatic decisions, and I’m well aware there might be significant improvement options. I invite you to provide them througy Pull Requests 😊.

┌──────────────────────────┐
│  Cromwell Workflow       │  (on-prem, cloud, or local)
│  Engine                  │  (submits tasks to TES)
└────────────┬─────────────┘
             │ WDL/CWL tasks
             ↓
    ┌────────────────────────────────┐
    │  Kubernetes Cluster            │  (OVH, AWS, GCP, etc.)
    │  ┌──────────────────────────┐  │
    │  │ Funnel TES               │  │
    │  │ (task executor)          │  │
    │  └──────────────────────────┘  │
    │  ┌──────────────────────────┐  │
    │  │ Task Pods (workers)      │  │
    │  │ (auto-scaled (karpenter))│  │
    │  └──────────────────────────┘  │
    │  ┌──────────────────────────┐  │
    │  │ Storage                  │  │
    │  │ (S3/NFS/EFS/...)         │  │
    │  └──────────────────────────┘  │
    └────────────────────────────────┘
             │
             ↓
    ┌────────────────────┐
    │ Results/Outputs    │
    │ (Cloud Storage)    │
    └────────────────────┘

1. Cloud-Agnostic Core

  • TES, Cromwell, Karpenter work on any Kubernetes
  • Documentation tries to separate platform-agnostic from platform-specific

2. Modular Components

  • Swap storage (NFS ↔ S3 ↔ EFS)
  • Swap compute (OVH ↔ AWS (↔ GCP))
  • Swap autoscaling (Karpenter ↔ Cloud-Managed)

3. Storage Strategy (DaemonSet Pattern)

DaemonSet (on every node)
  ├─ Mounts shared storage (NFS/EFS/S3)
  └─ Keeps connection alive to prevent timeouts on node re-usage during consecutive tasks

Task Pods
  ├─ Wait for mount to be ready
  ├─ Consume via hostPath + propagation
  └─ No unmounting on exit (DaemonSet owns lifecycle)

4. Auto-scaling (Karpenter)

Karpenter can be configured to select nodes from a preselected list of instance types

Monitor pod queue → Insufficient resources → Scale up nodes
                    Pods completed → Idle timeout → Scale down

5. Cost Optimization

This deployment was built with routine genomics pipelines in mind. In this setting, data comes in spikes (sequencing machines finish), and are time critical. Therefore, we aimed for:

  • Nodes scaling to (near) zero when idle + keep cromwell out of the K8s ecosystem
  • Use spot instances with robust retries where available
  • Prevent localization of static data where possible (reference data)
  • Provide access to wide ranges of instance types

� Development & Contributing

Development: Pending Pull Requests

Documentation of work-in-progress improvements to upstream repositories:

  • karpenter-provider-ovhcloud: Node labeling, drift detection, pool creation fixes
  • funnel: Kubernetes backend improvements (optional S3, ConfigMap templates, template rendering)
  • cromwell: S3 endpoint support, TES memory-retry, local filesystem support

Contribute

  • Issues: Document in GitHub Issues with [TES], [Cromwell], [OVH], or [AWS] prefix
  • Updates: Submit PRs with documentation improvements
  • Questions: Check relevant section or file an issue
  • Contact : geert.vandeweyer@uza.be

📖 How to Use This Documentation

The documentation is organized as two main deployment examples : OVHcloud and AWS. Next there are a couple of pages describing setups in more detail.

I want to deploy on OVHcloud

Start here: OVHcloud Installation Guide

Then reference:

I want to deploy on AWS

Start here: AWS Installation Guide

Then reference:

The Task Execution Layer : TES (Funnel)

Start here: Funnel TES Overview

Then dive into:

The Workflow Execution Layer : Cromwell

Start here: Cromwell Overview

Then dive into:

The Cluster Autoscaling Layer : Karpenter

Start here: Karpenter Overview

Then dive into:

I need quick command references

See: Quick Reference

kubectl, openstack, aws CLI commands, common tasks, troubleshooting.


📊 Status & Maintenance

Component Status Tested
OVHcloud Deployment ✅ Production OVH MKS 1.34
AWS Deployment 📋 Template Not yet
Funnel TES ✅ Functional (PR#1357) Yes
Cromwell ✅ Functional (PR#7858) Yes
Karpenter OVH ✅ Functional (PR#1) Yes
Karpenter AWS 📋 Template Not yet

🔐 Security Considerations

  • Encryption: LUKS on persistent volumes (OVH), EBS encryption (AWS)
  • Networking: Private subnets, security groups configured
  • Access Control: RBAC configured per component
  • Credentials: Environment variables, no hardcoding

See platform-specific guides for detailed security setup.


📋 Helpful Resources


Last Updated: March 28, 2026
Version: 1.0 (OVH testing finished)