Core Algorithms · Active
EIL Training Stack — Metaflow on AWS Batch
Our shared ML platform: Metaflow flows, AWS Batch compute, MLflow tracking, and reproducible PyPI dependency management across local and cloud runs.
Metaflow AWS Batch MLflow Infra
A pragmatic training platform optimized for a small lab. The goals are simple: one command to spin up a tracked, reproducible flow that runs identically on a laptop and on AWS Batch.
Components
ml-models— model definitions, training flows, evaluation utilities, and autils.batch_depsmodule that keeps local and Batch dependencies in lockstep via Metaflow’s@pypidecorator.eil-infra— CDK-managed AWS infrastructure: ECS-hosted MLflow, SSM tunnels for safe access, Batch compute environments sized forg4dn.xlarge.infra-bootstrap— shared submodule that initializes the Metaflow config and SSM tunnel on first run.
Why we built it ourselves
Off-the-shelf platforms tend to optimize for either local dev or cloud scale, rarely both with the same code path. Our stack is small, opinionated, and lets a four-person lab run robust, reproducible training without an MLOps team.