DCN-V2 — Deep & Cross Networks
Cross-network architectures with low-rank decomposition and mixture-of-experts gating — explicit feature-interaction layers for deep models.
Following Wang et al. (2021), we study explicit feature-crossing architectures and the trade-offs of low-rank versus full-rank gating. The core building block is the cross layer:
with low-rank decomposition and a mixture-of-experts extension that lets the model allocate capacity adaptively across feature regions.
What we’re building
- A reproducible reference implementation of DCN-V2 with batched training on AWS Batch via Metaflow.
- An agentic research workflow (the
dcn_v2_agentsrepo) that scaffolds paper structure, runs consistency checks, and assembles drafts across multiple owners (positioning, method, experiment, lead).
Why it matters
Cross networks are among the most compute-efficient ways to add explicit feature interactions to deep models. We’re interested in pushing the low-rank / MoE frontier — how aggressively can we compress the cross layers before the interaction signal degrades, and where do experts route when given the freedom?