Why We Moved an AudioLLM to MegatronScaling a 10B multimodal model beyond FSDP with tensor, pipeline, and context parallelismMar 20, 2026·11 min read·134