The MDS Shim — Zero-Conversion Data Loading for 800+ Datasets
How we trained on 800+ MDS datasets in Megatron without converting a single file
Mar 27, 202612 min read38

Search for a command to run...
Articles tagged with #mlops
How we trained on 800+ MDS datasets in Megatron without converting a single file

Scaling a 10B multimodal model beyond FSDP with tensor, pipeline, and context parallelism
