Megatron-LM introduced in paper: Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism (Shoeybi et al., 2019).
Predicate
introduced_in_paper
Object
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism (Shoeybi et al., 2019)
Primary source · preprint · 2019-09-17
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism — arXiv (Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro)