Tag
moe
13 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
Mixtral 8x7B released on: 2023-12-11.
410aec4f418f2b11 · 2 sources · 95% confidence
Mixtral 8x7B architecture: Sparse Mixture-of-Experts (8 experts × 7B params, 2 experts routed per token).
ad79b14fafb362cd · 2 sources · 100% confidence
Sparsely-Gated Mixture-of-Experts (MoE) introduced in paper: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (Shazeer et al., 2017).
2d6d7f61f1db6493 · 1 source · 100% confidence
Switch Transformer introduced in paper: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Fedus et al., 2021).
3d9c14b9379038c9 · 2 sources · 100% confidence
MoE Mixtral 8x22B released on: 2024-04-10 by Mistral AI.
4335bf51bf0fc14f · 2 sources · 100% confidence
Llama 4 released on: 2025-04-05 by Meta — Scout + Maverick + Behemoth lineup.
d5ce871dc69e7b04 · 2 sources · 100% confidence
Mixture of Experts (MoE) revival popularized in: Shazeer et al. 2017 — outrageously large neural networks via sparse gating.
f068236101568ad7 · 2 sources · 100% confidence
DeepSeek-V2 publicly released on: 2024-05-07 by DeepSeek — MoE 236B-parameter open model.
a47cef03b39df7bd · 2 sources · 100% confidence
DeepSeek-V3 publicly released on: 2024-12-26 by DeepSeek AI — 671B-parameter MoE (37B active), open weights.
035eaae4aa32e74d · 2 sources · 100% confidence
Tencent Hunyuan-Large publicly released on: 2024-11-05 by Tencent — 389B-parameter open-weight MoE (52B active).
bffa9a720c420248 · 2 sources · 100% confidence
ByteDance Doubao 1.5 Pro publicly released on: 2025-01-22 by ByteDance — MoE LLM matching GPT-4o quality on Chinese benchmarks, lower inference cost.
1853e2c2c1277656 · 2 sources · 85% confidence
Snowflake Arctic publicly released on: 2024-04-24 by Snowflake — 480B-parameter MoE LLM (17B active), Apache 2.0.
ecf617a1457e6ede · 2 sources · 100% confidence
Databricks DBRX publicly released on: 2024-03-27 by Databricks — 132B-parameter MoE (36B active per token), Databricks Open Model License.
bbffd3da8c5258aa · 2 sources · 100% confidence