Tag
architecture
2 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
Mixtral 8x7B architecture: Sparse Mixture-of-Experts (8 experts × 7B params, 2 experts routed per token).
ad79b14fafb362cd · 2 sources · 100% confidence
RMSNorm (Root Mean Square Layer Normalization) introduced in paper: Root Mean Square Layer Normalization (Zhang & Sennrich, 2019).
c64636dc60b1216f · 3 sources · 92% confidence