• Home
  • Explore

BASE Layers: Simplifying Training of Large, Sparse Models

proceedings.mlr.press/v139/lewis21a.html

1 Users

0 Comments

2 Highlights

0 Notes

Tags

Top Highlights

  • Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.

  • Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.

AboutPrivacyTerms

© 2023 Glasp Inc. All rights reserved.