Microsoft has announced a new language model trained using the "textbooks are all you need" approach. Using high quality training data, they obtain relatively good performance with a simple model that can easily run on mobile devices such as a phone:
https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
This highlights that improvements in AI are not only driven by hardware improvements, this model was trained for 2 weeks on 96 GPU's, or approximately 4 GPU-years of computing time.
No comments:
Post a Comment