Language models at smaller scale can still impressively make accurate predictions in a numbers of fields.
These smaller LMs are called Foundation Models. There are some advantages to consider with using smaller Foundation Models in place of going the typical path of deploying Large Language Models like GPT-4 from OpenAI, Bard from Google or Claude from Anthropic.
All of the forenamed large language models are trained on massive data sets which they’ve gathered off the internet.
While finely tuned, a LLM’s ability to predict the next token accurately aids it’s significantly higher performance scores than a based-trained foundation model will score before it gets trained with reinforcement learning feedback in domain specific areas of knowledge.
The benefit of a Foundation Model is also it’s limitation, ironically enough, which is it’s size.
The fact that it’s a substantially smaller set of data which it has been trained on slows down performance a bit behind the absolute top level performing abilities demonstrated by GPT-4.
However, once a foundation model is finely trained in specific a field of expertise, their overall performance metrics rise up to ~8% on average behind GPT-4 level transformer models, while still displaying emergent forms of commonsense reasoning and delighting us all, once again, in their processes.
While emergent behaviors are more common with LLMs of scale estimated near a 100 billion parameter model, for example, the domain specific trained assistant’s eventually appear to find their own way to arise from small scale LMs of around ~<20X the GPT-3.5 version’s size expected to be somewhere in the 5-10 billion parameters range.
Chain of Thought Prompting at scale during the RL phase of it’s overall training process has demonstrated to improve a model’s ability to perform in the field by an additional ~>10% on average across a sparse range of domains.
At DataGenn.ai we’re doing our best to cover the bases as much as possible. We’ve taken a vertical approach to building language models from a business perspective.
Our API endpoints empower you to generate. What… is up to you.