Domain-Specific LLMs

Domain-Specific LLMs

Introduction

Domain-specific Language Models are tailored for specific industries or use cases. Unlike generalized language models attempting to comprehend a wide array of topics, domain-specific LLMs are finely tuned to understand a particular domain's unique terminology, context, and intricacies. In this lesson, we’ll see what it takes to create a domain-specific LLM, how to do it, and what popular domain-specific LLMs are.

When Do Domain-Specific LLMs Make Sense?

Domain-specific LLMs offer distinct advantages over their generalized counterparts in scenarios where precision, accuracy, and context are very important. They excel in industries where specialized knowledge is essential for generating relevant and accurate outputs. Moreover, they are also good in certain scenarios for safety (constraining what the model knows about), and smaller model have lower latency and are cheaper to host/infer with respect to an LLM, especially if for a single task.

So, what are some specific industries where domain-specific LLMs could work well? Two notable examples are:

  1. Finance: Domain-specific LLMs can provide personalized investment recommendations based on an individual's financial goals, optimizing investment strategies.
  2. Healthcare: A domain-specific LLM trained in medical data can comprehend complex medical queries and offer accurate advice, enhancing patient care and medical consultation.

Why don’t we use general-purpose LLMs in these fields, too?

General-purpose LLMs gather their knowledge from their pre-training phase and are “steered” into valuable assistants in the finetuning phase. GPT-4 may know the correct answer to a medical question since it may be trained on medical papers too, but it may not be appropriately steered into a good “medical assistant” that would also ask meaningful questions. However, we’re still in the infancy of LLM research, so it’s still hard to correctly reason how they work.

As long as the required knowledge was in the pre-training data, in principle, the LLM is likely able to behave in the “correct” way if appropriately finetuned. If we think the required knowledge wasn’t in the training data, we’d have to pre-train a new LLM from scratch. If we think otherwise, then we could focus on finetuning.

In finance, Bloomberg recently trained a proprietary LLM from scratch using a mix of general-purpose and financial data called BloombergGPT.

BloombergGPT

BloombergGPT is a proprietary domain-specific 50B LLM trained for the financial domain.

Its training dataset is called “FinPile” and is made of many English financial documents derived from diverse sources, encompassing financial news, corporate filings, press releases, and even social media taken from Bloomberg archives (thus, it’s a proprietary dataset). Data ranges from company filings to market-relevant news from March 2007 to July 2022.

The dataset is further augmented by the integration of publicly available general-purpose text, creating a balance between domain specificity and the broader linguistic landscape. In the end, the final domain is approximately half domain-specific (51.27%) and half general-purpose (48.73%).

The model is based on the BLOOM model. It’s a decoder-only transformer with 70 layers of decoder blocks, multi-head self-attention, layer normalization, and feed-forward networks equipped with the GELU non-linear function. The model is based on the Chinchilla scaling laws.

BloombergGPT outperforms other models like GPT-NeoX, OPT-66B, and BLOOM-176B on financial tasks. However, when confronted with GPT-3 on general-purpose tasks, GPT-3 achieves better results.

The FinGPT Project

The FinGPT project aims to bring the power of LLMs into the world of finance. It aims to do so in two ways:

  1. Providing open finance datasets.
  2. Finetuning open-source LLMs on finance datasets for several use cases.

Many datasets collected by FinGPT are specifically for financial sentiment analysis. What do we mean by “financial sentiment”?

For example, the sentence “Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding period in 2007 representing 7.7 % of net sales” merely states facts; therefore, its “normal” sentiment would be neutral. However, it states that the company's operating profit rose, which is good news for someone who wants to invest in that company, and therefore the financial sentiment is “positive.” Similarly, the sentence “The international electronic industry company Elcoteq has laid off tens of employees from its Tallinn facility” has “negative” financial sentiment.

Some datasets for financial sentiment classification are:

Here, you can find a notebook showing how to finetune a model with these datasets and how to use the final model for predictions.

Med-PaLM for the Medical Domain

Med-PaLM is a finetuned version of PaLM (by Google) specifically for the medical domain.

The first iteration of Med-PaLM, introduced in late 2022 and subsequently published in Nature in July 2023, marked a milestone by surpassing the pass mark on US Medical License Exam (USMLE) style questions.

Building upon this success, Google Health unveiled the latest iteration, Med-PaLM 2, during its annual health event, The Check Up, in March 2023. Med-PaLM 2 represents a substantial leap forward, achieving an accuracy rate of 86.5% on USMLE-style questions.

Conclusion

Domain-specific LLMs are specialized tools finely tuned for domain expertise. They are indicated for specific fields like finance and healthcare, where nuanced understanding is very important. Examples include BloombergGPT for finance, FinGPT for financial sentiment analysis, and Med-PaLM for medical inquiries.