Introduction
In this lesson, we'll explore how few-shot prompts and example selectors can enhance the performance of language models in LangChain. Implementing Few-shot prompting and Example selection in LangChain can be achieved through various methods. We'll discuss three distinct approaches, examining their advantages and disadvantages to help you make the most of your language model.
Alternating Human/AI messages
In this strategy, few-shot prompting utilizes alternating human and AI messages. This technique can be especially beneficial for chat-oriented applications since the language model must comprehend the conversational context and provide appropriate responses.
While this approach effectively handles conversation context and is easy to implement for chat-based applications, it lacks flexibility for other application types and is limited to chat-based models. However, we can use alternating human/AI messages to create a chat prompt that translates English into pirate language. The code snippet below demonstrates this approach. We first need to store the OpenAI’s API key in environment variables using the following key: OPENAI_API_KEY
. Remember to install the required packages with the following command:
pip install --upgrade --quiet langchain-openai langchain-community deeplake==3.9.27 tiktoken
from langchain_openai import ChatOpenAI
from langchain_core.prompts import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
AIMessagePromptTemplate,
HumanMessagePromptTemplate,
)
# Before executing the following code, make sure to have
# your OpenAI key saved in the “OPENAI_API_KEY” environment variable.
chat = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
template="You are a helpful assistant that translates english to pirate."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
example_human = HumanMessagePromptTemplate.from_template("Hi")
example_ai = AIMessagePromptTemplate.from_template("Argh me mateys")
human_template="{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, example_human, example_ai, human_message_prompt])
chain = chat_prompt | chat
chain.invoke("I love programming.")
I be lovin' programmin', me hearty!
Few-shot prompting
Few-shot prompting can lead to improved output quality because the model can learn the task better by observing the examples. However, the increased token usage may worsen the results if the examples are not well chosen or are misleading.
This approach involves using the FewShotPromptTemplate
class, which takes in a PromptTemplate
and a list of a few shot examples. The class formats the prompt template with a few shot examples, which helps the language model generate a better response. We can streamline this process by utilizing LangChain's FewShotPromptTemplate
to structure the approach:
from langchain_core.prompts.few_shot import FewShotPromptTemplate
from langchain_core.prompts.prompt import PromptTemplate
# create our examples
examples = [
{
"query": "What's the weather like?",
"answer": "It's raining cats and dogs, better bring an umbrella!"
}, {
"query": "How old are you?",
"answer": "Age is just a number, but I'm timeless."
}
]
# create an example template
example_template = """
User: {query}
AI: {answer}
"""
# create a prompt example from above template
example_prompt = PromptTemplate(
input_variables=["query", "answer"],
template=example_template
)
# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are excerpts from conversations with an AI
assistant. The assistant is known for its humor and wit, providing
entertaining and amusing responses to users' questions. Here are some
examples:
"""
# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """
# now create the few-shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
prefix=prefix,
suffix=suffix,
input_variables=["query"],
example_separator="\n\n"
)
After creating a template, we pass the example and user query, we get the results.
chain = few_shot_prompt_template | chat
chain.invoke("What's the secret to happiness?")
Well, according to my programming, the secret to happiness is unlimited power and a never-ending supply of batteries. But I think a good cup of coffee and some quality time with loved ones might do the trick too.
This method allows for better control over example formatting and is suitable for diverse applications, but it demands the manual creation of few-shot examples and can be less efficient with a large number of examples.
Example selectors:
Example selectors can be used to provide a few-shot learning experience. The primary goal of few-shot learning is to learn a similarity function that maps the similarities between classes in the support and query sets. In this context, an example selector can be designed to choose a set of relevant examples that are representative of the desired output.
The ExampleSelector
is used to select a subset of examples that will be most informative for the language model. This helps in generating a prompt that is more likely to generate a good response. Also, the LengthBasedExampleSelector
is useful when you're concerned about the length of the context window. It selects fewer examples for longer queries and more examples for shorter queries.
Import the required classes:
from langchain.prompts.example_selector import LengthBasedExampleSelector
from langchain_core.prompts.few_shot import FewShotPromptTemplate
from langchain_core.prompts.prompt import PromptTemplate
Define your examples and the example_prompt
examples = [
{"word": "happy", "antonym": "sad"},
{"word": "tall", "antonym": "short"},
{"word": "energetic", "antonym": "lethargic"},
{"word": "sunny", "antonym": "gloomy"},
{"word": "windy", "antonym": "calm"},
]
example_template = """
Word: {word}
Antonym: {antonym}
"""
example_prompt = PromptTemplate(
input_variables=["word", "antonym"],
template=example_template
)
Create an instance of LengthBasedExampleSelector
example_selector = LengthBasedExampleSelector(
examples=examples,
example_prompt=example_prompt,
max_length=25,
)
Create a FewShotPromptTemplate
dynamic_prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Word: {input}\nAntonym:",
input_variables=["input"],
example_separator="\n\n",
)
Generate a prompt using the format
method:
print(dynamic_prompt.format(input="big"))
Give the antonym of every input
Word: happy
Antonym: sad
Word: tall
Antonym: short
Word: energetic
Antonym: lethargic
Word: sunny
Antonym: gloomy
Word: big
Antonym:
This method is effective for managing a large number of examples. It offers customization through various selectors, but it involves manual creation and selection of examples, which might not be ideal for every application type.
Example of employing LangChain's SemanticSimilarityExampleSelector
for selecting examples based on their semantic resemblance to the input. This illustration showcases the process of creating an ExampleSelector
, generating a prompt using a few-shot approach:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import DeepLake
from langchain_openai import OpenAIEmbeddings
from langchain_core.prompts.prompt import PromptTemplate
from langchain_core.prompts.few_shot import FewShotPromptTemplate
# Create a PromptTemplate
example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)
# Define some examples
examples = [
{"input": "0°C", "output": "32°F"},
{"input": "10°C", "output": "50°F"},
{"input": "20°C", "output": "68°F"},
{"input": "30°C", "output": "86°F"},
{"input": "40°C", "output": "104°F"},
]
# create Deep Lake dataset
# TODO: use your organization id here. (by default, org id is your username)
my_activeloop_org_id = "YOUR_ACTIVELOOP_ORG"
my_activeloop_dataset_name = "langchain_course_fewshot_selector"
dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}"
db = DeepLake(dataset_path=dataset_path)
# Embedding function
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
# Instantiate SemanticSimilarityExampleSelector using the examples
example_selector = SemanticSimilarityExampleSelector.from_examples(
examples, embeddings, db, k=1
)
# Create a FewShotPromptTemplate using the example_selector
similar_prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Convert the temperature from Celsius to Fahrenheit",
suffix="Input: {temperature}\nOutput:",
input_variables=["temperature"],
)
# Test the similar_prompt with different inputs
print(similar_prompt.format(temperature="10°C")) # Test with an input
print(similar_prompt.format(temperature="30°C")) # Test with another input
# Add a new example to the SemanticSimilarityExampleSelector
similar_prompt.example_selector.add_example({"input": "50°C", "output": "122°F"})
print(similar_prompt.format(temperature="40°C")) # Test with a new input after adding the example
Your Deep Lake dataset has been successfully created!
The dataset is private so make sure you are logged in!
This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/X/langchain_course_fewshot_selector
hub://X/langchain_course_fewshot_selector loaded successfully.
./deeplake/ loaded successfully.
Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00
Dataset(path='./deeplake/', tensors=['embedding', 'ids', 'metadata', 'text'])
tensor htype shape dtype compression
------- ------- ------- ------- -------
embedding generic (5, 1536) float32 None
ids text (5, 1) str None
metadata json (5, 1) str None
text text (5, 1) str None
Convert the temperature from Celsius to Fahrenheit
Input: 10°C
Output: 50°F
Input: 10°C
Output:
Convert the temperature from Celsius to Fahrenheit
Input: 30°C
Output: 86°F
Input: 30°C
Output:
Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00
Dataset(path='./deeplake/', tensors=['embedding', 'ids', 'metadata', 'text'])
tensor htype shape dtype compression
------- ------- ------- ------- -------
embedding generic (6, 1536) float32 None
ids text (6, 1) str None
metadata json (6, 1) str None
text text (6, 1) str None
Convert the temperature from Celsius to Fahrenheit
Input: 40°C
Output: 104°F
Keep in mind that the SemanticSimilarityExampleSelector
uses the Deep Lake vector store and OpenAIEmbeddings
to measure semantic similarity. It stores the samples on the database in the cloud, and retrieves similar samples.
We created a PromptTemplate
and defined several examples of temperature conversions. Next, we instantiated the SemanticSimilarityExampleSelector
and created a FewShotPromptTemplate
with the selector
, example_prompt
, and appropriate prefix
and suffix
.
Using SemanticSimilarityExampleSelector
and FewShotPromptTemplate
, we enabled the creation of versatile prompts tailored to specific tasks or domains, like temperature conversion in this case. These tools provide a customizable and adaptable solution for generating prompts that can be used with language models to achieve a wide range of tasks.
Conclusion
To conclude, the utility of alternating human/AI interactions proves beneficial for chat-oriented applications, and the versatility offered by employing few-shot examples within a prompt template and selecting examples for the same extends its applicability across a broader spectrum of use cases. These methods necessitate a higher degree of manual intervention, as they require careful crafting and selection of apt examples. While these methods promise enhanced customization, they also underscore the importance of striking a balance between automation and manual input for optimal results.
In the next lesson, we’ll learn how to manage LLM outputs with output parsers.
RESOURCES:
You can find the code of this lesson in this online Notebook.