What are Agents: Agents as Content Generators and Reasoning Engines

Introduction

In the fascinating world of artificial intelligence, LangChain and LLMs have opened up new horizons in data analysis, information synthesis, and content generation. Central to their functionality is the concept of Agents - intelligent systems that utilize LLMs to determine actions and facilitate complex tasks. In this way, LLMs are used more as a reasoning engine or a planner and less as content generators per-se. We discuss two primary ways we can harness the capabilities of LMMs: as content generators and as reasoning engines.

LLMs leverage their internal knowledge as content generators to create engaging and creative content from scratch. On the other hand, used as reasoning engines, they act as proficient synthesizers of information, extracting and summarizing relevant data from a multitude of sources and planning the next actions to take. Both these approaches have distinct advantages and challenges, with the choice largely dependent on the specific requirements of the task.

Agents

In the context of language models, agents are used to decide the course of action and the sequence of these actions. These actions can either be the utilization of a tool, observing its output, or offering a response to the user. The real potential of agents unfolds when they are utilized appropriately. This explanation aims to simplify the usage of agents via the highest-level API.

Before diving into the practical usage, it's crucial to understand the following terms:

Tool: A function that performs a specific task. It can be a Google Search, a Database lookup, a Python REPL, or other chains. A tool's interface is typically a function that takes a string as an input and returns a string as an output.
Language Learning Model: The language model that powers the agent.
Agent: The agent to use, identified by a string that references a supported agent class. It’s what orchestrates the LLM and the tools. This explanation focuses on using the standard supported agents via the highest-level API. For custom agent implementation, refer to the appropriate documentation.

Agents in LangChain play a crucial role in the decision-making and execution of tasks based on user input. They evaluate the situation and decide on the appropriate tools to use, if necessary.

Presently, most of the agents in LangChain fall into one of these two categories:

"Action Agents": These agents determine and execute a single action. They are typically used for straightforward tasks.
"Plan-and-Execute Agents": These agents first devise a plan comprising multiple actions and then execute each action sequentially. They are more suited for complex or long-running tasks as they help maintain focus on long-term objectives.

While Action Agents are more traditional and suitable for smaller tasks, Plan-and-Execute Agents help maintain long-term objectives and focus. However, they might lead to more calls and higher latency. Often, it's beneficial to let an Action Agent manage the execution for the Plan-and-Execute agent, thus utilizing both strengths.

For example, a high-level workflow of Action Agents would look something like this:

The agent receives user input.
It decides which tool to use (if any) and determines its input.
The chosen tool is called with the provided input, and an observation (the output of the tool) is recorded.
The history of the tool, tool input, and observation are relayed back to the agent, which then decides the next step.
This process is repeated until the agent no longer needs to use a tool, at which point it directly responds to the user.

The most critical abstraction to understand is the agent itself. In the context of LangChain, the term "agents" pertains to the concept of employing a language model as a reasoning mechanism and linking it with the key element - a tool.

Tools are instrumental in connecting the language model with other sources of data or computation, including search engines, APIs, and other data repositories. Language models can only access the knowledge they've been trained on, which can quickly become obsolete. Therefore, tools are essential as they allow the agent to retrieve and incorporate current data into the prompt as context. Tools can also execute actions (like running code or modifying files) and observe the results, subsequently informing the language model's decision-making process.

As we said before, we can abstract two primary modes of operation to consider when employing an LLM: as a content generator and as a reasoning engine.

When used as a "content generator," the language model is asked to create content entirely from its internal knowledge base. This approach can lead to highly creative outputs but can also result in unverified information or 'hallucinations' due to the model's reliance on pre-trained knowledge.
On the other hand, when functioning as a "reasoning engine," the Agent acts more as an information manager rather than a creator. In this mode, it is tasked with gathering relevant, accurate information, often aided by external tools. This involves the LLM drawing from similar resources on a given topic and constructing new content by extracting and summarizing the relevant details.

Answering Questions using an LLM as a reasoning engine

Let’s see a code example of it. As always, we first set the required API keys as environment variables.

import os
os.environ["OPENAI_API_KEY"] = "<YOUR-OPENAI-API-KEY>"
os.environ["GOOGLE_API_KEY"] = "<YOUR-GOOGLE-SEARCH-API-KEY>"
os.environ["GOOGLE_CSE_ID"] = "<YOUR-CUSTOM-SEARCH-ENGINE-ID>"

Here’s the code example. Remember to install the required packages with the following command: pip install langchain==0.1.4 deeplake==3.9.27 openai==1.10.0 tiktoken.

# Importing necessary modules
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI

# Loading the language model to control the agent
llm = OpenAI(model="gpt-3.5-turbo-instruct", temperature=0)

# Loading some tools to use. The llm-math tool uses an LLM, so we pass that in.
tools = load_tools(["google-search", "llm-math"], llm=llm)

# Initializing an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

# Testing the agent
query = "What's the result of 1000 plus the number of goals scored in the soccer world cup in 2018?"
response = agent.run(query)
print(response)

You should see something like the following printed output.

> Entering new AgentExecutor chain...
 I need to find out the number of goals scored in the 2018 soccer world cup
Action: Google Search
Action Input: "number of goals scored in 2018 soccer world cup"
Observation: Jan 13, 2023 ... A total of 172 goals were scored during the 2022 World Cup in Qatar, marking a new record for the tournament. Jan 31, 2020 ... A total of 169 goals were scored at the group and knockout stages of the FIFA World Cup held in Russia from the 14th of June to the 15th of July ... Jan 13, 2023 ... Average number of goals scored per match at the FIFA World Cup from 1930 to 2022 ; Russia 2018, 2.64 ; Brazil 2014, 2.67 ; South Africa 2010, 2.27. Number of goals scored in the matches played between the teams in question;; Fair play points in all group matches (only one deduction could be applied to a ... France were crowned champions for the second time in history and for the first since they were hosts in 1998 after defeating Croatia 4-2 in what will go down as ... Check out the top scorers list of World Cup 2018 with Golden Boot prediction. Get highest or most goal scorer player in 2018 FIFA World Cup. 2018 FIFA World Cup Russia™: France. ... Top Scorers. Previous. Antoine Griezmann ... #WorldCupAtHome: Electric Mbappe helps France win seven-goal thriller. Jun 30, 2018 ... Kylian Mbappe scored twice as France dumped Lionel Messi and Argentina out of the World Cup with a 4-3 win in an outstanding round-of-16 tie ... 0 · Luka MODRIC · Players · Top Scorers. Dec 18, 2022 ... Antoine Griezmann finished second in goals scored at the 2018 World Cup. Mbappe is also just the fifth man to score in multiple World Cup finals ...
Thought: I now know the number of goals scored in the 2018 soccer world cup
Action: Calculator
Action Input: 1000 + 169
Observation: Answer: 1169
Thought: I now know the final answer
Final Answer: The result of 1000 plus the number of goals scored in the soccer world cup in 2018 is 1169.

> Finished chain.

The result of 1000 plus the number of goals scored in the soccer world cup in 2018 is 1169.

There were 169 goals scored in the soccer world cup in 2018, so the final answer is correct.

In the example, the agent leverages its "reasoning engine" capabilities to generate responses. Instead of creating new content (acting as a content generator), the agent uses the tools at its disposal to gather, process, and synthesize information. The entire output was truncated, and the agent skillfully employed the LLM-math tool.

Let's break down the steps to see how the agent functions as a "reasoning engine":

Query Processing: The agent receives a query: "What's the result of 1000 plus the number of goals scored in the soccer world cup in 2018?” The agent identifies two distinct tasks within this query - finding out the number of goals scored in the 2018 soccer world cup and adding 1000 to such number.
Tool Utilization: The agent uses the "google-search" tool to answer the first part of the query. This is an example of the agent using external tools to gather accurate and relevant information. The agent isn't creating this information; it's pulling the data from an external source.
Information Processing: For the second part of the query, the agent uses the "llm-math" tool to perform a sum reliably. Again, the agent isn't creating new information. Instead, it's processing the data it has gathered.
Synthesis and Response: After gathering and processing the information, the agent synthesizes it into a coherent response that answers the original query.

In this way, the agent acts as a "reasoning engine.” It's not generating content from scratch but rather gathering, processing, and synthesizing existing information to generate a response. This approach allows the agent to provide accurate and relevant responses, making it a powerful tool for tasks that involve data retrieval and processing.

The agent would create new content as a content generator rather than just pulling and processing existing information. Let's imagine a scenario where we want the agent to write a short science fiction story based on a given prompt.

We could initialize the agent with a language model and set its temperature parameter to a higher value to encourage more creativity in its outputs. It is not required to use external tools, as the agent generates content rather than retrieving or processing it.

The language model will generate a long science fiction story about interstellar explorers based on the patterns it learned during training.

# Importing necessary modules
from langchain.agents import initialize_agent, AgentType
from langchain.llms import OpenAI
from langchain.agents import Tool
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

prompt = PromptTemplate(
    input_variables=["query"],
    template="You're a renowned science fiction writer. {query}"
)

# Initialize the language model
llm = OpenAI(model="gpt-3.5-turbo-instruct", temperature=0)
llm_chain = LLMChain(llm=llm, prompt=prompt)

tools = [
    Tool(
        name='Science Fiction Writer',
        func=llm_chain.run,
        description='Use this tool for generating science fiction stories. Input should be a command about generating specific types of stories.'
    )
]

# Initializing an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

# Testing the agent with the new prompt
response = agent.run("Compose an epic science fiction saga about interstellar explorers")
print(response)

You should see something like the following printed output.

> Entering new AgentExecutor chain...
 I need a way to generate this kind of story
Action: Science Fiction Writer
Action Input: Generate interstellar exploration story
Observation: .

The crew of the interstellar exploration vessel, the U.S.S. Discovery, had been traveling through the depths of space for months, searching for something that no one had ever seen before. They were searching for a planet, an anomaly, something out of the ordinary.

The ship had been equipped with the most advanced technology available, but nothing could have prepared them for what they encountered on their journey. As they entered an uncharted sector of the galaxy, they encountered an alien species unlike anything they had ever seen before.

The aliens were primitive, yet their technology was far more advanced than anything known to humanity. The crew of the U.S.S. Discovery found themselves in awe of the alien species and its technology.

The crew immediately set to work exploring the planet and its myriad of secrets. They uncovered evidence of an ancient civilization, as well as evidence of a mysterious energy source that could potentially power their ship and enable them to travel faster than the speed of light.

Eventually, the crew was able to unlock the secrets of the alien technology and use it to power their ship. With the newfound energy source, they were able to travel to the far reaches of the universe and explore places that no human had ever seen
Thought: I now know the final answer
Final Answer: The crew of the U.S.S. Discovery set out to explore the unknown reaches of the universe, unlocking the secrets of alien technology and discovering an ancient civilization with the power to travel faster than the speed of light.

> Finished chain.

… along with the content of the response variable.

The crew of the U.S.S. Discovery set out to explore the unknown reaches of the universe, unlocking the secrets of alien technology and discovering an ancient civilization with the power to travel faster than the speed of light.

Here, the Agent is primarily using its internal knowledge to generate the output. Here's a brief explanation of how that works:

The agent receives a prompt to "Compose an epic science fiction saga about interstellar explorers.”
The agent then uses its understanding of language, narrative structure, and the specific themes mentioned in the prompt (science fiction, interstellar exploration, etc.) to generate a story.

LLM's understanding comes from its training data. It was trained on a diverse range of internet text, so it has a broad base of information to draw from. When asked to generate a science fiction story, it uses patterns it learned during training about how such stories are typically structured and what elements they usually contain.

Remember, even though the language model has vast training data to draw from, it doesn't "know" specific facts or have access to real-time information. Its responses are generated based on patterns learned during training, not from a specific knowledge database.

Conclusion

In our agent examples, we've observed the strengths and limitations of using LLMs as a "content generator" and a "reasoning engine.”

In the first scenario, where the agent served as a "reasoning engine,” it leveraged tools like Google Search to gather, process, and synthesize information, thereby creating a knowledgeable and accurate output. However, while the agent's output was factual and informative, it lacked the creative flair that can be observed when an LLM is used as a "content generator.”

In contrast, when the agent functioned as a "content generator,” it created a vivid and imaginative science fiction story, showcasing its potential for creativity and narrative invention. Nevertheless, this approach is limited by the training data of the LLM and can sometimes result in "hallucinations" or inaccuracies.

In the next lesson, we’ll learn more about AutoGPT and BabyAGI, two popular LLM-based agents.