The Most Popular Proprietary LLMs

Introduction

In this lesson, we’ll dive into the most popular proprietary LLMs, such as GPT-4, Claude, and Cohere LLMs. The debate between open-source or proprietary models is multifaceted when discussing aspects like customizations, development speed, control, regulation, and quality.

Proprietary LLMs

Proprietary models like GPT-4 and PaLM are developed and controlled by specific organizations, in contrast to open-source LLMs such as BigScience’s Bloom and various community-driven projects, which are freely available for developers to use, modify, and distribute. As a result, these models may offer advanced features and customization options tailored to specific use cases. Organizations can fine-tune these models to meet their exact requirements, providing a competitive edge in the market.

Since proprietary models are developed and controlled by specific organizations, they have complete control over the model's development, deployment, and updates. This level of control allows organizations to protect their intellectual property and maintain a competitive advantage.

Organizations offering proprietary models often provide commercial support and service level agreements (SLAs) to ensure reliability and performance guarantees. This level of professional support can be crucial for some use cases.

Using proprietary LLMs can be cost-effective in most use cases. LLMs are large and need several GPUs to operate at low latency (and the right competencies); therefore, they benefit from economies of scale.

Let’s see now a list of popular proprietary models (as of July 2023).

Cohere LLMs

Cohere’s models are categorized into two main types: Generative and Embeddings. The Generative models, also known as command models, are trained on a large corpus of data from the Internet. They are continually developed and updated, with improvements released weekly.

You can register for a Cohere account and get a free trial API key. There is no credit or time limit associated with a trial key; calls are rate-limited to 100 calls per minute, which is typically enough for an experimental project.

Save your key in your .env file as follows.

COHERE_API_KEY="<YOUR-COHERE-API-KEY>"

Then, install the cohere package with this command.

pip install cohere

You can now generate text with Cohere as follows.

from dotenv import load_dotenv
load_dotenv()
import cohere
import os

co = cohere.Client(os.environ["COHERE_API_KEY"])

response = co.generate(
    prompt='Please briefly explain to me how Deep Learning works using at most 100 words.',
    max_tokens=200
)
print(response.generations[0].text)

Code to execute

Deep Learning is a subfield of artificial intelligence and machine learning that is based on artificial neural networks with many layers, inspired by the structure and function of the human brain. These networks are trained on large amounts of data and algorithms to identify patterns and learn from experience, enabling them to perform complex tasks such as image and speech recognition, language translation, and decision-making. The key components of Deep Learning are neural networks with many layers, large amounts of data, and algorithms for training and optimization. Some of the applications of Deep Learning include autonomous vehicles, natural language processing, and speech recognition.

The output

On the other hand, the embedding models are multilingual and can support over 109 languages. These models are designed for large enterprises whose end users are spread worldwide.

Developers can also build classifiers on top of Cohere's language models to automate language-based tasks and workflows. The Cohere service provides a variety of models such as Command (command) for dialogue-like interactions, Generationbase) for generative tasks, Summarizesummarize-xlarge) for generating summaries, and more.

OpenAI's GPT-3.5

GPT-3.5 is a language model developed by OpenAI. Its turbo version, GPT-3.5-turbo (recommended by OpenAI over other variants), offers a more affordable option for generating human-like text through an API accessible via OpenAI endpoints. The model is optimized for chat applications while remaining powerful on other generative tasks and can process 96 languages. GPT-3.5-turbo comes in two variants: one with a 4k tokens context length and the other with 16k tokens.

The Azure Chat Solution Accelerator, powered by Azure Open AI Service, offers enterprises a robust platform to host OpenAI chat models with enhanced moderation and safety features. This solution enables organizations to establish a dedicated chat environment within their Azure Subscription, ensuring a secure and tailored user experience. One of the key advantages is its privacy aspect, as it's deployed within your Azure tenancy, allowing for complete isolation and control over your chat services.

In the first lesson, “What are Large Language Models,” we saw how to use GPT-3.5-turbo via API, so refer to that lesson for a code snippet on how to use it with Python and get an API key. Additionally, we have recently introduced the "LangChain & Vector Databases in Production" free course, aimed at assisting you in getting the most out of LLMs and enhancing their functionality. The course encompasses fundamental topics such as initiating prompts and addressing hallucination, as well as delving into advanced areas like using LangChain to give memory to LLMs and developing agents for interaction with the real world.

OpenAI's GPT-4

OpenAI's GPT-4 is a multimodal model with an undisclosed number of parameters or training procedures. It is the latest and most powerful model published by OpenAI, and the multi-modality enables the model to process both text and image as input. It can be accessed by submitting your early access request through the OpenAI platform (as of July 2023). The two variants of the model are gpt-4 and gpt-4-32k with different context lengths, 8k and 32k tokens, respectively.

Anthropic’s Claude

Anthropic, an AI safety and research company, is a significant player in the AI landscape. It has secured substantial funding and partnered with Google for cloud computing access, mirroring OpenAI's trajectory in recent years.

Anthropic's flagship product, Claude 2, is an LLM with a context size of 100k tokens. Anthropic has ambitious growth plans and aims to compete with top players like OpenAI and Deepmind, working with similarly advanced models.

Claude 2 is trained to be a helpful assistant in a conversational tone, similar to ChatGPT. Its beta, unfortunately, is open only to people in the US or UK (as of July 2023).

If you're in the US or UK, you can sign up for free on Anthropic's website. Just click "Talk to Claude," and you'll be prompted to provide an email address. You'll be ready to go after you confirm the email address.

The API is made available via the web Console. First, read here, Getting Access to Claude, for how to apply for access.

Once you have access to the Console, you can generate API keys via your Account Settings.

Google’s PaLM

Google's Pathways Language Model, or PaLM, is a next-generation artificial intelligence model optimized for various developer use cases, particularly in the realm of NLP. Its primary applications include the development of chatbots, text summarization, question-answering systems, and document search through its text embedding service.

PaLM 2, the upgraded version of the model, is renowned for its ease of use and precision in following instructions. It features variants that are specifically trained for text and chat generation, as well as text embeddings, allowing for a broad range of use cases.

Access to PaLM is exclusively through the PaLM API. Read the Setup process, and please note that as of July 2023, the PaLM API is only available after being selected from a waitlist.

Google's PaLM 2 showcases significant advancements, including multilingual training for superior foreign language performance, enhanced logical reasoning, and the ability to generate and debug code. It integrates seamlessly into services like Gmail. PaLM 2 can be fine-tuned for specific domains such as cybersecurity vulnerability detection (Sec-PaLM) and medical query responses (Med-PaLM).

Conclusion

The choice between proprietary and open-source AI models depends on the specific needs and resources of the user or organization, and the decision should be based on a careful evaluation of all factors.

The next lesson will cover the most popular open-source LLMs.