Skip to main content

OpenAI

Chroma provides a convenient wrapper around OpenAI's embedding API. This embedding function runs remotely on OpenAI's servers, and requires an API key. You can get an API key by signing up for an account at OpenAI.

ModelsInputDimensionalityContext Size
ada-002English15362048

Basic Usage

Python

pip install openai

from chromadb.utils import embedding_functions

embedder = embedding_functions.OpenAIEmbeddingFunction(
api_key="YOUR_API_KEY")

collection = client.create_collection(
name="oai_ef",
embedding_function=embedder)

Javascript

yarn add openai
import { ChromaClient, OpenAIEmbeddingFunction } from 'chromadb'

const embedder = new OpenAIEmbeddingFunction({
openai_api_key: "YOUR_API_KEY"
})

const collection = await client.createCollection({
name: "oai_ef",
embeddingFunction: embedder
})

Advanced Usage

Call directly

By passing the embedding function to a Collection, Chroma handles the embedding of documents and queries for you. However in some cases you may want to generate the embeddings outside and handle them yourself.

Python

embeddings = embedder(["document1","document2"])
# [[0.04565250128507614, 0.01611952856183052...], [0.030171213671565056, 0.007690359838306904...]]

Javascript

const embeddings = embedder.generate(["document1","document2"])
// [[0.04565250128507614, 0.01611952856183052...], [0.030171213671565056, 0.007690359838306904...]]

Using a different model

You can pass in an optional model_name argument, which lets you choose which OpenAI embeddings model to use. By default, Chroma uses text-embedding-ada-002. You can see a list of all available models here.

Run with Azure

To use the OpenAI embedding models on other platforms such as Azure, you can use the api_base and api_type parameters:

openai_ef = embedding_functions.OpenAIEmbeddingFunction(
api_key="YOUR_API_KEY",
api_base="YOUR_API_BASE_PATH",
api_type="azure",
api_version="YOUR_API_VERSION",
model_name="text-embedding-ada-002"
)