An embedding is a new way to represent data for Vector databases, machine learning models, and algorithms. Its format is similar to spatial data; rather than latitude and longitude, we have a bunch of vectors representing data. Just like spatial data, it is not really readable by users. It contains a bunch of floating-point numbers. Here is an example of embedding. The following numbers represent the text "Hello World!"
1536 Floating points represent "Hello World"; now you are wondering if the number of floating points increases with the size of the text. I wonder how many floating points I need for "This is not my first rodeo. I have season tickets."
1 token ~= 4 chars in English | 1-2 sentences ~= 30 tokens |
1 token ~= 3/4 words | 1 paragraph ~= 100 tokens |
100 tokens ~= 75 words | 1500 words ~= 2048 tokens |
If you want to make a vector search in a database like Azure Cosmos DB MongoDB vcore, you must first convert your search parameter into an embedding/vector. To generate embeddings, you need to have Azure OpenAI feature. Then, you need to deploy the text-embedding-ada-002 model. The following screenshot shows you my deployed models. You must wait 5 to 10 minutes after the deployment to use a new model.
Next, you need the Azure.AI.OpenAI (1.0.0-beta9) or later version. Check the Include prerelease checkbox to find it in Visual Studio.
You will need the URL and the credentials from the Azure Portal or in the Sample Code link in Azure AI Studio.
- You will need to use the endpoint and the key when you declare the OpenAIClient.
- DeploymentName is the name of the text-embedding-ada-002 model deployment.
- Input is the text you want to convert into embedding/vector.
var client = new OpenAIClient(new Uri("endpoint goes here"),
new AzureKeyCredential("key goes here"));
var options = new EmbeddingsOptions()
{
DeploymentName = "embedding",
Input = { "Generate this text into embedding" }
};
var vector = await client.GetEmbeddingsAsync(options);
foreach (var item in vector.Value.Data[0].Embedding.ToArray())
{
Console.WriteLine(item);
}
No comments:
Post a Comment