Tuesday, November 17, 2020

How to track Azure Cosmos DB Re-Indexing Progress

 

     Indexes let your queries run faster. When you need to adjust your indexing policies, database engines re-indexes your data respecting to your changes. In Cosmos DB, when you change your indexing policies, database engine truncates all your indexes and starts to reindex all your indexes from scratch. You do not want to change your indexing policies when your application is busy. Because your queries can not use the dropped indexes, queries will take longer, and they will cost more Request Units. Also, your queries might not return all the data they supposed to. You can read me my older post about indexes in Cosmos DB.

     You may want to monitor re-indexing progress; you may want to disable your application until indexing is completed or warn your team about the re-indexing progress. You can check the re-indexing progress only from SDK, that means you need to write your own code to accomplish this. I have the following code which checks the progress every second. If progress is at %100 then it quits, otherwise it continues to check progress every second until it receives 100 as result.

     To get the re-indexing progress, you need to use PopulateQuotaInfo to your request when you want to get information about a container. You can read the result by checking the value of x-ms-documentdb-collection-index-transformation-progress header value.

using Microsoft.Azure.Cosmos;
using System;
using System.Configuration;
using System.Threading.Tasks;

namespace IndexTracker
{
    class Program
    {
        public static string connectionString = ConfigurationManager
        .AppSettings["CosmosConn"].ToString();        
        static async Task Main(string[] args)
        {
            var current = await CheckIndex();
            Console.WriteLine("Current Index Transformation : " + current.ToString());
            if (current < 100)
            {
                var start = DateTime.Now;
                while (current < 100)
                {
                    System.Threading.Thread.Sleep(1000);
                    current = CheckIndex().Result;
                    Console.WriteLine("Current Index Transformation : " + 
                    current.ToString());
                }
                var end = DateTime.Now;
                Console.WriteLine("It took " + (end - start).TotalSeconds + " 
                seconds to complete the task.");
            }
        }

        static async Task<int> CheckIndex()
        {
            var cosmosClient = new CosmosClient(connectionString);
            try
            {
                ContainerResponse containerResponse = await cosmosClient.
                GetContainer("Stackoverflow0", "Posts").
                    ReadContainerAsync(new ContainerRequestOptions 
                    { PopulateQuotaInfo = true });                
                return int.Parse(containerResponse
                .Headers["x-ms-documentdb-collection-index-transformation-progress"]);
            }
            catch
            {
                Console.WriteLine("Error Occured");
            }
            return 0;
        }
    }
}
   

     Before I run this, I changed my indexing policy. I have a container with 10000 documents and my Request Unit is 400 per second. Reindexing time will depend on these numbers. Here is the output of this code.

      You can accomplish this by using Cosmos DB's REST API too. In the following example, I am using the Postman to demo the same result. I updated my indexing policy and send a request to Cosmos DB to get information. As you can see, I add x-ms-documentdb-populatequotainfo in my request headers. Result comes back in x-ms-documentdb-collection-index-transformation-progress response header.




1 comment: