Wednesday, July 31, 2019

Exploring Azure Cosmos DB Indexing Policy changes


     Earlier, I wrote about Indexing Policies of Cosmos DB. I was revisiting the topic and noticed couple of changes in the Cosmos DB documentation. I want to share with you what I found. Let's start with the obsolete attributes. You can find the following default index policy example in many websites including my posts. There nine different attributes here and three of them are already obsolete.

 "indexingMode": "consistent", 
 "automatic": true
 "includedPaths": 
  [ 
   { 
     "path": "/*", 
     "indexes": [ 
       { 
         "kind": "Range", 
         "dataType": "Number", 
         "precision": -1 
       }, 
       { 
         "kind": "Range", 
         "dataType": "String", 
         "precision": -1 
       },
       { 
         "kind": "Spatial", 
         "dataType": "Point" 
       } 
    } 
  ], 
 "excludedPaths": []
 }

"automatic"

     Value of this attribute is ignored. You might need to keep it to make some tools work. If not, you can remove it from your policies.

"precision"

     Value of this attribute is ignored too. Keep its value -1 if you have a tool that requires it.

"hash"

     Hash is an index kind which used to work for equality indexes. It is replaced by the range kind.

Think twice before you change your current indexing policy!

     Removing any of these properties from your current indexing policies will cause re-indexing! My suggestion is; Don't use these obsolete properties in your new projects and keep them in your current projects until you need to re-index your containers.

     Re-indexing is the next topic. When re-indexing is in progress, queries may not return all the matching data. There is no easy way to track re-indexing. You need to write your own code by using one of the CosmosDB SDKs. You cannot change running re-indexing index policy if its mode is set to Consistent. You can always drop re-indexing indexes by setting the indexing policy's mode to None.

Time To Live requires Indexing

     Indexing must be active on TTL containers. If container's indexing mode is None, you cannot activate TTL functionality on that container. Also, if TTL is active on a container, you cannot change its indexing mode to None. This does not mean that you need to index everything. You can exclude all paths if you don't need any indexing on a TTL container.

Lazy Indexing is dangerous

     Lazy indexing used to be an option. It's not in any CosmosDB documentation anymore. By using Lazy indexing, you could save 20 to 30 percent for Request Units. Just like anything else in life, you get what you pay for when it comes to Lazy indexing. By selecting Lazy indexing, you are saying that eventually Indexes will be updated. If Indexes are not updated, that means your queries might not return all the data since all data might not be indexed yet. Lazy indexing is still an option, nobody talks about it for a good reason. In my opinion, it should be listed as obsolete feature or it should have a better documentation about how it works or why it might not be a good option for your solutions.
     If you use Lazy Indexing to reduce Request Units in your solution, change it to consistent now unless you have a really good reason!


Composite Indexes

     Composite Indexes can be used only for queries that orders data by using two or more properties. You need to specify two or more paths and the type of order (asc or desc) to define a composite index. The sequence of paths must match to your query to use it. To understand this better, Let's try to create composite index for the following query.


Select * FROM Users ORDER BY State asc, HiredOn asc

// I add the following code to my Index Policy


"compositeIndexes":[
[
  {
"path":"/State",
    "order":"ascending"
  },
  {
"path":"/HiredOn",
     "order":"ascending"
  }
]
]

Following query can use this composite Index too.

Select * FROM Users ORDER BY State desc, HiredOn desc

Following queries cannot use this composite Index! You need to create more composite indexes.

Select * FROM Users ORDER BY State asc, HiredOn desc

Select * FROM Users ORDER BY HiredOn desc, State desc

Select * FROM Users ORDER BY State asc, HiredOn asc, Department asc

3 comments: