DateTime data type can be trouble. In SQL Server, there are
many datetime datatypes. Some of them are in UTC format, some of them has time
zone information, some of them is just basic date and time or just date.
Depending on what you need, you can pick one of them and hopefully you pick the right one for what you need.
What if you want to store datetime in a database and there
is no datetime data type available? If you use NoSQL Database like Cosmos DB,
you will have this problem. Data is saved as Json in Cosmos DB and other
NoSQL Databases. JSON does not have a data type. Json supports
only strings, numbers, Boolean, null, custom objects and arrays.
Things can get tricky if you have queries that need to
retrieve data by date. Since there is no date type, how can database engine compare
date to date? I don’t work at Cosmos DB team, so I am not sure how Cosmos DB handles
this problem internally but my guess the answer for this question is just like any database answer…. It depends 😊
I believe it depends on how you save the datetime information in the database.
There are many ways to save date and time, the default way is; It will be saved as string but you can also save it as a number. Date format depends on
how you serialize your data into JSON.
Let’s say the date we want to
save is 11/15/2019 01:58 PM and we are in Eastern Time Zone. Here are the most common serializer outputs.
.NET
JavaScriptSerializer. "\"\\/Date(1573844334736)\\/\""
.NET DataContractJsonSerializer. "\"\\/Date(1573844334736-0500)\\/\""
Built-in JSON object of JavaScript. "2019-11-15T13:58:00.511Z"
ISO 8601 Format "2019-11-15T13:58:00-05:00"
One of the most common and recommended versions of datetime in many systems is ISO 8601 format. This is the format Cosmos DB recommends too. Also; It is
recommended to store all dates as UTC. It is recommended because it is
readable, data sorts correctly, includes fractional seconds and it is endorsed by many
groups including W3C.
As you can see it does not matter which serializer you pick your datetime will be a string. In some way database
engine needs to know that this string is datetime and use it as datetime to retrieve data.
If this happens in SQL Server, you are looking at table scan because you need
to convert all the data into the datetime data type then try to compare data. We are not even
touching to potential Time Zone or elapsed seconds problems.
You can save datetime as a number too. As you can see in the
first two serializers, there is a number in the Date constructor. This number is known as Unix timestamp. It represents the total number of seconds that have
elapsed since January 1, 1970. It does not count leap seconds! Cosmos DB’s
Timestamp (_ts) follows this way.
It’s easy to convert datetime into Unix timestamp in JavaScript.
Since Cosmos DB stored procedures are in JavaScript too, you can convert the
datetime coming from user into Unix timestamp then search CosmosDB by using a
number rather than string. If you need to use SDK to find data, you can use UnixDateTimeConverter class to convert datetime to Unix timestamp in your code.
To search by Unix timestamp from stored procedure or SDK; First you need to save datetime in Unix timestamp
as number. I recommend saving the dates in string format and Unix timestamp
format in the database. You can use Unix timestamp to search in database and you
can use string format to display date in User Interface so you don’t need to do any
conversion in client side or server side. You might need to pay little bit more for storage, but I think it's worth it. You can customize the Indexing policy and index only Unix timestamp.
Here is an example, I have the date in ISO 8601 format and Unix Timestamp format in the following document.
I have a container named Posts, It has 100.000 Json
documents in it. Let’s say I need to find all the posts of a specific user
after a specific date. Since I have the datetime in Unix timestamp. I will use
CreationUnixDt property in my query, It should be faster to find a number than string in
any database. Let’s test if this applies to Cosmos DB. First let’s search
by CreationDate property which is a string.Here is an example, I have the date in ISO 8601 format and Unix Timestamp format in the following document.
Now, Let’s use the same query but this time search by Unix Timestamp which is a number. First, we need to convert date (2008-07-01) to Unix Timestamp. There are many ways to do that. For example, In SQL Server you can use the following T-SQL
SELECT DATEDIFF(SECOND,{d '1970-01-01'}, '2008-07-1')
There are interesting results here, first of all it looks like It costs more to search by a number. Index lookup time is significantly low when you search by a number and execution time is higher when we search by a number. I am surprised about these numbers. If this was done in SQL Server, there will be huge differences between these two queries. Results of these queries tell me that there is not that much difference between these two queries.
You have one more option to save datetime in JSON object. You can create a custom object for it. This might cost you more for storage since you will have more properties to define a date and time. But you can search database however you like since it will give you most flexible solution, you store what it matters for you in this way. If date and time is crucial for your solution, this might help you to store datetime the way you want. You can store the location of user in this custom object too. Sometimes, it's not enough to know what time zone user was in when transaction occured. There are a lot of countries in the same time zone and current datetime data types do not include any information about the location of user.
great information, the only article i found for cosmos datetime
ReplyDeletevery less documentation for cosmos, and seems not much are using it....
to find a small thing need to put lots of efforts
thanks!!
ReplyDelete"Here is an example, I have the date in ISO 8601 format and Unix Timestamp format"
ReplyDeleteI assume both are stored in local time, since date is not ending in Z?
I'm exploring the option of storing both utc and local time to allow flexibility:
Delete"time":
{
"utc": "yyyy-MM-ddTHH:mm:ss.fffffffZ"
"local": "yyyy-MM-ddTHH:mm:ss.fffffff"
}
Time in cosmos db is probably what is most problematic to get right, I wish Microsoft had enforced use of the reccomended format "yyyy-MM-ddTHH:mm:ss.fffffffZ". To much flexibility for such a critical part IMO.