Converting Strings to datetime in Python

[stackabuse.com] 1 month ago

Introduction

Data can be represented in various forms - and a convinient way to represent dates and times are strings. However, to work with these dates and times in an arhithmetic fashion (such as calculating time differences, adding or removing time, etc.) - we need to convert them to a datetime object.

One of the most common sources of string-formatted datetimes are REST APIs that return agnostic strings, that we can then convert to other formats.

Additionally - timezones are a common headache when it comes to working with datetime objects, so we'll need to think about that while converting too.

In this guide - we'll take a look at how to convert a string date/time into a datetime object in Python, using the built-in datetime module, but also third-party modules such as dateutil, arrow and Maya, accounting for timezones.

Converting Strings Using datetime

The datetime module consists of three different object types: date, time, and datetime. The date object holds the date, time holds the time, and datetime holds both date and time!

import datetime
print(f'Current date/time: {datetime.datetime.now()}')

Running this code would result in:

Current date/time: 2022-12-01 10:27:03.929149

When no custom formatting is given, the default string format is used, i.e. the format for "2022-12-01 10:27:03.929149" is in ISO 8601 format (YYYY-MM-DDTHH:MM:SS.mmmmmm). If our input string to create a datetime object is in the same ISO 8601 format or if you know the format you'll be receiving upfront, we can easily parse it to a datetime object:

import datetime
date_time_str = '2022-12-01 10:27:03.929149'
# strptime(input_string, input_format)
date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)

Running it will print the date, time, and date-time:

Date: 2022-12-01
Time: 10:27:03.929149
Date-time: 2022-12-01 10:27:03.929149

Here, we use the strptime() method, which accepts two arguments:

  • The string-formatted date
  • The format of the first argument

Specifying the format like this makes the parsing much faster since datetime doesn't need to try and interpret the format on its own, which is much more expensive computationally. The return value is of the type datetime.

In our example, "2022-12-01 10:27:03.929149" is the input string and "%Y-%m-%d %H:%M:%S.%f" is the format of our date string. The returned datetime value is stored as date_time_obj.

Since this is a datetime object, we can call the date() and time() methods directly on it. As you can see from the output, it prints the 'date' and 'time' part of the input string!

Format Tokens

It's worth taking a moment to understand format tokens - the "%Y-%m-%d %H:%M:%S.%f" from before.

Each token represents a different part of the date-time, like day, month, year, day of month or week, etc. The list of supported tokens is extensive enough to enable various formatting. Some of the commonly used ones, that we've also used earlier are:

  • %Y: Year (4 digits)
  • %m: Month
  • %d: Day of month
  • %H: Hour (24 hour)
  • %M: Minutes
  • %S: Seconds
  • %f: Microseconds

Note: All of these tokens, except the year, are expected to be zero-padded (i.e. August is the 8th month, and is zero-padded to 08).

Using strptime() Format Tokens to Convert String to Different Datetime Format

If the format of a string is known, it can be easily parsed to a datetime object using strptime(). Let's take a look at a non-trivial example that translates from one format to another:

import datetime
date_time_str = 'Jul 17 2022 9:20AM'
date_time_obj = datetime.datetime.strptime(date_time_str, '%b %d %Y %I:%M%p')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)

The input string was of one format - "Jul 17 2022 9:20AM". Knowing this format, we mapped the constituent elements to the ISO 8601 format and converted it to a datetime object:

Date: 2022-07-17
Time: 09:20:00
Date-time: 2022-07-17 09:20:00

Here's a short list of common string-formatted datetimes and their corresponding formats for strptime():

"Jun 28 2018 at 7:40AM" -> "%b %d %Y at %I:%M%p"
"September 18, 2017, 22:19:55" -> "%B %d, %Y, %H:%M:%S"
"Sun,05/12/99,12:30PM" -> "%a,%d/%m/%y,%I:%M%p"
"Mon, 21 March, 2015" -> "%a, %d %B, %Y"
"2018-03-12T10:12:45Z" -> "%Y-%m-%dT%H:%M:%SZ"

You can parse a date-time string of any format - as long as you use the correct string of format tokens for the input you're receiving.

Convert String to Datetime with Timezones

Handling date-times becomes more complex while dealing with timezones. All above examples so far are naive to the timezone. These are known as naive datetime objects.

However, the datetime objects contain a field exactly for storing timezone-related data - tzinfo:

import datetime as dt
dtime = dt.datetime.now()
print(dtime) # 2022-12-01 11:02:25.219318
print(dtime.tzinfo) # None

The tzinfo field is meant to be a datetime.timezone object, denoting the timezone information. It's None by default, and denotes that the datetime object is timezone-naive. A very common external library for handling timezones is pytz. You can set PyTz objects as the tzinfo field too.

If you don't have it already - install it via:

$ pip install pytz

Using PyTz, we can create an anchor for time-zone aware datetimes, such as UTC:

import datetime as dt
import pytz
dtime = dt.datetime.now(pytz.utc)
print(dtime)
print(dtime.tzinfo)

Output:

2022-12-01 02:07:41.960920+00:00
UTC

It's no longer 11AM, but 2AM, because we've set the timezone a few hours back! This changes the timezone of the datetime.

+00:00 is the difference between the displayed time and the UTC time as the global coordination anchor. We've set the time to be in UTC, so the offset is 00:00. This is a timezone-aware object.

Similarly, we can switch the same datetime's interpretation between timezones. Let's convert a string, such as "2022-06-29 17:08:00" to a datetime and then localize it to the "America/New_York" timezone:

import datetime as dt
import pytz
date_time_str = '2022-06-29 17:08:00'
date_time_obj = dt.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S')
timezone = pytz.timezone('America/New_York')
timezone_date_time_obj = timezone.localize(date_time_obj)
print(timezone_date_time_obj)
print(timezone_date_time_obj.tzinfo)

Note: Localization turns a timezone-naive datetime into a timezone-aware datetime, and treats the timezone as the local one. Thus, the datetime stays the same, but given the different timezone, it no longer represents the same point in time unbound from timezones.

We get the same datetime value, offset by -04:00 compared to the UTC time:

2022-06-29 17:08:00-04:00
America/New_York

17:08 in Tokyo isn't the same point in time as 17:08 in New York. 17:08 in Tokyo is 3:08 in New York.

How to find all of the timezone codes/aliases?

To find all of the available timezones, inspect the all_timezones field, which is a list of all of the available timezones:

print(f'There are {len(pytz.all_timezones)} timezones in PyTz\n')
for time_zone in pytz.all_timezones:
 print(time_zone)

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

There are 594 timezones in PyTz
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Asmera
...

Change Datetime's Timezone

We can convert timezone of a timezone-aware datetime object from one region to another, instead of localizing a timezone-naive datetime through the lens of some timemzone.

This is different from localization, as localization represents a different point in time, but converting the timezone of an object represents the same point in time, through a different lens:

import datetime as dt
import pytz
timezone_nw = pytz.timezone('America/New_York')
nw_datetime_obj = dt.datetime.now(timezone_nw)
timezone_london = pytz.timezone('Europe/London')
london_datetime_obj = nw_datetime_obj.astimezone(timezone_london)
print('America/New_York:', nw_datetime_obj)
print('Europe/London:', london_datetime_obj)

First, we created one datetime object with the current time and set it as the "America/New_York" timezone. Then using the astimezone() method, we have converted this datetime to "Europe/London" timezone. Both datetimes will print different values, using UTC offset as a reference link between them:

America/New_York: 2022-11-30 21:24:30.123400-05:00
Europe/London: 2022-12-01 02:24:30.123400+00:00

2:24 the next day in London is the same point in time as 21:24 the previous day in New York as London is 5h ahead.

As expected, the date-times are different since they're about 5 hours apart.

Convert String to Datetime Using Third Party Libraries

Python's datetime module can convert all different types of strings to a datetime object. But the main problem is that in order to do this you need to create the appropriate formatting code string that strptime() can understand. Creating this string takes time and it makes the code harder to read.

Instead, we can use other third-party libraries to make it easier.

In some cases these third-party libraries also have better built-in support for manipulating and comparing date-times, and some even have timezones built-in, so you don't need to include an extra PyTz package.

Let's take a look at few of these libraries in the following sections.

Convert String to Datetime with dateutil

The dateutil module is an extension to the datetime module. One advantage is that we don't need to pass any parsing code to parse a string!

To automatically convert a string to datetime without a format token using Python's dateutil:

from dateutil.parser import parse
datetime = parse('2018-06-29 22:21:41')
print(datetime)

This parse function will parse the string automatically! You don't have to include any format string. Let's try to parse different types of strings using dateutil:

from dateutil.parser import parse
date_array = [
 '2018-06-29 08:15:27.243860',
 'Jun 28 2018 7:40AM',
 'Jun 28 2018 at 7:40AM',
 'September 18, 2017, 22:19:55',
 'Sun, 05/12/1999, 12:30PM',
 'Mon, 21 March, 2015',
 '2018-03-12T10:12:45Z',
 '2018-06-29 17:08:00.586525+00:00',
 '2018-06-29 17:08:00.586525+05:00',
 'Tuesday , 6th September, 2017 at 4:30pm'
]
for date in date_array:
 print('Parsing: ' + date)
 dt = parse(date)
 print(dt.date())
 print(dt.time())
 print(dt.tzinfo)
 print('\n')

Output:

Parsing: 2018-06-29 08:15:27.243860
2018-06-29
08:15:27.243860
None
Parsing: Jun 28 2018 7:40AM
2018-06-28
07:40:00
None
Parsing: Jun 28 2018 at 7:40AM
2018-06-28
07:40:00
None
Parsing: September 18, 2017, 22:19:55
2017-09-18
22:19:55
None
Parsing: Sun, 05/12/1999, 12:30PM
1999-05-12
12:30:00
None
Parsing: Mon, 21 March, 2015
2015-03-21
00:00:00
None
Parsing: 2018-03-12T10:12:45Z
2018-03-12
10:12:45
tzutc()
Parsing: 2018-06-29 17:08:00.586525+00:00
2018-06-29
17:08:00.586525
tzutc()
Parsing: 2018-06-29 17:08:00.586525+05:00
2018-06-29
17:08:00.586525
tzoffset(None, 18000)
Parsing: Tuesday , 6th September, 2017 at 4:30pm
2017-09-06
16:30:00
None

You can see that almost any type of string can be parsed easily using the dateutil module.

While this is convenient, recall from earlier that having to predict the format makes the code much slower, so if you're code requires high performance then this might not be the right approach for your application.

Convert String to Datetime with Maya

Maya also makes it very easy to parse a string and change timezones. To easily convert a string with Python's Maya:

import maya
dt = maya.parse('2018-04-29T17:45:25Z').datetime()
print(dt.date())
print(dt.time())
print(dt.tzinfo)

Output:

2018-04-29
17:45:25
UTC

For converting the time to a different timezone:

import maya
dt = maya.parse('2018-04-29T17:45:25Z').datetime(to_timezone='America/New_York', naive=False)
print(dt.date())
print(dt.time())
print(dt.tzinfo)

Output:

2018-04-29
13:45:25
America/New_York

Now isn't that easy to use? Let's try out maya with the same set of strings we have used with dateutil:

import maya
date_array = [
 '2018-06-29 08:15:27.243860',
 'Jun 28 2018 7:40AM',
 'Jun 28 2018 at 7:40AM',
 'September 18, 2017, 22:19:55',
 'Sun, 05/12/1999, 12:30PM',
 'Mon, 21 March, 2015',
 '2018-03-12T10:12:45Z',
 '2018-06-29 17:08:00.586525+00:00',
 '2018-06-29 17:08:00.586525+05:00',
 'Tuesday , 6th September, 2017 at 4:30pm'
]
for date in date_array:
 print('Parsing: ' + date)
 dt = maya.parse(date).datetime()
 print(dt)
 # Truncated for readability
 #print(dt.date())
 #print(dt.time())
 #print(dt.tzinfo)

Output:

Parsing: 2018-06-29 08:15:27.243860
2018-06-29 08:15:27.243860+00:00
Parsing: Jun 28 2018 7:40AM
2018-06-28 07:40:00+00:00
Parsing: Jun 28 2018 at 7:40AM
2018-06-28 07:40:00+00:00
Parsing: September 18, 2017, 22:19:55
2017-09-18 22:19:55+00:00
Parsing: Sun, 05/12/1999, 12:30PM
1999-05-12 12:30:00+00:00
Parsing: Mon, 21 March, 2015
2015-03-21 00:00:00+00:00
Parsing: 2018-03-12T10:12:45Z
2018-03-12 10:12:45+00:00
Parsing: 2018-06-29 17:08:00.586525+00:00
2018-06-29 17:08:00.586525+00:00
Parsing: 2018-06-29 17:08:00.586525+05:00
2018-06-29 12:08:00.586525+00:00
Parsing: Tuesday , 6th September, 2017 at 4:30pm
2017-09-06 16:30:00+00:00

As you can see, all of the date formats were successfully parsed!

If we don't provide the timezone info then, it automatically converts it to UTC. So, it is important to note that we must provide the to_timezone and naive parameters if the time is not in UTC.

Convert String to Datetime with Arrow

Arrow is another library for dealing with datetime in Python. And like before with maya, it also figures out the datetime format automatically. Once interpreted, it returns a Python datetime object from the arrow object.

To easily convert a string to datetime using Python's arrow:

import arrow
dt = arrow.get('2018-04-29T17:45:25Z')
print(dt.date())
print(dt.time())
print(dt.tzinfo)

Output:

2018-04-29
17:45:25
tzutc()

And here is how you can use arrow to convert timezones using the to() method:

import arrow
dt = arrow.get('2018-04-29T17:45:25Z').to('America/New_York')
print(dt)
print(dt.date())
print(dt.time())

Output:

2018-04-29T13:45:25-04:00
2018-04-29
13:45:25

As you can see the date-time string is converted to the "America/New_York" region.

Now, let's again use the same set of strings we have used above:

import arrow
date_array = [
 '2018-06-29 08:15:27.243860',
 #'Jun 28 2018 7:40AM',
 #'Jun 28 2018 at 7:40AM',
 #'September 18, 2017, 22:19:55',
 #'Sun, 05/12/1999, 12:30PM',
 #'Mon, 21 March, 2015',
 '2018-03-12T10:12:45Z',
 '2018-06-29 17:08:00.586525+00:00',
 '2018-06-29 17:08:00.586525+05:00',
 #'Tuesday , 6th September, 2017 at 4:30pm'
]
for date in date_array:
 dt = arrow.get(date)
 print('Parsing: ' + date)
 print(dt)
 # Truncated for readability
 #print(dt.date())
 #print(dt.time())
 #print(dt.tzinfo)

This code will fail for the date-time strings that have been commented out, which is over half of our examples. The output for other strings will be:

Parsing: 2018-06-29 08:15:27.243860
2018-06-29T08:15:27.243860+00:00
Parsing: 2018-03-12T10:12:45Z
2018-03-12T10:12:45+00:00
Parsing: 2018-06-29 17:08:00.586525+00:00
2018-06-29T17:08:00.586525+00:00
Parsing: 2018-06-29 17:08:00.586525+05:00
2018-06-29T17:08:00.586525+05:00

In order to correctly parse the date-time strings that are commented out, you'll need to pass the corresponding format tokens to give the library clues as to how to parse it.

Conclusion

In this article we have shown different ways to parse a string to a datetime object in Python. You can either opt for the default Python datetime library or any of the third-party libraries mentioned in this article, among many others.

The main problem with the default datetime package is that we need to specify the parsing code manually for almost all date-time string formats. So, if your string format changes in the future, you will likely have to change your code as well. But many third-party libraries, like the ones mentioned here, handle it automatically.

One more problem we face is dealing with timezones. The best way to handle them is always to store the time in your database as UTC format and then convert it to the user's local timezone when needed.

These libraries are not only good for parsing strings, but they can be used for a lot of different types of date-time related operations. I'd encourage you to go through the documents to learn the functionalities in detail.