Working with Date and Time in Pandas

Working with Date and Time in Pandas

5 mins read32.1K Views Comment
clickHere
Updated on Oct 3, 2023 12:06 IST

When you are analyzing your data using Python, its Pandas library naturally comes in handy. It is a powerful and feature-rich library that is designed for data pre-processing purposes such as data cleaning, manipulation, and transformation. The Pandas objects store various data types such as int, float, string, boot, DateTime, etc. So, let’s get started working with date and time in pandas

2022_03_Untitled-design-5.jpg

Using the correct data type is important to produce accurate outputs during analysis. Pandas can correctly infer data types in most cases. However, there are certain scenarios where you will likely need to explicitly convert your data from one type to another. 

In this article, we will aim at how Pandas handles dates and times in your data through the DateTime data type.

The date and time features find importance in Data Science problems, spanning domains such as marketing, sales, e-commerce, retail, and many more.

Must Check: Pandas Online Courses and Certifications

Must Check: Pandas Interview Questions for Data Scientists

Today, we will cover the following sections:

DateTime Objects in Pandas

Convert string to DateTime

Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. 

To understand this through example, let’s create a sample DataFrame as shown below:

 
#Importing required libraries
import pandas as pd
import numpy as np
from datetime import datetime
#Creating a sample DataFrame
df = pd.DataFrame({'date': ['04/03/2022', '05/03/2022', '06/03/2022'],
'patients': [16, 19, 11]})
df
Copy code
df.dtypes
Copy code

Let’s check the data types of the DataFrame:

output2

As you can see, the ‘date’ column in the DataFrame is currently a string-type object. 

Let’s parse this to datetime, as shown:

 
df.dtypesdf['date'] = pd.to_datetime(df['date'])
df
Copy code
output3
 
print(type(df['date']))
Copy code
output4

Let’s check the converted data type:

 
df.dtypes
Copy code
output5

Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.

Programming Online Courses and Certification Python Online Courses and Certifications
Data Science Online Courses and Certifications Machine Learning Online Courses and Certifications

Convert integer to DateTime

The to_datetime()function will convert integer format to a DateTime object. It takes two mandatory arguments:

  • unit: defines if your integer is in terms of the number of days, seconds, years, etc. 
  • origin: defines the reference point you want to start counting your ‘units’ from

For example, 

 
pd.to_datetime(14667, unit='D', origin='unix')
Copy code
output6

Here, the unit parameter is set to the number of days, and the origin parameter is set to start at the Unix, that is, 1970-01-01. So, the timestamp displays the date and time 14667 days from the Unix.

Let’s take another example,

 
pd.to_datetime(996209999, unit='s', origin='2000-01-01')
Copy code
output7

The timestamp displays the date and time 996209999 seconds from 2000-01-01. Phew, that’s a lot of seconds!

Timedelta Objects in Pandas

In case you need to find the duration between two dates, the result is returned as a timedelta object. Pandas has a built-in function called to_timedelta()that gives the difference in dates and times. 

Let’s understand this through example:

 
#Timedelta - duration between dates
date = datetime.now()
#Present Date
print(date)
#Date Tomorrow
print(date+pd.to_timedelta(1, unit='D'))
#Date After a Month
print(date+pd.to_timedelta(1, unit='m'))
Copy code

Text

Description automatically generated

Here, too, the unit argument defines whether that’s days, months, hours, seconds, etc. 

DateTime Format

By default, the to_datetime() function parses string with month-first format (MM-DD-YYYY). You can check your datetime format using the argument format, as shown:

 
pd.to_datetime(df['date'], format="Your_datetime_format")
Copy code
A picture containing text

Description automatically generated

However, if you’re outside of the United States, you probably write the day-first (DD-MM-YYYY).

You can make Pandas display day first instead of month by setting the argument dayfirst=True, as shown:

 
df = pd.DataFrame({'date': ['4/03/2022', '5/03/2022', '6/03/2022'],
'patients': [16, 19, 11]})
df['date'] = pd.to_datetime(df['date'], dayfirst=True)
df
Copy code
output10

Custom format

The format argument of the to_datetime() function allows you to pass a custom format. For example, let’s say you want to parse your string with the following format – YYYY-MM-DD HH: MM: SS. Let’s see how we can do this:

 
#Creating another Sample DataFrame
df0 = pd.DataFrame({'time_of_death': ['2020-7-13 21:15:0',
'2020-7-9 17:30:30',
'2020-12-10 4:45:12'],
'patientID': [101, 23, 48]})
df0['time_of_death'] = pd.to_datetime(df0['time_of_death'],
format="%Y-%m-%d %H:%M:%S")
df0
Copy code
output11

Handling Parsing Errors

If your date string does not meet the timestamp format, you will get a TypeError, as shown:

 
df = pd.DataFrame({'date': ['4/03/2022', 'x/03/2022', '6/03/2022'],
'patients': [16, 19, 11]})
df['date'] = pd.to_datetime(df['date'], dayfirst=True)
df
Copy code
2022_03_image-187.jpg

You can set the argument errors to ‘ignore’ or ‘coerce’ to avoid this.

 
df['date'] = pd.to_datetime(df['date'], dayfirst=True, errors='ignore')
df
Copy code
output13
/
 
df['date'] = pd.to_datetime(df['date'], dayfirst=True, errors='coerce')
df
Copy code
output14

Extract DateTime into Different Columns

Pandas DateTime object has the following in-built attributes:

  • dt.year – to get the year
  • dt.month – to get the month
  • dt.day – to get the day
  • dt.week – to get the week of the year
  • dt.dayofweek – to get the day of the week
  • dt.is_leap_year – to get leap year

Let’s see how we can get these separately:

 
df = pd.DataFrame({'date': ['4/03/2022', '5/03/2022', '6/03/2022'],
'patients': [16, 19, 11]})
df['date'] = pd.to_datetime(df['date'], dayfirst=True)
df['year']= df['date'].dt.year
df['month']= df['date'].dt.month
df['day']= df['date'].dt.day
df
Copy code
2022_03_image-188.jpg
 
df['week_of_year'] = df['date'].dt.week
df['day_of_week'] = df['date'].dt.dayofweek
df['is_leap_year'] = df['date'].dt.is_leap_year
df
Copy code
output16

 

Assemble DateTime from Different Columns

You can use the to_datetime()function to assemble a datetime from multiple columns as well. The keys (columns labels) can be common abbreviations: 

 
#Creating another Sample DataFrame
df1 = pd.DataFrame({'year': [2000, 2010, 2020],
'month': [1, 2, 3],
'day': [21, 4, 13]})
df1['date'] = pd.to_datetime(df1)
df1
Copy code
output17

Date Range in Pandas 

There will be instances where you would need to create date sequences. To make this convenient, Pandas provides the date_range() method that accepts a start date, an end date, and an optional frequency code (default code is ‘D’) :

 
pd.date_range(start='13/4/2022', end='5/5/2022')
Copy code
output18

You could also use the datetime.today() function to specify today’s date and time. And define the time period you want to generate:

 
pd.date_range(start=datetime.today(), periods=10, freq='T')
Copy code
About the Author

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio