search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas DataFrame | tz_localize method

schedule Aug 10, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas DataFrame.tz_localize(~) makes the DataFrame's index timezone-aware.

NOTE

To localize a column and not an index, use Series.dt.tz_localize() instead.

Parameters

1. tzlink | string or tzinfo

The timezone to use.

2. axis | int or string | optional

Whether to localize the row index or the column index:

Axis

Description

0 or "index"

Localize the row index.

1 or "columns"

Localize the column index.

By default axis=0.

3. level | int or string | optional

The level to target. This is only relevant when your DataFrame has a MultiIndex.

4. copy | boolean | optional

  • If True, then a new DataFrame is returned. Modifying this DataFrame will not mutate the source DataFrame, and vice versa.

  • If False, then no new DataFrame is created - modifying the returned DataFrame will mutate the source DataFrame, and vice versa.

By default, copy=True.

5. ambiguouslink | string or Numpy array of booleans | optional

Due to time adjustments caused by Daylight Saving Time (DST), ambiguity in the time can arise. For instance, consider the following case:

Local time:
01:59:58
01:59:59   
01:00:00   # DST ends and so we set the wall clock back 1 hour
01:00:01
...
01:59:58   # This local time occured for the second time
...

If you try to localize time that occurred twice (e.g. 01:59:58), then Pandas will get confused as to which time you're referring to - the first one (DST) or the second one (non-DST)?

Pandas can deal with such ambiguity in one of the following ways:

Value

Description

"infer"

Infer the DST transition from the sequence of time provided.

array of boolean

An array (e.g. lists, Numpy array) of booleans where:

  • True indicates DST time

  • False indicates non-DST time

"NaT"

Ambiguous times are converted into NaT (not-a-time).

"raise"

Ambiguous times will raise an error.

By default, ambiguous="raise".

6. nonexistentlink | string or timedelta | optional

Again, due to Daylight Saving Time (DST), some local times do not exist. For instance:

Local time:
00:59:58
00:59:59   # DST starts so the wall clock is turned forwards by 1 hour
02:00:00
02:00:01

Notice how local times like 01:30:30 do not exist due to DST causing the wall clock to shift forwards by an hour.

Pandas can deal with non-existent times in the following ways:

Value

Description

"shift_forward"

Shift any non-existent times forwards to the nearest existing time.

"shift_backward"

Shift any non-existent times backwards to the nearest existing time.

"NaT"

Return NaT for non-existent times.

timedelta object

Shift non-existing times by the provided timedelta.

"raise"

Throw an error for non-existent times.

By default, nonexistent="raise".

Return Value

A DataFrame with its index converted to local time.

Examples

Basic usage

Consider the following time-zone naive DatetimeIndex:

idx = pd.DatetimeIndex(['2020-12-22 15:30:00',
'2020-12-23 16:00:00'])
s = pd.Series(range(2), index=idx)
s
2020-12-22 15:30:00 0
2020-12-23 16:00:00 1
dtype: int64

Here, naive simply means that our DatetimeIndex has no notion of timezones.

To make DatetimeIndex timezone-aware:

s.tz_localize(tz="Asia/Tokyo")
2020-12-22 15:30:00+09:00 0
2020-12-23 16:00:00+09:00 1
dtype: int64

Here, the appended +09:00 means that the standard time in Tokyo is 9 hours ahead of UTC.

Dealing with ambiguous times

Consider the following time-series with ambiguous dates:

idx = pd.DatetimeIndex(['2019-10-27 02:30:00',
'2019-10-27 02:00:00',
'2019-10-27 02:30:00',
'2019-10-27 03:00:00',
'2019-10-27 03:30:00'])
s = pd.Series(range(5), index=idx)
s
2019-10-27 02:30:00 0
2019-10-27 02:00:00 1
2019-10-27 02:30:00 2
2019-10-27 03:00:00 3
2019-10-27 03:30:00 4
dtype: int64

At 2019-10-27 3AM (Central European Time), the DST ended, which means that wall clock was turned back one hour. Therefore, we have an ambiguous case here where times like 2019-10-27 02:30:00 occurred twice locally.

raise

By default, ambiguous="raise", which means that an error will be thrown whenever there is ambiguous time:

s.tz_localize("CET")   # ambiguous="raise"
AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:30:00, try using the 'ambiguous' argument

infer

In this specific case, it can be inferred from the data that the first 02:30:00 refers to DST time, and the latter 02:30:00 refers to the non-DST time:

s.tz_localize("CET", ambiguous="infer")
2019-10-27 02:30:00+02:00 0
2019-10-27 02:00:00+01:00 1
2019-10-27 02:30:00+01:00 2
2019-10-27 03:00:00+01:00 3
2019-10-27 03:30:00+01:00 4
dtype: int64

Observe how Pandas accounted for DST by offsetting by UTC+02:00.

Boolean array

Sometimes it is impossible to differentiate between DST and non-DST time:

idx = pd.DatetimeIndex(['2019-10-27 02:30:00'])
s = pd.Series("a", index=idx)
s.tz_localize("CET", ambiguous="infer")
AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:30:00 as there are no repeated times

Here, we get an error because without a sequence of datetimes (as we had above), Pandas cannot know whether the ambiguous time of 02:30:00 is DST or non-DST.

In such cases, we can directly tell Pandas whether a certain datetime is DST or not by passing in an array of booleans:

idx = pd.DatetimeIndex(['2019-10-27 02:30:00', '2019-10-27 02:35:00'])
s = pd.Series("a", index=idx)
s.tz_localize("CET", ambiguous=[True,False])
2019-10-27 02:30:00+02:00 a
2019-10-27 02:35:00+01:00 a
dtype: object

Here, True indicates that the corresponding time is DST.

NaT

To map all ambiguous datetimes to NaT:

idx = pd.DatetimeIndex(['2019-10-27 02:30:00', '2019-10-27 03:30:00'])
s = pd.Series("a", index=idx)
s.tz_localize("CET", ambiguous="NaT")
NaT a
2019-10-27 03:30:00+01:00 a
dtype: object

Dealing with non-existent times

Consider the following Series:

idx = pd.DatetimeIndex(['2019-03-31 01:30:00', '2019-03-31 02:30:00', '2019-03-31 03:30:00'])
s = pd.Series("a", index=idx)
s
2019-03-31 01:30:00 a
2019-03-31 02:30:00 a
2019-03-31 03:30:00 a
dtype: object

At 2019-03-31 2AM (Central European Time), the DST started, which means that the wall clock was turned one hour forwards. As a result, local times like 2:30AM is non-existent.

raise

By default nonexistent="raise" which means non-existent times like this will raise an error:

s.tz_localize("CET")   # nonexistent="raise"
NonExistentTimeError: 2019-03-31 02:30:00

shift_forward

To shift non-existent times forwards to the nearest existing time:

s.tz_localize("CET", nonexistent="shift_forward")
2019-03-31 01:30:00+01:00 a
2019-03-31 03:00:00+02:00 a
2019-03-31 03:30:00+02:00 a
dtype: object

shift_backward

To shift non-existent times backwards to the nearest existing time:

s.tz_localize("CET", nonexistent="shift_backward")
2019-03-31 01:30:00+01:00 a
2019-03-31 01:59:59.999999999+01:00 a
2019-03-31 03:30:00+02:00 a
dtype: object

Here's the same Series s for your reference:

s
2019-03-31 01:30:00 a
2019-03-31 02:30:00 a
2019-03-31 03:30:00 a
dtype: object

timedelta object

To shift non-existent times forwards by one hour:

s.tz_localize("CET", nonexistent=pd.Timedelta("1 hour"))
2019-03-31 01:30:00+01:00 a
2019-03-31 03:30:00+02:00 a
2019-03-31 03:30:00+02:00 a
dtype: object

To shift non-existent times backwards by one hour:

s.tz_localize("CET", nonexistent=-pd.Timedelta("1 hour"))   # notice the "-" there
2019-03-31 01:30:00+01:00 a
2019-03-31 01:30:00+01:00 a
2019-03-31 03:30:00+02:00 a
dtype: object

Note that if the shifted non-existent time is still non-existent, then an error will be thrown:

s.tz_localize("CET", nonexistent=pd.Timedelta("5 minutes"))
ValueError: The nonexistent argument must be one of 'raise', 'NaT', 'shift_forward', 'shift_backward' or a timedelta object

NaT

To convert non-existent times to NaT (not-a-time):

s.tz_localize("CET", nonexistent="NaT")
2019-03-31 01:30:00+01:00 a
NaT a
2019-03-31 03:30:00+02:00 a
dtype: object
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!