search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas DataFrame | set_index method

schedule Aug 11, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas's DataFrame.set_index(~) sets the index of the DataFrame using one of its columns.

Parameters

1. keys | string or array-like or list<string>

The names of the column(s) used for the index.

2. droplink | boolean | optional

  • If True, then the column used for the index will be deleted.

  • If False, then column will be retained.

By default, drop=True.

3. appendlink | boolean | optional

  • If True, then the columns will be appended to the current index.

  • If False, then the columns will replace the current index.

By default, append=False.

4. inplacelink | boolean | optional

  • If True, then the source DataFrame will be modified and return.

  • If False, then a new DataFrame will be returned.

By default, inplace=False.

5. verify_integritylink | boolean | optional

  • If True, then an error is raised if the new index has duplicates.

  • If False, then duplicate indexes are allowed.

By default, verify_integrity=False.

Return Value

A DataFrame with a new index.

Examples

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})
df
   A  B  C
0  1  3  5
1  2  4  6

Setting a single column as the index

To set column A as the index of df:

df.set_index("A")      # Returns a DataFrame
   B  C
A
1  3  5
2  4  6

Here, the name assigned to the index is the column label, that is, "A".

Setting multiple columns as the index

To set columns A and B as the index of df:

df.set_index(["A","B"])
      C
A  B
1  3  5
2  4  6

Here, the DataFrame ends up with 2 indexes.

Keeping the column used for the index

To keep the column that will be used as the index, set drop=False:

df.set_index("A", drop=False)
   A  B  C
A
1  1  3  5
2  2  4  6

Notice how the column A is still there.

Just as reference, here's df again:

df
   A  B  C
0  1  3  5
1  2  4  6

Appending to the current index

To append a column to the existing index, set append=True:

df.set_index("A", append=True)
      B  C
   A
0  1  3  5
1  2  4  6

Notice how the original index [0,1] has been appended to.

Setting an index in-place

To set an index in-place, supply inplace=True:

df.set_index("A", inplace=True)
df
   B
A
1  3
2  4

As shown in the output above, by setting inplace=True, the source DataFrame will be directly modified. Opt to set inplace=True when you're sure that you won't be needing the source DataFrame since this will save memory.

Verifying integrity

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,1],"B":[3,4]})
df
A B
0 1 3
1 1 4

By default, verify_integrity=False, which means that no error will be thrown if the resulting index contains duplicates:

df.set_index("A")   # verify_integrity=False
B
A
1 3
1 4

Notice how the new index contains duplicate values (two 1s), but no error was thrown.

To throw an error in such in cases, pass verify_integrity=True like so:

df.set_index("A", verify_integrity=True)
ValueError: Index has duplicate keys: Int64Index([1], dtype='int64', name='A')
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...