JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Pandas DataFrame DataFrame.set_index() function

Author:JIYIK Last Updated:2025/05/01 Views:

The pandas.DataFrame.set_index() method can be used to set an array or column of appropriate length as the index of a DataFrame, even after the DataFrame is created. The newly set index can replace the existing index or extend the existing index.


pandas.DataFrame.set_index()Method Syntax

DataFrame.set_index(
    keys, drop=True, append=False, inplace=False, verify_integrity=False
)

parameter

keys Column or list of columns to set as index
drop Boolean. The default value Trueis to delete the column to be set as the index.
append Boolean. The default value is False, which specifies whether to append the column to the existing index.
inplace Boolean. If Truetrue, the caller's DataFrame is modified in-place.
verify_integrity Boolean. If true True, raises when creating an index with duplicates ValueError. The default value isFalse

Return Value

Returns a object with the modified index columns if inplace; otherwise .TrueDataFrameNone


Example Code: DataFrame.set_index()Setting the Pandas DataFrame Index Using Pandas Methods

import pandas as pd
fruit_list = [ ('Orange', 34, 'Yes' ,'ABC') ,
             ('Mango', 24, 'No','ABC' ) ,
             ('banana', 14, 'No','ABC' ) ,
             ('Apple', 44, 'Yes',"XYZ" ) ]

df = pd.DataFrame(fruit_list, 
                  columns = ['Name',
                             'Price',
                             'In_Stock',
                             'Supplier']) 
print(df)
df_modified=df.set_index("Name")
print(df_modified)

Output:

        Name  Price In_Stock Supplier
0     Orange     34      Yes      ABC
1      Mango     24       No      ABC
2     banana     14       No      ABC
3      Apple     44      Yes      XYZ
4  Pineapple     64       No      XYZ
5       Kiwi     84      Yes      XYZ
           Price In_Stock Supplier
Name                              
Orange        34      Yes      ABC
Mango         24       No      ABC
banana        14       No      ABC
Apple         44      Yes      XYZ
Pineapple     64       No      XYZ
Kiwi          84      Yes      XYZ

The original Dataframeuses the numeric range as the default index column, while in modified_df, we use set_index()the method to set Namethe column as the index.


Example Code: DataFrame.set_index()Setting in Pandas Methoddrop=False

import pandas as pd
fruit_list = [ ('Orange', 34, 'Yes' ,'ABC') ,
             ('Mango', 24, 'No','ABC' ) ,
             ('banana', 14, 'No','ABC' ) ,
             ('Apple', 44, 'Yes',"XYZ" )  ]

df = pd.DataFrame(fruit_list, 
                  columns = ['Name',
                             'Price',
                             'In_Stock',
                             'Supplier']) 
print(df)

df_modified=df.set_index("Name",drop=False)

print(df_modified)

Output:

     Name  Price In_Stock Supplier
0  Orange     34      Yes      ABC
1   Mango     24       No      ABC
2  banana     14       No      ABC
3   Apple     44      Yes      XYZ
          Name  Price In_Stock Supplier
Name                                   
Orange  Orange     34      Yes      ABC
Mango    Mango     24       No      ABC
banana  banana     14       No      ABC
Apple    Apple     44      Yes      XYZ

If we set_indexset it in the DataFrame's method drop=False, even if Namethe column is set to indexthe column, it is still Dataframea column in .


Example code DataFrame.set_indexis set in Pandas methodinplace=True

import pandas as pd

fruit_list = [ ('Orange', 34, 'Yes' ,'ABC') ,
             ('Mango', 24, 'No','ABC' ) ,
             ('banana', 14, 'No','ABC' ) ,
             ('Apple', 44, 'Yes',"XYZ" )  ]

df = pd.DataFrame(fruit_list, columns = ['Name' , 'Price', 'In_Stock',"Supplier"]) 
print("Before Setting Index:")
print(df)
df.set_index("Name",inplace=True)
print("After Setting Index:")
print(df)

Output:

Before Setting Index:
     Name  Price In_Stock Supplier
0  Orange     34      Yes      ABC
1   Mango     24       No      ABC
2  banana     14       No      ABC
3   Apple     44      Yes      XYZ
After Setting Index:
        Price In_Stock Supplier
Name                           
Orange     34      Yes      ABC
Mango      24       No      ABC
banana     14       No      ABC
Apple      44      Yes      XYZ

If we set_index()set it in the method inplace=True, the caller DataFramewill be modified in place.


Example Code: DataFrame.set_index()Setting a Multi-Index Column Using Pandas Methods

import pandas as pd

fruit_list = [ ('Orange', 34, 'Yes' ,'ABC') ,
             ('Mango', 24, 'No','ABC' ) ,
             ('banana', 14, 'No','ABC' ) ,
             ('Apple', 44, 'Yes',"XYZ" )  ]

df = pd.DataFrame(fruit_list, columns = ['Name' , 'Price', 'In_Stock',"Supplier"]) 
print("Before Setting Index:")
print(df)
df.set_index("Name",append=True,inplace=True,drop=False)
print("After Setting Index:")
print(df)

Output:

Before Setting Index:
     Name  Price In_Stock Supplier
0  Orange     34      Yes      ABC
1   Mango     24       No      ABC
2  banana     14       No      ABC
3   Apple     44      Yes      XYZ
After Setting Index:
            Name  Price In_Stock Supplier
  Name                                   
0 Orange  Orange     34      Yes      ABC
1 Mango    Mango     24       No      ABC
2 banana  banana     14       No      ABC
3 Apple    Apple     44      Yes      XYZ

If we set_index()set it in the method append=True, the newly set index column will be appended to the existing index, and a single DataFramehas multiple index columns.


Example code: Pandas behavior Dataframe.set_index()when verify_integrityisTrue

import pandas as pd

fruit_list = [
    ("Orange", 34, "Yes", "ABC"),
    ("Mango", 24, "No", "ABC"),
    ("Apple", 14, "No", "ABC"),
    ("Apple", 44, "Yes", "XYZ"),
]

df = pd.DataFrame(fruit_list, columns=["Name", "Price", "In_Stock", "Supplier"])

df_modified = df.set_index("Name", verify_integrity=True)
print(df_modified)

Output:

Traceback (most recent call last):
  .....line 3920, in set_index
    dup=duplicates))
ValueError: Index has duplicate keys: Index(['Apple'], dtype='object', name='Name')

AppleIt is raised because the index has a duplicate index - ValueError. It has two Appleset as indexes in the column; therefore, it raises an error if set_index()in the method verify_integrityis set to .True

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Convert Pandas to CSV without index

Publish Date:2025/05/01 Views:159 Category:Python

As you know, an index can be thought of as a reference point used to store and access records in a DataFrame. They are unique for each row and usually range from 0 to the last row of the DataFrame, but we can also have serial numbers, dates

Convert Pandas DataFrame to Dictionary

Publish Date:2025/05/01 Views:197 Category:Python

This tutorial will show you how to convert a Pandas DataFrame into a dictionary with the index column elements as keys and the corresponding elements of other columns as values. We will use the following DataFrame in the article. import pan

Convert Pandas DataFrame columns to lists

Publish Date:2025/05/01 Views:191 Category:Python

When working with Pandas DataFrames in Python, you often need to convert the columns of the DataFrame into Python lists. This process is very important for various data manipulation and analysis tasks. Fortunately, Pandas provides several m

Subtracting Two Columns in Pandas DataFrame

Publish Date:2025/05/01 Views:120 Category:Python

Pandas can handle very large data sets and has a variety of functions and operations that can be applied to the data. One of the simple operations is to subtract two columns and store the result in a new column, which we will discuss in thi

Dropping columns by index in Pandas DataFrame

Publish Date:2025/05/01 Views:99 Category:Python

DataFrames can be very large and can contain hundreds of rows and columns. It is necessary to master the basic maintenance operations of DataFrames, such as deleting multiple columns. We can use dataframe.drop() the method to delete columns

Pandas Copy DataFrame

Publish Date:2025/05/01 Views:53 Category:Python

This tutorial will show you how to DataFrame.copy() copy a DataFrame object using the copy method. import pandas as pd items_df = pd . DataFrame( { "Id" : [ 302 , 504 , 708 ], "Cost" : [ "300" , "400" , "350" ], } ) print (items_df) Output:

Pandas DataFrame.ix[] Function

Publish Date:2025/05/01 Views:168 Category:Python

Python Pandas DataFrame.ix[] function slices rows or columns based on the value of the argument. pandas.DataFrame.ix[] grammar DataFrame . ix[index = None , label = None ] parameter index Integer or list of integers used to slice row indice

Pandas DataFrame.describe() Function

Publish Date:2025/05/01 Views:120 Category:Python

Python Pandas DataFrame.describe() function returns the statistics of a DataFrame. pandas.DataFrame.describe() grammar DataFrame . describe( percentiles = None , include = None , exclude = None , datetime_is_numeric = False ) parameter perc

Pandas DataFrame.astype() Function

Publish Date:2025/05/01 Views:160 Category:Python

Python Pandas DataFrame.astype() function changes the data type of an object to the specified data type. pandas.DataFrame.astype() grammar DataFrame . astype(dtype, copy = True , errors = "raise" ) parameter dtype The data type we want to a

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial