Pandas DataFrame.describe() Function
Python Pandas DataFrame.describe() function returns the statistics of a DataFrame.
pandas.DataFrame.describe()
grammar
DataFrame.describe(
percentiles=None, include=None, exclude=None, datetime_is_numeric=False
)
parameter
percentiles |
This parameter tells the output which percentiles to include, all values should be between 0 and 1. The default value is [.25, .5, .75] to return the 25th, 50th and 75th percentiles. |
include |
The data type to be included in the output. It has three options. all : All columns of the input will be included in the output. List-like data type: Limits the results to the provided data type. None : The results will include all numeric columns. |
exclude |
The data types to exclude from the output. It has two options. Similar to a list of data types: Exclude the provided data types from the result. None The result will not include anything. |
datetime_is_numeric |
Boolean parameter. It specifies whether we should treat the datetime type as a numeric type. |
return
It returns a statistical summary of the passed Series
or DataFrame.
Example Code: DataFrame.describe()
Method to Find DataFrame Statistics
import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)
dataframe1 = dataframe.describe()
print("Statistics are: \n")
print(dataframe1)
Output:
The Original Data frame is:
Attendance Name Obtained Marks
0 60 Olivia 90
1 100 John 75
2 80 Laura 82
3 78 Ben 64
4 95 Kevin 45
Statistics are:
Attendance Obtained Marks
count 5.000000 5.000000
mean 82.600000 71.200000
std 15.773395 17.484279
min 60.000000 45.000000
25% 78.000000 64.000000
50% 80.000000 75.000000
75% 95.000000 82.000000
max 100.000000 90.000000
This function returns a statistical summary of the DataFrame. We did not pass any arguments, so the function used all default values.
Sample code: DataFrame.describe()
Method to find statistics for each column
We will use include
the parameter to find statistics for all columns.
import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)
dataframe1 = dataframe.describe(include='all')
print("Statistics are: \n")
print(dataframe1)
Output:
The Original Data frame is:
Attendance Name Obtained Marks
0 60 Olivia 90
1 100 John 75
2 80 Laura 82
3 78 Ben 64
4 95 Kevin 45
Statistics are:
Attendance Name Obtained Marks
count 5.000000 5 5.000000
unique NaN 5 NaN
top NaN Kevin NaN
freq NaN 1 NaN
mean 82.600000 NaN 71.200000
std 15.773395 NaN 17.484279
min 60.000000 NaN 45.000000
25% 78.000000 NaN 64.000000
50% 80.000000 NaN 75.000000
75% 95.000000 NaN 82.000000
max 100.000000 NaN 90.000000
This function returns summary statistics for all columns in the DataFrame.
Sample code: DataFrame.describe()
Method to find numeric column statistics
Now we will use exclude
the parameter to find statistics only for numeric columns.
import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)
dataframe1 = dataframe.describe(exclude=[object])
print("Statistics are: \n")
print(dataframe1)
Output:
The Original Data frame is:
Attendance Name Obtained Marks
0 60 Olivia 90
1 100 John 75
2 80 Laura 82
3 78 Ben 64
4 95 Kevin 45
Statistics are:
Attendance Obtained Marks
count 5.000000 5.000000
mean 82.600000 71.200000
std 15.773395 17.484279
min 60.000000 45.000000
25% 78.000000 64.000000
50% 80.000000 75.000000
75% 95.000000 82.000000
max 100.000000 90.000000
We have excluded data types object
.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Pandas DataFrame.astype() Function
Publish Date:2025/05/01 Views:160 Category:Python
-
Python Pandas DataFrame.astype() function changes the data type of an object to the specified data type. pandas.DataFrame.astype() grammar DataFrame . astype(dtype, copy = True , errors = "raise" ) parameter dtype The data type we want to a
Pandas DataFrame.to_dict() Function
Publish Date:2025/05/01 Views:188 Category:Python
-
Python Pandas DataFrame.to_dict() function converts the given DataFrame to a dictionary. pandas.DataFrame.to_dict() Syntax DataFrame . to_dict(orient = 'dict' , into = class ' dict ' ) parameter orient This parameter determines the type of
Pandas DataFrame.reset_index() Function
Publish Date:2025/05/01 Views:140 Category:Python
-
Python Pandas DataFrame.reset_index() function resets the index of the given DataFrame. It replaces the old index with the default index. If the given DataFrame has a MultiIndex, then this method removes all the levels. pandas.DataFrame.rep
Pandas DataFrame.resample() Function
Publish Date:2025/05/01 Views:78 Category:Python
-
Python Pandas DataFrame.resample() function resamples time series data. pandas.DataFrame.resample() Syntax DataFrame . resample( rule, axis = 0 , closed = None , label = None , convention = "start" , kind = None , loffset = None , base = No
Pandas DataFrame.insert() Function
Publish Date:2025/05/01 Views:116 Category:Python
-
Python Pandas DataFrame.insert() function inserts a column at the specified position into the DataFrame. pandas.DataFrame.insert() Syntax DataFrame . insert(loc, column, value, allow_duplicates = False ) parameter loc It is an integer param
Pandas DataFrame.idxmax() Function
Publish Date:2025/05/01 Views:79 Category:Python
-
Python Pandas DataFrame.idxmax() function returns the index of the maximum value in a row or column. pandas.DataFrame.idxmax() Syntax DataFrame . idxmax(axis = 0 , skipna = True ) parameter axis It is a parameter of integer or string type.
Pandas DataFrame sort_index() Function
Publish Date:2025/05/01 Views:183 Category:Python
-
This tutorial explains how to use pandas.DataFrame.sort_index() the sort method to sort a Pandas DataFrame based on its index. We will use the DataFrame shown in the above example to explain how to sort a Pandas DataFrame based on the index
Pandas cut function
Publish Date:2025/05/01 Views:165 Category:Python
-
pandas.cut() The function can distribute the given data into a range, which can also be called bins . We will use the following DataFrame in this article. import pandas as pd df = pd . DataFrame( { "Name" : [ "Anish" , "Birat" , "Chirag" ,
Appending to an Empty DataFrame in Pandas
Publish Date:2025/05/01 Views:54 Category:Python
-
As we learned earlier, Pandas in Python is an open source module that we can use for data analysis and making machine learning models. It is Numpy used along with another package called as they go hand in hand to support multidimensional ar