groupby in pandas dataframe with examples | groupby min max mean sum
In this pandas tutorial, we will discuss about:
-
groupby in pandas dataframe,
-
groupby in pandas example,
-
groupby sum in pandas dataframe,
-
pandas groupby max value,
-
pandas groupby minimum value,
-
pandas groupby mean
Before we proceed to see examples like pandas groupby min max values, pandas groupby mean, sum, etc. lets create one dataframe.
Pandas DataFrame is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.
We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.
Syntax:
pandas.DataFrame(data)
Where, data is the input dataframe. The data can be a dictionary that stores list of values with specified key.
Example: Create Dataframe
In this example, we will create a dataframe with 8 rows and 3 columns with student data.
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#display the dataframe
print(data)
Output: Dataframe is created below
student_names subject1_marks subject2_marks
0 sravan kumar 78 98
1 pandit 87 87
2 sravan kumar 68 68
3 sonu 67 57
4 sonu 90 90
5 pandit 76 26
6 sravan kumar 76 76
7 sravan kumar 55 53
Now lets learn to use groupby in pandas using above dataframe.
groupby in pandas
groupby in pandas or groupby() is used to group the columns in a dataframe using groupby() function.
We can group the data and perfrom different aggregate operations like sum,min,max amd mean on the grouped column.
function groupby in pandas groups the data based on similar values.
Syntax:
dataframe.groupby(['column_name'])
where,
-
dataframe is the input dataframe
-
column_name is the column in which the data is grouped.
If we want to get the groups of the values with indices, then we have to use groups keyword with the above syntax.
dataframe.groupby(['column_name']).groups
Lets see groupby in pandas example to get some clarity.
Example: groupby in pandas dataframe example
In this groupby in pandas example, we will get the groups of the grouped data in the dataframe.
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#get the groups
print(data.groupby(['student_names']).groups)
Output: groupby in pandas dataframe example result is given below
In this code , we are grouping the data based on student_name column. so the indices 1 and 5 are pandit ,hence pandit is in 1 and 5 position . similarly sonu is in 3 and 4 position and sravan kumar is in 0,2,6 and 7 positions.
{'pandit': [1, 5], 'sonu': [3, 4], 'sravan kumar': [0, 2, 6, 7]}
Now lets see another example for groupby sum in pandas dataframe.
groupby() with sum() function: pandas groupby sum
This function will return the sum in the grouped data.
Syntax:
dataframe.groupby(['column_name']).sum()
Example: groupby sum in pandas dataframe example
groupby() student_names and get sum of two subject marks
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#groupby student_names to get sum
print(data.groupby(['student_names']).sum())
Output:
In this groupby sum in pandas dataframe output, we will get the sum of subject1_marks subject2_marks
by grouping with 'student_names' column.
subject1_marks subject2_marks
student_names
pandit 163 113
sonu 157 147
sravan kumar 277 295
Similarly we can apply in other places this function - groupby sum in pandas dataframe. Now lets see pandas groupby max example.
groupby() with max() function: pandas groupby max
This function will return the maximum value in the grouped data.
Syntax:
dataframe.groupby(['column_name']).max()
Example: pandas groupby max example
In this pandas groupby max value example we will use groupby() on student_names and get maximum value of two subject marks.
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#groupby student_names to get max
print(data.groupby(['student_names']).max())
Output: pandas groupby max example result
In this output, we will get the maximum of subject1_marks subject2_marks
by grouping with 'student_names' column.
subject1_marks subject2_marks
student_names
pandit 87 87
sonu 90 90
sravan kumar 78 98
Now we have seen how to get pandas groupby max value so lets move ahead to see pandas groupby minimum value.
groupby() with min() function: pandas groupby min
This pandas groupby min function will return the minimum value in the grouped data.
Syntax:
dataframe.groupby(['column_name']).min()
Example: pandas groupby minimum value example
groupby() student_names and get minimum value of two subject marks.
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#groupby student_names to get minimum
print(data.groupby(['student_names']).min())
Output: pandas groupby minimum value result is given below
In this pandas groupby minimum value output, we will get the minimum of subject1_marks subject2_marks
by grouping with 'student_names' column.
subject1_marks subject2_marks
student_names
pandit 76 26
sonu 67 57
sravan kumar 55 53
We have learned how to get pandas groupby minimum value so lets see pandas groupby mean.
groupby() with mean() function: pandas groupby mean
This pandas groupby mean function will return the average value in the grouped data.
Syntax:
dataframe.groupby(['column_name']).mean()
Example: pandas groupby mean example
In this pandas groupby mean example we will apply groupby() student_names and get average value of two subject marks.
import pandas as pd
# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
'subject1_marks':[78,87,68,67,90,76,76,55],
'subject2_marks':[98,87,68,57,90,26,76,53],}
# create dataframe from the student data
data = pd.DataFrame(data)
#groupby student_names to get average
print(data.groupby(['student_names']).mean())
Output: pandas groupby mean result
In this output, we will get the average of subject1_marks subject2_marks
by grouping with 'student_names' column.
subject1_marks subject2_marks
student_names
pandit 81.50 56.50
sonu 78.50 73.50
sravan kumar 69.25 73.75
This wraps up our session on groupby in pandas dataframe, groupby in pandas example, groupby sum in pandas dataframe, pandas groupby max value, pandas groupby minimum value, pandas groupby mean.
Would you like to see your article here on tutorialsinhand.
Join
Write4Us program by tutorialsinhand.com
About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University.
Published 1200+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :
Published Date :
Jun 16,2022