Articles

groupby in pandas dataframe with examples | groupby min max mean sum

groupby in pandas dataframe with examples | groupby min max mean sum


In this pandas tutorial, we will discuss about:

  • groupby in pandas dataframe,
  • groupby in pandas example,
  • groupby sum in pandas dataframe,
  • pandas groupby max value,
  • pandas groupby minimum value,
  • pandas groupby mean

Before we proceed to see examples like pandas groupby min max valuespandas groupby mean, sum, etc. lets create one dataframe.

 

Pandas DataFrame is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.

 

We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.

Syntax:

pandas.DataFrame(data)

Where, data is the input dataframe. The data can be a dictionary that stores list of values with specified key.

 

Example: Create Dataframe

In this example, we will create a dataframe with 8 rows and 3 columns with student data.

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#display the dataframe
print(data)

Output: Dataframe is created below

  student_names  subject1_marks  subject2_marks
0  sravan kumar              78              98
1        pandit              87              87
2  sravan kumar              68              68
3          sonu              67              57
4          sonu              90              90
5        pandit              76              26
6  sravan kumar              76              76
7  sravan kumar              55              53

Now lets learn to use groupby in pandas using above dataframe.


groupby in pandas

groupby in pandas or groupby() is used to group the columns in a dataframe using groupby() function.

 

We can group the data and perfrom different aggregate operations like sum,min,max amd mean on the grouped column.

 

function groupby in pandas groups the data based on similar values.

Syntax:

dataframe.groupby(['column_name'])

where, 

  1. dataframe is the input dataframe
  2. column_name is the column in which the data is grouped.

If we want to get the groups of the values with indices, then we have to use groups keyword with the above syntax.

dataframe.groupby(['column_name']).groups

Lets see groupby in pandas example to get some clarity.

 

Examplegroupby in pandas dataframe example

In this groupby in pandas example, we will get the groups of the grouped data in the dataframe.

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#get the groups
print(data.groupby(['student_names']).groups)

Outputgroupby in pandas dataframe example result is given below

In this code , we are grouping the data based on student_name column. so the indices 1 and 5 are pandit ,hence pandit is in 1 and 5 position . similarly sonu is in 3 and 4 position and sravan kumar is in 0,2,6 and 7 positions.

{'pandit': [1, 5], 'sonu': [3, 4], 'sravan kumar': [0, 2, 6, 7]}

​Now lets see another example for groupby sum in pandas dataframe.


groupby() with sum() function: pandas groupby sum

This function will return the sum in the grouped data.

Syntax:

dataframe.groupby(['column_name']).sum()

Example: groupby sum in pandas dataframe example

groupby() student_names and get sum of two subject marks

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#groupby student_names to get sum
print(data.groupby(['student_names']).sum())

Output

In this groupby sum in pandas dataframe output, we will get the sum of subject1_marks subject2_marks

by grouping with 'student_names' column.

               subject1_marks  subject2_marks
student_names                                
pandit                    163             113
sonu                      157             147
sravan kumar              277             295

Similarly we can apply in other places this function - groupby sum in pandas dataframe. Now lets see pandas groupby max example.


groupby() with max() function: pandas groupby max

This function will return the maximum value in the grouped data.

Syntax:

dataframe.groupby(['column_name']).max()

Example: pandas groupby max example

In this pandas groupby max value example we will use groupby() on student_names and get maximum value of two subject marks.

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#groupby student_names to get max
print(data.groupby(['student_names']).max())

Outputpandas groupby max example result

In this output, we will get the maximum of subject1_marks subject2_marks

by grouping with 'student_names' column.

               subject1_marks  subject2_marks
student_names                                
pandit                     87              87
sonu                       90              90
sravan kumar               78              98

Now we have seen how to get pandas groupby max value so lets move ahead to see pandas groupby minimum value.


groupby() with min() function: pandas groupby min

This pandas groupby min function will return the minimum value in the grouped data.

Syntax:

dataframe.groupby(['column_name']).min()

Example: pandas groupby minimum value example

groupby() student_names and get minimum value of two subject marks.

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#groupby student_names to get minimum
print(data.groupby(['student_names']).min())

Outputpandas groupby minimum value result is given below

In this pandas groupby minimum value output, we will get the minimum of subject1_marks subject2_marks

by grouping with 'student_names' column.

               subject1_marks  subject2_marks
student_names                                
pandit                     76              26
sonu                       67              57
sravan kumar               55              53

We have learned how to get pandas groupby minimum value so lets see pandas groupby mean.


groupby() with mean() function: pandas groupby mean

This pandas groupby mean function will return the average value in the grouped data.

Syntax:

dataframe.groupby(['column_name']).mean()

Example: pandas groupby mean example

In this pandas groupby mean example we will apply groupby() student_names and get average value of two subject marks.

import pandas as pd

# create student data
data = {'student_names':['sravan kumar','pandit', 'sravan kumar', 'sonu', 'sonu', 'pandit', 'sravan kumar', 'sravan kumar'],
		     'subject1_marks':[78,87,68,67,90,76,76,55], 
        'subject2_marks':[98,87,68,57,90,26,76,53],}
	

# create dataframe from the student data
data = pd.DataFrame(data)

#groupby student_names to get average
print(data.groupby(['student_names']).mean())

Outputpandas groupby mean result

In this output, we will get the average of subject1_marks subject2_marks

by grouping with 'student_names' column.

               subject1_marks  subject2_marks
student_names                                
pandit                  81.50           56.50
sonu                    78.50           73.50
sravan kumar            69.25           73.75

This wraps up our session on groupby in pandas dataframe, groupby in pandas example, groupby sum in pandas dataframe, pandas groupby max value, pandas groupby minimum value, pandas groupby mean.


Pandas

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 1000+ Technical Articles on Python, R, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Jun 16,2022  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!