Articles

cumulative sum in pandas dataframe | pandas cumsum method

cumulative sum in pandas dataframe | pandas cumsum method


In this pandas tutorial, we will discuss about:

  • cumulative sum in pandas dataframe,
  • method cumsum in pandas dataframe

Before learning about how get cumulative sum in pandas dataframe using cumsum() function in pandas DataFrame, lets have a look at what is dataframe?

 

DataFrame in pandas is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.

 

We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.

Syntax:

pandas.DataFrame(data)

Where, data is the input dataframe. The data can be a dictionary that stores list of values with specified key.

 

Example: Create dataframe

In this example, we will create a dataframe with 4 rows and 3 columns with building data and assign indices through index parameter.

import pandas as pd

#create dataframe from the building data
data= pd.DataFrame({
                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#display the dataframe
print(data)

Output: pandas dataframe is created below

       length  breadth  area
one       5.6     12.9    20
two       7.8      4.5    56
three     4.5     21.5    43
four      5.3      6.0    45

Lets use this dataframe to find cumulative sum in pandas dataframe using pandas cumsum method.


cumsum in pandas dataframe

cumsum() function or cumsum in pandas dataframe will return the cumulative sum of values for the given dataframe.

Syntax:

dataframe.cumsum(axis,skipna)

Pandas cumsum method will return the entire dataframe.

Parameters:

  1. axis=0 specifies row and axis= 1 specifies column to get cumulative sum along row/column
  2. skipna will take boolean values - True or False. If False,It will consider NaN values and If True,It will not  consider NaN values in cumaulative sum operation.

If we want to get the  cumulative sum of values in a column for the given dataframe, we have to specify the column.

Syntax:

dataframe['column'].cumsum()

where, dataframe is the input dataframe and column is the column name.

pandas cumsum method used as such will return the specified column cumulative sum  in the given dataframe.

Lets see various examples to see how to get cumulative sum in pandas?

 

Example 1: get cumulative sum in pandas

Here, we will get cumulative sum in pandas for the entire dataframe and in a specific column.

import pandas as pd

#create dataframe from the building data
data= pd.DataFrame({
                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

# get cumulative sum in pandas
print(data.cumsum())

print()

# get cumulative sum in pandas from length column
print(data['length'].cumsum())

print()

# get cumulative sum in pandas from area column
print(data['area'].cumsum())

Outputget cumulative sum in pandas result

       length  breadth  area
one       5.6     12.9    20
two      13.4     17.4    76
three    17.9     38.9   119
four     23.2     44.9   164

one       5.6
two      13.4
three    17.9
four     23.2
Name: length, dtype: float64

one       20
two       76
three    119
four     164
Name: area, dtype: int64

Example 2: get cumulative sum in pandas

Lets get cumulative sum in pandas while dealing with skipna parameter

import pandas as pd
import numpy as np

#create dataframe from the building data
data= pd.DataFrame({
                    'length':[np.nan,7.8,4.5,np.nan],

                   "breadth":[12.9,4.5,21.5,np.nan],

                    "area":[2,np.nan,56,43]

                   },index=['one','two','three','four'])

# get cumulative sum in pandas without nan values
print(data.cumsum(skipna=True))

print()

# get cumulative sum in pandas considering nan values
print(data.cumsum(skipna=False))

Outputget cumulative sum in pandas result is below

       length  breadth   area
one       NaN     12.9    2.0
two       7.8     17.4    NaN
three    12.3     38.9   58.0
four      NaN      NaN  101.0

       length  breadth  area
one       NaN     12.9   2.0
two       NaN     17.4   NaN
three     NaN     38.9   NaN
four      NaN      NaN   NaN

Thus we have seen two different examples of how to get cumulative sum in pandas.

 

This wraps up our session on how to get cumulative sum in pandas using pandas cumsum method.


Pandas

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 1200+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Apr 03,2022  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!