Articles

How to get variance and standard deviation in pandas DataFrame? | var() & std() in pandas

How to get variance and standard deviation in pandas DataFrame? | var() & std() in pandas


In this pandas tutorial, we will discuss about:

  • how to get variance in pandas?
  • how to get standard deviation in pandas?
  • var method in pandas,
  • std method in pandas

We will learn about methods var in pandas and std in pandas to calculate standard deviation and variance.

 

DataFrame in pandas is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.

 

We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.

Syntax:

pandas.DataFrame(data)

Where, data is the input dataframe , The data can be a dictionary that stores list of values with specified key.

 

Example: Create DataFrame

In this example, we will create a dataframe with 4 rows and 4 columns with building data and assign indices through index parameter.

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#display the dataframe
print(data)

Output: dataframe is created

      building-id  length  breadth  area
one         c-001     5.6     12.9    20
two         c-021     7.8      4.5    56
three       c-002     4.5     21.5    43
four        c-004     5.3      6.0    45

Now lets understand how to get variance in pandas and how to get standard deviation in pandas?


var in pandas

We can get the variance by using var in pandas or var() function.

 

Syntaxvar method in pandas

dataframe.var(axis)

where, dataframe is the input dataframe

  1. axis =1 represents column, which will return the variance column wise.
  2. axis= 0 represents row, which will return the variance row wise.

Example 1calculate variance in pandas

Lets calculate variance in pandas across column

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#calculate variance in pandas
print(data.var(axis=1))

Output: calculate variance in pandas result on above dataframe is given below

one       51.843333
two      831.063333
three    372.250000
four     516.263333
dtype: float64

Example 2calculate variance in pandas

Lets calculate variance in pandas across row

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#calculate variance in pandas
print(data.var(axis=0))

Output: Result for calculate variance in pandas is given below

length       1.993333
breadth     60.302500
area       228.666667
dtype: float64

Note - we can specify column name, if we want to return variance for particular column. But specify the axis as 0.

Example 3calculate variance in pandas

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#get the variance for only length column
print(data['length'].var(axis=0))

Outputcalculate variance in pandas result for specified column is given below

1.9933333333333332

Thus we have seen three different example on how to get variance in pandas? using var in pandas.


std method in pandas

We can get the standard deviation by using std method in pandas or std() function.

 

Syntaxstd method in pandas

dataframe.std(axis)

where, dataframe is the input dataframe

  1. axis =1 represents column, which will return the standard deviation column wise.
  2. axis= 0 represents row, which will return the standard deviation row wise.

Example 1calculate standard deviation in pandas

Lets calculate standard deviation in pandas across column.

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#get the standard deviation
print(data.std(axis=1))

Output: Result for calculate standard deviation in pandas

one       7.200231
two      28.828169
three    19.293781
four     22.721429
dtype: float64

Example 2calculate standard deviation in pandas

Lets calculate standard deviation in pandas across row

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#calculate standard deviation in pandas
print(data.std(axis=0))

Output: Result for calculate standard deviation in pandas

length      1.411855
breadth     7.765468
area       15.121728
dtype: float64

Note - we can specify column name, if we want to return standard deviation  for particular column. But specify the axis as 0.

Example 3calculate standard deviation in pandas

Lets calculate standard deviation in pandas using column name

import pandas as pd

#create dataframe from the college data
data= pd.DataFrame({'building-id':['c-001','c-021','c-002','c-004'],

                    'length':[5.6,7.8,4.5,5.3],

                   "breadth":[12.9,4.5,21.5,6.0],

                    "area":[20,56,43,45]

                   },index=['one','two','three','four'])

#get the standard deviation for only length column
print(data['length'].std(axis=0))

Output: Result for calculate standard deviation in pandas for specified column

1.411854572303158

Thus we have learned how to get standard deviation in pandas? using std method in pandas.


Pandas

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 1400+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Apr 02,2022  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!