Replace NaN values with 0 in pandas DataFrame
In this pandas tutorial we will discuss how to replace NaN values with 0 in pandas dataframe.
Introduction
DataFrame in pandas is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.
We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.
Syntax:
pandas.DataFrame(data)
Where, data is the input dataframe, The data can be a dictionary that stores list of values with specified key
NaN represents not a number, we can create NaN by using numpy.nan value and numpy is the module.
First let's create the dataframe with NaN values and see how to replace NaN values with 0.
#importing pandas and numpy modules
import pandas
import numpy
#create dataframe from the college data
data= pandas.DataFrame([[numpy.nan,'c-021','c-002',numpy.nan],
[numpy.nan,"vvit","RVR - JC","Andhra University"],
["guntur",numpy.nan,numpy.nan,"guntur"],
[1200,3422,5644,670]
])
#add columns
data=data.set_axis(['col1','col2','col3','col4'],axis=1)
#display
print(data)
Output:
In this example, we created a dataframe with 4 rows and 4 columns with column labels
col1 col2 col3 col4
0 NaN c-021 c-002 NaN
1 NaN vvit RVR - JC Andhra University
2 guntur NaN NaN guntur
3 1200 3422 5644 670
Method 1 : Using fillna()
fillna() is used to replace NaN values with the given value. Here the value is 0.
Syntax:
dataframe_input.fillna(0)
If we want to replace in particular column, then we have to specify the column name.
Syntax:
dataframe_input['column_label'].fillna(0)
Example 1: In this example, we will fill all NaN with 0.
#importing pandas and numpy modules
import pandas
import numpy
#create dataframe from the college data
data= pandas.DataFrame([[numpy.nan,'c-021','c-002',numpy.nan],
[numpy.nan,"vvit","RVR - JC","Andhra University"],
["guntur",numpy.nan,numpy.nan,"guntur"],
[1200,3422,5644,670]
])
#add columns
data=data.set_axis(['col1','col2','col3','col4'],axis=1)
#fill NaN with 0's
data=data.fillna(0)
#display
print(data)
Output:
col1 col2 col3 col4
0 0 c-021 c-002 0
1 0 vvit RVR - JC Andhra University
2 guntur 0 0 guntur
3 1200 3422 5644 670
Example 2: In this example, we will fill NaN with 0 in particular columns.
#importing pandas and numpy modules
import pandas
import numpy
#create dataframe from the college data
data= pandas.DataFrame([[numpy.nan,'c-021','c-002',numpy.nan],
[numpy.nan,"vvit","RVR - JC","Andhra University"],
["guntur",numpy.nan,numpy.nan,"guntur"],
[1200,3422,5644,670]
])
#add columns
data=data.set_axis(['col1','col2','col3','col4'],axis=1)
#fill NaN with 0's in col2
data['col2']=data['col2'].fillna(0)
#fill NaN with 0's in col1
data['col1']=data['col1'].fillna(0)
#display
print(data)
Output:
In the above code we replace NaN with 0 for col2 and col1
col1 col2 col3 col4
0 0 c-021 c-002 NaN
1 0 vvit RVR - JC Andhra University
2 guntur 0 NaN guntur
3 1200 3422 5644 670
Method 2 : Using replace()
replace() is used to replace NaN values with the given value. Here the value is 0.
Syntax:
dataframe_input.replace(numpy.nan,0)
If we want to replace in particular column, then we have to specify the column name.
Syntax:
dataframe_input['column_label'].replace(numpy.nan,0)
Example 1: In this example, we will fill all NaN with 0.
#importing pandas and numpy modules
import pandas
import numpy
#create dataframe from the college data
data= pandas.DataFrame([[numpy.nan,'c-021','c-002',numpy.nan],
[numpy.nan,"vvit","RVR - JC","Andhra University"],
["guntur",numpy.nan,numpy.nan,"guntur"],
[1200,3422,5644,670]
])
#add columns
data=data.set_axis(['col1','col2','col3','col4'],axis=1)
#fill NaN with 0's
data=data.replace(numpy.nan,0)
#display
print(data)
Output:
col1 col2 col3 col4
0 0 c-021 c-002 0
1 0 vvit RVR - JC Andhra University
2 guntur 0 0 guntur
3 1200 3422 5644 670
Example 2: In this example, we will fill NaN with 0 in particular columns.
#importing pandas and numpy modules
import pandas
import numpy
#create dataframe from the college data
data= pandas.DataFrame([[numpy.nan,'c-021','c-002',numpy.nan],
[numpy.nan,"vvit","RVR - JC","Andhra University"],
["guntur",numpy.nan,numpy.nan,"guntur"],
[1200,3422,5644,670]
])
#add columns
data=data.set_axis(['col1','col2','col3','col4'],axis=1)
#fill NaN with 0's in col2
data['col2']=data['col2'].replace(numpy.nan,0)
#fill NaN with 0's in col1
data['col1']=data['col1'].replace(numpy.nan,0)
#display
print(data)
Output:
In the above code we replace NaN with 0 for col2 and col1
col1 col2 col3 col4
0 0 c-021 c-002 NaN
1 0 vvit RVR - JC Andhra University
2 guntur 0 NaN guntur
3 1200 3422 5644 670
Would you like to see your article here on tutorialsinhand.
Join
Write4Us program by tutorialsinhand.com
About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University.
Published 1400+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :
Published Date :
Jun 14,2024