Comvert pandas DataFrame to numpy Array
In this tutorial, we will discuss how to convert pandas dataframe to numpy array.
Introduction
DataFrame in pandas is two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.
We create DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.
Syntax:
pandas.DataFrame(data)
Where, data is the input dataframe , The data can be a dictionary that stores list of values with specified key.
numpy stands for numeric python which is used to perform mathematical operations on arrays.
It is a module in which we have to import from the python.
Syntax to import:
import numpy
We can also use alias for the module
For example,
import numpy as np
We can directly use np to call the numpy module.
Let's create a dataframe with 4 rows and 4 columns.
import pandas as pd
#create dataframe from the college data
data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],
'college_name':["vignan university","vvit","RVR - JC","Andhra University"],
"college_address":["guntur","guntur","guntur","guntur"],
"Total Staff":['1200','3422','5644','670']
},index=['one','two','three','four'])
#display the dataframe
print(data)
Output:
college_id college_name college_address Total Staff
one c-001 vignan university guntur 1200
two c-021 vvit guntur 3422
three c-002 RVR - JC guntur 5644
four c-004 Andhra University guntur 670
Method -1 : using to_numpy()
By using this method, we can convert the entire dataframe to the numpy array.
Syntax:
dataframe.to_numpy()
where, dataframe is the input pandas dataframe.
Example:
In this example,we will convert the above dataframe to the numpy array.
import pandas as pd
#create dataframe from the college data
data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],
'college_name':["vignan university","vvit","RVR - JC","Andhra University"],
"college_address":["guntur","guntur","guntur","guntur"],
"Total Staff":['1200','3422','5644','670']
},index=['one','two','three','four'])
#convert to the numpy array
print(data.to_numpy())
Output:
[['c-001' 'vignan university' 'guntur' '1200']
['c-021' 'vvit' 'guntur' '3422']
['c-002' 'RVR - JC' 'guntur' '5644']
['c-004' 'Andhra University' 'guntur' '670']]
If we want to get the type, use type() to get the type of the data structure.
import pandas as pd
#create dataframe from the college data
data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],
'college_name':["vignan university","vvit","RVR - JC","Andhra University"],
"college_address":["guntur","guntur","guntur","guntur"],
"Total Staff":['1200','3422','5644','670']
},index=['one','two','three','four'])
#convert to the numpy array and get the type
print(type(data.to_numpy()))
Output:
<class 'numpy.ndarray'>
If we want to convert only particulr columns to numpy array, then we have to specify the column name to be converted to numpy array.
Syntax:
dataframe[['column_name']].to_numpy()
Example:
In this example we are converting college_id and college_name columns to numpy array.
import pandas as pd
#create dataframe from the college data
data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],
'college_name':["vignan university","vvit","RVR - JC","Andhra University"],
"college_address":["guntur","guntur","guntur","guntur"],
"Total Staff":['1200','3422','5644','670']
},index=['one','two','three','four'])
#convert college_id column to the numpy array
print(data[['college_id']].to_numpy())
print()
#convert college_name column to the numpy array
print(data[['college_name']].to_numpy())
Output:
[['c-001']
['c-021']
['c-002']
['c-004']]
[['vignan university']
['vvit']
['RVR - JC']
['Andhra University']]
Method -2 : using values
By using this method, we can convert the entire dataframe to the numpy array.
Syntax:
dataframe.values
where, dataframe is the input pandas dataframe.
Example:
In this example,we will convert the above dataframe to the numpy array.
import pandas as pd
#create dataframe from the college data
data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],
'college_name':["vignan university","vvit","RVR - JC","Andhra University"],
"college_address":["guntur","guntur","guntur","guntur"],
"Total Staff":['1200','3422','5644','670']
},index=['one','two','three','four'])
#convert to the numpy array
print(data.values)
print()
#convert to the numpy array and get the type
print(type(data.values))
Output:
[['c-001' 'vignan university' 'guntur' '1200']
['c-021' 'vvit' 'guntur' '3422']
['c-002' 'RVR - JC' 'guntur' '5644']
['c-004' 'Andhra University' 'guntur' '670']]
<class 'numpy.ndarray'>
Would you like to see your article here on tutorialsinhand.
Join
Write4Us program by tutorialsinhand.com
About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University.
Published 1400+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :
Published Date :
Jun 14,2024