Articles

Get Pandas DataFrame information | using pandas info, describe, value_counts, column functions

Get Pandas DataFrame information | using pandas info, describe, value_counts, column functions


In this pandas tutorial, we will learn about:

  • pandas print dataframe information using different method,
  • pandas dataframe info method,
  • pandas describe function,
  • pandas value_counts function,
  • pandas column function

DataFrame is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.

We can able to create this DataFrame using DataFrame() method. But this is available in pandas module, so we have to import pandas module.

Syntax:

pandas.DataFrame(data)

Where, data is the input dataframe. The data can be a dictionary that stores list of values with specified key.

 

Example: Create DataFrame

In this example, we will create dataframe with 4 rows and 4 columns with college data and assign index labels using index parameter.

import pandas as pd

#create dataframe from the college data

data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],

                    'college_name':["vignan university","vvit","RVR - JC","Andhra University"],

                   "college_address":["guntur","guntur","guntur","guntur"],

                    "Total Staff":[1200,3422,5644,670]

                   },index=['college1','college2','college3','college4'])

#display the dataframe

print(data)

Output: Given below is the dataframe created

  

 

Now we will use this datframe in our further explanation in this chapter.


Method 1 : In pandas get dataframe info using info()

This pandas dataframe info method will return the column name with associated data type and count of Non-Null values.

Syntax:

dataframe.info()

Where, dataframe is the input dataframe.

 

Example:

This example demonstrates pandas dataframe info method in pandas dataframe get column info as shown below:

import pandas as pd

#create dataframe from the college data

data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],

                    'college_name':["vignan university","vvit","RVR - JC","Andhra University"],

                   "college_address":["guntur","guntur","guntur","guntur"],

                    "Total Staff":[1200,3422,5644,670]

                   },index=['college1','college2','college3','college4'])



print(data.info())

Outputpandas dataframe get column info using pandas dataframe info method is shown below

<class 'pandas.core.frame.DataFrame'>
Index: 4 entries, college1 to college4
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype
---  ------           --------------  -----
 0   college_id       4 non-null      object
 1   college_name     4 non-null      object
 2   college_address  4 non-null      object
 3   Total Staff      4 non-null      int64
dtypes: int64(1), object(3)
memory usage: 160.0+ bytes

The output will return the column name and data type along with Non-Null Count

As there are no any empty/null/nan values in the dataframe. So the no-null values are 4.

 

Thus we learnt about pandas dataframe info method.


Method 2 : In pandas get dataframe info using describe()

This pandas describe function() will return the statistics like mean, count, min, max and quartiles from each column in the pandas DataFrame.

Syntax:

dataframe.describe()

where, dataframe is the input dataframe.

 

Note - pandas describe function() will get the information only from the numeric column data (like integer/double/float) by default. It will not get the information from string or object datatype columns.

 

Example:

This example demonstrates pandas describe function() where detail for numeric column Total staff will get printed.

import pandas as pd

#create dataframe from the college data

data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],

                    'college_name':["vignan university","vvit","RVR - JC","Andhra University"],

                   "college_address":["guntur","guntur","guntur","guntur"],

                    "Total Staff":[1200,3422,5644,670]

                   },index=['college1','college2','college3','college4'])



print(data.describe())

Output: Only numeric column is Total staff in the above dataframe which is printed as pandas get dataframe info using pandas describe function()

       Total Staff
count     4.000000
mean   2734.000000
std    2277.037256
min     670.000000
25%    1067.500000
50%    2311.000000
75%    3977.500000
max    5644.000000

From the above dataframe, only Total Staff column is of float type which is numeric, Hence, pandas describe function will return the information only from this column.


Method 3 : In pandas get dataframe info using value_counts()

This function pandas value_counts() will return the occurances of each element from single column or from the entire dataframe.

Syntax:

dataframe.value_counts()

Where, dataframe is the input dataframe.

 

The above syntax will return the count of each row.

If we want to get the count of every value in particular column, we have to specify the column name in pandas value_counts

Syntax:

Dataframe[‘column’].value_counts()

Where, dataframe is the input dataframe and column is the column name

 

Example:

This example demonstrates pandas value_counts() method

import pandas as pd

#create dataframe from the college data

data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],

                    'college_name':["vignan university","vvit","RVR - JC","Andhra University"],

                   "college_address":["guntur","guntur","guntur","guntur"],

                    "Total Staff":[1200,3422,5644,670]

                   },index=['college1','college2','college3','college4'])



print(data.value_counts())

print(data['college_name'].value_counts())

print(data['college_address'].value_counts())

 

Output: using pandas value_counts we get below results

       

 

In the above output, we have seen that each row occured uniquely. So output is 1 and each value in  college_name

column occured once. Hence the value count is 1 and value - guntur occured occured 4 times in college_address column. Hence the value count is 4


Method 4 : In pandas get dataframe info using columns

This pandas column function function will return the column names from the  pandas dataframe.

Syntax:

dataframe.columns

Where, dataframe is the input dataframe.

 

Example:

This example demonstrates pandas column function.

import pandas as pd

#create dataframe from the college data

data= pd.DataFrame({'college_id':['c-001','c-021','c-002','c-004'],

                    'college_name':["vignan university","vvit","RVR - JC","Andhra University"],

                   "college_address":["guntur","guntur","guntur","guntur"],

                    "Total Staff":[1200,3422,5644,670]

                   },index=['college1','college2','college3','college4'])

print(data.columns)

Output: We get below result applying pandas column function on given dataframe.

       
Index(['college_id', 'college_name', 'college_address', 'Total Staff'], dtype='object')

In the above output, column names are returned in a list.


Conclusion:

From this article, we learned how to get the information from DataFrame using pandas dataframe info method, pandas describe function, pandas value_counts function, pandas column function.


Pandas

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 1200+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Feb 26,2022  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!