Articles

PySpark - Row class

PySpark - Row class


In this PySpark tutorial, we will discuss how to use Row class to create pyspark dataframe.

Introduction:

DataFrame in PySpark is an two dimensional data structure that will store data in two dimensional format. One dimension refers to a row and second dimension refers to a column, So It will store the data in rows and columns.

Let's install pyspark module before going to this. The command to install any module in python is "pip".

Syntax:

pip install module_name

Installing PySpark:

pip install pyspark

Steps to create dataframe in PySpark:

1. Import the below modules

      import pyspark
      from pyspark.sql import SparkSession

2. Create spark app named tutorialsinhand using getOrCreate() method

     Syntax:

     spark = SparkSession.builder.appName('tutorialsinhand').getOrCreate()

3. Create Rows using Row class

4. Pass this Row class to createDataFrame() method to create pyspark dataframe

    Syntax:
    spark.createDataFrame(Row_Class)

Row Class is used to create Rows for the dataframe.

Syntax:

[Row(column=value),........]

where column represents the column name in the pyspark dataframe and value represent row value.

We have to import this from pyspark.sql module.

Syntax:

from pyspark.sql import Row

Example:

In this example, we will create PySpark dataframe with 5 rows and 3 columns.

# import the below modules
import pyspark
from pyspark.sql import SparkSession
from pyspark.sql import Row

# create an app
spark = SparkSession.builder.appName('tutorialsinhand').getOrCreate()

#create rows usng Row class
rows = [ Row(rollno= 1,  name=  'Gottumukkala Sravan',marks=  98),

        Row(rollno=  2,  name=  'Gottumukkala Bobby',marks=  89),

        Row(rollno=  3,  name= 'Lavu Ojaswi',marks=  90),

        Row(rollno=  4, name=  'Lavu Gnanesh',marks= 78),

        Row(rollno=  5,  name=  'Chennupati Rohith',marks=100)]


# create the dataframe from  rows
data = spark.createDataFrame(rows)


#display dataframe
data.show()

Output:

So the column names are - rollno, name and marks.

+------+-------------------+-----+
|rollno|               name|marks|
+------+-------------------+-----+
|     1|Gottumukkala Sravan|   98|
|     2| Gottumukkala Bobby|   89|
|     3|        Lavu Ojaswi|   90|
|     4|       Lavu Gnanesh|   78|
|     5|  Chennupati Rohith|  100|
+------+-------------------+-----+

We can also create columns first and then we  will pass Rows.

Syntax:

column_names=Row(column,...............)
[column_names(value1,..................),.........]

Example:

In this example, we will create PySpark dataframe with 5 rows and 3 columns.

 


# import the below modules
import pyspark
from pyspark.sql import SparkSession
from pyspark.sql import Row

# create an app
spark = SparkSession.builder.appName('tutorialsinhand').getOrCreate()

#create columns usng Row class
col=Row("rollno","name","marks")
rows = [ col(1,   'Gottumukkala Sravan', 98),

        col( 2,  'Gottumukkala Bobby',  89),

        col( 3,  'Lavu Ojaswi',  90),

        col(4,  'Lavu Gnanesh', 78),

        col(5,   'Chennupati Rohith',100)]


# create the dataframe from  rows
data = spark.createDataFrame(rows)


#display dataframe
data.show()

Output

So the column names are - rollno, name and marks.

+------+-------------------+-----+
|rollno|               name|marks|
+------+-------------------+-----+
|     1|Gottumukkala Sravan|   98|
|     2| Gottumukkala Bobby|   89|
|     3|        Lavu Ojaswi|   90|
|     4|       Lavu Gnanesh|   78|
|     5|  Chennupati Rohith|  100|
+------+-------------------+-----+

 


pyspark

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 1400+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Jun 14,2024  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!