Pyspark rename column : Implementation tricks

Pyspark rename column

pyspark rename column is easily possible withColumnRenamed() function easily. All we need to pass the existing column name and the new one. In this article, we will explore the same with an example. Initially, we will create a dummy pyspark dataframe and then choose a column and rename the same. Renaming is very important in the mapping layer when we map two or more fields with similar data.

Let’s start the coding stuff-

Pyspark rename column : ( Syntax ) –

Let’s create a dummy dataframe. Here is the syntax for the same-

Step 1 – ( Prerequisites ) –

Copy the below code and run in Interpreter.

import pyspark
from pyspark.sql import SparkSession
records = [ 
    (4,"Charlee","2005","60",35000), 
    (5,"Guo","2010","40",38000)]
record_Columns = ["seq","Name","joining_year", "specialization_id","salary"]
sampleDF = spark.createDataFrame(data=records, schema = record_Columns)
sampleDF.show(truncate=False)
pyspark dataframe dummy
pyspark dataframe dummy

Step 2 –

In this step, we will use withColumnRenamed() function to rename the “salary” column to “Income” income.

sampleDF.withColumnRenamed("salary","Income").show(truncate=False)
Rename Pyspark dataframe
Rename Pyspark dataframe

As you may see in the output, we renamed the “salary” column with “income”.

Renaming column is a very common operation in every data engineering or data science-related task. There is some other way to achieve the same but those are not as simple as the above one. So I will recommend using the same.

I hope you must like this article. Please provide your suggestion on how can we improve this article. You may also request for article on any topic as per your choice.  You may request below for the same comment or you can write back us to in an email. Please subscribe to us for more articles on Pyspark and Data Science Technology.

Similar Articles :

We have started a  series on Pyspark and Data Engineering stuffs.  Here we try to make syntax too user-friendly. Especially for beginners its very good to start from here. It will cover most of the basics related to this topic.

Pyspark drop column : How to performs ?

 

Thanks 

Data Science Learner Team

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages Datasciencelearner.com, where he and his team share knowledge and help others learn more about data science.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner