How to Merge Two Columns in Pandas ? : 3 Steps Only

Merging two columns in Pandas can be a tedious task if you don’t know the Pandas merging concept. You can easily merge two different data frames easily. But on two or more columns on the same data frame is of a different concept. In this entire post, you will learn how to merge two columns in Pandas using different approaches.

Step 1: Import the Necessary Packages

Numpy and Pandas Packages are only required for this tutorial, therefore I am importing it.

import pandas as pd
import numpy as np

Step 2: Create a Dataframe

For the demonstration purpose, I am creating a Dataframe manually. You can apply the same concept to your dataframe.

missing = np.nan
actors_name = ["Tom Cruise","Hugh Jackman","Brad Pitt","Johnny Depp","Leonardo DiCaprio"]
actor_age = [57,missing,51,missing,44]
actor_age_revised =[missing,55,missing,56,missing]
df = pd.DataFrame({"name":actors_name,"age1":actor_age,"revised_age":actor_age_revised})

Here the dataframe contains “name“, “age1” and “revised_age” columns and also some rows have missing value. I have created it for showing the merge process on the columns.

Step 3:   Apply the approaches

In this step apply these methods for completing the merging task.

Approach 1: Using the “+” Operator

You can do the simple mathematical calculation on the two columns if it contains missing values of numeric type. Like this.

df = df.fillna(0)
df["age"] = (df["age1"] + df["revised_age"]).astype("int")
df = df[["name","age"]]
df

First, you are filling the missing values and then adding the values of the two columns and output the result in the age column.

Output

merging columns with approach 1

Approach 2: Using the pop() method

df["age"] = df.pop("age1").fillna(df.pop("revised_age")).astype(int)
 df

You can merge the columns using the pop() method. In this, you are popping the values of “age1” columns and filling it with the popped values of the other columns “revised_age“. You will get the output as below.

merging columns with approach 2 pop method

Approach 3: Using the combine_first() method

The other method for merging the columns is dataframe combine_first() method.  Use the following code.

df["age"] = df["age1"].combine_first(df["revised_age"]).astype(int)
df = df[["name","age"]]
df

The above code combining the “age1” columns with the “revised_age” and assigning it to the df[“age”] column.

Output:

merging columns with approach 3 combine_first method

Approach 4: Using Numpy

You can directly merge the “age1” column using the numpy.where() method. We are replacing all the NaN values with the “revised_age” column and dropping the “revised_age” column.  Use the code below.

df["age1"] = np.where(df["age1"].isna(),df["revised_age"],df["age1"]).astype("int")
df =df.drop("revised_age",axis=1)
df

merging columns with approach 4 numpy method

 

These are some approaches to merge two columns in a Dataframe. You can apply the simple addition approach if the data contains numeric values. Otherwise, use other approaches. Hope you have learned it easily. If you have any queries please contact us for more help.

 

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner