Remove spaces from all column names in Spark | Scala | Pyspark

Remove whitespace from a dataframe column name in Spark

Parmanand
Feb 24, 2023

In This article we will try to remove spaces from a data frame column name. apart from this we will also add prefix in all column name.

Let get started !

Step 1 : Collect all column names into an array

val cols:Array[String]=dfWithSchema.columns

Step 2: Perform required operation using map on each column name

val finalcol:Array[String]=cols.map(p=>p.replace(' ','_').toLowerCase()+"_post")

Here, i have replaced white space with ‘_’. also converted column name to lowercase.

Step 3: Pass this modified column array to toDF function.

dfWithSchema.toDF(finalcol:_*).show

Above code will show the data frame with new column names.

Thanks for reading. Please follow me for more articles like this.

--

--

No responses yet