Remove spaces from all column names in Spark | Scala | Pyspark
In This article we will try to remove spaces from a data frame column name. apart from this we will also add prefix in all column name.
Let get started !
Step 1 : Collect all column names into an array
val cols:Array[String]=dfWithSchema.columns
Step 2: Perform required operation using map on each column name
val finalcol:Array[String]=cols.map(p=>p.replace(' ','_').toLowerCase()+"_post")
Here, i have replaced white space with ‘_’. also converted column name to lowercase.
Step 3: Pass this modified column array to toDF function.
dfWithSchema.toDF(finalcol:_*).show
Above code will show the data frame with new column names.
Thanks for reading. Please follow me for more articles like this.