If else condition in spark Scala Dataframe

Case When statement in SQL

Parmanand
2 min readNov 17, 2020

In SQL world, very often we write case when statement to deal with conditions. Spark also provides “when function” to deal with multiple conditions.

In this article, will talk about following:

  1. when
  2. when otherwise
  3. when with multiple conditions

Let’s get started !

Let’s consider an example, Below is a spark Dataframe which contains four columns. Now task is to create “Description” column based on Status.

import org.apache.spark.sql.{DataFrame, SparkSession}
import org.apache.spark.sql.functions._
object CaseStatement {def main(args: Array[String]): Unit = {
val sparkSession = SparkSession.builder
.appName("TestAPP").master("local[2]").getOrCreate()
val tempDF:DataFrame=sparkSession.read.option("header","true")
.option("delimiter",",").csv("Data/weblog.csv")

/*
Case statement
*/
val finalDF=tempDF.withColumn("desc",

when(col("Status")===200,"Success")
.when(col("Status")===404,"Not found").
otherwise(lit("Unknown"))
)


finalDF.show(5);
}

}

Output :

As it can be noticed that one extra “desc” column got added. And status 200 has success as descriptions.

Thanks for readings. Please follow me for more articles like this.

Please do share the article, if you liked it. Any comments or suggestions are welcome.

--

--