Home > Enterprise >  Map monthname to monthnumber in Azure databricks using scala
Map monthname to monthnumber in Azure databricks using scala

Time:11-24

Need to map monthname to monthnumber in Azure databricks using scala. I have column name PERIOD which have data like months name(like Jan,Feb,Mar,....,Nov,Dec), I want to replace this monthname with monthnumber (like 01,02,03,...,11,12). Result should be like Jan,Feb,Mar,..,Nov,Dec replaced with 01,02,03,...,11,12.

      "Jan"  -> 01,
      "Feb"  -> 02,
      "Mar"  -> 03,
      "Apr"  -> 04,
      "May"  -> 05,
      "Jun"  -> 06,
      "Jul"  -> 07,
      "Aug"  -> 08,
      "Sep"  -> 09,
      "Oct"  -> 10,
      "Nov"  -> 11,
      "Dec"  -> 12

I'm new to scala and azure databricks. I tried mapping approach but not getting desired solution. enter image description here

CodePudding user response:

I have a dataframe with column called month_in_words as in the following image:

enter image description here

  • Now to convert this month in words to month number, first I have created a map to where key is the month name and value is month number as string (Since we need month number as 01, but not 1).
val monthNumber = Map(
      "Jan" -> "01",
      "Feb" -> "02",
      "Mar" -> "03",
      "Apr" -> "04",
      "May" -> "05",
      "Jun" -> "06",
      "Jul" -> "07",
      "Aug" -> "08",
      "Sep" -> "09",
      "Oct" -> "10",
      "Nov" -> "11",
      "Dec" -> "12"
  )
  • Now you can use the following code to convert month name to month number:
var final_df = df.rdd.map(f => {
    val number = monthNumber(f.getString(0))
    (f.getString(0),number)
}).toDF("month_in_words","month_in_number")
//display(df)

enter image description here

  • To directly convert, you can use the following code:
var final_df = df.rdd.map(f => monthNumber(f.getString(0))).toDF("month_in_number")
//display(final_df)

enter image description here

  • Related