Home > Mobile >  Unable to get the result from the window function
Unable to get the result from the window function

Time:05-22

 --------------- -------- 
|YearsExperience|  Salary|
 --------------- -------- 
|            1.1| 39343.0|
|            1.3| 46205.0|
|            1.5| 37731.0|
|            2.0| 43525.0|
|            2.2| 39891.0|
|            2.9| 56642.0|
|            3.0| 60150.0|
|            3.2| 54445.0|
|            3.2| 64445.0|
|            3.7| 57189.0|
|            3.9| 63218.0|
|            4.0| 55794.0|
|            4.0| 56957.0|
|            4.1| 57081.0|
|            4.5| 61111.0|
|            4.9| 67938.0|
|            5.1| 66029.0|
|            5.3| 83088.0|
|            5.9| 81363.0|
|            6.0| 93940.0|
|            6.8| 91738.0|
|            7.1| 98273.0|
|            7.9|101302.0|
|            8.2|113812.0|
|            8.7|109431.0|
|            9.0|105582.0|
|            9.5|116969.0|
|            9.6|112635.0|
|           10.3|122391.0|
|           10.5|121872.0|
 --------------- -------- 

I want to find the top highest salary from the above data which is 122391.0
My Code

val top= Window.partitionBy("id").orderBy(col("Salary").desc)

val res= df1.withColumn("top", rank().over(top))


Result
 --------------- -------- --- --- 
|YearsExperience|  Salary| id|top|
 --------------- -------- --- --- 
|            1.1| 39343.0|  0|  1|
|            1.3| 46205.0|  1|  1|
|            1.5| 37731.0|  2|  1|
|            2.0| 43525.0|  3|  1|
|            2.2| 39891.0|  4|  1|
|            2.9| 56642.0|  5|  1|
|            3.0| 60150.0|  6|  1|
|            3.2| 54445.0|  7|  1|
|            3.2| 64445.0|  8|  1|
|            3.7| 57189.0|  9|  1|
|            3.9| 63218.0| 10|  1|
|            4.0| 55794.0| 11|  1|
|            4.0| 56957.0| 12|  1|
|            4.1| 57081.0| 13|  1|
|            4.5| 61111.0| 14|  1|
|            4.9| 67938.0| 15|  1|
|            5.1| 66029.0| 16|  1|
|            5.3| 83088.0| 17|  1|
|            5.9| 81363.0| 18|  1|
|            6.0| 93940.0| 19|  1|
|            6.8| 91738.0| 20|  1|
|            7.1| 98273.0| 21|  1|
|            7.9|101302.0| 22|  1|
|            8.2|113812.0| 23|  1|
|            8.7|109431.0| 24|  1|
|            9.0|105582.0| 25|  1|
|            9.5|116969.0| 26|  1|
|            9.6|112635.0| 27|  1|
|           10.3|122391.0| 28|  1|
|           10.5|121872.0| 29|  1|
 --------------- -------- --- --- 
Also I have choosed partioned by salary and orderby id.
<br>
But the result was same.

As you can see 122391 is coming just below the above but it should come in first position as i have done ascending.

Please help anybody find any things

CodePudding user response:

Are you sure you need a window function here? The window you defined partitions the data by id, which I assume is unique, so each group produced by the window will only have one row. It looks like you want a window over the entire dataframe, which means you don't actually need one. If you just want to add a column with the max, you can get the max using an aggregation on your original dataframe and cross join with it:

val maxDF =  df1.agg(max("salary").as("top"))
val res = df1.crossJoin(maxDF)
  • Related