spark exceptAll weird behavior-CodePudding

Can someone help me explain this behavior:

scala> val l1 = List(84.99F, 9.99F).toDF("dec")
l1: org.apache.spark.sql.DataFrame = [dec: float]

scala> val l2 = List(84.99, 9.99).toDF("dec")
l2: org.apache.spark.sql.DataFrame = [dec: double]

scala> l1.show
 ----- 
|  dec|
 ----- 
|84.99|
| 9.99|
 ----- 


scala> l2.show
 ----- 
|  dec|
 ----- 
|84.99|
| 9.99|
 ----- 

scala> l1.exceptAll(l2).show(false)
 -----------------                                                              
|dec              |
 ----------------- 
|9.989999771118164|
|84.98999786376953|
 ----------------- 

l1.select('dec.cast("double")).exceptAll(l2).show(false)
 -----------------                                                              
|dec              |
 ----------------- 
|9.989999771118164|
|84.98999786376953|
 -----------------

I do understand it's due to the float vs double column comparison in exceptAll, but how and where is the weird diff coming from?

CodePudding user response：

exceptAll requires Spark to widen (cast) the type of l1 to double as well. And such a cast is not necessarily precise causing the result you are seeing:

List(84.99F, 9.99F).toDF("dec")
  .select('dec.cast("double"))
  .show()

 ----------------- 
|              dec|
 ----------------- 
|84.98999786376953|
|9.989999771118164|
 -----------------