So I have the given dataframe:
-------------------- -------------
| entity_id| state|
-------------------- -------------
| ha_tdeg| 39.9|
| memory_free| 1459.4|
| srv_tdeg| 39.0|
| as_tempera...| 9.5|
| as_humidity| 81.71|
| as_pressure| 1003.35|
| as_am_humidity| 22.16|
| as_pm_humidity| 4.64|
| memory_free| 1460.0|
| ha_tdeg| 38.0|
| memory_free| 1459.3|
-------------------- -------------
Im trying to add a percentage sign to every "state" where "entity_id" contains 'humidity'. So as it's seen in the code below, I set the "state" column to "String" before I work with it. But whenever I execute the command below and try to concatenate '%' (or any other string), all the values become "null". Interesting for me is that if I try to concatenate a number wraped as String ("10"), It performs a math's addition.
What is the way to overcome this issue?
Here is the code im using:
var humidityDF = df.filter(forExport("entity_id").contains("humidity") && df("state").isNotNull)
humidityDF = humidityDF.withColumn("state", humidityDF("state").cast("String"))
humidityDF = humidityDF.withColumn("state", col("state") "%")
I tried :
humidityDF = humidityDF.withColumn("state", col("state").toString "%")
But this doesn't work since 'withColumn' accepts only Column type parameters.
CodePudding user response:
import org.apache.spark.sql.functions.{lit, concat}
humidityDF = humidityDF.withColumn("state", concat(col("state"),lit("%")))