have been struggling to get this sorted I have Dataset#1 personInfo
---------------------------
|field_1 |field_2|field_3|...
-----------------
|personID| DoB |intAge |...
DataSet#2 ageCodes
-----------------
|field_1|field_2|
-----------------
| age |ageCode|
-----------------
| 35 | 6 |
-----------------
| 36 | 6 |
-----------------
| 37 | 6 |
-----------------
| 38 | 7 |
-----------------
| 39 | 7 |
-----------------
| 40 | 7 |
-----------------
Am trying to update personInfo
row with their ageCode
personInfo = personInfo.withColumn("ageCode",
ageCodes.filter(col("age").equalTo(personInfo.col("intAge"))).col("ageCode")
);
have tried several variations of the above & can't seem to get it quite right
any help gratefully received
CodePudding user response:
You want to use join:
personInfo = personInfo.join(ageCodes, ageCodes.col("age").equalTo(personInfo.col("intAge")), "left").drop("age");