I have some data in dataframe which looks like this:
----------- -------- ----------- --------------------------------
| Noun| Pronoun| Adjective|Metadata |
----------- -------- ----------- --------------------------------
| Homer| Simpson|Engineer |{"Age": "50", "Country": "USA"} |
| Elon | Musk |King |{"Age": "45", "Country": "RSA"} |
| Bart | Lee |Cricketer |{"Age": "35", "Country": "AUS"} |
| Lisa | Jobs |Daughter |{"Age": "35", "Country": "IND"} |
| Joe | Root |Player |{"Age": "31", "Country": "ENG"} |
----------- -------- ----------- --------------------------------
I want to append values from another column(say Adjective
) into the Metadata
column. So that the final dataframe would look like this:
----------- -------- ----------- ------------------------------------------------------------
| Noun| Pronoun| Adjective|Metadata |
----------- -------- ----------- ------------------------------------------------------------
| Homer| Simpson|Engineer |{"Age": "50", "Country": "USA", "Adjective": "Engineer"} |
| Elon | Musk |King |{"Age": "45", "Country": "RSA", "Adjective": "King"} |
| Bart | Lee |Cricketer |{"Age": "35", "Country": "AUS", "Adjective": "Cricketer"} |
| Lisa | Jobs |Daughter |{"Age": "35", "Country": "IND", "Adjective": "Daughter"} |
| Joe | Root |Player |{"Age": "31", "Country": "ENG", "Adjective": "Player"} |
----------- -------- ----------- ------------------------------------------------------------
Please suggest how this can be implemented.
CodePudding user response:
Assuming your column Metadata
contains JSON strings, you can first convert it to MapType
with from_json
function, then add the columns you want using map_concat
and finally convert again to JSON string using to_json
:
val df2 = df.withColumn(
"Metadata",
from_json(col("Metadata"), lit("map<string,string>"))
).withColumn(
"Metadata",
to_json(map_concat(col("Metadata"), map(lit("Adjective"), col("Adjective"))))
)
df2.show(false)
// ----- ------- --------- ----------------------------------------------------
//|Noun |Pronoun|Adjective|Metadata |
// ----- ------- --------- ----------------------------------------------------
//|Homer|Simpson|Engineer |{"Age":"50","Country":"USA","Adjective":"Engineer"} |
//|Elon |Musk |King |{"Age":"45","Country":"RSA","Adjective":"King"} |
//|Bart |Lee |Cricketer|{"Age":"35","Country":"AUS","Adjective":"Cricketer"}|
//|Lisa |Jobs |Daughter |{"Age":"35","Country":"IND","Adjective":"Daughter"} |
//|Joe |Root |Player |{"Age":"31","Country":"ENG","Adjective":"Player"} |
// ----- ------- --------- ----------------------------------------------------
This also can be done using a conversion to StructType instead of MapType but map is more generic in this case.