Here is my df:
text | date | channel | sentiment | 27cf2f | |
---|---|---|---|---|---|
0 | I like the new layout | 2021-08-30T18:15:22Z | Snowflake | predict | 1,586,020 |
I need to convert this to JSON output that matches the following:
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "27cf2f",
"value": "1,586,020"
}
]
}
]
I'm getting stuck with mapping the keys of the columns to the values in the first dict and mapping the column and row to new keys in the final dict. I've tried various options using df.groupby
with .apply()
but am coming up short.
Samples of what I've tried:
df.groupby(['text', 'date','channel','sentiment','product','segment']).apply(
lambda r: r[['27cf2f]].to_dict(orient='records')).unstack('text').apply(lambda s: [
{s.index.name: idx, 'fields': value}
for idx, value in s.items()]
).to_json(orient='records')
Any and all help is appreciated!
CodePudding user response:
This solution is hardcoded to look for a column named 27cf2f
:
# Start with your example data
d = {'text': ['I like the new layout'],
'date': ['2021-08-30T18:15:22Z'],
'channel': ['Snowflake'],
'sentiment': ['predict'],
'27cf2f': ['1,586,020']}
df = pd.DataFrame(d)
# Build a dict for each group as a Series named `fields`
res = (df.groupby(['text', 'date','channel','sentiment'])
.apply(lambda s: [{'field': '27cf2f',
'value': value}
for value in s['27cf2f'].values])
).rename('fields')
# Convert Series to DataFrame and then to_json
res = res.reset_index().to_json(orient='records')
# Print result
import json
print(json.dumps(json.loads(res), indent=2))
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "27cf2f",
"value": "1,586,020"
}
]
}
]