Example of data frame I have is as follows:
Period | ticker | rank | compositesore | C20 | C11 | C25 |
----------------------------------------------------------
2010-1 | AAPL | 2 | 100 | 121 | 44 |56 |
-----------------------------------------------------------
The json output I want is as follows:
[
{
"period": "2010-1",
"ticker": "AAPL",
"rank": 2,
"compositescore": 100,
"factorValues": {
"C20": 121,
"C11": 44,
"C25": 56
}
]
I want is collect all C20, C11, and C25 columns under the "factorValues".
CodePudding user response:
Not terribly pretty, but it works. For the example, you'll need to import json and io, as well as pandas, of course.
data = '''Period|ticker|rank|compositesore|C20|C11|C25
2010-1|AAPL|2|100|121|6|156
2010-3|NFLX|3|100|4121|464|596
2010-4|QQQQ|4|100|1281|441|856
'''
df = pd.read_csv(io.StringIO(data), sep='|', engine='python')
cola = ['Period', 'ticker', 'rank', 'compositesore']
colb = ['C20', 'C11', 'C25']
def create_json(x):
# print(x[cola].to_json())
dd = json.loads(x[cola].to_json())
ee = json.loads(x[colb].to_json())
dd['factorValues'] = ee
# print(dd)
return dd
df['jsons'] = df.apply(lambda x: create_json(x), axis=1)
df.jsons[0]
Output
{'Period': '2010-1',
'ticker': 'AAPL',
'rank': 2,
'compositesore': 100,
'factorValues': {'C20': 121, 'C11': 6, 'C25': 156}}
Finally
df.jsons.tolist()
[{'Period': '2010-1',
'ticker': 'AAPL',
'rank': 2,
'compositesore': 100,
'factorValues': {'C20': 121, 'C11': 6, 'C25': 156}},
{'Period': '2010-3',
'ticker': 'NFLX',
'rank': 3,
'compositesore': 100,
'factorValues': {'C20': 4121, 'C11': 464, 'C25': 596}},
{'Period': '2010-4',
'ticker': 'QQQQ',
'rank': 4,
'compositesore': 100,
'factorValues': {'C20': 1281, 'C11': 441, 'C25': 856}}]
Side note: If you do
return json.dumps(dd)
you'll get double quotes instead of single quotes