I need to convert this DataFrame:
meterid timestamp value
123 2018-04-09T21:34:55.335Z 0
123 2018-05-10T21:34:55.335Z 10
456 2018-04-09T21:34:55.335Z 200
to this format:
{
"data":{
"123":[
[
"2018-04-09T21:34:55.335Z",
0
],
[
"2018-05-10T21:34:55.335Z",
10
]
],
"456":[
[
"2018-04-09T21:34:55.335Z",
200
]
]
}
}
What is the best way of transforming this? I tried to.dict()
with different values of orient
, but it didn't give me what I wanted. Any help would be appreciated. Thanks in advance.
CodePudding user response:
Kind of hacky, but you could do iterrow and store in a dictionary:
data_store = dict()
for i, row in df.iterrows():
if data_store.get(row["meterid"]):
data_store[row["meterid"]] = row["timestamp" : ].tolist()
else:
data_store[row["meterid"]] = []
data_store[row["meterid"]] = row["timestamp" : ].tolist()
Would give problems if meterid is not unique, but if it is, it should work.
CodePudding user response:
You could do:
dict((df.index[i],df.iloc[i,1:].to_list()) for i in range(len(df)))
as a quick and dirty option.
CodePudding user response:
You can use to_dict('split')
dict.setdefault
:
temp = {}
for data in df.to_dict('split')['data']:
temp.setdefault(data[0], []).append(data[1:])
out = {'data': temp}
Output:
{'data': {123: [['2018-04-09T21:34:55.335Z', 0], ['2018-05-10T21:34:55.335Z', 10]], 456: [['2018-04-09T21:34:55.335Z', 200]]}}
CodePudding user response:
I would combine all the data columns into 1 column with the code below:
df["combined"] = df.drop("meterid", axis=1).values.tolist()
Then I would use the code below to produce the output you specified:
df[["meterid", "combined"]].groupby("meterid")["combined"].apply(list).to_dict()
Since the meterid in the prompt is not unique, I've written this code with the assumption that the meterid is not the index of your dataframe.