Home > front end >  Python: convert pandas DataFrame to dict of lists
Python: convert pandas DataFrame to dict of lists

Time:12-24

I need to convert this DataFrame:

meterid    timestamp                  value
123       2018-04-09T21:34:55.335Z     0
123       2018-05-10T21:34:55.335Z     10
456       2018-04-09T21:34:55.335Z     200

to this format:

{
   "data":{
      "123":[
         [
            "2018-04-09T21:34:55.335Z",
            0
         ],
         [
            "2018-05-10T21:34:55.335Z",
            10
         ]
      ],
      "456":[
         [
            "2018-04-09T21:34:55.335Z",
            200
         ]
      ]
   }
}

What is the best way of transforming this? I tried to.dict() with different values of orient, but it didn't give me what I wanted. Any help would be appreciated. Thanks in advance.

CodePudding user response:

Kind of hacky, but you could do iterrow and store in a dictionary:

data_store = dict()
for i, row in df.iterrows():
    if data_store.get(row["meterid"]):
        data_store[row["meterid"]] = row["timestamp" : ].tolist()
    else:
        data_store[row["meterid"]] = []
        data_store[row["meterid"]] = row["timestamp" : ].tolist()

Would give problems if meterid is not unique, but if it is, it should work.

CodePudding user response:

You could do:

dict((df.index[i],df.iloc[i,1:].to_list()) for i in range(len(df)))

as a quick and dirty option.

CodePudding user response:

You can use to_dict('split') dict.setdefault:

temp = {}
for data in df.to_dict('split')['data']:
    temp.setdefault(data[0], []).append(data[1:])
out = {'data': temp}

Output:

{'data': {123: [['2018-04-09T21:34:55.335Z', 0], ['2018-05-10T21:34:55.335Z', 10]], 456: [['2018-04-09T21:34:55.335Z', 200]]}}

CodePudding user response:

I would combine all the data columns into 1 column with the code below:

df["combined"] = df.drop("meterid", axis=1).values.tolist()

Then I would use the code below to produce the output you specified:

df[["meterid", "combined"]].groupby("meterid")["combined"].apply(list).to_dict()

Since the meterid in the prompt is not unique, I've written this code with the assumption that the meterid is not the index of your dataframe.

  • Related