Home > Mobile >  Combine Pandas dataFrames into a labelled JSON file
Combine Pandas dataFrames into a labelled JSON file

Time:11-21

I have two separate train and validation dataFrames, train_df and val_df, that both look like below (with varying values in the lists, ints and strings):

F1      F2 F3
[0,0,0] 1  'string1'
[1,2,1] 2  'string2'
...     ... ...

And I'd like to save them into a single JSON file with the following format:

{
    "training":[
        {
            "F1": [0,0,0]
            "F2": 1
            "F3": 'string1'
        },
        {
            "F1": [1,2,1]
            "F2": 2
            "F3": 'string2'
        }
    ],
    "validation":[
        {...},
        {...}
    ]
}
    

where the validation section is structured as the training section (F1,F2,F3) but with different values.

The closest I've gotten is using train_df.to_json(orient="records") which results in the right substructure, but I'm struggling to figure out how to insert the top level training or validation identifier. One option I've thought of is to save both dataframes separately, then read them in and store as strings using .dumps and then insert the text in the required location, but that seems overly convoluted and I'm sure there must be a more pythonic way of doing this.

CodePudding user response:

Here's a way you can do it.

final_json={'training':df1.to_dict('records'),'validation':df2.to_dict('records')}
'''
{
    "training": [
        {
            "F1": "[0,0,0]",
            "F2": 1,
            "F3": "'string1'"
        },
        {
            "F1": "[1,2,1]",
            "F2": 2,
            "F3": "'string2'"
        }
    ],
    "validation": [
        {
            "F1": "[0,0,0]",
            "F2": 1,
            "F3": "'string1'"
        },
        {
            "F1": "[1,2,1]",
            "F2": 2,
            "F3": "'string2'"
        }
    ]
}
'''
  • Related