Home > other >  How can I properly format a pandas dataframe into JSON?
How can I properly format a pandas dataframe into JSON?

Time:03-16

I have this function that takes in a JSON, transforms it to a pandas dataframe, does a calculation and attempts to return it in proper json form.

Here's what the function looks like:

def run(data):
    try:
        start_time = datetime.datetime.now()
        ret_columns = ["operat_flight_nbr", "schd_leg_dep_dt", \
        "schd_leg_dep_tm", "dep_airprt_cd", "predictions"]
        df = pd.DataFrame.from_records(json.loads(data)["data"])
        predictions = predict(df)
        df["predictions"] = predictions

        elapsed_time_ms = (datetime.datetime.now() - start_time).total_seconds() * 1000

        return {"data" : json.loads(df[ret_columns].to_json(date_format="iso")), "predictions" : predictions, "elapsed_time_ms" : elapsed_time_ms }

Here's the current undesirable output:

{
    "data": {
        "operat_flight_nbr": {
            "0": 2825,
            "1": 3701
        },
        "schd_leg_dep_dt": {
            "0": "2021-06-04",
            "1": "2021-08-09"
        },
        "schd_leg_dep_tm": {
            "0": "09:41:00",
            "1": "13:03:00"
        },
        "dep_airprt_cd": {
            "0": "CLT",
            "1": "MYT"
        },
        "predictions": {
            "0": 18.1041783139,
            "1": 2.947184921
        }
    },
    "predictions": [
        18.104178313869596,
        2.947184920966057
    ],
    "elapsed_time_ms": 59.61000000000001
}

and here's the desired output:

{
    "data": [
              {
                "operat_flight_nbr": 2825
                "schd_leg_dep_dt": "2021-06-04"
                "schd_leg_dep_tm": "09:41:00"
                "dep_airprt_cd": "CLT",
                "predictions": 18.1041783139
              },
              {
                "operat_flight_nbr": 3701
                "schd_leg_dep_dt": "2021-08-09"
                "schd_leg_dep_tm": "13:03:00"
                "dep_airprt_cd": "MYT",
                "predictions" : 2.947184921
              }
            ],
    "predictions": [
        18.104178313869596,
        2.947184920966057
    ],
    "elapsed_time_ms": 59.61000000000001
}

I think I need to use json.dumps() somewhere, but when I remove the .to_json() function, it does dates in the wrong format.

CodePudding user response:

Add orient='records' to your to_json() call:

    return {"data" : json.loads(df[ret_columns].to_json(date_format="iso", orient="records")), "predictions" : predictions, "elapsed_time_ms" : elapsed_time_ms }
  • Related