Home > Software engineering >  pipdeptree json result to csv
pipdeptree json result to csv

Time:10-31

The json result of pipdeptree has form:

[
    {
        "package": {
            "key": "ansible-base",
            "package_name": "ansible-base",
            "installed_version": "2.10.8",
            "required_version": "==2.10.8"
        },
        "dependencies": []
    },
    {
        "package": {
            "key": "botocore",
            "package_name": "botocore",
            "installed_version": "1.27.10",
            "required_version": ">=1.12.36,<2.0a.0"
        },
        "dependencies": [
            {
                "key": "boto3",
                "package_name": "boto3",
                "installed_version": "1.24.10"
            },
            {
                "key": "s3transfer",
                "package_name": "s3transfer",
                "installed_version": "0.6.0"
            }
        ]
    }
    
]

I would like to save it as csv with columns: package.key,package.package_name,package.installed_version,package.required_version,dependencies

and in dependencies column to list the keys only (so for the second package, the value for dependencies should be boto3; s3transfer.

I've tried to simply flatten the json with pandas.normalize_json but it doesn't handle the dependencies array correctly. What is the easiest way to do this?

CodePudding user response:

Try:

import pandas as pd


lst = [
    {
        "package": {
            "key": "ansible-base",
            "package_name": "ansible-base",
            "installed_version": "2.10.8",
            "required_version": "==2.10.8",
        },
        "dependencies": [],
    },
    {
        "package": {
            "key": "botocore",
            "package_name": "botocore",
            "installed_version": "1.27.10",
            "required_version": ">=1.12.36,<2.0a.0",
        },
        "dependencies": [
            {
                "key": "boto3",
                "package_name": "boto3",
                "installed_version": "1.24.10",
            },
            {
                "key": "s3transfer",
                "package_name": "s3transfer",
                "installed_version": "0.6.0",
            },
        ],
    },
]

out = []
for d in lst:
    out.append({})
    for k, v in d["package"].items():
        out[-1][f"package.{k}"] = v
    out[-1]["dependencies"] = "; ".join(v["key"] for v in d["dependencies"])

df = pd.DataFrame(out)
print(df)
df.to_csv("data.csv", index=False)

Prints:

    package.key package.package_name package.installed_version package.required_version       dependencies
0  ansible-base         ansible-base                    2.10.8                 ==2.10.8                   
1      botocore             botocore                   1.27.10        >=1.12.36,<2.0a.0  boto3; s3transfer

and saves data.csv:

package.key,package.package_name,package.installed_version,package.required_version,dependencies
ansible-base,ansible-base,2.10.8,==2.10.8,
botocore,botocore,1.27.10,">=1.12.36,<2.0a.0",boto3; s3transfer
  • Related