Home > Blockchain >  How to write the content of a json file to an excel file
How to write the content of a json file to an excel file

Time:12-15

I have a bit complicated looking json file that stores a dictionary. Please refer to this image.

Here is the dictionary if you want to code on your local machine.

{'https://www.linkedin.com/in/manashi-sherawat-mathur-phd-3a97b69': 
{'showallexperiences': 
[{'company_url': 'https://www.linkedin.com/company/1612/'}, 
{'company_url': 'https://www.linkedin.com/search/results/all/?keywords=Independent Pharma/Biotech Professional'], 'showalleducation': 
[{'university_url': 'https://www.linkedin.com/company/3555/'}, {'university_url': None}]}, 
'https://www.linkedin.com/in/baneshwar-singh-6143082b/': 
{'showallexperiences': 
[{'company_url': 'https://www.linkedin.com/company/166810/'}, 
{'company_url': 'https://www.linkedin.com/company/166810/'}], 
'showalleducation': 
[{'university_url': 'https://www.linkedin.com/company/6737/'}, {'university_url': 'https://www.linkedin.com/company/5826549/'}]}}}

The main keys of the dictionary are different Linkedin URLs and their values are also dictionaries each storing company and university URLs belonging to that link.

I need to write the content of that dictionary to an excel file.

Particularly, each different Linkedin URL (the main key)'s info should be written on a separate row. I also hope to add some numbering for the columns of different company's and university's urls. So that the final output looks like this.

Can somebody please suggest any solution?

CodePudding user response:

Try:

import pandas as pd

data = {
    "https://www.linkedin.com/in/manashi-sherawat-mathur-phd-3a97b69": {
        "showallexperiences": [
            {"company_url": "https://www.linkedin.com/company/1612/"},
            {
                "company_url": "https://www.linkedin.com/search/results/all/?keywords=Independent Pharma/Biotech Professional"
            },
        ],
        "showalleducation": [
            {"university_url": "https://www.linkedin.com/company/3555/"},
            {"university_url": None},
        ],
    },
    "https://www.linkedin.com/in/baneshwar-singh-6143082b/": {
        "showallexperiences": [
            {"company_url": "https://www.linkedin.com/company/166810/"},
            {"company_url": "https://www.linkedin.com/company/166810/"},
        ],
        "showalleducation": [
            {"university_url": "https://www.linkedin.com/company/6737/"},
            {"university_url": "https://www.linkedin.com/company/5826549/"},
        ],
    },
}

all_data = []
for k, v in data.items():
    company_urls = {
        f"exp{i}_company_url": e["company_url"]
        for i, e in enumerate(v["showallexperiences"], 1)
    }

    edu_urls = {
        f"edu{i}_uni_url": e["university_url"]
        for i, e in enumerate(v["showalleducation"], 1)
    }

    all_data.append({"linkedin_link": k, **company_urls, **edu_urls})

df = pd.DataFrame(all_data)
print(df.to_markdown(index=False))
df.to_csv("data.csv", index=False)

Prints:

linkedin_link exp1_company_url exp2_company_url edu1_uni_url edu2_uni_url
enter image description here

  • Related