I have excel file in which json data column is present and I want to read each and every value of that column in python than how can I do that ? data.xlsx
CodePudding user response:
from ast import literal_eval
import pandas as pd
df_raw = pd.read_excel("data.xlsx")
data_raw = df_raw.loc[:, "data"]
data_raw = data_raw.apply(literal_eval)
data_raw = data_raw.tolist()
df = pd.DataFrame(data_raw)
Basically, I use literal_eval to convert all the JSON strings in the "data" column to Python dicts, then I convert the list of Python dicts to a DataFrame.
Let me know if it works for you!
You can also use the loads
function from the json
package instead of literal_eval
: json.loads
is a bit more specific for the JSON format, whereas literal_eval
is better if you have a generic dictionary or other Python data structures stored in the Excel file.
CodePudding user response:
import json
import pandas as pd
df = pd.read_excel("data.xlsx")
# Convert strings to dictionaries
df["data"] = df["data"].apply(json.loads)
# Explode dictionaries column into multiple columns
df2 = pd.DataFrame(df["data"].tolist())
# Join ID column with data columns
df = df[["id"]].join(df2)
print(df)
# Output:
# id name languages
# 0 1 Bob English
# 1 2 Bob English