I would like to transform a JSON with the following input:
{
"039.png" : [
"finding1"
],
"006.png" : [
"finding1",
"finding2"
],
"012.png" : [
"nofinding"
]}
into a data frame with the following layout:
| image | finding1 | finding2 | nofinding |
| ------- | -------- | -------- | --------- |
| 039.png | true | false | false |
| 006.png | true | true | false |
| 012.png | false | false | true |
By now I have tried:
import pandas as pd
import json
from pandas.io.json import json_normalize
with open('file.json') as data_file:
data = json.load(data_file)
df = pd.json_normalize(data)
df = df.transpose()
df.head()
However, this results in:
039.png [finding1]
006.png [finding1, finding2]
012.png [nofinding]
In addition, I would also like to program the return path to be able to create the given JSON format from the desired table format
many thanks in advance
CodePudding user response:
It's straightforward with get_dummies
:
data = {
"039.png" : [
"finding1"
],
"006.png" : [
"finding1",
"finding2"
],
"012.png" : [
"nofinding"
]}
df = pd.DataFrame.from_dict(data, orient='index')
print(pd.get_dummies(df, prefix='', prefix_sep='', dtype=bool))
Output:
finding1 nofinding finding2
039.png True False False
006.png True False True
012.png False True False
CodePudding user response:
Based on your pd.json_normalize
, you can explode
then pd.crosstab
df = pd.json_normalize(data).T.explode(0)
out = pd.crosstab(df.index, df[0]).eq(1).rename_axis('image').reset_index().rename_axis(None,axis=1)
image finding1 finding2 nofinding
0 006.png True True False
1 012.png False False True
2 039.png True False False