not used pandas explode before. I got the gist of the pd.explode but for value lists where selective cols have nested lists I heard that pd.Series.explode is useful. However, i keep getting : "KeyError: "None of ['city'] are in the columns". Yet 'city' is defined in the keys:
keys = ["city", "temp"]
values = [["chicago","london","berlin"], [[32,30,28],[39,40,25],[33,34,35]]]
df = pd.DataFrame({"keys":keys,"values":values})
df2 = df.set_index(['city']).apply(pd.Series.explode).reset_index()
desired output is:
city / temp
chicago / 32
chicago / 30
chicago / 28
etc.
I would appreciate an expert weighing in as to why this throws an error, and a fix, thank you.
CodePudding user response:
The problem comes from how you define df:
df = pd.DataFrame({"keys":keys,"values":values})
This actually gives you the following dataframe:
keys values
0 city [chicago, london, berlin]
1 temp [[32, 30, 28], [39, 40, 25], [33, 34, 35]]
You probably meant:
df = pd.DataFrame(dict(zip(keys, values)))
Which gives you:
city temp
0 chicago [32, 30, 28]
1 london [39, 40, 25]
2 berlin [33, 34, 35]
You can then use explode:
print(df.explode('temp'))
Output:
city temp
0 chicago 32
0 chicago 30
0 chicago 28
1 london 39
1 london 40
1 london 25
2 berlin 33
2 berlin 34
2 berlin 35