Here's the problematic code:
import pandas as pd
here I create a sample src
dictionary containing 3 dataframes:
src = {}
for i in range(1,4):
src[i] = pd.DataFrame({'a':[i, 2*i, 3*i], 'b':[10*i, 20*i, 30*i], 'c':[100*i, 200*i, 300*i]})
display(src[i])
here are 3 dataframes created in src
dictionary:
a b c
0 1 10 100
1 2 20 200
2 3 30 300
a b c
0 2 20 200
1 4 40 400
2 6 60 600
a b c
0 3 30 300
1 6 60 600
2 9 90 900
here I want to append a
column from each dataframe in src
dictionary to a
dataframe in output
dictionary, and b
column to b
dataframe.
output = {}
for i in src:
output['a'] = output['a'].concat([output[i]['a'], src[i][a]], axis = 1)
output['b'] = output['b'].concat([output[i]['b'], src[i][b]], axis = 1)
I got this error message:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Input In [30], in <cell line: 11>()
10 # retrieve a column from all source dataframes, put them in a new dataframe. and these new dataframes are in a new dictionary.
11 for i in src:
---> 12 output['a'] = output['a'].concat([output[i]['a'], src[i][a]], axis = 1)
13 output['b'] = output['b'].concat([output[i]['b'], src[i][b]], axis = 1)
KeyError: 'a'
How can I fix it?
CodePudding user response:
The problem is that on the first loop you do not have a key called 'a'
(at this point output
is an empty dictionary) - so define the keys at the definition of output
- don't make it an empty dictionary