The problem in that: I have two Excel sheets. Need to replace values in [something, like, some, false, values] in first sheet by values from second. The tables can contain not equal dimensions
For example, need that dc_result from dc1 and dc2, by merging on='n':
import pandas as pd
import numpy as np
dc1 = {
'n':[1, 2, 3, 4],
'a':[0, 5, 0.0, 'A'],
'b':[0,'', False, 'B', 'C'],
}
dc2 = {
'n':[1, 2, 3],
'a':[1, 0, 3],
'b':[6, 5, 8],
'c':[6, 5],
}
# df1 = pd.DataFrame.from_dict(dc1)
# df2 = pd.DataFrame.from_dict(dc2)
# df = df1.merge(df2, how = 'outer', on='n')
false_list = [False, 0, 0.0, 'Nan', np.nan, None, '','Null']
dc_result = {
'n':[1, 2, 3, 4],
'a':[1, 5, 3, 'A'],
'b':[6, 5, 8, 'B', 'C'],
'c':[6, 5],
}
# or
dc_result_2 = {
'n':[1, 2, 3, 4, ''],
'a':[1, 5, 3, 'A', ''],
'b':[6, 5, 8, 'B', 'C'],
'c':[6, 5, '', '', ''],
}
print(dc_result)
any panda's way to do so?
CodePudding user response:
Since pandas doesn't like columns with unequal length, this can instead be done using a python dictionary/list comprehension
dc_result = {k:[dc1[k][i] if dc1[k][i] not in false_list else dc2[k][i]
for i in range(len(dc1[k]))] for k in dc1}