Instead of doing this:
df['A'] = df['A'] if 'A' in df else None
df['B'] = df['B'] if 'B' in df else None
df['C'] = df['C'] if 'C' in df else None
df['D'] = df['D'] if 'D' in df else None
...
I want to do this in one line or function. Below is what I tried:
def populate_columns(df):
col_names = ['A', 'B', 'C', 'D', 'E', 'F', ...]
def populate_column(df, col_name):
df[col_name] = df[col_name] if col_name in df else None
return df[col_name]
df[col_name] = df.apply(lambda x: populate_column(x) for x in col_names)
return df
But I just get Exception has occurred: ValueError
. What can I do here?
CodePudding user response:
Looks like you can replace your whole code with a reindex
:
ensure_cols = ['A', 'B', 'C', 'D']
df = df.reindex(columns=df.columns.union(ensure_cols))
NB. By default the fill value is NaN
, if you really want None
use fill_value=None
.
If you want to fix your code, just use a single loop:
col_names = ['A', 'B', 'C', 'D']
for c in col_names:
if c not in df:
df[c] = None