While assign a new column to my DataFrame I get this error here is my code
def check_header(header, df):
print("Header : ",header)
for item in header:
if not item in df.columns:
df = df.assign(item) #here I'm getting error
return df[header]
I have check this post but din't work for me because my pandas version is satisfying it's
>>> import pandas as pd
>>> pd.__version__
'1.1.5'
what is problem in my code please help me.
CodePudding user response:
If need add new columns by header
list with convert not matched values to new columns filled by NaN
s use DataFrame.reindex
:
df = pd.DataFrame(data = {"test":["mkt1","mkt2","mkt3"],
"test2":["cty1","cty2","cty3"]})
def check_header(header, df):
return df.reindex(header, axis=1)
a = ['test','test1','test3']
print (check_header(a, df))
test test1 test3
0 mkt1 NaN NaN
1 mkt2 NaN NaN
2 mkt3 NaN NaN
If need same values in new columns use fill_value
parameter:
def check_header(header, df):
return df.reindex(header, axis=1, fill_value=0)
a = ['test','test1','test3']
print (check_header(a, df))
test test1 test3
0 mkt1 0 0
1 mkt2 0 0
2 mkt3 0 0
If need different values per new column use DataFrame.assign
with dictionary for new columns names like keys::
def check_header(header, df):
diff = np.setdiff1d(header, df.columns)
d = dict(zip(diff, diff))
print (d)
{'test1': 'test1', 'test3': 'test3'}
return df.assign(**d).reindex(header, axis=1)
a = ['test','test1','test3']
print (check_header(a, df))
test test1 test3
0 mkt1 test1 test3
1 mkt2 test1 test3
2 mkt3 test1 test3