I have a dataframe (df) and list (l) containing a list of column names. df:
Col_A | Col_B | Col_D | Col_G |
---|---|---|---|
AA | 12 | Q | no |
BB | 23 | W | yes |
WW | 44 | yes |
l = ['Col_A', 'Col_B', 'Col_C', 'Col_D', 'Col_E', 'Col_F', 'Col_G']
I would like to print the column names that are not present in the df.
Desired output:
['Col_C', 'Col_E', 'Col_F']
What I tried so far:
if l not in df.columns:
print(l)
I get an error TypeError: unhashable type: 'list'
CodePudding user response:
You can use list comprehension for this:
[i for i in l if i not in df.columns]
This goes through every element in l
(i
) and if it is not in the columns of df, it will add it to a new list. Output:
['Col_C', 'Col_E', 'Col_F']
CodePudding user response:
Use numpy.setdiff1d
:
L = np.setdiff1d(l, df.columns).tolist()
Or Index.difference
:
L = pd.Index(l).difference(df.columns).tolist()
Or list comprehension with not in
:
L = [x for x in l if x not in df.columns]
print (L)
['Col_C', 'Col_E', 'Col_F']
CodePudding user response:
You have to loop over the list l
.
Like:
for item in l:
if item not in df.columns:
print(item)