Efficiently get a list of booleans that correlates to wheter an item exists in a list of dictionarie-CodePudding

I wish to add a boolean/binary 'CONDITION' column to my existing data frame based on whether a predefined value exists in the given row's 'DICTIONARY' column that contains a dictionary of simple string-string pairs, and we are looking at its keys. I tried to avoid writing my own loop with:

df.loc['KEYWORD' in df['DICTIONARY'], 'CONDITION'] = 1
df.loc['KEYWORD' not in df['DICTIONARY'], 'CONDITION'] = 0

But it gives the error:

KeyError: 'cannot use a single bool to index into setitem'

The same error showed up when I've tried:

condition = ('KEYWORD' in (i for i in df['DICTIONARY']))
df.loc[condition, 'CONDITION'] = 1

I have also tried with this, however it results in a generator which I was unable to utilize:

condition = ('KEYWORD' in i for i in df['DICTIONARY'].tolist())

CodePudding user response：

If possible convert valuest to strings and then test subtrings use:

df['CONDITION'] = df['DICTIONARY'].astype(str).str.contains('KEYWORD').astype(int)

df['CONDITION'] = np.where(df['DICTIONARY'].astype(str).str.contains('KEYWORD'), 1, 0)

Or maybe (depends of data):

df['CONDITION'] = df['DICTIONARY'].map(lambda x: 'KEYWORD' in x).astype(int)

df['CONDITION'] = np.where(df['DICTIONARY'].map(lambda x: 'KEYWORD' in x), 1, 0)