Home > Enterprise >  How to remove substring from column row if substring isn't part of dictionary key value?
How to remove substring from column row if substring isn't part of dictionary key value?

Time:10-25

I have this simplified DataFrame:

A B
foo A, B, C, D

I create a dictionary d:

d = {'foo': 'A, B, C'}

Dictionary keys are in column A and their values are in column B. How can I remove any substrings that aren't part of my dictionary key value?

Desired DataFrame:

A B
foo A, B, C

CodePudding user response:

If need compare by spiltted values by , use:

d = {'foo': 'A, B, C'}

f = lambda x: ', '.join(y for y in x.B.split(', ') if y in x.A.split(', '))
df['B'] = df.assign(A = df['A'].map(d)).apply(f, axis=1)
print (df)
     A        B
0  foo  A, B, C

CodePudding user response:

I can misunderstood but if you want to remove:

any substrings that aren't part of my dictionary key value?

It probably means you want to only keep the values in your dictionary?.

Suppose the dataframe below:

>>> df
     A           B
0  foo  A, B, C, D
1  bar     X, Y, Z

Update your values from your dict:

df.update(pd.DataFrame(d.items(), columns=df.columns))

Output result:

>>> df
     A        B
0  foo  A, B, C
1  bar  X, Y, Z
  • Related