find all occurencies and put them into pandas column-CodePudding

I have a dataframe where I want to analyze the string word by word. Eg. I have string:

Hose clip 8-12 mm W4 9 mm edge SST left right

and on this I wanted to apply this matcher to extract only few values from that description

def matcher(x):
    for i in attributes:
        if i.lower() in x.lower():
            return i
    else:
        return np.nan

Hence I created

attributes =['left','right']

and called it like

df['Colours'] = df['pre_descr'].apply(matcher)

I thought it will give me all occurencies, but is stops after finding the first find. So I get only

'left'

Then I thought I would split the string by ' ' and store this list into pandas column like this

a = 0
for i in df['pre_descr']:
    df.at[a, 'pre_descr_list']= i.split(' ')
    a =1

and iterate over the values and store them in there is they are in the attributes list but!

This gives me error ValueError: Must have equal len keys and value when setting with an iterable
But I see the list:

['Hose', 'clip', '8-12', 'mm', 'W4', '9', 'mm', 'edge', 'SST', 'left', 'right']

Please, how would you solve it? I think I have it overcomplicated and it should be easier... but I dont know how to even specify it... Maybe the first thing = to store the values in the column as list is not even needed? Thanks!

CodePudding user response：

I believe that I need to know: Why you want this list? This list you be used for what? This may clarify the intended solution.

But, let's answer your question.

The following code must resolve your problem:

attributes = ['left', 'right']

df['pre_descr'].apply(lambda x: [word for word in x.lower().split(" ") if word in attributes])