AttritubeError: list' object has no attribute 'map' when using .apply() to dataFrame-CodePudding

This is how my dataFrame called "emails" looks like(only one row with columns 'text' and 'POS_Tag'):

print(emails)

Trying to use apply on my dataFrame by first defining the function as :

 def extractGrammar(email):     
    tag_count_data = pd.DataFrame(email['POS_Tag'].map(lambda x: Counter(tag[1] for tag in x)).to_list())

    # Print count Part of speech tag needed for Adjective, Adverbs, Nouns and Verbs 
    email = pd.concat([email, tag_count_data], axis=1).fillna(0)

    pos_columns = ['PRP','MD','JJ','JJR','JJS','RB','RBR','RBS', 'NN', 'NNS','VB', 'VBS', 'VBG','VBN','VBP','VBZ']
    for pos in pos_columns:
        if pos not in email.columns:
            email[pos] = 0

    email = email[['text']   pos_columns]

    email['Adjectives'] = email['JJ'] email['JJR']  email['JJS']
    email['Adverbs'] = email['RB'] email['RBR']   email['RBS']
    email['Nouns'] = email['NN'] email['NNS']
    email['Verbs'] = email['VB'] email['VBS'] email['VBG'] email['VBN'] email['VBP']  email['VBZ'] 

    return email ``

And I have tried to pass my emails as an object with the apply() function as such:

emails=emails.apply(extractGrammar, axis = 1)

I have just been getting this error: AttributeError: 'list' object has no attribute 'map'

I have previously used the exact same block of code within the 'extractGrammar' function on csv files with multiple rows of emails except it was used in a very manual and chronological way outside of a function where no apply was used. I cannot figure out what seemed to have gone wrong.

CodePudding user response：

You get that result because when you apply() the extractGrammar() function to your DataFrame, it passes each row of the DataFrame to the function. Then when you access the ['POS Tag'] column, it is not returning that entire Series, but rather the contents of that POS Tag cell for that row, which is a list. Lists do not have a map method. If you are trying to count the occurrences of the second element of each tuple in the POS Tag column, you could try the following:

tag_count_data = Counter([x[1] for x in email['POS Tag']])

This will give you a Counter of the second elements of the tags for that individual row.