This is how my dataFrame called "emails" looks like(only one row with columns 'text' and 'POS_Tag'):
print(emails)
Trying to use apply on my dataFrame by first defining the function as :
def extractGrammar(email):
tag_count_data = pd.DataFrame(email['POS_Tag'].map(lambda x: Counter(tag[1] for tag in x)).to_list())
# Print count Part of speech tag needed for Adjective, Adverbs, Nouns and Verbs
email = pd.concat([email, tag_count_data], axis=1).fillna(0)
pos_columns = ['PRP','MD','JJ','JJR','JJS','RB','RBR','RBS', 'NN', 'NNS','VB', 'VBS', 'VBG','VBN','VBP','VBZ']
for pos in pos_columns:
if pos not in email.columns:
email[pos] = 0
email = email[['text'] pos_columns]
email['Adjectives'] = email['JJ'] email['JJR'] email['JJS']
email['Adverbs'] = email['RB'] email['RBR'] email['RBS']
email['Nouns'] = email['NN'] email['NNS']
email['Verbs'] = email['VB'] email['VBS'] email['VBG'] email['VBN'] email['VBP'] email['VBZ']
return email ``
And I have tried to pass my emails as an object with the apply() function as such:
emails=emails.apply(extractGrammar, axis = 1)
I have just been getting this error:
AttributeError: 'list' object has no attribute 'map'
I have previously used the exact same block of code within the 'extractGrammar' function on csv files with multiple rows of emails except it was used in a very manual and chronological way outside of a function where no apply was used. I cannot figure out what seemed to have gone wrong.
CodePudding user response:
You get that result because when you apply()
the extractGrammar()
function to your DataFrame, it passes each row of the DataFrame to the function. Then when you access the ['POS Tag'] column, it is not returning that entire Series, but rather the contents of that POS Tag
cell for that row, which is a list. Lists do not have a map
method. If you are trying to count the occurrences of the second element of each tuple in the POS Tag
column, you could try the following:
tag_count_data = Counter([x[1] for x in email['POS Tag']])
This will give you a Counter of the second elements of the tags for that individual row.