I am a beginner.I am trying to build a recommendation system based on the skills that a particular candidate has. I want to iterate through each of the skill and store 1 if the candidate has a particular skill and zero if not.
candidate_skillsDF['primary_skills'] = candidate_skillsDF['primary_skills'].str.split(',')
for index, row in candidate_skillsDF.iterrows():
for skill in row['primary_skills']:
candidate_skillsDF.at[index, 'primary_skills'] = 1
#Filling in the NaN values with 0 to show that a candidate doesn't have that column's skill
# candidates_skillsDF = candidate_skillsDF.fillna(0)
candidate_skillsDF.head()
I keep getting this error
TypeError Traceback (most recent call last)
Input In [40], in <cell line: 5>()
3 candidate_skillsDF['primary_skills']
5 for index, row in candidate_skillsDF.iterrows():
----> 6 for skill in row['primary_skills']:
7 candidate_skillsDF.at[index, 'primary_skills'] = 1
9 #Filling in the NaN values with 0 to show that a candidate doesn't have that column's skill
10 # candidates_skillsDF = candidate_skillsDF.fillna(0)
TypeError: 'float' object is not iterable
Now I have tried to use the range() to iterate through the skills, I have tried to use len(), can someone tell me what I'm doing wrong?
CodePudding user response:
row['primary_skills']
is always going to be the particular contents of a single cell because you're iterating through rows (i.e., 0, 1, 2, ..., n) but also only looking at the 'primary_skills'
column. Currently, it's evident that your 'primary_skills'
column is just a series of individual float numbers. Are you sure this is the column you're trying to iterate through? The code you wrote assumes you can iterate through the contents within an individual cell. A single number is not something you can iterate through.
CodePudding user response:
After
for index, row in candidate_skillsDF.iterrows():
row['primary_skills'] is a value from column candidate_skillsDF['primary_skills'] at position index, but not an array or list where you can run the next for loop. I would iterate through
for skill in row:
and then add an if statement to check if that is equal to "particular skill"
for skill in row:
if skill == particular_skill: