I am trying to convert 1 column (kinda 2) of categories (strings) into a set of numbers 1 for star, 2 qso, 3 for galaxies that are not agn (second column defines that) and then 4 for galaxies that are AGN.all saved on a new column of the dataframe.
for n ,i, l in zip(data_clean['class'], data_clean['subClass'], data_clean['nClass'] ):
if n == 'STAR':
l = 1
elif n == 'QSO':
l=2
elif n == 'GALAXY' and i != 'AGN':
l=3
elif n == 'GALAXY' and i == 'AGN':
l=4
where class is the major category, subclass where I get the AGN classification and the nclass is the new column where I put the new integer classification. But I get all zeros. What am I doing wrong?
CodePudding user response:
Does this do what you want?
def type_to_value(n, i):
if n == 'STAR':
return 1
elif n == 'QSO':
return 2
elif n == 'GALAXY' and i != 'AGN':
return 3
elif n == 'GALAXY' and i == 'AGN':
return 4
data_clean['nClass'] = [type_to_value(n, i) for n, i in zip(data_clean['class'], data_clean['subClass'])]
CodePudding user response:
There is no indentations on your for loop?
Try:
for n ,i, l in zip(data_clean['class'], data_clean['subClass'], data_clean['nClass'] ):
if n == 'STAR':
l = 1
elif n == 'QSO':
l=2
elif n == 'GALAXY' and i != 'AGN':
l=3
elif n == 'GALAXY' and i == 'AGN':
l=4
I can't use tab here, so just try to do equal in your code. Don't copy and paste.