Home > Blockchain >  factorizing on a slice of a df
factorizing on a slice of a df

Time:08-29

I'm trying to give numerical representations of strings, so I'm using Pandas' factorize

For example Toyota = 1, Safeway = 2 , Starbucks =3

Currently it looks like (and this works):

#Create easy unique IDs for subscription names i.e. 1,2,3,4,5...etc..
df['SUBS_GROUP_ID'] = pd.factorize(df['SUBSCRIPTION_NAME'])[0]   1

However, I only want to factorize subscription names where the SUB_GROUP_ID is null. So my thought was, grab all null rows, then run factorize function.

mask_to_grab_nulls = df['SUBS_GROUP_ID'].isnull()

df[mask_to_grab_nulls]['SUBS_GROUP_ID'] =  pd.factorize(df[mask_to_grab_nulls]['SUBSCRIPTION_NAME'])[0]   1

This runs, but does not change any values... any ideas on how to solve this?

CodePudding user response:

You can use enter image description here

  • Related