Home > other >  Separating pandas dataframe column data into new new columns
Separating pandas dataframe column data into new new columns

Time:10-05

my data column combines information on two axes: wealth and life stage. I need to break up the two-digit codes by their 'tens'-place and 'ones'-place digits into two new ordinal variables. How would I go about separating and re-encoding this data?

This is the data dictionary for the column:

  • -1: unknown
  • 11: Wealthy Households - Pre-Family Couples & Singles
  • 12: Wealthy Households - Young Couples With Children
  • 13: Wealthy Households - Families With School Age Children
  • 14: Wealthy Households - Older Families & Mature Couples
  • 15: Wealthy Households - Elders In Retirement
  • 21: Prosperous Households - Pre-Family Couples & Singles
  • 22: Prosperous Households - Young Couples With Children
  • 23: Prosperous Households - Families With School Age Children
  • 24: Prosperous Households - Older Families & Mature Couples
  • 25: Prosperous Households - Elders In Retirement
  • 31: Comfortable Households - Pre-Family Couples & Singles
  • 32: Comfortable Households - Young Couples With Children
  • 33: Comfortable Households - Families With School Age Children
  • 34: Comfortable Households - Older Families & Mature Couples
  • 35: Comfortable Households - Elders In Retirement
  • 41: Less Affluent Households - Pre-Family Couples & Singles
  • 42: Less Affluent Households - Young Couples With Children
  • 43: Less Affluent Households - Families With School Age Children
  • 44: Less Affluent Households - Older Families & Mature Couples
  • 45: Less Affluent Households - Elders In Retirement
  • 51: Poorer Households - Pre-Family Couples & Singles
  • 52: Poorer Households - Young Couples With Children
  • 53: Poorer Households - Families With School Age Children
  • 54: Poorer Households - Older Families & Mature Couples
  • 55: Poorer Households - Elders In Retirement

I have tried np.where() and created conditions:

I have used .map():

data2.map({11:'Wealthy', 12:'Wealthy', 13:'Wealthy', 14:'Wealthy', 15:'Wealthy',
#          21:'Prosperous', 22:'Prosperous', 23:'Prosperous', 24:'Prosperous', 25:'Prosperous',
#          31:'Comfortable', 32:'Comfortable', 33:'Comfortable', 34:'Comfortable', 35:'Comfortable',
#          41:'Less_Afflutent', 42:'Less_Afflutent', 43:'Less_Afflutent', 44:'Less_Afflutent', 45:'Less_Afflutent',
#          51:'Poorer', 52:'Poorer', 53:'Poorer', 54:'Poorer', 55:'Poorer'})

I've tried a couple other things as well that didn't seem worth even posting. I'm new and still learning but I've never come across a a problem like this before and I've done SO much googling

CodePudding user response:

If I understand the question correctly, you want to split your current status column into two new columns and then make mappings for each of those.

# Assuming your current column is called 'status'
df['tens_place'] = df['status']//10 # whole number of division, no remainder, gives -1 for -1
df['ones_place'] = df['status'] # just remainder, gives 9 for -1
tens_map = tens_map = {1:'Wealthy Households',2:'Prosperous households',
    3:'Comfortable Households',4:'Less Affluent households' ,5:'Poorer households',
    -1:'unknown'}
ones_map = ... # same sorta thing
df['wealth_status'] = df['tens_place'].map(tens_map)
df['family_status'] = df['ones_place'].map(ones_map)
  • Related