Home > Net >  Group values and ordinally label groups Pandas
Group values and ordinally label groups Pandas

Time:03-22

I have a column that contains repetitive houses idshouse_id. I want to group the similar houses ids as family_labeland give them an ordinal label.

My original data looks like this

df_original = pd.DataFrame({'house_id':['112', '119', '913', '514', '112', '119', '119']})

house_id
112
119
913
514
112
119
119

My target result looks like the below dataframe

df_result = pd.DataFrame({'house_id':['112', '119', '913', '514', '112', '119', '119'], 'family_label':['family1', 'family2', 'family3', 'family4', 'family1', 'family2', 'family2']})

house_id     family_label
112          family1
119          family2
913          family3
514          family4
112          family1
119          family2
119          family2

So far this is what I have achived.

I used this code

df_original['label'] = df_original.groupby(df_original.house_id).grouper.group_info[0] 1

it generates the below output

house_id  label
112        1
119        2
913        3
514        4
112        1
119        2
119        2

I want to know if my approach is correct and I want to add the word 'family' before each number.

CodePudding user response:

You can use a list comprehension and precede family string. Such as:

 df_original['label'] = ["family" str(x) for x in (df_original.groupby(df_original.house_id).grouper.group_info[0] 1)]

Outputting:

  house_id    label
0      112  family1
1      119  family2
2      913  family4
3      514  family3
4      112  family1
5      119  family2
6      119  family2

CodePudding user response:

Use GroupBy.ngroup:

df_original['label'] = "family" (df_original.groupby('house_id').ngroup() 1).astype(str)
print (df_original)
  house_id    label
0      112  family1
1      119  family2
2      913  family4
3      514  family3
4      112  family1
5      119  family2
6      119  family2
  • Related