Home > OS >  Add column to DataFrame and assign number to each row
Add column to DataFrame and assign number to each row

Time:11-11

I have the following table

Father Son Year
James Harry 1999
James Alfi 2001
Corey Kyle 2003

I would like to add a fourth column that makes the table look like below. It's supposed to show which child of each father was born first, second, third, and so on. How can I do that?

Father Son Year Child
James Harry 1999 1
James Alfi 2001 2
Corey Kyle 2003 1

CodePudding user response:

here is one way to do it. using cumcount

# groupby Father and take a cumcount, offsetted by 1
df['Child']=df.groupby(['Father'])['Son'].cumcount() 1
df

    Father  Son     Year    Child
0   James   Harry   1999    1
1   James   Alfi    2001    2
2   Corey   Kyle    2003    1

it assumes that DF is sorted by Father and Year. if not, then

df['Child']=df.sort_values(['Father','Year']).groupby(['Father'] )['Son'].cumcount() 1
df

CodePudding user response:

Here is an idea of solving this using groupby and cumsum functions. This assumes that the rows are ordered so that the younger sibling is always below their elder brother and all children of the same father are in a continuous pack of rows.

Assume we have the following setup

import pandas as pd
df = pd.DataFrame({'Father': ['James', 'James', 'Corey'], 
                   'Son': ['Harry', 'Alfi', 'Kyle'], 
                   'Year': [1999, 2001, 2003]})

then here is the trick we group the siblings with the same father into a groupby object and then compute the cumulative sum of ones to assign a sequential number to each row.

df['temp_column'] = 1
df['Child'] = df.groupby('Father')['temp_column'].cumsum()
df.drop(columns='temp_column')

The result would look like this

  Father    Son  Year  Child
0  James  Harry  1999      1
1  James   Alfi  2001      2
2  Corey   Kyle  2003      1

Now to make the solution more general consider reordering the rows to satisfy the preconditions before applying the solution and then if necessary restore the dataframe to the original order.

  • Related