Home > Net >  Filling data in one column if there are matching values in another column
Filling data in one column if there are matching values in another column

Time:11-29

I have a DF with parent/child items and I need to associate a time for the parent to all the children items. The time is only listed when the parent matches the child and I need that time to populate on all the children.

This is a simple example.

data = {

     'Parent' : ['a123', 'a123', 'a123', 'a123', 'a234', 'a234', 'a234', 'a234'],
     'Child' : ['a123', 'a1231', 'a1232', 'a1233', 'a2341', 'a234', 'a2342', 'a2343'],
     'Time' : [51, 0, 0, 0, 0, 39, 0, 0],
}

The expected results are:

results= {

     'Parent' : ['a123', 'a123', 'a123', 'a123', 'a234', 'a234', 'a234', 'a234'],
     'Child' : ['a123', 'a1231', 'a1232', 'a1233', 'a2341', 'a234', 'a2342', 'a2343'],
     'Time' : [51, 51, 51, 51, 39, 39, 39, 39],
}

Seems like it should be easy, but I can't wrap my head around where to start.

CodePudding user response:

If time is positive for the parent, or null, you can use a simple groupby.transform('max'):

df['Time'] = df.groupby('Parent')['Time'].transform('max')

Else, you can use:

df['Time'] = (df['Time']
 .where(df['Parent'].eq(df['Child']))
 .groupby(df['Parent']).transform('first')
 .convert_dtypes()
)

Output:

  Parent  Child  Time
0   a123   a123    51
1   a123  a1231    51
2   a123  a1232    51
3   a123  a1233    51
4   a234  a2341    39
5   a234   a234    39
6   a234  a2342    39
7   a234  a2343    39
  • Related