Home > OS >  Pandas add a row having particular data after a set of common elements
Pandas add a row having particular data after a set of common elements

Time:03-08

I have a dataframe in python pandas having this structure

A    B      C      D
cat   Jack  Tail  short
cat   Jack  Tail  Medium
cat   Jack  Tail  Long
cat   Na    eye  blue
cat   Mik  Tail  short
cat   Mik  Tail  Medium
cat   Moon  Tail  short
dog   Min   Tail  short
dog   Spoon   Tail  short
dog   Spoon   Tail  medium
dog   Spoon   Tail  Long

How should I do it in python pandas to obtain the mentioned new and enriched dataframe? Now I would like to punt a new row after each element having column "C" as discriminant (so if Column C is equal to Tail) with a "D" value equal to "Other". Then my desired output should be:

A    B      C      D
cat   Jack  Tail  short
cat   Jack  Tail  Medium
cat   Jack  Tail  Long
cat   Jack  Tail  Other
cat   Na    eye  blue
cat   Mik  Tail  short
cat   Mik  Tail  Medium
cat   Mik  Tail  Other
cat   Moon  Tail  short
cat   Moon  Tail  other
dog   Min   Tail  short
dog   Min   Tail  Other
dog   Spoon   Tail  short
dog   Spoon   Tail  medium
dog   Spoon   Tail  Long
dog   Spoon   Tail  Long

CodePudding user response:

You can generate a new dataframe with only the new rows and only when column C is "Tail", then append, then sort:

df_other = df[["A", "B", "C"]].query("C == 'Tail'").drop_duplicates().assign(D="Other")

pd.concat([df, df_other]).sort_values(["A", "B"])

CodePudding user response:

You could use groupby concat:

(df.groupby(['A', 'B', 'C'], as_index=False, group_keys=False)
   .apply(lambda g: pd.concat([g, g.iloc[-1:].assign(D='Other')]))
   #.reset_index(drop=True) # uncomment to have a fresh index
)

output:

     A      B     C       D
0  cat   Jack  Tail   short
1  cat   Jack  Tail  Medium
2  cat   Jack  Tail    Long
2  cat   Jack  Tail   Other
3  cat    Mik  Tail   short
4  cat    Mik  Tail  Medium
4  cat    Mik  Tail   Other
5  cat   Moon  Tail   short
5  cat   Moon  Tail   Other
6  dog    Min  Tail   short
6  dog    Min  Tail   Other
7  dog  Spoon  Tail   short
8  dog  Spoon  Tail  medium
9  dog  Spoon  Tail    Long
9  dog  Spoon  Tail   Other

Only add "Other" if the group only has "Tail" in C:

(df.groupby(['A', 'B'], as_index=False, group_keys=False, sort=False)
   .apply(lambda g: pd.concat([g, g.iloc[-1:].assign(D='Other')])
                    if set(g['C']) == {'Tail'} else g
         )
)

output:

      A      B     C       D
0   cat   Jack  Tail   short
1   cat   Jack  Tail  Medium
2   cat   Jack  Tail    Long
2   cat   Jack  Tail   Other
3   cat     Na   eye    blue
4   cat    Mik  Tail   short
5   cat    Mik  Tail  Medium
5   cat    Mik  Tail   Other
6   cat   Moon  Tail   short
6   cat   Moon  Tail   Other
7   dog    Min  Tail   short
7   dog    Min  Tail   Other
8   dog  Spoon  Tail   short
9   dog  Spoon  Tail  medium
10  dog  Spoon  Tail    Long
10  dog  Spoon  Tail   Other
  • Related