Home > Mobile >  Replace columns in Pandas Dataframe with the longest String in each Row
Replace columns in Pandas Dataframe with the longest String in each Row

Time:10-09

I had a Dataframe, that had a column with names of players. With following code I split it into the Dataframe you see in the picture:

df = df.name.str.split(expand=True)

Dataframe df

Now I want to find the longest String in each Row and put it into a new column. I hope I have explained my problem clearly. Thanks for any help :)

CodePudding user response:

You can apply a function to row axis to drop nan values then to get maximum value passing len to the key parameter:

>>> df['new_column']=df.apply(lambda x: max(x.dropna() ,key=len), axis=1)

CodePudding user response:

You can stack and get the row with max length per level:

s = df.stack()
df['new'] = s.loc[s.str.len().groupby(level=0).idxmax()].droplevel(1)

example:

     0    1     2   3   new
0  ABC    D  EFGH      EFGH
1    A  BCD   EFG   H   BCD
2    A   BC   DEF  GH   DEF

used input:

df = pd.DataFrame([['ABC', 'D', 'EFGH', ''],
                   ['A', 'BCD', 'EFG', 'H'],
                   ['A', 'BC', 'DEF', 'GH'],
                  ])
  • Related