Home > Mobile >  Spllit string and assign to two columns using Pandas assign method
Spllit string and assign to two columns using Pandas assign method

Time:10-16

If using the following DataFrame I can split the "ccy" string and create two new columns:

df_so = pd.DataFrame.from_dict({0: 'gbp_usd',
 1: 'eur_usd',
 2: 'usd_cad',
 3: 'usd_jpy',
 4: 'eur_usd',
 5: 'eur_usd'},orient='index',columns=["ccy"])

df_so[['base_ccy', 'quote_ccy']] = df_so['ccy'].str.split('_', 1, expand=True)

giving the following DataFrame.

index ccy base_ccy quote_ccy
0 gbp_usd gbp usd
1 eur_usd eur usd
2 usd_cad usd cad
3 usd_jpy usd jpy
4 eur_usd eur usd
5 eur_usd eur usd

How do I do the same str.split using DataFrame.assign within my tweak function below ?

I can do this with a list comprehension to get the same result, but is there a simpler/cleaner way using assign?:

def tweak_df (df_):
  return (df_.assign(base_currency= lambda df_: [i[0] for i in df_['ccy'].str.split('_', 1)],
                     quote_currency= lambda df_: [i[1] for i in df_['ccy'].str.split('_', 1)],
                     )        
  )
tweak_df(df_so)

Yields same result as the table above but the code is not very intuitive and simple is better than complex.

CodePudding user response:

A possible solution:

df_so.assign(**tweak_df(df_so))

Output:

       ccy base_ccy quote_ccy base_currency quote_currency
0  gbp_usd      gbp       usd           gbp            usd
1  eur_usd      eur       usd           eur            usd
2  usd_cad      usd       cad           usd            cad
3  usd_jpy      usd       jpy           usd            jpy
4  eur_usd      eur       usd           eur            usd
5  eur_usd      eur       usd           eur            usd

CodePudding user response:

I actually think the first version you suggested is the best.

df_so[['base_ccy', 'quote_ccy']] = df_so['ccy'].str.split('_', 1, expand=True)

If you want to do it using assign, you can do it like this utilising the rename function.

df_so.assign(**df_so['ccy'].str.split('_', n=1, expand=True)
             .rename(columns={0: "base_ccy", 1: "quote_ccy"}))
  • Related