Home > Mobile >  Cast the two first words of a column to another column in the same dataframe
Cast the two first words of a column to another column in the same dataframe

Time:09-30

I have a data frame like this

df1 = pd.DataFrame({'ID'   : ['T1002, T5006, T5007, Stay home'] })

                ID
0   T1002, T5006, T5007, Stay home
1   Go for walk, T5007, T5007, Stay home

I want to take the two first words from each row and cast them to a new column

Expected outcome:

    New_id                       ID
0   T1002_T5006,         Go for walk, T5007, T5007, Stay home
1   Go for walk_T5007,   Go for walk, T5007, T5007, Stay home

I tried this but it did not work:

df1['New_id']= df1["ID"].str.split(',').str.join(sep=" ")

Any ideas?

CodePudding user response:

Considering that the dataframe df looks like this

df = pd.DataFrame({'ID': ['T1002, T5006, T5007, Stay home', 'Go for walk, T5007, T5007, Stay home']})

[Out]:
                                     ID
0        T1002, T5006, T5007, Stay home
1  Go for walk, T5007, T5007, Stay home

Then the following will do the work

df['New_id'] = df['ID'].str.split(',').str[:2].str.join('_')

[Out]:
                                     ID              New_id
0        T1002, T5006, T5007, Stay home        T1002_ T5006
1  Go for walk, T5007, T5007, Stay home  Go for walk_ T5007

Notes:

  • df['ID'] selects the column ID from the dataframe df

  • .str.split(',') splits the string by the comma

  • str[:2] takes the first two words

  • .str.join('_') joins the strings with an underscore between them. One could leave it as follows .str.join('') and, with that, the output would be

                                         ID             New_id
    0        T1002, T5006, T5007, Stay home        T1002 T5006
    1  Go for walk, T5007, T5007, Stay home  Go for walk T5007
    

CodePudding user response:

try:

df["New_id"] = df['ID'].map(lambda x: '_'.join([i.strip() for i in x.split(',')[:2]]))

    ID                                      New_id
0   T1002, T5006, T5007, Stay home          T1002_T5006
1   Go for walk, T5007, T5007, Stay home    Go for walk_T5007
  • Related