Home > Software design >  How to split some text to another column from pandas dataframe
How to split some text to another column from pandas dataframe

Time:11-07

Let's say I have below dataframe

          ID                      EQ Type
0      P-456  BSD500,BSD300,GHF20e,GHF20g
1      P-460         BSD500,BSD300,GHF20e
2      P-462         GHF20e,GHF20g,GHF20h

I want to take some text of EQ Type's row then insert it to another column, the expected result is as below

          ID             EQ Type               EQ Type_1
0      P-456       BSD500,BSD300           GHF20e,GHF20g
1      P-460       BSD500,BSD300                  GHF20e
2      P-462                None    GHF20e,GHF20g,GHF20h

Is it possible to do that? Because each row has a different amount of EQ

I have tried this

df[["EQ Type", "EQ Type 1"]] = df["EQ Type"].str.split("GHF20",expand=True)

but it doesn't work.

Thanks.

CodePudding user response:

I hope I've understood you correctly, you want to extract every GHF20* value into a separate column:

df = df.assign(tmp=df["EQ Type"].str.split(",")).explode("tmp")
df[""] = np.where(df["tmp"].str.startswith("GHF20"), "EQ Type_1", "EQ Type")

print(
    df.pivot_table(
        index=[df.index, "ID"],
        columns="",
        values="tmp",
        aggfunc=",".join,
    )
    .droplevel(0)
    .reset_index()
)

Prints:

      ID        EQ Type             EQ Type_1
0  P-456  BSD500,BSD300         GHF20e,GHF20g
1  P-460  BSD500,BSD300                GHF20e
2  P-462            NaN  GHF20e,GHF20g,GHF20h
  • Related