Let's say I have below dataframe
ID EQ Type
0 P-456 BSD500,BSD300,GHF20e,GHF20g
1 P-460 BSD500,BSD300,GHF20e
2 P-462 GHF20e,GHF20g,GHF20h
I want to take some text of EQ Type's row then insert it to another column, the expected result is as below
ID EQ Type EQ Type_1
0 P-456 BSD500,BSD300 GHF20e,GHF20g
1 P-460 BSD500,BSD300 GHF20e
2 P-462 None GHF20e,GHF20g,GHF20h
Is it possible to do that? Because each row has a different amount of EQ
I have tried this
df[["EQ Type", "EQ Type 1"]] = df["EQ Type"].str.split("GHF20",expand=True)
but it doesn't work.
Thanks.
CodePudding user response:
I hope I've understood you correctly, you want to extract every GHF20*
value into a separate column:
df = df.assign(tmp=df["EQ Type"].str.split(",")).explode("tmp")
df[""] = np.where(df["tmp"].str.startswith("GHF20"), "EQ Type_1", "EQ Type")
print(
df.pivot_table(
index=[df.index, "ID"],
columns="",
values="tmp",
aggfunc=",".join,
)
.droplevel(0)
.reset_index()
)
Prints:
ID EQ Type EQ Type_1
0 P-456 BSD500,BSD300 GHF20e,GHF20g
1 P-460 BSD500,BSD300 GHF20e
2 P-462 NaN GHF20e,GHF20g,GHF20h