I have one dataframe which I have to divide it into 2 dataframes.
Example:
Project_Number Indication
S100 X
S100 Y
S200 Z
S300 P
S300 Q
S300 R
S400 S
Now I have to divide into 2 based on Project_Number. If particular project_number is having more than 1 value then it should go into 1 dataframe and if it is having single value then go in 2nd dataframe.
Output:
df1-
Project_Number Indication
S100 X
S100 Y
S300 P
S300 Q
S300 R
df2-
Project_Number Indication
S200 Z
S400 S
CodePudding user response:
Use Series.duplicated
with keep=False
for all dupes:
m = df['Project_Number'].duplicated(keep=False)
df1 = df[m]
df2 = df[~m]
CodePudding user response:
You can do this in a few steps using the groupby() and duplicated():
df = pd.DataFrame([x.split(" ") for x in ("""S100 X
S100 Y
S200 Z
S300 P
S300 Q
S300 R
S400 S""").split("\n")], columns="Project_Number,Indication".split(","))
(has_multiple1, df1), (has_multiple2, df2) = list(df.groupby(df['Project_Number'].duplicated(keep=False)))