Home > Mobile >  How to filter out dataframe into 2 based on particular column value?
How to filter out dataframe into 2 based on particular column value?

Time:10-05

I have one dataframe which I have to divide it into 2 dataframes.

Example:

Project_Number      Indication    
S100                 X
S100                 Y
S200                 Z
S300                 P
S300                 Q
S300                 R
S400                 S

Now I have to divide into 2 based on Project_Number. If particular project_number is having more than 1 value then it should go into 1 dataframe and if it is having single value then go in 2nd dataframe.

Output:

df1-

Project_Number     Indication
S100                 X
S100                 Y
S300                 P
S300                 Q
S300                 R

df2-

Project_Number     Indication
S200                 Z
S400                 S

CodePudding user response:

Use Series.duplicated with keep=False for all dupes:

m = df['Project_Number'].duplicated(keep=False)

df1 = df[m]
df2 = df[~m]

CodePudding user response:

You can do this in a few steps using the groupby() and duplicated():

df = pd.DataFrame([x.split("                 ") for x in ("""S100                 X
S100                 Y
S200                 Z
S300                 P
S300                 Q
S300                 R
S400                 S""").split("\n")], columns="Project_Number,Indication".split(","))

(has_multiple1, df1), (has_multiple2, df2) = list(df.groupby(df['Project_Number'].duplicated(keep=False)))
  • Related