Home > other >  Remove duplicates rows but change value on certain column if condition is met
Remove duplicates rows but change value on certain column if condition is met

Time:06-30

So I have a .csv file with thousands of rows that have duplicates area names in column A and "Completed" values on column B (which can be "Completed" or "In Progress" in the same area).

Area Completed
Chicago In Progress
Chicago Completed
Chicago In Progress
Chicago In Progress
San Francisco Completed
San Francisco Completed
San Francisco Completed
San Francisco Completed
Los Angeles In Progress
Los Angeles In Progress
Los Angeles In Progress
Los Angeles In Progress

I need to make it so that the end product is the following

Area Completed
Chicago Particularly Completed
San Francisco Completed
Los Angeles In Progress

The idea is to remove the duplicate area values and have the column B be determined by the original values with the following methodology:

  • if all of the values in an area are "Completed" then column B is Completed
  • if all of the values in an area are "In Progress" then column B is "In Progress"
  • if one area contains values "In Progress" and "Completed" then the column B is Particularly Completed

So far I've thought about using a python script for this, but want to know if doing this would be possible with just excel as well?

CodePudding user response:

Formula I have used in D2 cell

=UNIQUE(A2:A13)

Then in E2 cell

=IF(COUNTA(UNIQUE(FILTER($B$2:$B$13,$A$2:$A$13=D2)))>1,"Particularly Complete",UNIQUE(FILTER($B$2:$B$13,$A$2:$A$13=D2)))

and drag down till need.

enter image description here

CodePudding user response:

A correct answer has already been indicated. Anyway, if one doesn't have the UNIQUE or the FILTER function, here is another solution:

A B C D E
Area Completed
Chicago In Progress =IF(COUNTIF($A$2:A2,A2)>1,"",A2) =IFERROR(INDEX(C:C,AGGREGATE(15,6,ROW(C:C)/(C:C<>""),ROW(D2)-ROW($D$2) 1)),"") =IF(D2="","",IF(COUNTIF(A:A,D2)=COUNTIFS(A:A,D2,B:B,"Completed"),"Completed",IF(COUNTIF(A:A,D2)=COUNTIFS(A:A,D2,B:B,"In Progress"),"In Progress",IF(AND(COUNTIFS(A:A,D2,B:B,"Completed")>0,COUNTIFS(A:A,D2,B:B,"In Progress")>0),"Particularly Completed","Unpredicted Outcome"))))
Chicago Completed
Chicago In Progress
Chicago In Progress
San Francisco Completed
San Francisco Completed
San Francisco Completed
San Francisco Completed
Los Angeles In Progress
Los Angeles In Progress
Los Angeles In Progress
Los Angeles In Progress

Formulas in column C, D and E has to be dragged down. Column C filters the Area list leaving only unique areas. Column D sort the list of unique areas of column C. Column E gives the actual response. Once could easily obtain a list of unique areas with little work (which could be better if the list is really long), so the "real deal" is just the formula in column E which work on said list.

  • Related