Remove duplicates rows but change value on certain column if condition is met-CodePudding

So I have a .csv file with thousands of rows that have duplicates area names in column A and "Completed" values on column B (which can be "Completed" or "In Progress" in the same area).

Area	Completed
Chicago	In Progress
Chicago	Completed
Chicago	In Progress
Chicago	In Progress
San Francisco	Completed
San Francisco	Completed
San Francisco	Completed
San Francisco	Completed
Los Angeles	In Progress
Los Angeles	In Progress
Los Angeles	In Progress
Los Angeles	In Progress

I need to make it so that the end product is the following

Area	Completed
Chicago	Particularly Completed
San Francisco	Completed
Los Angeles	In Progress

The idea is to remove the duplicate area values and have the column B be determined by the original values with the following methodology:

if all of the values in an area are "Completed" then column B is Completed
if all of the values in an area are "In Progress" then column B is "In Progress"
if one area contains values "In Progress" and "Completed" then the column B is Particularly Completed

So far I've thought about using a python script for this, but want to know if doing this would be possible with just excel as well?

CodePudding user response：

Formula I have used in D2 cell

=UNIQUE(A2:A13)

Then in E2 cell

=IF(COUNTA(UNIQUE(FILTER($B$2:$B$13,$A$2:$A$13=D2)))>1,"Particularly Complete",UNIQUE(FILTER($B$2:$B$13,$A$2:$A$13=D2)))

and drag down till need.

CodePudding user response：

A correct answer has already been indicated. Anyway, if one doesn't have the UNIQUE or the FILTER function, here is another solution:

A	B	C	D	E
Area	Completed
Chicago	In Progress	=IF(COUNTIF($A$2:A2,A2)>1,"",A2)	=IFERROR(INDEX(C:C,AGGREGATE(15,6,ROW(C:C)/(C:C<>""),ROW(D2)-ROW($D$2) 1)),"")	=IF(D2="","",IF(COUNTIF(A:A,D2)=COUNTIFS(A:A,D2,B:B,"Completed"),"Completed",IF(COUNTIF(A:A,D2)=COUNTIFS(A:A,D2,B:B,"In Progress"),"In Progress",IF(AND(COUNTIFS(A:A,D2,B:B,"Completed")>0,COUNTIFS(A:A,D2,B:B,"In Progress")>0),"Particularly Completed","Unpredicted Outcome"))))
Chicago	Completed
Chicago	In Progress
Chicago	In Progress
San Francisco	Completed
San Francisco	Completed
San Francisco	Completed
San Francisco	Completed
Los Angeles	In Progress
Los Angeles	In Progress
Los Angeles	In Progress
Los Angeles	In Progress

Formulas in column C, D and E has to be dragged down. Column C filters the Area list leaving only unique areas. Column D sort the list of unique areas of column C. Column E gives the actual response. Once could easily obtain a list of unique areas with little work (which could be better if the list is really long), so the "real deal" is just the formula in column E which work on said list.