Home > Enterprise >  Google Sheets - Identify Duplicate Cells and Replace Cell Value App Script
Google Sheets - Identify Duplicate Cells and Replace Cell Value App Script

Time:08-12

Is there a way to identify duplicate cells and replace the duplicate's cell value to blank or clear it but not the entire row, only one column value. In my example below. I want to identify duplicates if Fruits and Item has the same value. In this case. Lime with item A and Apple with Item B are duplicates, and I want to replace Item duplicate value to blank.

Fruits Price Item
Lime 1 A
Apple 2 B
Lime 3 A
Apple 2 C
Apple 4 B

I've got this code but it deletes the entire row. I only want to compare column A and C and should remove cell value in Column C and keep the first instance and not delete the row.

function removeDuplicates() {
var sheet = SpreadsheetApp.getActiveSheet();
var rng = sheet.getRange("A2:C")
var data = rng.getValues();
var newData = new Array();
for(i in data){
var row = data[i];
var duplicate = false;
for(j in newData){
  if(row.join() == newData[j].join()){
    duplicate = true;
  }
}
if(!duplicate){
  newData.push(row);
}
}
rng.clearContent();
sheet.getRange(2, 1, newData.length, 
newData[0].length).setValues(newData);
}

Here's the result that I want.

Fruits Price Item
Lime 1 A
Apple 2 B
Lime 3
Apple 2 C
Apple 4

CodePudding user response:

you can use conditional formatting where you set font color to white:

=INDEX(COUNTIFS(A$2:A&C$2:C, A2&C2, ROW(A$2:A), "<="&ROW(A2)))>1

enter image description here

CodePudding user response:

I believe your goal is as follows.

  • You want to achieve your showing sample input and output situations using Google Apps Script.

In this case, how about the following modification?

Modification points:

  • In your script, by row.join() == newData[j].join(), all values of columns "A", "B" and "C" are used. By this, the duplicated values of only columns "A" and "C" cannot be checked.
  • And, in order to search the duplicated values, a loop is included in a loop. By this, I'm worried that the process cost will become high.
  • In the case of var rng = sheet.getRange("A2:C"), all rows of the sheet are retrieved. In this case, the process cost will become high. So, you can use var rng = sheet.getRange("A2:C" sheet.getLastRow()). By this, the data range is retrieved.

When these points are reflected in your script, it becomes as follows.

Modified script 1:

In this pattern, your showing script is modified.

function removeDuplicates() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var rng = sheet.getRange("A2:C"   sheet.getLastRow()); // Modified
  var data = rng.getValues();
  var newData = new Array();
  for (i in data) {
    var row = data[i];
    var duplicate = false;
    for (j in newData) {
      if ([row[0]   row[2]].join() == [newData[j][0]   newData[j][2]].join()) { // Modified
        duplicate = true;
      }
    }
    if (!duplicate) {
      newData.push(row);
    } else {
      newData.push([row[0], row[1], null]); // Added
    }
  }
  rng.clearContent().setValues(newData); // Modified
}

Modified script 2:

In this pattern, only the duplicated cell values of column "C" are removed.

function removeDuplicates() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var data = sheet.getRange("A2:C"   sheet.getLastRow()).getValues();
  var { ranges } = data.reduce((o, [a, , c], i) => {
    var key = a   c;
    if (o.obj[key]) {
      o.ranges.push(`C${i   2}`);
    } else {
      o.obj[key] = true;
    }
    return o;
  }, { ranges: [], obj: {} });
  sheet.getRangeList(ranges).clearContent();
}
  • When this script is run for your sample input situation, your sample output situation can be retrieved.

References:

  • Related