Home > OS >  How to group data in rows regardless of the order of the data in the rows and add group names to eac
How to group data in rows regardless of the order of the data in the rows and add group names to eac

Time:09-29

Column A contains texts. Each text has values in the neighboring cells. Now I want to check for each row whether the other rows also contain these values. Restriction: At least three values must match and regardless of the order in the row. The rows, which belong together because of at least 3 common values, should be grouped. A group name should be assigned to them dynamically. An example table might look like this: enter image description here INPUT

Word Value01 Value02 Value03 Value04
banana A B C D
sweet banana L N O J
cheap banana B D A F
best banana D C A H
banana banana N L I O
how to make banana bread O A I K
I love banana B C A D
banana bread I O A C

EXPECTED OUTPUT

Word Value01 Value02 Value03 Value04 group name
banana A B C D banana group 1
sweet banana L N O J banana group 2
cheap banana B D A F banana group 1
best banana D C A H banana group 1
banana banana N L I O banana group 2
how to make banana bread O A I K banana group 3
I love banana B C A D banana group 1
banana bread I O A C banana group 3

I tried the following formula:

=ArrayFormula(IF(B1:E1=B1:E8,"banana group 1","banana group 2"))

But the problems are, only the exact match is recognized and I am limited to two group names. Here is a link to the example table: Example Table

CodePudding user response:

I thought I would take a look at all similarities. But you could get only three if you change just one line. I will point it out below:

function sortgroups() {
  const ss = SpreadsheetApp.getActive();
  const sh = ss.getSheetByName("Sheet0");
  const vs = sh.getRange("A2:E9").getValues();
  let obj = { pA: [] };
  vs.forEach((r, i) => {
    let a = [r[1], r[2], r[3], r[4]];
    obj[r[0]] = { chars: a, sub: [] };
    obj.pA.push(r[0]);
    vs.forEach((s, j) => {
      if (j != i) {
        let b = [s[1], s[2], s[3], s[4]];
        let m = [];//matches
        b.forEach((c, k) => {
          let idx = a.indexOf(c);
          if (~idx) {
            m.push(c);
          }
        })
        if (m.length == 3) {//this is the line to change to limit similarites
          obj[r[0]].sub.push({ p: s[0], chars: m.join(',') });
        } 
      }
    });

  });
  let r = [];
  obj.pA.forEach((p, i) => {
    r.push([p, obj[p].chars[0], obj[p].chars[1], obj[p].chars[2], obj[p].chars[3], '', '']);
    obj[p].sub.forEach((o, j) => {
      r.push(['', '', '', '', '', o.p, o.chars]);
    })

  })
  r.unshift(["Words","Value1","Value2","Value3","Value4","Similar Words","Similarities"]);
  sh.getRange(sh.getLastRow()   2, 1, r.length, r[0].length).setValues(r);
}

Output:

Words Value1 Value2 Value3 Value4 Similar Words Similarities
banana A B C D
cheap banana B,D,A
best banana D,C,A
how to make banana bread A
I love banana B,C,A,D
banana bread A,C
sweet banana L N O J
banana banana N,L,O
how to make banana bread O
banana bread O
cheap banana B D A F
banana A,B,D
best banana D,A
how to make banana bread A
I love banana B,A,D
banana bread A
best banana D C A H
banana A,C,D
cheap banana D,A
how to make banana bread A
I love banana C,A,D
banana bread A,C
banana banana N L I O
sweet banana L,N,O
how to make banana bread O,I
banana bread I,O
how to make banana bread O A I K
banana A
sweet banana O
cheap banana A
best banana A
banana banana I,O
I love banana A
banana bread I,O,A
I love banana B C A D
banana A,B,C,D
cheap banana B,D,A
best banana D,C,A
how to make banana bread A
banana bread A,C
banana bread I O A C
banana A,C
sweet banana O
cheap banana A
best banana C,A
banana banana I,O
how to make banana bread O,A,I
I love banana C,A

Only Three:

Words Value1 Value2 Value3 Value4 Similar Words Similarities
banana A B C D
cheap banana B,D,A
best banana D,C,A
sweet banana L N O J
banana banana N,L,O
cheap banana B D A F
banana A,B,D
I love banana B,A,D
best banana D C A H
banana A,C,D
I love banana C,A,D
banana banana N L I O
sweet banana L,N,O
how to make banana bread O A I K
banana bread I,O,A
I love banana B C A D
cheap banana B,D,A
best banana D,C,A
banana bread I O A C
how to make banana bread O,A,I

CodePudding user response:

There must be a easier way to do this, but this is one (expensive) way to do this. SORT each of row in ascending order, then get all the permutations of each row(for 4 items, that is 24 permutations per row). Check if any permutation match any of the previous row. If it matches, take the group it belongs to(F:F). If not, create a new group based on row number.

F2:

=ARRAYFORMULA(IFNA(INDEX(F$1:F1, MATCH( true,BYROW(B$1:E1,LAMBDA(r, OR(BYROW(REDUCE(TRANSPOSE(SORT(TRANSPOSE(B2:E2))),SEQUENCE(PERMUT(4,4)-1),LAMBDA(a,c,{a;GET_NEXT_LEX(INDEX(a,c,0))})),LAMBDA(perm, SUMPRODUCT(perm=r)>=3 ))))),0)),"group#"&ROW()))

Drag fill down. It gets more expensive based on number of rows.

For this to work, you need to install GET_NEXT_LEX function and it's dependencies from here.

Word Value01 Value02 Value03 Value04 output
banana A B C D group#2
sweet banana L N O J group#3
cheap banana B D A F group#2
best banana D C A H group#2
banana banana N L I O group#3
how to make banana bread O A I K group#7
I love banana B C A D group#2
banana bread I O A C group#7
  • Related