I would like to convert data collected in Sheet1 so it look like Sheet2.
Sheet1 - Data populated from Google Form.
This sheet contains the attendance of which employees participated in a specific class.
This sheet contains over 50,000 rows.
Class ID are unique for each row.
The same Employee ID can be found in multiple rows
- Notice Employee ID "123456" is found in class X123456, and ZZ974547
A | B | C | |
---|---|---|---|
1 | Date | Class ID | Employee ID's |
2 | 4/26/2021 6:47:13 | X123456 | 123456 896779 835906 TMP880997 908613 882853 |
3 | 4/26/2021 17:18:57 | Y123456 | 227583 233482 218680 226955 225310 227569 227582 |
4 | 4/26/2021 18:01:30 | XYZ123456 | 201032 232863 232848 TMP232845 |
5 | 4/27/2021 12:24:29 | X123457 | 188809 224046 232861 232846 |
6 | 4/28/2021 10:56:28 | X123458 | 210975 |
7 | 5/26/2021 10:29:31 | ZZ974547 | 123456 955725 961714 956114 955986 959287 955748 |
Sheet2 - Expected outcome using a formula
- Results sorted by timestamp.
- Count the number of Employee ID's within a Class ID.
- Then duplicate the Class ID the same number of times.
- Class ID X123456 contains 6 Employee ID's, so X123456 is repeated 6 times (1/row)
- Class ID Y123456 contains 7 Employee ID's, so Y123456 is repeated 7 times (1/row)
A | B | |
---|---|---|
1 | Class ID | Employee ID |
2 | X123456 | 123456 |
3 | X123456 | 896779 |
4 | X123456 | 835906 |
5 | X123456 | TMP880997 |
6 | X123456 | 908613 |
7 | X123456 | 882853 |
8 | Y123456 | 227583 |
9 | Y123456 | 233482 |
10 | Y123456 | 218680 |
11 | Y123456 | 226955 |
12 | Y123456 | 225310 |
13 | Y123456 | 227569 |
14 | Y123456 | 227582 |
15 | XYZ123456 | 201032 |
16 | XYZ123456 | 232863 |
17 | XYZ123456 | 232848 |
18 | XYZ123456 | TMP232845 |
Here are the current formulas I have tried...
Sheet2!A2 =TRANSPOSE(SPLIT(REPT(B2:B &" ",COUNTA(TRANSPOSE(SPLIT(C2:C," "))))," "))
Sheet2!B2 =TRANSPOSE(SPLIT(C2:C," "))
These formulas work for the first Class ID, but not for the remaining Class ID's. I tried wrapping them with ARRAYFORMULA()
but that did not work.
CodePudding user response:
try:
=INDEX(QUERY(SPLIT(FLATTEN(IF(IFERROR(SPLIT(D2:D, " "))="",,
C2:C&"×"&SPLIT(D2:D, " "))), "×"), "where Col2 is not null"))
update:
=INDEX(SUBSTITUTE(QUERY(SPLIT(FLATTEN(IF(IFERROR(
SPLIT(D2:D, " "))="",,C2:C&"×"&SPLIT(D2:D, " ")&"♦")), "×"),
"where Col2 is not null"), "♦", ))