I have some sort of data that I can't change anymore and it looked like this in gsheet
Date | Type | Setting | Unit | Firstname | Lastname | Tags |
---|---|---|---|---|---|---|
12/1/2021 | Event 1 | 1 | 2 | John | Doe | Tag1,Tag2,Tag3 |
12/1/2021 | Event 2 | 1 | 2 | Jane | Doe | Tag4,Tag5,Tag6 |
12/1/2021 | Event 2 | 1 | 2 | Karen | Doe | Tag5,Tag8 |
Can anyone please help so I can transpose it into like this and use it in a pivot table on another sheet.
Date | Type | Setting | Unit | Firstname | Lastname | Tags |
---|---|---|---|---|---|---|
12/1/2021 | Event 1 | 1 | 2 | John | Doe | Tag1 |
12/1/2021 | Event 1 | 1 | 2 | John | Doe | Tag2 |
12/1/2021 | Event 1 | 1 | 2 | John | Doe | Tag3 |
12/1/2021 | Event 2 | 1 | 2 | Jane | Doe | Tag4 |
12/1/2021 | Event 2 | 1 | 2 | Jane | Doe | Tag5 |
12/1/2021 | Event 2 | 1 | 2 | Jane | Doe | Tag6 |
12/1/2021 | Event 2 | 1 | 2 | Karen | Doe | Tag5 |
12/1/2021 | Event 2 | 1 | 2 | Karen | Doe | Tag8 |
CodePudding user response:
Suppose that the original data is in a sheet named Sheet1
and that the headers shown in your posted data run A1:G1
. In a new sheet, place this formula in cell A1:
=ArrayFormula({REGEXREPLACE(Sheet1!A1:G1, "s$", ""); SPLIT(SUBSTITUTE(QUERY(FLATTEN(TRANSPOSE(QUERY(TRANSPOSE(FILTER(Sheet1!A2:F&"~", Sheet1!A2:A<>"")),,COLUMNS(Sheet1!A:F))) & SPLIT(FILTER(Sheet1!G2:G, Sheet1!A2:A<>""), ",")), "Select * WHERE Col1 Is Not Null"), "~ ", "~"), "~", 1, 0)})
In short, this formula smashes everything in the first six columns into one long string separated by an arbitrary delineator (I chose the tilde symbol: ~
). It then concatenates these strings with each SPLIT
element of the seventh column, FLATTEN
s all of that into one column, QUERY
s out blank rows and then SPLIT
s again at the tilde symbols into separate cells.
The first part — REGEXREPLACE(Sheet1!A1:G1, "s$", "");
— uses the original headers, but uses REGEXREPLACE
to get rid of the final 's' on the original header 'Tags'.
The framework =ArrayFormula({ ; })
means this (in plain English): "We'll be processing many cells, not just one. We'll be forming a temporary virtual array in memory. And we want the part before the semicolon to go on top of the part after the semicolon."
In the new sheet, you'll need to then select Col A entirely and format it in the date format you prefer, since the dates will originally show up in their raw format (i.e., the count of days since the origin date of December 30, 1899).
If this formula does not work, it is most likely because you are in a locale that uses semicolons in place of commas. In that case, try this version:
=ArrayFormula({REGEXREPLACE(Sheet1!A1:G1; "s$"; ""); SPLIT(SUBSTITUTE(QUERY(FLATTEN(TRANSPOSE(QUERY(TRANSPOSE(FILTER(Sheet1!A2:F&"~"; Sheet1!A2:A<>""));;COLUMNS(Sheet1!A:F))) & SPLIT(FILTER(Sheet1!G2:G; Sheet1!A2:A<>""); ",")); "Select * WHERE Col1 Is Not Null"); "~ "; "~"); "~"; 1; 0)})