Home > Blockchain >  Tranpose Columns then repeat rows
Tranpose Columns then repeat rows

Time:01-02

I have some sort of data that I can't change anymore and it looked like this in gsheet

Date Type Setting Unit Firstname Lastname Tags
12/1/2021 Event 1 1 2 John Doe Tag1,Tag2,Tag3
12/1/2021 Event 2 1 2 Jane Doe Tag4,Tag5,Tag6
12/1/2021 Event 2 1 2 Karen Doe Tag5,Tag8

Can anyone please help so I can transpose it into like this and use it in a pivot table on another sheet.

Date Type Setting Unit Firstname Lastname Tags
12/1/2021 Event 1 1 2 John Doe Tag1
12/1/2021 Event 1 1 2 John Doe Tag2
12/1/2021 Event 1 1 2 John Doe Tag3
12/1/2021 Event 2 1 2 Jane Doe Tag4
12/1/2021 Event 2 1 2 Jane Doe Tag5
12/1/2021 Event 2 1 2 Jane Doe Tag6
12/1/2021 Event 2 1 2 Karen Doe Tag5
12/1/2021 Event 2 1 2 Karen Doe Tag8

CodePudding user response:

Suppose that the original data is in a sheet named Sheet1 and that the headers shown in your posted data run A1:G1. In a new sheet, place this formula in cell A1:

=ArrayFormula({REGEXREPLACE(Sheet1!A1:G1, "s$", ""); SPLIT(SUBSTITUTE(QUERY(FLATTEN(TRANSPOSE(QUERY(TRANSPOSE(FILTER(Sheet1!A2:F&"~", Sheet1!A2:A<>"")),,COLUMNS(Sheet1!A:F))) & SPLIT(FILTER(Sheet1!G2:G, Sheet1!A2:A<>""), ",")), "Select * WHERE Col1 Is Not Null"), "~ ", "~"), "~", 1, 0)})

In short, this formula smashes everything in the first six columns into one long string separated by an arbitrary delineator (I chose the tilde symbol: ~). It then concatenates these strings with each SPLIT element of the seventh column, FLATTENs all of that into one column, QUERYs out blank rows and then SPLITs again at the tilde symbols into separate cells.

The first part — REGEXREPLACE(Sheet1!A1:G1, "s$", ""); — uses the original headers, but uses REGEXREPLACE to get rid of the final 's' on the original header 'Tags'.

The framework =ArrayFormula({ ; }) means this (in plain English): "We'll be processing many cells, not just one. We'll be forming a temporary virtual array in memory. And we want the part before the semicolon to go on top of the part after the semicolon."

In the new sheet, you'll need to then select Col A entirely and format it in the date format you prefer, since the dates will originally show up in their raw format (i.e., the count of days since the origin date of December 30, 1899).

If this formula does not work, it is most likely because you are in a locale that uses semicolons in place of commas. In that case, try this version:

=ArrayFormula({REGEXREPLACE(Sheet1!A1:G1; "s$"; ""); SPLIT(SUBSTITUTE(QUERY(FLATTEN(TRANSPOSE(QUERY(TRANSPOSE(FILTER(Sheet1!A2:F&"~"; Sheet1!A2:A<>""));;COLUMNS(Sheet1!A:F))) & SPLIT(FILTER(Sheet1!G2:G; Sheet1!A2:A<>""); ",")); "Select * WHERE Col1 Is Not Null"); "~ "; "~"); "~"; 1; 0)})

  • Related