Home > database >  How to make n rows from 1 row with multiple data points in Excel
How to make n rows from 1 row with multiple data points in Excel

Time:04-11

I extract data automatically from printed tables into Excel, and generally the solution I use is pretty good. However, because the tables are very complex, it sometimes merges several rows into one due to failed layout recognition. So, it might look like this (| indicates new cell):

Label1|A1|B1|C1
Label2|A2|B2|C2
Label3^Label4|A3^A4|B3^B4|C3^C4
Label5|A5|B5|C5

of course, I need 5 rows:

Label1|A1|B1|C1
Label2|A2|B2|C2
Label3|A3|B3|C3
Label4|A4|B4|C4
Label5|A5|B5|C5

Now, I could do it manually, but: multiply that with tens of thousands of cases... I'm in dire need of an automatable solution - any suggestions?

CodePudding user response:

Some assumptions from your data:

  • No duplicates in column 1
  • When rows are grouped by this data extractor, each column has the same number of grouped rows, and they are always separated by a ^
  • There are no ^'s other than those used to separate these grouped rows.

You can obtain your desired output using Power Query, available in Windows Excel 2010 and Office 365 Excel

  • Select some cell in your original table
  • Data => Get&Transform => From Table/Range or From within sheet
  • When the PQ UI opens, navigate to Home => Advanced Editor
  • Make note of the Table Name in Line 2 of the code.
  • Replace the existing code with the M-Code below
  • Change the table name in line 2 of the pasted code to your "real" table name
  • Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps

M Code

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
    
//group by "Column 1"
// This assumes there are no duplicates in Column 1, as shown in your example
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Column1"}, {
        {"splitRows",(t)=>
            let 
                colNames = Table.ColumnNames(t),
                
            //generate lists of new rows, splitting on the ^
                newRows = List.Generate(
                    ()=>[nr=Text.Split(Table.Column(t, colNames{0}){0}, "^"), idx = 0],
                    each [idx]<List.Count(colNames),
                    each [nr=Text.Split(Table.Column(t, colNames{[idx] 1}){0}, "^"), idx = [idx] 1],
                    each [nr]),
                newTable = Table.FromColumns(newRows,colNames)
            in  
                newTable}
    }),

    #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Column1"}),
    #"Expanded splitRows" = Table.ExpandTableColumn(#"Removed Columns", "splitRows", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"})
in
    #"Expanded splitRows"

enter image description here

  • Related