Home > Back-end >  How to convert categorical values into columns in Excel?
How to convert categorical values into columns in Excel?

Time:05-08

I am working with a dataset that is structured like the one below. As you can see, the indicator column contains binary categorical data.

country_code    indicator   cumulative_count
AFG             cases       52909
AFG             deaths      2230
...             ...         ...

I would like to turn the indicator column into two separate columns (corresponding with the values of indicator: cases and deaths). I.e. I'm expecting the final result to be like this:

country_code    cases   deaths
AFG             52909   2230
...             ...     ...

Notes:

  • The original dataset is publically accessible from enter image description here

    CodePudding user response:

    This can also be accomplished using Power Query, available in Windows Excel 2010 and Excel 365 (Windows or Mac)

    To use Power Query

    • Load your data table into Excel
    • Select some cell in your Data Table
    • Data => Get&Transform => from Table/Range or from within sheet
    • When the PQ Editor opens: Home => Advanced Editor
    • Make note of the Table Name in Line 2
    • Paste the M Code below in place of what you see
    • Change the Table name in line 2 back to what was generated originally.
    • Read the comments and explore the Applied Steps to understand the algorithm
    let
    
    //Read in the table
    //Change table name in next line to your actual table name
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        //Remove the unneeded columns
        #"Removed Other Columns" = Table.SelectColumns(Source,{"country_code", "indicator", "year_week", "cumulative_count"}),
    
    //Set the data types for those columns
        #"Set Data Type" = Table.TransformColumnTypes(#"Removed Other Columns",{
            {"country_code", type text}, {"indicator", type text},{"year_week", type text},{"cumulative_count", Int64.Type}
        }),
    
    //Pivot the Indicator column and aggregate by Sum
        #"Pivoted Column" = Table.Pivot(#"Set Data Type", 
            List.Distinct(#"Removed Other Columns"[indicator]), "indicator", "cumulative_count", List.Sum),
    
    //Filter to show only the relevant year-week for rows where thiere is a country_code
    //  (the others refer to continents)
        #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([country_code] <> null) and ([year_week] = "2020-53"))
    in
        #"Filtered Rows"
    

    filtered to show just 2020-53
    enter image description here

    CodePudding user response:

    If I'm understanding your question correctly. one way: Add new column F Formula in $F$2: sumifs($D2:$D$9999, $B2:$B$9999, $B2, $E2:$E$9999, "deaths")

    copy formula down through end record filter column E for "cases"

    if you then insert rows above the header row, you can use Subtotal(109, ...) to view cumulative counts for a specific year, or alternatively add another column with Sumif as shown above

  • Related