I am working with a dataset that is structured like the one below. As you can see, the indicator column contains binary categorical data.
country_code indicator cumulative_count
AFG cases 52909
AFG deaths 2230
... ... ...
I would like to turn the indicator column into two separate columns (corresponding with the values of indicator: cases and deaths). I.e. I'm expecting the final result to be like this:
country_code cases deaths
AFG 52909 2230
... ... ...
Notes:
- The original dataset is publically accessible from
CodePudding user response:
This can also be accomplished using Power Query, available in Windows Excel 2010 and Excel 365 (Windows or Mac)
To use Power Query
- Load your data table into Excel
- Select some cell in your Data Table
Data => Get&Transform => from Table/Range
orfrom within sheet
- When the PQ Editor opens:
Home => Advanced Editor
- Make note of the Table Name in Line 2
- Paste the M Code below in place of what you see
- Change the Table name in line 2 back to what was generated originally.
- Read the comments and explore the
Applied Steps
to understand the algorithm
let //Read in the table //Change table name in next line to your actual table name Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content], //Remove the unneeded columns #"Removed Other Columns" = Table.SelectColumns(Source,{"country_code", "indicator", "year_week", "cumulative_count"}), //Set the data types for those columns #"Set Data Type" = Table.TransformColumnTypes(#"Removed Other Columns",{ {"country_code", type text}, {"indicator", type text},{"year_week", type text},{"cumulative_count", Int64.Type} }), //Pivot the Indicator column and aggregate by Sum #"Pivoted Column" = Table.Pivot(#"Set Data Type", List.Distinct(#"Removed Other Columns"[indicator]), "indicator", "cumulative_count", List.Sum), //Filter to show only the relevant year-week for rows where thiere is a country_code // (the others refer to continents) #"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([country_code] <> null) and ([year_week] = "2020-53")) in #"Filtered Rows"
CodePudding user response:
If I'm understanding your question correctly. one way: Add new column F Formula in $F$2: sumifs($D2:$D$9999, $B2:$B$9999, $B2, $E2:$E$9999, "deaths")
copy formula down through end record filter column E for "cases"
if you then insert rows above the header row, you can use Subtotal(109, ...) to view cumulative counts for a specific year, or alternatively add another column with Sumif as shown above