Home > Mobile >  Summaries values from column, bases if they contain certain characters
Summaries values from column, bases if they contain certain characters

Time:12-19

I have a dataset:

Gemeinde <- c("Adliswil", "Adliswil", "Adliswil", "Adliswil", "Adliswil","Adlikon", "Adlikon", "Adlikon", "Adlikon", "Adlikon")
Country <- c("Schweiz", "Deutschland", "Frankreich", "Türkei", "China","Schweiz", "Deutschland", "Frankreich", "Türkei", "China")
Count <- c(23, 41, 32, 58, 26,23, 41, 32, 58, 26)
df <- data.frame(Gemeinde, Country, Count)
print (df)

I need to summarize the data set based if the Country where the people come from is Swiss or not. So group-by Gemeinde, differ if the country is Swiss or not, and then get the sum of the Count for Gemeinde -Swiss or not.

What I tried:

country_count <- country_count %>%
  group_by(Gemeinde) %>%
  summarize(sumSwiss = sum(Anzahl[Staatsangehörigkeit== "Frankreich & Deutschland & Schweiz"]),
            sumForeigners = sum(Anzahl))

All Countries that are

"Frankreich & Deutschland & Schweiz"

Should become Swiss, all the rest should be "Foreign".

What I need

Gemeinde <- c("Adliswil", "Adlikon")
sumSwiss <- 96
sumForeign <- 84

df <- data.frame(Gemeinde, sumSwiss, sumForeign)

print (df)

CodePudding user response:

library(tidyverse) 

df %>%
  mutate(Country = case_when(
    Country %in% c("Frankreich", "Deutschland", "Schweiz") ~ "Swiss",
    TRUE ~ "Foreign"
  )) %>%
  group_by(Gemeinde, Country) %>%
  summarise(sum = sum(Count)) %>%
  pivot_wider(names_from = "Country",
              values_from = "sum",
              names_prefix = "sum")

# A tibble: 2 × 3
# Groups:   Gemeinde [2]
  Gemeinde sumForeign sumSwiss
  <chr>         <dbl>    <dbl>
1 Adlikon          84       96
2 Adliswil         84       96
  • Related