I've been trying to make a graph using either barplot or ggplot but first I need to combine different observations from the same variable.
My variable has different observations depending on how relevant a subject is for each user. like this:
Count Activity
10 Bikes for fitness reasons
22 Runs for fitness reasons
12 Bikes to commute to work
10 Walks to commute to work
5 Walks to stay healthy
My idea is to merge the observations from the "Activity" variable so it looks like this:
Count Activity
22 Bikes
22 Runs
15 Walks
So, I don't care the reason for them to do the activity, I just want to merge them so I can put that info into a bar graph.
CodePudding user response:
Here is a tidyverse
solution:
library(tidyverse)
df %>%
mutate(Activity = word(Activity, 1)) %>%
group_by(Activity) %>%
summarize(Count = sum(Count))
This gives us:
# A tibble: 3 x 2
Activity Count
<chr> <dbl>
1 Bikes 22
2 Runs 22
3 Walks 15
Data:
structure(list(Count = c(10, 22, 12, 10, 5), Activity = c("Bikes for fitness reasons",
"Runs for fitness reasons", "Bikes to commute to work", "Walks to commute to work",
"Walks to stay healthy")), row.names = c(NA, -5L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000019ba0e31ef0>)
CodePudding user response:
You could use grep()
to find each term you are looking for, like this:
df <- data.frame(
Count = c(10,22,12,10,5),
Activity = c("Bikes for fitness reasons",
"Runs for fitness reasons",
"Bikes to commute to work",
"Walks to commute to work",
"Walks to stay healthy"))
# Look for this string
var <- "Bikes"
# Get the row where "Bikes" appears
grep(pattern = var, x = df$Activity)
#> [1] 1 3
# Get Count values from each row where "Bikes" appears
df[grep(pattern = var, x = df$Activity), "Count"]
#> [1] 10 12
CodePudding user response:
Using trimws
library(dplyr)
df %>%
group_by(Activity = trimws(Activity, whitespace = "\\s .*")) %>%
summarise(Count = sum(Count))
-output
# A tibble: 3 x 2
Activity Count
<chr> <dbl>
1 Bikes 22
2 Runs 22
3 Walks 15