I have a question :
I have this DF
#> # A tibble: 4 × 3
#> family name items
#> <chr> <chr> <list>
#> 1 Kelly Mark book, ring, necklace
#> 2 Kelly Scott axe, camera, watch
#> 3 Quin Tegan book, camera, watch
#> 4 Quin Sara sword, fork, book
How to count each items in the list inside that dataframe into Total like this : count(book) = 3 count(camera) = 2 etc
should I pivot wider the entire item into new column? I'm really sorry if my question is too basic, since I really new into Data Processing
Thank You
#My Approach I tried to using pivot longer, but the columns become too many. The list contains hundreds of value, and it seems bother me to handle such a big data. I haven't yet tried another solution.
CodePudding user response:
From your question you mention that you do not want to pivot wider because of the amount of columns it would create. One alternative is to put the count into a list:
count <- as.list(table(unlist(df$items)))
count$book
[1] 3
Note: this is a count across all rows, which is what your post suggests you are looking for.
CodePudding user response:
Maybe something like this:
df %>%
unnest(items) %>%
unnest(items) %>%
count(items, name="count")
items count
<chr> <int>
1 axe 1
2 book 3
3 camera 2
4 fork 1
5 necklace 1
6 ring 1
7 sword 1
8 watch 2
CodePudding user response:
library(dplyr)
library(tidyr)
df <- tibble(
family = c("Kelly", "Kelly", "Quin", "Quin"),
name = c("Mark", "Scott", "Tegan", "Sara"),
items = c("book, ring, necklace",
"axe, camera, watch",
"book, camera, watch",
"sword, fork, book"))
df %>% separate(items, into = c("i1", 'i2', 'i3')) %>%
pivot_longer(cols = i1:i3, names_to = "item_order", values_to = "item") %>%
count(item, sort = TRUE)
CodePudding user response:
The other answers thus far use a character "items" column whereas the poster specified a list-column. The list-column may be unnested with tidyverse functions then counted as follows:
library(dplyr)
df <- tibble::tribble(
~family, ~name, ~items,
"Kelly", "Mark", list("book", "ring", "necklace"),
"Kelly", "Scott", list("axe", "camera", "watch"),
"Quin", "Tegan", list("book", "camera", "watch"),
"Quin", "Sara", list("sword", "fork", "book")
)
df %>%
tidyr::unnest_longer(items) %>%
count(items)
#> # A tibble: 8 × 2
#> items n
#> <chr> <int>
#> 1 axe 1
#> 2 book 3
#> 3 camera 2
#> 4 fork 1
#> 5 necklace 1
#> 6 ring 1
#> 7 sword 1
#> 8 watch 2
Created on 2022-10-26 with reprex v2.0.2