Identifier | Value |
---|---|
511016 | 75.72911 |
511016 | 79.01783 |
511016 | 74.87570 |
511029 | 72.75873 |
511029 | 74.41798 |
511029 | 78.56112 |
The dataset consists of two columns (as above) but 77,000 rows. The first column is the 'name' vector and the second the value. I need to transform the data so that the first column has just one value for the identifier and after this, the columns take all the values that the respective identifier has. Like this:
Identifier | Value 1 | Value 2 | etc... |
---|---|---|---|
511016 | 75.72911 | 79.01783 | |
511029 | 72.75873 | 74.41798 |
I have been able to use group_keys to produce a single column with the identifiers but cannot get the values to match in this way. Any help is appreciated.
CodePudding user response:
library(dplyr)
library(tidyr)
df <- tribble(~Identifier, ~Value,
511016, 75.72911,
511016, 79.01783,
511016, 74.87570,
511029, 72.75873,
511029, 74.41798,
511029, 78.56112
)
df %>%
group_by(Identifier) %>%
mutate(row_id = row_number()) %>%
pivot_wider(id_cols = "Identifier", names_from = row_id,
values_from = Value,
names_glue = "{.value}{row_id}") %>%
ungroup()
CodePudding user response:
This should do it:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
dat <- structure(list(Identifier = c(511016L, 511016L, 511016L, 511029L,
511029L, 511029L),
Value = c(75.72911, 79.01783, 74.8757, 72.75873,
74.41798, 78.56112)), row.names = c(NA, 6L), class = "data.frame")
dat %>%
group_by(Identifier) %>%
mutate(obs = row_number()) %>%
pivot_wider(names_from = "obs",
values_from = "Value",
names_prefix="value")
#> # A tibble: 2 × 4
#> # Groups: Identifier [2]
#> Identifier value1 value2 value3
#> <int> <dbl> <dbl> <dbl>
#> 1 511016 75.7 79.0 74.9
#> 2 511029 72.8 74.4 78.6
Created on 2022-04-29 by the reprex package (v2.0.1)
CodePudding user response:
Assuming your df has two columns, One similar Option is:
library(dplyr)
library(tidyr)
df %>%
group_by(Identifier) %>%
mutate(name = paste(colnames(df[2]), row_number())) %>%
pivot_wider(names_from = name, values_from = Value)
Identifier `Value 1` `Value 2` `Value 3`
<int> <dbl> <dbl> <dbl>
1 511016 75.7 79.0 74.9
2 511029 72.8 74.4 78.6