I'd like to preserve the proper yearly index names as I recast my data from wide to long.
dt = data.table(country = c(1,2,3,4,5), gdp_1990 = rnorm(5), gdp_1991 = rnorm(5), gdp_1992 = rnorm(5),
unemp_1990 = rnorm(5), unemp_1991 = rnorm(5), unemp_1992 = rnorm(5))
melt(dt, id = 'country', measure = patterns(gdp = '^gdp_', unemp = '^unemp_'), variable.name = 'year')
Desired Output:
country year gdp unemp
1: 1 1990 0.856957066 -1.42947033
2: 2 1990 -1.765995901 1.38170009
3: 3 1990 -0.298302521 -0.54070574
4: 4 1990 -0.919421829 -0.17552704
5: 5 1990 -0.189133135 1.18923546
6: 1 1991 -1.248963381 -0.10467153
7: 2 1991 -0.800931881 0.03589986
Actual Output:
country year gdp unemp
1: 1 1 0.856957066 -1.42947033
2: 2 1 -1.765995901 1.38170009
3: 3 1 -0.298302521 -0.54070574
4: 4 1 -0.919421829 -0.17552704
5: 5 1 -0.189133135 1.18923546
6: 1 2 -1.248963381 -0.10467153
7: 2 2 -0.800931881 0.03589986
CodePudding user response:
With data.table
(dev
version - 1.14.3
) we can use measure
with sep
as documented in ?measure
measure(..., sep, pattern, cols, multiple.keyword="value.name")
library(data.table)
melt(dt, measure.vars = measure(value.name, year, sep = "_"))
-output
country year gdp unemp
<num> <char> <num> <num>
1: 1 1990 -1.275041172 -0.75524345
2: 2 1990 1.979629503 -1.14636877
3: 3 1990 0.062272176 1.16928396
4: 4 1990 -0.210106506 -0.66517069
5: 5 1990 -1.089511759 -1.79322014
6: 1 1991 0.460566878 0.61720109
7: 2 1991 0.183378182 -0.01628616
8: 3 1991 -0.647174381 1.14346303
9: 4 1991 0.008846161 0.05223651
10: 5 1991 -0.039701540 1.40848433
11: 1 1992 0.328204416 1.44638191
12: 2 1992 -1.359373393 1.33391755
13: 3 1992 -0.538430362 -0.26828537
14: 4 1992 0.424461192 -0.32107074
15: 5 1992 -0.338010393 -0.19920506
CodePudding user response:
Using tidyr::pivot_longer
we can use names_sep = "_"
to split the names into the variable and year. In names_to
, use the special string ".value"
to specify that you want multiple columns created from the gdp
and unemp
columns:
tidyr::pivot_longer(dt, -1, names_sep = "_", names_to = c(".value", "year"))
#> # A tibble: 15 x 4
#> country year gdp unemp
#> <dbl> <chr> <dbl> <dbl>
#> 1 1 1990 -0.324 -1.12
#> 2 1 1991 0.307 -1.64
#> 3 1 1992 -0.0569 -1.49
#> 4 2 1990 0.0602 -0.751
#> 5 2 1991 -1.54 0.450
#> 6 2 1992 -1.91 -1.08
#> 7 3 1990 -0.589 2.09
#> 8 3 1991 -0.301 -0.0186
#> 9 3 1992 1.18 1.00
#> 10 4 1990 0.531 0.0174
#> 11 4 1991 -0.528 -0.318
#> 12 4 1992 -1.66 -0.621
#> 13 5 1990 -1.52 -1.29
#> 14 5 1991 -0.652 -0.929
#> 15 5 1992 -0.464 -1.38