Say that I have a df
.
And I want to change it into a long data format.
I found that this question (Long pivot for multiple variables using Pivot long) was similar with mine.
But I got an error when ran the below code. I did not know why.
What I expected should like the df_expected
.
library(tidyverse)
df = data.frame(
dis = 'cvd',
pollution = 'pm2.5',
lag_day = '2',
b1.x = 1,
b1_ci.x = 2,
PC.x = 3,
pc_ci.x = 4,
b1.y =5,
b1_ci.y = 6,
PC.y = 7,
pc_ci.y = 8
)
# df
# dis pollution lag_day b1.x b1_ci.x PC.x pc_ci.x b1.y b1_ci.y PC.y pc_ci.y
# 1 cvd pm2.5 2 1 2 3 4 5 6 7 8
df %>%
pivot_longer(
cols = -c(dis:lag_day),
names_to = c('.value', 'from'),
names_sep = '.'
) # error code
# Error: Input must be a vector, not NULL.
# Run `rlang::last_error()` to see where the error occurred.
# In addition: Warning message:
# Expected 2 pieces. Additional pieces discarded in 8 rows [1, 2, 3, 4, 5, # 6, 7, 8].
df_expected = data.frame(
dis = 'cvd',
pollution = 'pm2.5',
lag_day = '2',
from = c('x', 'y'),
b1 = c(1,5),
b1_ci = c(2,6),
PC = c(3, 7),
pc_ci = c(4, 8)
)
# df_expected
# dis pollution lag_day from b1 b1_ci PC pc_ci
# 1 cvd pm2.5 2 x 1 2 3 4
# 2 cvd pm2.5 2 y 5 6 7 8
CodePudding user response:
df %>%
pivot_longer(-c(dis, pollution, lag_day),
names_to = c('.value', 'from'), names_sep='[.]')
# A tibble: 2 x 8
dis pollution lag_day from b1 b1_ci PC pc_ci
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 cvd pm2.5 2 x 1 2 3 4
2 cvd pm2.5 2 y 5 6 7 8
in base R:
reshape(df, -(1:3), direction = 'long')
or even:
reshape(df, -(1:3), idvar = 1:3, direction = 'long')
dis pollution lag_day time b1 b1_ci PC pc_ci
cvd.pm2.5.2.x cvd pm2.5 2 x 1 2 3 4
cvd.pm2.5.2.y cvd pm2.5 2 y 5 6 7 8