I would like to perform a following task more efficiently.
n_rows <- 50
n_cols <- 100
data_wide <- matrix(1:(n_rows*n_cols), nrow = n_rows, ncol = n_cols)
count <- 1
data_long <- data.frame()
for (run_row in 1:n_rows) {
for (run_col in 1:n_cols) {
data_long[count, 1:3] <- c(run_row, run_col, data_wide[run_row, run_col])
count <- count 1
}
}
Actually, the number of rows = 980 and columns = 1150 and for loop is slow.
CodePudding user response:
I'm not aware of a built-in solution, but you can construct the columns like so:
f <- function(n_rows, n_cols){
data_wide <- matrix(1:(n_rows*n_cols), nrow = n_rows, ncol = n_cols)
data.frame(
row = c(t(row(data_wide))),
col = c(t(col(data_wide))),
value = c(t(data_wide))
)
}
system.time(f(980, 1150))
user system elapsed
0.04 0.00 0.05
CodePudding user response:
library(tidyverse)
Pivot <- function(rows = 50,cols = 100){
n_rows <- rows
n_cols <- cols
data_wide <- matrix(1:(n_rows*n_cols), nrow = n_rows, ncol = n_cols)
# The actual code
data_wide %>%
# tibble is an easier format to work with, for me atleast
as_tibble() %>%
# Since you want each row to have its own unique number
dplyr::mutate(V1_copy = V1) %>%
# The actual conversion from wide to long
pivot_longer(cols = !matches("V1_copy"),names_to = "V2",values_to = "V3") %>%
# Editing to make the output look like yours
dplyr::mutate(V2 = str_replace(V2,"V","")) %>%
dplyr::rename(V1 = V1_copy)}
Output:
> Pivot(50,100)
# A tibble: 5,000 × 3
V1 V2 V3
<int> <chr> <int>
1 1 1 1
2 1 2 51
3 1 3 101
4 1 4 151
5 1 5 201
6 1 6 251
7 1 7 301
8 1 8 351
9 1 9 401
10 1 10 451
# … with 4,990 more rows
Speed:
> system.time(Pivot(50,100))
user system elapsed
0.054 0.033 0.183
> system.time(Pivot(50,1000))
user system elapsed
0.122 0.041 0.391
> system.time(Pivot(50,10000))
user system elapsed
0.583 0.078 0.748
Speed for the dimensions of your dataset: 980 x 1150
> system.time(Pivot(980,1150))
user system elapsed
0.661 0.054 0.724