I have a tibble and a df. The tibble consists of n columns that are numbered in order. n can change.
The df is a constant size, and has rownames (needed for downstream package)
In an iterative fashion, I need to add a column from the tibble to the df (Name of column will always be "output"). My issue is that when I attempt to do this in an iterative fashion, the new column does not add in the expected manner.
Here is an example of the issue:
Setup:
The Tibble:
set.seed(42)
df_tibble = tibble(
`1` = rnorm(5),
`2` = rnorm(5),
`3` = rnorm(5)
)
df_tibble
# # A tibble: 5 x 3
# `1` `2` `3`
# <dbl> <dbl> <dbl>
# 1 1.37 -0.106 1.30
# 2 -0.565 1.51 2.29
# 3 0.363 -0.0947 -1.39
# 4 0.633 2.02 -0.279
# 5 0.404 -0.0627 -0.133
The DataFrame:
# A dataframe which has row names. Row names are required for
# other process
df = data.frame(
a = rnorm(5),
b = rnorm(5))
row.names(df) = c("A", "B", "C", "D", "E")
df
# a b
# A -0.4304691 0.4554501
# B -0.2572694 0.7048373
# C -1.7631631 1.0351035
# D 0.4600974 -0.6089264
# E -0.6399949 0.5049551
Issue:
Need to iterate through columns of df_tibble, select next column from tibble and add to df
for (i in 1:n){
# Change df to have three columns, with "output" reassigned each time
df$output = df_tibble[,i]
print(df)
#
# Other processes
#
}
This can be summarized as I need a better method of selecting columns from a tibble by location rather than by name. select()
gives me the same result as using base r:
# Equivalent, none works as expected:
df$output = df_tibble[,i]
df$output = select(df_tibble, i)
df %>%
mutate(output = df_tibble[,i])
Output I get and Output I want:
Output I get (when adding first column from tibble):
a b output
A -0.4304691 0.4554501 \033[38;5;246m# A tibble: 5 x 1\033[39m
B -0.2572694 0.7048373 `3`
C -1.7631631 1.0351035 \033[3m\033[38;5;246m<dbl>\033[39m\033[23m
D 0.4600974 -0.6089264 \033[38;5;250m1\033[39m 1.30
E -0.6399949 0.5049551 \033[38;5;250m2\033[39m 2.29
Warning message:
In format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x, :
corrupt data frame: columns will be truncated or padded with NAs
Output I Want:
a b output
A -0.4304691 0.4554501 1.3709584
B -0.2572694 0.7048373 -0.5646982
C -1.7631631 1.0351035 0.3631284
D 0.4600974 -0.6089264 0.6328626
E -0.6399949 0.5049551 0.4042683
Thanks
CodePudding user response:
tibbles square bracket subsetting doesn't drop by default, so it always stays a tibble.
Just change
df$output = df_tibble[, i]
to
df$output = df_tibble[, i, drop = TRUE]
CodePudding user response:
If we need to add just a column,
df$output <- df_tibble[["1"]]
-output
> df
a b output
A 0.6359504 -0.3066386 1.3709584
B -0.2842529 -1.7813084 -0.5646982
C -2.6564554 -0.1719174 0.3631284
D -2.4404669 1.2146747 0.6328626
E 1.3201133 1.8951935 0.4042683
Or to add multiple columns, why not use bind_cols
library(dplyr)
df <- bind_cols(df, df_tibble)