Home > Enterprise >  Better way to add tibble column to df in iterative fashion
Better way to add tibble column to df in iterative fashion

Time:10-01

I have a tibble and a df. The tibble consists of n columns that are numbered in order. n can change.

The df is a constant size, and has rownames (needed for downstream package)

In an iterative fashion, I need to add a column from the tibble to the df (Name of column will always be "output"). My issue is that when I attempt to do this in an iterative fashion, the new column does not add in the expected manner.

Here is an example of the issue:

Setup:

The Tibble:

set.seed(42)

df_tibble = tibble(
  `1` = rnorm(5),
  `2` = rnorm(5),
  `3` = rnorm(5)
)
df_tibble
# # A tibble: 5 x 3
#      `1`     `2`    `3`
#    <dbl>   <dbl>  <dbl>
# 1  1.37  -0.106   1.30 
# 2 -0.565  1.51    2.29 
# 3  0.363 -0.0947 -1.39 
# 4  0.633  2.02   -0.279
# 5  0.404 -0.0627 -0.133

The DataFrame:

# A dataframe which has row names. Row names are required for 
# other process
df = data.frame(
  a = rnorm(5),
  b = rnorm(5))
row.names(df) = c("A", "B", "C", "D", "E")
df
#            a          b
# A -0.4304691  0.4554501
# B -0.2572694  0.7048373
# C -1.7631631  1.0351035
# D  0.4600974 -0.6089264
# E -0.6399949  0.5049551

Issue:

Need to iterate through columns of df_tibble, select next column from tibble and add to df

for (i in 1:n){
  # Change df to have three columns, with "output" reassigned each time
  df$output = df_tibble[,i]
  
  print(df)
  
  #
  # Other processes
  #
  
}

This can be summarized as I need a better method of selecting columns from a tibble by location rather than by name. select() gives me the same result as using base r:

# Equivalent, none works as expected:
df$output = df_tibble[,i]
df$output = select(df_tibble, i)
df %>%
  mutate(output = df_tibble[,i])

Output I get and Output I want:

Output I get (when adding first column from tibble):

          a          b                                        output
A -0.4304691  0.4554501       \033[38;5;246m# A tibble: 5 x 1\033[39m
B -0.2572694  0.7048373                                           `3`
C -1.7631631  1.0351035    \033[3m\033[38;5;246m<dbl>\033[39m\033[23m
D  0.4600974 -0.6089264                \033[38;5;250m1\033[39m  1.30 
E -0.6399949  0.5049551                \033[38;5;250m2\033[39m  2.29 
Warning message:
In format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,  :
  corrupt data frame: columns will be truncated or padded with NAs

As Seen in rstudio: enter image description here

Output I Want:

           a          b     output
A -0.4304691  0.4554501  1.3709584
B -0.2572694  0.7048373 -0.5646982
C -1.7631631  1.0351035  0.3631284
D  0.4600974 -0.6089264  0.6328626
E -0.6399949  0.5049551  0.4042683

enter image description here

Thanks

CodePudding user response:

tibbles square bracket subsetting doesn't drop by default, so it always stays a tibble.

Just change

df$output = df_tibble[, i]

to

df$output = df_tibble[, i, drop = TRUE]

CodePudding user response:

If we need to add just a column,

df$output <- df_tibble[["1"]]

-output

> df
           a          b     output
A  0.6359504 -0.3066386  1.3709584
B -0.2842529 -1.7813084 -0.5646982
C -2.6564554 -0.1719174  0.3631284
D -2.4404669  1.2146747  0.6328626
E  1.3201133  1.8951935  0.4042683

Or to add multiple columns, why not use bind_cols

library(dplyr)
df <- bind_cols(df, df_tibble)
  • Related