Home > Blockchain >  New df with columns from different df of unequal length
New df with columns from different df of unequal length

Time:11-07

I am trying to create a new df new_df with columns from different data frames.

The columns are of unequal length, which I presume can be solved by replacing empty 'cells' with NA? However, this is above my current skill level, so any help will be much appreciated!

Packages:

library(tidyverse)
library(ggplot2)
library(here)
library(readxl)
library(gt)

I want to create new_df with columns from the following subsets:

Kube_liten$Unit_cm
Kube_Stor$Unit_cm

CodePudding user response:

Novice, we appreciate that you are new to R. But please study a few basics. In particular how vector recycle.

Your problem:

vec1 <- c(1,2,3)
vec2 <- c("A","B","C","D","E")

df <- data.frame(var1 = vec1, var2 = vec2)
Error in data.frame(var1 = vec1, var2 = vec2) : 
  arguments imply differing number of rows: 3, 5

You may "glue" vectors together with cbind - check out the warning. The problem of different vector length is not gone.

df <- cbind(vec1, vec2)
Warning message:
In cbind(vec1, vec2) :
  number of rows of result is not a multiple of vector length (arg 1)

What you get - vec1 is "recycled". In principle R assumes you want to fill the missing places by repeating the values ... (which might not what you want).

df
     vec1 vec2
[1,] "1"  "A" 
[2,] "2"  "B" 
[3,] "3"  "C" 
[4,] "1"  "D" 
[5,] "2"  "E" 

## you can convert this to a data frame, if you prefer that object structure
Warning message:
In cbind(vec1, vec2) :
  number of rows of result is not a multiple of vector length (arg 1)
> df
  vec1 vec2
1    1    A
2    2    B
3    3    C
4    1    D
5    2    E

So your approach to extend the unequal vector length with NA is valid (and possibly what you want). Thus, you are on the right way.

  1. determine the length of your longest vector
  2. inject NAs where needed (mind you you may not want to have them always at the end)

This problem can be found on Stackoverflow. Check out How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?

CodePudding user response:

You can try the following, which extends the "short" vector with NA values:

col1 <- 1:9
col2 <- 1:12

col1[setdiff(col2, col1)] <- NA

data_comb <- data.frame(col1, col2)
# or
# data_comb <- cbind(col1, col2)

Output:

   col1 col2
1     1    1
2     2    2
3     3    3
4     4    4
5     5    5
6     6    6
7     7    7
8     8    8
9     9    9
10   NA   10
11   NA   11
12   NA   12

Since you didn't provide sample data or a desired output, I don't know if this will be the exact approach for your data.

  • Related