Home > other >  How to join a dataset by row index?
How to join a dataset by row index?

Time:04-19

I have one dataset that looks like this

Rownumber Name
5 John
7 Ken
9 Marvin

I have another that looks like this

LastName
Ming
Roe
Martin
Hugo
Sawyer
Wallace
Thomas
Wang
Abdul

Note that I want to join second dataset to first. In the first I have a column called row number and the second I don't, but I have a backend index.

I would like my dataset to look like this

Rownumber Name Last name
5 John Sawyer
7 Ken Thomas
9 Marvin Abdul

You see that in the second dataset Sawyer is the 5th row, Thomas is the 7th, and Abdul is the 9th that is way it merged.

CodePudding user response:

Suppose your data are stored in two data.frames named df and df2, you could use a dplyr join:

library(dplyr)
df2 %>% 
  mutate(Rownumber = row_number()) %>% 
  right_join(df, by = "Rownumber") %>% 
  select(Rownumber, Name, LastName)

This returns

  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul
  • First create row numbers for your data.frame containing the last names.
  • Next right_join the first data.frame by Rownumber.

Data

df <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken", 
"Marvin")), row.names = c(NA, -3L), class = "data.frame")

df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo", 
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA, 
-9L), class = "data.frame")

CodePudding user response:

I guess you can try this without left_join

> transform(df,LastName = df2$LastName[Rownumber])
  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul

CodePudding user response:

Another option is to use merge, where we use Rownumber from df1 for by.x, then use 0 from df2 for by.y, which refers to the row names.

merge(df1, df2, by.x = "Rownumber", by.y = 0, all.x = TRUE, all.y = FALSE)

Output

  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul

Data

df1 <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken", 
"Marvin")), row.names = c(NA, -3L), class = "data.frame")

df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo", 
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA, 
-9L), class = "data.frame")
  • Related