Hi i have two datasets which represent to different groups:
student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)
#another dataframe
Student_details<-c("Bracy","Evin")
Student_class<-c("High school","College")
Student_rank<-c("A","A ")
df2<-data.frame(Student_class,Student_details,Student_rank)
df2
I need to rbind df1 and df2 even though the lenght is unequal and make a third column in the final called "dataset" which indicates which dataset it is from:
CodePudding user response:
I am assuming your column name student_details,student_class is same across data frame. You can use bind_rows which is more flexible than rbind. It will create NA values.
student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)
student_details<-c("Bracy","Evin")
student_class<-c("High school","College")
student_rank<-c("A","A ")
df2<-data.frame(student_details,student_class,student_rank)
library(dplyr)
df_full<-bind_rows(df1,df2)
CodePudding user response:
You can use the rbindlist()
function from the data.table
package to accomplish this.
It is important that the column names are the same in both dataframes, as you want to bind by column name.
#convert uppercase letters in column names to lower case.
names(df2) <- tolower(names(df2))
Next, bind them together:
library(data.table)
final_df <- rbindlist(list(df1, df2), use.names = T, fill = T, idcol = "dataset")
final_df
Output:
dataset student_details student_class student_rank
1: 1 John High School <NA>
2: 1 Henrick College <NA>
3: 1 Maria Preschool <NA>
4: 1 Lucas High School <NA>
5: 1 Ali college <NA>
6: 2 Bracy High school A
7: 2 Evin College A
CodePudding user response:
With your specific df1
and df2
, we can try merge
from base R
> merge(df1, df2, all = TRUE, sort = FALSE)
student_details student_class student_rank
1 John High School <NA>
2 Henrick College <NA>
3 Maria Preschool <NA>
4 Lucas High School <NA>
5 Ali college <NA>
6 Bracy High school A
7 Evin College A
but the data.table
option using rbindlist
should work in general sense (see answer by @Flap)