Home > Software design >  How to combine two columns to make one long column when there is no ID var?
How to combine two columns to make one long column when there is no ID var?

Time:10-16

I have two columns in different data frames (each data frame is only the 1 column, each with 2000 rows) with occupation info. I just want to stack the two columns into 1 4000 row long column, so I can take percentages and make a bar chart of percentages.

Essentially, I want to take to data frames shaped like this:

Occupation1

Lobbyist
Government Employee
Government Employee
Lobbyist
Teacher
Teacher
Occupation2

Lawyer
Government Employee
Lobbyist
Teacher
Teacher

I want this outcome:

Occupation

Lobbyist
Government Employee
Government Employee
Lobbyist
Teacher
Teacher
Lawyer
Government Employee
Lobbyist
Teacher
Teacher

CodePudding user response:

You can just use rbind() to bind the rows, and setNames() to change the name of each:

rbind(
  setNames(df1, "Occupation"),
  setNames(df2, "Occupation")
)

CodePudding user response:


data.frame(Occupation = c(df1$Occupation1, df2$Occupation2))
#>             Occupation
#> 1             Lobbyist
#> 2  Government Employee
#> 3  Government Employee
#> 4             Lobbyist
#> 5              Teacher
#> 6              Teacher
#> 7               Lawyer
#> 8  Government Employee
#> 9             Lobbyist
#> 10             Teacher
#> 11             Teacher

Created on 2022-10-15 with reprex v2.0.2

data

df1 <- structure(list(Occupation1 = c("Lobbyist", "Government Employee", "Government Employee", "Lobbyist", "Teacher", "Teacher")), 
                 class = "data.frame", row.names = c(NA, -6L))

df2 <- structure(list(Occupation2 = c("Lawyer", "Government Employee", "Lobbyist", "Teacher", "Teacher")), 
                 class = "data.frame", row.names = c(NA, -5L))

CodePudding user response:

In base R just use unlist:

unlist(c(df1$Occupation1, df2$Occupation2))
 [1] "Lobbyist"            "Government Employee" "Government Employee" "Lobbyist"            "Teacher"             "Teacher"             "Lawyer"             
 [8] "Government Employee" "Lobbyist"            "Teacher"             "Teacher"

In dplyr use bind_rows and align the column names:

bind_rows(df1 %>% rename(Occupation = Occupation1), 
          df2 %>% rename(Occupation = Occupation2))
            Occupation
1             Lobbyist
2  Government Employee
3  Government Employee
4             Lobbyist
5              Teacher
6              Teacher
7               Lawyer
8  Government Employee
9             Lobbyist
10             Teacher
11             Teacher
  • Related