Home > Mobile >  Match two columns in dataframe to values in table with multiple columns
Match two columns in dataframe to values in table with multiple columns

Time:10-02

I have a dataframe like so:

team_points <- c(1,2,2,2,1)
values_rem <- c(15.5,15.3,15.1,15.3,15.2)
df1 <- data.frame(team_points, values_rem)

Output:

  team_points values_rem
1           1       15.5
2           2       15.3
3           2       15.1
4           2       15.3
5           1       15.2

I also have a table like so:

values_rem_tbl <- c(15.1, 15.2, 15.3, 15.4, 15.5)
team_points_0 <- c(44.2,44.4,44.6,44.8,45)
team_points_1 <- c(46.2,46.4,46.6,46.8,47)
team_points_2 <- c(48.2,48.4,48.6,48.8,49)
tbl <- data.frame(values_rem_tbl, team_points_0, team_points_1, team_points_2)

Output:

  values_rem_tbl team_points_0 team_points_1 team_points_2
1           15.1          44.2          46.2          48.2
2           15.2          44.4          46.4          48.4
3           15.3          44.6          46.6          48.6
4           15.4          44.8          46.8          48.8
5           15.5          45.0          47.0          49.0

What I am trying to do is match the values in each row of df1 (team_points and values_rem) to the values that appear in the table.

So, for example, row 1 in df1 has team_points as 1, values_rem as 15.5. I would like to check the values tbl by matching the 15.5 to values_rem_tbl (row 5) and then team_points 1 to team_points_1 column, which would return value of 47.0.

Desired output:

  team_points values_rem   Result
1           1       15.5   47.0
2           2       15.3   48.6
3           2       15.1   48.2
4           2       15.3   48.6
5           1       15.2   46.4

The tbl has values_rem_tbl from 0 to 50 (increment of 0.1) and team_points from team_points_0 to team_points_9 across.

The df1 has many other columns, but these are two required so only showing these two in the example.

I have looked at dplyr and pivot_longer but can't seem to get the desired outcome.

CodePudding user response:

So, if I understand you correctly you want to match combinations of (team_points x values_rem) from df1 to the appropriate entries in tbl.

It's seems this process would be a lot clearer if tbl was transformed to have separate columns for team_points and values_rem, similar to df1.

This code does just that:

pivoted_tbl <- pivot_longer(
    tbl,
    cols = team_points_0:team_points_9, # Select/Choose which columns to pivot
    names_to = "team_points", # This is the name for the new column that labels which column a row came from
    names_pattern = ".*(\\d)$", # This is a regular expression pattern that determines the values for the new column ('team_points' above). This pattern gets the last digit ("\\d") in a column name.,
    names_transform = list(team_points = as.double), # ensures the 'team_points' column is the same type (double) as the sister column in `df1`
    values_to = "Result"
)

Now you can simply join the two tables together using dplyr::inner_join().

inner_join(df1, pivoted_tbl, by = c("team_points" = "team_points", "values_rem" = "values_rem_tbl"))
  • Related