Home > Blockchain >  Creating a difference matrix comparing differences between all rows in R dataframe
Creating a difference matrix comparing differences between all rows in R dataframe

Time:12-09

I have a dataframe with two columns: the first is the name of US politicians (bioname). The second column is their D-W ideological score (dw1). I want to create a network in which the bioname are nodes/vertices, while edges/ties are weighted as the difference between the two politicians' dw1 scores. For example, I would want the edge weight between Trump and Biden to be .3615 (.7015 - .34) and .022 (.7015 - .6795) between Trump and Rogers, and so on for EVERY POSSIBLE PAIR of subjects in the dataset.

How can I reformat my data to compute these differences for all politicians in the dataset?

bioname dw1
Trump 0.7015
Biden 0.3400
Rogers 0.6795
Sewell 0.3035
Brooks 0.8255

CodePudding user response:

dist should also work

out <- with(df, as.matrix(dist(setNames(dw1, bioname))))

-output

> out
        Trump  Biden Rogers Sewell Brooks
Trump  0.0000 0.3615 0.0220 0.3980 0.1240
Biden  0.3615 0.0000 0.3395 0.0365 0.4855
Rogers 0.0220 0.3395 0.0000 0.3760 0.1460
Sewell 0.3980 0.0365 0.3760 0.0000 0.5220
Brooks 0.1240 0.4855 0.1460 0.5220 0.0000

CodePudding user response:

I'd create a named vector and use outer to build a matrix of differences. Calling your data frame df:

named_vec = setNames(df$dw1, nm = df$bioname)
outer(named_vec, named_vec, FUN = "-")
#          Trump   Biden  Rogers Sewell  Brooks
# Trump   0.0000  0.3615  0.0220 0.3980 -0.1240
# Biden  -0.3615  0.0000 -0.3395 0.0365 -0.4855
# Rogers -0.0220  0.3395  0.0000 0.3760 -0.1460
# Sewell -0.3980 -0.0365 -0.3760 0.0000 -0.5220
# Brooks  0.1240  0.4855  0.1460 0.5220  0.0000

Using this data

df = read.table(text = 'bioname dw1
Trump   0.7015
Biden   0.3400
Rogers  0.6795
Sewell  0.3035
Brooks  0.8255', header = T)
  • Related