Home > Mobile >  R: How to construct an index which states for each country pair the correlation between the yes vote
R: How to construct an index which states for each country pair the correlation between the yes vote

Time:12-21

I have a dataset which states for each UN resolution the country and the vote:

ResolutionID: 1,2,3,... Country: US, CA, MX, ... vote: yes, no, abstain Dataset

I want to create an variable calculating for each country pair (e.g. US-CA, US-MX, MX-CA,...) the correlation of their voting records. Thus, providing an index for each country pairs friendship or strategic alliance.

What R Code do I have to used?

Citeation of the dataset: Erik Voeten "Data and Analyses of Voting in the UN General Assembly" Routledge Handbook of International Organization, edited by Bob Reinalda (published May 27, 2013)

What R Code do I have to used?

CodePudding user response:

Using the algorithm for calculating the 'index of agreement' as proposed by Lijphart (1963), the below might be what you're after.

## set up sample data
set.seed(32446)
test = data.frame(rcid = rep(seq(3), each=10), country = rep(letters[seq(10)], 3), vote = sample(c("yes","no"),30, replace = T))
test$vote[sample(seq(30),3)] = "abstain" # add in a few abstentions
test = test[-sample(seq(30), 2), ] # remove some as missing
test


## set up the comparison df
allCountries = unique(test$country)

compdf = outer(allCountries, allCountries, paste)
compdf = data.frame(compdf[which(lower.tri(compdf))])
names(compdf) = "comp"

## cycle through the resolutions - use scoring as per Lijphart (1963): agreement by averaging the scores of 1 if there is agreement, 0 if the vote is opposite, and 0.5 if only one country abstains.
for(r in unique(test$rcid)){
  tempVotes = data.frame(countries = allCountries,
                         test$vote[which(test$rcid==r)][match(allCountries, test$country[which(test$rcid==r)])])
  tempVotes = outer(tempVotes[,2], tempVotes[,2], 
                    FUN=function(x,y){
                      ifelse(is.na(x) | is.na(y), NA, #NA if one country didn't vote
                             ifelse(x==y,1, ## 1 if they agree
                                    ifelse(x=="abstain" | y=="abstain", 0.5, # 0.5 if one side abstains
                                    0)
                                    ) # zero otherwise
                             )
                      } 
                    )
  compdf = cbind(compdf, tempVotes[which(lower.tri(tempVotes) )] )  
  names(compdf)[ncol(compdf)] = paste0("resolution_",r)
}

## calculate the mean score across resolutions
result = data.frame(comp = compdf$comp,
                    result = rowMeans(compdf[,seq(2, ncol(compdf)) ], na.rm=T)
                    )
result$result = 1-result$result # make it into a distance score, rather than agreement score

## create distance matrix and plot Dendrogram
library(ggplot2)
library(ggdendro)
distance = matrix(NA, length(allCountries), length(allCountries))
distance[which(lower.tri(distance))] = result$result
rownames(distance) = allCountries; colnames(distance) = allCountries
distance
cluster =  hclust(as.dist(distance))
ggdendrogram(cluster, rotate = FALSE, size = 2)

Lijphart, A. The Analysis of Bloc Voting in the General Assembly: A Critique and a Proposal. The American Political Science Review Vol. 57, No. 4 (Dec., 1963), pp. 902-917 https://www.jstor.org/stable/1952608

  •  Tags:  
  • r
  • Related