Dummy variable if company exists in another dataframe in R-CodePudding

I want to create dummy variables that has a value of 1 if the corporation exists in another specific dataset in R. Explanation: I have a dataframe with key financial data of all Norwegian firms. In another dataframe there is a list of all firms with subsidiaries in other countries. I want to create a dummy variable that identifies, in the financial dataframe, that the firm has foreign subsidiaries. That way I can do multivariable regression. Is there any way to create such a dummy variable? The companies use the same identification system so it should be easy to connect the dataframes.

CodePudding user response：

Here is one possible option using tidyverse with a made up example (but am just guessing on the data structure):

library(tidyverse)

financial %>% 
  left_join(., subsidiaries, by = c("firm", "ID")) %>% 
  mutate(dummy = ifelse(!is.na(subsidiary) & subsidiary != "Norway", 1, 0))

Output

   firm ID money subsidiary dummy
1 firm1  1   234       <NA>     0
2 firm2  2   345   country1     1
3 firm2  2   345   country2     1
4 firm3  3   352   country1     1
5 firm3  3   352   country3     1
6 firm4  4   546   country1     1
7 firm5  5   232     Norway     0

Data

financial <- data.frame(firm = c("firm1", "firm2", "firm3", "firm4", "firm5"),
                 ID = c(1, 2, 3, 4, 5),
                 money = c(234, 345, 352, 546, 232))


subsidiaries <- data.frame(firm = c("firm2", "firm2", "firm3", "firm3", "firm4", "firm5"),
                          ID = c(2, 2, 3, 3, 4, 5),
                          subsidiary = c("country1", "country2", "country1", "country3", "country1", "Norway"))

CodePudding user response：

You can merge the two dataframes by firms and then create a dummy variable. Then use ifelse to create the dummy variable.

newdf = merge(financialdata,firmsdata, by="firms")

[It would have been easier if there was an example of the datasets you are working with]