I want to create dummy variables that has a value of 1 if the corporation exists in another specific dataset in R. Explanation: I have a dataframe with key financial data of all Norwegian firms. In another dataframe there is a list of all firms with subsidiaries in other countries. I want to create a dummy variable that identifies, in the financial dataframe, that the firm has foreign subsidiaries. That way I can do multivariable regression. Is there any way to create such a dummy variable? The companies use the same identification system so it should be easy to connect the dataframes.
CodePudding user response:
Here is one possible option using tidyverse
with a made up example (but am just guessing on the data structure):
library(tidyverse)
financial %>%
left_join(., subsidiaries, by = c("firm", "ID")) %>%
mutate(dummy = ifelse(!is.na(subsidiary) & subsidiary != "Norway", 1, 0))
Output
firm ID money subsidiary dummy
1 firm1 1 234 <NA> 0
2 firm2 2 345 country1 1
3 firm2 2 345 country2 1
4 firm3 3 352 country1 1
5 firm3 3 352 country3 1
6 firm4 4 546 country1 1
7 firm5 5 232 Norway 0
Data
financial <- data.frame(firm = c("firm1", "firm2", "firm3", "firm4", "firm5"),
ID = c(1, 2, 3, 4, 5),
money = c(234, 345, 352, 546, 232))
subsidiaries <- data.frame(firm = c("firm2", "firm2", "firm3", "firm3", "firm4", "firm5"),
ID = c(2, 2, 3, 3, 4, 5),
subsidiary = c("country1", "country2", "country1", "country3", "country1", "Norway"))
CodePudding user response:
You can merge the two dataframes by firms and then create a dummy variable.
Then use ifelse
to create the dummy variable.
newdf = merge(financialdata,firmsdata, by="firms")
[It would have been easier if there was an example of the datasets you are working with]