The
CodePudding user response:
First of all: Your example data don't match any lines (df2
doesn't provide any names contained in your example df1
).
If I got your question right, you could use
library(dplyr)
library(purrr)
library(readr)
df1 %>%
inner_join(df2, by = c("name" = "distinctOrganizations")) %>%
split(f = .$name) %>%
walk(~write_csv(.x, paste0(unique(.x$name), ".csv")))
- We use an
inner_join
to remove all elements from df1
that don't have a match in df2
- Then we
split
the resulting data.frame by name, creating a new data.frame for each (distinct) organization
- Finally we use
purrr
's walk
function to write a .csv
-file for each of these organizations. This produces .csv
-files like Amgen Inc.csv
or ALVAS EDUCATION FOUNDATION.csv
.
Note: The address
column contains some line breaks (\n
). You should consider removing them, those could cause trouble in your .csv
and in your next steps working with those. There are also some white spaces in column type_of_sponsor
(at the beginning and the end) you perhaps want to remove.
Data
I modified df2
to get two matches:
df2 <- structure(list(distinctOrganizations = c("Amgen Inc", "A and U tibbia college and hospital",
"ALVAS EDUCATION FOUNDATION", "A KIREETI", "AAMIR ZUBAIR SHAIKH",
"Aansu Susan Varghese")), row.names = c(NA, 6L), class = "data.frame")