Now When I am loading this dataframe in R I want something like this
when a number in NAICS_CD col is present in top_3 , I need is_present col as 1 else 0 I need this to be done with R programming
CodePudding user response:
We could use str_detect
from stringr
package together with an ifelse
statement:
library(dplyr)
library(stringr)
df %>%
mutate(is_present = ifelse(str_detect(top_3, as.character(NAICS_CD)), 1, 0))
NAICS_CD top_3 is_present
1 541611 ["541611","541618","611430"] 1
2 812990 ["561720","561740","561790"] 0
3 424950 ["444120","711510","811121"] 0
4 722330 ["311991","722310","722320"] 0
5 722320 ["722320","722330","722310"] 1
6 531180 ["531110","531190","531111"] 0
7 484121 ["484121","484110","484230"] 1
8 531311 ["531110","531311","531111"] 1
data:
df <- structure(list(NAICS_CD = c(541611L, 812990L, 424950L, 722330L,
722320L, 531180L, 484121L, 531311L), top_3 = c("[\"541611\",\"541618\",\"611430\"]",
"[\"561720\",\"561740\",\"561790\"]", "[\"444120\",\"711510\",\"811121\"]",
"[\"311991\",\"722310\",\"722320\"]", "[\"722320\",\"722330\",\"722310\"]",
"[\"531110\",\"531190\",\"531111\"]", "[\"484121\",\"484110\",\"484230\"]",
"[\"531110\",\"531311\",\"531111\"]")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8"))
CodePudding user response:
You can use grepl
to check whether any pattern (in this case any targeted number) is present or not in the targeted row, and then use any
to return a single TRUE
or FALSE
to be suitable with conditional ifelse
. Then, use ifelse
to assign 1
or 0
to is_present
column.
For example:
top3_row1 <-'["541611", "541618","611430"]'
is_present <- ifelse(any(grepl("54161", top3_row1)), 1, 0)
is_present
[1] 1
To apply this to your data frame, you can use for
loop or other ways. For example:
for(k in 1:nrow(mydf)){
mydf$is_present[k] <- ifelse(any(grepl(mydf$NAICS_CD[k], mydf$top_3)), 1, 0)
}