Home > Enterprise >  Loop in R through columns and find the first value equal 1
Loop in R through columns and find the first value equal 1

Time:11-03

I would like to loop through the columns in the data frame (from left to right) and find the first value that is equal 1 in each row. If the value is equal 1, then I would like to add a new column to the data frame called x_time = 9,10,11,12 or 13 depending at which time point the value 1 was found first.

See the data example

df <- data.frame(x9 = c('$7', '$7', 2, '$7', 1, '$7'),
                 x10 = c('$7', 1, '$7', '$7', '$7', '$7'),
                 x11 = c('$7', '$7', 2, '$7', 1, '$7'),
                 x12 = c(1, 1, 2, '$7', '$7', '$7'),
                 x13 = c('$7', '$7', 2, '$7', 2, '$7'))
                
Desired output:

  x9 x10 x11 x12 x13 x_time
1 $7  $7  $7   1  $7     12
2 $7   1  $7   1  $7     10
3  2  $7   2   2   2     NA
4 $7  $7  $7  $7  $7     NA
5  1  $7   1  $7   2      9
6 $7  $7  $7  $7  $7     NA

Please let me know what would the most sufficient approach.

In Stata I would just make a global macro and loop through its content:

global varlist “x09 x10 x11 x12 x13” 
gen x_time = .
foreach var in $varlist {
    replace x_time = substr("`var'",-2,.) if x_time == . & `var' == 1
}

CodePudding user response:

You can loop out like this

vec <- c()
for (k in 1:nrow(df)) {
  if(length(which(as.vector(unlist(df[k,]))=="1"))>0){
    vec[k] <- as.numeric(gsub("x","",colnames(df)[which(as.vector(unlist(df[k,]))=="1")[1]]))
      
  }else{
    vec[k] <- NA
  }
  
}

df$x_time <- vec

output

> df
  x9 x10 x11 x12 x13 x_time
1 $7  $7  $7   1  $7     12
2 $7   1  $7   1  $7     10
3  2  $7   2   2   2     NA
4 $7  $7  $7  $7  $7     NA
5  1  $7   1  $7   2      9
6 $7  $7  $7  $7  $7     NA

CodePudding user response:

Here is a data.table approach

library(data.table)
# Convert to data.table, keep rownames as identifier
setDT(df, keep.rownames = TRUE)
# join molten data on df
df[melt(df, id.vars = "rn")[value == 1, .SD[1], by = .(rn)],
   x_time := gsub("x", "", i.variable),
   on = .(rn)]
#    rn x9 x10 x11 x12 x13 x_time
# 1:  1 $7  $7  $7   1  $7     12
# 2:  2 $7   1  $7   1  $7     10
# 3:  3  2  $7   2   2   2   <NA>
# 4:  4 $7  $7  $7  $7  $7   <NA>
# 5:  5  1  $7   1  $7   2      9
# 6:  6 $7  $7  $7  $7  $7   <NA>

CodePudding user response:

Here is a tidyverse approach:

  1. concatenate the column names if colum is 1.
  2. as we search for the first column with 1 we could use parse_number that selects the first number from string!
library(dplyr)
library(tidyr)

df %>% 
  mutate(across(x9:x13, ~case_when(. == "1" ~ cur_column()), .names = 'new_{col}')) %>%
  unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ') %>% 
  mutate(x_time=parse_number(New_Col), .keep="unused")

output:

  x9 x10 x11 x12 x13 x_time
1 $7  $7  $7   1  $7     12
2 $7   1  $7   1  $7     10
3  2  $7   2   2   2     NA
4 $7  $7  $7  $7  $7     NA
5  1  $7   1  $7   2      9
6 $7  $7  $7  $7  $7     NA
  • Related