Home > Net >  Conditional modifying of rows in a dataframe in R
Conditional modifying of rows in a dataframe in R

Time:12-30

In my Data, I'm trying to change y to qlogis(y) when dv=="B".

But I wonder why I get a warning message saying NaNs produced?

Note that .5 and .6 are between 0 and 1 and shouldn't return any NaNs.

library(dplyr)

Data = read.table(text=
"
id dv y
1  A  2
1  B  .5
1  C  11
2  A  4
2  B  .6
2  C  19
", h=TRUE)

mutate(Data, y = ifelse(dv=="B", qlogis(y), y))

CodePudding user response:

We may use replace to apply only qlogis to subset of data and not on the full data

library(dplyr)
Data %>% 
   mutate(y = replace(y, dv == "B", qlogis(y[dv == "B"])))

-output

 id dv          y
1  1  A  2.0000000
2  1  B  0.0000000
3  1  C 11.0000000
4  2  A  4.0000000
5  2  B  0.4054651
6  2  C 19.0000000

i.e.

> qlogis(2)
[1] NaN
Warning message:
In qlogis(2) : NaNs produced
> qlogis(0.5)
[1] 0

The warning is related to how the ifelse process the output i..e all the arguments should have the same length. Thus, qlogis(y) gets applied to the whole column and inside the function, it is processed so that only a subset is returned. But, the warning already comes from the qlogis function resulting from NaN on values greater than 1. i.e. in the source code

...
p = log(lower_tail ? (p / (1. - p)) : ((1. - p) / p));
...

Thus a value greater than 1 with default case of lower.tail argument TRUE

> log((2/(1-2)))
[1] NaN
> log((1/(1-1)))
[1] Inf
> qlogis(1)
[1] Inf
> qlogis(0.9)
[1] 2.197225
> qlogis(1.2)
[1] NaN
Warning message:
In qlogis(1.2) : NaNs produced
ifelse
function (test, yes, no) 
{
...

 ans <- test
    len <- length(ans)
    ypos <- which(test)
    npos <- which(!test)
    if (length(ypos) > 0L) 
        ans[ypos] <- rep(yes, length.out = len)[ypos]
    if (length(npos) > 0L) 
        ans[npos] <- rep(no, length.out = len)[npos]
    ans
...

Now, we do the same steps as in the ifelse code

> test <- Data$dv == "B"
> ans <- test
>  len <- length(ans)
>     ypos <- which(test)
>     npos <- which(!test)
> ypos
[1] 2 5
> npos
[1] 1 3 4 6
> yes <- qlogis(Data$y)### it is the input on the whole column
Warning message:
In qlogis(Data$y) : NaNs produced
  • Related