What exactly does the logical parameter on the `subset` function in R?-CodePudding

I am Learning R with the book Learning R - Richard Cotton, Chapter 5: List and Dataframes and I don't understand this example give, I have this dataframe and the following scripts:

(a_data_frame <- data.frame(
x = letters[1:5],
y = rnorm(5),
z = runif(5) > 0.5
))

  x          y     z
1 a  0.6395739 FALSE
2 b -1.1645383 FALSE
3 c -1.3616093 FALSE
4 d  0.5658254 FALSE
5 e  0.4345538 FALSE

subset(a_data_frame, y > 0 | z, x) # what exactly mean y > 0 | z ?

I read the book and said:

subset takes up to three arguments: a data frame to subset, a logical vector of conditions for rows to include, and a vector of column names to keep

No more information about the second logic parameter.

CodePudding user response：

It's a tricky example because the (a_data_frame, y > 0 | z, x) the second parameter means y > 0 and the "| z" means or the values in z column that are True.

y>0 evaluate the values given by rnorm(5) your values is different than the book because are randomly generate also the "or" "|" symbol is in the case the column z is selected if the condition is True, in your case all the values False and you can't see what's going on but as didactic example if we change z = rnorm(5) instead of runif(5)>5, you can understand better how works this function.

(a_data_frame <- data.frame(
x = letters[1:5],
y = rnorm(5),
z = rnorm(5)
))

  x           y           z
1 a -0.91016367  2.04917552
2 b  0.01591093  0.03070526
3 c  0.19146220 -0.42056236
4 d  1.07171934  1.31511485
5 e  1.14760483 -0.09855757

So If we have y<0 or z<0 the output of column will be the row a,c,e

> subset(a_data_frame, y < 0 | z < 0, x)
  x
1 a
3 c
5 e
> subset(a_data_frame, y < 0 & z<0, x)
[1] x 
<0 rows> (or 0-length row.names) # there is no values for y<0 and z<0
> subset(a_data_frame, y < 0 & z, x) # True for row 2.
  x
2 b
> subset(a_data_frame, y < 0 | z, x) # true for row 2 and row 4.
  x
2 b
4 d