I am Learning R with the book Learning R - Richard Cotton, Chapter 5: List and Dataframes and I don't understand this example give, I have this dataframe and the following scripts:
(a_data_frame <- data.frame(
x = letters[1:5],
y = rnorm(5),
z = runif(5) > 0.5
))
x y z
1 a 0.6395739 FALSE
2 b -1.1645383 FALSE
3 c -1.3616093 FALSE
4 d 0.5658254 FALSE
5 e 0.4345538 FALSE
subset(a_data_frame, y > 0 | z, x) # what exactly mean y > 0 | z ?
I read the book and said:
subset takes up to three arguments: a data frame to subset, a logical vector of conditions for rows to include, and a vector of column names to keep
No more information about the second logic parameter.
CodePudding user response:
It's a tricky example because the (a_data_frame, y > 0 | z, x)
the second parameter means y > 0 and the "| z" means or the values in z column that are True.
y>0 evaluate the values given by rnorm(5) your values is different than the book because are randomly generate also the "or" "|" symbol is in the case the column z is selected if the condition is True, in your case all the values False and you can't see what's going on but as didactic example if we change z = rnorm(5)
instead of runif(5)>5
, you can understand better how works this function.
(a_data_frame <- data.frame(
x = letters[1:5],
y = rnorm(5),
z = rnorm(5)
))
x y z
1 a -0.91016367 2.04917552
2 b 0.01591093 0.03070526
3 c 0.19146220 -0.42056236
4 d 1.07171934 1.31511485
5 e 1.14760483 -0.09855757
So If we have y<0 or z<0 the output of column will be the row a,c,e
> subset(a_data_frame, y < 0 | z < 0, x)
x
1 a
3 c
5 e
> subset(a_data_frame, y < 0 & z<0, x)
[1] x
<0 rows> (or 0-length row.names) # there is no values for y<0 and z<0
> subset(a_data_frame, y < 0 & z, x) # True for row 2.
x
2 b
> subset(a_data_frame, y < 0 | z, x) # true for row 2 and row 4.
x
2 b
4 d