Home > Mobile >  Dataset manipulation in R
Dataset manipulation in R

Time:12-05

If mydata is a matrix of data with multiple columns and rows, why do we use a negative sign in the following example ? Is it a matrix inversion ? Thank you!

test <- sample(1:dim(mydata)[1])
new.test <- mydata[test, ]
train <- mydata[-test, ]

CodePudding user response:

Basically, what your code is doing is:

# Pick at random some row indexes from mydata.
test <- sample(1:dim(mydata)[1])
# Create test data with the selected rows from original data
new.test <- mydata[test, ]
# Create train data with the other rows from original data
# - sign drops indexes which were selected in sample() and keeps the rest
train <- mydata[-test, ]

CodePudding user response:

You are overthinking. It is R's way of specifying which indices to include and which ones to exclude. When you add a "minus" sign, it tells R to drop those rows from dataframe — nothing more.

> df = data.frame(x = c("a","b"), y=1:2)
> df
  x y
1 a 1
2 b 2
> df2 = df[-1, ]
> df2
  x y
2 b 2
  •  Tags:  
  • r
  • Related