I'm analyzing the data set data(wine)
from the R package gclus
.
I split the data set according to the proportions 70:30
into a training and a test set.
library(gclus)
data("wine")
sample_size <- floor(0.70 * nrow(wine))
set.seed(123)
train_index <- sample(seq_len(nrow(wine)), size = sample_size)
train <- wine[train_index, ]
test <- wine[-train_index, ]
How could I now carry out LDA for the following data subsets:
wine[c("Class", "Ash", "OD280", "Nonflavanoid")]
wine[c("Class", "Hue", "Magnesium", "Flavanoids", "Alcohol", "Malic", "Intensity", "Alcalinity", "Proline")]
.
CodePudding user response:
You can use the function lda
in MASS
package:
MASS::lda(Class~Ash OD280 Nonflavanoid, train)
Call:
lda(Class ~ Ash OD280 Nonflavanoid, data = train)
Prior probabilities of groups:
1 2 3
0.3225806 0.3790323 0.2983871
Group means:
Ash OD280 Nonflavanoid
1 2.487500 3.163250 0.2950000
2 2.255106 2.766809 0.3727660
3 2.456486 1.714865 0.4575676
Coefficients of linear discriminants:
LD1 LD2
Ash 0.8918665 3.862102
OD280 -2.3406276 -0.482271
Nonflavanoid 1.2410102 -5.865365
Proportion of trace:
LD1 LD2
0.8966 0.1034
You can do the same for the second case