carrying out a Linear Discriminant Analysis-CodePudding

I'm analyzing the data set data(wine) from the R package gclus.

I split the data set according to the proportions 70:30 into a training and a test set.

library(gclus)
data("wine")
sample_size <- floor(0.70 * nrow(wine))
set.seed(123)
train_index <- sample(seq_len(nrow(wine)), size = sample_size)
train <- wine[train_index, ]
test <- wine[-train_index, ]

How could I now carry out LDA for the following data subsets:

wine[c("Class", "Ash", "OD280", "Nonflavanoid")]
wine[c("Class", "Hue", "Magnesium", "Flavanoids", "Alcohol", "Malic", "Intensity", "Alcalinity", "Proline")] .

CodePudding user response：

You can use the function lda in MASS package:

MASS::lda(Class~Ash OD280 Nonflavanoid, train)
Call:
lda(Class ~ Ash   OD280   Nonflavanoid, data = train)

Prior probabilities of groups:
        1         2         3 
0.3225806 0.3790323 0.2983871 

Group means:
       Ash    OD280 Nonflavanoid
1 2.487500 3.163250    0.2950000
2 2.255106 2.766809    0.3727660
3 2.456486 1.714865    0.4575676

Coefficients of linear discriminants:
                    LD1       LD2
Ash           0.8918665  3.862102
OD280        -2.3406276 -0.482271
Nonflavanoid  1.2410102 -5.865365

Proportion of trace:
   LD1    LD2 
0.8966 0.1034

You can do the same for the second case