Home > Blockchain >  Confused about the use of [ ] to index a list of colours
Confused about the use of [ ] to index a list of colours


I'm running and example of the book "Beckerman, A. P., & Petchey, O. L. (2012). Getting Started with R. An Introduction for Biologist. Oxford, Reino U nido: Oxford University Press."

The example is this

# plot window 
par(mfrow = c(1,1))
# the plot
plot(EGGS ~ DENSITY, data = limp, pch = 19, cex = 1.5, 
    col = c("Black","Red")[limp$SEASON], 
    xlab = list("Density", cex = 1.2), 
    ylab = list("Eggs Produced", cex = 1.2))
# add a legend 
legend(35,3,legend = c("spring","summer"),  
     col = c("black", "red"), pch = c(19, 19))

But if I include the limp$SEASON R doesn't print the data on the plot.

This is the data

structure(list(DENSITY = c(8L, 8L, 8L, 8L, 8L, 8L, 15L, 15L, 
15L, 15L, 15L, 15L, 30L, 30L, 30L, 30L, 30L, 30L, 45L, 45L, 45L, 
45L, 45L, 45L), SEASON = c("spring", "spring", "spring", "summer", 
"summer", "summer", "spring", "spring", "spring", "summer", "summer", 
"summer", "spring", "spring", "spring", "summer", "summer", "summer", 
"spring", "spring", "spring", "summer", "summer", "summer"), 
    EGGS = c(2.875, 2.625, 1.75, 2.125, 1.5, 1.875, 2.6, 1.866, 
    2.066, 0.867, 0.933, 1.733, 2.23, 1.466, 1, 1.267, 0.467, 
    0.7, 1.4, 1.022, 1.177, 0.711, 0.356, 0.711), season = structure(c(1L, 
    1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 
    2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("spring", "summer"
    ), class = "factor")), row.names = c(NA, -24L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x55d2263c37f0>)

CodePudding user response:

@DanY's answer gives good general context, but the likely specific reason that this example is not working is that it will only work if limp$SEASON is an integer vector or a factor. Prior to R version 4.0 (i.e., when this example was written), the default for importing data was that character columns would automatically be transformed into factors (stringsAsFactors = TRUE). That is no longer the default, so you probably need to specify limp$SEASON <- factor(limp$SEASON) before trying the plotting code.

CodePudding user response:

This is a common technique to color points on scatterplots created with the plot() function in R.

The general idea is:


For example, suppose x and y are length-5 vectors so that plot(x, y) creates a scatterplot of 5 points. Let's suppose we want to color some of those points red and some blue. We could define color_vec and integer_vec as:

color_vec <- c("red", "blue")
integer_vec <- c(2, 2, 1, 2, 1)

Then the code color_vec[integer_vec] produces:

## "blue" "blue" "red"  "blue" "red"

Adding it to the plot command then produces a scatterplot of 5 points where the third and fifth are colored red, the others blue:

plot(x, y, col=color_vec[integer_vec])

Note that integer_vec is commonly a factor variable in the dataframe containing x and y so that each of the colors in color_vec is assigned to a level of the factor variable.

  •  Tags:  
  • r
  • Related