Plot non numeric data on R


What I am trying to do is to plot some non numeric data on R.

I have a data frame with the following column names and values

Vessel ID P560 P765 P456....

Fabric M S M....

Vessel Use Storing Cooking Storing....

I want to plot Storing vessels which are made of M fabric

I tried this:

pottery <- read.csv (file="potter.csv", header=TRUE)
ggplot(data = pottery)  
geom_point(mapping = aes(x = Use=="Storing", y = Fabric=="M")) 
labs(y="M Fabric", x = "Storing Use") 
theme(plot.title = element_text(hjust = 0.5))

And what I get is enter image description here

Data used

df <- data.frame(ID     = paste0('p', sample(100:999, 100, FALSE)),
                 Fabric = sample(c('M', 'S'), 100, TRUE),
                 Use    = sample(c("Storing", "Cooking"), 100, TRUE))

#>     ID Fabric     Use
#> 1 p935      M Cooking
#> 2 p778      M Cooking
#> 3 p228      S Cooking
#> 4 p608      M Storing
#> 5 p570      M Cooking
#> 6 p398      S Storing

Created on 2022-09-15 with reprex v2.0.2

I'm not really sure what you want to show. If it is just the count of the vessel IDs that have fabric M and are used for storage you can do something like the following.

Your example:

> set.seed(123)
> df <- data.frame(vessel_id = sample(c("P560", "P765", "P456"), 100, replace = TRUE),
                   fabric = sample(c("M", "S"), 100, replace = TRUE),
                   vessel_use = sample(c("Storing","Cooking"), 100, replace = TRUE)
> head(df)

vessel_id fabric vessel_use
1      P456      M    Cooking
2      P456      S    Cooking
3      P456      S    Storing
4      P765      M    Storing
5      P456      M    Cooking
6      P765      M    Storing

Then you can use:



plot(df[fabric == "M" & vessel_use == "Storing", table(vessel_id)],
     main = "vessel ID with M fabric and Storing use",
     xlab = "vessel ID",
     ylab = "count")

This gives you something like this: link to imgur

Here you use

  1. the library data.table, setting the dataframe to datatable and its dt[i,j,by] syntax
  2. the function table() that counts the occurences of items in the vessel ID column of the subset df[fabric == "M" & vessel_use == "Storing",]
  3. the base R plot() function
