What I am trying to do is to plot some non numeric data on R.
I have a data frame with the following column names and values
Vessel ID P560 P765 P456....
Fabric M S M....
Vessel Use Storing Cooking Storing....
I want to plot Storing vessels which are made of M fabric
I tried this:
pottery <- read.csv (file="potter.csv", header=TRUE)
class(pottery)
ggplot(data = pottery)
geom_point(mapping = aes(x = Use=="Storing", y = Fabric=="M"))
ggtitle("Test")
labs(y="M Fabric", x = "Storing Use")
theme(plot.title = element_text(hjust = 0.5))
Data used
set.seed(1)
df <- data.frame(ID = paste0('p', sample(100:999, 100, FALSE)),
Fabric = sample(c('M', 'S'), 100, TRUE),
Use = sample(c("Storing", "Cooking"), 100, TRUE))
head(df)
#> ID Fabric Use
#> 1 p935 M Cooking
#> 2 p778 M Cooking
#> 3 p228 S Cooking
#> 4 p608 M Storing
#> 5 p570 M Cooking
#> 6 p398 S Storing
Created on 2022-09-15 with reprex v2.0.2
CodePudding user response:
I'm not really sure what you want to show. If it is just the count of the vessel IDs that have fabric M and are used for storage you can do something like the following.
Your example:
> set.seed(123)
>
> df <- data.frame(vessel_id = sample(c("P560", "P765", "P456"), 100, replace = TRUE),
fabric = sample(c("M", "S"), 100, replace = TRUE),
vessel_use = sample(c("Storing","Cooking"), 100, replace = TRUE)
)
>
> head(df)
vessel_id fabric vessel_use
1 P456 M Cooking
2 P456 S Cooking
3 P456 S Storing
4 P765 M Storing
5 P456 M Cooking
6 P765 M Storing
Then you can use:
library("data.table")
setDT(df)
plot(df[fabric == "M" & vessel_use == "Storing", table(vessel_id)],
main = "vessel ID with M fabric and Storing use",
xlab = "vessel ID",
ylab = "count")
This gives you something like this: link to imgur
Here you use
- the library
data.table
, setting the dataframe to datatable and itsdt[i,j,by]
syntax - the function
table()
that counts the occurences of items in the vessel ID column of the subsetdf[fabric == "M" & vessel_use == "Storing",]
- the base R
plot()
function