My data frame is number of gifts for boys and number of gifts for girls according to each state.
I'm trying to plot two variables side by side using the barplot()
function.
I can't figure out how to get both on there, the way I'm doing it now is that it's assigning two colours but to the same parameter.
Essentially what I want to do is assign two bars for each region (so for New Hampshire it plots df$GB
df$GG
, Mass it plots df$GB
and df$GG
, etc...)
Any thoughts?
y <- as.matrix(df$GB, df$GG)
barplot(y, main='Gifts', col=c('green', 'red'),
beside=TRUE, legend=c('G', 'H'), xaxt='n')
axis(1, at=1: nrow(df), labels=df$Region, las=2)
CodePudding user response:
You will get a faster response with code adapted to your data if you provide a reproducible sample using something like dput(head(df, 5))
. Without that I'll provide some data that may resemble yours:
df <- structure(list(Region = c("A", "B", "C", "D", "E"), GB = c(1L,
5L, 10L, 8L, 2L), GG = c(4L, 2L, 9L, 1L, 10L)), class = "data.frame", row.names = c(NA,
-5L))
Now the boxplot:
barplot(t(df[, 2:3]), beside=TRUE, names=df$Region)
legend("topleft", c("Girls", "Boys"), fill=c("black", "gray"))
CodePudding user response:
You are right that as input barplot
likes matrices (or vectors), where the columns represent the units, and the rows the groups.
So you were close and just not successful in creating the needed matrix yet. Note, that in as.<something>
we always put in just one object, while you put two columns extra. You want a subset of the data frame as.matrix(df[c('GB', 'GG')])
. You also want to t
ranspose it, t(as.matrix(df[c('GB', 'GG')]))
, where t(df[c('GB', 'GG')])
is sufficient since t
automatically coerces to a matrix.
barplot()
is rich in arguments, it's definitely worth to read ?barplot
on what is possible. E.g. you may specify additional arguments for the legend with a list in args.legend
(read '?legend
for options) and define a names.arg
for the x-axis labels. For the latter I'd use a named vector nm
that assigns abbreviations to the regions, the explanation can be put into the description below the graphic. (In this case this might be a better idea than rotating long labels, even though it's nice to know how to do it, see at the bottom of my answer, though.)
(nm <- setNames(c("CT", "ME", "MA", "NH", "RI", "VT"), df$region))
# Connecticut Maine Massachusetts New Hampshire Rhode Island Vermont
# "CT" "ME" "MA" "NH" "RI" "VT"
So we could the barplot
like so.
barplot(t(df[c('GB', 'GG')]), main='Gifts', col=c('green', 'red'),
beside=TRUE, legend.text=c('Girls', 'Boys'),
names.arg=nm[df$region], ylim=c(0, max(df[c('GB', 'GG')]) 1),
args.legend=list(title='Sex', cex=.9))
If you rely on the long names, notice that barplot
invisibly outputs the midpoints of the bars, which can be used for label positioning. You can also extend the mar
of axis 1 a little.
op <- par(mar=c(6, 4, 4, 2)) ## set par
b <- barplot(t(df[c('GB', 'GG')]), beside=TRUE)
axis(side=1, at=colMeans(b), labels=df$region, las=2)
par(op) ## restore old par
Data:
df <- structure(list(GB = c(6L, 2L, 5L, 10L, 3L, 1L), GG = c(8L, 5L,
4L, 10L, 2L, 6L), region = c("Connecticut", "Maine", "Massachusetts",
"New Hampshire", "Rhode Island", "Vermont")), row.names = c(NA,
-6L), class = "data.frame")