I have come across a bizarre problem. I created a bar chart with plotly, but the bars have lengths that are seemingly random. In some cases, a higher value will have a higher bar, but in other cases, a higher value will have a lower bar, depending on which bars you are comparing. I cannot figure out how this is possible, let alone how it happened here.
Here is the code I have:
priceplot2 <- listings4 %>%
plot_ly(hovertemplate = "",
x = ~logavgprice,
y = ~neighbourhood,
span = I(1)
) %>%
add_bars(color = ~logavgprice,
hovertemplate = ~paste0(neighbourhood, ", Average Price: $", round(avgprice, 2), "<extra></extra>"))
priceplot2
Here is a photo of the result:
Earlier, I grouped the data by a variable (neighbourhood), and I took the average of price for each neighbourhood, and then I did a log transformation of the data.
I hope this makes sense, and I apologize that this is not easily reproducible.
I appreciate any help anyone can provide.
Thank you!
P.S. If anyone could show me how to reorder the neighbourhoods, I'd appreciate that as well. Also, I cannot seem to get rid of that legend, but that's not a huge deal.
CodePudding user response:
It looks like you are displaying unaggregated data, so the bars are stacking all the observations. See this example:
library(plotly)
data.frame(neighborhood = c("A","B","B","C"),
avgprice = c(10,100,100,1000),
logavgprice = c(1,2,2,3)) %>%
plot_ly(x = ~logavgprice,
y = ~neighborhood) %>%
add_bars(marker = list(line = list(color = 'rgb(200,200,200)', width = 3)))
Note that the bars in B are each 2 long, but the total bar looks 4 long (and would be 100's or 1000's long if I had more observations).
To fix this, you could adjust your prior steps -- maybe you should use summarise
instead of mutate
when you calculate avgprice
and then logavgprice
-- or use dplyr::distinct(neighborhood, .keep_all = TRUE)
to keep just one observation per neighborhood.
We can also use forcats::fct_reorder
to make the neighborhoods be an ordered factor which will display in the order we want.
Here's an example:
data.frame(neighborhood = c("A","B","B","C"),
price = c(1000, 90, 110, 10)) %>%
group_by(neighborhood) %>%
summarize(avg_price = mean(price)) %>%
mutate(logavgprice = log(avg_price)) %>%
mutate(neighborhood = forcats::fct_reorder(neighborhood, logavgprice)) %>%
arrange(neighborhood) %>% # optional, just to show reordering before plotting
plot_ly(x = ~logavgprice,
y = ~neighborhood) %>%
add_bars(marker = list(line = list(color = 'rgb(200,200,200)', width = 3)))