If I create a boxplot in ggplot2 and convert it using ggplotly command, the outliers are outlined in black.
Here is a simple example:
library(ggplot2)
library(plotly)
p <- ggplot(mpg, aes(class, hwy))
g <- p + geom_boxplot(aes(colour = "red"))
ggplotly(g)
ggplot would show this chart:
whereas plotly would show this chart:
Is this something that can be fixed?
This persists even when the outliers should be discarded, in the examples also
library(plotly)
set.seed(123)
df <- diamonds[sample(1:nrow(diamonds), size = 1000),]
p <- ggplot(df, aes(cut, price, fill = cut)) +
geom_boxplot(outlier.shape = NA) +
ggtitle("Ignore outliers in ggplot2")
# Need to modify the plotly object and make outlier points have opacity equal to 0
p <- plotly_build(p)
p$data <- lapply(p$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
p
I managed to set the opacity property of the outliers using the code below. This seems to work for the faceted charts I have tried so far also.
library(ggplot2)
library(plotly)
set.seed(123)
df <- diamonds[sample(1:nrow(diamonds), size = 1000),]
p <- ggplot(df, aes(cut, price, fill = cut)) +
geom_boxplot(outlier.shape = NA) +
ggtitle("Ignore outliers in ggplot2")
# Need to modify the plotly object and make outlier points have opacity equal to 0
p <- plotly_build(p)
for(i in 1:length(p$x$data)) {
p$x$data[[i]]$marker$opacity = 0
}
p
The replacement lapply
code is then
p$x$data <- lapply(p$x$data, FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
(note p$x$data
rather than p$data
). I'm happy to PR this to the documentation if someone can point to the source.
The problem is that when you also have geom_jitter
in the plot (in addition to geom_boxplot
), the lapply
part will remove all the points. Is there a way to selectively remove outliers that belong to geom_boxplot
only?
p$x$data <- lapply(p$x$data, FUN = function(x){
x$marker$line$width = 0
return(x)
})
modify marker$line$color
The problem is that when you also have
geom_jitter
in the plot (in addition togeom_boxplot
), thelapply
part will remove all the points. Is there a way to selectively remove outliers that belong togeom_boxplot
only?
You can use the code above and just index to the layer you want to remove, e.g. say the boxplot outliers are on the first layer.
p$x$data[1] <- lapply(p$x$data[1], FUN = function(x){
x$marker = list(opacity = 0)
return(x)
})
Hi! Just wanted to bring this issue to your attention again, as none of the workarounds mentioned above seem to be working (and aren't working in the documentation either)!
(There's also an interesting phenomenon where, for coloured barplots, the most extreme outliers are coloured with black outlines, but closer to the barplot, they're black with coloured outlines, i.e. the reverse.)
There's a WIP here https://github.com/ropensci/plotly/pull/1514 that fixes this issue, feel free to test it out and let me know if you run into problems.
I didn't see the solution being mentioned #1514 on the last release. I was to get the visual I wanted by altering the lapply function to filter only layer that are type == "box"
p$x$data <- lapply(p$x$data, FUN = function(x){
if (x$type == "box") {
x$marker = list(opacity = 0)
}
return(x)
})
This will do the trick for the original question coloring outliers! Plotly differentiates outliers from extreme outliers. We go under the hood and override all outlier colors manually.
library(ggplot2)
library(plotly)
p <- ggplot(mpg, aes(class, hwy)) + geom_boxplot(color="red")
output = ggplotly(p)
# overrides black outline of outliers
output$x$data[[1]]$marker$line$color = "red"
# overrides black extreme outlier color
output$x$data[[1]]$marker$outliercolor = "red"
# overrides black not as extreme outlier color
output$x$data[[1]]$marker$color = "red"
output
Most helpful comment
You can use the code above and just index to the layer you want to remove, e.g. say the boxplot outliers are on the first layer.