-
Notifications
You must be signed in to change notification settings - Fork 2.1k
ggplot throws an error when a data is zero row and the lengths of aesthetics are not zero #2850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
With reprex: library(ggplot2)
d1 <- data.frame(xval = rep(1:5, 4), yval = 1:20)
# The followings are OK
ggplot(d1, aes(xval, yval)) ggplot(d1, aes(xval, yval, colour = "Type 1")) nrow(d1[0,])
#> [1] 0
ggplot(d1[0,], aes(xval, yval)) ggplot(d1[0,], aes(xval, yval, colour = "Type 1"))
#> Error: Aesthetics must be either length 1 or the same as the data (1): x, y, colour Created on 2018-08-23 by the reprex package (v0.2.0.9000). |
I found two workarounds, while I still hope
library(ggplot2)
d2 <- data.frame(xval = rep(1:5, 4), yval = 1:20, col_type = "Type 1")
ggplot(d2, aes(xval, yval, colour = col_type)) ggplot(d2[0,], aes(xval, yval, colour = col_type)) Created on 2018-08-24 by the reprex
library(ggplot2)
d1 <- data.frame(xval = rep(1:5, 4), yval = 1:20)
ggplot(d1, aes(xval, yval, colour = rep("Type 1", nrow(d1)))) d1zero <- d1[0,]
ggplot(d1zero, aes(xval, yval, colour = rep("Type 1", nrow(d1zero)))) Created on 2018-08-24 by the reprex |
If the data has zero rows then the value of the longest unevaluated aesthetic gets used. Here Lines 220 to 229 in 01155ba
Can you say a bit more about your use case? The combination of zero-row data and mapping an aesthetic to a string is a little unusual. |
Thank you for your comment. library(ggplot2)
set.seed(123)
d1_1 <- data.frame(yval = rnorm(20))
d1_2 <- data.frame(yval = rnorm(10))
d2 <- data.frame(yval = rnorm(15))
ggplot(mapping = aes(y = yval)) +
geom_violin(data = d1_1, mapping = aes(x = 1, fill = "Type 1")) +
geom_violin(data = d1_2, mapping = aes(x = 2, fill = "Type 1")) +
geom_violin(data = d2, mapping = aes(x = 4, fill = "Type 2")) +
scale_x_continuous(breaks = c(1,2,4), labels = c("Data 1_1", "Data 1_2", "Data 2")) Created on 2018-08-24 by the reprex package (v0.2.0). However, in the real use case, some of the data may have zero rows, in which case |
Ah, I see. I think part of the issue here is that ggplot2 does have certain expectations about the format of data that is provided. You'll have the most success when mapping columns of data in a data frame to the visual variables you want to see in the plot, rather than manually creating the violins from separate data frames. Here is an example of what I mean using your data: library("tidyverse")
set.seed(123)
d1_1 <- data.frame(yval = rnorm(20))
d1_2 <- data.frame(yval = rnorm(10))
d2 <- data.frame(yval = rnorm(15))
## Combine the data into one dataset
dat <- bind_rows(
list(d1_1 = d1_1, d1_2 = d1_2, d2 = d2),
.id = "id"
) %>%
## Extract the 1 or 2 from d1, d2 etc. to inform fill color
mutate(fill = substr(id, 2, 2))
p <- ggplot(dat, aes(x = id, y = yval, fill = fill)) +
geom_violin()
p Then if you want to customize the labels etc. you can do so with p +
scale_fill_discrete(labels = c("Type 1", "Type 2")) +
scale_x_discrete(labels = c("Data 1_1", "Data 1_2", "Data 2")) This isn't so much a workaround as it is taking full advantage of ggplot2's ability to understand all the data at once, and it should hopefully solve the original problem as a) there won't be any zero-row data once the data is combined (unless everything has zero rows), and b) since |
Thank you so much!
is intended, I think the issue has been solved. However, I'd like to report an example odd for me, relating to zero-row data. If some of the data have zero rows (or are dummy), the widths of violins are larger than those in other cases. I'm sorry if this is documented, a duplicate, or a matter of preference, but I'd like to report in case this is undesirable. library(tidyverse)
set.seed(123)
dat <- data.frame(x = rep(c("a","b","c"), 20), y = rnorm(60))
p <- ggplot(dat, aes(x,y)) + geom_violin()
p + scale_x_discrete(limits = c("c", "b", "a")) # Wide!
p + scale_x_discrete(limits = c("c", "DUMMY", "a"))
#> Warning: Removed 20 rows containing non-finite values (stat_ydensity). p + scale_x_discrete(limits = c("DUMMY1", "DUMMY2", "a"))
#> Warning: Removed 40 rows containing non-finite values (stat_ydensity). p + scale_x_discrete(limits = c("DUMMY", "b", "a"))
#> Warning: Removed 20 rows containing non-finite values (stat_ydensity). p + scale_x_discrete(limits = c("c", "DUMMY", "b", "a")) # Wide!
p + scale_x_discrete(limits = c("c", "DUMMY1", "b", "DUMMY2"))
#> Warning: Removed 20 rows containing non-finite values (stat_ydensity). Created on 2018-08-27 by the reprex package (v0.2.0). |
While there are usually better ways to solve the problem by the original poster, I would argue that zero-length value handling by the layers is inconsistent, as This currently fails: library(ggplot2)
df <- data.frame(x = numeric(0), y = numeric(0))
ggplot(df, aes(x, y, colour = "a value")) + geom_point()
#> Error: Aesthetics must be either length 1 or the same as the data (1): x, y But these do not: library(ggplot2)
df <- data.frame(x = numeric(0), y = numeric(0))
ggplot() + annotate("point", x = df$x, y = df$y, colour = "a value")
tibble::tibble(x = df$x, y = df$y, colour = "a value")
#> # A tibble: 0 x 3
#> # … with 3 variables: x <dbl>, y <dbl>, colour <chr>
ggplot2:::data_frame(x = df$x, y = df$y, colour = "a value")
#> [1] x y colour
#> <0 rows> (or 0-length row.names) |
@paleolimbot while I get your point, there is real difference in what happens in an |
If a data is zero row and the lengths of aesthetics are not zero,
ggplot
throws an error with an wrong length of the data as follows.I think the number in the parentheses indicates the length of the data, but it is wrong, since the length is zero.
In addition, this behavior doesn't get along with constant (length-one) aesthetics, while they can be combined with non-zero-row data. I hope
ggplot
tolerates combination of zero-row data and constant aesthetics.Example (using R 3.5.1 and ggplot2 3.0.0):
The text was updated successfully, but these errors were encountered: