Broken axis with ggplot2

For visualizing my data I use R and the library ggplot2. And just lately I made some sensitivity simulations with out dynamic global vegetation model (DGVM) LPJ-GUESS. While summarizing the data per ecosystem and having a first look at the data I realized, that one ecosystem has up to 10 times higher values than all the others. That made me searching for “broken axis” and I didn’t find a satisfying solution, so I had to create my own.

The data used in this example can be downloaded from my Dropbox (I hope I don’t delete it). First the required libraries and data must be loaded and I define a function for the base plot, which I also run immediately.

library(ggplot2)
library(Cairo)
load("data.sum.RData")
base.plot <- function(data) {
p <- ggplot(data, aes(x=value, y=name, col=sens))
p <- p + theme_bw()
p <- p + theme(legend.position="bottom")
p <- p + geom_point(size=2.5, position=position_jitter(w=0, h=0.15), alpha=0.8)
p <- p + scale_color_brewer(palette="Set1", guide=guide_legend(ncol=6, title=NULL))
p <- p + xlab("") + ylab("")
return(p)
}
p <- base.plot(data.sum)
CairoPNG(filename="base_plot.png", width=640, height=320)
print(p)
dev.off()

base_plot

Here you see the large offest between “desert” and the other ecosystems. Therefore I created a “desert mask” column in my data.frame, rescaled the values of the desert, so that they are still larger than the maximum of the others and created custom breaks and labels with the minimum desert value and maximum value of the others. The step between the labels should be 10 here.

data.sum$mask = 0
data.sum$mask[data.sum$name == "desert"] = 1
max.value <- max(data.sum$value)
max.value.other <- max(data.sum$value[data.sum$name != "desert"])
min.value.desert <- min(data.sum$value[data.sum$name == "desert"])
scale <- floor(min.value.desert / max.value.other) - 1
data.sum$value[data.sum$mask == 1] = data.sum$value[data.sum$mask == 1] / scale
step <- 10
low.end <- max(data.sum$value[data.sum$name != "desert"])
up.start <- ceiling(max(data.sum$value[data.sum$name != "desert"]))
breaks <- seq(0, max(data.sum$value), step)
labels <- seq(0, low.end+step, step)
labels <- append(labels, scale * seq(from=ceiling((up.start + step) / step) * step, length.out=length(breaks) - length(labels), by=step))

And now add that new data can be plotted using facet_grid, to show a clear break in the axis:

p <- base.plot(data.sum)
p <- p + facet_grid(. ~ mask, scales="free", space="free")
p <- p + scale_x_continuous(breaks=breaks, labels=labels, expand=c(0.075,0))
p <- p + theme(strip.background = element_blank(), strip.text.x = element_blank())
CairoPNG(filename="broken_axis.png", width=640, height=320)
print(p)
dev.off()

broken_axis

UPDATE: After a comment via twitter, I will also show a plot with a logarithmic x-axis. In my opinion the above “broken axis” stills looks better, although that´s not a clean statistical way.

p <- base.plot(data.sum)
CairoPNG(filename="log10_axis.png", width=640, height=320)
p <- p + scale_x_log10(breaks=c(10, 20, 30, 40, 50, 75, 500, 700))
print(p)
dev.off()

log10_axis

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s