Airplanes above Mainz

Quiet a while ago I bought a SDR-RTL USB dongle. Originally to capture the signal from my outdoor temperature sensor. However, no one had decipherd the signal for that device so far and I was not motivated enough to invest too much time in it. I almost forgot the dongle and remembered it recently, when I read about Flight radar and dump1090, which logs the ID, position and elevation signal of airplanes. Almost all airplanes transmit this unencrypted signal about themselves at 1090 MHz the ‘Automatic Dependent Surveillance-Broadcast System (ADS-B)’. And I was curious how I could log and visualize those flight tracks from Frankfurt/Main airport towards Mainz.

I’ve build an R package to process the data from dump1090 and present some function for flat and 3D visualization for it.

Continue reading “Airplanes above Mainz”


Accelerate NetCDF-4 read access

For simulating the (global) terrestrial biosphere I need time series of climate data, which is mostly provided in a 3-dimesional NetCDF file with the dimensions longitude, latitude and time. If everything is in a single file, this is often several GB large and extracting a timeseries along one position can take quiet some time, since by default NetCDF files are optimized to extract time slices. In the following I show how I speed up reading from a large NetCDF file along the time axis.

Continue reading “Accelerate NetCDF-4 read access”

Twitter friends and (un-)followers with R

A few weeks ago  the number of my twitter followers exceeded 100 and was going slightly down again and I was interested in who unfollowed me. Twitter does not offer that and I would have had to register at a third party page, which even wanted write access to my twitter account.  I did not think about it for a while and then I found the R package twitteR, which lets you fetch all required information and I used it in the following to evaluate my “friends” and “(un-)followers”.
Continue reading “Twitter friends and (un-)followers with R”

Broken axis with ggplot2

For visualizing my data I use R and the library ggplot2. And just lately I made some sensitivity simulations with out dynamic global vegetation model (DGVM) LPJ-GUESS. While summarizing the data per ecosystem and having a first look at the data I realized, that one ecosystem has up to 10 times higher values than all the others. That made me searching for “broken axis” and I didn’t find a satisfying solution, so I had to create my own.

Continue reading “Broken axis with ggplot2”

Mapping traceroutes with Perl

I was playing around on my smartphone, looking for interesting apps and found one that displays the trace of a request to a server (e.g. Nevertheless, since I didn’t really need it, it was to expensive and I thought: “ok, let’s do it myself!” and I wrote a little perl script and uploaded it to GitHub. The script traces the route of a request and either produces a static google map as png or writes a GPX file, which can be used in GIS-software or Google maps.

Continue reading “Mapping traceroutes with Perl”

Retrieving and reprojecting MODIS satellite products

Data from the MODIS satellites are a great source for comparison to vegetation models and also for teaching. However, they come in a format I am not really used to and in a strange projection. The format is HDF, of which even several versions exists. But there are also several tools to convert it to a “better” file format for me. HDF needs much less space compared to other formats. Therefore it makes sense for satellite data to be stored as HDF.

First I need to know which data I want and in which region. Just recently a colleague asked me to get the fire product (MCD45A1) for Vietnam for him. Therefore I show it as example here.

On the map provided on the MODIS overview webpage one can try to select the correct row (v) and column (h) values. However, I already said it is a strange projection. I reprojected the grid to a standard equirectangular projection:

modis sinusoidal

That’s also not easy to select the right h/v values. However with both maps it worked. I set up a perl script using my module WebService::MODIS, which is also available on CPAN. Here is the example script, how to download the desired satellite data:

#!/usr/bin/env perl
use warnings;
use strict;
use WebService::MODIS;
readCache; # or "initCache;" if not done yet
my $fire = WebService::MODIS->new();
$fire->dates(["2007.01.01", "2014.12.31"]);
#print "$_\n" foreach $fire->url;

Now the files are in the current working directory. I do the mosaicing and reprojecting with ModisDownload.R. You may now ask: “Why not also downloading the data with the R Skript?” I did it in the beginning and it was very frustrating, since it crashed or did not start downloading at all and continued download was not possible. Therefore the perl module. To make use of different directories, where the downloaded files were and the reprojected files should be saved, I modified the original ModisDownload.R. Here is the diff of line 184 and 185 (Version 3.2, 13th Jan. 2014):

- write(paste(getwd(),"/",hdfNames[1], sep=""), mosaicname)
- for (j in 2:length(hdfNames)) write(paste(getwd(),"/",hdfNames[j], sep=""),mosaicname,append=T)
+ write(hdfNames[1], mosaicname)
+ for (j in 2:length(hdfNames)) write(hdfNames[j],mosaicname,append=T)

Now the filenames given to the mosaicHDF function must contain the full path to the files if called from another directory. ModisDownload.R uses the Modis Reprojection Tool (MRT), which is written in Java. Last year we used it for teaching. However, the installation under Windows it is a pain in the … Therefore, we preprocessed the data this year and supplied the students with GeoTiffs. Here is now the R script, which processes the downloaded hdf files and creates one GeoTiff per timeslice in UTM projection in the current working directory:

MRT.PATH <- "/path/to/MRT/bin"
MODISDOWNLOAD.DIR <- "/path/to/ModisDownload.R"
source(paste(MODISDOWNLOAD.DIR, "ModisDownload.R", sep="/"))
MODIS <- list(product="MCD45A1", bands="1 0 0 0 0 0 0 0", pixel.size=500)
DOWNLOAD.DIR <- "/path/to/downloaded/MODIS/files"
OUTPUT.DIR <- "/path/to/large/disk/for/GeoTIFFs"
downloaded <- list.files(DOWNLOAD.DIR, pattern="hdf$", full.names=TRUE)
if (length(downloaded) != 0) {
  yyyyddd <- sub("^.*\\.A", "", downloaded)
  yyyyddd <- unique(sub("\\..*$", "", yyyyddd))
  year <- trunc(as.integer(yyyyddd) / 1000)
  day.of.year <- as.integer(yyyyddd) - year * 1000
  dates <- as.Date(paste(year,"-01-01", sep=""), format="%Y-%m-%d") + day.of.year - 1
  for (j in 1:length(dates)) {
    in.hdf <- downloaded[grepl(paste("A", yyyyddd[j], sep=""), downloaded)]
    out.hdf <- paste(MODIS[["product"]], "_", dates[j], ".hdf", sep="")
    out.tif <- paste(MODIS[["product"]], "_", dates[j], ".tif", sep="")
    if (length(in.hdf)>1) {
      mosaicHDF(in.hdf, out.hdf, MRTpath=MRT.PATH)
    } else {
      file.symlink(in.hdf, out.hdf)
      proj_params="0 0 0 0 0 0 0 0 0 0 0 0 0 0 0",
    ## unlink(out.hdf)

Now I have a timeseries of Geotiffs in UTM projection I can work with.

Reading GPX tracks in R

I found another post on R-bloggers, which is very similar to my previous post. So I modified my script and it does not use the library plotKML anymore:


getTrack <- function(file, vars=c("time", "ele")) {
  pfile <- htmlTreeParse(file, error=function (...) {}, useInternalNodes=TRUE)

  trkpt <- xpathSApply(pfile, path = "//trkpt", xmlAttrs)
  create.df.str <- 'out.df <- data.frame(lon=as.numeric(trkpt["lon",]), lat=as.numeric(trkpt["lat",])'
  for (n in vars) {
    if (n=="ele") {
      ele <- as.numeric(xpathSApply(pfile, path = "//trkpt/ele", xmlValue))
      create.df.str <- paste(create.df.str, ", ele=ele", sep="")
    } else if (n=="time") {
      time <- xpathSApply(pfile, path = "//trkpt/time", xmlValue)
      time <- strptime(time, format = "%Y-%m-%dT%H:%M:%OS")
      create.df.str <- paste(create.df.str, ", time=time", sep="")
    } else {
      eval(parse(text=paste(n, ' <- xpathSApply(pfile, path = //trkpt/', n, ', xmlValue)', sep="")))
      create.df.str <- paste(create.df.str, ", ",n, "=", n, sep="")
  create.df.str <- paste(create.df.str, ")", sep="")

track <- getTrack("track.gpx")

track$delta.dist = 0
track$delta.dist[2:nrow(track)] = pointDistance(track[2:nrow(track), c("lon", "lat")], track[1:(nrow(track)-1), c("lon", "lat")], lonlat=TRUE)

track$delta.time = 0
track$delta.time[2:nrow(track)] = as.numeric(difftime(track$time[2:nrow(track)], track$time[1:(nrow(track)-1)], units="secs"))

track$speed = 3.6 * track$delta.dist / track$delta.time

Cycling to work and visualizing the track

Each spring I intent to cycle at least once a month from Mainz, where I live to work in Frankfurt. It’s a 43 km ride, mostly along the river Main. It is a nice route, especially in the morning, when I ride towards the sunrise. But you always have to pay attention to the rabbits, there are so many of them. In the morning, when it is still a bit dark I sometimes see them just in the last moment. Therefore I often ring my bell, so they hear me and run away. Of course there are many more animals to spot along the nice scenery. And even the urban and industrial parts of the tracks are kind of nice. As I already wrote the route goes along the river most of the time, only in Flörsheim, Okriftel and Höchst the route leads further away from the river.

The largest part is on asphalt, but there are also gravel and cobblestone sections, which is ok with my bike, but it might not really be suitable for racing bikes. There is another option further north, which is shorter but I am afraid it would be mostly on roads. Therefore I never tried it, but that alternative would be better for racing bikes.

Since they finished the new runway at the airport I would not like to live closer to Frankfurt. It is so loud, when the planes are descending over those villages alternating between old and new runway.

I recorded my track with a smartphone and with an old Garmin GPS device. The quality does not really differ much. Here is the track with some alternative shortcuts:

You can create and publish maps like this in Google Maps.

Getting the data

To read the data from my Garmin device I use gpsbabel. You have to make sure that the module garmin_gps is not loaded once the device is plugged in and turned on (“rmmod garmin_gps” as root or blacklist it, entry “blacklist garmin_gps” in a file in /etc/modprobe.d/; the file name is distribution dependent).

> gpsbabel -t -i garmin -f usb: -o gpx -F garmin_tracks.gpx

On the iPhone I use iSmoothRun, which exports the data to my dropbox in gpx format. For further visualization I use R with the libraries ggplot2, sp, raster and plotKML. The last one has a lot of dependencies.
You have to find out which of the recorded tracks from the Garmin device is the one to use, in my case it is number 5. If your track was too long and cut into pieces, you have to do it analogue to the iPhone GPX track.

garmin.gpx <- readGPX("garmin_tracks.gpx")
garmin.track <- garmin.gpx$tracks[[5]][[1]]
garmin.spdf <- SpatialPointsDataFrame(garmin.track[1:2], garmin.track[3:ncol(garmin.track)], proj4string=CRS("+proj=longlat +ellps=WGS84"))
iphone.gpx <- readGPX("ismoothrun_export.gpx")
iphone.track <- iphone.gpx$tracks[[1]][[2]]
for (i in 2:length(iphone.gpx$tracks[[1]])) {
  iphone.track <- rbind(iphone.track, iphone.gpx$tracks[[1]][[i]])
iphone.spdf <- SpatialPointsDataFrame(iphone.track[1:2], iphone.track[3:ncol(iphone.track)], proj4string=CRS("+proj=longlat +ellps=WGS84"))

I now have a SpatialPointsDataFrame in a longitude/latitude projection. To visualize anything in dependence on the distance cycled, I can reproject the data to UTM, calculate the distance between the points and append it as a new column to the data of the SpatialPointsDataFrame:

iphone.utm <- coordinates(spTransform(iphone.spdf, CRS=CRS("+proj=utm +units=km")))
garmin.utm <- coordinates(spTransform(garmin.spdf, CRS=CRS("+proj=utm +units=km")))
iphone.dist <- sqrt((iphone.utm[2:nrow(iphone.utm), 1] - iphone.utm[1:(nrow(iphone.utm)-1), 1])^2 +
  (iphone.utm[2:nrow(iphone.utm), 2] - iphone.utm[1:(nrow(iphone.utm)-1), 2])^2)
garmin.dist <- sqrt((garmin.utm[2:nrow(garmin.utm), 1] - garmin.utm[1:(nrow(garmin.utm)-1), 1])^2 +
  (garmin.utm[2:nrow(garmin.utm), 2] - garmin.utm[1:(nrow(garmin.utm)-1), 2])^2)
iphone.spdf@data$dist = 0
iphone.spdf@data$dist[2:nrow(iphone.spdf@data)] = cumsum(iphone.dist)
garmin.spdf@data$dist = 0
garmin.spdf@data$dist[2:nrow(garmin.spdf@data)] = cumsum(garmin.dist)

For what ever reason, the elevation is defined as character. Therefore it has to be converted to numeric first:

garmin.spdf@data$ele = as.numeric(garmin.spdf@data$ele)
iphone.spdf@data$ele = as.numeric(iphone.spdf@data$ele)

To make it understandable by ggplot2, it has to be converted to a combined simple data.frame:

garmin.df <- data.frame(garmin.spdf)
iphone.df <- data.frame(iphone.spdf)
combined <- data.frame(garmin.df, device="Garmin", stringsAsFactors=FALSE)
combined <- rbind(combined, data.frame(iphone.df, device="iPhone", stringsAsFactors=FALSE))
p <- ggplot(combined, aes(x=dist, y=ele, color=device))
p <- p + geom_line(size=1.5)
p <- p + xlab("Distance [km]") + ylab("Elevation [m]")

I would say both devices are not really good in this flat terrain. The steepest and highest elevation are the three bridges I cross over the Rhein and Main. And yes it goes more or less up all the time, but not as much as the Garmin measures, despite barometric device. However, I was curious how the date compares to a digital elevation model (DEM) and extracted the data along the track. I use the corresponding tile of the GTOPO30 dataset:

gtopo <- readGDAL("W020N90.DEM")
gtopo.iphone.spdf <- extract(raster(gtopo), iphone.spdf, sp=TRUE, fun=mean)
gtopo.iphone.df <- data.frame(gtopo.iphone.spdf)
p <- ggplot(combined, aes(x=dist, y=ele, color=device))
p <- p + geom_line(size=1.5)
p <- p + geom_line(data=gtopo.iphone.df, aes(y=band1), color="black", linetype=3, size=1.5, alpha=0.5)
p <- p+ xlab("Distance [km]") + ylab("Elevation [m]")

To my surprise the elevation recorded by the iPhone looks much better compared to the Garmin GPS in relation to the GTOPO30 data (dotted line), despite barometric altitude calculation.
One could now also reformat the time column, calculate speed, use it as axis, …

garmin.df$time <- strptime(garmin.df$time, format="%Y-%m-%dT%H:%M:%SZ",tz="CEST")
iphone.df$time <- strptime(iphone.df$time, format="%Y-%m-%dT%H:%M:%SZ",tz="CEST")


The problem with my plan cycling once a month is, I always have reasons not to cycle: the weather, it’s to cold, hot, raining; I have a slight cold; The planes a descending over Mainz, which means easterly winds; … So this year I made it only three times so far. However, hopefully Stadtradeln motivated me to cycle a bit more often now.

And also the it is a really nice scenery. Here are some impressions from my tours:

Autumn sunrise above the foggy alluvial meadow of the Main

View from the bridge in Sindlingen to both sides, even industry can be nice.

And some spiderwebs

Simple greenhouse gas layer model

Since carbon dioxide in the atmosphere is still increasing and it seems as if it will not stop increasing in the near future, it is important to know about the greenhouse gas effect. On earth is it neither as cold as on Mars (-55°C) nor as hot as on Venus (>400°C). This has not only to do with the distance from sun, but also with the greenhouse gases. On Mars there is hardly any atmosphere, leading to warm temperatures during day and very cold temperatures during night. You might have experienced clear nights with a much larger temperature difference to daytime temperatures, whereas with a cloudy sky the temperature difference between day and night is not as large. This is because water is a very effective greenhouse gas. However, there are hardly any possibilities to alter the water content in the atmosphere, but we could reduce our CO2 emissions. On Venus on the other side the atmosphere contains a lot of CO2 and other gases leading to a very strong greenhouse gas effect.

Quiet a while ago I’ve written a simple program to calculate the greenhouse gas effect, and since I’m not really good in documenting my code the following text is an extended documentation of a simple greenhouse gas layer model written in R, which is based on chapter 3 “The layer model” of David Archer’s book “Global Warming: Understanding the Forecast” (2008) (web page, publisher). The book is easy to read and the formulas are even understandable by a non-physicist like me. In the following I will briefly explain the simple radiation budget of the earth.

Solar radiation

If the earth is in equilibrium, with respect to its radiation budget, the amount of energy reaching the earth must be equal the amount of energy leaving the earth for a given time period. Energy is measured in Joule and energy per time is called power and is measured in Watts, where Watts = Joule/second or in scientific notation W = J s-1. The energy source for the earth is the sun and it is possible, to calculate suns mean energy reaching the earth, the so called solar constant Isolar (it is not really a constant, but for simplicity it is assumed to be constant).

According to the Stefan-Boltzman law the power emitted by a so called “black body” per area in Watts per square meter is equal to the surface temperature to the power of 4 times the Stefan-Boltzman constant σ. With the average surface temperature of the sun in Kelvin Tsun = 5778 K and σ = 5.671*10-8W m-2 K-4 the sun emits σ*Tsun4 = 63.21 MW m-2. That’s really much. The sun has a radius of rsun = 696*106 m and therefore a surface of 4*π*r2 = 6.09*1018 m2. By multiplying these two large numbers, the power emitted per area and suns surface area, we end up with the total power emitted by the sun in all directions, which is 3.85*1026 W. In Wikipedia’s list of the largest power stations in the world the Three Gorges Dam with 2.5*109 W is the largest power station to date, which is still nothing compared to the sun.

Of course, not all this power is reaching the earth, the total power emitted by the sun must be divided by the surface of a sphere with the distance of the earth from the sun dsun-earth The distance is on average dsun-earth = 149.6*109 m. This is so far, that the radiuses of earth and sun can be neglected. According to the above already used formula to calculate the surface of a sphere, the solar constant calculated here is Isolar = 3.85*1026 W / (4*π*dsun-earth2) = 1368.123 W m-2.


Radiation budget without atmosphere

Everybody might have seen pictures of the earth and realized that some parts are bright and others are dark. This means a part of the light and therefore the energy from the sun is directly reflected back to space, without being taken up by the earth surface and converted to “heat” or infrared radiation. The fraction of back scattered light or energy is called albedo with the symbol α and is on average 33% of incoming solar radiation. As mentioned above, the average incoming power must be equal the outgoing power. Because the sun only shines on one side of the earth the solar constant must be multiplied by the surface of a circle, the cross section of the earth (Isolar*(1 – α)*π*rearth2), but the earth emits in all directions, therefore the outgoing power must be multiplied by the surface of a sphere and we assume that the earth also behaves like a “black body” (4*π*rearth2*σ*Tearth4) to calculate earth surface temperature without atmosphere. As already mentioned the incoming and outgoing radiation must be equal and by solving Isolar*(1 – α)*π*r2 = 4*π*rearth2*σ*Tearth4 it is possible to calculate earth surface temperature if it had no atmosphere: Tearth = [(1-α)*Isolar / (4*σ)]0.25 (where the power to 0.25 stand for the forth root). By filling in the above values the earth surface would have an average temperature of 252.13 K equal to -20°C, pretty cold. So luckily we have an atmosphere on earth not only to breath, but also to raise the temperature.

Greenhouse layer

If now the atmosphere comes in our model, the above radiation budget is valid for the top of the atmosphere. So incoming solar power must equal outgoing infrared radiation at the top of the atmosphere. We assume a very simple atmosphere, which is transparent to the incoming radiation, but absorbing and emitting all infrared radiation. However, it takes up infrared radiation from the earth but emits it on both sides, back to earth and to space. By solving a set of formulas the surface temperature below the greenhouse layer equal 20.25*252.13 K = 299.8 K, a much more comfortable temperature now. We could now add more and more greenhouse layers, each of them would increase the surface temperature by the factor of 20.25 = 1.189.

Dust layer

One could also assume a dusty atmosphere layer, which absorbs a fraction of sun’s radiation, and emits infrared radiation to all direction. Guess what happens. Of course the earth surface would be cooler. I don’t want to go into details here, if you want to know more, look at the code, read David Archer’s or another atmospheric physics book. I was not yet able to combine the two models the greenhouse and the dust layer. And I will not finish it.

Final remarks

In the real world things are not so simple. The solar constant is not equal through out the year and over geological time periods, since the distance to earth changes and there are times with more or less sun spots. The albedo is not the same everywhere, some clouds and ice are much more reflective than forests or the oceans. A fraction of the solar radiation energy is not converted to infrared radiation and reemitted, but used for evaporation and photosynthesis. The energy in the evaporated and transpired water vapor is transported elsewhere and released, when the water condenses and precipitates. We don’t have a closed greenhouse gas layer and no equally distributed dust layer in the atmosphere.

Most of these processes are implemented in global climate models in a simplified way, which we call parameterization. These models divide the earth surface in small squares and the atmosphere in boxes above these squares to calculate all known processes. The models are parameterized to our best knowledge to represent reality as good as possible. And these global climate models do not run on a single PC they need several tens to hundreds, so called clusters. However, the above described model is already good enough to yield an approximation of the greenhouse gas effect and reality (observed mean earth temperature: 295 K or 22°C) is in between the two calculated temperatures of 252 K and 300 K, somewhat closer to the single greenhouse layer model.