The German newspaper Süddeutsche Zeitung (SZ) worked together with OpenDataCity to create an online train monitor of the German network: Zugmonitor. This is another great example of the new form of data journalism.
The project provides access to data of train delays collected over 150 days between 2 October 2011 and 1 March 2012 and allows you to analyse the delays in more detail.
Here is an example showing the delays by station.
This SZ article (in German) gives you an overview of the data and how to access it. I believe the most convient method to query the data is to use the Google Fusion tables. It allows you to import the data into R with the
read.csv function. The filename to use is an url mixed with a little bit of SQL syntax.
Here is an example extracting the station data (Fusion table 3166152):
The other sources can be accessed in the same way:
|Delay||Fusion table ID|
|between stations (all trains)||3166064|
|between stations (ICE tains only)||3166328|
|by train type||3165124|
I am curious what people will make of the data. Apparently more data will be made available in the future. I will keep an eye the project page.