February 4, 2012, Saturday, 34

Dataman:AppAccess

From SwissExperiment

AppAccess
Jump to: navigation, search

In-application data access

A series of tools have been put together to allow direct access to the data held in GSN from within the application that you are using. So far, these include tools for


MATLAB Tools

Matlab is a registered trademark of The Mathworks, Inc. For details of the software, see mathworks.com. It may be available to you through your institute licences.

Several tools have been created to simplify access to data held in GSN databases. These follow the general syntax used for web queries, and generally work by downloading and parsing XML data.

The tools include several GSN query functions;

  • GSN_xml_parser requests and parses an xml file.
  • GSN_request_location returns latitude, longitude and elevation for a particular station.
  • GSN_request_list_virtual_sensors lists the variables available for a particular station.
  • GSN_multidata queries GSN for data for a particular interval from a particular station. This query can be made very specific.

For details and syntax, see the examples here.

Code is available as an archive including examples here.

This code was developed by Andy.clifton


R Tools

R is an open source software tool available from http://www.r-project.org. It can be accessed in two ways;

Standalone R

Accessing GSN data through R either requires the use of the standard R commands: read.table or read.csv, for example:

data=read.csv(file="URL", skip=2, head=TRUE, sep=",")

where URL is the http GSN query as defined in the multidata section, here. info.pngTo get data in R, you should choose download_format=csv and download_mode=inline.

You can automatically write this command through a form interface on this link. You can also automatically fill in most of the form by navigating to your sensor through the metadata database and choosing the 'download data' link.

E.g. The command below will download recent data from the wan 1 station;

data=read.csv(file="http://montblanc.slf.ch:22001/multidata?nb=SPECIFIED&nb_value=10000&vs%5B0%5D=wan1&field%5B0%5D=All&time_format=iso&download_format=csv&download_mode=inline",head=TRUE,sep=",",skip=2)


Data can then be accessed by picking up the variable you are interested in, e.g.

data$air_temperature

A more complex plotting example is downloadable here. This example is designed to plot the Wannengrat meteo stations as 'IMIS standard' plots. It could be converted for plotting any type of GSN timeseries.

These examples were prepared by Nicholas Dawes and Christoph Mitterer.

R in the wiki

R is enabled through an extension. To embed R code in a wiki page, use the <R></R> tags.

More information about the extension is available through [author's web site]. For an example of how to pass data into a template, see Template:ScienceRxyplot.

The listing below details the libraries that are currently installed. To have a new library installed, please contact the webmaster with details of the library you want.

These examples were prepared by Andy Clifton.


LabVIEW Tools

LabVIEW is a registered trademark of National Instruments. For details of the software, see http://www.ni.com. It may be available to you through your institute licences.

Two tools have been created to simplify data access through GSN:

  • GSN_data_access.vi is designed to be used as a sub-vi. Inputs required: the various parameters needed for a GSN query. Outputs provided: Data (including timestamp as separate numerical fields), Timestamp array
  • GSN_metadata.vi can be used as a sub-vi to query the metadata, though the most benefit is provided when this code is used within your application e.g. to provide auto populated click-based data access as shown in the demo application.

A demo plotting application is provided. This is not a robust vi at present, but is designed to demonstrate the capabilities available when the two vi's above are used. I will find some time to make this robust at some point, then make an executable.

This code was developed by Nicholas Dawes

Screenshots:

Auto-populated, multi-selection menus provide the parameter selection
Legends are auto labelled with the parameter names


MeteoIO C++ library

The MeteoIO library has been designed for the specific needs of numerical models consuming meteorological data. The whole task of data pre-processing has been delegated to this library, namely retrieving, filtering and re-sampling the data if necessary as well as providing spatial interpolations. The focus has been to design an Application Programming Interface (API) that would provide a uniform interface to meteorological data in the models; hide the complexity of the processing taking place; guarantee a robust behavior dealing with formats or transmissions errors, erroneous or missing data. Moreover, for an operational context, this error handling should avoid interrupting the simulation as much as possible. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors not familiar with the environmental sciences and/or a particular model to painlessly participate. In this context, a plugin has been develop that retrieves data from GSN. The user has to provide some GSN parameters (server, stations to read, optional usage of a proxy) and the library does the rest. More can be found about this Open Source (LGPL) c++ library on https://slfsmm.indefero.net/p/meteoio/ .


Googledocs

A useful tool for users wanting to make plots of data quickly and easily is Googledocs. The plots from Googledocs are javascripts which can then be added to any webpage.

In order to query data from Googledocs, you must first create a HTML query of your desired GSN-based data. To create this query, use the GSN data query form. Time format should be set as 'unix'. Copy the 'Show XML' link from the page created.

In Googledocs, create a new spreadsheet. In e.g. column C, you should put your time query:

=importXML("QUERY PASTED FROM WIKI","/result/data/tuple/field[2]")

NB:

  • you only need to paste this query into the top cell, Googledocs will fill in the rest of the cells according to the number of values in your query.
  • if you have no aggregation in your query, you should use /result/data/tuple/field[2], whereas if you have aggregation in your query, you should use /result/data/tuple/field[3].

In e.g. column D, you can create your data query:

=importXML("QUERY PASTED FROM WIKI","/result/data/tuple/field[1]")

In columns A and B, you can now create your time column and data column for plotting. The time column (A) should be formatted for the time format required, and should contain the formula:

=(C2/86400000)+25569

In column B, you can do whatever operation on the data that you desire, e.g.:

 =if(D2>0,D2,0)

You can now create your desired plots. Once you have created your plots, the javascript can be obtained via the 'publish chart' button.

An example of the types of queries that you can do can be found here