February 4, 2012, Saturday, 34

Dataman:DataInput

From SwissExperiment

DataInput
Jump to: navigation, search

Data Input

Under Swiss Experiment, we have developed a number of ways to store various types of data under the best technological data platforms for the job. These data platforms are GSN for time-series data and the Semantic Wiki for storing non-time-series data.


Contents

Metadata

The SwissEx wiki (the user editable website that you are looking at) is the main point for metadata entry. Projects can create their own sites within the SwissEx site and use this for project organisation and as a collaborative tool for discussing work. These pages can be open or password protected within the team and using the SVN which runs behind the wiki means that storage and collaborative working on a document are possible for widely distributed organisations. These pages are all free text.

This wiki runs Semantic Mediawiki software. This means that it is possible to store information in a structured way using the SQL database behind the wiki. For the majority of metadata recording tasks, the SwissEx team have predefined templates and provided a form-based data entry system. One outcome of this is that the SwissEx wiki contains a database of fieldsites and drilling down within this provides all of the metadata on the sensors and data.

This method of metadata entry provides a web based, distributed interface. Once this metadata has been entered, it can be read by both GSN and SensorMap to provide metadata alongside the data. Metadata is stored using Semantic Mediawiki. For SwissEx, storage will be centralised. External persons may also use this central storage, but instructions will also be given on how to reproduce the wiki configuration.

Click here to start setting up your metadata database. A basic user guide for entering this information is provided here. You will require a login from the Wiki Admin.

There are various tasks that you may wish to do:

In some cases, projects may wish to store other types of data/metadata. In order to do this, projects have to define templates of the data that they wish to record (more about semantics entry is provided here).

Sensor Data

Many different types of data are collected in the field and the SwissEx team have tried to provide one or more solutions to acquiring this data.

For many of the solutions provided by SwissEx, you will need a locally installed GSN server.


Data measured and entered by hand

There are two choices for entering this type of data:

  • Entry into a CSV file: if the data is unlikely to be changed at a later point, the easiest way to put the data into the system is to write it to a CSV file and save it to a file system where GSN has access. Using the CSV file wrapper, GSN can continually check the file for new data. When you append the next data set to the file, GSN will automatically update the database.
  • Entry into the wiki: if you think that you are going to be adjusting your data values at a later point, the best way to acquire data would be to set up a template in the wiki and use the Excel macro developed by the SwissEx team to upload the data to the wiki. In this way, distributed access to the data is available and the data is editable. Using the GSN SPARQL wrapper, this data can be synchronised into the GSN database.

Logger data collected by hand

This uses the same principle as the CSV file entry above: collecting the data from your loggers and writing it to a CSV file on the filesystem, GSN can acquire the data using the CSV wrapper.

Logger data acquired by 3rd party technology

Loggers such as Campbell loggers which acquire data and send it back over wireless networks to proprietory software, generally write the data to CSV files. Using the same principle as above, these files can be continually monitored by GSN for new data. When the data is appended to the file, GSN will update the database.

Streaming data from the sensor

GSN has a number of wrappers for acquiring data directly from the sensor or sensor network, e.g. for cameras, gps sensors, SensorScope stations, DTS sensors, etc. Using these wrappers, the data is acquired by GSN without the need to be written to a disk space in between.

Non time-series data

If data does not vary on a scale of < 50 years, we are considering it as metadata (unless it makes sense to have it as a time series) and it should be saved in the wiki. Other data is time-variant and should be saved using GSN.