MarBlog

Archive for November, 2010

BODC’s New data catalogue

by on Nov.24, 2010, under Online Data Sources

British Oceanographic Data Centre has jsut announced a new facility on their website to search and retrieve data series directly from the web. While a lot of data could be retrieved before, this catalogue truly opens up access across all categories and project, with over 76,000 data series being put online in a searchable format. The series are mainly CTD casts, but also include bathymetry meterology, optical properties, wave data and more.

The great thing is that data is available in several recognised formats, NetCDF, ODV and ASCII files – so virtually everyone in the field can access this data in a preferred format.

There are some limitations in terms of the way you can refine searches, but most of them makes sense from the perspective of optimising searches and not hanging up the server in searches that return virtually everything.

By the time you have narrowed your search criteria to return 1,000 series or less, you can retrieve results. There’s the option of downloading a KML file of coverage, and you can retrieve data in your preferred format.

It is important to note that we’re talking data series, not individual points here, so even a single series can contain thousands of data points, giving you access to a seriously large amount of oceanographic data with a wide geographic coverage.

The initial map on the start page show waters around Britain, but make sure you either zoom out or pan around as there is data from a much wider region – virtually all of thw world - than what is shown on the map.

You do have to register an account with BODC in order to checkout your “data shopping”, but there is a huge amount of data freely available. The map tells you up front which data series are freely available.

BODC has truly made their data a lot more accessible with this exercise.

1 Comment :, , , , more...

Ontology Part 4: Digging a bit deeper

by on Nov.03, 2010, under General, Online Data Sources

Ok, so I’m now writing the 4th post centring about ontology. In reality, what I am interested in is actually how to use ontology to improve the sharing and location of data, and if at all possibly from a viewpoint of not wrecking the entire existing data models, because it’s time consuming, expensive, and quite often – let’s be realistic - just not an option.
In a slideshare post by Juan Esteva, there are some interesting issues addressed about data integration:
View more presentations from juanesteva.
To sum up my main take. There are 3 main challenges to successfull data integration – essentially independent on if you are talking about a internet based global concept, or if you are talking about disparate data sources within your own organisation (admitted there is a scalability difference!)
  1. Syntactic Challenges – e.g. different models and languages
  2. Schematic Challenges – e.g. structural differences
  3. Semantic differences – e.g. different meanings and understandings.
To achieve true interoperability, you should in theory address all 3, and it would be nice to be able to do so for most people. But some of these challenges will remain, no matter what conceptual nirvana you present.
Each of the challenges have their own issues. There will always be lingual differences – but they are gradually being overcome.
There will always be schematic differences. Simply because nobody will ever use the same vendor and the same solution – and they probably shouldn’t either.
But in terms of semantics, there is at least an opportunity to present the information in a structured fashion now - just like it has been possible to define your language, and your schema, you can now at least make representation of your understanding of what it is that you share. It will probably not be perfect (In fact it is almost guaranteed that someone will argue – and like with the single vendor/schema solution- perhaps that’s a good thing!) – but it does provide the opportunity to include the current level of understanding for which data is being shared.
To start approaching more “global” ontologies, there’s a next step involved i suspect, whereby classifications and entities’ relationships are defined by all the possibly used combinations, without passing judgement on the use. Instead perhaps it should merely be the strength of the usage of particular triplicates that determine your most likely representation and understanding of a concept. So we’re not barring anyone from semantically expressing that if it walks like a duck – its-a three-legged pony. But because there is an overwhelmingly more popular usage of the semantic triplicate its-a duck, we can regard it as more likely without dismissing equestrian counterparts.
I am actually working on something slightly more constructive in terms of ontology use, which will show up in a future part of this series, but it is the more philosophical aspects of the concept that still both excites and bugs me.
1 Comment :, , , , more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Blogroll

A few highly recommended websites...