Adrià Mercader - Blog
I recently needed a bit of geoprocessing for which PostGIS turned out to be the perfect tool. I'll share my findings in case they are useful to somebody. But first of all, here goes some context: I had imported the countries dataset from Natural Earth to PostGIS (piece of cake with ogr2ogr), and I needed to obtain the extent or bounding box of each country.
Getting the extents
The most obvious choice seemed
ST_Extent, but there is a better choice:
ST_Extent is an aggregate function, so you need to group by one of the
table's columns, and it returns a
bbox3d_extent object instead of an actual geometry.
ST_Envelope will return a geometry (a polygon in our case):
CREATE TABLE world_extents AS SELECT ST_Envelope(mpoly) AS the_geom,name FROM world_worldborders;
The result of the previous statement was not as exciting as expected when rendered in QGIS:
The Natural Earth borders dataset has one multi-polygon per country, which includes all the land polygons which pertain to it. In consequence, countries like the United States had extents stretching the whole world, from Hawaii to the Bering Strait islands. I needed a previous step to obtain one separate polygon for each land mass or island.
GIS users will be familiar with the operation known as explode. It is used to obtain individual features from a multipart feature. So instead of having a single polygon for all Japan, after exploding it you will get an individual polygon for each of its islands.
There are different ways of performing it in desktop GIS applications, but in can also be done in PostGIS using a couple of different functions. I adapted this statment from the slides of the "Tips for the PostGIS power user" by Paul Ramsey, presented in last year's edition of FOSS4G (The slides are a must read for any PostGIS enthusiast). Let's have a look at the final statement and discuss it later:
CREATE TABLE world_worldborders_2 AS SELECT id,name,the_geom FROM (SELECT id,name,ST_GeometryN(mpoly,generate_series(1,ST_NumGeometries(mpoly))) AS the_geom FROM world_worldborders ) AS foo;
ST_GeometryNwill return the geometry at the provided index in a geometry collection or multipart geometry.
generate_seriesis a set returning function (it returns more than one row). It generates a series of values from the start to the end values provided.
ST_NumGeometriesdoes what it says on the tin, returning the number of geometries in a collection or multigeometry.
There is a faster approach using
ST_Dump, which returns a set of the individual geometries that make up a multi one.
Notice that it returns a composite type with a
geom properties, so you need to use the dot notation to access the returned geometry:
CREATE TABLE world_worldborders_3 AS SELECT id,name, (ST_Dump(mpoly)).geom FROM world_worldborders;
The combined statement produced a more nice looking layer, with extents following more closely the actual outline of the countries:
CREATE TABLE world_extents_2 AS SELECT id,name,ST_Envelope(the_geom) AS the_geom FROM (SELECT id,name,ST_GeometryN(mpoly,generate_series(1,ST_NumGeometries(mpoly))) AS the_geom FROM world_worldborders ) AS foo;
Notice the subtitles options below the video. It is using Universal Subtitles, an online service that makes really easy to add or edit subtitles to existing videos. I helped creating the Catalan version of the subtitles and found it a brilliant idea. You can edit the translations on site or download the file and upload the changes.
By the way, if there are any Catalan readers willing to proofread and correct the subtitles, please go ahead...
The past Friday and Saturday took place in Hilversum, near Amsterdam, the first Europeana Hackathon. Europeana is a project funded by the European Comission that aggregates cultural content from organizations across Europe. It hosts around 16 million records, and the variety of stuff you can find on it is incredible, from ancient maps to fashion clogs to dolls sitting on a monkey.
The team behind Europeana has recently published a preliminary beta version of an API, and the objectives of the Hackathon were both receiving feedback from developers and see what they could came up with after playing with it.
The hacking took place in the impressive Netherlands Institute for Sound and Vision and the room was everything but dull! The organization was absolutely flawless and there were lots of interesting ideas for enhancing and building services around the API.
Just the day before starting, new enrichment terms were included in the API, including location information extracted from GeoNames. This allowed to perform bounding box based spatial queries, and I quickly started looking at ways to use them.
The Europeana API is based on OpenSearch, and returns a RSS response of the query results. Unfortunately, the location information is not included in this first response, but in a SRW representation that has to be queried in a second query. So my first step was to create a thin wrapper around the Europeana API that captured the results, queried the details of each item and injected the coordinates back in the results RSS.
So instead of getting this:
<item> <title>Vierge à l'Enfant : Nostre Dame de Grasse</title> <link>http://www.europeana.eu/portal/record/03901/ ... </link> <description>anonyme;Culture.fr/collections</description> <enclosure url="http://www.culture.gouv.fr/Wave/image/joconde/0387/m056206_0003905_v.jpg"/> </item>
You get this:
<item> <title>Vierge à l'Enfant : Nostre Dame de Grasse</title> <link>http://www.europeana.eu/portal/record/03901/ ... </link> <description>anonyme;Culture.fr/collections</description> <enclosure url="http://www.culture.gouv.fr/Wave/image/joconde/0387/m056206_0003905_v.jpg"/> <georss:point xmlns:georss="http://www.georss.org/georss">43.41667 1.5</georss:point> </item>
A tiny but crucial difference. This effectively transforms the results in a GeoRSS feed, an open standard used to include location information in RSS which is widely supported in mapping applications and libraries like OpenLayers or Google Maps. The obvious problem is that this is terribly inefficient and very slow, so it will be great if the API itself would provide this information.
Once the GeoRSS wrapper was ready, there were several possible applications to build around it. I wrote a simple interface with OpenLayers that allows you to draw a bounding box, perform an API query and plot the results on the map. It also supports pagination and filtering by historic period (though in some cases no results are returned).
There was no time for more, but hopefully it will show the potential of geo-enabling the Europeana API, which I'm sure has a bright future ahead.
Some say blogs are not cool any more (i.e. they are "dead"), left behind by the frantic pace of Twitter and Facebook. News and ideas are more and more condensed to fit the sacred 140 characters, and they become outdated sooner than ever before. I use Twitter and Facebook daily, but I still found the most interesting and useful resources reading blogs. I value the effort the authors (at least some of them) put into presenting their ideas and knowledge. So I thought I could spend some time sharing some of my thoughts too.
A couple of quick facts for those of you who don't know me:
- I was born in Tarragona and I'm currently living in the UK, in Newcastle upon Tyne.
- I'm a web developer with a strong background in the geospatial field. I work with Open software and standards.
- If I had to choose three pieces of software to take to a desert island, I would probably pick Python, OpenLayers and PostGIS.
As a proper hacker, I wrote my own website and blog engine (with Pylons, in case you were wondering) so please bear with me if you find some glitch or odd behaviour. You can subscribe to the Atom or RSS feeds or follow me on @amercader to hear about new posts.
If you made it this far, thanks for your time and hope to see you around!