Intro to OGR, Part II: Creating New Data

Looking for Part I? (Go back to "Intro to OGR: Exploring Data" here!)

Last time, we explored an Austin Parks shapefile using both QGIS & OGR. In this part of the tutorial series, we're going to use the original Austin Parks shapefile to create new layers of different file types. After that, we'll create a simple web map using CartoDB to host & display our data.

Inspect the Field List

If you aren't there already, open a terminal & change to your working directory (where you've placed the Austin Parks shapefile).

$ cd ~/gis/austin_parks

From here, we'll take another look at the summary info given by ogrinfo. This will show us a field list, and we'll use that to determine which fields we want in our newly-created layer. We've done this already, but just to review:

$ ogrinfo -so city_of_austin_parks.shp -sql "SELECT * FROM city_of_austin_parks"

This will return an overview of city_of_austin_parks.shp, including a list of fields.

For this project, we want parks with a "developed" status, and we only want to display each park's name and address. Judging from this field list, we know we'll need to look at both PARK_STATU and PARK_ADDRE for the new layer. However, you'll notice that these features have both PARK_NAME and PARK_LABEL fields. One of those probably has the descriptive name we want to use in our map, but we don't know which. Let's grab one of the features to see what it has in each of those fields.

Do this:

$ ogrinfo -q city_of_austin_parks.shp -sql "SELECT * FROM city_of_austin_parks" -fid 0

To get this:

Layer name: city_of_austin_parks
OGRFeature(city_of_austin_parks):0
  PARK_ID (Integer) = 167
  PARK_NAME (String) = Givens
  PARK_TYPE (String) = District
  PARK_ADDRE (String) = 3811 E 12th St.
  PARK_STATU (String) = Developed
  POLYGON_PA (Integer) = 1
  ACRES (Real) =         41.43263805
  CREATED_BY (String) = (null)
  CREATED_DA (Date) = 0000/00/00
  MODIFIED_B (String) = ahardy
  MODIFIED_D (Date) = 2009/05/01
  PARK_LABEL (String) = Givens District Park
  SHAPE_AREA (Real) = 1804805.71356000006
  SHAPE_LEN (Real) =    5518.67090340000
  POLYGON ((3131001.980158895254135 (..etc...)

Based on that, PARK_LABEL looks like the more descriptive of those two fields.

Create New Layer that Meets our Requirements

At this point we are ready to create a new shapefile. To recap: we want to create a map that shows only the developed parks in Austin, and labels each park with its name and address. We're going to filter out only the features where "PARK_STATU" = "Developed", and keep only the PARK_ADDRE and PARK_LABEL columns. While we're at it, let's also rename those columns to make them a bit more friendly.

For this step, we're going to switch to a new OGR utility: ogr2ogr, which will allow us to create a new file. This tool is similar to ogrinfo, in that both are used from the command line and both share flags like -sql.

Do this:

$ ogr2ogr new_layer.shp city_of_austin_parks.shp -sql "SELECT PARK_LABEL AS NAME, PARK_ADDRE as ADDRESS FROM city_of_austin_parks WHERE PARK_STATU = 'Developed'"

Congratulations! In one unbelievably fast instant, you've created a new shapefile. Check your working directory to see new_layer.shp.

In that command, we told ogr2ogr to create a new shapefile (new_layer.shp), based on the old shapefile (city_of_austin_parks.shp). We renamed PARK_LABEL to NAME, and PARK_ADDRE to ADDRESS. Finally, we kept only the features where PARK_STATU was "Developed".

That was ridiculously fast. Let's check the first feature of the new layer to make sure the output is good.

Do this:

$ ogrinfo -q new_layer.shp -sql "SELECT * FROM new_layer WHERE FID = 0"

To get this:

Layer name: new_layer
OGRFeature(new_layer):0
  NAME (String) = Givens District Park
  ADDRESS (String) = 3811 E 12th St.
  POLYGON ((3131001.980158895254135 (...etc...)

Looks like exactly what we wanted. Great!

Building the Map

Finally, let's create a map! I really like CartoDB for use cases like this one: when I want somewhere to upload spatial data and display it on a web map with minimal effort. To continue here, you'll need an account on CartoDB. If you don't have one already, sign up here: it's free!

When you first login to CartoDB, you'll see a dashboard and something telling you that you have 5 free tables available. We're going to use one of those for our Austin Parks map. Before we do that, however, let's go back to OGR one last time.

Creating a KML

At this point we've finished creating our map layer. If we wanted, we could directly use this shapefile to create a web map: CartoDB accepts many spatial filetypes, shapefiles included. However, as a learning project this is a great opportunity to practice another common usecase for ogr2ogr: quickly converting shapefiles into alternate formats. Here, we're going to convert a shapefile into a KML document.

Even if you've never worked with KML files before, don't worry - they're very simple! There is only one thing you need to know: to be a valid KML, every feature needs a specific field called Name and one called Description. We do have a Name field, but our shapefile doesn't have a column named Description. That's fine though: we can tell ogr2ogr to assign our current ADDRESS field to be the new Description.

Do this:

$ ogr2ogr -f "KML" austinparks.kml new_layer.shp -dsco DescriptionField='ADDRESS'

Here, we've explicitly told ogr2ogr that we want KML-formatted output, using the -f "KML" flag. Our new KML file is called austinparks.kml and we've used our existing ADDRESS field as the new Description. Check your working directory to verify that austinparks.kml was created.

Putting our Data on CartoDB

At this point we have our finished map layer as a KML, and are logged in to CartoDB. To put your map online, navigate to your CartoDB dashboard. Click "Create Table".

In the New Table dialog, choose "select a file" and point to your newly-created austinparks.kml file.

It may take a few minutes to upload & process your data. That's fine. When it's finished, you'll see your new austinparks table.

Click "Map View" and you'll see your data projected onto a basemap. Neat!

There is a lot you can do with CartoDB, so when you have time feel free to poke around and explore. Among other things, you can change the basemap, rename the map, and customize the popup windows. CartoDB also has a quite powerful API, which you can use to customize even further, and then integrate your map data within other applications.

When you are finished, click the "Share" button on your map frame. From there, you can get a publicly-shareable map link, as well as embeddable HTML and API integration details. Using the embeddable link within an existing site will look something like this:

And that's it: you've successfully created map layers from public data, using tools from the open source OGR library, and turned your data into a web map with CartoDB. I think that's pretty awesome!

By @Sara Safavi