Written on November 17, 2023

Thoughts on PostGIS Day 2023

I blocked my calendar on Nov 16th and spent an enjoyable few hours watching presentations from PostGIS Day 2023. I wasn’t able to catch all of them, as they started at 5 AM for me, but I was impressed with the ones I did see. Paul Ramsey did his usual excellent job of hosting the event and it was very enjoyable to live in my old world again, even for just a little while.

A few presentations stood out for me and will serve as inspiration for future personal research and exploration.

Spatial SQL, the Modern Data Stack, and PostGIS

Matt Forrest, from Carto, gave a talk that was a fruit salad of the latest hotness with references to DuckDB, H3, dbt and a host of other buzzwords. While it’s easy to be cynical about the name drops, his talk was interesting and compelling - so much so that I bought his book, Spatial SQL after his presentation was done. Couple things that he covered that are on my list to check out:

  • DuckDB: this has been coming up more annd more lately and I just discovered that my friend from Tableau days, Richard Wesley, now works for DuckDB Labs. Frankly anything that Richard touches is something I want to check out, so this was a good reminder to give this a test drive soon.

  • dbt: the dbt website is a mish-mash of blurbs without any real cohesion or insights, but this blog post describes dbt as being, “a command line tool that enables data analysts and engineers to transform data in their warehouses more effectively”. It’s still not clear to me WTF that means and honestly, I’m not sure that I need/want it, but it does seem to come up in conversations. Best I can tell, it’s a data wrangling platform that lives in the Cloud and allows SQL-based ETL. Solid maybe on this one, but would be good to understand a bit better how people are using it. Cloud-based, SQL analog to FME or Alteryx, maybe?

  • H3: Ahh H3… Uber’s much vaunted “hex grid”. Described as being, “a geospatial indexing system using a hexagonal grid that can be (approximately) subdivided into finer and finer hexagonal grids”. (Because I guess a rectangle wasn’t good enough?) I suppose if you want to cover a sphere, then the hexagon works better, but I think someone at Uber just REALLY liked playing old-school RTS games. Still, Sarah Battersby has written about using hexes and she’s right that there are some advantages to using them in certain situations. I’m still somewhat skeptical about whether they’re as cool as people say they are though.

My takeaway from Matt’s talk is that some of this “hot newness” can actually be pretty useful, so I’m looking forward to working through some of his examples and seeing what I think for myself.

PgRouting: A Practical Example

Presented by Vicky Vergara, whom I found to be an absolutely lovely person. Sometimes you meet people who exude a vibe that makes them seem like superheroes, as though everything they do and touch makes the world a better place. Vicky felt like that to me. Her talk was on pgRouting, which is something I’ve wanted to experiment with for awhile now to generate drivetime polygons. But what I found absolutely amazing was that the work she presented dovetails into the the UN Sustainable Development Goals. I think this mission statement, “promoting the development and use of open-source software that meets UN needs and supports the aims of the UN” is simply AMAZING. I had no idea OSGeo was involved in this and I am super impressed. Nice job, people!

Refactoring the Way We Talk About SQL: It is Costing us Money

The inimitable Brian Timoney gave a talk that I feel could have been titled, “Refactoring the Way We Talk About GIS…” instead of “SQL”, but regardless - he’s absolutely right. GIS professionals don’t get the same respect as other Data Scientists, even though in many cases they use the same tools and analytic techniques - in addition to doing GIS - that other Data Scientists do. I enjoyed the content and humor of his presentation and am curious to take a stab at his Colorado well permit challenge. Maybe using R and possibly a Shiny app? That said, while I generally agree with him, there are still some GIS analysts out there who frankly need to get their skills on par with the rest of the analytic world. Simply knowing how to do spatial analysis in a desktop GIS tool like ArcMap or MapInfo doesn’t cut it anymore. And maybe having some general knowledge about statistics and machine learning wouldn’t hurt either, just sayin’…

Simple Polygonal Coverages in PostGIS

Given by Martin Davis, another superhero in my book, this was an illuminating talk about how simple polygonal coverages in PostGIS could be used in place of [more complex and cumbersome] topological models for certain tasks. Given what he showed in this presentation, the GRASS-based workflow I developed at Tableau to validate and clean admin polygons could have been replaced by in-database spatial SQL queries, which would have been REALLY nice. And the MapShaper simplification and all the Node.js bullshit that went along with it… I’m really happy to have learned about this functionality because I’m certain it’s going to come in handy for me at some point in the future.

Final Thoughts

As I mentioned in the beginning, I really enjoyed the presentations and vibe of this Conference, Summit, Day… (whatever it is). If I have one regret, it’s that I couldn’t catch the 1st talk of the day on PostgREST, as it appears to be the one talk on the agenda that discussed using an API on top of PostGIS. My personal feeling is that doing geospatial analysis purely in SQL isn’t how I want to spend my time anymore, but I can absolutely see myself building APIs on top of PostGIS functions in the future. I think more talks about APIs on top of PostGIS would have been interesting.

Secondly, I do feel there was a bit of a disconnect by having ZERO talks about the interactions between Python (or R, etc.) code and PosGIS. It would have been useful, for example, to explore how code-based interactions with the database can benefit from DB profiling to understand how they execute on the database, and what can be done to optimize the queries that are being made. Plenty of people are still slinging SQL directly, but I know plenty more who are just using ODBC connections and writing Python or R code. I know, I know, there’s nothing stopping me from submitting an talk proposal…

These minor gripes aside, I feel PostGIS (and PostgeSQL) continues to demonstrate that it is one of the best available products out there and that these presentations are the materialized view/proof of that. Thank you to all the people who organized the event and to the speakers who gave the presentations. I look forward to seeing you again next year.

[ ]