I am still working on the first deployment but new things and ideas are popping up. I had few ideas and discoveries regarding reverse geocoding, tile server and middleware.
Regarding the deployment – I populated the server with GADM data. I started researching how to deploy golang server. I the past I used docker images for similar purpose. Usually I would build and publish docker image to my docker-hub account and then remotely update the running docker image on my virtual machine. I think this time I can simplify things a bit by building the image directly on my virtual machine. This way I can skip the docker-hub step.
Reverse geocoding. My idea for reverse geocoding was to return the lowest possible GADM level for a given point (lat, lng). It is not obvious how to achieve this since each level in stored in separate table and various countries have various number of levels – e.g France have all five levels while other countries only have two or three.
Idea 1 (expensive) – is to send five SQL queries for each reverse-geocode request. One SQL query for each GADM level. This will definitely work but at the expense of performance since unnecessary queries need to be processed by Postgres cluster while only one, lowest level is returned to the client.
Idea 2 – is to create a mapping between country and number of GADM levels e.g. FR -> 5. This way I can send two queries to Postgres, one to find the country for requested lat, lng point; and second to the table that stores the lowest level for this country
// pseudo query one
SELECT fid, geom FROM adm_0 WHERE WITHIN(point, geom)
JOIN country_to_num_levels ON fid;
// pseudo query two
SELECT ...columns FROM adm_? WHERE WITHIN(point, geom)
This approach limits unnecessary work that Posgres has to perform but it also requires two parallel queries – which mean slower response time for users.
Idea 3 – I realized I am not forced to store GADM data in a way that it was ingested to Postgres by ogr2ogr tool (meaning in five separate tables). I can create a single table for all geometries. Then I can order the table by GADM level and perform a single geospatial query what limits result to just one. This results will still need to be joined with appropriate adm_lv table. Which means I am still performing two queries, but potentially only one isWithin query.
I have few more ideas that include caching, rasterizing the dataset or leveraging geo-hashes.
Tile server – I found these two cool projects that allow serving geospatial data as vector tiles directly from PostGIS
- tegola.io
- www.postgresql.org/about/news/pg_tileserv-for-postgresqlpostgis-2016/
Middleware – Moreover I found the middleware package I was looking for! Something that handles context logging in json format. I also need to look into other middleware they provide: cors, jwt, auth and proxy.