PostGIS extension enables you to efficiently store and query geospatial data in Postgres. It introduces spatial data types such as GEOMETRY and GEOGRAPHY, which lets you store point
s, polygon
s, line
s and so on. PostGIS also supports a large set of SQL functions and spatial indexes to effectively manipulate spatial data.
The reason you chose PostGIS is that you want your geospatial workload to be treated as a first-class citizen. You should expect no less when it comes to replicating your geospatial database, instead of geospatial being an afterthought.
In this blog, we will explore what it takes to natively replicate PostGIS data from Postgres to other data stores.
Status Quo
We've overwhelmingly heard from customers that existing data movement tools are not reliable in replicating geospatial data from Postgres. Common issues include -
Suboptimal Data Type Conversion - converting spatial types to STRING or JSON rather than using native spatial types supported by the target.
Data Loss - skipping whole rows when encountering incompatible types leading to loss of data.
Replicating Geospatial data from Postgres
At PeerDB, we are building a specialized data-movement tool for Postgres. With that spirit, we took a step forward to add extensive geospatial support while replicating data from Postgres to Snowflake. This includes -
Demo of replicating Postgres Geospatial data incl. POINT, LINE, POLYGON etc of GEOMETRY and GEOGRAPHY types to Snowflake.
Support for GEOMETRY and GEOGRAPHY data types: These types are stored as GEOMETRY and GEOGRAPHY types on Snowflake.
Replicating Points, Lines and Polygons: A GEOMETRY or GEOGRAPHY in Postgres can be any of a Point, Line or Polygon. PeerDB handles replicating all these data types to Snowflake.
Let's see how to ingestPoint
,Line
andPolygon
to a Postgres table:CREATE TABLE locations ( id serial PRIMARY KEY, location_name VARCHAR(255), location_geometry GEOMETRY, location_geography GEOGRAPHY ); -- Inserting a point with generic geometry and geography INSERT INTO locations VALUES (1,'Point A', ST_GeomFromText('POINT(1 2)'), ST_GeogFromText('POINT(1 2)')); -- Inserting a line with generic geometry and geography INSERT INTO locations VALUES (2, 'Line AB', ST_GeomFromText('LINESTRING(1 2, 3 4, 5 6)'), ST_GeogFromText('LINESTRING(1 2, 3 4, 5 6)')); -- Inserting a polygon with generic geometry and geography INSERT INTO locations VALUES (3, 'Polygon ABCD', ST_GeomFromText('POLYGON((1 1, 2 1, 2 2, 1 2, 1 1))'), ST_GeogFromText('POLYGON((1 1, 2 1, 2 2, 1 2, 1 1))'));
Replicating GEOMETRYCOLLECTION: A geometry can represent a collection of Lines, Points and Polygons. PeerDB handles replicating
GEOMETRYCOLLECTION
from Postgres to Snowflake.
Let's see how to insert aGEOMETRYCOLLECTION
to a Postgres table:/* Inserting a GEOMETRYCOLLECTION into the generic geometry column */ INSERT INTO locations (location_name, location_geometry, location_geography) VALUES ( 'GeometryCollection', ST_GeomFromText('GEOMETRYCOLLECTION(POINT(1 2), LINESTRING(1 2, 3 4))'), ST_GeogFromText('GEOMETRYCOLLECTION(POINT(1 2), LINESTRING(1 2, 3 4))') );
Error handling for invalid Geometries: A difference between Postgres and Snowflake is that Postgres lets you store invalid Geometries and Snowflake doesn't. For example, a line with a single point can be stored in Postgres but not in Snowflake. In such a scenario, while replicating data from Postgres, PeerDB captures the invalid data in logs and stores
null
for that column value on Snowflake.
Let's see how an invalidGEOMETRY
looks like in Postgressource=> SELECT ST_IsValid(ST_MakeLine('POINT(1 2)'::geometry)); NOTICE: Too few points in geometry component at or near point 1 2 st_isvalid ------------ f (1 row)
Conclusion
We added extensive support to replicate geospatial data from Postgres to Snowflake. We will be expanding this to other data stores such as BigQuery and Clickhouse. Please reach out to us if you want this support for other data stores.
If you want to give PeerDB a try for replicating geospatial data from Postgres, these links should prove useful: :)