X

Oracle Spatial and Graph – technical tips, best practices, and news from the product team

Validating and Rectifying Spatial Data in the Oracle Database

Hans Viehmann, and Albert Godfrind

It always amazes me, how many of the complaints about unexpected results or errors in spatial queries we hear about can be traced back to invalid geometries. In our documentation, should you ever bother to read it, we are actually very clear about this:

“You should validate all geometry data, and fix any validation errors, before performing any spatial operations on the data. The recommended procedure for loading and validating spatial data is as follows:

  1. Load the data, using a method described in Bulk Loading or Transactional Insert Operations Using SQL.
  2. Use the SDO_GEOM.VALIDATE_GEOMETRY_WITH_CONTEXT function or the SDO_GEOM.VALIDATE_LAYER_WITH_CONTEXT procedure on all spatial data loaded into the database.
  3. For any geometries with the wrong orientation or an invalid ETYPE or GTYPE value, use SDO_MIGRATE.TO_CURRENT on these invalid geometries to fix them.
  4. For any geometries that are invalid for other reasons, use SDO_UTIL.RECTIFY_GEOMETRY to fix these geometries.“

The need to eliminate invalid geometries is probably immediately obvious. If you have, say, a self-crossing polygon, how would you know what is inside and what is outside? What makes this trickier is that across the industry, tools and solutions have varying levels of strictness when it comes to tolerating errors in spatial data. This means that an incorrect geometry may have been loaded into a GIS tool without problems, but once it is saved to the database, it could raise errors when it is used in a subsequent query. Moreover, the Oracle Database has become less forgiving from one release to the next in this regard, so that an invalid geometry may have gone unnoticed in 11gR2 while in 19c it would cause issues.

So, how can errors in SDO_GEOMETRY objects be identified and subsequently be fixed? The simplest approach for a table of geometries is to use the procedure below, which is based on the SDO_UTIL.RECTIFY_GEOMETRY() function. This function tries hard to correct the most common errors such as duplicate points, polygon orientation errors, polygon construction errors, and so on. However, any error may in turn hide other errors, which the function is not designed to correct. Here is what happens behind the scenes:

  1. The function validates the geometry
  2. If correct, it returns it unchanged
  3. If it detects one of the known errors, it tries to correct it.
  4. If it detects any uncorrectable error, it fails with an exception
  5. It repeats the process until there is no more error or it finds an uncorrectable error.

Therefore, there are two possible outcomes: Either a corrected, valid geometry; or an exception. This means you need to catch the exception to detect which of the shapes are unrecoverable. The entire procedure could look like this:


declare
  -- Declare a custom exception for uncorrectable geometries
  -- "ORA-13199: the given geometry cannot be rectified"
  cannot_rectify exception;
  pragma exception_init(cannot_rectify, -13199);
  v_geometry_fixed sdo_geometry;
begin
  -- Process the invalid geometries
  for e in (
    select rowid, shape
    from my_geom_table
    where sdo_geom.validate_geometry_with_context(shape, 0.05) != 'TRUE'
  )
  loop
    -- Try and rectify the geometry. Throws an exception if it cannot be corrected
    begin
      v_geometry_fixed := sdo_util.rectify_geometry (e.shape, 0.05);
    exception
      when cannot_rectify then
        v_geometry_fixed := null;
    end;
    if v_geometry_fixed is not null then
      -- Update the base table with the rectified geometry
      update my_geom_table
      set shape = v_geometry_fixed 
      where rowid = e.rowid;
      dbms_output.put_line('Successfully corrected geometry rowid='||e.rowid);
    else
      dbms_output.put_line('*** Unable to correct geometry rowid='||e.rowid);
    end if;
    commit;

  end loop;
end;
/

Any shapes that are left uncorrected at this point need further examination, of course.

A word about tolerance is necessary here. In the above example, we specify 0.05 as tolerance, i.e. 5cm. You should really use the actual tolerance of your data, which you will find in the metadata of your table (USER_SDO_GEOM_METADATA). Using the correct tolerance is important for both validation and rectification, and must be the same for both. Errors may exist at certain tolerance levels but not at others. For example, the above tolerance (5cm) means that two adjacent points, which are 3cm apart, will be diagnosed as “duplicate points”. Validating at a finer tolerance (say 1cm) will not report any such error.

Join the discussion

Comments ( 1 )
  • Tracy McLane Thursday, April 30, 2020
    Hans, you make some excellent points on validation and rectification. I use these functions in my current database automation validation of the USDA Natural Resources Inventory (NRI). In addition to those you mentioned, the functions SDO_UTIL.SIMPLIFY and SDO_UTIL.REMOVE_DUPLICATE_VERTICES, when used with data and site-specific tolerances, can help eliminate excessive vertices to improve data performance.

    In addition to clean topology, spatial data partitioning and the enabling of parallel processes are key factors for optimizing processing speed of large spatial datasets.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.