It always amazes me, how many of the complaints about unexpected results or errors in spatial queries we hear about can be traced back to invalid geometries. In our documentation, should you ever bother to read it, we are actually very clear about this. 

“You should validate all geometry data, and fix any validation errors, before performing any spatial operations on the data. The recommended procedure for loading and validating spatial data is as follows:

  1. Load the data, using a method described in Bulk Loading or Transactional Insert Operations Using SQL.
  2. Use the SDO_GEOM.VALIDATE_GEOMETRY_WITH_CONTEXT function or the SDO_GEOM.VALIDATE_LAYER_WITH_CONTEXT procedure on all spatial data loaded into the database.
  3. For any geometries with the wrong orientation or an invalid ETYPE or GTYPE value, use SDO_MIGRATE.TO_CURRENT on these invalid geometries to fix them. 
  4. For any geometries that are invalid for other reasons, use SDO_UTIL.RECTIFY_GEOMETRY to fix these geometries.”

The need to eliminate invalid geometries is probably immediately obvious. If you have, say, a self-crossing polygon, how would you know what is inside and what is outside? What makes this trickier is that across the industry, tools, and solutions have varying levels of strictness when it comes to tolerating errors in spatial data. This means that an incorrect geometry may have been loaded into a GIS tool without problems, but once it is saved to the database, it could raise errors when it is used in a subsequent query. Moreover, the Oracle Database has become less forgiving from one release to the next in this regard, so that an invalid geometry may have gone unnoticed in 12gR2 while in 19c and 23ai it would cause issues. 

So, how can errors in SDO_GEOMETRY objects be identified and subsequently be fixed? The simplest approach for a table of geometries is to use the procedure below, which is based on the SDO_UTIL.RECTIFY_GEOMETRY() function. This function tries hard to correct the most common errors, such as duplicate points, polygon orientation errors, polygon construction errors, and so on. However, any error may, in turn, hide other errors, which the function is not designed to correct. Here is what happens behind the scenes:

  1. The function validates the geometry
  2. If correct, it returns it unchanged
  3. If it detects one of the known errors, it tries to correct it
  4. If it detects any uncorrectable error, it fails with an exception
  5. It repeats the process until there are no more errors or it finds an uncorrectable error

Therefore, there are two possible outcomes: Either a corrected, valid geometry; or an exception. This means you need to catch the exception to detect which of the shapes are unrecoverable. The entire procedure could look like this. 

declare
-- Declare a custom exception for uncorrectable geometries
-- "ORA-13199: the given geometry cannot be rectified"
cannot_rectify exception;
pragma exception_init(cannot_rectify, -13199);
v_geometry_fixed sdo_geometry;
begin
-- Process the invalid geometries
for e in (
select rowid, shape
from my_geom_table
where sdo_geom.validate_geometry_with_context(shape, 0.05) )
loop
-- Try and rectify the geometry.
-- Throws an exception if it cannot be corrected
begin
v_geometry_fixed := sdo_util.rectify_geometry (e.shape, 0.05);
exception
when cannot_rectify then
v_geometry_fixed := null;
end;
if v_geometry_fixed is not null then
-- Update the base table with the rectified geometry
update my_geom_table
set shape = v_geometry_fixed
where rowid = e.rowid;
dbms_output.put_line('Successfully corrected geometry rowid='||else
dbms_output.put_line('*** Unable to correct geometry rowid='||end if;
commit;
end loop;
end;
/

Any shapes left uncorrected at this point need further examination, of course.

A word about tolerance is necessary here. In the above example, we specify 0.05 tolerance, i.e., 5cm. You should really use the actual tolerance of your data, which you will find in the metadata of your table (USER_SDO_GEOM_METADATA). Using the correct tolerance is essential for both validation and rectification and must be the same for both. Errors may exist at certain tolerance levels but not at others. For example, the above tolerance (5cm) means that two adjacent points, which are 3cm apart, will be diagnosed as “duplicate points.” Validating at a finer tolerance (say 1cm) will not report any such error. 

In earlier releases of Oracle Spatial, the tolerance for geodetic data (such as data identified by latitude and longitude) had to be 0.05 or greater. We now support smaller tolerance values as well, so if you observe validation errors in geodetic data, reducing the tolerance to 0.005, i.e. 5mm, is worth a try. In any case, the tolerance should be the same for all spatial layers you want to compare.

Pro tip: If you need to validate large volumes of data on a multi-core system, you can use parallelism very effectively. For example, you can create a table of invalid geometries with 16 threads performing the validation in parallel with the following SQL statement:

CREATE TABLE validation_results PARALLEL 16 NOLOGGING
SELECT sdo_rowid, status
FROM (SELECT rowid sdo_rowid,
             sdo_geom.validate_geometry_with_context(
        FROM my_geom_table)
WHERE status <> 'TRUE';

Want to learn more? Please use these links for our documentationdownload informationLiveLab workshops, and more information