Tuesday Apr 01, 2014

Tips for using the Oracle Spatial and Graph geocoding engine to convert addresses directly in the Oracle database

By Karin Patenge, Principal Sales Consultant, Oracle Germany

Many OLTP systems and data warehouse implementations would benefit from having an address cleansing process included which at the same time associates geographic coordinates with each address for spatial analytics. Together with reference datasets from HERE/Nokia or TomTom the Oracle Geocoder can provide this functionality out-of-the-box.

The Geocoding engine in Oracle Spatial and Graph can be accessed using PL/SQL, via the SDO_GCDR package, which belongs to the user MDSYS. In addition to the the actual conversion of an address line into a point geometry (SDO_GEOMETRY with the geometry type 2001) using GEOCODE_AS_GEOMETRY, this package includes the GEOCODE function:

select SDO_GCDR.GEOCODE(user, SDO_KEYWORDARRAY('Schiffbauergasse 14', '14467 Potsdam'),'DE', 'DEFAULT') from dual;

This function provides the second important capability of the geocoder (in addition to the calculation of the geo-coordinates), namely to perform address verification. The return value is an object of type SDO_GEO_ADDR. Based on the address match against the reference dataset, this function returns information such as the quality of the match as well as the completed or corrected address from the reference dataset.

The command describe SDO_GEO_ADDR displays the structure of the object. The result for the query above can be found below, but without looking in the Oracle Spatial Developer's Guide only some parts are meaningful.

MDSYS.SDO_GEO_ADDR(MDSYS.SDO_KEYWORDARRAY(),
 NULL,NULL,NULL,NULL,NULL,
 'Potsdam','BRANDENBURG','DE','14467',
 NULL,NULL,NULL,NULL,NULL,NULL,'F','F',
 NULL,NULL,'L',0,64959047,'??????????B281CP?',4,'DEFAULT',
 13.04793,52.39935,'???11111110??400?')

To understand the output more easily, here is a PL/SQL script that extracts the array values and interprets the ERRORMESSAGE, MATCHVECTOR and MATCHMODE strings.

create or replace procedure format_geo_addr (
  address SDO_GEO_ADDR
)
AS
  type strings is table of varchar2(30);
  match_names strings := strings (
    '?            ',
    '0 (MATCHED)  ',
    '1 (ABSENT)   ',
    '2 (CORRECTED)',
    '3 (IGNORED)  ',
    '4 (SUPPLIED) '
  );
  address_elements strings := strings (
    null,
    null,
    'X Address Point',
    'O POI Name',
    '# House or building number',
    'E Street prefix',
    'N Street base name',
    'U Street suffix',
    'T Street type',
    'S Secondary unit',
    'B Built-up area or city',
    null,
    null,
    '1 Region',
    'C Country',
    'P Postal code',
    'P Postal add-on code'
  );
  element_match varchar2(128);
  element_match_code char(1);

BEGIN
  if address is not null then
    dbms_output.put_line ('- ID                  ' || address.ID);
    dbms_output.put_line ('- ADDRESSLINES');
    if address.addresslines is not null then
      for i in 1..address.addresslines.count() loop
        dbms_output.put_line ('- ADDRESSLINES['||i||']           ' || address.ADDRESSLINES(i));
      end loop;
    end if;
    dbms_output.put_line ('- PLACENAME           ' || address.PLACENAME);
    dbms_output.put_line ('- STREETNAME          ' || address.STREETNAME);
    dbms_output.put_line ('- INTERSECTSTREET     ' || address.INTERSECTSTREET);
    dbms_output.put_line ('- SECUNIT             ' || address.SECUNIT);
    dbms_output.put_line ('- SETTLEMENT          ' || address.SETTLEMENT);
    dbms_output.put_line ('- MUNICIPALITY        ' || address.MUNICIPALITY);
    dbms_output.put_line ('- REGION              ' || address.REGION);
    dbms_output.put_line ('- COUNTRY             ' || address.COUNTRY);
    dbms_output.put_line ('- POSTALCODE          ' || address.POSTALCODE);
    dbms_output.put_line ('- POSTALADDONCODE     ' || address.POSTALADDONCODE);
    dbms_output.put_line ('- FULLPOSTALCODE      ' || address.FULLPOSTALCODE);
    dbms_output.put_line ('- POBOX               ' || address.POBOX);
    dbms_output.put_line ('- HOUSENUMBER         ' || address.HOUSENUMBER);
    dbms_output.put_line ('- BASENAME            ' || address.BASENAME);
    dbms_output.put_line ('- STREETTYPE          ' || address.STREETTYPE);
    dbms_output.put_line ('- STREETTYPEBEFORE    ' || address.STREETTYPEBEFORE);
    dbms_output.put_line ('- STREETTYPEATTACHED  ' || address.STREETTYPEATTACHED);
    dbms_output.put_line ('- STREETPREFIX        ' || address.STREETPREFIX);
    dbms_output.put_line ('- STREETSUFFIX        ' || address.STREETSUFFIX);
    dbms_output.put_line ('- SIDE                ' || address.SIDE);
    dbms_output.put_line ('- PERCENT             ' || address.PERCENT);
    dbms_output.put_line ('- EDGEID              ' || address.EDGEID);
    dbms_output.put_line ('- ERRORMESSAGE        ' || address.ERRORMESSAGE);
    if address.MATCHVECTOR is not null then
      dbms_output.put_line ('- MATCHVECTOR         ' || address.MATCHVECTOR);
      for i in 1..length(address.MATCHVECTOR) loop
        if address_elements(i) is not null then
          if substr (address.matchvector,i,1) = '?' then
            element_match_code := 0;
          else
            element_match_code := substr(address.matchvector,i,1) + 1;
          end if;
          dbms_output.put_line ('-   '|| substr(address.errormessage,i,1)  || ' ' ||
            match_names (element_match_code + 1) || ' ' ||
            address_elements (i)
          );
        end if;
      end loop;
    end if;
    if address.MATCHVECTOR is not null then
      dbms_output.put_line ('- MATCHCODE           ' || address.MATCHCODE || ' = ' ||
        case address.MATCHCODE
          when  0 then 'Ambiguous'
          when  1 then 'Exact match'
          when  2 then 'Street type not matched'
          when  3 then 'House number not matched'
          when  4 then 'Street name not matched'
          when 10 then 'Postal code not matched'
          when 11 then 'City not matched'
        end
      );
    end if;
    dbms_output.put_line ('- MATCHMODE           ' || address.MATCHMODE);
    dbms_output.put_line ('- LONGITUDE           ' || address.LONGITUDE);
    dbms_output.put_line ('- LATITUDE            ' || address.LATITUDE);
  else
    dbms_output.put_line ('**** NO MATCH ****');
  end if;
end;
/
show errors

create or replace procedure format_addr_array (
  address_list SDO_ADDR_ARRAY
)

as
begin
  if address_list is not null and address_list.count() > 0 then
    for i in 1..address_list.count() loop
      dbms_output.put_line ('ADDRESS['||i||']');
      format_geo_addr (address_list(i));
    end loop;
  else
    dbms_output.put_line ('**** NO MATCH ****');
  end if;
end;
/
show errors

Now the GEOCODE function can be called again, but this time in conjunction with the format function:

exec format_geo_addr (SDO_GCDR.GEOCODE(user, SDO_KEYWORDARRAY('Schiffbauergasse 14', '14467 Potsdam'), 'DE', 'DEFAULT'));

And the resulting output is now formatted in a manner that is much easier to understand.


- ID                  0
- ADDRESSLINES
- PLACENAME          
- STREETNAME          Schiffbauergasse
- INTERSECTSTREET    
- SECUNIT            
- SETTLEMENT         
- MUNICIPALITY        Potsdam
- REGION              BRANDENBURG
- COUNTRY             DE
- POSTALCODE          14467
- POSTALADDONCODE    
- FULLPOSTALCODE     
- POBOX              
- HOUSENUMBER         14
- BASENAME            SCHIFFBAUER
- STREETTYPE          GASSE
- STREETTYPEBEFORE    F
- STREETTYPEATTACHED  F
- STREETPREFIX       
- STREETSUFFIX       
- SIDE                R
- PERCENT             .75
- EDGEID              748777426
- ERRORMESSAGE        ????#ENUT?B281CP?
- MATCHVECTOR         ???10101010??400?
-   ? ?             X Address Point
-   ? 1 (ABSENT)    O POI Name
-   # 0 (MATCHED)   # House or building number
-   E 1 (ABSENT)    E Street prefix
-   N 0 (MATCHED)   N Street base name
-   U 1 (ABSENT)    U Street suffix
-   T 0 (MATCHED)   T Street type
-   ? 1 (ABSENT)    S Secondary unit
-   B 0 (MATCHED)   B Built-up area or city
-   1 4 (SUPPLIED)  1 Region
-   C 0 (MATCHED)   C Country
-   P 0 (MATCHED)   P Postal code
-   ? ?             P Postal add-on code
- MATCHCODE           1 = Exact match
- MATCHMODE           DEFAULT
- LONGITUDE           13.0745378367008
- LATITUDE            52.4041174822031

Note: The reference dataset used for this test with the Oracle Spatial and Graph geocoder was provided by HERE/Nokia.

Upcoming Webinars: MapViewer at City of Toronto for Public Safety, Customers Achieve 300x Performance Gains with Oracle Spatial and Graph

A note to share information about two upcoming Directions Media webinars on April 23 and May 6.

City of Toronto Enhances Public Safety Using Real-time Big Data and Map Rendering with Oracle and AGSI, Wed., April 23, 2:00PM US EDT

Learn how the City of Toronto Police Services can search, review and map social media traffic in real time to quickly identify and respond to incidents, improving public safety. See live demos of their system using Oracle MapViewer's HTML5 capabilities, Oracle Spatial, and a social media mapping platform from partner AGSI. Carol Palmer of Oracle will co-present this webinar with Mike Jander of AGSI, and City of Toronto, hosted by Directions Media.

Learn more and register for this free webinar -
http://www.directionsmag.com/webinars/register/city-of-toronto-enhances-public-safety-using-real-time-big-data-and-ma/389356?DM_webinars_section&utm_medium=web&utm_campaign=389356


Learn How Customers Are Experiencing 300x Performance Gains with Oracle Spatial and Graph, Tues., May 6, 2:00PM US EDT (Free Webinar)

Nick Salem of Neustar and Steve Pierce of Think Huddle will share their realized performance benchmarks using Oracle Spatial and Graph. With Oracle Spatial and Graph in Database 12c, customers can address the largest geospatial workloads and experience performance increases of 50 to 300 times for vector operations, with minimal configuration changes. Jim Steiner of Oracle will also discuss performance gains from parallel raster processing and Exadata.

Learn more and register for this free webinar -
http://www.directionsmag.com/webinars/register/learn-how-customers-are-experiencing-300x-performance-gains-with-oracl/390239?DM_webinars_section&utm_medium=web&utm_campaign=390239

Monday Feb 24, 2014

Performance Boost with Aggregating Geometries - with 12c Spatial Vector Performance Acceleration

Nick Salem of Neustar recently shared some impressive performance gains realized using Oracle Spatial 12c Vector Performance Acceleration, in a Google+ post.   He observed performance gains of 40X to 300X -- with a use case aggregating all ZIP Code geographies in California using a SDO_AGGR_UNION operations (plain and with group by/mod functions as described in 11g documented best practices). 

Read more details of his test case and results here.  https://plus.google.com/114373574274269737617/posts/2caAypKxwff

Thanks for sharing, Nick!

Monday Feb 17, 2014

Oracle Spatial and Graph is now compliant with OGC WFS 1.1.0 and WFS-T 1.1.0

An update from Baris Kazar of the Spatial development team: 

We are happy to announce that Oracle Spatial and Graph for 12c Release 1 is now fully compliant with OGC's WFS 1.1.0 and WFS-T 1.1.0 standards.  

http://www.opengeospatial.org/resource/products/details/?pid=1197

For more information on Oracle Spatial and Graph standards compliance

http://www.opengeospatial.org/resource/products/compliant#Oracle_USA - full list of OGC standards compliance for  all Oracle Spatial product versions

http://www.oracle.com/technetwork/database/options/spatialandgraph/documentation/spatial-stds-support-094078.html -  Oracle Spatial Technologies Standards Page


Wednesday Feb 05, 2014

Data Validation with Esri and Oracle Spatial

By:  Daniel Geringer, Senior Development Manager, Oracle Spatial and Graph

One of the things users can run into when using Esri with Oracle Spatial can be problems with data validation when loading data using Esri utilities into Oracle Spatial data types. From my experience, the solution is a relatively straightforward change in the workflow used to load data.

Here is an excerpt from a user problem where it sounds like Esri is calling Oracle's validation on load (not Esri's validation).

I don't believe Esri's validation would be sensitive to the database version.

  • It seems there are tighter tolerances (or the tolerances work) in 11.2.0.3 and features that were not reporting as invalid in 11.2.0.2 are now reporting as invalid features (mainly self-intersecting polygons) in the 11.2.0.3.  We are assuming that for some reason the SDE tools are crashing when they hit an invalid feature.

This sounds more like a workflow defined by Esri than differences in validation.  I think the issue is once Esri encounters an invalid geometry, it stops loading the rest (even if they are valid).  I believe Esri chooses this workflow to ensure all geometries in a layer are valid.  I agree the outcome is optimal, and what Oracle recommends too.  A deficiency in the workflow is:

1. It prevents valid geometries from being loaded after an invalid one is encountered
2. Not loading the invalid geometry does not provide a way to inspect it

From a product perspective, a change in the Esri workflow would have to be implemented by Esri.  (Maybe the default behavior can remain the same, but enable a switch that populates invalid geometries into a different table for inspection at a later time. This is just one suggestion.  There may be other options.)

Currently, the best way to circumvent this is to do what this customer did, and load data with Oracle or third party tools like FME.  Once the data is loaded, Oracle also recommends validation, and removing/isolating invalid geometries until they can be fixed.  I discuss a strategy to do this in a best practices presentation. 

One thing I keep hearing is the claim that Esri and Oracle validate differently.

  • Both ESRI and Oracle validate by OGC rules. 
  • The only difference I see is when the data is geodetic.  Esri does not consider geodesic paths between vertices for linestrings or polygons, but Oracle does. 
  • There is only one case I have come across that is valid in Esri, but not in Oracle or by OGC standards. 
    • Imagine a box with a triangle void, where the triangle only touches the box at one point. 
    • Esri can model this as a single exterior ring with no voids, and considers it valid.  This does not comply with OGC standards.
    • Oracle and the OGC standard state that this case is valid if it is modeled with two rings, one exterior and one interior, where the interior ring touches the exterior at one point.
  • Other than geodetic data validation, I believe both Esri and Oracle offer comprehensive validation, and for all intents and purposes, the validation offered by both solutions is equivalent.

Friday Jan 10, 2014

New Point-in-Polygon function in Oracle Spatial and Graph 12c

By: Jim Steiner, Siva Ravada, Rick Anderson

With the increased adoption of Exadata for spatial workloads, we have been looking at ways to exploit more and more of the capabilities of this architecture to address problems faced in large scale spatial analysis. The new point-in-polygon function in Spatial can result in 100s of times faster UPDATE and INSERT operations with no degradation in query performance for large scale point-in-polygon operations. Mask operations (DISJOINT, TOUCH, INSIDE, ANYINTERACT) can be performed with the point-in-polygon function.

When working with point data and performing point-in-polygon analysis the existing spatial operators to do a fast query on the data if there is a Spatial index on the point data. However, in many cases, the data volume is very high, so creating and maintaining the index becomes very expensive. With 12c, we exploit Exadata smartscan by implementing a different model to take advantage of all the CPUs to do point in polygon operations and not have the overhead of a Spatial index. The mdsys.PointInPolygon() function returns those rows that reside within a specified polygon geometry. This parallel-enabled Point-In-Polygon function takes an arbitrary set of rows whose first column is a point's x-coordinate value and the second column is a point's y-coordinate value.

The mdsys.PointInPolygon() function API is the following:

mdsys.sdo_PointInPolygon(cur SYS_REFCURSOR,
                         geom_obj IN SDO_GEOMETRY,
                         tol IN NUMBER,
                         params IN VARCHAR2 DEFAULT NULL);

The "cur" parameter is used to select an "x" and "y" point coordinate from a

user table. The two columns must be of type NUMBER; this is NOT a geometry

parameter.

The "geom_obj" parameter is either a polygon geometry from a table, or a

Transient instance of a polygon geometry, against which all of the selected points from "cur" will be validated.

The "tol" parameter is the desired tolerance value, which must be greater than the value "0.0".



The following examples show the performance benefits of this new approach:

Here we select all rows from the "weather_sensor" table and query those rows against a transient polygon geometry instance. Only 1 weather_sensor row (out of 4) resides within the specified polygon.

SQL> SELECT *

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from weather_sensor),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

In order to utilize parallel query servers, you must either specify the

"/*+ PARALLEL(4) */" optimizer hint, or enable parallel query execution,

using the command:

alter session force parallel query;

Below is the same as above, but uses 4 parallel query servers:

SQL> SELECT /*+ PARALLEL(4) */ *

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from weather_sensor),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

There can be a huge performance benefit to using parallel query servers.

The following "worst-case" example queries 1-million rows against a transient

polygon geometry instance, using the non-parallel query execution:

SQL> -- instead of the actual data...

SQL>

SQL> -- Test "non-parallel" execution first

SQL> timing start "sdo_PointInPolygon()"

SQL> SELECT COUNT(*)

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from pip_data),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

timing for: sdo_PointInPolygon()

Elapsed: 00:05:00.73

Enabling the parallel query servers dramatically reduces the query execution time:

SQL> -- Now test using 4 parallel query servers

SQL> timing start "sdo_PointInPolygon()"

SQL> SELECT /*+ PARALLEL(4) */ COUNT(*)

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from pip_data),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

SQL> timing stop

timing for: sdo_PointInPolygon()

Elapsed: 00:02:18.18

For more information about this new feature, link to the documentation URL:
SDO_PointInPolygon


Thursday Dec 19, 2013

Welcome!

Welcome to the official blog for the spatial features of Oracle Spatial and Graph, an option of Oracle Database!  Our development and product managers will be posting the latest product updates, and relevant technical tips here.

Feel free to let us know in the comments of any questions, feedback, or suggested topics.

We look forward to sharing info and interacting with our great user community on this blog!

About

The official blog for the spatial features of Oracle Spatial and Graph, an option of Oracle Database - brought to you by the product managers and developers. Get technical tips, product information, and the latest news here. Visit our official product website at http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/spatialfeatures-1902020.html

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today