Monday May 09, 2016

Benefits of the 12c SDO_POINTINPOLYGON Function

Guest Post By: Nick Salem, Distinguished Engineer, Neustar and Technical Chair, Oracle Spatial SIG

The mdsys.SDO_POINTINPOLYGON function API is a new feature that was released in Oracle Database 12c. There is a nice blog post that explains how this feature can be used to address the challenges of ingesting large amounts of spatial data where you can handle the loading and querying of large spatial data sets without the overhead associated with creating and maintaining a spatial index. The example shows how SDO_POINTINTPOLYGON can really benefit massive scale operations, such as those using Exadata environments.

In this post, I would like to cover some other benefits that the SDO_POINTINPOLYGON feature provides that can be very helpful – especially for applications servicing a large number of concurrent spatial operations. This can greatly improve performance for such applications that run on either Exadata or non-Exadata environments. The fact that the SDO_POINTINPOLYGON does not use a spatial index means that you can leverage data stored in an external table or a global temporary table to perform spatial point-in-polygon queries. Global temporary tables are great for multi-session environments because every user session has their own version of the data for the same global temporary table, without any contention or row locking conflicts between sessions. Furthermore in 12c, Oracle introduced some major performance optimizations to global temporary tables that result in substantially lower redo and undo generation. You will need to make sure system parameter temp_undo_enabled is set to TRUE to ensure that the 12c global temporary tables optimization is fully in effect.

Below is a screenshot from Neustar’s ElementOne platform with a map showing a trade area and a set of uploaded customer points.


At Neustar, our clients work with a lot of transient work data as part of a multi-step process for various spatial and analytical use cases. Let’s put together a quick PL/SQL script that you can use to test drive the power of the SDO_POINTINPOLYGON function. Here, I use a simple polygon in the San Diego area and generate a set of random customer points in and around the polygon. Then, I populate the global temporary table. The script is configurable: you can increase or decrease the number of randomly generated customer points, and how far from the polygon centroid you may want to allow points to extend to. Once you have the data populated, you can run the SDO_POINTINPOLYGON queries in serial or in parallel, or change some of the optional MASK parameters. Here’s a screenshot of the test polygon and a sample of randomly generated 1,000 customer points.


1. Create a global temporary table

Ok, so let’s first create a global temporary table =>

create global temporary table TMP_SPATIAL_POINT (
x number,

y number,
id varchar2(512) )

on commit preserve rows;

2. Generate a set of random points and populate the global temporary table

Next, let’s run the following script to populate table TMP_SPATIAL_POINT. The script has two variables: maxDistanceInMeters and numberOfPoints in the PL/SQL declaration section that you can adjust as needed. If you want to generate more points, then you can change the value of numberOfPoints from 1000 to a greater number. In this example, I also have maxDistanceInMeters set to 4000. This will ensure that no customer points get generated further than 4000 meters away from the polygon centroid; this can be increased or decreased as needed. The script goes through a loop up to the numberOfPoints variable and uses the SDO_UTIL.POINT_AT_BEARING function to plot points around the centroid of the polygon using randomly generated values. The goal of the script is to quickly create some test data you can play with. Of course, you can also change the test polygon as well.

declare
polygon sdo_geometry;
centroid sdo_geometry;
newPoint sdo_geometry;
maxDistanceInMeters number := 4000;
numberOfPoints number := 10000;
type tRecs is table of tmp_spatial_point%rowtype;
recs tRecs := tRecs();
begin
polygon := SDO_GEOMETRY(2003, 8307, NULL, SDO_ELEM_INFO_ARRAY(1,1003,1),
SDO_ORDINATE_ARRAY(-117.1044,32.680882,-117.08895,32.661808,-117.06148,32.675102,
-117.06045,32.697641, -117.09753,32.696774, -117.1044,32.680882));
centroid := SDO_GEOM.SDO_CENTROID( polygon, 0.05 );
recs.extend(numberOfPoints);
for i in 1 .. numberOfPoints loop
newPoint := SDO_UTIL.POINT_AT_BEARING( start_point => centroid,
bearing => dbms_random.value(0,6.283),
distance => dbms_random.value(1,maxDistanceInMeters));
recs(i).id := i;
recs(i).x := newPoint.sdo_point.x;
recs(i).y := newPoint.sdo_point.y;
end loop;
execute immediate ‘truncate table tmp_spatial_point’;
forall i in recs.first .. recs.last
insert into tmp_spatial_point values ( recs(i).x, recs(i).y, recs(i).id ) ;
commit;
end;

3. Run SDO_POINTINPOLYGON queries (in serial or parallel)

Ok, now we can start performing queries using the SDO_POINTINPOLYGON function. Here’s a sample query that returns the counts of points that fall inside the polygon. The params parameter is optional; if omitted, a MASK=ANYINTERACT query will be performed.

set timing on
select
count(*)
from
table(
SDO_PointInPolygon( cur => cursor(select * from tmp_spatial_point),
geom_obj => SDO_GEOMETRY(2003, 8307, NULL, SDO_ELEM_INFO_ARRAY(1,1003,1),
SDO_ORDINATE_ARRAY(-117.1044,32.680882, -117.08895,32.661808, -117.06148,32.675102,
-117.06045,32.697641, -117.09753,32.696774, -117.1044,32.680882)),
tol => 0.05,
Params => 'MASK=INSIDE'
)
) t ;

Here’s a another example of the query using parallelism and with the params parameter omitted. 

select /*+ parallel(8) */
count(*)
from
table(
SDO_PointInPolygon( cur => cursor(select * from tmp_spatial_point),
geom_obj => SDO_GEOMETRY(2003, 8307, NULL, SDO_ELEM_INFO_ARRAY(1,1003,1),
SDO_ORDINATE_ARRAY(-117.1044,32.680882, -117.08895,32.661808, -117.06148,32.675102,
-117.06045,32.697641, -117.09753,32.696774, -117.1044,32.680882)),
tol => 0.05
)
) t ;

The SDO_POINTINPOLYGON function has been built to leverage Oracle’s parallel processing capability.  To demonstrate the magnitude of performance gain when utilizing parallelism, I modified the point generation script in part 2 to populate a million points with a max distance of 5,000 meters from the center point.  I then tested the SDO_POINTINPOLYGON query with no parallelism, and then with parallelism of 2, 4 and 8.  Here are the elapsed response times:

Level of parallelism

Elapsed time

None

13.28 secs

2

9.62 secs

4

6.03 secs

8

3.43 secs 

Utilizing parallelism can greatly shorten query processing times.  You can use these scripts in your environment to generate different numbers of points, test various levels of parallelism, and compare the response times. 

Tuesday Apr 01, 2014

Upcoming Webinars: MapViewer at City of Toronto for Public Safety, Customers Achieve 300x Performance Gains with Oracle Spatial and Graph

A note to share information about two upcoming Directions Media webinars on April 23 and May 6.

City of Toronto Enhances Public Safety Using Real-time Big Data and Map Rendering with Oracle and AGSI, Wed., April 23, 2:00PM US EDT

Learn how the City of Toronto Police Services can search, review and map social media traffic in real time to quickly identify and respond to incidents, improving public safety. See live demos of their system using Oracle MapViewer's HTML5 capabilities, Oracle Spatial, and a social media mapping platform from partner AGSI. Carol Palmer of Oracle will co-present this webinar with Mike Jander of AGSI, and City of Toronto, hosted by Directions Media.

Learn more and register for this free webinar -
http://www.directionsmag.com/webinars/register/city-of-toronto-enhances-public-safety-using-real-time-big-data-and-ma/389356?DM_webinars_section&utm_medium=web&utm_campaign=389356


Learn How Customers Are Experiencing 300x Performance Gains with Oracle Spatial and Graph, Tues., May 6, 2:00PM US EDT (Free Webinar)

Nick Salem of Neustar and Steve Pierce of Think Huddle will share their realized performance benchmarks using Oracle Spatial and Graph. With Oracle Spatial and Graph in Database 12c, customers can address the largest geospatial workloads and experience performance increases of 50 to 300 times for vector operations, with minimal configuration changes. Jim Steiner of Oracle will also discuss performance gains from parallel raster processing and Exadata.

Learn more and register for this free webinar -
http://www.directionsmag.com/webinars/register/learn-how-customers-are-experiencing-300x-performance-gains-with-oracl/390239?DM_webinars_section&utm_medium=web&utm_campaign=390239

Monday Feb 24, 2014

Performance Boost with Aggregating Geometries - with 12c Spatial Vector Performance Acceleration

Nick Salem of Neustar recently shared some impressive performance gains realized using Oracle Spatial 12c Vector Performance Acceleration, in a Google+ post.   He observed performance gains of 40X to 300X -- with a use case aggregating all ZIP Code geographies in California using a SDO_AGGR_UNION operations (plain and with group by/mod functions as described in 11g documented best practices). 

Read more details of his test case and results here.  https://plus.google.com/114373574274269737617/posts/2caAypKxwff

Thanks for sharing, Nick!

Friday Jan 10, 2014

New Point-in-Polygon function in Oracle Spatial and Graph 12c

By: Jim Steiner, Siva Ravada, Rick Anderson

With the increased adoption of Exadata for spatial workloads, we have been looking at ways to exploit more and more of the capabilities of this architecture to address problems faced in large scale spatial analysis. The new point-in-polygon function in Spatial can result in 100s of times faster UPDATE and INSERT operations with no degradation in query performance for large scale point-in-polygon operations. Mask operations (DISJOINT, TOUCH, INSIDE, ANYINTERACT) can be performed with the point-in-polygon function.

When working with point data and performing point-in-polygon analysis the existing spatial operators to do a fast query on the data if there is a Spatial index on the point data. However, in many cases, the data volume is very high, so creating and maintaining the index becomes very expensive. With 12c, we exploit Exadata smartscan by implementing a different model to take advantage of all the CPUs to do point in polygon operations and not have the overhead of a Spatial index. The mdsys.PointInPolygon() function returns those rows that reside within a specified polygon geometry. This parallel-enabled Point-In-Polygon function takes an arbitrary set of rows whose first column is a point's x-coordinate value and the second column is a point's y-coordinate value.

The mdsys.PointInPolygon() function API is the following:

mdsys.sdo_PointInPolygon(cur SYS_REFCURSOR,
                         geom_obj IN SDO_GEOMETRY,
                         tol IN NUMBER,
                         params IN VARCHAR2 DEFAULT NULL);

The "cur" parameter is used to select an "x" and "y" point coordinate from a  user table. The two columns must be of type NUMBER; this is NOT a geometry  parameter.

The "geom_obj" parameter is either a polygon geometry from a table, or a transient instance of a polygon geometry, against which all of the selected points from "cur" will be validated.

The "tol" parameter is the desired tolerance value, which must be greater than the value "0.0".



The following examples show the performance benefits of this new approach:

Here we select all rows from the "weather_sensor" table and query those rows against a transient polygon geometry instance. Only 1 weather_sensor row (out of 4) resides within the specified polygon. 


SQL> SELECT *

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from weather_sensor),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

In order to utilize parallel query servers, you must either specify the

"/*+ PARALLEL(4) */" optimizer hint, or enable parallel query execution,

using the command:

alter session force parallel query;

Below is the same as above, but uses 4 parallel query servers:

SQL> SELECT /*+ PARALLEL(4) */ *

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from weather_sensor),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));


There can be a huge performance benefit to using parallel query servers.

The following "worst-case" example queries 1 million rows against a transient polygon geometry instance, using the non-parallel query execution:

SQL> -- instead of the actual data...

SQL>

SQL> -- Test "non-parallel" execution first

SQL> timing start "sdo_PointInPolygon()"

SQL> SELECT COUNT(*)

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from pip_data),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

timing for: sdo_PointInPolygon()

Elapsed: 00:05:00.73

Enabling the parallel query servers dramatically reduces the query execution time:

SQL> -- Now test using 4 parallel query servers

SQL> timing start "sdo_PointInPolygon()"

SQL> SELECT /*+ PARALLEL(4) */ COUNT(*)

2 FROM TABLE(mdsys.sdo_PointInPolygon(

3 CURSOR(select * from pip_data),

4 MDSYS.SDO_GEOMETRY(

5 2003,

6 NULL,

7 NULL,

8 MDSYS.SDO_ELEM_INFO_ARRAY(1, 1003, 1),

9 MDSYS.SDO_ORDINATE_ARRAY(5, 1, 8, 1, 8, 6, 5, 7, 5, 1)),

10 0.05));

SQL> timing stop

timing for: sdo_PointInPolygon()

Elapsed: 00:02:18.18

For more information about this new feature, link to the documentation URL:
SDO_PointInPolygon


About

The official blog for the spatial features of Oracle Spatial and Graph, an option of Oracle Database - brought to you by the product managers and developers. Get technical tips, product information, and the latest news here. Visit our official product website at http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/spatialfeatures-1902020.html

Search

Categories
Archives
« June 2016
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today