空間軌跡 (トラジェクトリ) データは、交通、環境、スポーツ分析、さらには人の移動研究などの多岐にわたる領域で、ますます重要になっています。複数の軌跡を比較し、その類似度を測定することにより、共通する移動パターンの特定、移動の異常検知、また履歴データを元にした最適ルートの推定などが可能になります。実際の運用では、ノイズやわずかな誤差を含むGPSデータから得られた軌跡を、既知の道路や経路にマッチングするようなデータクレンジングがよく行われます。

この記事では、MySQLの空間機能を使って軌跡の類似度を分析する方法を詳しく解説します。主たる指標として、2つの軌跡間の類似度を測定するフレシェ距離 (ST_FrechetDistance) を中心に扱い、軌跡同士がどの程度近いのかを定量的に把握するための、いくつかの幾何学的操作についても説明します。

これらの機能はMySQL Community Edition、Enterprise Edition、およびHeatWave MySQLで利用可能です。

公園での歩行軌跡の分析

都市部の公園を歩く人の、移動を分析する例を考えてみます。この例では、それぞれの歩行者の軌跡をLINESTRINGで表しています。一定期間にわたり収集したGPSデータを用い、各歩行者の歩行経路を取り扱います。GPSデータから軌跡へ変換する方法については、別記事で紹介しています。まずはデータを格納するためのテーブルを作成します:

CREATE TABLE walk_trajectories (
    id INT AUTO_INCREMENT PRIMARY KEY,
    walker_name VARCHAR(100),
    trajectory LINESTRING NOT NULL SRID 4326
);

GPSポイントは、WGS84地理座標系 (SRIDは4326) で記録されている点に注意してください。空間参照系 (SRS) やSRIDの詳細については、過去のブログ記事 (英語)をご覧ください。

続いて、このテーブルにGPSデータを挿入します:

INSERT INTO walk_trajectories (walker_name, trajectory) VALUES
('Red Walker', ST_GeomFromText('LINESTRING(37.79562087383118 -122.46702694852982,37.7956012 -122.4670506,37.7955599 -122.4671136,37.7955243 -122.4671745,37.7954966 -122.4672255,37.7954759 -122.4672803,37.7954534 -122.4673499,37.7954181 -122.4674723,37.7953104 -122.4679047,37.7952792 -122.4680356,37.7952403 -122.4681458,37.795211 -122.4682021,37.7951776 -122.4682513,37.7951367 -122.468302,37.7950253 -122.468414,37.794974 -122.468481,37.794909 -122.468587,37.794877 -122.468668,37.794855 -122.468769,37.794852 -122.468904,37.7948625 -122.4689824,37.794888 -122.469046,37.7949726 -122.4691719,37.795056 -122.469263,37.7951199 -122.4693423,37.7951531 -122.4693115,37.7952016 -122.469269,37.795257 -122.4692223,37.7952906 -122.4691971,37.7953097 -122.4691838,37.7953481 -122.4691595,37.7953838 -122.4691424,37.7954199 -122.469131,37.795457 -122.4691254,37.7954934 -122.4691247,37.7955311 -122.4691268,37.7955659 -122.4691335,37.7956027 -122.4691418,37.795629 -122.4691489,37.7956738 -122.4691624,37.7957277 -122.469183,37.7957798 -122.4692086,37.7958306 -122.4692375,37.7958781 -122.4692629,37.7959215 -122.4692814,37.7959695 -122.4692929,37.7960161 -122.4692933,37.7960564 -122.4692831,37.7960939 -122.4692659,37.7961254 -122.4692411,37.7961532 -122.4692115,37.7961741 -122.4691849,37.7961938 -122.469152,37.7962897 -122.4689668,37.7963355 -122.4688798,37.7963845 -122.4687902,37.7964298 -122.4687122,37.7964824 -122.4686279,37.7965342 -122.4685538,37.7965849 -122.4684856,37.7966363 -122.4684182,37.7966796 -122.4683636,37.796737 -122.4682994,37.7968009 -122.4682335,37.7968656 -122.4681718,37.7971681 -122.4678909,37.7972322 -122.4678313,37.7972714 -122.4677967,37.7973108 -122.467765,37.7973526 -122.4677401,37.797394 -122.4677221,37.7974527 -122.4677091,37.7975103 -122.4677104,37.7975635 -122.4677251,37.7976099 -122.4677465,37.7976599 -122.4677815,37.7977004 -122.4678246,37.7977382 -122.4678928,37.7977657 -122.4679639,37.7977847 -122.4680304,37.7979263 -122.4686621,37.7979524 -122.4687553,37.797978 -122.4688301,37.7980072 -122.4688995,37.7980396 -122.4689606,37.7983457 -122.4686602,37.7984301 -122.4685586,37.7984665 -122.4685136,37.798519 -122.468456,37.79858158135988 -122.46840682510111)', 4326));    

INSERT INTO walk_trajectories (walker_name, trajectory) VALUES
('Orange Walker', ST_GeomFromText('LINESTRING(37.79482108844495 -122.4645294833186,37.794799 -122.464502,37.794626 -122.464206,37.794467 -122.463968,37.794485 -122.463823,37.794555 -122.463591,37.794643 -122.46339,37.794732 -122.463321,37.7947455 -122.463375,37.7947782 -122.4635064,37.794867 -122.463863,37.7949197 -122.4639853,37.7949857 -122.4640935,37.7950715 -122.464169,37.7951561 -122.4642276,37.7953511 -122.4643412,37.7953905 -122.4643701,37.7955512 -122.4645025,37.7956021 -122.4644328,37.7956434 -122.4644212,37.7957046 -122.464439,37.7959564 -122.4645123,37.7961988 -122.4645231,37.7962859 -122.4644272,37.7964012 -122.4643843,37.7964096 -122.4643814,37.7965165 -122.4643499,37.7965776 -122.4643499,37.7966589 -122.4643671,37.7967471 -122.4644358,37.796774 -122.4644811,37.796841 -122.464411,37.7969942 -122.4644235,37.797037 -122.464427,37.797102 -122.464466,37.797172 -122.464572,37.797195 -122.464787,37.797263 -122.4649764,37.7972974 -122.4651364,37.797379 -122.465304,37.797378 -122.465444,37.797392 -122.465536,37.797488 -122.46579,37.797533 -122.465968,37.797576 -122.466433,37.797563 -122.466525,37.79748 -122.466678,37.797468 -122.466768,37.797506 -122.466997,37.7976 -122.467159,37.7976769 -122.4671803,37.7977357 -122.4672502,37.7977634 -122.4673603,37.7977709 -122.4673781,37.797736 -122.4675416,37.7977094 -122.4676502,37.7977004 -122.4678246,37.7977382 -122.4678928,37.7977657 -122.4679639,37.7977847 -122.4680304,37.7979263 -122.4686621,37.7979524 -122.4687553,37.797978 -122.4688301,37.7980072 -122.4688995,37.7980396 -122.4689606,37.7979617 -122.4690293,37.797899 -122.4691154,37.7978717 -122.4690874,37.7978485 -122.4690684,37.7978194 -122.4690588,37.7977902 -122.469064,37.7977491 -122.4690918,37.7977106 -122.469139,37.7976752 -122.4691978,37.7976453 -122.4692561,37.7976175 -122.4693151,37.79759 -122.4693831,37.7975464 -122.4694971,37.7975118 -122.4695889,37.7975004 -122.4696329,37.7974951 -122.469679,37.7974932 -122.4697222,37.797495 -122.4699267,37.7975004 -122.4699808,37.797506559515654 -122.47000914467381)', 4326));

INSERT INTO walk_trajectories (walker_name, trajectory) VALUES
('Blue Walker', ST_GeomFromText('LINESTRING(37.795951956800835 -122.46451100656672,37.7959564 -122.4645123,37.7961988 -122.4645231,37.7962859 -122.4644272,37.7964012 -122.4643843,37.7964096 -122.4643814,37.7965165 -122.4643499,37.7965776 -122.4643499,37.7966589 -122.4643671,37.7967471 -122.4644358,37.796774 -122.4644811,37.796841 -122.464411,37.7969942 -122.4644235,37.797037 -122.464427,37.797102 -122.464466,37.797172 -122.464572,37.797195 -122.464787,37.797263 -122.4649764,37.7972974 -122.4651364,37.797379 -122.465304,37.797378 -122.465444,37.797392 -122.465536,37.797488 -122.46579,37.797533 -122.465968,37.797576 -122.466433,37.797563 -122.466525,37.79748 -122.466678,37.797468 -122.466768,37.797506 -122.466997,37.7976 -122.467159,37.7976769 -122.4671803,37.7977357 -122.4672502,37.7977634 -122.4673603,37.7977709 -122.4673781,37.7977815 -122.4673889,37.7978012 -122.4673898,37.7978124 -122.4673731,37.7978217 -122.4671592,37.7979107 -122.4670819,37.7979545 -122.4671,37.7979791 -122.4671624,37.7980287 -122.4672285,37.7980584 -122.4672983,37.798062 -122.467368,37.7980572 -122.4675702,37.7980594 -122.4677186,37.798026 -122.467992,37.7980211 -122.4681211,37.7980226 -122.4681609,37.7980747 -122.4682775,37.7980892 -122.4683561,37.7981142 -122.4683971,37.7981561 -122.4684234,37.7983493 -122.4684585,37.7983795 -122.4684881,37.7984301 -122.4685586,37.7984665 -122.4685136,37.798519 -122.468456,37.7986357 -122.4683643,37.7986936 -122.4682506,37.7987175 -122.4681369,37.7987255 -122.4680383,37.7987186 -122.4679823,37.79869201265307 -122.46791956856391)', 4326));

INSERT INTO walk_trajectories (walker_name, trajectory) VALUES
('Green Walker', ST_GeomFromText('LINESTRING(37.79479601418929 -122.46357793761146,37.794867 -122.463863,37.7949197 -122.4639853,37.7949857 -122.4640935,37.7950715 -122.464169,37.7951561 -122.4642276,37.7953511 -122.4643412,37.7953905 -122.4643701,37.7955512 -122.4645025,37.7956021 -122.4644328,37.7956434 -122.4644212,37.7957046 -122.464439,37.7959564 -122.4645123,37.7961988 -122.4645231,37.7964195 -122.4644959,37.7965414 -122.4644814,37.7966339 -122.4645057,37.7966842 -122.4645444,37.7967399 -122.4646324,37.7967468 -122.464753,37.7967138 -122.4650008,37.7966523 -122.465259,37.7966115 -122.4655337,37.7965474 -122.4663887,37.7965157 -122.466681,37.7964589 -122.466942,37.7963396 -122.4669996,37.7962695 -122.467098,37.7962504 -122.4671278,37.7962169 -122.4671802,37.7961796 -122.4672275,37.796151 -122.467292,37.7956588 -122.4689369,37.7956419 -122.4689849,37.7956291 -122.4690289,37.7956147 -122.4690867,37.7956027 -122.4691418,37.7956532 -122.4692212,37.7957024 -122.4692982,37.7957261 -122.4693958,37.7957216 -122.469482,37.7957242 -122.4695602,37.7957464 -122.4696513,37.7957727 -122.4697368,37.795893 -122.470146,37.796021 -122.470426,37.796307 -122.470781,37.796682 -122.471141,37.797249 -122.471761,37.797406 -122.47188,37.797654 -122.472134,37.797802 -122.472393,37.79787 -122.472616,37.797888 -122.472847,37.797899 -122.472949,37.79791649714564 -122.47311959717)', 4326));
軌跡地図
サンフランシスコのプレシディオ国立公園内の歩行軌跡。地図データは 2024-11-13 時点のOpenStreetMapより取得。

フレシェ距離 (Frechet distance) は、2つの曲線の類似度を測るための数学的な手法であり、単に形状だけでなく、どのような順序で経路をたどるかも考慮します。そのため、空間軌跡を比較する際によく用いられます。MySQLでは、ST_FrechetDistance関数によって2つの軌跡間のフレシェ距離を算出できます。この関数は、経路の幾何学的形状とその辿り方の両方を考慮するため、本記事の話題として最適です。

すでに保存された各軌跡を、たとえば最初の軌跡と比較してみましょう。

SELECT walker_name, id, 
  ST_FrechetDistance(trajectory, 
                     (SELECT trajectory
                      FROM walk_trajectories
                      WHERE walker_name='Red Walker')) as similarity
FROM walk_trajectories;

+---------------+----+--------------------+
| walker_name   | id | similarity         |
+---------------+----+--------------------+
| Red Walker    |  1 |                  0 |
| Orange Walker |  2 | 340.99002878471487 |
| Blue Walker   |  3 |  342.0596889960262 |
| Green Walker  |  4 |  421.5789953668151 |
+---------------+----+--------------------+
4 rows in set (0,01 sec)

次に、Presidio Hill (POINT(37.79360088412314 -122.46428847312926)) を起点とし、National Cemetery Overlook (POINT(37.79744139891328 -122.46510386466979)) を経由し、最終的にPark BoulevardとShofieldバス停付近 (POINT(37.79937853647067 -122.46860817074773)) に至る想定でルートを考えます。データベース内のどの軌跡が、このルートに最もよく一致するかを確認してみます。述べた地点を単純に連結したLINESTRINGは以下のように示せます:

LINESTRING(37.79360088412314 -122.46428847312926,37.79744139891328 -122.46510386466979,37.79937853647067 -122.46860817074773)

これは上記マップでは、黒色の線として示されています。以下のクエリで、これと各歩行者の軌跡が比較できます:

SELECT walker_name, id,
  ST_FrechetDistance(trajectory, 
                     ST_GeomFromText('LINESTRING(37.79360088412314 -122.46428847312926,37.79744139891328 -122.46510386466979,37.79937853647067 -122.46860817074773)', 4326)) 
  as similarity
FROM walk_trajectories 
ORDER BY similarity;

+---------------+----+--------------------+
| walker_name   | id | similarity         |
+---------------+----+--------------------+
| Orange Walker |  2 | 241.65190882342154 |
| Blue Walker   |  3 | 261.68893003364565 |
| Green Walker  |  4 | 429.19109867209164 |
| Red Walker    |  1 | 453.59273618286403 |
+---------------+----+--------------------+
4 rows in set (0,00 sec)

すなわち、指定されたルートに最も近い軌跡はオレンジと青の歩行者のものであることがわかります。

最後に、少し異なる視点のクエリを実行してみましょう。軌跡同士が部分的に重なっている区間に着目し、最も重なりが大きい2つの軌跡を探したいユースケースがあるとしましょう。以下のクエリで、それを確認できます:

SELECT t1.walker_name,t2.walker_name, ST_Length(ST_Intersection(t1.trajectory, t2.trajectory)) as common_length 
FROM walk_trajectories t1 
CROSS JOIN walk_trajectories t2 ON t1.id < t2.id 
ORDER BY common_length DESC;

+---------------+---------------+--------------------+
| walker_name   | walker_name   | common_length      |
+---------------+---------------+--------------------+
| Orange Walker | Blue Walker   |  423.1932792589895 |
| Orange Walker | Green Walker  | 172.44142700478628 |
| Red Walker    | Orange Walker | 107.59951501720846 |
| Blue Walker   | Green Walker  | 26.921615355894808 |
| Red Walker    | Blue Walker   | 13.384680833787076 |
| Red Walker    | Green Walker  |               NULL |
+---------------+---------------+--------------------+
6 rows in set (0,00 sec)

CROSS JOINにより、重複を除く全ての軌跡のペアに対する結果が生成されます。結果を各ペアの交差部分の長さ (ST_Intersection関数で求めたジオメトリに対しST_Length関数を適用して導出) でソートしています。マップ上の表示でもわかるとおり、最も重なりが大きいのはオレンジと青の軌跡です。

まとめ

MySQLの空間関数を使うことで、軌跡を分析・比較できる強力なツールキットが得られます。フレシェ距離や距離・長さといった指標、そして空間集合演算 (たとえばLINESTRING同士の交差) などを組み合わせることで、GPSデータや交通ルート、野生動物の移動経路など、さまざまな移動パターンに対し深い比較が行えます。

これらの機能によって空間データを深く理解し、現実のアプリケーションにおける経路や動作を比較・分析・最適化できます。これらの機能はMySQL Community Edition、Enterprise Edition、およびHeatWave MySQLで利用できます。