The Path Through Your JSON Isn’t Always Straight

Analyzing Formula 1 JSON Data with Oracle SQL/JSON Path Expressions

As part of Oracle JSON Insights, this post show how Oracle AI Database lets you query JSON data directly from SQL using SQL/JSON path expressions. Rather than flattening JSON into relational tables first, you can extract fields, filter documents, project columns, and even perform type-aware comparisons directly from JSON.

In this post we’ll use SQL/JSON path expressions to analyze a Formula 1 stint-level dataset.

The source data is available on GitHub as final_data.csv and has been loaded into an Oracle JSON collection table named f1_stints.

Each row contains one JSON document stored in the data column.

Example document:

{
    "bestPreRaceTime": 80.452,
    "eventYear": 2021,
    "bestLapTimeIsFrom": "Qualifying",
    "RoundNumber": 3,
    "Stint": 4,
    "EventName": "Portuguese Grand Prix",
    "Position": 17,
    "Team": "Haas F1 Team",
    "fuel_bias": 86.35908974358976,
    "raceStintsNums": 4,
    "deg_bias": 85.33125917911491,
    "Rainfall": 0,
    "StintLen": 32,
    "TyreAge": 1,
    "meanTrackTemp": 38.328125,
    "meanAirTemp": 19.184375,
    "_id": "6a35c694bc11f1a36a82abe6",
    "CircuitLength": 4.684,
    "": 3348,
    "designedLaps": 66,
    "lapNumberAtBeginingOfStint": 33,
    "GridPosition": 19,
    "meanHumid": 40.721875,
    "fuel_slope": -0.05593589743589682,
    "Compound": "HARD",
    "Driver": "MSC",
    "deg_slope": -0.018910302945693302
}

Each document describes one race stint, including:

Event and season
Driver and team
Tyre compound
Stint number
Grid and finishing positions
Stint length
Weather conditions
Track temperature
Tyre degradation metrics

We’ll progressively build four SQL examples:

Simple field access
Document-level predicates
Column projection with JSON_TABLE
JSON item methods

The goal is to answer useful Formula 1 questions while introducing the SQL/JSON path syntax naturally.

Query 1 — Simple Field Access

Let’s begin with the simplest use of SQL/JSON.

Suppose we want to answer:

Which tyre compounds did race winners start on?

Each JSON document contains:

Compound
Stint number
Final finishing position

We can extract those values using JSON_VALUE and then let SQL perform the aggregation.

WITH opening_stints AS (
  SELECT
    JSON_VALUE(t.data, '$.Compound' RETURNING VARCHAR2(30)) AS compound,
    JSON_VALUE(t.data, '$.Position' RETURNING NUMBER)       AS finish_position,
    JSON_VALUE(t.data, '$.Stint' RETURNING NUMBER)          AS stint_number
  FROM f1_stints t
)
SELECT
  compound,
  COUNT(*) AS winner_first_stints
FROM opening_stints
WHERE finish_position = 1
  AND stint_number = 1
GROUP BY compound
ORDER BY winner_first_stints DESC, compound;

The JSON paths are intentionally simple:

$.Compound
$.Position
$.Stint

The $ symbol represents the root of the JSON document, and each path simply accesses a field by name.

Result

Compound	Winner first stints
MEDIUM	34
SOFT	27
ULTRASOFT	8
SUPERSOFT	6
INTERMEDIATE	5
WET	2
HYPERSOFT	2

Even this basic example provides an interesting racing insight.

Across this dataset, race winners most often started on MEDIUM tyres rather than the softest available compound.

The important takeaway is that SQL/JSON path expressions don’t have to be complicated. Often they’re simply the bridge that lets SQL work naturally with semi-structured JSON data.

Query 2 — Document-Level Predicates

In the previous query we extracted values first and filtered afterwards.

Sometimes it’s cleaner to filter the JSON document directly.

Let’s ask:

Which race winners started outside the top five?

We’re looking for documents where:

Position = 1
Stint = 1
GridPosition > 5

This is a perfect use case for JSON_EXISTS.

SELECT
  JSON_VALUE(t.data, '$.eventYear' RETURNING NUMBER)        AS event_year,
  JSON_VALUE(t.data, '$.EventName' RETURNING VARCHAR2(100)) AS event_name,
  JSON_VALUE(t.data, '$.Driver' RETURNING VARCHAR2(10))     AS driver,
  JSON_VALUE(t.data, '$.GridPosition' RETURNING NUMBER)     AS grid_position,
  JSON_VALUE(t.data, '$.Compound' RETURNING VARCHAR2(30))   AS compound,
  JSON_VALUE(t.data, '$.StintLen' RETURNING NUMBER)         AS first_stint_laps
FROM f1_stints t
WHERE JSON_EXISTS(
  t.data,
  '$?(@.Position == 1 && @.Stint == 1 && @.GridPosition > 5)'
)
ORDER BY event_year, event_name;

The key expression is the JSON path predicate:

$?(@.Position == 1 && @.Stint == 1 && @.GridPosition > 5)

Inside the predicate, @ refers to the current JSON document.

Oracle evaluates the predicate and keeps only documents that satisfy all three conditions.

Result

Year	Race	Driver	Grid	Compound	First stint laps
2018	Chinese Grand Prix	RIC	6	ULTRASOFT	16
2018	German Grand Prix	HAM	14	SOFT	41
2020	Italian Grand Prix	GAS	10	SOFT	19
2020	Turkish Grand Prix	HAM	6	WET	8
2021	Azerbaijan Grand Prix	PER	6	SOFT	13
2021	Hungarian Grand Prix	OCO	8	INTERMEDIATE	2
2021	São Paulo Grand Prix	HAM	10	MEDIUM	7

This is where SQL/JSON starts to feel more analytical.

Instead of extracting values first, we’re describing an entire racing scenario and asking Oracle Database to return only the matching JSON documents.

Query 3 — Projecting JSON with JSON_TABLE

When a query needs multiple JSON fields simultaneously, JSON_TABLE provides a much cleaner approach.

It projects JSON into relational columns so the remainder of the query looks like ordinary SQL.

Let’s compare tyre degradation by compound.

Our question is:

Across dry stints of at least 10 laps, which tyre compounds have the highest average degradation slope?

SELECT
  jt.compound,
  COUNT(*)                       AS stint_count,
  ROUND(AVG(jt.deg_slope), 4)    AS avg_deg_slope,
  ROUND(STDDEV(jt.deg_slope), 4) AS deg_slope_stddev,
  ROUND(MIN(jt.deg_slope), 4)    AS best_case,
  ROUND(MAX(jt.deg_slope), 4)    AS worst_case
FROM f1_stints t,
  JSON_TABLE(t.data, '$'
    COLUMNS (
      compound  VARCHAR2(20) PATH '$.Compound',
      deg_slope NUMBER       PATH '$.deg_slope',
      stint_len NUMBER       PATH '$.StintLen',
      rainfall  NUMBER       PATH '$.Rainfall'
    )
  ) jt
WHERE
  jt.rainfall = 0
  AND jt.deg_slope IS NOT NULL
  AND jt.stint_len >= 10
  AND jt.compound IN (
    'SOFT',
    'MEDIUM',
    'HARD',
    'SUPERSOFT',
    'ULTRASOFT',
    'HYPERSOFT'
  )
GROUP BY jt.compound
ORDER BY avg_deg_slope DESC;

The important part is the projection:

JSON_TABLE(t.data, '$'
  COLUMNS (
    compound  VARCHAR2(20) PATH '$.Compound',
    deg_slope NUMBER       PATH '$.deg_slope',
    stint_len NUMBER       PATH '$.StintLen',
    rainfall  NUMBER       PATH '$.Rainfall'
  )
)

Once projected, the remainder of the query is simply relational SQL using functions like:

AVG
STDDEV
MIN
MAX

Result

Compound	Stints	Avg deg slope	Stddev
HARD	820	-0.0204	0.1335
ULTRASOFT	151	-0.0311	0.1370
MEDIUM	1083	-0.0325	0.2345
SUPERSOFT	178	-0.0459	0.2456
SOFT	874	-0.0495	0.4904
HYPERSOFT	32	-0.1902	0.7377

The HARD compound shows the highest average degradation slope in this filtered sample, while HYPERSOFT has both the lowest average and the smallest sample size.

The standard deviation is equally informative.

SOFT and HYPERSOFT show much wider variation than HARD or ULTRASOFT, suggesting degradation depends not only on tyre compound but also on race conditions, strategy, temperature, traffic, and driver management.

Query 4 — JSON Item Methods

Our final example introduces SQL/JSON item methods.

These methods make type conversion explicit during extraction and filtering.

Let’s ask:

In dry conditions, which slick compounds were stretched the longest?

SELECT
  JSON_VALUE(data, '$.Compound.string()' RETURNING VARCHAR2(20)) AS compound,

  COUNT(*) AS stint_count,

  ROUND(
    AVG(JSON_VALUE(data, '$.StintLen.number()' RETURNING NUMBER)),
    1
  ) AS avg_stint_laps,

  MAX(
    JSON_VALUE(data, '$.StintLen.number()' RETURNING NUMBER)
  ) AS longest_stint,

  ROUND(
    AVG(JSON_VALUE(data, '$.meanTrackTemp.number()' RETURNING NUMBER)),
    1
  ) AS avg_track_temp

FROM f1_stints
WHERE JSON_EXISTS(
  data,
  '$?(
    @.Rainfall.number() == 0
    && @.StintLen.number() >= 25
    && @.Compound.string() in ("SOFT", "MEDIUM", "HARD")
  )'
)
GROUP BY
  JSON_VALUE(data, '$.Compound.string()' RETURNING VARCHAR2(20))
ORDER BY
  avg_stint_laps DESC;

Notice the explicit item methods:

$.Compound.string()
$.StintLen.number()
$.meanTrackTemp.number()

The predicate also uses item methods:

$?(
  @.Rainfall.number() == 0
  && @.StintLen.number() >= 25
  && @.Compound.string() in ("SOFT","MEDIUM","HARD")
)

These methods don’t change the analytical question—they simply make the intended data types explicit.

This becomes particularly valuable when JSON fields may contain mixed types or require coercion.

Result

Compound	Stints	Avg laps	Longest stint	Avg track temp
HARD	587	35.6	66	38.5
MEDIUM	458	32.5	62	36.6
SOFT	255	32.4	56	37.9

This gives us another useful Formula 1 insight.

Among long dry stints, the HARD tyre is consistently stretched the furthest, while MEDIUM and SOFT produce remarkably similar average stint lengths.

Conclusion

These four examples demonstrate a natural progression when working with JSON data in Oracle Database.

Simple field access is ideal when you only need a handful of values.
Document-level predicates let the JSON document itself determine whether a row matches.
JSON_TABLE projects multiple JSON fields into a relational shape for more complex analysis.
JSON item methods make type handling explicit during filtering, extraction, and aggregation.

The important point is that SQL/JSON path expressions do not replace SQL, they complement it.

JSON path expressions excel at navigating semi-structured documents and expressing document-level conditions. SQL remains the best tool for grouping, joining, aggregating, sorting, and ranking data.

Together, they make it possible to perform rich analytical queries over JSON documents without first flattening the data into a traditional relational schema.

Oracle JSON Insights

Learn from the team that build the features at Oracle. Read articles like this post, watch demos and webcasts, or start to develop using the LiveLabs, all on available from one page: Oracle JSON Insights.

The Path Through Your JSON Isn’t Always Straight

Analyzing Formula 1 JSON Data with Oracle SQL/JSON Path Expressions

Query 1 — Simple Field Access

Result

Query 2 — Document-Level Predicates

Result

Query 3 — Projecting JSON with JSON_TABLE

Result

Query 4 — JSON Item Methods

Result

Conclusion

Oracle JSON Insights

Prakashkumar Thiagarajan

Principal Data Systems Engineer

KScope26 Reflections: Essbase, AI, and the Modern Lakehouse

Upcoming Change to the Default Database Version for New Autonomous AI Database Serverless Instances

The Path Through Your JSON Isn’t Always Straight

Analyzing Formula 1 JSON Data with Oracle SQL/JSON Path Expressions

Query 1 — Simple Field Access

Result

Query 2 — Document-Level Predicates

Result

Query 3 — Projecting JSON with JSON_TABLE

Result

Query 4 — JSON Item Methods

Result

Conclusion

Oracle JSON Insights

Authors

Prakashkumar Thiagarajan

Principal Data Systems Engineer

KScope26 Reflections: Essbase, AI, and the Modern Lakehouse

Upcoming Change to the Default Database Version for New Autonomous AI Database Serverless Instances