X

Celebrating the joy and power of Oracle SQL with the Oracle Developer Advocate team

  • April 2, 2020

How to Fix ORA-01450: Maximum Key Length (6398) Exceeded Errors

Chris Saxon
Developer Advocate

Another day, another slow query. This time customers are complaining queries of the form:

select * from some_table
where  text_column = 'some value';

Are taking for-ever to execute.

So you try and slap an index on the column. Only to be met with:

create index text_index
  on some_table ( text_column );
  
ORA-01450: maximum key length (6398) exceeded

You what now?!

The maximum size of the string is 4,000 characters. How can that not fit in 6,398 bytes? And more importantly, how do you avoid the error?

Photo by George Becker from Pexels

In this post you'll find out:

But first, why would a 4,000 character string complain that it needs 6,398 bytes?

What Causes ORA-01450 Errors

The smallest unit of data storage in Oracle Database is the block. This defaults to 8k (8,192 bytes).

Each index entry must fit within one block. So the maximum size of an indexed value must be less than the block size. Minus some storage overheads. In a default installation this means the largest indexable value is around 6,400 bytes.

But… why does this mean you can't index a varchar2 (4000)?!

In older releases of Oracle Database, the maximum size of a varchar2 was 4,000 bytes. So you could only hit this problem when creating an index on many of these columns:

create index multi_col_index
  on some_table ( 
    vc_4000_byte_1, vc_4000_byte_2 
  );
  
ORA-01450: maximum key length (6398) exceeded

This changed in Oracle Database 12c with the introduction of extended data types. This increased the maximum size of a varchar2 to 32,767 bytes.

Critically, if you're using a UTF8 character set for your database, each character could consume up to 4 bytes of storage. So if you declare a varchar2 with character semantics, the maximum size string could be 16,000 bytes!

You can verify this by checking data_length in the *_tab_cols views:

select data_type, data_length, char_length
from   user_tab_cols
where  table_name = 'SOME_TABLE'
and    column_name = 'TEXT_COLUMN';

DATA_TYPE       DATA_LENGTH    CHAR_LENGTH   
VARCHAR2              16000           4000 

Clearly this can't fit in 8k. So what can you do about it?

Decrease the Largest String Length

Ryan McGuire Gratisography

The simplest solution is to decrease the maximum length of strings you can store. Of course, to do this you need to check that you can use a smaller limit.

A good first pass for this is to find the longest strings stored in the column. Both in characters and bytes:

select max ( length ( text_column ) ) mx_char_length,
       max ( lengthb ( text_column ) ) mx_byte_length
from   some_table;

Assuming these are well short of the current upper limit, it's time for good old-fashioned business analysis. Speak with the data consumers to find out what a reasonable maximum length is.

Provided you find a smaller limit is acceptable, you can alter the table to shrink the size:

alter table some_table
  modify text_column varchar2(1000 char);

But many free-form text fields genuinely can store large bodies of text. In which case you can't lower the upper limit. You need a workaround.

As the problem is the indexed values may not fit in one block, one solution is:

Use a larger block size for the indexes!

Store the Index in a Tablespace with a Larger Block Size

The least impact to application code is to store the index in a tablespace with a 16k or 32k block size.

Assuming the database uses an 8k block size this means you'll have multiple block sizes. So to do this, first you need to set the appropriate db_Nk_cache_size parameter. Then create a tablespace with the desired block size and store the index in it:

create tablespace tblsp_32k_blocks
  datafile 'tblsp_32k_blocks' size 1m
  blocksize 32768;
  
create index text_index
  on some_table ( text_column )
  tablespace tblsp_32k_blocks;

While this is simple from a SQL perspective, it makes the database harder to manage. DBAs may need to start micro-managing buffer cache sizes; a poor use of their time.

If this is a no-go, there are some workarounds using function-based indexes.

Create a STANDARD_HASH Index

Hashing the text returns a fixed-length value. The exact length depends on the hash function, but even a SHA512 hash value is 128 characters. Well within the limit for an 8k block.

So you can create an index using standard_hash:

create index text_hash_index
  on some_table ( 
    standard_hash ( text_column )
  );

Those of you familiar with function-based indexes will know: normally you have to change your query to use them. The function in the index must appear in your where clause.

But the optimizer has some tricks up its sleeve. It can automatically use this index in equality or IN-list comparisons:

select * from some_table
where  text_column = 'this';

----------------------------------------------------------                              
| Id  | Operation                           | Name       |                              
----------------------------------------------------------                              
|   0 | SELECT STATEMENT                    |            |                              
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SOME_TABLE |                              
|*  2 |   INDEX RANGE SCAN                  | TEXT_INDEX |                              
---------------------------------------------------------- 

But range comparisons can't use this index. Which also means it's unusable for like queries

select * from some_table
where  text_column >= 't'
and    text_column < 'u';

----------------------------------------                                             
| Id  | Operation         | Name       |                                             
----------------------------------------                                             
|   0 | SELECT STATEMENT  |            |                                             
|*  1 |  TABLE ACCESS FULL| SOME_TABLE |                                             
---------------------------------------- 

select * from some_table
where  text_column like 'this%';

----------------------------------------                                                 
| Id  | Operation         | Name       |                                                 
----------------------------------------                                                 
|   0 | SELECT STATEMENT  |            |                                                 
|*  1 |  TABLE ACCESS FULL| SOME_TABLE |                                                 
----------------------------------------  

So what if you do these kinds of queries?

The optimizer can also do the same magic to use a substr-based index.

Create a SUBSTR Index

The substr function returns the first N characters of a string. When searching for a string, often these are the most important for finding a particular row.

create index text_substr_index
  on some_table ( 
    substr ( text_column, 1, 10 )
  );

Again, the optimizer can use this index without you having to change the SQL. This time for range comparisons too, but sadly like still won't use the index:

select * from some_table
where  text_column = 'this';

-----------------------------------------------------------------                      
| Id  | Operation                           | Name              |                      
-----------------------------------------------------------------                      
|   0 | SELECT STATEMENT                    |                   |                      
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SOME_TABLE        |                      
|*  2 |   INDEX RANGE SCAN                  | TEXT_SUBSTR_INDEX |                      
-----------------------------------------------------------------                      
                                                                                       
select * from some_table
where  text_column >= 't'
and    text_column < 'u';

-----------------------------------------------------------------           
| Id  | Operation                           | Name              |           
-----------------------------------------------------------------           
|   0 | SELECT STATEMENT                    |                   |           
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| SOME_TABLE        |           
|   2 |   INDEX RANGE SCAN                  | TEXT_SUBSTR_INDEX |           
-----------------------------------------------------------------

select * from some_table
where  text_column like 'this%';

----------------------------------------                    
| Id  | Operation         | Name       |                    
----------------------------------------                    
|   0 | SELECT STATEMENT  |            |                    
|   1 |  TABLE ACCESS FULL| SOME_TABLE |                    
---------------------------------------- 

So which of these two methods should you use?

If you need range comparisons on the text, you'll have to use substr. But it's likely to be less efficient for equality searches.

So if you only have equality searches on the column, standard_hash is the way to go. You can also use this in a unique index if you want to prevent duplicates in the text.

But provided you can work within their restrictions, either of the function-based index methods work well.

That said, both of these methods are kind-of "hidden" from developers and DBAs. Which can increase the "surprise-factor" for the application.

There is a way to make these clear: use virtual columns.

Image by Dimitris Vetsikas from Pixabay)

Create a Virtual Column

A virtual column applies a function to a column in the table. The database only computes this at runtime. The value is not stored in the table.

So instead of creating a function-based index on standard_hash or substr, you can use one of these in a virtual column. Then create the index on that:

alter table some_table
  add text_hash varchar2(40 char) as (
    standard_hash ( text_column )
  );
  
create index vc_text_hash_index
  on some_table ( text_hash );

There are a couple of advantages to doing this over a function-based index:

  • The optimizer can gather statistics on the virtual column
  • You can see what the indexed values are, making the application easier to understand.

This can also be useful if you want to extract the middle of the string for some reason. For example, removing a common prefix or getting certain fields from delimited or fixed-width data.

But this highlights a general problem with indexing free-text fields. Often you want to search for specific values within a string. So you need wildcards at the start and end of the search string:

select * from some_table
where  text_column like '%middle text%';

Standard indexes are practically useless for these types of query. Bringing the question of why you're bothering to index these columns at all!

Luckily there is a way to index large text strings in Oracle Database, allowing you to do ad-hoc searches of them:

Oracle Text!

Create an Oracle Text Index

Oracle Text is a full text indexing engine with huge functionality. Most relevant to the ORA-01460 problem is you can create these indexes on large varchar2 columns. Without needing any of the fancy tricks described above.

To use it, which state kind of text index you want with the indextype clause. The most common type is a context index:

create index oracle_text_index
  on some_table ( text_column )
  indextype is ctxsys.context;

Although there's nothing special you need to do to create these indexes, to use them, your query must use the contains operator:

select * from some_table
where  contains ( text_column, 'value' ) > 0;

This may mean rewriting lots of SQL so it can use the index. But it also opens new possibilities, such as enabling fuzzy-text search. If you go down this route, this is a good chance to speak with your customers see if they want any of these extra searching capabilities.

Summary

Perhaps the best solution to ORA-01450 errors is to avoid them in the first place: ensure you set appropriate limits for character columns!

Remember that the size of a varchar2 is a constraint. If the maximum size of a string should be ten characters, make it a varchar2(10 char). Besides avoiding the indexing issues above, this improves data quality. Avoid choosing varchar2(4000) "just-in-case" you need to large blocks of text.

But there will always be cases where you need to store large pieces of text. It's worth remembering you can use two or more of the solutions above on the same column. So you could create a function-based index to make equality searches fast. While also adding an Oracle Text index to improve the performance of more free-form queries.

Ultimately all the solutions above have their pros and cons. It's worth testing a couple of solutions in your application to see which trade-offs gives you the most upside with least downside. This could well be something you need to decide on a case-by-case basis!

Join the discussion

Comments ( 2 )
  • Rajeshwaran, Jeyabal Friday, May 22, 2020
    Thanks for the detailed post.

    It would be really helpful to read, if we could get the explain plan along with "predicate information" so that we can see if any optimization/surprise done by optimizer for queries against the Function based indexes.
  • Chris Saxon Friday, May 22, 2020
    Thanks Rajesh.

    I'm not sure what you're concerned about here - the optimizer processes the function-based indexes as it would for any other query.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.