Inconsistencies between system catalog and DocDB schema during DDL operations

25 June 2024
Product Affected Versions Related Issues Fixed In
YSQL v2.18.0.0, v2.20.0.0, v2024.1.0.0 #22935 v2.18.9.0, v2.20.5.0, v2024.1.2.0

Description

When server-side caching of sequences is enabled by setting the flag ysql_sequence_cache_method = server, it can cause errors for columns using the sequences if multiple databases are present in the cluster, or if a database is dropped and recreated. Columns using sequences as primary keys may incorrectly return duplicate key errors, and unique constraint violations may also occur if the column has a unique constraint.

This issue arises because the same TServer cache entry is used across multiple databases, and sequence cache entries are never invalidated, remaining in memory till the TServer is restarted.

Mitigation

If the cluster has ysql_sequence_cache_method = server enabled, then update the flag to use ysql_sequence_cache_method = connection.

Details

By default, YugabyteDB caches sequence values for each YSQL connection. To improve performance, YugabyteDB added support for caching sequence values on the YB-TServer. You can enable this feature by setting the flag ysql_sequence_cache_method = server. By default, this flag is set to "connection".

For Tserver caching, the cache key is the OID of the sequence objects. PostgreSQL maintains unique OIDs within the database and thus multiple databases can allocate the same OID. This can cause the sequences in two different databases to have the same OID. Since the cache key is OID, hence the different sequences in different databases will continue to use the same cache key which results in such sequences sharing the same cache in Tserver.

For TServer caching, the cache key is the OID (Object Identifier) of sequence objects. In PostgreSQL, each database maintains unique OIDs, but multiple databases can have the same OID. This means sequences in different databases might share the same OID. As the cache key is based on OID, different sequences in different databases will continue to use the same cache key, which results in sequences sharing the same cache in TServer.

When ysql_sequence_cache_method is set to "server", sequence cache entries are never invalidated and remain in memory until the TServer shuts down. As a result, if a database containing sequences is dropped and then recreated, it might produce errors due to outdated entries from the dropped sequences.

Examples

The following example demonstrates the same TServer cache entry being used across two different sequences in two different databases. To reproduce the following issue, set the cluster flag ysql_sequence_cache_method = server.

  1. Create two databases seqtest1 and seqtest2 as follows:

    create database seqtest1;
    create database seqtest2;
    
  2. Connect to database seqtest1 and create sequence foo. Allocate three values from foo by calling the function nextval three times as follows:

    \c seqtest1
    
    You are now connected to database "seqtest1" as user "yugabyte".
    
    create sequence foo;
    select oid, relname from pg_class where relkind = 'S';
    
      oid  | relname
    -------+---------
     16384 | foo
    (1 row)
    
    select nextval('foo');
    
     nextval
    ---------
           1
    (1 row)
    
    select nextval('foo');
    
    nextval
    ---------
           2
    (1 row)
    
    select nextval('foo');
    
    nextval
    ---------
           3
    (1 row)
    
  3. Connect to database seqtest2, create a new sequence bar, and allocate value by calling the function nextval.

    \c seqtest2
    seqtest2=# create sequence bar;
    select oid, relname from pg_class where relkind = 'S';
    
      oid  | relname
    -------+---------
     16384 | foo
    (1 row)
    
    seqtest2=# select nextval('bar');
    
     nextval
    ---------
           4
    (1 row)
    

    In this scenario, sequence bar has the same OID as sequence foo, and uses the same cache on the TServer. This results in the nextval function returning 4, which is the next available sequence value in the cache. The correct behavior would be to return 1, as this is first time values are allocated from the sequence bar, and the default for a sequence minimum value is 1.

    TServer sequence cache entries are never invalidated and reside in memory until the TServer shuts down, as shown in the following steps.

  4. Drop the database seqtest1 and 1seqtest2, which drops all the objects, including sequences fooandbar. Create a new database seqtest3and connect to databaseseqtest3`.

    drop database seqtest1;
    drop database seqtest2;
    create database seqtest3;
    \c seqtest3
    
    You are now connected to database "seqtest3" as user "yugabyte".
    
  5. Create a sequence baz in seqtest3 and allocate value from sequence baz using nextval.

    create sequence baz;
    
    select oid, relname from pg_class where relkind = 'S';
    
      oid  | relname
    -------+---------
     16384 | baz
    (1 row)
    
    select nextval('baz');
    
    nextval
    ---------
           5
    (1 row)