0

I have an application that uses postgres database on one region (US West) containing several tables, one of which contains several hundred thousand records (let's call it "events" table with "eventId" as the primary key/Integer/auto-increment) I also recently cloned this database into another region (EU central) for users using the application on that region.

What I'd like to achieve is uniqueness of the primary key across the two databases such that if a new "events" record is created at US West, the primary key must NOT overlap with another "events" record that already exist at EU central and vice-versa. Since the database was cloned, any previous overlap doesn't matter in this case. I also need to do this for other tables such as Users, etc.

I looked into Postgres partitioning and sharding (via foreign data wrappers) but from my research, I cannot partition the original table (at US West) without copying it into another "partitioned" table which would require downtime (if my understanding is correct) In addition, I also needed to have to rename the tables (e.g. events_eu_central vs. events_us_west) which would require modifications to the application backend which may be extra work for the devs.

I'm about to look into something like a "dedicated ID-lookup" table on a database sitting on one region (e.g. US West) that can be used by other regions (via Foreign data wrappers). Then, during INSERT to "events" table, a database trigger runs and checks if the ID already exists.

I don't know if this is going to work or will cause a massive performance hit but would like to hear any ideas.

4
  • Use a UUID, not an integer: postgresql.org/docs/current/datatype-uuid.html Commented Jul 9 at 19:45
  • @FrankHeikens We needed to keep it as integer though as 1) our users also use that number to quickly access their events using the URL and 2) we cannot change the primary key to something else as we have hundred thousand records on the table and some of the links embedded on the emails that used this identifier will break
    – ct101
    Commented Jul 9 at 22:30
  • "...some of the links embedded on the emails that used this identifier..." -- It's never a good idea to expose your pk to the outside world. Use a secondary key for that purpose. Commented Jul 10 at 4:03
  • The dedicated ID-lookup table approach is a very bad idea: it doesn't scale well, has poor performance, and is prone to race conditions. A variation using an ID service is only marginally better. Consider changing the ID columns to type BIGINT and defining the sequences with min and max values that preclude overlaps between regions. This approach preserves the existing URLs and foreign key relations.
    – JohnH
    Commented Jul 10 at 4:29

1 Answer 1

0

There are two approaches:

  • use an uuid as primary key

  • create the sequences differently on both systems:

    id bigint GENERATED ALWAYS AS IDENTITY (START 1 INCREMENT 2)
    

    versus

    id bigint GENERATED ALWAYS AS IDENTITY (START 2 INCREMENT 2)
    

Not the answer you're looking for? Browse other questions tagged or ask your own question.