Implementing Time Travel In PostgreSQL


An overview over BLOBs in PostgreSQL

Published on December 09, 2024 by Stefanie Janine Stölting

PostgreSQL timetravel partitions

20 min READ

Time Travel PostgreSQL

History

Until PostgreSQL 6.2 had time travel implemented in its core. It has been removed in PostgreSQL 6.3.

How To Implement It In Current Versions

It is possible to implement time travel without any extension into PostgreSQL.

I gave a talk about it at PGDay/MED 2024, the idea of the talk started with writing this blog post.
The slides are available on GitLab.

Implications

There is a simple solution, but that might become slow with more data and extensive updated data. Hans-Jürgen Schönig has written about it in his blog post in 2019 about AS OF-queries.
The only problem is, that the trigger function is throwing errors on PostgreSQL 17. I have not tested it with older PostgreSQL versions.
But this is solvable by chaning the function.

Implementation With Range Partitions

The idea to this blog post started with a talk by Ryan Booz, Postgres Table Partitioning.

Primary Key Table

For the start weU need a table, that will held the unique numeric identifiers as they need to be unique over all partitions. That feature is not supported by PostgreSQL by now.
Therefore we create a table which will only contain the identifiers. The table and it’s primary key will be small and PostgreSQL will not face any problems holding it in memory.

Creating the primary key table:

-- Creates the primary key table which will also keep the creation timestamp
CREATE TABLE timetravel_pk (
  timetravelid BIGINT GENERATED ALWAYS AS IDENTITY,
  created TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT clock_timestamp(),
  CONSTRAINT timetravle_pk_pk PRIMARY KEY (timetravelid)
);

The table with the time travel data will later reference to this table in a foreign key.
To only store the time when the id has been created once, the timestamp holding that data is located in this table.

Data table

The data table will be partitioned by range. For this example it will be by year.

There is one thing, that partitions have to exist before data can be inserted. To handle that case we create a trigger, which will be fired before insert.
The same trigger will be used to create the id column before the data is inserted into the partitioned table.

Creating the data table:

-- Creates the btree GIST index which will be used for the column changed
CREATE EXTENSION IF NOT EXISTS btree_gist;

-- Creates the partitioned table which will store all versions of records
CREATE TABLE timetravel (
  timetravelid BIGINT NOT NULL,
  changed TSTZRANGE NOT NULL DEFAULT tstzrange(clock_timestamp(), 'INFINITY', '[)'),
  data_text TEXT,
  data_json JSONB,
  deleted BOOLEAN NOT NULL DEFAULT FALSE,
  CONSTRAINT timetravelid_fk FOREIGN KEY (timetravelid) REFERENCES timetravel_pk(timetravelid)
) PARTITION BY RANGE (lower(changed))
;

-- Creating indexes
CREATE INDEX timetravel_changed_idx
	ON timetravel
	USING gist
	(changed)
;

CREATE INDEX timetravel_timetravelid_fk_idx
	ON timetravel
	USING btree
	(timetravelid)
;

CREATE INDEX timetravel_not_deleted_idx
  ON timetravel
  USING btree
  (deleted)
  WHERE NOT deleted
;

Keep in mind, that PostgreSQL does not create indexes on foreign keys automatically. That is a trap where lots of people are falling into. But it has advantages, as PostgreSQL supports a bunch of index types and you can choose the appropriate index method.

The Time Range Table

To partition the table automatically we need to know, for which years the partitions do already exist. And when they are not existing, new partitions have to be created.
The trigger will be fired before insert and before update.

There is an alternative solution, but that would involve an extension to create cron jobs inside the database, I prefer pg_cron.
Then the function has to be created as simple procedure, there is no need to return data. A cron job inside the database will execute the procedure before the current time period ends.
That is obviosly much faster, than the method with triggers, but involves an extension.

Createing the table for to store the periods

CREATE TABLE timetravel_part_vals (
  part_year SMALLINT NOT NULL,
  start_value TIMESTAMP WITH TIME ZONE NOT NULL,
  end_value TIMESTAMP WITH TIME ZONE NOT NULL,
  CONSTRAINT timetravel_part_vals_pk PRIMARY KEY (part_year)
);

The Function to Handle New Partitions

I would usually handle the automatically creation of partitions by time in a cron job, for example with pg_cron.
That is much more effective than handling it every time a record is inserted, updated, or deleted.

Function to handle partitions:

CREATE OR REPLACE FUNCTION timetravel_partition (in_changed TIMESTAMPTZ)
  RETURNS void
  LANGUAGE PLPGSQL
  AS
$$
DECLARE
  query TEXT;
  in_range BOOLEAN = FALSE;
  year_p SMALLINT;
  start_v TIMESTAMP WITH TIME ZONE;
  end_v TIMESTAMP WITH TIME ZONE;
  part_name TEXT;
BEGIN
  -- Check the changed date to be in an existing partition
  EXECUTE 'SELECT count(*) > 0
  FROM timetravel_part_vals
  WHERE part_year = extract(year from $1)'
  INTO in_range
  USING in_changed;

  IF (NOT in_range) THEN
    -- Update the range data
    EXECUTE 'INSERT INTO timetravel_part_vals (part_year, start_value, end_value)
    SELECT extract(year from $1),
      date_trunc(''year'', $1),
      date_trunc(''year'', $1) + INTERVAL ''1 year''
    RETURNING *'
    INTO year_p, start_v, end_v
    USING in_changed;

    -- Create a new partition
    part_name := 'timetravel_' || year_p::TEXT;

    EXECUTE 'CREATE TABLE ' || part_name ||
      ' PARTITION OF timetravel FOR VALUES FROM (''' || start_v::text || ''') TO (''' || end_v::text || ''')';

    RAISE NOTICE 'Partition % created.', part_name;

  END IF;
END;
$$


-- Create the partition for the current year
SELECT timetravel_partition (now());

The Trigger To Handle INSERT And UPDATE

The trigger function for inserts and updates will always overwrite the incoming value of the change timestamp, that way the data can not be manipulated.

Creating the trigger for insert and updats

CREATE OR REPLACE FUNCTION trigger_timetravel_in_upd ()
  RETURNS TRIGGER
  LANGUAGE PLPGSQL
  AS
$$
BEGIN
  -- Setting default values
  NEW.deleted = false;
  NEW.changed = tstzrange(clock_timestamp(), 'INFINITY', '[)');

  -- On UPDATE a new record is inserted
  CASE WHEN TG_OP = 'UPDATE' THEN

    IF NEW.timetravelid <> OLD.timetravelid THEN
      RAISE EXCEPTION 'The identity column timetravelid can not be changed!';
    END IF;

    IF NOT OLD.deleted THEN
      IF upper(OLD.changed) = 'infinity' THEN
        -- Updated the latest version of a record
        INSERT INTO timetravel (timetravelid, data_text, data_json, changed)
        SELECT NEW.timetravelid
          , NEW.data_text
          , NEW.data_json
          , NEW.changed
        ;

        -- Only the range for the old record is changed, it has an end now
        NEW.data_text = OLD.data_text;
        NEW.data_json = OLD.data_json;
        NEW.changed = tstzrange(lower(OLD.changed), lower(NEW.changed));

        RETURN NEW;
      ELSE
	      -- It is not the newest version, therefore there is nothing to do
        RETURN NULL;
      END IF;
    ELSE
      -- An already deleted record cannot be changed
      RETURN NULL;
    END IF;

  -- The new record needs its id created by inserting into the pk table
  WHEN TG_OP = 'INSERT' THEN
    INSERT INTO timetravel_pk (created)
      VALUES (clock_timestamp())
      RETURNING timetravelid
      INTO NEW.timetravelid;

  	RETURN NEW;
  ELSE
    RETURN NULL;
  END CASE;
END;
$$

-- Attach the trigger function for inserts
CREATE TRIGGER timetravel_insert
  BEFORE INSERT
  ON timetravel
  FOR EACH ROW
  WHEN (pg_trigger_depth() < 1)
  EXECUTE PROCEDURE trigger_timetravel_in_upd()
;

-- Attach the trigger function for updates
CREATE TRIGGER timetravel_update
  BEFORE UPDATE
  ON timetravel
  FOR EACH ROW
  EXECUTE PROCEDURE trigger_timetravel_in_upd()
;

The Trigger For Deletions

The last thing we have to take care of is deleted records. We can only delete the data logically, otherwise time travel would not be possible.
You might have noticed, that the table timetravel key does already have a column deleted. It is by default false and can not be changed on inserts or updates, the trigger function does handle that for us. But now we need a trigger that will be fired before a record gets deleted and the job of the trigger is to cancel the deletion and to insert a new record where deleted is true.

You might have noticed, that the INSERT triggers have an additional condition after FOR EACH ROW, WHEN (pg_trigger_depth() < 1) does handle, that the trigger is only executed, when it is not executed from within another trigger. That is documented in FUNCTIONS-INFO-SESSION-TABLE.
Without that the trigger timetravel_insert would overwrite the deleted value and the trigger timetravel_insert_pk would overwrite the primary key.

Delete trigger:

-- The trigger inserts two records, one with the old data but with an end
-- timestamp in column changed, one with deleted = true but end timestamp
-- is INFINITY
CREATE OR REPLACE FUNCTION trigger_timetravel_del ()
  RETURNS TRIGGER
  LANGUAGE PLPGSQL
  AS
$$
DECLARE
  ts timestamp with time zone;
BEGIN
  -- When a record has already been deleted, an error is raised and no data is changed
  IF OLD.deleted THEN
     RAISE EXCEPTION 'Timetravel record with the timetravelid % is already deleted.', OLD.timetravelid;
  END IF;

  IF upper(OLD.changed) = 'infinity' THEN
  -- Only the latest version has to be taken care off
    ts = clock_timestamp();

    INSERT INTO timetravel (timetravelid, changed, data_text, data_json, deleted)
    VALUES (
      OLD.timetravelid,
      tstzrange(ts, 'INFINITY'),
      OLD.data_text,
      OLD.data_json,
      true
    ),
    (
      OLD.timetravelid,
      tstzrange(lower(OLD.changed), ts),
      OLD.data_text,
      OLD.data_json,
      false
    );

    RETURN OLD;
  ELSE
    -- All other records stay as they are
    RETURN NULL;
  END IF;
END;
$$

-- Attach the trigger function for deletions
CREATE TRIGGER timetravel_delete
  BEFORE DELETE
  ON timetravel
  FOR EACH ROW
  EXECUTE PROCEDURE trigger_timetravel_del()
;

Now we do have everything in place to have the data automatically partitioned, data also can not be deleted, and we can create some views to access the data at certain points in time.

Inserting DATA

To play with the time traveling, we need some data in the timetravel table. The statement will insert 1,000,000 records.

Inserting data:

-- Insert records
INSERT INTO timetravel (data_text, data_json)
SELECT substr(md5(random()::text), 1, 25) AS data_text
	, to_jsonb(substr(md5(random()::text), 1, 25)) AS data_json
FROM generate_series(1, 100000) s(i)
;

The update will change some records by random ids. You may run it several times to generate several versions of each record.

Updating data:

-- Update existing records
WITH t1 AS
	(
		SELECT floor(random() * (100000-1+1) + 1)::bigint AS timetravelid
			, substr(md5(random()::text), 1, 25) AS data_text
			, to_jsonb(substr(md5(random()::text), 1, 25)) AS data_json
		FROM generate_series(1, 100000) s(i)
	)
, t2 AS
	(
		SELECT DISTINCT ON (timetravelid) timetravelid
			, data_text
			, data_json
		FROM t1
	)
UPDATE timetravel SET
	data_text = t2.data_text,
	data_json = t2.data_json
FROM t2
WHERE timetravel.timetravelid = t2.timetravelid
;
;

** Deleting data:**

-- Delete some records
DELETE FROM timetravel WHERE timetravelid IN (99, 654, 5698);

SQL Statements To Access The Data

Checking the existing amount of data:

-- Table statistics
WITH rec_v AS
  (
    SELECT t.timetravelid
      , ROW_NUMBER() OVER (PARTITION BY t.timetravelid ORDER BY lower(t.changed)) AS rec_version
    FROM timetravel AS t
  )
SELECT count(t.timetravelid) AS recordcount
  , min(t.timetravelid) AS min_primary_key
  , max(t.timetravelid) AS max_primary_key
  , min(rec_v.rec_version) AS min_version_number
  , max(rec_v.rec_version) AS max_version_number
  , count(t.timetravelid) FILTER (WHERE t.deleted) AS deleted_records
FROM timetravel AS t
INNER JOIN rec_v
  ON t.timetravelid = rec_v.timetravelid
;

The results after having updated four times:

recordcountmin_primary_keymax_primary_keymin_version_numbermax_version_number
4347004110000014

The recordcount might differ as the column timetravelid is randomly generated.

Getting data valid on a certain point in time:

-- Result for a certain point in time
SELECT t.timetravelid
  , ROW_NUMBER() OVER (PARTITION BY t.timetravelid ORDER BY lower(t.changed)) AS rec_version
  , t.data_text
  , t.data_json
  , pk.created
  , t.changed
  , lower(t.changed) AS valid_from
  , upper(t.changed) AS valid_until
  , t.deleted
FROM timetravel_pk AS pk
INNER JOIN timetravel AS t
  ON pk.timetravelid = t.timetravelid
WHERE '2024-11-16 09:45:06.821 +0100'::timestamptz <@ t.changed
ORDER BY t.timetravelid
  , lower(t.changed)
;

You would likely have to adjust the WHERE clause to have a timestamp that does exist in your table.

View

We can use the previous statement to create a view, we only remove the WHERE clause and the ORDER BY clause.

Create a view:

CREATE OR REPLACE VIEW timetravel_v AS
WITH rec_v AS
  (
    SELECT t.timetravelid
      , ROW_NUMBER() OVER (PARTITION BY t.timetravelid ORDER BY lower(t.changed)) AS rec_version
    FROM timetravel AS t
  )
SELECT DISTINCT ON (t.timetravelid)
  t.timetravelid
  , rec_v.rec_version
  , t.data_text
  , t.data_json
  , pk.created
  , t.changed
  , lower(t.changed) AS valid_from
  , upper(t.changed) AS valid_until
FROM timetravel_pk AS pk
INNER JOIN timetravel AS t
  ON pk.timetravelid = t.timetravelid
INNER JOIN rec_v
  ON pk.timetravelid = rec_v.timetravelid
WHERE NOT deleted
AND upper(t.changed) = 'infinity'::TIMESTAMPTZ
;

There is still one problem with that view: Due to having calculated columns in it, it would not be possible to insert new data or update existing data.
To achieve this we can create more trigger functions, which will be fired with INSTEAD OF INSERT and INSTEAD OF UPDATE as triggers attached to the view.

-- One trigger to bind them all
CREATE OR REPLACE FUNCTION trigger_timetravel_view_in_upd_del ()
  RETURNS TRIGGER
  LANGUAGE PLPGSQL
  AS
$$
BEGIN
  CASE
    WHEN TG_OP = 'INSERT' THEN
      EXECUTE 'INSERT INTO timetravel (data_text, data_json) VALUES ($1, $2)'
        USING NEW.data_text, NEW.data_json;
    WHEN TG_OP = 'UPDATE' THEN
      EXECUTE 'UPDATE timetravel SET data_text = $1, data_json = $2 WHERE timetravelid = $3'
        USING NEW.data_text, NEW.data_json, OLD.timetravelid;
    WHEN TG_OP = 'DELETE' THEN
      EXECUTE 'DELETE FROM timetravel WHERE timetravelid = $1'
        USING OLD.timetravelid;
    ELSE
      RAISE EXCEPTION 'Operation not supported.';
  END CASE;

  RETURN NEW;
END;
$$

-- Attach the trigger function to the view
CREATE OR REPLACE TRIGGER timetravel_v_trigger
	INSTEAD OF INSERT OR UPDATE OR DELETE
	ON timetravel_v
	FOR EACH ROW
	EXECUTE PROCEDURE trigger_timetravel_view_in_upd_del ()
;

Possible Future

The SQL standard of 2011 has added historical records with the key words AS OF. This is currently not implemented into PostgreSQL up to version 17. It is labeled under Temporal features, the link goes to a publication about it.

In the commitfests for PostgreSQL 17 a patch was introduced to extend PostgreSQL to the standard. That patch did not make it into PostgreSQL 17, but may probably land in the next major version.

Downloading SQL The Script

The SQL script is a available for download to spare you to copy the code.

I hove you enjoyed this long blog post.



Author: Stefanie Janine Stölting