Handling BLOBs In PostgreSQL


An overview over BLOBs in PostgreSQL

Published on October 16, 2024 by Stefanie Janine Stölting

PostgreSQL BLOB BYTEA

2 min READ

BLOBs In PostgreSQL

Implementation

PostgreSQL does not have a BLOB data type as specified in the SQL standard. The nearest implementation is the data type BYTEA. Since PostgreSQL 9.0 it does handle data by standard as hexadecimal data.

Limitations

BYTEA has a limit of max 1 GB of data.

Binary data cannot be indexed or searched for content. They can be inserted, updated (fully replaced), or deleted.

What Can Be Stored In 1 GB

You can store up to 3,000 jpg images with a size of up to 5 MB in 1 GB.
An English protestant bible contains roughly 809,000 letters. That results in 1,236 bible texts that can be stored in 1 GB.

That should give you an impression what 1 GB of data is.

More Than 1 GB BLOBs

Binary data can be stored in files, that PostgreSQL is referencing to. That way the binary files are stored on the file system and can be created, updated, or deleted. The functions that can be used are documented, the usage is supported in ODBC, .NET, JDBC, or libpq.

The files are referenced in a table with an OID.

Limitations

But this method has limits, too. 4 TB is what PostgreSQL can handle as maximum per file. The maximum size of all objects is 32 TB

As OIDs are 32 bit numbers, there is a maximum of files, that can be handled, it is 2^32.

Documentation

In addtion to the documentation about Binary Data Types there is a PostgreSQL Wiki Page.
There are listed some pros and cons about BLOBs in databases.

Why I Believe You Should Not Use BLOBs In A Database

Even while the PostgreSQL Wiki](https://wiki.postgresql.org/wiki/BinaryFilesInDB) has some pros, I believe the reasons that are speaking against database BLOBs are more the important ones.

Getting files through databases involves another layer handling the data: the database.
That is obviously comming with cost:

  • It requires more memory than normal data read/writes
  • Backups will take significantly longer
  • Longer transactions for accessing or storing binary data
  • May become more complicated to handle files as usually a temporary file is copied to the client and, when neccessary, copied back
  • Binary data is limited in size per object
  • Binary data is limited in size overall

In addition there are advantages to have files on the file system:

  • The number of files is only limited by the file system
  • The size of files is only limited by the file system
  • Performance increases compared to getting them through database layer

The best method is not to store binary data in a database at all. Store them in a file system and handle the information insided the database about where the file is stored. You can also add descriptions to the file record. That way it is possible to search the documentations of files with full text search (FTS).

There is one exception where I might store BLOBs in a database. That is when the amount of them is not that big, some thousands might be OK, and they are small, which would be a maximum of 100 kb.



Author: Stefanie Janine Stölting