This has been already addressed in PostgreSQL core, but this option
is still available in RDS.
What does it change and why is so important?
Tracking databases and not just tables counters in Postgres isn’t cheap, but since some time ago there were workarounds involving the setup of a ramdisk to place the directory pointed by
stat_temp_directory GUC variable. That directory places a
global.stat and a per-database stat files called like
db_<oidOfDB>.stat. Although the mechanism for writing into these files avoids extra or unnecessary flushes, it is very write intensive.
This change does not require any downtime (in standalone installations), as a simple reload will force the Stat Collector to rewrite the files on the folder. There is a pretty much clear blog on putting stat_temp_directory on a ramdisk.
The problem relies on the RDS lack of privileges to manipulate file or directory contents, which does not allow you to check the current size and set a proper value. Although, you may probably want to know that there is a limit of 1 GB for this setting in RDS.
If you don’t want any further details and you want to relief your storage, set it to 256 MB and continue with your life. Even though is a large setting (next paragraph explain why), you don’t want to fall short on it.
After you apply the change over
pg_stat_ramdisk_size, you will see the location in the RDS have changed:
show stats_temp_directory; stats_temp_directory --------------------------- /rdsdbramdisk/pg_stat_tmp
TL;DR What’s the expected size of the stat_temp_directory?
Before move forward, lets detail the structure of the entries for the stat file:
|describers||char (T or F in this case)|
First of all, as it’ll explained later, not all the tables, indexes and functions are written on the db statsfile. Basically, a basic formula will be:
SizeOfDBStatFile = PGSTAT_FILE_FORMAT_ID + describers + (tableCount * PgStat_StatTabEntry) + (funcCount * PgStat_StatFuncEntry) + closingChar
In order to get the estimate space needed for the current tables on each database (keep in mind that this considers all tables flushed on file), there is a query you can execute safely on each database in your PostgreSQL instance (statfile is one per database):
Also, you need to do the same within
pg_proc, but instead the factor will be 28 bytes. You’ll need to run this on every database, and sum them all. This is for tracking stats for function usage, which can be
disabled from the
postgresql.conf file with the
track_functions variable. Also, all the aspects
of runtime statistics can be found here.
Structure of the global stats:
The global statfile is smaller, and contains only the global stats and the counters across databases. Should be something close to:
PGSTAT_FILE_FORMAT_ID + describer + PgStat_GlobalStats + PgStat_ArchiverStats + (PgStat_StatDBEntry + describer) * numDatabases.
So, as you can see, the limitation imposed by AWS in regarding is way above the amount of data held on this directory in most of the databases that can run inside RDS expectations.
Why it affects RDS?
Prior to this feature been added, the
stat_temp_directory had a place into the persistent storage layer. This was the same as any other Postgres installation by default, however due to the storage characteristics of RDS the impact could be considered higher than a standalone setup.
If your application is write intensive, you will see the impact on the Write latency and operations.
A deeper look
So the question didn’t took much time to appear in the network and, I wasn’t the exception. Is there a way to pre calculate the contents of the directory?
I couldn’t end up with an exact number however, you may know that the size of the files are more related to the number of tables, indexes, functions and databases. The following structure is the core of this implementation. It is so important that it actually has a defined
PGSTAT_FILE_FORMAT_ID that it is written also in the stat files.
All the structures for these file contents are placed in the
include/pgstat.h header and its implementation is done in
postmaster/pgstat.c (as it is a startup worker). Every field that is used for counters use
int64 and there are some
timestampz (64 bits too) with Oid as an exception, which is represented by 32 bits (unsigned int).
Backends communicate to the collector through
StatMsgType struct, when is different from a zeroed struct
PgStat_TableCounts. Structures kept in backend local memory while accumulating counts. So, that means that not all the tables, indexes and functions will have an entry.
Which backends can request a file write? All the backends, the archiver, the bgwriter. All of them use the same structure for passing the changes (PgStat_Msg).
There are 2 functions for write (
pgstat_write_statsfiles) and 2 for read (
pgstat_read_statsfiles ) each of those controlling either the
The HTAB structure is opaque, and it holds a hash map of tables and functions to be collected. We don’t care about the size of this maps as it won’t be written to the stats file anyway. The whole database entry is
22 * 64 bit values +
1 * 32 bits, per database (180 bytes).
In overal, this is the structure size of each:
|PgStat_StatTabEntry||20 * 64 bits and 1 * 32 Oid||(164 bytes)|
|PgStat_StatFuncEntry||3 * 64 bits and 1 * 32 Oid||(28 bytes)|
|PgStat_GlobalStats||11 * 64 bits, 8 bytes + 1 * 32 bit, 4 bytes||(92 bytes)|
|PgStat_ArchiverStats||4 * 8bytes, 2 char 41 bytes.||(114 bytes)|
Hope you enjoyed the article!