‹

›

Evaluating PGCat's Sharding by hash

Distributing shards across nodes.

March 9, 2025 - 8 mins read

ℹ️ PGCat’s sharding feature is currently experimental. The code for the laboratory can be found at lab_pgcat. PGCat sharding mechanisms As of today, PGCat supports 2 mechanisms of sharding through extended syntax: By setting the shard explicitly: SET SHARD TO '<index>';, which allows you to do deterministic sharding, whether you choose your shard according to a rule, such a lookup, region, customer’s group, etc. This is great if you have one of those well delimited or an even distribution.

[BabelfishPG] Enabling Statements Execution Time logging and Plans

Configuration, plans and log format

October 25, 2024 - 3 mins read

Series: BabelfishPG

Introduction ℹ️ Laboratory for this post can be found at babelfishpg-lab. ⚠️ In this post, we cover 4.2.0 version. There is a slight difference in the logging settings and format from 4.1.1. Even tho BabelfishPG is a Postgres flavor, the configuration is done through its extensions (babelfish_tds and babelfish_tsql). In this post, we focus on how to log query timings and enable showing query plans. For Postgres vanilla, you can log slow queries by setting the log_min_duration_statements, but this won’t log TSQL statements.

Cascading timeouts through Pool (PGBouncer)

Pgbouncer query_timeout and Postgres statement_timeout

October 2, 2024 - 2 mins read

Series: Postgres

Combining query_timeout and statement_timeout In pgbouncer’s documentation it’s stated that the query_timeout should be set slightly higher that Postgres’ statement_timeout. Although this apply in most of the cases, it depends on the business requirements. Generally, the statement_timeout should be set to the 99th percentile of your statements duration. However, there are cases were certain statements require larger timeouts, due to particularities like a large customer set, or larger fields, as in TOAST compression cases.

[BabelfishPG] Using tds_fdw to access BabelfishPG

Querying BabelfishPG/MSSQL Server from Postgres

September 6, 2024 - 6 mins read

Series: BabelfishPG

It supports TDS, right? Some things happen once in a lifetime, and the story around this is quite particular. A customer required something that at the beginning sound counter intuitive: migrate an existing Postgres database to BabelfishPG. The thing was that the application was a critical business core, with a large amount of code that would require years to migrate fully for supporting another storage strategy. But the real reason was that their customer didn’t want to stick to any private licensing model, and required to use Open Source solutions.

[BabelfishPG] Pooling TDS connections in BabelfishPG with FreeTDS

TDSPool utility

August 15, 2024 - 4 mins read

Series: BabelfishPG

Next post will cover performance tests using tdspool. BabelfishPG connection architecture Inherited from Postgres connection architecture, each connection trhough the TDS port will instance a Postgres backend. As in Postgres, BabelfishPG needs a middleware for funnel connections through the TDS port for avoiding running out of connections and processing capacity in the database server. For Postgres, we have plenty of options, like PGBouncer, Odyssey, pgcat, name it. For T-SQL (read as MSSQL compatible language), there aren’t many open sourced solutions.

[Earthly +v0.8.10 feature]: Disabling Earthly internal labels

Reproducibility in Earthly images

July 1, 2024 - 2 mins read

Series: Earthly

NOTE: This has not been yet updated in the Earthly documentation, but I’m pretty sure it will be soon (and I’ll update this post accordingly) What does labels do to reproducible builds? By default, Earthly adds dev.earthly.* labels in the built image. You can find these by doing a docker inspect <imageID> | jq -r '.[].Config.Labels'. Here’s an example of the image configuration including the dev.earthly.* labels: "Config": { .

Implementing a functional multi-process function in Bash

Using only wait and jobs commands

June 1, 2024 - 3 mins read

Series: Bash

If you’re in this post, is because you need to implement a consistent way to add concurrency to your scripts. This implementation is limited, as it does not contemplates grouping jobs for setting different process groups with different settings or priorities. Although, probably most of the cases you just need to execute code that runs inside a block or just simply does something in parallel. The parallel command is a good option, but it requires to loose some code readability, particularly on code blocks that might inject a complex semantic.