Rechercher

CV

DBA Freelance

À propos

- Dynamique, rigoureux, pédagogue, à l'écoute des besoins de mes clients ;... Lire la suite

Référencement

Derniers blogs mis à jour

Août 2025

Calendrier des notes en Août 2025
D	L	M	M	J	V	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Notes récentes

08h57 25 juin 2025
EDB feedback

Article intéressant sur le 0 downtime des cartes...
08h56 25 juin 2025
[Cloud IBM PSQL] Role creation with HVAULT...

script : prepare_dba_static_role.sh
08h55 25 juin 2025
[PSQL] Autovacuum PostgreSQL on AZURE

Comment configurer l'autovacuum de manière agressive...
09h14 28 mai 2025
[PSQL] Que contient les fichiers wal

Un bon tuto à faire ! What’s Inside a WAL File
12h48 14 mai 2025
[PSQL] 5 biggest tables

SELECT relname AS "relation", pg_size_pretty (...
12h13 14 mai 2025
[ISSUE]PostgreSQL: Permission denied for large...

PostgreSQL: Permission denied for large object
16h59 09 févr. 2025
Marchand de bien vers SCI

Prendre une carte de marchand de bien. Acheter le bien...
10h58 20 janv. 2025
[POSTGRES] Easy to migrate

Little bash script
11h37 16 janv. 2025
Oracle backup batch file

Oracle backup batch file - Stack Overflow need to...
09h16 14 janv. 2025
Pratique le "watch"

Je partage une commande qui peut être utilisée...

Tags populaires

10h02 11 juil. 2024

[Postgres] Solve PostgreSQL DataFileRead and buffer_io with Parameter Tuning

Click her to see the next.

Solve PostgreSQL DataFileRead and buffer_io with Parameter Tuning | by monika yadav | Medium

This is an epic case study to solve DataFileRead and buffer_io without changing disk hardware, iops and without creating any new index!

Problem Walk Through:

You can see from the image attached above that our average active sessions(AAS) were going upto 60 which is a very big number. An active session is a database connection that has submitted a request to the database but has not yet received the response. It means that in our case on an average 30 to 60 sessions were waiting for the response from the database server.

In Performance Insight, in database load chart legend in the top right corner there is an option to select chart type (bar / line) and slice by(application / database / host / session_type / user /sql / waits). We have sliced the chart by waits with bar type which means that the bar line will be showing % of time spent by different sessions in particular postgreSQL wait events.

In our case major wait events as can been seen were:

1) DataFileRead: This event occurs when a connection waits on a backend process to read a required page from storage because the page isn’t available in shared memory.

2) buffer_io: This event occurs when lots of connections wait for pages to be read into the shared buffers pool. As soon as pages are read into the shared buffer pool the wait event is released.

If we combine both issues it means our shared buffer isn’t sufficient for our current workload and leading to buffer_io wait and DataFileRead to read pages into shared buffers. We picked the queries that were showing 95% of wait on both events.

Possible Demons:

The possible demons to lead us into current situations could be one or multiple of below:

Queries doing sequential scans of tables and causing havoc on the database. But in our case the problematic query wasn’t doing the sequential scans.
Queries might be using index but index definition isn’t the best fit. In our case the index definition and usage was good.
Shared_buffer isn’t sufficient and PostgreSQL needs more shared buffers or may be more overall memory allocation.
Heavy write operations/ongoing index activity/vacuum activity which would have slowed down. Fortunately this was also not the case with us.
High number of read operations which would have aggravated the situation. This was also not the case with us.

Problem Root Cause:

So what was the root cause in our case? What do you think?

It was smaller shared_buffers.

Possible Solutions:

Increasing shared buffers size
Optimising shared buffers content.

Shared buffers size can be increased by:

Increasing shared_buffers parameter value involves downtime
Upscaling RDS instance size involves both downtime and increased cost.

I started looking into optimising shared buffers.

I picked and analyzed the query with highest no of wait on buffer_io and DataFileRead. Below is explain plan of query.

You will see 99.98% of time spent by query is in index scan. Also actual time=43.760..6346.829 means that it took 43 ms before the index found the first result. Also shared buffer hit ratio is only 20%, formula to calculate shared buffer hit ratio is : (shared hit*100)/(shared hit+ shared read).I suspected index bloating as the issue here.

After analysing the index bloat of the indexes associated with problematic queries, I figured out that our indexes were 50% bloated. I have added references at the bottom of page to find bloating percentage in your indexes.

I first reindexed the indexes used in our problematic query and it improved the performance. I also reindexed the other bloated indexes and after that here is query execution plan:

Permanent Fix:

High index bloat is caused by insufficient autovacuum setting. It means that volume of our write operations were very high and autovacuum wasn’t running frequently enough to avoid increased bloating. To fix our issue permanently, I made autovacuum settings more aggressive.

Default autovacuum settings were:

Autovacuum_vacuum_scale_factor: 0.1
autovacuum_analyze_scale_factor: 0.05
autovacuum_vacuum_cost_delay: 20ms

New customised autovacuum settings are:

autovacuum_analyze_scale_factor = 0.005
autovacuum_vacuum_scale_factor=0.01
autovacuum_vacuum_cost_delay: 30ms

How to change autovacuum settings:

Above settings can be changed system wide from associated parameter group in RDS and postgreSQL configuration file in on-prem / vm installation.

I wanted to change this setting only specific to a table so I went ahead with alter table command:

Set lock_timeout=’300ms’;

ALTER TABLE table_name SET (autovacuum_vacuum_scale_factor = 0.01, autovacuum_analyze_scale_factor = 0.005,autovacuum_vacuum_cost_delay=’30ms’);

By fixing bloating and tuning autovacuum settings, we reduced the wait event count from 30–60 to less than 2,

I will be making a very detailed blog post on tuning autovacuum settings for specific needs in my next blog post, till then Happy Learning!

What Problems We Solved:

Improved application performance by 97%.
Saved business cost by $100 per month.
It was done with zero downtime, we saved business downtime.

References:

https://wiki.postgresql.org/wiki/Show_database_bloat

Lien permanent Catégories : Postgresql, Professionnel, tuning 0 commentaire

Expertise en SGBDR

L'expertise en bases de données tient en un blog !

collignon sur Intermédiation

Le boss sur Tuning ora2pg

joguess sur Forcer le refresh de la vue...

joguess sur Purger fichiers logs

JJO sur oracleasm createdisk failed

JJO sur oracleasm createdisk failed

JJO sur Installation ASM et configuration de l'instance...

jjo sur Outil de benchmark Oracle - ORION

EDB feedback

[Cloud IBM PSQL] Role creation with HVAULT...

[PSQL] Autovacuum PostgreSQL on AZURE

[PSQL] Que contient les fichiers wal

[PSQL] 5 biggest tables

[ISSUE]PostgreSQL: Permission denied for large...

Marchand de bien vers SCI

[POSTGRES] Easy to migrate

Oracle backup batch file

Pratique le "watch"

[Postgres] Solve PostgreSQL DataFileRead and buffer_io with Parameter Tuning

Problem Walk Through: