PostgreSQL B-Tree Index Optimizations
Up to 49.6% query speedup in PostgreSQL 17.4 B-tree indexes. Implemented linear search optimization for small leaf pages and async prefetching for range scans. Improved 37% of JOB benchmark queries.
Problem
B-tree indexes in PostgreSQL use binary search uniformly across all leaf page sizes, creating unnecessary overhead for small pages where linear scans would be more efficient. Additionally, range scans don't prefetch subsequent pages, causing I/O stalls during sequential access patterns.
Approach
Implemented two core optimizations: (1) Linear search optimization that replaces binary search with linear scanning for leaf pages containing ≤4 items, reducing algorithmic overhead and improving CPU cache locality; (2) Asynchronous leaf-page prefetching that uses PostgreSQL's native PrefetchBuffer() API to overlap I/O operations with CPU processing during range scans. Both optimizations are configurable via GUC variables and backward compatible with no changes to index structures.
Impact
Evaluated on 113 complex analytical queries from the Join Order Benchmark (JOB), the combined optimizations improved 37.2% of queries (42/113) with an average speedup of 302.29ms. Linear search alone benefited 56.6% of queries with 80.41ms average reduction, while prefetching improved 31.0% of queries by 150.91ms on average. Best case achieved 49.6% speedup (1,066ms reduction) on query-intensive workloads.
Key Metrics
Technologies
Links
My Role
Co-developer - researched PostgreSQL internals, designed and implemented linear search optimization in C, modified core B-tree access methods (_bt_binsrch), integrated GUC configuration variables, developed benchmarking framework with Docker, and led performance analysis across 113 analytical queries. Collaborated with Soumya who implemented the prefetch optimization and contributed to benchmark scripts and testing infrastructure.
Team Size: 2 people