PostgreSQL vs ParadeDB: A Search Comparison
I wanted to compare PostgreSQL's built-in search capabilities with ParadeDB. Built a benchmark using 1.6M Amazon products to test different search scenarios and write operations. Here are the results.
The Setup
Both running PostgreSQL 17. Vanilla PG needed 11 indexes (GIN for full-text, pg_trgm for fuzzy). ParadeDB uses one BM25 index.
Read Performance Results
Tested five search types with multiple queries each:
Full-Text Search
- PostgreSQL: 401ms average
- ParadeDB: 92ms average
- ParadeDB is 4.4x faster with slightly higher relevance scores
Fuzzy Search (Handling typos)
- PostgreSQL: 22,838ms average
- ParadeDB: 139ms average
- ParadeDB is 164x faster
Field-Specific Search
- PostgreSQL: 4,399ms average
- ParadeDB: 90ms average
- ParadeDB is 48x faster
Boolean Queries
- PostgreSQL: 7ms average
- ParadeDB: 2ms average
- ParadeDB is 3.5x faster
Exact Phrase
- PostgreSQL: 6ms average
- ParadeDB: 89ms average
- PostgreSQL is 14x faster
Implementation Complexity
PostgreSQL Setup
Requires understanding of ts_vector, ts_rank, GIN indexes, and pg_trgm. Each search type needs different query syntax. Performance depends heavily on having all 11 indexes properly configured.
ParadeDB Setup
Uses one BM25 index and consistent query syntax (products @@@ 'search term'
) for all search types.
Query Complexity
PostgreSQL full-text query:
WHERE to_tsvector('english', COALESCE(title, '') || ' ' ||
COALESCE(description, '') || ' ' || COALESCE(brand, ''))
@@ plainto_tsquery('english', 'wireless headphones')
ParadeDB:
WHERE products @@@ 'wireless headphones'
PostgreSQL requires more complex queries that vary by search type, while ParadeDB maintains consistent syntax.
Data Loading Performance
Both databases loaded 1.6M products:
- PostgreSQL: Data load + 11 indexes = ~7 minutes
- ParadeDB: Data load + BM25 index = ~3 minutes
ParadeDB's single index builds faster than PostgreSQL's multiple indexes.
Search Relevance
Measured relevance scores for search results. ParadeDB averaged 6 points higher for full-text search. Both databases returned similar relevance for field-specific searches.
Performance Summary
PostgreSQL performs better for:
- Exact phrase matching (14x faster)
- Single-row inserts (3.6x faster)
- Single-row updates (3.1x faster)
- Existing PostgreSQL deployments without extension support
ParadeDB performs better for:
- Fuzzy search (164x faster)
- Field-specific searches (48x faster)
- Full-text search (4.4x faster)
- Bulk data imports (8.4x faster for large batches)
- Index rebuilds (9x faster)
Write Performance Results
Tested inserts and updates with all indexes in place:
Batch Operations
- Small batches (100 rows): PostgreSQL is 2.2x faster (1,846 vs 839 rows/sec)
- Medium batches (1000 rows): ParadeDB is 5.4x faster (11,028 vs 2,056 rows/sec)
- Large batches (5000 rows): ParadeDB is 8.4x faster (18,070 vs 2,160 rows/sec)
Single Row Operations
- Single inserts: PostgreSQL is 3.6x faster (1,177 vs 330 rows/sec)
- Updates: PostgreSQL is 3.1x faster (854 vs 275 rows/sec)
- Bulk updates: PostgreSQL is 3.2x faster (867 vs 267 rows/sec)
Index Rebuilds
- PostgreSQL: 63.9 seconds
- ParadeDB: 7.1 seconds
ParadeDB performs better for bulk operations, PostgreSQL for single-row operations.
Use Case Recommendations
ParadeDB might be suitable when:
- Search performance is critical
- You need fuzzy search/autocomplete functionality
- Your workload involves bulk data imports
- You prefer simpler search implementation
- Single-row write performance is not critical
PostgreSQL might be suitable when:
- You need consistent single-row write performance
- Your searches are primarily exact phrase matching
- You have high-frequency transactional workloads
- Extension installation is not possible
- Fuzzy search is not required
Key Findings
- Bulk write performance: ParadeDB is 8.4x faster for large batch inserts
- Index rebuild time: ParadeDB rebuilds indexes 9x faster (7s vs 64s)
- Fuzzy search performance: 164x difference (22.8s vs 139ms)
- Index dependency: PostgreSQL performance heavily depends on having the right indexes
- Search relevance: ParadeDB scores 5-6 points higher on relevance metrics
Bottom Line
For search workloads, ParadeDB destroys PostgreSQL. The 164x speed difference on fuzzy search isn't fixable - PostgreSQL with 11 perfectly tuned indexes still takes 22+ seconds.
But here's the twist: ParadeDB is actually FASTER for bulk writes too. 8.4x faster for large batches. Only loses on single-row operations.
The patterns are clear:
- Need search? ParadeDB, no question
- Bulk data loads? ParadeDB again (8.4x faster)
- High frequency OTLP/Single inserts? PostgreSQL (3.6x faster)
- Mixed workload? PostgreSQL primary, ParadeDB replica
The biggest surprise? I expected ParadeDB to be slower for all writes. Wrong. It's only slower for single-row operations. For bulk operations, it's significantly faster.
Perfect PostgreSQL indexes can't fix algorithm differences. pg_trgm scanning millions of trigrams will never match BM25's inverted index. 22 seconds vs 139ms tells that story.
Github repo: https://github.com/VineethReddy02/pg_search_benchmark