PMR vs std::vector When “More Advanced” Gets Slower
- 7 hours ago
- 3 min read
Modern C++ gives us incredibly powerful tools, and Polymorphic Memory Resources (PMR) is one of the most exciting additions from C++17.
Custom allocators, memory pools, stack-backed buffers it all sounds like a guaranteed performance win.
But as always in performance engineering…👉 it depends.
In this post, I ran a simple benchmark to answer a practical question:
Does using PMR with std::vector and std::string actually make things faster?
The answer is surprisingly nuanced and in one case, shockingly worse.
The Benchmark Setup 🧪
We tested push_back performance across four combinations:
vector<string>
vector<pmr::string>
pmr::vector<string>
pmr::vector<pmr::string>And evaluated them under two conditions:
1. Short strings (SSO applies)
2. Long strings (no SSO, heap allocation required)
To isolate insertion cost:
reserve() was used
We avoided measuring reallocation
Focus was purely on object construction + insertion
Case 1 — Short Strings (SSO) 🟢

For short strings, all implementations perform almost identically.
Why?
Because of Small String Optimization (SSO).
With SSO:
The string does not allocate on the heap
Data is stored directly inside the object
Type | Relative Performance |
All variants | ~1.0x – 1.2x |
Insight
When no allocation happens, PMR cannot help.
In fact:
PMR introduces a small overhead
But there is nothing to optimize
So the result is predictable:👉 No meaningful gain, slight overhead
Case 2 — Long Strings (Real Allocations) 🔴

This is where things get interesting.
Now:
Strings allocate memory
Allocators matter
Cache locality matters
Pooling can make a difference
Winner — vector<pmr::string> 🏆
std::vector<std::pmr::string>Why it wins:
Strings allocate from a pool (PMR resource)
Allocation becomes cheaper for the string which is the hot path
Better locality (especially with stack-backed buffers)
Vector itself remains simple and efficient
Result
👉 ~30% faster than vector<string>
Baseline — vector<string> 🥈
std::vector<std::string>Still performs very well:
Highly optimized implementation
Efficient growth strategy
Good baseline for comparison
pmr::vector<string> 🥉
std::pmr::vector<std::string>Why it underperforms:
Vector uses PMR → but we already reserve()
So allocation cost for vector is negligible anyway
Strings still allocate normally → the real bottleneck remains
We add overhead without solving the problem
Worst — pmr::vector<pmr::string>💀
std::pmr::vector<std::pmr::string>Result
👉 ~3.14× slower than the fastest case
Yes… slower. A lot slower.
Why is pmr::vector<pmr::string> so slow?
This is the key insight of the entire benchmark.
You are stacking allocator-aware abstractions twice:
1. The container (pmr::vector)
Uses polymorphic_allocator
Uses allocator-aware construction
Adds runtime indirection
2. The element (pmr::string)
Also allocator-aware
Also uses memory_resource
Also adds indirection
What happens on each push_back
Each insertion now involves:
Constructing a pmr::string
Allocating its internal buffer via PMR
Inserting into a PMR-aware vector
Triggering allocator-aware construction paths
Passing allocators through uses_allocator mechanisms
Going through polymorphic dispatch (memory_resource)
👉 You pay overhead at multiple layers
The important realization
PMR is not free - it’s a tradeoff.
When used correctly:
It reduces allocation cost
When overused:
It adds abstraction overhead
It prevents compiler optimizations
It introduces runtime dispatch
The Real Lesson 💡
This benchmark leads to a very practical takeaway:
💥 Optimize where the cost actually is - not everywhere.
✔ Good usage
std::vector<std::pmr::string>Optimize string allocation (the hot path)
Keep container simple
Over-engineering
std::pmr::vector<std::pmr::string>Optimize everything
Pay for everything
Lose performance
Mental Model 🧬
Think in layers:
vector allocation → small cost (after reserve)
string allocation → big cost (for long strings)
PMR overhead → non-zero cost👉 Optimize the dominant cost, not all costs.
🏁 Final Thoughts
PMR is an excellent tool:
Memory pools
Custom allocation strategies
Control over memory layout
But like all powerful tools:👉 it requires precision


