In this article, we will understand why vector push_back is slow in C++.
The main reason why push_back is slow is because of multiple reallocation of memory.
Every vector has vector::size and vector::capacity. vector::size gives the number of elements in the vector and vector::capacity gives the number of elements vector can store.
When we push data to the vector using push_back, there are two possibilities
- vector::size < vector::capacity
In this case, push_back put the new element at the end of the vector and increment size.
- vector::size == vector::capacity
In this case, there’s no space available for the new element. Thus, reallocation takes place and all the elements are moved from the old region of memory to the new memory locations. The new capacity is generally 1.5x or 2x times the previous capacity. After that, the new element is put at the end of the vector.
Thus, multiple push_back may result in multiple reallocations of the vector which degrades performance.
We can avoid multiple reallocation using vector::reserve.
vector<int> v; v.reserve(N);
The above piece of code first initializes a vector v and then reserve memory enough to store atleast N elements. In simple words, reserve(N) sets the capacity of the vector >= N. Remember, even tho the capacity of the vector is >= N, it’s size is still 0.
Note that vector::reserve only reserves memory. It doesn’t initialize memory. This means we cannot use v[i]. We still have to use push_back to push elements to the vector.
The advantage of this technique is that allocation only takes place once.
vector::reserve is efficient if we know the upper bound of the number of elements.
Even tho reallocation may only take place once, vector::push_back is still not the most efficient way to push elements to the vector.
vector::push_back performs a bound check, change end pointer, and increment vector size. Performing multiple operations just to push 1 element can affect performance
We can use other faster techniques to push data to the vector.
The above line of code initialize an array of size N and capacity >= N.
The advantage is we can directly access any element using by its index. v[i] simply access the ith index without checking bounds or changing the size of the vector. Updating vector using v[i] is more efficient than using push_back.
The disadvantage of this method is it cannot be used in all situations.
Leave Comment if you face any issue