2014-12-16 - unordered array

recently i was in a need of fast and compact container to keep data, that changes very often. amount of insertions, removals and searches were pretty equal and very frequent. fortunately size was expected to be small - tens to hundreds of elements. of course the most cache-friendly data structure is linear array (read: vector). can we do even better?

assumptions

generally vector is fast, but removal of elements from the middle is a bit more painful. to deal with it i've decided to create an array that by design does not guarantee any particular order, after insertion and removal. when the size does not change, elements can be reorder, in terms of swapping/assigning. it's just size-changing operations that destroy any order. also all size-modifying operations invalidate all iterators. when container shrinks, unlike the vector, it can shrink as well. i've called the container unordered array (available as a part of but library).

operations

since no particular order is assumed, addition is simple. it means appending one more element at the end.

removal is also done in situ as well. instead of moving all elements from the given position until the end, one place backwards (as in vector), last element is being moved to the position of the element to be removed and the stub of last element is being removed.

searching for an element is done in a linear fashion, as usual.

operation UnorderedArray vector
add O(1) O(N)
remove O(1) O(N)
find O(N) O(N)

test scenario

to see how it behaves i've done some measurements. they are all done using algorithm, that reassembles my actual use case. it goes like this:

const vector<int> valuesToAdd = /*...*/;
Container c;
c.reserve( valuesToAdd.size() );
// fill
for(auto v: valuesToAdd)
  c.add(v);
// search and remove
const auto queries = randomizeOrder(valuesToAdd);
for(auto q: queries)
{
  const auto it = std::find( c.begin(), c.end(), q );
  c.erase(it);
}

check out an actual implementation of test case for details.

for measurements i've selected several containers to compare with UnorderedArray:

  • vector – to see if it is much slower.
  • deque – to check if it is significantly slower.
  • set – to compare with balanced search tree.
  • unordered_set – to compare with hash table and amortized O(1) access.
  • list – just for the record… ;)

measurements

all data points all data points, without the list 0-2k range 0-2k range w/o list and deque

all measurements, containing list clearly show that… list just gets in the way. ;) there is just one interesting point there, that shows sudden speedup of the list around 7k elements – looks like the list is not only slow but also haunted. ;) just stay away from it.

unordered_set was the fastest container for more than ~1.2k elements. up to that point, UnorderedArray was the fastest container there is. in terms of speed, next non-associative container is vector. here is a comparison of these two:

UnorderedArray vs. std::vector for small number of elements

final remarks

for my use case UnorderedArray was a clear winner. however please keep in mind that this comparison is far from being comprehensive. it shows a particular test case only, that i happened to need. it also does not include neither different compilers nor hardware platforms. consider it an inspiration more than an oracle… ;)

blog/2014/12/16/unordered_array.txt · Last modified: 2014/12/16 19:58 by basz
Back to top
Valid CSS Driven by DokuWiki Recent changes RSS feed Valid XHTML 1.0