Topic: std::set inplace operators yield 300% performance gain
Author: jking@incipient.com (James E. King, III)
Date: Mon, 30 Sep 2002 23:02:30 +0000 (UTC) Raw View
Sets are currently lacking in-place operators that can add significant
value. I have coded some inplace operators and found that I can
improve performance of set operations such as set_union,
set_symmetric_difference, etc. by 3 times.
I have written some tests to verify that this is indeed a good thing -
not just in beautifying set manipulation in C++ but also in optimizing
runtimes. Here are some results of a simple test case program I wrote
to illustrate the benefit of this in-place operator enhancement. This
code was compiled by the Microsoft Visual Studio .NET C++ compiler
with /O2 optimizations enabled and executed on a Dual Pentium 4 Xeon
1.8Ghz Windows 2000 Server with 512MB RAM. These are typical results
after 10 runs.
----------------------------------------------------------------------
c:\temp\stl\setoper\release>setoper
set theory time test
5000 loops of old style: 2.4 ms/loop (11954 ticks)
5000 loops of new style: 0.8 ms/loop (3938 ticks)
and let's make sure both code paths produce the same result: true
----------------------------------------------------------------------
The project and source is located at this URL:
http://www.khoromag.org/setoper.zip
I would really like to see this added to the next C++ standard spec.
Take, for example, a routine that applies two different set
operations to an input set. This fictitious function populates two
sets with values to be used as transforms on a set that the user
supplies. The first transform does a difference. The second one does
a union. The caller's set is passed in by reference so as to avoid
some memory copies.
----------------------------------------------------------------------
typedef std::set<std::string> string_set;
string_set& transform(string_set& xform)
{
string_set diffset = gen_diffs(xform);
string_set unionset = gen_union(xform);
string_set tmp; // stores the difference
set_difference(xform.begin(), xform.end(),
diffset.begin(), diffset.end(),
inserter(tmp, tmp.begin()));
xform.clear(); // clear original set so it can receive union
set_union(tmp.begin(), tmp.end(),
unionset.begin(), unionset.end(),
inserter(xform, xform.begin()));
return xform;
}
----------------------------------------------------------------------
I propose a set of operators for std::set that do in-place operations:
operator &= in-place set_intersection with right-hand operand
operator |= in-place set_union with right-hand operand
operator -= in-place set_difference with right-hand operand
operator ^= in-place set_symmetric_difference ...
as well as other operators that would make the code easier to read
(but not implemented in this example):
operator & a shortcut for set_intersection
operator | a shortcut for set_union
operator - a shortcut for set_difference
operator ^ a shortcut for set_symmetric_difference
This simplifies the code to an easier-to-read form:
----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
string_set diffset = gen_diffs(xform);
string_set unionset = gen_union(xform);
string_set tmp;
tmp = xform - diffset;
xform.clear();
xform = tmp | unionset;
return xform;
}
----------------------------------------------------------------------
This likely incurs slightly additional overhead due to the creation of
a temporary set to be the return value of xform - diffset, followed by
a default operator = assignment, however it is much easier to read.
Using the in-place operators, we get even better looking code, and
also better performing code!
----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
string_set diffset = gen_diffs(xform);
string_set unionset = gen_union(xform);
xform -= diffset;
xform |= unionset;
return xform;
}
----------------------------------------------------------------------
In this particular case, you avoid a using a temporary that is needed
to store results from the set_* templates, and you avoid a clear() as
well.
James E. King, III
Software Architect
Incipient, Inc.
230 Third Ave.
Waltham, MA. 02154
jking@incipient.com
----------------------------------------------------------------------
uset.h
----------------------------------------------------------------------
template<class _S> class uset : public set<_S>
{
public:
typedef uset<_S> _Myt;
inline _Myt& operator &= (const _Myt& right)
{
_Myt::iterator li = begin();
_Myt::const_iterator ri = right.begin();
while (ri != right.end() && li != end()) {
if (*li < *ri) { erase(li++); }
else if (*li > *ri) { ++ri; }
else { ++li; ++ri; }
}
if (ri == right.end())
erase(li, end());
return *this;
};
inline _Myt& operator |= (const _Myt& right)
{
_Myt::iterator li = begin();
_Myt::const_iterator ri = right.begin();
while (ri != right.end() && li != end()) {
if (*li < *ri) { ++li; }
else if (*li > *ri) { insert(*ri++); }
else { ++li; ++ri; }
}
if (ri != right.end())
insert(ri, right.end());
return *this;
};
inline _Myt& operator -= (const _Myt& right)
{
_Myt::iterator li = begin();
_Myt::const_iterator ri = right.begin();
while (ri != right.end() && li != end()) {
if (*li < *ri) { ++li; }
else if (*li > *ri) { ++ri; }
else { erase(li++); ++ri; }
}
return *this;
};
inline _Myt& operator ^= (const _Myt& right)
{
_Myt::iterator li = begin();
_Myt::const_iterator ri = right.begin();
while (ri != right.end() && li != end()) {
if (*li < *ri) { ++li; }
else if (*li > *ri) { insert(*ri++); }
else { erase(li++); ++ri; }
}
if (ri != right.end())
insert(ri, right.end());
return *this;
};
};
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]