Topic: std::set inplace operators yield 300% performance gain


Author: jking@incipient.com (James E. King, III)
Date: Mon, 30 Sep 2002 23:02:30 +0000 (UTC)
Raw View
Sets are currently lacking in-place operators that can add significant
value.  I have coded some inplace operators and found that I can
improve performance of set operations such as set_union,
set_symmetric_difference, etc. by 3 times.

I have written some tests to verify that this is indeed a good thing -
not just in beautifying set manipulation in C++ but also in optimizing
runtimes. Here are some results of a simple test case program I wrote
to illustrate the benefit of this in-place operator enhancement.  This
code was compiled by the Microsoft Visual Studio .NET C++ compiler
with /O2 optimizations enabled and executed on a Dual Pentium 4 Xeon
1.8Ghz Windows 2000 Server with 512MB RAM.  These are typical results
after 10 runs.

----------------------------------------------------------------------
c:\temp\stl\setoper\release>setoper
set theory time test

5000 loops of old style: 2.4 ms/loop (11954 ticks)
5000 loops of new style: 0.8 ms/loop (3938 ticks)
and let's make sure both code paths produce the same result: true
----------------------------------------------------------------------

The project and source is located at this URL:

http://www.khoromag.org/setoper.zip

I would really like to see this added to the next C++ standard spec.

Take, for example, a routine that applies two different set
operations to an input set. This fictitious function populates two
sets with values to be used as transforms on a set that the user
supplies. The first transform does a difference. The second one does
a union. The caller's set is passed in by reference so as to avoid
some memory copies.

----------------------------------------------------------------------
typedef std::set<std::string> string_set;

string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  string_set tmp;     // stores the difference

  set_difference(xform.begin(), xform.end(),
    diffset.begin(), diffset.end(),
    inserter(tmp, tmp.begin()));

  xform.clear();      // clear original set so it can receive union

  set_union(tmp.begin(), tmp.end(),
    unionset.begin(), unionset.end(),
    inserter(xform, xform.begin()));

  return xform;
}
----------------------------------------------------------------------

I propose a set of operators for std::set that do in-place operations:

  operator &=    in-place set_intersection with right-hand operand
  operator |=    in-place set_union with right-hand operand
  operator -=    in-place set_difference with right-hand operand
  operator ^=    in-place set_symmetric_difference ...

as well as other operators that would make the code easier to read
(but not implemented in this example):

  operator &     a shortcut for set_intersection
  operator |     a shortcut for set_union
  operator -     a shortcut for set_difference
  operator ^     a shortcut for set_symmetric_difference

This simplifies the code to an easier-to-read form:

----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  string_set tmp;
  tmp = xform - diffset;

  xform.clear();
  xform = tmp | unionset;

  return xform;
}
----------------------------------------------------------------------

This likely incurs slightly additional overhead due to the creation of
a temporary set to be the return value of xform - diffset, followed by
a default operator = assignment, however it is much easier to read.

Using the in-place operators, we get even better looking code, and
also better performing code!

----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  xform -= diffset;
  xform |= unionset;

  return xform;
}
----------------------------------------------------------------------

In this particular case, you avoid a using a temporary that is needed
to store results from the set_* templates, and you avoid a clear() as
well.

James E. King, III
Software Architect
Incipient, Inc.
230 Third Ave.
Waltham, MA.  02154

jking@incipient.com

----------------------------------------------------------------------
uset.h
----------------------------------------------------------------------

template<class _S> class uset : public set<_S>
{
  public:
    typedef uset<_S> _Myt;

    inline _Myt& operator &= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) { erase(li++);   }
            else if (*li > *ri) {       ++ri;    }
            else                { ++li; ++ri;    }
        }

        if (ri == right.end())
            erase(li, end());

        return *this;
    };

    inline _Myt& operator |= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {       ++li;    }
            else if (*li > *ri) { insert(*ri++); }
            else                { ++li; ++ri;    }
        }

        if (ri != right.end())
            insert(ri, right.end());

        return *this;
    };

    inline _Myt& operator -= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {              ++li; }
            else if (*li > *ri) {              ++ri; }
            else                { erase(li++); ++ri; }
        }

        return *this;
    };

    inline _Myt& operator ^= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {              ++li; }
            else if (*li > *ri) { insert(*ri++);     }
            else                { erase(li++); ++ri; }
        }

        if (ri != right.end())
            insert(ri, right.end());

        return *this;
    };
};

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]