Topic: in-place operator enhancements to std::set


Author: jking@incipient.com (James E. King, III)
Date: Sun, 21 Jul 2002 01:36:27 GMT
Raw View
Very recently I 'discovered' std::set and I've got some suggestions
for enhancing it in the next C++ standard.  Sets are currently lacking
in-place operators that would add significant value.  Take, for
example, a routine that applies two different set operations to an
input set.

This fictitious function populates two sets with values to be used as
transforms on a set that the user supplies.  The first transform does
a difference. The second one does a union. The caller's set is passed
in by reference so as to avoid some memory copies.

----------------------------------------------------------------------
typedef std::set<std::string> string_set;

string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  string_set tmp;     // stores the difference

  set_difference(xform.begin(), xform.end(),
    diffset.begin(), diffset.end(),
    inserter(tmp, tmp.begin()));

  xform.clear();      // clear original set so it can receive union

  set_union(tmp.begin(), tmp.end(),
    unionset.begin(), unionset.end(),
    inserter(xform, xform.begin()));

  return xform;
}
----------------------------------------------------------------------

I propose a set of operators for std::set that do in-place operations:

  operator &=    in-place set_intersection with right-hand operand
  operator |=    in-place set_union with right-hand operand
  operator -=    in-place set_difference with right-hand operand
  operator ^=    in-place set_symmetric_difference ...

as well as other operators that would make the code easier to read
(but not implemented in this example):

  operator &     a shortcut for set_intersection
  operator |     a shortcut for set_union
  operator -     a shortcut for set_difference
  operator ^     a shortcut for set_symmetric_difference

This simplifies the code to an easier-to-read form:

----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  string_set tmp;
  tmp = xform - diffset;

  xform.clear();
  xform = tmp | unionset;

  return xform;
}
----------------------------------------------------------------------

This likely incurs slightly additional overhead due to the creation of
a temporary set to be the return value of xform - diffset, followed by
a default operator = assignment, however it is much easier to read.

Using the in-place operators, we get even better looking code, and
also better performing code!

----------------------------------------------------------------------
string_set& transform(string_set& xform)
{
  string_set diffset = gen_diffs(xform);
  string_set unionset = gen_union(xform);

  xform -= diffset;
  xform |= unionset;

  return xform;
}
----------------------------------------------------------------------

In this particular case, you avoid a using a temporary that is needed
to store results from the set_* templates, and you avoid a clear() as
well.

I have written some tests to verify that this is indeed a good thing -
not just in beautifying set manipulation in C++ but also in optimizing
runtimes. Here are some results of a simple test case program I wrote
to illustrate the benefit of this in-place operator enhancement.  This
code was compiled by the Microsoft Visual Studio .NET C++ compiler
with /O2 optimizations enabled and executed on a Dual Pentium 4 Xeon
1.8Ghz Windows 2000 Server with 512MB RAM.  These are typical results
after 10 runs.

----------------------------------------------------------------------
c:\temp\stl\setoper\release>setoper
set theory time test

5000 loops of old style: 2.4 ms/loop (11954 ticks)
5000 loops of new style: 0.8 ms/loop (3938 ticks)
and let's make sure both code paths produce the same result: true
----------------------------------------------------------------------

The project and source is located at this URL:

http://www.khoromag.com/setoper.zip

I'm hoping this will catch someone's eye and make it into the
standards review board, and maybe pick up a sponsor along the way.
Below I have placed a copy of my class called "uset" which simply
extends set and adds the in-place operators.  The test project also
uses and includes this source - I use four-character tabs to align
things and the file looks best this way.

Thanks.

James E. King, III
Software Architect
Incipient, Inc.
230 Third Ave.
Waltham, MA.  02154

jking@incipient.com

----------------------------------------------------------------------
uset.h
----------------------------------------------------------------------

template<class _S> class uset : public set<_S>
{
  public:
    typedef uset<_S> _Myt;

    inline _Myt& operator &= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) { erase(li++);   }
            else if (*li > *ri) {       ++ri;    }
            else                { ++li; ++ri;    }
        }

        if (ri == right.end())
            erase(li, end());

        return *this;
    };

    inline _Myt& operator |= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {       ++li;    }
            else if (*li > *ri) { insert(*ri++); }
            else                { ++li; ++ri;    }
        }

        if (ri != right.end())
            insert(ri, right.end());

        return *this;
    };

    inline _Myt& operator -= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {              ++li; }
            else if (*li > *ri) {              ++ri; }
            else                { erase(li++); ++ri; }
        }

        return *this;
    };

    inline _Myt& operator ^= (const _Myt& right)
    {
        _Myt::iterator       li = begin();
        _Myt::const_iterator ri = right.begin();

        while (ri != right.end() && li != end()) {
                 if (*li < *ri) {              ++li; }
            else if (*li > *ri) { insert(*ri++);     }
            else                { erase(li++); ++ri; }
        }

        if (ri != right.end())
            insert(ri, right.end());

        return *this;
    };
};

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]