Topic: Named Operators
Author: simon@hand-multimedia.co.nz
Date: 24 Sep 2005 04:20:23 GMT Raw View
Prefix operators are basicly functions without the parens,
so no biggy there. Infix operators are trickier, but you
can achive the effect using expression templates:
if(x <in> some_cont) {...}
using overloaded op< and op>.
I don't think there's any real implentation yet, but you
might try checking out the Boost lists to see if it goes
anywhere. Post-fix operators are pretty much untouchable.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
Author: billclare3@aol.com
Date: 14 Sep 2005 14:40:01 GMT Raw View
Introduction:
This note is a two part proposal to significantly enhance the
expressive power of C++ code, particularly in dealing with data
structures. The first part is to provide the ability to define
identifiers, in addition to the already defined predefined operator
symbols, as binary and unary operators. The second part is to suggest
at least one way to exploit this capability; i.e., with somewhat
different versions of the current STL algorithms, that would allow
operations on sequences to return sequences.
I suspect that at least the first part of this has history, but I
didn't find it in casual searching or in the C++0x proposals. A
reference however might be Bjarne Stroustrup's somewhat whimsical
paper on Generalized Overloading for C++ 2000 (available at his website
- http://www.research.att.com/~bs/papers.html). So, at the risk of
resurrecting some hoary arguments, this attempts to re-examine the
issue.
Secondly, the STL algorithms go a long way towards introducing general
operations on collections. However, while they provide great deal of
flexibility in allowing variations on simple constructs, they do not
provide the defaults to make the simple constructs simple. This leads
to a lot of awkward syntax related to specifying functions, iterators,
composition and binding for basic operations. In particular they do
not allow operations on collections that return collections.
Lambda expressions are an alternative that provides an expressive
approach to defining functional transformations, but, for many, these
are not particularly readable. Here the objectives are both
expressiveness and readability.
Proposal:
The basic proposal is fairly simple to state:
Interpret "arg1 op arg2" ; as the best fit (or ambiguity),
according to standard function overloading rules, from among:
"arg1.op(arg2)" ; "op(arg1, arg2)" ; "op(arg2, arg1)" ;
where, "op" is an identifier for a function, a pointer or
reference to a function, a functor with "operator ( ) (arg1, arg2)"
or a pointer or reference to any of the above. Slightly better, for
consistency in providing prefix and postfix functions, would be to
follow the practice of pointer interpretation for "operator ->",
and allow for an indefinite chain of such pointers. This would allow,
for instance, for prefix/suffix operations that are specific to
functions rather than to objects.
This basic proposal appears to be only a minor change to parsers and
directories, although it is tied into other proposals for generalized
function specification. This proposal is meant to match the same
parsing and interpretation rules as for the binary operator symbols,
with the addition of the possibility of a functor for resolution (which
however corresponds to another C++0x suggestion for function name
resolution). Use of a functor allows operator modifiers in its
constructor, which essentially permits ternary and higher order
operators, as in: "arg1 op (mod1, mod2) arg2". This construct
could invoke a functor constructor, or alternatively, a "make"
functions.
For consistency, we can also have unary prefix operators as in "op
arg", which is little different from "op(arg)", and also the
postfix variant "arg op". As illustrated below, this later is
useful to specify filter predicates.
Finally it would be useful to support generalized inner products, as in
"arg1 op1 op2 arg2 ", perhaps with mapping to a functor which has a
constructor "op1(op2)" and an "operator ( ) (arg1, arg2) ", or
alternatively to a "make" function of the form "op1( op2, arg1,
arg2)" .
The obvious assumptions here are that unary operators are
right-associative, that binary operators are left associative, and that
these binary operators have lower priority than the unary ones, which
in turn have lower priority than any of the predefined operator
symbols. (Apparently such an obvious approach to binding posers has
been the subject of much controversy, particularly with an option to
specify priorities - which would make declaring and using such
operators more complicated and less readable.)
A secondary, more useful, and considerably more extensive proposal, is
to apply these operators to an additional set of library algorithms
that work on Sequences and return Sequences. A Sequence (or Range)
here, is a stream, a collection, or any series of values or objects to
be examined or operated on. Basically it is an object that supports
"Sequence::begin( )" and "Sequence::end( )" functions to
provide iterators of an appropriate type for the context of their use
(which algorithms can and should check). Also the binary operators
should be implemented to use the obvious defaults (e.g. for
"binder2nd", "compose", etc.) as illustrated below. With
somewhat more work, the operators can implement expression templates to
provide "lazy evaluations", to specify complex structures, which
are evaluated as necessary. Enthusiasts of SQL and of Ken Iverson's
APL will recognize the power of such operators.
This and the examples below are probably related to other language
extensions for functions that are currently under consideration,
although these relations are not spelled out here.
Finally, removing the restriction on function definitions within
functions might be a simple and useful adjunct, even a the
simplification of the scooping rules that only allowed such functions
to be called by the declaring function.
Some Examples:
The following assumes we have a simple (or complex) Sequence class
1. A Simple Transform:
>From Generic Programming and the STL by Matthew H. Arnold - 12.3 ,
to add two sequences of integers and write the result we have:
" transform (vector.begin(), vector.end(), array[0],
ostream_iterator<int>(cout, "\n"), plus<int>() )
;"
Certainly the intent could be expressed more easily by:
" cout ("\n") << vector Plus array ;"
2. Another Transform
Also from Generic Programming and the STL - 15.8.7, to transform a
vector of angles to the negative of a vector of sines, we have:
" transform (angles.begin(), angles.end(),sines.begin(),
compose1(negate<double>( ),
compose1(ptr_fun(sin),
bind2nd(multiplies<double>( ), pi/180))))
;"
Certainly, the intent could be expressed more easily by:
"sines = ( - sin * pi/180 ) angles ; "
or
" UnaryFunction<Sine> Convert = ( - sin * pi/180 ) ;
sines = Convert angles ; "
3. Finding Things
And again from Generic Programming and the STL - 15.8.8, to find a
list element between 1 and 10.
" findif (list.begin(),list.end(),
compose2(logical_and<bool> ( ),
bind2nd( greater_equal<int>( ), 1),
bind2nd( less_equal<int>( ), 10) ) ) ) ;"
Instead we might have,
" list FindIf ( (GE 1) And (LE 10) ) ; "
or for the more mathematically inclined,
" list FindIf (<= 1) && (>= 10) "
Workings:
To find a qualifying integer from a list of integers, the Programmer
writes: " list FindIf ( (GE 1) And (LE 10) ) " .
The interpretation, in reverse order of this, would be somewhat as
follows:
Compiler finds: " FindIf (Sequence, UnaryPredicate) "
- with the matching template parameters, for function
" Sequence::Iterator FindIf (Sequence, UnaryPredicate) "
Library writes: " FindIf " function using constructs for
" Sequence::begin( ) , Sequence::end( ) ", and
" UnaryPredicate( *Sequence::iterator) ".
Compiler finds: " UnaryPredicate " matches
" And ( (GE 1), (LE 10) )," which matches a Functor
" And " with a template specialization that inherits from
" compose_f_gxhx ," with a constructor
(or perhaps a "make" function for such a constructor)
" And(UnaryPredicate1, UnaryPredicate2 )" and
" bool operator ( ) (arg) ".
Library writer: for this specialization of " And "
defines " bool operator ( ) (arg) "
with the obvious implementation:
" return (UnaryPredicate1)(arg) && (UnaryPredicate2 ) (arg) "
with give a return value of
" (GE 1)(*sequence::iterator) && (LE 10) (*sequence::iterator)) "
Compiler finds: " (GE 1)" matches " UnaryFunctor GE (1) "
(or perhaps again a "make" function for such a constructor)
which inherits from " binder2nd ",
so from its " operator ( ) "
we get " (GE(1) ) (*sequence::iterator )" ,
with the obvious implementation of
" *sequence::iterator >= 1 " .
So the " UnaryPredicate(Argument) " is effectively
"And(GE (*sequence::iterator, 1), LE(*sequence::iterator, 10)) "
or
" (*sequence::iterator >= 1) && (*sequence::iterator <= 10))"
And, from these and appropriate template declarations,
the compiler and library writer should be able to deduce the
correct value types.
Extension to Member Functions:
An extension of this might be:
" persons FindIf ( (Age( ) GE 1) And ( Age( ) LE 10) ", where
" Age " is a member function of " Person ".
This time " GE " is found to match a " UnaryFunctor " with
constructor " GE( MemberFunction, Value) "
(or again similar "make" function) and " operator ( ) (Arg*)
"
with the implementation
" ( (*Arg).MemberFunction( ) ) >= Value ".
Of course without general binary operators, we could still write:
"FindIf (persons, Where (GE (Age ( ), 1) and LE (Age ( ), 10 ) ) )
"
but that is not nearly as elegant.
Of course without general binary operators, we could still write:
"FindIf (persons, Where (And(GE (Age ( ), 1) , LE (Age ( ), 10
))))"
but that is not nearly as elegant.
And finally another variation might be:
"persons FindIf ( Age() IN (1,10) )"
where " IN "is a postfix operator matching the functor
constructor
" IN(MemberFunction, Value1, Value2) "
and " operator ( Arg*) "
with the implementation
" ( (*Arg).MemberFunction( ) ) >= Value1 &&
( (*Arg).MemberFunction( ) ) <= Value2 "
Relational Data Base Operations:
Typically, this " FindIf " might be used in code to examine a
sequence to produce a sequence of results. So, to produce the result
sequence, we might have, for essentially the same cost;
" sequence Select predicate " .
Note that implementations of these functions can produce template
expressions for later "lazy evaluation". Also, they can be
implemented as "pipes" for operators that iteratively take an
object from one sequence, provide buffer, filter, transform and/or
accumulate operations on the object, and then send results to a target
sequence.
To provide the other basic SQL operations we need join -
" sequence Join(Predicate) sequence "
and projection -
" Project<Model> sequence ",
as well as combinations, such as -
" ( ( sequence Join sequence) Select<Model> predicate)" .
and where Model specifies the columns or column operators for the
result.
Projection could be implemented by a template specialization of a
conversion function, ideally based on some reflection capability.
Another alternative is to provide sequences of columns of tuple
variables and/or tuple functions that determine a column value. Most
useful is probably a tuple of access functions, to be applied to
imbedded sequence iterators.
Project is essentially a mapping function. One specialization would be
support for efficient C++ collections that are internal representations
of XML document structures. Somewhat more ambitious is something like,
"XMLDoc [XPath]" or " XMLDoc XQuery XMLDoc " .
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]