Thread

Topic: Feature Request: Concepts
Author: Christopher Eltschka <celtschk@dollywood.itp.tuwien.ac.at>
Date: Mon, 21 May 2001 19:10:48 GMT Raw View
1. Summary

Concepts are a way to add constrained genericity to C++ designed to
be as flexible and generic as possible.  In addition, template
overloading on concepts is possible.  This allows to replace the
complex tag and traits machinery, gives more flexibility to later
additions (including additions "in between" the concept hierarchy,
which isn't possible with the tag solution).

2. Impact

The concept extension introduces a new keyword, "concept". Therefore
programs which use this as identifier will have to be changed to use
another identifier instead.

Other than that, as far as I can see the legality or semantics of
currently legal code is not changed by this extension.

While the extension also affects overloading, if concepts are not
used, the additional operations results in a no-op.  Therefore even
the change of overloading rules don't result in a change of the
behaviour of currently legal programs.

3. Rationale

Currently, there are two ways to get functions working on different
types:

- The OO way: Have the function take an abstract base class defining
  the interface, and have the actual type to be derived from it.

  Advantages:

  * The function is assured that only classes conforming to the
    interface can call that function. If the user tries to call the
    function with an object of the wrong type, he usually gets a
    helpful error message.

  * The function can be overloaded on different interfaces.  The
    compiler can select the correct function to choose based on the
    interface. especially if for a certain derived interface, there's
    an optimized algorithm available, and the object in question has a
    static type conforming to that interface (i.e. either is of that
    type, or of a type derived from that class), the compiler will
    choose that optimized algorithm.

  * Run time polymorphism: The exact type (and therefore the exact
    behaviour) need not be fixed at compile time, provided the base
    class has the corresponding virtual functions (i.e. assuming the
    author of the base type anticipated that usage).

  Disadvantages:

  * The base classes must be specified at type definition
    time. Therefore the mechanism can only be used for interfaces
    which were considered at type definition time. If you later define
    an interface, and find the type would perfectly fit the interface,
    but you cannot change the type, you're lost. The best you can do
    is to write an adaptor class which derived from your new interface
    and forward all functions to the other type.  But even then, that
    is a new type, which therefore can be used only for new
    objects. If yo get an object of the original type, you're probably
    lost.  The same is true for the hierarchies themselves.  Someone
    may have written a generic base class for a semigroup and a
    derived class for a group, but not have thought of a base class
    specifically for a monoid.  Therefore if you want to use that type
    system, and cannot alter it, you are constrained by not being able
    to specify the relationship between groups and monoids, and
    between semigroups and monoids, and therefore f.ex.  cannot write
    functions which specifically need monoids, but work automatically
    for groups.

  * Somwhat related to the first point: If two independant class
    libraries need the exactly same interface, their objects cannot be
    used interchangeably anyway, since they will use different (though
    identical in the specification of the interface) base classes.

  * Class interfaces are fixed up to the last bit (this is of course
    needed to provide runtime polymorphism, but it applies even to
    situations where runtime polymorphism isn't needed). If your base
    class specifies the function takes an int, then the function takes
    an int, even if your derived class would happily work with longs,
    and it would even be an advantage to do so. Therefore the very
    strict definition of the interface reduces the freedom more than
    would be needed from an interface ppoint of view.

  * Class interfaces are checked individually. The best example of the
    problems this causes comes from Stepanov: You cannot write a "max"
    function which takes two objects of the same type, and returns
    another one again from the same type, where the only restriction
    of the type is that it must allow comparing. The restriction to
    comparable types can be expressed with the help of base classes
    (though only for types which have not already been defined), but
    the restriction for both types to be the same cannot be expressed
    with base classes.

  * If passing a class as base class, the derived interface
    "disappears" completely and can only be recovered through casts or
    RTTI. As an example, if you have a stringstream s, then you cannot
    write (s << something).str(), because the information that the
    stream is a stringstream and therefore supports the member
    function str() got lost.

- The generic way: Have a template which takes types as formal
  parameters, and which is then automatically instantiated over the
  correct types.  Advantages:

  * The requirements on the type are independent from the definition
    of the concrete type. Therefore even classes designed without that
    specific implementation in mind can still be used with that class
    library, as long as it conforms to the interface.

  * For the same reason, libraries which use the same interfaces are
    automatically compatible to each other.

  * The class interface is not completely fixed, but only to the
    extent that certain usages are possible. Since guaranteeing a
    certain usage is exactly what defining an interface is all about,
    this means that the templates don't force unnecessary restrictions
    on the type.

  * Since the template type parameters are specified independent from
    the object parameters, they allow to express relations between
    different types. The canonical example is of course again the max
    function: It is possible to state that the two parameters must be
    of the same type. Indeed, the standard library contains such a max
    function template.

  * The full object type "survives" on the way through the
    functions. For example, one can write a+max(b,c), because the
    objects which were passed through max "remember" that they also
    support addition, despite the fact that max itself doesn't care
    about addition.

  Disadvantages:

  * Templates don't fully check the conformance to an interface (there
    is not even a formally declared interface), but only check the
    part of the interface they actually use. While this sometimes is
    useful (for example, smart pointers can implement operator->
    without loosing the ability to work with builtin types, but in
    most cases it is not what one wants. After all, checking
    interfaces is what static typing is all about, and it is necessary
    to allow modifications of the code without the danger of breaking
    some usage.

  * The function cannot be overloaded on different interfaces (except
    for the normal, template independent overloading, which certainly
    is still present).

  * There is no runtime polymorphism.

Now one ideally wants to have all the advantages with none of the
disadvantages. While this is (probably) not possible, one notes that
the restrictions in the OO case are all necessary to allow runtime
polymorphism. So if we drop runtime polymorphism from out wish list,
the rest should be doable with templates.

Indeed, there are template techniques which allow to treat the two
disadvantages directly. The first one is the easier one: By just
making sure that the compiler has to compile code which probes all of
the operations the interface guarantees, without having the program
execute that code, one can ensure that the compiler gives an error if
the given type doesn't conform to the interface. Getting a sensible
error message is, of course, much harder.

For the second problem, a solution can be found in the implementation
of STL iterators. It is quite a complex machinery, which uses traits
(so it can work with built-in pointers) to associate with each type a
tag type that has no function but to identify the category of the
iterator, and then uses traditional inheritance on those tag types to
express the is-a relationship of the real types, and then defines for
each set of functions to overload on iterator category a forwarder
function of the given interface, which uses the traits template to get
the iterator tag, and then uses conventional overloading on the tag
type to select the real function which then finally does the work.

Now one might say, we have the solution, we can do it, fine, ready,
end of story. But the current situation has several disadvantages:

- The template tricks are quite complex. This shows essentially two
  things: First, the features are really needed (otherwise one
  wouldn't invent such a complicated machinerie to get them). Second,
  they are not well supported by the language (otherwise one wouldn't
  need such a complicated machinery to get them). The situation is
  somewhat similar to OOP in C: It is possible, but the language
  doesn't support it well. The fact that the template tricks are
  complex also means that there is a high learning curve until one can
  use them.

- The solutions to the problem are not connected to each other. That
  is, it's quite easy to declare an iterator to be a random access
  iterator, while it doesn't fully conform to that specification.
  However, that error may go unnoticed until one uses the iterator on
  an algorithm which really needs a random access operator. Moreover,
  algorithms which use the overloading machinery described above might
  only check the minimal reqirements in the forwarder (f.ex. if it
  really is a forward iterator), and then use the iterator category to
  go to the algorithm which is optimal for the given operator, which
  itself doesn't check again. Our wrongly declared random access
  iterator would then pass the test (because as forward iterator it is
  valid), and then be passed to the random access iterator version
  (based on the tierator tag), and then the error may or may not get
  noticed depending on the implementation (if it needs the usage the
  iterator doesn't support correctly).

- There is no standard way to specify the interfaces.  Therefore if
  one wants to see what the compiler really checks, one must go into
  the implementation and find out where the tests are really
  implemented.  One cannot simply scan for the interfaces, because one
  doesn't recognice them as such.

- By relying on conventional inheritance for the tag types, one gets
  back the corresponding limitations.  For example, if one needs an
  iterator category which makes more guarantees than an input
  iterator, but less than a forward iterator, one is again lost,
  because it is not possible to insert the new category into the
  middle of the hierarchy.

- And, of course, there is the problem of getting sensible error
  messages.

My concepts proposal shall overcome those problems by allowing those
interfaces to be specified explicitly, in a way unambiguously defined
as such an interface, that is, as concept. The basic goal is to solve
all those problems, without in any way imposing unnecessary
restrictions. This implies that concepts have to be defined in form of
example code which has to be compileable, because any other
specification would impose unnecessary restrictions (f.ex. by
determining if operators are implemented as members or non-members
even in cases where it should not matter). Note that all STL items
except for the concrete classes/functions are defined this way in the
standard (except that the standard also demands a certain semantics;
of course, this cannit be checked automatically at compile time). Of
course this aproach has some drawback: Example code can be tested with
a concrete type, but it cannot be used for more general
considerations, especially answering the question if all types which
compile code snippet 1 would also compile code snippet 2 would
generally be equivalent to solving the halting problem (after all,
templates give a turing-complete compile-time language!). Since such
information is crucial for partial ordering (forward iterators conform
to the input iterator requirements, but if we have overloading between
a version taking forward iterators and a version taking input
iterators, we want the forward iterator version to be selected), it
must be given explicitly by the user (but after all, with the current
workarounds, it must be given explicitly, too). A more formal
specification of the interface (similar to the specifications of class
interfaces) would possibly make automatic partial ordering possible,
but IMHO the restrictions following from that aproach would not be
compensated by this advantage.

A concept is the abstraction of a property of a type or a relation
between types. This allows constraint genericity and overloading on
concepts.

Concepts are approximately to types what types are to objects.
However, there are some important differences:

- A type applies only to one object. A concept can, however, apply to
  more than one types at once, thus allowing to express not only
  properties of, but also relations between types.

- An object has only one type (however, object orientation lifts this
  requirement through inheritance, esp. multiple inheritance, so
  objects can effectively be of more than one type at once). For
  concepts, no restriction applies.  A type can conform to many
  concepts at the same time.

- Since the type of an object determines it's representation, it must
  already be fixed at definition time. If an object is already
  defined, it's type cannot be changed anymore.  Types on the other
  hand are already completely specified by their conventional
  definition. Therefore types can conform to concepts which are
  invented long after the time the type was defined, and by a person
  who cannot change the type's definition.

- Concepts can be inserted into a concept hierarchy at any place. That
  is, if two existing concepts are "too far apart" from each other,
  it's always possible to fill the gap.

- Types must be specified exactly. Concepts are specified by giving
  supported usage. The latter is more flexible. It is in the tradition
  of templates, and it fixes exactly what an interface is about: About
  usability in certain ways in certain contexts.

4. Some notes about the specification

The concept syntax is similar to template syntax. This occurs
naturally, since checking conformance is done by test-compiling
code.

Overloading of constrained templates is based on partial
ordering. There are two aspects of partial ordering:

First, some concepts may be more constraining than others, for
example, the concept RandomAccessIterator is more constraining than
the concept BidirectionalIterator.

Second, a template may be constrained by more than one concept.  A
template constrained by the concepts A and B is more constrained than
a template which is constrained only by one of the concepts.

The first one is reduced to the second one by adding all the
constraints which are declared to be less constrained.

5. Specification

5.1 Concept definition

A concept is defined in the same way as a function template, except
the keyword "template" at the very beginning is replaced by the new
keyword "concept", optionally prefixed by the keyword "explicit", and
the return type is missing. If the definition is prefixed by
"explicit", it is an /explicit concept/, otherwise it is an /implicit
concept/.

A concept may be defined at global or namespace scope.

[Examples:

concept<class C> Container(C const& c)
{
  typename C::iterator first = c.begin(), last = c.end();
  while (first != last)
  {
    typename C::value_type v = *first;
    first = *v;
  }
  C::size_type s = c.size();
  bool b = (s > 0);
  b = s.empty();
}

This defines an implicit concept "Container". It is checking for
various container requirements. Since something which is not a
container is quite unlikely not to fail this check, it's reasonable to
be an implicit concept.

explicit concept<class G>
 Group(G const& g1, G const& g2)
{
  G g3 = g1;
  g3 = g1 * g2;
}

concept<class A, class B>
 convertable(A const& a)
{
  static_cast<B>(a);
}

This concept, describing "A can be converted to B", demonstrates the
usage of concepts for a list of types instead of a single type.

This defines the concept "Group", which declares the type G to
describee a group. Since most of the requirements of a group cannot be
checked by the compiler (the above code would, f.ex., also compile for
a general matrix class using * for multiplication), concepts like this
should be made explicit.  End example]

5.2 Conformance of types to concepts.

1 A type or list of types is /formally conforming to a concept/ in a
  given scope, if the instantiation of the template function would
  give no compiler errors, assuming the generated function had the
  same access permissions as the scope.

2 A concept is /implicit for a given type or type list/ in a given
  scope, if the typelist is suitable for the template argument list of
  the concept, and either the concept is implicit, or there's an
  explicit conformance declaration in scope (see 5.3) or can be
  generated from a suitable conformnance template (see 5.4).

3 A type or set of types is /conforming to a concept/, if it is
  formally conforming to the concept in that scope, and the concept is
  implicit for the typelist in the invoking scope (see 5.5).

5.3 Explicit conformance declaration

A type or set of types may be explicitly declared to be conforming to
a concept by a /conformance declaration/. A conformance declaration
has the form

concept ConceptName<Types>;

where ConceptName is the name of a previously declared concept, and
<Types> is a typelist which is suitable to the template argument list
given at the concept definition.

An explicit conformance declaration can be put at any scope. If an
explicit conforming declaration is given for a type or list of types
which doesn't formally conform to the concept in the scope of the
conformance declaration, the program is ill-formed.  The scope of an
explicit conformance declaration begins at the declaration and ends at
the end of the containing scope.

[Examples:

(These examples use the concept definitions of the
examples in 5.1)

concept Container<std::vector<int> >;

This checks that std::vector<int> formally conforms to the concept
Container. If it woldn't conform, the program containing this line
were ill-formed.

class G { ... };
concept Group<G>;

The second line declares G to be a group. It checks that G fulfills
the formal requirements, and since the Group concept is explicit, it
also makes Group implicit for G after this line, effectively enabling
use of G as a Group.

class X { ... };
concept convertable<X, int>;

This checks that X is convertable to int.
End example]

5.4 Conformance templates

To allow conformance declarations for parametrized types, a
conformance declaration can also be templated.  A conformance template
takes the form

template<TemplateArgList> concept ConceptName<Types>;

where ConceptName is the name of a previously defined concept, Types
is a typelist suitable to the concept's template argument list, and
TemplateArgList is a list of template arguments Types depends on.
Types has exactly the same restrictions that argument types have for
functions, for the same reason: Given a conrete list for Types, the
template arguments must be deducible.

A conformance declaration is implicitly instantiated as soon as the
concept is referenced with appropriate types. Checks are done only on
instatiation of the conformance declaration.

[Examples:

These examples use the declarations and definitions of the examples of
the previous sections.

template<class T> class MyGroup { ... };

template<class T> concept Group<MyGroup<T> >;

This declares that MyGroup conforms to Group for each T.
At this time, no check is made (as it cannot be made yet).

MyGroup<int> g;

At this point, the concept is _not_ checked! (This is, because a
concept may depend on more than one type, and it would be quite hard
to tell when to check such a concept).

template<class T: Group<T> > void foo(T t);
foo(g);

Now, the conformance is checked. Note that without the conformance
template, this call would have been illegal, since Group is an
explicit concept.

template<class T> concept Container<std::vector<T>::iterator>;

This declaration is ill-formed, because given a type Iter, it cannot
be deduced what type T has.

End examples]

5.5 Constrained templates

Templates can be constrained to a given set of templates, by adding a
/constraints clause/ after the template argument list. A constraints
clause consists of a colon, followed by a comma-separated list of
concept instantiations. Declarations which only differ in the order
of concept instantiations declare the same entity.

An empty constraints clause (i.e. only the colon) is equivalent to no
constraints clause at all.

[Example:

template<class A, class B: Group<A>, Convertable<A,B> > class X;
// A has to be a group, and A has to be convertable to B

template<class A, class B: Convertable<A,B>, Group<A> > class X;
// redeclaration of the same class template

template<class T:> void f(T);
// same as template<class T> void f<T>

End example]

Conformance is checked using the access restrictions of the generated
function or class

[Example:

concept<class X> Concept(X& x)
{
  x.foo();
}

class C
{
  friend void f<C>();
private:
  void foo();
};

template<class T: Concept<T> > void f();

int main()
{
  f<C>();
}

The call in main() is legal, because the concept is checked in the
scope of f<C>, and since f<C> is a friend of C, the code in the
concept would compile.

End example]

A conformance template can also be constrained; constrained
conformance templates are used in constraint list expansion, see 5.6.
For constrained conformance templates, the template parameters must be
deducible from the constraints clause. That is, the same restrictions
apply to those constraints clauses which also apply to the use of
template parameters in function argument lists.

A concept is /invoked/ if a template constrained to the concept is
instantiated. The scope containing the invoking code is the /invocation
scope/. The invocation scope is needed to determine if a type conforms
to a concept (see 5.3).

If a concept is invoked with a typelist which it doesn't conform to,
the program is ill-formed.

5.6 Constraints clause expansion

For a constraint list with explicit types inserted, /constraints
clause expansion/ gives an expanded constraints clause by going
through the set of constraints, and for each constraint for which a
constrained conformance template exists where the set of constraints
can be instantiated to a subset of the set of constraints, the concept
of the instantiated constrained conformance template is added to the
constraint list, if it is not already in that list. The same procedure
is recursively applied until no new constraints get added.

[Example:

Assume the concepts C1<T>, C2<T>, C3<T>, C4<T1, T2>, C5<T1, T2>
and the following constrained conformance templates (CCT):

template<class T: C2<T> >
 concept C1<T>;

template<class T1, class T2: C1<T1>, C3<T2> >
 concept C4<T1, T2>;

template<class T1, class T2: C2<T1>, C4<T1, T2> >
 concept C5<T1, T2>;

Now given the constraint list

 C2<X>, C5<X,Y>

the first CCT constraint list can be instantiated to a sublist,
resulting in

 C2<X>, C5<X,Y>, C1<X>

Now the second CCT can be instantiated:

 C2<X>, C5<X,Y>, C1<X>, C4<X,Y>

Now the third CCT can be instantiated; however, C5<X,Y>
is already in the list.

End example]

[Example:

Given C1<T>, C2<T>, C3<T>, C4<T1, T2> and the following
CCTs:

template<class T: C1<T>, C2<T> > concept C3<T>;           // 1
template<class T: C2<T> > concept C4<T,T>;                // 2
template<class T1, class T2: C4<T1, T2> > concept C1<T1>; // 3

Now we have f.ex. the following expansion:

  C2<X> -> C2<X>, C4<X,X> (2)
        -> C2<X>, C4<X,X>, C1<X> (3)
        -> C2<X>, C4<X,X>, C1<X>, C3<X> (1)

  C4<X,Y> -> C4<X,Y>, C1<X> (3)

End example]

It's reasonable to have an implementation defined ressource
limit on the concepts clause expansion, since - similar to
the template expansion case - it's possible to generate
infinite loops. That ressource limit should, however, be
reasonably large to cover all normal cases.

5.7 Overloading of constrained function templates

5.7.1 Overload resolution

The overload resoltion is modified as follows:

In a first step, the steps of ordinary overload resolution before
checking partial ordering of templates are done completely ignoring
the constraints. This results in a set of zero or more viable
function, which in the following is called the primary set.

Then, for all functions in the primary set which are specialisations
of constrained templates, constraints clause expansion (see 5.6) is
performed. All further steps are done using those expanded constraints
clauses instead of the original ones.

Then the inferred types are checked against the concepts, and all
candidate functions where concepts are not fulfilled are removed from
the primary set.

Then, normal overload resolution contines using the modified primary
set, resulting in another, smaller set of functions, which in the
following is called the secondary set.

If the secondary set is empty, the program is ill-formed.

If the secondary set contains exactly one function, this
function is used.

If the secondary set contains more than one function,
the most constrained function is determined. For this,
a function f1 is /more constrained/ than a function f2,
if the expanded constraints clause of f1 is a superset of
the expanded constraints clause of f2. The expanded
constraints clause of an unconstrained function is empty.

5.7.2 Explicit disambiguation of concept overloaded templates

Specifying template parameters without specifying
explicit concepts causes the functions to choose from
to be selected only from the set of templates with this
template list, but with arbitrary constraints.

A certain version of a template can be explicitly requested
by writing the exact constraints clause into the explicit
template parameter list. If an unconstrained template shall be
selected, an empty constraints clause has to be added.

[Example:

template<class T1, class T2> void f(T1 x, T2 y);    // 1
template<class T> void f(T x, T y);                 // 2
template<class T: Constraint<T> > void f(T x, T y); // 3

f(3, 4) // selects from (1,2,3), implicit argument deduction
f<int, int>(3, 4) // selects (1) only
f<int>(3,4) // selects from (2,3)
f<int: Constraint<int> >(3,4) // selects (3)
f<int:>(3,4) // selects (2)

End example]

5.8 Partial specialisation based on constraints

A partial specialisation of templates can be based on constraints.

[Example:

// the concept
concept<class T> has_internal_reference_count(T& t)
{
  bool b = t.unshared();
  t.increment_reference_count();
  t.decrement_reference_count();
}

// smart pointer class, general version
template<class T> class smart_pointer
{
public:
  // pointer interface
private:
  T* object;
  unsigned long* refcount;
};

// smart pointer class, partial specialisation
template<class T: has_internaal_reference_count<T> >
 class smart_pointer<T>
{
public:
  // ...
private:
  T* object;
  // no separate refcount pointer needed
};

End example]

For selection of the specialisatzion, constraints clause expansion
is done, and the most constrained applicable version is chosen.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.research.att.com/~austern/csc/faq.html                ]