Topic: Defect Report: TR1 regex named character classes and


Author: Eric Niebler <eric@boost-consulting.com>
Date: Fri, 1 Jul 2005 18:21:23 +0000 (UTC)
Raw View
[ Note: Forwarded to C++ Committee. -sdc ]

This defect is also being discussed on the Boost developers list. The
full discussion can be found here:
http://lists.boost.org/boost/2005/07/29546.php

-- Begin original message --

Also, I may have found another issue, closely related to the one under
discussion. It regards case-insensitive matching of named character
classes. The regex_traits<> provides two functions for working with
named char classes: lookup_classname and isctype. To match a char class
such as [[:alpha:]], you pass "alpha" to lookup_classname and get a
bitmask. Later, you pass a char and the bitmask to isctype and get a
bool yes/no answer.

But how does case-insensitivity work in this scenario? Suppose we're
doing a case-insensitive match on [[:lower:]]. It should behave as if it
were [[:lower:][:upper:]], right? But there doesn't seem to be enough
smarts in the regex_traits interface to do this.

Imagine I write a traits class which recognizes [[:fubar:]], and the
"fubar" char class happens to be case-sensitive. How is the regex engine
to know that? And how should it do a case-insensitive match of a
character against the [[:fubar:]] char class? John, can you confirm this
is a legitimate problem?

I see two options:

1) Add a bool icase parameter to lookup_classname. Then,
lookup_classname( "upper", true ) will know to return lower|upper
instead of just upper.

2) Add a isctype_nocase function

I prefer (1) because the extra computation happens at the time the
pattern is compiled rather than when it is executed.

-- End original message --

For what it's worth, John has also expressed his preference for option
(1) above.

--
Eric Niebler
Boost Consulting
www.boost-consulting.com



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html                       ]