Topic: Using temporary string in regex_match


Author: SG <s.gesemann@googlemail.com>
Date: Tue, 5 Mar 2013 23:50:09 -0800 (PST)
Raw View
On Mar 5, 7:08 am, Vyacheslav Kononenko wrote:
>
> I hit an issue with using boost::regex_match()
>
>   boost::regex reg( ".*" );
>   std::string function();
>   boost::smatch what;
>   if( boost::regex_match( function(), what, reg ) )
>       std::cout << what[0].str() << std::endl;
>
> This code seems to have UB, as boost::smatch from boost-regex stores
> positions in original string and use them for str() method. But the
> problem is that temporary of std::string destroyed as soon as
> boost::regex_match() terminates.
>
> Bad thing this issue is not easy to diagnose as passing temporary of
> std::string to const reference is completely legit, and to understand
> that there is issue with such issue you have to dig into implementation.

I would expect the documentation of regex_match and smatch to reflect
that. And indeed the C++11 standard library says that an smatch is an
alias for match_results<string::const_iterator> storing a couple of
sub_match<string::const_iterator> objects that are simply pairs of
iterators with a couple of extra methods.

But I am all for lowering the chance of accidental misuse. One could
add the following overload

  template <class ST, class SA, class Allocator,
            class charT, class traits>
  bool regex_match(
            const basic_string<charT, ST, SA>&& s, // rvalue-ref!
            match_results<
              typename basic_string<charT, ST, SA>::const_iterator,
              Allocator>& m,
            const basic_regex<charT, traits>& e,
            regex_constants::match_flag_type flags =
              regex_constants::match_default) = deleted; // deleted!

to the standard library just like std::reference_wrapper<T> has a
similarly overloaded constructors to prevent binding to rvalues.

> I believe following code has the same problem:
>
> boost::regex reg( ".*" );
> boost::smatch what;
> if( boost::regex_match( std::string( "abc" ), what, reg ) )
>     std::cout << what[0].str() << std::endl;
>
> std::regex_match has the same signature when match is passed. Does it
> have similar issue?

Yes. The smatch object called what will store dangling iterators.

Cheers!
SG


--
[ comp.std.c++ is moderated.  To submit articles, try posting with your ]
[ newsreader.  If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]