Topic: parsing strings at compile-time
Author: restor <akrzemi1@gmail.com>
Date: Wed, 23 Jun 2010 17:20:40 CST Raw View
Hi,
Compile-time string parsing would be very useful a tool for
implementing user-defined literals of custom syntax/format. Two
examples:
Date date1 = "29-Feb-2012"_date;
Compliler checks whether 2012 is a leap year, and if it should allow
29 as a day in February. If not, error is reported during compilation.
Regex regex1 = "[a-zA-Z_]+[0-9]*"_regex;
Compiler checks if we haven't made any syntx error in the expression.
C++ Standards Committee rejected this addition on the grounds that it
would make parsing of string literals too troublesome for compiler
implementers (if I got it right).
Now, I was thinkig that we could achieve the same functionality by
allowing a small extension to constexpr functions:
Allow references to arrays of a known size as arguments to constexpr
functions (or are they already allowed?), and allow the indexing
operator[] (of the build-in array type) to be acceptable constexpr
operation
on such arguments. I.e. the following should be valid:
template< size_t N > constexpr
char fun( char (&arr) [N], int i )
{
return arr[i];
}
Having this, we can add both run-time and compile-time reporting of
invalid index values:
template< size_t N > constexpr
char fun( char (&arr) [N], int i )
{
return ( i >= 0 && i < N )
? arr[i]
: throw Exception();
}
This works at compile-time because for the valid values of i, the
throw expression is not evaluated, and for the invalid values of i, we
get a compilation error that the function is not constant expression.
The below is an attempt to implement a compile-time string parsing
tool that would be useful for checking the validity of date. I.e. it
would parse strings like "12-JUL-2007" and produce a Date or fail to
compile. I skip some
parts, as I only wanted to prove the possibility of implementing the
extraction of substrings, iteration, and error reporting.
// ------ THE FACILITY FOR EXTRACTING SUBSTRINGS ------
emplate< size_t N >
struct SubStr
{
const char (&arr) [N + 1];
size_t beg, end;
constexpr SubStr( const char (&arr) [N + 1], size_t, size_t );
constexpr char operator[]( size_t i ) {
return i >= end - begin ? throw Error()
: return arr[begin + i];
}
}
template< size_t N > constexpr
SubStr<N> subStr( const char (&arr) [N + 1], size_t beg, size_t
end )
{
return SubStr<N>( arr, beg, end );
}
// ---- FUNCTIONS FOR PARSING COMPILETIME STRINGS ----
template< size_t N, size_t M > constexpr
bool equals( SubStr<N>, SubStr<M> );
template< size_t N, size_t M > constexpr
bool equals( SubStr<N> str, const char (&arr) [M + 1] )
{
return equals( str, subStr(arr, 0, M + 1) );
}
template< size_t N > constexpr
int toInt( SubStr<N> );
// ----
constexpr
unsigned month( SubStr<N> str )
{
return equals(str, "JAN") ? 1
: equals(str, "FEB") ? 2
...
: equals(str, "DEC") ? 12
: throw InvalidDate();
}
constexpr
Date date( int day, int month, int year )
{
return (day < 1 && day > 31) ? throw InvalidDate()
: invalid31mon(day, mon) ? throw InvalidDate()
: invalid30mon(day, mon) ? throw InvalidDate()
: invalid29mon(day, mon, year) ? throw InvalidDate()
: Date( day, mon, year );
}
constexpr
Date date( const char (&arr) [11 + 1] )
{
return date(
toInt( subStr(arr, 0, 2) ),
month( subStr(arr, 3, 6) ),
toInt( subStr(arr, 7, 11) )
);
}
// ------ FUNCTIONS FOR ITERATING OVER THE STRING (2 DIRECTIONS)
----
template< size_t N, typename Trnsf, typename State > constexpr
State iterFwd( SubStr<N> str, size_t I, Trnsf transform, State
state )
{
return (I > N) ? throw Exception()
: (I == N) ? state
: iterFwd( str, I + 1, transform, transform(state, str[I]) ) :
}
template< size_t N, typename Trnsf, typename State > constexpr
State iterBack( SubStr<N> str, size_t I, Trnsf transform, State
state )
{
return (I >= N) ? throw Exception()
: (I == 0) ? transform( state, str[0] )
: iterBack( str, I - 1, transform, transform(state, str[I]) ) :
}
// ---- EXAMPLE OF USAGE OF THE ITERATION FUNCTION ----
struct GetNum;
struct NumCollector; // have to be literal types
template< size_t N > constexpr
int toInt( SubStr<N> str )
{
return iterBack( str, N - 1, GetNum(), NumCollector() );
}
Regards,
&rzej
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use
mailto:std-c++@netlab.cs.rpi.edu<std-c%2B%2B@netlab.cs.rpi.edu>
]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Author: =3D?ISO-8859-1?Q?Daniel_Kr=3DFCgler?=3D <daniel.kruegler@googlemail.c=.om>
Date: Thu, 24 Jun 2010 12:19:57 CST Raw View
On 24 Jun., 01:20, restor <akrze...@gmail.com> wrote:
[..]
> Now, I was thinkig that we could achieve the same functionality by
> allowing a small extension to constexpr functions:
> Allow references to arrays of a known size as arguments to constexpr
> functions (or are they already allowed?), and allow the indexing
> operator[] (of the build-in array type) to be acceptable constexpr
> operation on such arguments. I.e. the following should be valid:
>
> =C3=A1template< size_t N > constexpr
> =C3=A1char fun( char (&arr) [N], int i )
> =C3=A1{
> =C3=A1 =C3=A1return arr[i];
> =C3=A1}
This constexpr function is already valid as of the current FCD.
> Having this, we can add both run-time and compile-time reporting of
> invalid index values:
>
> =C3=A1template< size_t N > constexpr
> =C3=A1char fun( char (&arr) [N], int i )
> =C3=A1{
> =C3=A1 =C3=A1return ( i >= 0 && i < N )
> =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 ? arr[i]
> =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 : throw Exception=
();
> =C3=A1}
This won't compile in the run-time case, because the throw-
expression is always a non-constant expression.
> The below is an attempt to implement a compile-time string parsing
> tool that would be useful for checking the validity of date. I.e. it
> would parse strings like "12-JUL-2007" and produce a Date or fail to
> compile. I skip some
> parts, as I only wanted to prove the possibility of implementing the
> extraction of substrings, iteration, and error reporting.
>
> =C3=A1// ------ THE FACILITY FOR EXTRACTING SUBSTRINGS ------
>
> =C3=A1emplate< size_t N >
> =C3=A1struct SubStr
> =C3=A1{
> =C3=A1 =C3=A1 =C3=A1 =C3=A1const char (&arr) [N + 1];
> =C3=A1 =C3=A1 =C3=A1 =C3=A1size_t beg, end;
> =C3=A1 =C3=A1 =C3=A1 =C3=A1constexpr SubStr( const char (&arr) [N + 1], s=
ize_t, size_t );
> =C3=A1 =C3=A1 =C3=A1 =C3=A1constexpr char operator[]( size_t i ) {
> =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1return i >= end =
- begin ? throw Error()
> =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 =C3=A1 : r=
eturn arr[begin + i];
> =C3=A1 =C3=A1 =C3=A1 =C3=A1}
> =C3=A1}
This function (and the remaining ones as well as far as I see) have
the same problem as above.
HTH & Greetings from Bremen,
Daniel Kr=C5=98gler
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use
mailto:std-c++@netlab.cs.rpi.edu<std-c%2B%2B@netlab.cs.rpi.edu>
]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Author: Mathias Gaunard <loufoque@gmail.com>
Date: Thu, 24 Jun 2010 12:23:24 CST Raw View
On Jun 24, 12:20 am, restor <akrze...@gmail.com> wrote:
> Hi,
> Compile-time string parsing would be very useful a tool for
> implementing user-defined literals of custom syntax/format. Two
> examples:
>
> Date date1 = "29-Feb-2012"_date;
>
> Compliler checks whether 2012 is a leap year, and if it should allow
> 29 as a day in February. If not, error is reported during compilation.
>
> Regex regex1 = "[a-zA-Z_]+[0-9]*"_regex;
>
> Compiler checks if we haven't made any syntx error in the expression.
>
> C++ Standards Committee rejected this addition on the grounds that it
> would make parsing of string literals too troublesome for compiler
> implementers (if I got it right).
There was also a proposal of turning foo<"bar"> into foo<'b', 'a',
'r'>.
Don't know what happened to that one.
--
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use
mailto:std-c++@netlab.cs.rpi.edu<std-c%2B%2B@netlab.cs.rpi.edu>
]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]