Topic: Why no persistence and RegEx in standard?


Author: mskuhn@cip.informatik.uni-erlangen.de (Markus Kuhn)
Date: 1995/06/01
Raw View
>>>>>> On 24 May 1995 22:45:29 GMT, Kalyan Kolachala
><kal@chromatic.com> said:

>    Kalyan> I personally consider regular expressions to be very
>    Kalyan> useful.  Those from the Unix world will tell you how
>    Kalyan> frequently you need them and come across them. In most
>    Kalyan> cases one ends up using Perl, sed etc As it is I have seen
>    Kalyan> a widespread use of the Regular expressions from various
>    Kalyan> class libraries but the code using them suffers from
>    Kalyan> portability problems.

A regexp library has already been standardized in the Posix standard
long ago (IEEE 1003.2). Just use this standard if you need a formal
standard that specifies regular expressions. Most good C++ programming
environments also conform to Posix. Were is the problem?

For more info: You can order Posix standards from IEEE:

  phone:  +1 908 981 1393 (TZ: estern standard time)
          +1 800 678 4333 (from US+Canada only)
  fax:    +1 908 981 9667

Markus

---
Markus Kuhn, Computer Science student -- University of Erlangen,
Internet Mail: <mskuhn@cip.informatik.uni-erlangen.de> - Germany
WWW Home: <http://wwwcip.informatik.uni-erlangen.de/user/mskuhn>





Author: akv@srl03.cacs.usl.edu (Anil Vijendran)
Date: 1995/05/29
Raw View
>>>>> On 24 May 1995 22:45:29 GMT, Kalyan Kolachala
<kal@chromatic.com> said:

    Kalyan> I personally consider regular expressions to be very
    Kalyan> useful.  Those from the Unix world will tell you how
    Kalyan> frequently you need them and come across them. In most
    Kalyan> cases one ends up using Perl, sed etc As it is I have seen
    Kalyan> a widespread use of the Regular expressions from various
    Kalyan> class libraries but the code using them suffers from
    Kalyan> portability problems.

Recalling from whatever little I remember, Jonathan Shopiro in "A C++
Toolkit" (PH) attributes a lot of reuse problems (and there by the
necessity to develop languages/constructs to alleviate them) at AT&T
were discovered when they tried to reuse their reg exp libraries in
different applications.


--
Anil

___________________________________________________________________________
Anil K Vijendran                    USL Box 43007, Lafayette, LA 70504-3007
akv@cacs.usl.edu                                         (318) 232-5502 [H]





Author: maxtal@Physics.usyd.edu.au (John Max Skaller)
Date: 1995/05/28
Raw View
In article <3prduj$da7@giga.bga.com>, Jamshid Afshar <jamshid@ses.com> wrote:
>It's actually easy to say why regular expressions are not part of
>(draft) Standard C++.  As far as I know noone ever made a proposal to
>the committee, so the most correct answer is "noone asked".
>
>If a regular expression proposal had been made, I would bet one of the
>major problems would have been agreeing on a standard regex syntax.
>Which one?

 NONE of them. Its obvious. The correct way to handle
regular searches is to pass the search an object representing
a regular grammar and NOT a string.

 The reason is simple -- one can then invent
parsers which translate your favourite regular expression
language into a regular grammar object.

 An example of how to construct a regular grammar:

 RegGram G0; // null
 RegGram G1("string"); // recognise a string
 ...
 G10 = (G5 || G7) && *G3 && +G2 && opt(G7);

You may prefer

 rept0(G3)
 rept1(G3)
 rept(G3,10);

.. etc for repetition.

That is, the only sensible regular language to provide is
C++ itself. It is easy enough to write

 emacs_regexp("... an emacs regexp here ..");

which parses the string and returns a RegGram.

--
        JOHN (MAX) SKALLER,         INTERNET:maxtal@suphys.physics.su.oz.au
 Maxtal Pty Ltd,
        81A Glebe Point Rd, GLEBE   Mem: SA IT/9/22,SC22/WG21
        NSW 2037, AUSTRALIA     Phone: 61-2-566-2189





Author: jamshid@ses.com (Jamshid Afshar)
Date: 1995/05/23
Raw View
It's actually easy to say why regular expressions are not part of
(draft) Standard C++.  As far as I know noone ever made a proposal to
the committee, so the most correct answer is "noone asked".

If a regular expression proposal had been made, I would bet one of the
major problems would have been agreeing on a standard regex syntax.
Which one? Sun's, HP's, grep, agrep, GNU grep, perl, Emacs?  I
*believe* there is now a POSIX standard for a regex syntax and for C
regex functions.  If so, a C++ proposal could have been based on that.
How about regex substitutions?  What should the class or classes do
exactly?  How do they interact with the standard string class?

A proposal that answered all these questions might have still gotten
rejected because the committee simply didn't want to take on this
responsibility and didn't feel C++ would be hurt by not having this
functionality in the (already huge) standard.

PS: check out Jim Morris' free SPLASH library.  It's a great "Small
Perl-like List And String Handling class library" at
ftp://ftp.netcom.com/pub/mo/morris/.  I like it a lot better than
commercial regex libraries I've seen (it has split() and regex
substitution).

Jamshid Afshar
jamshid@ses.com





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/05/24
Raw View
In article <1995May22.162305.2281@nlm.nih.gov> bkline%occs.nlm.nih.gov
(Bob Kline Phoenix Contract) writes:

|> James Kanze US/ESC 60/3/141 #40763 (kanze@lts.sel.alcatel.de) wrote:
|> : [snip]

|> : Re regular expressions:

|> : 1. It is really a speciallized application, and not of general
|> : interest.  There's no Fourier transformation either.

|> I seriously doubt that regular expressions are of less general interest
|> than, say, complex numbers.

I seriously doubt that they are of more general interest than, say
Fourier transformations:-).

Seriously, as was pointed out in the discussion with the original
poster, regular expressions *are* an extremely powerful tool, with
widespread potential uses.  They are also not that widely known; many
people who probably should be using them are not familiar with them.
If these people did know the use of regular expressions, then I agree
that they would belong in the standard.  But it is not the role of the
standard to educate these people.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung







Author: Kalyan Kolachala <kal@chromatic.com>
Date: 1995/05/24
Raw View
kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
wrote:

>
>I seriously doubt that they are of more general interest than,
say
>Fourier transformations:-).

I personally consider regular expressions to be very useful.
Those from the Unix world will tell you how frequently you
need them and come across them. In most cases one ends up
using Perl, sed etc As it is I have seen a widespread use of
the Regular expressions from various class libraries but
the code using them suffers from portability problems.

I guess they were not added due to time constraints and I
hope will be considered in the next revision.
>
>Seriously, as was pointed out in the discussion with the
original
>poster, regular expressions *are* an extremely powerful tool,
with
>widespread potential uses.  They are also not that widely
known;

I am not just talking of potential use (which is enormous)
but actual use and that too is quite significant, except
maybe in the DOS world as there are not enough tools
available.

- Kalyan







Author: bkline%occs.nlm.nih.gov (Bob Kline Phoenix Contract)
Date: 1995/05/22
Raw View
James Kanze US/ESC 60/3/141 #40763 (kanze@lts.sel.alcatel.de) wrote:
: [snip]

: Re regular expressions:

: 1. It is really a speciallized application, and not of general
: interest.  There's no Fourier transformation either.

I seriously doubt that regular expressions are of less general interest
than, say, complex numbers.

--
/*----------------------------------------------------------------------*/
/* Bob Kline                                       Stream International */
/* bob_kline@stream.com               formerly Corporate Software, Inc. */
/* voice: (703) 522-0820 x-311                      fax: (703) 522-5407 */
/*----------------------------------------------------------------------*/





Author: "Eugene Radchenko" <eugene@qsar.chem.msu.su>
Date: 1995/05/16
Raw View
Hello all!

Bjarne Stroustrup <bs@research.att.com> writes:
>Subject: [NEWS] Draft ANSI/ISO C++ Standard available
[...]
>Let me give a few examples of suggestions that I personally think
>would stand little chance of acceptance at this stage:
[...]
>        adding a regular pattern matching library
>        adding a persistence library

I do agree with / can understand the (probable) rejection of the other
things. Nevertheless, these two libraries seem rather necessary and not
terribly difficult to implement. After all, there exist lots of regex and
persistence libraries which support different features and are incompatible
with one another. (That's what the standard is supposed to rectify?).
Persistence especially needs tighter integration with the language itself
since the library should support all class layout possibilities (multiple
inheritance, virtual inheritance, etc).

I am sorry if this was discussed in D&E(C++) book. Unfortunately it is not
readily available here. (Is there by any chance an electronic version?)

     Hope this helps in making the standard better             Eugene

P.S. BTW, is there any way to disconnect certain person from the UseNet
(yes, I mean Mr. Fleming)? If not maybe serious people should not answer
his posts? Otherwise, comp.{lang,std}.c++ will soon be drowned by the
threads generated by him. On the other hands, useful info is hidden in the
replies sometimes - if someone happens to read them despite the thread
name. Maybe we should devise some mark to the effect 'This post also
contains useful info'? (Though Mr.Fleming will probably see this as another
evidence of C++ conspiracy...).



--
-------------------------------------------------------------------
Eugene V. Radchenko         Graduate Student in Computer Chemistry
E-mail: eugene@qsar.chem.msu.su        Fax: +7-(095)939-0290
Ordinary mail: Chair of Organic Chemistry, Department of Chemistry,
               Moscow State University, 119899 Moscow, Russia






Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/05/17
Raw View
In article <ABxE5klqsU@qsar.chem.msu.su> "Eugene Radchenko"
<eugene@qsar.chem.msu.su> writes:
|> Hello all!

|> Bjarne Stroustrup <bs@research.att.com> writes:
|> >Subject: [NEWS] Draft ANSI/ISO C++ Standard available
|> [...]
|> >Let me give a few examples of suggestions that I personally think
|> >would stand little chance of acceptance at this stage:
|> [...]
|> >        adding a regular pattern matching library
|> >        adding a persistence library

|> I do agree with / can understand the (probable) rejection of the other
|> things. Nevertheless, these two libraries seem rather necessary and not
|> terribly difficult to implement. After all, there exist lots of regex and
|> persistence libraries which support different features and are incompatible
|> with one another. (That's what the standard is supposed to rectify?).
|> Persistence especially needs tighter integration with the language itself
|> since the library should support all class layout possibilities (multiple
|> inheritance, virtual inheritance, etc).

Re regular expressions:

1. It is really a speciallized application, and not of general
interest.  There's no Fourier transformation either.

2. At least some of the incompatibilities in the regular expression
classes are due to their meeting different constraints.  I wrote my
own regular expression class because none of the existing ones were
adequate for my application.  But I suspect that mine is not adequate
for most other applications.  In this sense, we actually need the
variety.

Re persistency:

There is a serious lack of established practice.  I agree that we need
some sort of standardization in this area, but I do not think that the
technologies are currently ripe enough to put this into the standard.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung







Author: jak@cs.brown.edu (Jak Kirman)
Date: 1995/05/17
Raw View
>>>>> "James" == James Kanze US/ESC 60/3/141 #40763 <kanze@lts.sel.alcatel.de> writes:

 James> Re regular expressions:

 James> 1. It is really a speciallized application, and not of general
 James> interest.  There's no Fourier transformation either.

I don't agree with this at all -- I think regular expressions are far
more commonly used than Fourier transforms (or even complex numbers, for
that matter), or would be if there were good standards available.  One
of the most powerful features of Perl, in my experience, is its very
extensive regular-expression and string-handling capabilities.

 James> 2. At least some of the incompatibilities in the regular expression
 James> classes are due to their meeting different constraints.  I wrote my
 James> own regular expression class because none of the existing ones were
 James> adequate for my application.  But I suspect that mine is not adequate
 James> for most other applications.  In this sense, we actually need the
 James> variety.

Applications that use regular expressions very heavily probably would
need to define their own, or use one of the various available packages.
But most of the time regular expressions are used in cases where
overhead is relatively unimportant (e.g., they are often associated with
user input, or filename manipulation).  A standard regular expression
package would be sufficient for most applications.

Unfortunately, people don't realize how useful regular expressions are
until they have been given a good regexp library to use.  A lot of the
software engineers I teach don't even know what they are.

                         Jak Kirman                        jak@cs.brown.edu
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aerodynamics are for people who can't design engines.
                                                             -- Enzo Ferrari





Author: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Date: 1995/05/18
Raw View
In article <jakzndwxy5rboov@remington.cs.brown.edu> jak@cs.brown.edu
(Jak Kirman) writes:

|> Unfortunately, people don't realize how useful regular expressions are
|> until they have been given a good regexp library to use.  A lot of the
|> software engineers I teach don't even know what they are.

Agreed.  This is probably the crux of the matter.  Regular expressions
won't be considered as appropriate for the standard until they are
widely used, and they won't be widely used until they are in the
standard:-).

In fact: it is not the role of the standard to *teach* computer
science.  I agree with your comments concerning the usefulness of
regulare expressions, to the degree that a regular expression class
was the third class I ever wrote (after string and associative array).
But the fact is, they are not widely known or used.  Ask any software
engineer about complex or array, and he immediately knows what you are
talking about, and what the uses are.  As you point out, however,
regular expressions are *not* in this category.  Whether they should
be or not is not really the business of the C++ standard.
--
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung