Topic: #pragma STDC ONCE: interest?


Author: giecrilj@stegny.2a.pl (=?iso-8859-1?Q?Kristof_Zelechovski?=)
Date: Mon, 29 Jan 2007 18:51:05 GMT
Raw View
Uzytkownik "James Kanze" <james.kanze@gmail.com> napisal w wiadomosci news:1165591160.970620.325980@f1g2000cwa.googlegroups.com...

> Linking isn't the only problem, although linked header files do
> occasionnally occur.  (I use them a lot to handle common cases
> in dependant directories---if Solaris and Linux actually need
> the same file, the Solaris and the Linux dependancy directories
> each contain a link to the common file.  Of course, any single
> compile will only use one.)  Remotely mounted files pose a

Not a good idea.
You achieve the same effect if you create a file link.h that #includes "target.h",
and the additional benefit is that the __FILE__ macro and the debug information
show you where you really are.
Technically, a soft link is a regular small file with a link attribute,
and the content of the file consists of the path of the target with some structural decoration.
Why not replace this system-dependent structural decoration with the standard #include directive?
(I understand that autoconf provides support for soft links and not for preprocessor redirections;
I think it is a problem with autoconf and it should be fixed.)

Chris

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Tue, 12 Dec 2006 10:40:26 CST
Raw View
Greg Herlihy wrote:
> kuyper@wizard.net wrote:
> > Greg Herlihy wrote:
> > .
> > > Start with the definition of the __FILE__ macro (i.e the name of the
> > > file that contains the #pragma once directive), unstringify the name
> > > (i.e. strip the enclosing quotation marks), replace periods (and any
> > > other character not allowed in an identifier) with underscores,
> > > uppercase the name, and prepend an underscore (so the #pragma once
> > > macro should not conflict with any user-declared macros).
> >
> > There are a couple of  problems:
> > Two different files might be mapped to the same macro name by that
> > algorithm.
>
> If the two files have the same name then they would in fact have the
> same implicitly-defined macro name

I'm talking about two files with different names that get mapped to the
same macro name. You've defined a many-to-one mapping, both because all
characters not allowed in identifiers are converted to the same
character in a macro name, and because lower and upper case letters get
converted to the same character. If you're mainly used to a
case-insensitive file system, that last item probably seems
unimportant, but the standard shouldn't impose case-insensitivity on
all systems just because some systems are case insensitive.

- so a #once directive (I agree that
> the #pragma should be dropped) in one of the files would inhibit the
> subsequent inclusion of the other, within a single translation unit.
> This behavior would be quite deliberate. In fact, one of the uses for a
> #once directive that I can anticipate would to aid programmers in
> detecting these kinds of header name collisions in order that they can
> be addressed as soon as they have been detected.

I'm not talking about a header name collision. I'm talking about files
with different names that map to the same macro name.

> > This idea intrudes upon a very large portion of the identifier name
> > space that used to be reserved to implementations. The macro
> > corresponding to a given file name might be one the implementation is
> > already using for some other purpose. It's trivial to construct
> > filenames where the corresponding macro name is the same as one of the
> > standard-defined macros, such as "_stdc._", though those are odd enough
> > that they're not likely to come up in practice. The ones that conflict
> > with implementation-defined macros are a much bigger problem.
>
> I don't see how searching and replacing for macro names in a set of
> Standard header files is likely to prove all that formidable a task for
> most C++ implementors. After all, one has to assume that a C++
> implementor is likely to have at least a passing acquaintance with
> regular expressions - especially after having implemented a complete
> regular expression library for the Standard.

What does search and replace of standard headers have to do with it?
Consider an implementation has documented for years to it's users the
fact that setting a macro named __OS400__ changes the behavior of it's
standard headers in some way that is useful to at least some of the
users. Those users have an existing code base that sets and uses that
macro. With your suggestion, a #once statement found inside a header
file whose computed macro name happens to match __OS400__ will cause
problems for such code. They would basically have to prohibit the use
of #once in files with such names, or require that all legacy code
containing that macro be changed to refer to a new macro name chosen to
be impossible to duplicate with a #once directive. Given the rules
you've suggested, the only way I can see to make a safe macro name is
to include at least one lower case letter, and that's only because
you've caused additional problems at the other end by mapping lower
case letters to uppercase letters.

Most file names that could cause conflicts would have to have rather
peculiar names. Therefore, prohibiting #once in files with such names
would be an annoyance, but no more. If there were a good reason for
imposing such a prohibition, it might even be worthwhile. But I don't
see that there's any need to define thie feature in terms of defining
macro names. All you have to do is say that #once adds the current
value of __FILE__ to a list of one-time only header files, and that
#include is not required to actually open and read a header if it's on
that list.

> And although the burden that any change to the C++ language would have
> on implementors should always be taken into account when considering
> the feature - it is really the benefit to user programs that matters
> above all else. The reason why the C++ Standard reserves certain types
> of names to the implementation is precisely to have the option of
> adding a feature like a #once directive and not run the risk of
> breaking current user programs.

The problem is that your suggestion removes an enourmous range from the
list of names that an implement can safely use. In particular, it makes
every macro name that consists solely of uppercase letters, digits, and
underscores unsafe, and that's one of the most popular categories of
implementation-defined macro names.

> > with the same value for __FILE__  as the value when the #pragma once
> > directive was processed. The macros are just a kludge to keep track of
> > that name. Why not simply define the behavior directly in terms of the
> > file name? Let the implementation worry about how to keep track of it.
>
> The proposal is for the preprocessor to test for the same
> implicitly-defined macro that a #once directive appearing in the header
> about to be included - would define. Otherwise there would be little
> point in making the comparison since there would be no possibility of
> finding a match. In other words:
>
>     #include <headers/MyHeader.h>
>     #include <MyHeader.h>
>
> would both test for the same for the same macro, since the __FILE__
> macro in a header called "MyHeader.h" would always be defined in the
> same way.

You seem to be suggesting that there's a problem with my alternative
suggestion that would come up in this situation, that would be avoided
by using macro definitions. I don't follow that. Under my suggestion,
if the file #included by the first directive had a #once directive,
then the value of __FILE__ in that file would be added to the once-only
list. If the second #include causes a search that locates the same
header file, then the implementation could determine before opening it
that it would have a value of __FILE__ that would match an existing
entry in the once_only list, and would therefore not need to re-open
it.
With your proposal, the first #include would cause the implicit
definition of a macro; the second #include would find a file that the
implementation could determine would have the same implicitily defined
macro name, and would therefore not need to be opened.

The only differences I can see between your variation and mine are:
1. Your version interacts badly with other uses of the same macro name
in the program, whereas mine has no interaction with the macro names at
all.
2. Your version treats as identical some pairs of files with different
names, mine doesn't.

> > Of course, any scheme that uses the filename alone to identify which
> > files are the same provides less protection than header guards do. With
> > header guards, if the same file can be found in the #include search
> > path by two different names, it is still protected against double
> > inclusion, which is not the case with filename based approaches.
>
> Only if the name of the included file changes while the translation
> unit is being compiled.

I'm thinking about a file which is accessble by two different names at
the same time, not a file whose name changes during the compilation.
This situation can be created on Unix-like systems by soft links, hard
links, or by mounting overlapping portions of a file system at two
different mount points. I'm less familiar with other operating systems;
it's been a decade sincle I last compiled code for a non-Unix-like
system, so I have no idea whether Windows compilers treat shortcuts the
same way that Linux compilers handle soft links. However, every
operating system I've ever used allowed you to create a new copy of a
file with a different name, which is yet another way to create this
situation

The header-guard system prevents double inclusion in all of those
cases, even the case where you copy the file to a new name. A #once
directive whose behavior is defined in terms of the filename won't
catch any of them. It may be arguable whether this difference is
important, but it's unarguably a difference.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:41:46 CST
Raw View
ThosRTanner wrote:
> "Bo Persson" wrote:
> > ThosRTanner wrote:

> > > I'm sure you do - we do to. But do you really have 2 different
> > > mounts
> > > to the same server, which is what would be required to cause the
> > > issue. I would really query a system set up like that.

> > I have mounts to different department's disks. I have no idea where the
> > files are stored physically. Could be on the same server, or a different
> > one, or on some NAS device, or in another city. I don't know.

> Who needs to know where the files are stored physically? It's
> irrelevant. The point is that if a/b.h and f/g.h are actually the same
> file, you are going to have an awful lot of trouble anyway.

Such totally different names probably won't occur in practice,
but what about a/b.h and b.h, including from files in different
directories.  This is an everyday occurance here: files in the
subsystem a use #include "b.h", and other users use #include
"a/b.h".  There are historical reasons for this, and I'm far
from sure it is the right policy.  (It's not what I do for the
files at my web site, for example.)  But I suspect that it's not
a rare occurance.

On the other hand, if I have files "a/b.h" and "c/b.h", header
files in subsystems a and c will both include simply "b.h", each
header getting a different file.

In fact, of course, the compiler can (and generally must) keep
track of the directory where it is running, and the directory
from where it actually loaded the file, so it can recognize in
all cases which files are actually the same file.  Barring
perverse anomalies like links, multiple remote mounts files,
etc.  But how do you specify this in standardese?  Including the
fact that you don't get guaranteed results in the perverse
cases.

> Who is going to realise that altering b.h means that all the
> files using g.h need rebuilding?

The automatically generated dependancy files of make.  (Does
anyone today ever actually consider what the dependancies are,
rather than letting the compiler do this?)

> > > Like I said - I don't expect the compiler to resolve that sort of
> > > thing, And if you have 2 different hard paths to the same file,
> > > frankly, you'll have nasty build and maintenance problems anyway.

> > So I just add a #pragma once, and the compiler will sort it out for me?
> > :-)

> I don't think it's going to make your maintenance issues a lot worse.

You still have to specify when it works, and when it doesn't.

> > > I'd still prefer a #pragma includemultiple - then the default
> > > behaviour would be what everyone wants. The paranoid (or those with
> > > IMHO insanse network setups) can always add this to all their
> > > headers, along with
> > > the include guards.

> > That doesn't help either, if the compiler can't tell whether it is the same
> > file, or not.

> Well, as I said, if your system is set up badly, you can put it in
> every header file. By default you wouldn't need to put anything in any
> header file.

For better or for worse, the default will have to conform with
current existing practice.  We can't break existing, working
code, regardless of how bad you think it is.  (That's not quite
true, of course, but to do so would require a very big win, and
it's obvious that any benefits here are minimal.)  The default
is multiple inclusion, with a #pragma once.

All that's missing is a reasonable definition of "once".

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:41:46 CST
Raw View
"Andrei Alexandrescu See Website For Email wrote:
> Andrew Marlow wrote:
> > On Fri, 08 Dec 2006 17:07:49 +0000, Andrei Alexandrescu (See Website For
> > Email) wrote:

> >>>>The optimization based on recognizing it is based on recognizing
> >>>>that two includes actually include the same file.

> >>>Ok, but how can a "case of doubt" be reasonably identified?

> >>A compiler running on a not-too-smart filesystem could compute and cache
> >>on disk md5 checksums for each included file.

> > IMO it's alot simpler than that. The compiler can remember each #include
> > statement it has seen. This won't catch all cases, since people can vary
> > what comes after the '#include' and still be referring to the same file,
> > but nontheless it seems like a simple and useful optimisation to me...

> That can be done on some systems. It has been said repeatedly in this
> thread that that's not enough.

I think the issue is a bit more subtle.  On some common systems
(Linux, Solaris), I can set it up so that:

    #include "abc.h"
    #include "abc.h"

reads two different sets of data.  I think that there's also
general agreement that the standard doesn't have to support
things that perverse.

On the other hand, if these two includes were in two different
header files, the including header files are in different
directories, and both directories contained a file "abc.h", not
only will I get two different files, that's what I expect and
want.  Simply comparing the [hq]-char-sequence is not sufficient
to say that two files are the same file.

Beyond that, I've not seen any concrete explination of what the
standard should actually say.  The fact that whatever it says
will allow "mistakes" in some perverse cases doesn't bother me
too much (as long as you don't consider the environments I
actually work in "perverse":-)).  But you need to find some
wording to specify this, in an environment neutral manner.

> The md5 checksums database ensures that
> #pragma once is implementable on _all_ systems.

But what does it buy you, if you have to read the entire file
each time to recalculate the MD5 checksum.  You've just deferred
the problem: how does whatever system manages the MD5 checksums
know when it has to recalculate them?

> The point is that the feature is reasonably easy to implement
> on any system that has timestamps on files.

On any system that has "reliable" and meaningful timestamps on
files.  Thus, not on a system which allows you to copy in your
files from some other system.  Or to remote mount file systems,
at least not with NFS or SMB.

Timestamps are not reliable:-(.  (At least not on NFS mounted
files, or files which have been copied from other systems.  I
know this all too well.  Make depends on them, and I've had no
end of problems because of it---modified sources not getting
recompiled, etc.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:43:22 CST
Raw View
Greg Herlihy wrote:
> A formal definition for "#pragma once" would be a paradigm of
> simplicitly. Certainly file system issues would present no
> difficulties. The file system is largely irrelevant anyway because a
> #pragma once works - not with files - but with the names of files as
> they appear in an #include directive.

So you break all of the code we've got here.  And you risk
breaking a lot of libraries.  You are basically saying that the
filename which appears in an include directive must be unique
accross the entire translation unit.  This has never been the
case in the past---when you write #include "abc.h", the first
place the compiler looks (at least, every compiler I've ever
used) is in the directory which contains the including file.  So
you include "x/a.h" for something in library/sub-system x, and
"y/b.h" for something in library/sub-system y, and both of these
files include "bits.h" for their internal stuff.

    [...]
> A #pragma once has an additional benefit which - while less tangible -
> is nonetheless one that should not be discounted: and that benefit is,
> what it would do for C++ itself. Frankly, the need for header guards is
> simply an embarrassment. The embarrassment is not merely that textual
> inclusion of source code is antiquated (it is), nor is it merely the
> kludgy way that header guards cover an apparent shortcoming in the
> language (they do).

I can actually agree with this.  But the solution isn't finding
a hack to avoid having to write header guards.  The solution is
to get rid of textual include completely.  (Or rather provide a
better substitute for it.  Regardless of how big an improvement
it might be, removing #include from the language would break
a bit more existing code that would be acceptable, I think.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:43:17 CST
Raw View
Greg Herlihy wrote:
> kuyper@wizard.net wrote:
> > Greg Herlihy wrote:

> > > Start with the definition of the __FILE__ macro (i.e the name of the
> > > file that contains the #pragma once directive), unstringify the name
> > > (i.e. strip the enclosing quotation marks), replace periods (and any
> > > other character not allowed in an identifier) with underscores,
> > > uppercase the name, and prepend an underscore (so the #pragma once
> > > macro should not conflict with any user-declared macros).

> > There are a couple of  problems:
> > Two different files might be mapped to the same macro name by that
> > algorithm.

> If the two files have the same name then they would in fact have the
> same implicitly-defined macro name

You mean, of course, if the include directive of two files
contains the same [qh]-char-sequence, they will be treated as
the same file.  (At least if I understand your proposal
correctly.)  This is obviously unacceptable, since it breaks
just about every library in existance.  (What, I'm not allowed
to have an include file named ext/vstring.h or bits/gslice.h,
just because I'm compiling with g++?  Or is g++ forbidden to use
files with these names in its library implementation?)

    [...]
> > This idea intrudes upon a very large portion of the identifier name
> > space that used to be reserved to implementations. The macro
> > corresponding to a given file name might be one the implementation is
> > already using for some other purpose. It's trivial to construct
> > filenames where the corresponding macro name is the same as one of the
> > standard-defined macros, such as "_stdc._", though those are odd enough
> > that they're not likely to come up in practice. The ones that conflict
> > with implementation-defined macros are a much bigger problem.

> I don't see how searching and replacing for macro names in a set of
> Standard header files is likely to prove all that formidable a task for
> most C++ implementors.

And how do they know what to search for?  You're asking them to
search for patterns based on the names of my files.

And of course, the compiler implementor doesn't always have
access to all of the files concerned.  If I write something like
    #include "reentrant"
I'm going to run into a conflict with unistd.h on my system
(Solaris)---I don't know if Posix requires _REENTRANT, but
Solaris certainly does, and the C++ compiler implementors don't
have much say in the matter.

     [...]
> > with the same value for __FILE__  as the value when the #pragma once
> > directive was processed. The macros are just a kludge to keep track of
> > that name. Why not simply define the behavior directly in terms of the
> > file name? Let the implementation worry about how to keep track of it.

> The proposal is for the preprocessor to test for the same
> implicitly-defined macro that a #once directive appearing in the header
> about to be included - would define. Otherwise there would be little
> point in making the comparison since there would be no possibility of
> finding a match. In other words:

>     #include <headers/MyHeader.h>
>     #include <MyHeader.h>

> would both test for the same for the same macro, since the __FILE__
> macro in a header called "MyHeader.h" would always be defined in the
> same way.

Are you sure of that?  I get different values with g++ and with
Sun CC.  (On the other hand, I get the same value for "abc.h",
even when it refers to two different files.)

> Now it is important to note that if someone mismanages their header
> files to such an extent that they wind up including two header files
> with the same name within the same translation unit,

In other words, if they happen to include a third party library
which uses internal headers whose names conflict with their own.

> then a #once
> directive will not be coming to their rescue. For better or for worse,
> using a #once directive will not spare those who carelessly or poorly
> manage their program's dependencies

Or use external libraries, like the system library of g++.

G++ (or anyone else I know) doesn't document their internal
header file names.  How am I supposed to avoid them?

    [...]
> Even today choosing a header file name should always be done carefully.
> A user header file whose name collides with the name of a Standard
> header leads to undefined behavior.

Not at all.

    #include "string"
    #include <string>

can (and probably should) refer to two different headers.  (Not
that I condone chosing such conflicting names.  But I don't know
all of the names used internally by g++ or Sun CC.  And neither
guarantees not to use additional internal headers in the next
release.)
>
> > Of course, any scheme that uses the filename alone to identify which
> > files are the same provides less protection than header guards do. With
> > header guards, if the same file can be found in the #include search
> > path by two different names, it is still protected against double
> > inclusion, which is not the case with filename based approaches.

> Only if the name of the included file changes while the translation
> unit is being compiled.

It's quite possible under most systems (Unix at least, but with
remotely mounted file systems, Windows as well) for a file to
have several names.  There are even very legitimate reasons why
this is desirable in many cases, and the absense of it in the
native Windows filesystem often drives me up the wall.

But I don't think that this concerns (or should concern) header
files.  And if there is some strange case where it does, the
user can always fall back on include guards for this one
exception.  I don't consider this per se an argument against
#once.  Presuming, of course, we are talking uniquely about the
base name.  I do expect code which uses e.g. "vstring.h" in a
file in bits, and "bits/vstring.h" elsewhere to designate the
same file to work.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:40:34 CST
Raw View
"Andrei Alexandrescu See Website For Email wrote:
> Gennaro Prota wrote:
> > On Mon, 11 Dec 2006 20:32:10 GMT, "Andrei Alexandrescu (See Website
> > For Email)" wrote:
> >> The probability of two files having identical timestamps, identical
> >> length, identical md5 hashes, while they are actually different is
> >> exceeedingly low, save for concerted attacks (see
> >> http://www.mscs.dal.ca/~selinger/md5collision/).

> > Even lower for SHA-2 functions, but still... The point is, I think,
> > being able to take an appropriate decision even in case of collision.
> > That the collision is unlikely is a welcome property but we shouldn't
> > rely on it being impossible.

> If the collision probability is lower than the probability of file read
> error using a deterministic comparison function, then...

You still have to find formal wording to cover the case in the
standard.  Would you accept wording along the lines of "if two
different include files have the same SHA-2 hash code, it is
undefined behavior"?

> >  Let me see if I understood: you are compiling A.cpp, when
> > you encounter (scattered somewhere in the inclusion tree)

> >   #include "a.h"
> >   #include "mylib/a.h"

> > in that order. Now, if when reading the second included file you
> > detect that both the hash (and the date? see below) are the same as
> > for "a.h" (or any other files previously added to the "database") you
> > have to go back and read a.h for the comparison. Going back would
> > require knowing the path of the file *as determined (in an
> > implementation-defined manner) when the #include "a.h" was executed*,
> > but we kept that complete path in the database (right?). So you use
> > that path and just compare the two files; if they are different you go
> > with the normal processing (#inclusion, insertion into the database),
> > if they compare equal you do nothing.

> > Included files with no #once would never be added to the database. The
> > database is relative to the translation unit. Right? Does anyone see a
> > flaw in this?

> > (A prerequisite of this is that a complete path always corresponds to
> > the same file or, better, to the same content.

The problem, I think, is normally the opposite.  Two different
paths may refer to the same file.

Not that this changes much.  Somehow, I don't think you'll find
much support for a solution which requires an external database
to manage the files.

> > In all this, anyway, I
> > think I didn't get the absolute need for timestamps, which means I'm
> > missing something --note that two files might have identical content
> > and timestamp, while still being distinct as filesystem entities)

> The timestamp is useful in that you save the comparison result together
> with a pair of timestamps. "Files a.h and mylib/a.h were identical when
> a.h had timestamp xxxx and mylib/a.h had timestamp yyyy." As soon as
> either file gets modified, that invalidates the comparison result.

The issue is more complex.  What happens if a.h and mylib/a.h
are on different mounts, and one of the mounts is changed?  (Say
it was an error that they were, in fact, the same file.)

Note too that you are adding a lot of restrictions with regards
to the systems on which C++ can be implemented.  Historically,
you could implement a compiler on a system which didn't support
timestamped files.  Personally, I have no problem with requiring
time stamps, although I'm less sure about requiring them to be
meaningful---it's already happened to me that when copying files
between systems, all of the timestamps got reset to 0.  (I'm not
too sure how meaningful timestamps are over NFS; on my system
here, make regularly warns me that some files have modification
times in the future, which makes me a bit sceptical.  But then,
I've also learned that it's best to wait a bit between writing
the file back from the editor and then compiling it; the
compiler often sees the old version of the file for a couple of
seconds after the write.  Such are the joys of networked
systems, where the editor and the compiler are running on
different machines, and the file system is on a third.)

Anyway, I'd be interesting in having a look at a concrete
proposal, written in something vaguely similar to standardese.
It's easy to speculate about possible implementations for
specific systems, and say that such and such won't be a problem
(and I agree that it normally won't), but it's another to
formulate it in the terms necessary for the standard.

(FWIW: the solution with the data base has been used in the past
for managing template instantiations.  It's normally called a
repository.  And while I don't think that your suggestion runs
the same risks---you're only considering individual files, and
not possible dependancies---the solution doesn't have a good
reputation for getting things right.  So if you do try to write
something up along these lines, I'd strongly suggest that you
point out how it is different, and why it will work here, when
it caused so many problems with CFront.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:44:34 CST
Raw View
Robert Mabee wrote:
> Anders Dalvander wrote:
> > Wouldn't a new #import <file.h> or #using <file.h> construct be better
> > overall? Then the file doesn't need to be opened and scanned for
> > #pragma once or other include guard constructs. It would also be a step
> > toward modules in C++, and perhaps a way to get rid of the need to
> > forward declare classes.

> That is surely the right long-term solution.  Imported files would have
> to be considered as file scope outside any scoping constructs that might
> bracket the import statement (probably not a # preprocessor statement)
> and considered idempotent so multiple imports would produce no error
> message.  Probably they can't do all that and still contribute to the
> preprocessor actions in the referencing file.

There's actually a concrete proposal for modules under
consideration, which would presumably make anything concerning
include files moot.  Given the time frame being aimed for, and
the lack of a concrete implementation to actually experiment
with, I rather doubt that it will make it (although from what I
have seen, it looks very good), however.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:44:21 CST
Raw View
Bj   rn Roald wrote:

    [...]
> I think use of a hash can be helpful in resolving the cases where we
> *must* do the extra check.  But before this is used the simpler *safe*
> assumptions should be exploited.  So, what are the useful *safe*
> assumptions?  I think the most useful would be that any line of the form

> #include <file_a.h>

> would cause the compiler to always see the same file no mather what file
> system we deal with.  Likewise for the #include "file_b.h" form.

As has already been pointed out, this would break too much code.
(In theory, it should work for the <...> form; this form should
only be used for system includes, and the implementor more or
less has control over those.  In practice, every compiler I know
includes paths specified by the -I option, and I've seen more
than a little code with user headers in the <...>.)

    [...]
> As I see it, there are two goals of the #pragma once, and other similar
>   proposals.

> 1. simplicity in use, and less errors as positive important side effect

A limited benefit, since external tools normally take care of
this automatically.

> 2. optimalization, mainly in the use of the programmers time - which
> often also involves waiting for the compiler to complete

Except that existing compilers (good ones, anyway) recognize the
include guard pattern and don't make redundant includes anyway.

There's also a definite esthetic benefit---include guards are
ugly, and there necessity is an embarassement to the language.
But I don't find a pragma much better.

The current situation is thus: there's a benefit, but it's
pretty small, and there's no real working proposal on the floor.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 10:43:36 CST
Raw View
"Andrei Alexandrescu See Website For Email wrote:
> James Kanze wrote:
> > My point is that basically, I get the feeling that there is a
> > partial agreement, at least between myself and the proponents of
> > pragma once, at least with regards to two important points:

> >  1) that the concept of "same file" isn't workable, as such,
> >     since it cannot be implemented, and

> What happened to the md5 hash?

I hadn't seen it when I wrote the previous posting, but there is
still the question as to who calculates it, and when, and who
maintains the data base.

There's also a point about how you specify it formally
(supposing that you don't want to require MD5 if something
better comes along).

And a political point that it can't be guaranteed.  In
practice, I think real compilers use it---or things even less
guaranteed---for things like generating the actual name of an
anonymous namespace.  It works.  But I suspect that you'll find
some resistance concerning it anyway.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Tue, 12 Dec 2006 11:51:14 CST
Raw View
Gennaro Prota wrote:
> On Mon, 11 Dec 2006 01:53:08 GMT, James Dennett wrote:

> >Gennaro Prota wrote:
> >> On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:

> >>> I think that the real problem is that no one has made a concrete
> >>> proposal

> >> Or perhaps that we should all be more constructive.

> >To me, James's point that you quote is a very constructive
> >one.

> To me, after making some good points James is mainly saying that
> there's no real issue because the editor automatically does everything
> needs to be done, then again correcting people who say the same by
> saying that in some places it's actually the SCM system that provides
> the guards and then again whatever else... That may be constructive to
> someone but, to me, it's just spirit of contradiction. Looks like
> another discussion on c.l.c++.m where he said that my code could work
> to reply then that I was using the term "works" without defining it...

> >There's not been a formal proposal/specification, and
> >that's the key thing that's needed in order to make progress.

> That's what I wanted to write.

> >[...]

> >Based on what I've seen so far, I cannot think of a useful
> >formal definition of what #pragma once should do, and its
> >value seems smaller than its cost.

> So what's the problem, let's just forget about it. I asked if there
> was interest; answers could have been more explicit.

Let me see if I understand you correctly.  You asked if there
was interest, then you complain because I said that there was
none on my part, and explained why (that my editor inserted the
guards automatically, in environments where the employer didn't
require the source code control system to do so).

I've also said quite explicitly that I won't oppose a concrete
proposal that would work.  After having pointed out that just
saying "if it is the same file" doesn't work.  (Well, actually,
others pointed this out more than I did.)

Both Andrei and Greg have come back with some proposals with
more exact wording.  Both proposals had serious flaws, but
that's what discussion is all about; finding the flaws and
addressing them.

FWIW: I don't think that any proposal involving timestamps has a
chance.  They're too fragile in practice to be of much use.  I
also doubt that the committee would accept anything involving a
look aside database; I'm just guessing about that, but it does
seem to go against what has been considered acceptable in the
past.  (One might argue that what has been considered acceptable
is somewhat outdated, and I think I would rather agree.  But I
don't see it changing.)  I also think that any proposal will
have to include some "implementation defined" aspects---after
all, how the file is found is already implementation dependant,
so I don't see how one determines whether two
[qh]-char-sequences refer to the same file or not could not
contain some implementation defined aspects.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Martin Bonner" <martinfrompi@yahoo.co.uk>
Date: Tue, 12 Dec 2006 11:51:47 CST
Raw View
James Kanze wrote:

> ThosRTanner wrote:
>
> > That's a QOI issue though. I can see that there are issues:
> > 1) File name case differs on a case-insensitive file system - easily
> > fixable
>
> How?  I don't think that an application program can even
> necessarily know if filenames are case-insensitive or not.

It really is QOI.  GetVolumeInformation returns this information (but
that is a pretty obscure API).

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: yechezkel@emailaccount.com (Yechezkel Mett)
Date: Tue, 12 Dec 2006 18:00:53 GMT
Raw View
James Kanze wrote:
> I think the issue is a bit more subtle.  On some common systems
> (Linux, Solaris), I can set it up so that:
>
>     #include "abc.h"
>     #include "abc.h"
>
> reads two different sets of data.  I think that there's also
> general agreement that the standard doesn't have to support
> things that perverse.
>
> On the other hand, if these two includes were in two different
> header files, the including header files are in different
> directories, and both directories contained a file "abc.h", not
> only will I get two different files, that's what I expect and
> want.  Simply comparing the [hq]-char-sequence is not sufficient
> to say that two files are the same file.
>
> Beyond that, I've not seen any concrete explination of what the
> standard should actually say.  The fact that whatever it says
> will allow "mistakes" in some perverse cases doesn't bother me
> too much (as long as you don't consider the environments I
> actually work in "perverse":-)).  But you need to find some
> wording to specify this, in an environment neutral manner.

The standard just needs to say that #pragma once (or whatever) causes a
second #include of the same file to have no effect, to note that the
definition of "the same file" is implementation defined, and then to
leave it as a QOI issue.

As far as implementation goes, in situations where there is no OS method
to check if two files are the same I would suggest simply converting all
filenames to some canonical form and comparing that. It'll fail if the
same file is accessed via two different shares (within one translation
unit), but does anyone do that?

Not that I think such a directive is necessary, but a future module-type
import system might also need to know if two files are the same.

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jdennett@acm.org (James Dennett)
Date: Tue, 12 Dec 2006 18:01:07 GMT
Raw View
Greg Herlihy wrote:
> kuyper@wizard.net wrote:
>> Greg Herlihy wrote:
>> .
>>> Start with the definition of the __FILE__ macro (i.e the name of the
>>> file that contains the #pragma once directive), unstringify the name
>>> (i.e. strip the enclosing quotation marks), replace periods (and any
>>> other character not allowed in an identifier) with underscores,
>>> uppercase the name, and prepend an underscore (so the #pragma once
>>> macro should not conflict with any user-declared macros).
>> There are a couple of  problems:
>> Two different files might be mapped to the same macro name by that
>> algorithm.
>
> If the two files have the same name then they would in fact have the
> same implicitly-defined macro name - so a #once directive (I agree that
> the #pragma should be dropped) in one of the files would inhibit the
> subsequent inclusion of the other, within a single translation unit.
> This behavior would be quite deliberate. In fact, one of the uses for a
> #once directive that I can anticipate would to aid programmers in
> detecting these kinds of header name collisions in order that they can
> be addressed as soon as they have been detected.

Many C++ development groups have already found a solution
to the problem of header name collisions: they place all
header files in directories, and refer to them by qualified
names to disambiguate.

It wouldn't be so good to provide language support for
something that ignores directory names/paths.  I also don't
think there's any experience with a feature much like what
you propose, which would be one more barrier to adopting
it into an ISO standard.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Yechezkel Mett <yechezkel@emailaccount.com>
Date: Tue, 12 Dec 2006 13:07:44 CST
Raw View
James Kanze wrote:
> There's actually a concrete proposal for modules under
> consideration, which would presumably make anything concerning
> include files moot.  Given the time frame being aimed for, and
> the lack of a concrete implementation to actually experiment
> with, I rather doubt that it will make it (although from what I
> have seen, it looks very good), however.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2122.htm has it
  under "Heading for a separate TR", meaning that it won't be in C++09,
but it is still being worked on, to be delivered afterwards.

Yechezkel Mett

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Tue, 12 Dec 2006 19:06:59 GMT
Raw View
Bj=F8rn Roald wrote:
> Andrei Alexandrescu (See Website For Email) skrev:
>> If the collision probability is lower than the probability of file=20
>> read error using a deterministic comparison function, then...
>=20
>=20
> yes, it is simply an assessment of the risk involved (likelihood *=20
> consequence).  And a decision on whether that is a chance we take.  If=20
> it approaches the likelihood that my hard disk evaporate as the result=20
> of a meteor hit I can bear it for most of the stuff I will ever compile=
.=20
>  In any way, I think the hash collision on many systems can be made mor=
e=20
> unlikely using additional low cost, possibly system dependent, features=
=20
> of the file system.  In a safe and slow mode - collisions will always b=
e=20
> detectable using file compare - unless that fails as well or the meteor=
=20
> strikes. So basically if you do not accept any risk, you probably shoul=
d=20
> not write code at all.
>=20
> So for the sake of it, compare all this with the risk of accidental use=
=20
> of the same include guard in multiple include files.  I have encountere=
d=20
>  this one on more than one occasion.  We all seems to have accepted it=20
> as a risk we have to live with.

Hah, what a great point. Hits the nail right on the head. Made my day!=20
Thanks Bj=F8rn.

Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: =?ISO-8859-1?Q?Bj=F8rn_Roald?= <bjorn@4roald.org>
Date: Tue, 12 Dec 2006 13:37:53 CST
Raw View
James Kanze skrev:
> Bj   rn Roald wrote:
>
>     [...]
>> I think use of a hash can be helpful in resolving the cases where we
>> *must* do the extra check.  But before this is used the simpler *safe*
>> assumptions should be exploited.  So, what are the useful *safe*
>> assumptions?  I think the most useful would be that any line of the form
>
>> #include <file_a.h>
>
>> would cause the compiler to always see the same file no mather what file
>> system we deal with.  Likewise for the #include "file_b.h" form.
>
> As has already been pointed out, this would break too much code.
> (In theory, it should work for the <...> form; this form should
> only be used for system includes, and the implementor more or
> less has control over those.  In practice, every compiler I know
> includes paths specified by the -I option, and I've seen more
> than a little code with user headers in the <...>.)

I do not see how this relate to what I propose.  If the preprocessor
encounter #include "file_b.h" 5 times during the processing of one
translation unit, exactly how is it possible that it will look at
different files?  I can safely assume one command line, so the -I
options are not changing.


>     [...]
>> As I see it, there are two goals of the #pragma once, and other similar
>>   proposals.
>
>> 1. simplicity in use, and less errors as positive important side effect
>
> A limited benefit, since external tools normally take care of
> this automatically.

Yes, you repeatedly state this.  But it only hold half way, and I
dislike halfway solutions.  Does your tool change the GUARD when you
rename a source file?  Or when you copy it?  Or change the namespace? Or...

Does it fix unbalanced #ifdef #endif blocks?

>> 2. optimalization, mainly in the use of the programmers time - which
>> often also involves waiting for the compiler to complete
>
> Except that existing compilers (good ones, anyway) recognize the
> include guard pattern and don't make redundant includes anyway.

Yes, but you have to maintain them, scroll past them, make shure they
follow the ruling style and conventions, that is also wasted time.

> There's also a definite esthetic benefit---include guards are
> ugly, and there necessity is an embarassement to the language.
> But I don't find a pragma much better.

It is an improvement but it should have been the default.  That seems to
be a bit off the chart.  But I really would have preferred a pragma
solution to be used in those files that need the legacy default
behavior.  That way you and I would not be bothered much with how it
looks.  This would probably break some code, but should be feasible to
fix.  The real question I guess is if there is a smart and nice way of
handling the transition period for code that have too deal with
conforming and none-conforming compiler.

> The current situation is thus: there's a benefit, but it's
> pretty small, and there's no real working proposal on the floor.

Can someone point me at a good proposal text for a similar feature, or
other relevant link giving me a clue of what it takes. My quick
assessment of what need to be done is:

- write the proposal and revisions
- get reviews
- implement reference implementation, probably a patch to gcc
- make a test suite
- show how migration of existing code is done
- present the proposal to the standard comity

I guess I could help with all but the last

Test may be based on simple tool copying large bodies of code as it
modify it. It should then remove all #pragma once and traditional guards
and add new #pragma xxxx in files not containing the standard guard.  A
modified version of the boost.bcp tool comes to my mind.  Some smartness
is probably required to detect situations where the guard MACRO is used
for other purposes as well.

----
Bj   rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Robert Mabee <rmabee@comcast.net>
Date: Tue, 12 Dec 2006 14:07:18 CST
Raw View
Yechezkel Mett wrote:
> The standard just needs to say that #pragma once (or whatever) causes a
> second #include of the same file to have no effect,

"No effect" is a strong claim that requires the file recognition to be
100% accurate.  A relaxed claim could tolerate sometimes not recognizing
the same file, although still must not fail to include a different file
that somehow resembles a first file.  This would still reduce compile
times when the recognition works most of the time, and compile correctly
when the OS and filesystem don't cooperate.

For example, the new #pragma could require that the include file contain
only complete declarations and #defines, which would generate no error
messages if they were duplicates, and it could be undefined behavior to
#undef anything from the header so you wouldn't be able to tell if the
second inclusion happened.

#include recursion has to be reliably stopped.  I guess in that case
the file name is identical after at most one instance of mistakenly
rereading a file, so string comparison of directory and file name is
adequate.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Tue, 12 Dec 2006 14:28:16 CST
Raw View
Bj   rn Roald wrote:
..
> I do not see how this relate to what I propose.  If the preprocessor
> encounter #include "file_b.h" 5 times during the processing of one
> translation unit, exactly how is it possible that it will look at
> different files?  I can safely assume one command line, so the -I
> options are not changing.

On many (most?) implementations the search path starts in the same
directory as the file where the #include was encountered that started
the search. A single translation unit can include code #included from
many different directories, and each directory could, in principle,
have it's own copy (or version of) file_b.h.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: =?ISO-8859-1?Q?Bj=F8rn_Roald?= <bjorn@4roald.org>
Date: Tue, 12 Dec 2006 19:08:55 CST
Raw View
kuyper@wizard.net skrev:
> Bj   rn Roald wrote:
> ..
>> I do not see how this relate to what I propose.  If the preprocessor
>> encounter #include "file_b.h" 5 times during the processing of one
>> translation unit, exactly how is it possible that it will look at
>> different files?  I can safely assume one command line, so the -I
>> options are not changing.
>
> On many (most?) implementations the search path starts in the same
> directory as the file where the #include was encountered that started
> the search. A single translation unit can include code #included from
> many different directories, and each directory could, in principle,
> have it's own copy (or version of) file_b.h.

Ok. I see that now, thanks!  This basically removes this opportunity of
optimalization as the the directory in which the 5 #include "file_b.h"
are encountered likely will vary where the search is starting during
preprocessing.

The overall solution to base it on a hash of file content, md5 or other,
may still be a good approach to a solution, but this simple
optimalization of it will fail.

---
Bj   rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bjorn@4roald.org (=?ISO-8859-1?Q?Bj=F8rn_Roald?=)
Date: Wed, 13 Dec 2006 04:48:26 GMT
Raw View
James Kanze skrev:

> On any system that has "reliable" and meaningful timestamps on
> files.  Thus, not on a system which allows you to copy in your
> files from some other system.  Or to remote mount file systems,
> at least not with NFS or SMB.
 >
> Timestamps are not reliable:-(.  (At least not on NFS mounted
> files, or files which have been copied from other systems.  I
> know this all too well.  Make depends on them, and I've had no
> end of problems because of it---modified sources not getting
> recompiled, etc.)

Yes, but this is mostly caused by bad synchronization of the server or=20
client's clock.  Or do I misunderstand something?  Such time scew may=20
not cause problems here as it obviously does with traditional make tools.

---
Bj=F8rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:44:43 CST
Raw View
Bj   rn Roald wrote:
> James Kanze skrev:
> > Bj   rn Roald wrote:

> >     [...]
> >> I think use of a hash can be helpful in resolving the cases where we
> >> *must* do the extra check.  But before this is used the simpler *safe*
> >> assumptions should be exploited.  So, what are the useful *safe*
> >> assumptions?  I think the most useful would be that any line of the form

> >> #include <file_a.h>

> >> would cause the compiler to always see the same file no mather what file
> >> system we deal with.  Likewise for the #include "file_b.h" form.

> > As has already been pointed out, this would break too much code.
> > (In theory, it should work for the <...> form; this form should
> > only be used for system includes, and the implementor more or
> > less has control over those.  In practice, every compiler I know
> > includes paths specified by the -I option, and I've seen more
> > than a little code with user headers in the <...>.)

> I do not see how this relate to what I propose.  If the preprocessor
> encounter #include "file_b.h" 5 times during the processing of one
> translation unit, exactly how is it possible that it will look at
> different files?

foo/doh.h
    #include "abc.h"

bar/doh.h
    #include "abc.h"

main.cc
    #include "foo/doh.h"
    #include "bar/doh.h"

If there is a file abc.h in both foo and bar, the include in
foo/doh.h will find the one in foo, and the include in bar/doh.h
will find the one in bar.  The situation actually occurs fairly
often, for internal headers in libraries.

> I can safely assume one command line, so the -I
> options are not changing.

My mention of the -I options was purely parenthetical, and not
really relevant to the discussion at hand.

> >     [...]
> >> As I see it, there are two goals of the #pragma once, and other similar
> >>   proposals.

> >> 1. simplicity in use, and less errors as positive important side effect

> > A limited benefit, since external tools normally take care of
> > this automatically.

> Yes, you repeatedly state this.  But it only hold half way, and I
> dislike halfway solutions.  Does your tool change the GUARD when you
> rename a source file?  Or when you copy it?  Or change the namespace? Or...

Of course not (although it would be simple enough to make it do
so).  Should it?  What if the tool simply generates some random
guard, or something based on the timestamp?

> Does it fix unbalanced #ifdef #endif blocks?

No.  Nor does it fix any other broken code in the file.  (In
fact, it doesn't fix anything; it just inserts when the file is
created.  Does anyone actually use an editor which doesn't do
something this fundamental?)

> >> 2. optimalization, mainly in the use of the programmers time - which
> >> often also involves waiting for the compiler to complete

> > Except that existing compilers (good ones, anyway) recognize the
> > include guard pattern and don't make redundant includes anyway.

> Yes, but you have to maintain them, scroll past them, make shure they
> follow the ruling style and conventions, that is also wasted time.

Agreed.  On the other hand, you also have to scroll past the
copyright notice, which is almost always longer.  As for style
and naming conventions, you program the editor to insert
something conform.

> > There's also a definite esthetic benefit---include guards are
> > ugly, and there necessity is an embarassement to the language.
> > But I don't find a pragma much better.

> It is an improvement but it should have been the default.

C should have had true modules from the start.  Then we wouldn't
be having this discussion.

> That seems to
> be a bit off the chart.  But I really would have preferred a pragma
> solution to be used in those files that need the legacy default
> behavior.  That way you and I would not be bothered much with how it
> looks.  This would probably break some code, but should be feasible to
> fix.  The real question I guess is if there is a smart and nice way of
> handling the transition period for code that have too deal with
> conforming and none-conforming compiler.

No.

> > The current situation is thus: there's a benefit, but it's
> > pretty small, and there's no real working proposal on the floor.

> Can someone point me at a good proposal text for a similar feature, or
> other relevant link giving me a clue of what it takes.

Well, a lot depends on the proposal.  This one is simple enough
and isolated enough that it shouldn't take much work: a short
introduction, a paragraph or two with a discussion of why it
will help, what the alteratives might be, etc.  (The discussion
here should provide all of the necessary material.)  And
finally, a detailed write up of exactly what
sentences/paragraphs have to be modified, and how, in the
standard.

Other aspects which normally have to be addressed are
implementability (I think some compilers already support it, in
some form or another, so implementability has been proven,
provided what you specify doesn't differ too much from what has
been implemented.) and interaction with other features, which
should be pretty close to 0, given that we're talking here about
the preprocessor, and only the preprocessor.

> My quick
> assessment of what need to be done is:

> - write the proposal and revisions
> - get reviews

These two steps had better be done very, very quickly if you
want to get it into the next revision.  Formally, the cut-off
date for new proposals has passed.

> - implement reference implementation, probably a patch to gcc

No patch needed, g++ already supports it.

> - make a test suite
> - show how migration of existing code is done

Existing code should work without change.  It's a pure
extension.

The one thing you might have to worry about is if some compiler
has already implemented whatever syntax you chose to do
something else.

> - present the proposal to the standard comity

I don't think must testing or other is necessary.  The largest
single piece of work will be writing the proposal.  You then
probably need someone who attends the standardization meetings
to defend it there, and particularly, to defend accepting a new
proposal after the cut-off dead-line.  IMHO, if you address the
issues raised here, and clearly delimit what you don't expect
quality implementations to support, and what you do, the two
main objections will be that it doesn't solve any real problem
(IMHO, true, but I'm willing to accept that I'm not the only
person in the world), and that it is too late for this round.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:55:42 CST
Raw View
Martin Bonner wrote:
> James Kanze wrote:

> > ThosRTanner wrote:

> > > That's a QOI issue though. I can see that there are issues:
> > > 1) File name case differs on a case-insensitive file system - easily
> > > fixable

> > How?  I don't think that an application program can even
> > necessarily know if filenames are case-insensitive or not.

> It really is QOI.  GetVolumeInformation returns this information (but
> that is a pretty obscure API).

I'm afraid that my system doesn't have such a call:-).  I think
that there is something under Posix, but it's equally obscure.

(The fact that it is obscure isn't a problem.  Only the compiler
author's would have to use it.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:55:33 CST
Raw View
Bj   rn Roald wrote:
> James Kanze skrev:

> > On any system that has "reliable" and meaningful timestamps on
> > files.  Thus, not on a system which allows you to copy in your
> > files from some other system.  Or to remote mount file systems,
> > at least not with NFS or SMB.

> > Timestamps are not reliable:-(.  (At least not on NFS mounted
> > files, or files which have been copied from other systems.  I
> > know this all too well.  Make depends on them, and I've had no
> > end of problems because of it---modified sources not getting
> > recompiled, etc.)

> Yes, but this is mostly caused by bad synchronization of the server or
> client's clock.  Or do I misunderstand something?  Such time scew may
> not cause problems here as it obviously does with traditional make tools.

In this case, yes, it's a serious case of bad synchronization.
But my point was just in general that time stamps aren't
reliable.  There are any number of ways they can foul up.

To tell the truth, I was rather surprised at the amount of
effort being invested in this; the "specification" I was looking
for was that it was "implementation specified"; I just wanted to
be sure that the proponents understood that what they were
proposing couldn't be strictly specified.  Because let's face
it, and implementation which maps all names to /dev/null is
conforming, even if it fails on grounds of QoI or usability.

And I did want to see some discussion concerning what it
desirable from a QoI point of view.  Given the discussion, I'm
convinced that it's possible to implement it even better than I
expected.  (Or that I think necessary.)

So my final take is simply that I don't see much use for it, but
if other's do, and are willing to go to the effort of writing up
a proposal, I won't oppose it.  I'll even vote for it, if I
happen to be present at the meeting it's voted on, on the
grounds that a significant number of people do think it would be
useful to them, and that I don't think it's really any burden on
the implementors (as long as we do leave it "implementation
defined", and don't require 100% accuracy in perverse cases).
But I'm not going to invest any more time in it myself.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:54:43 CST
Raw View
Robert Mabee wrote:
> Yechezkel Mett wrote:
> > The standard just needs to say that #pragma once (or whatever) causes a
> > second #include of the same file to have no effect,

> "No effect" is a strong claim that requires the file recognition to be
> 100% accurate.

Not really.  As it stands, which file you get from an include is
already implementation defined.

> A relaxed claim could tolerate sometimes not recognizing
> the same file, although still must not fail to include a different file
> that somehow resembles a first file.  This would still reduce compile
> times when the recognition works most of the time, and compile correctly
> when the OS and filesystem don't cooperate.

It would still mean that include guards are required.  If I
understand the proponants correctly, this is what they want to
eliminate.  The optimization of compile times is currently
possible, and in fact, implemented in not a few compilers.

> For example, the new #pragma could require that the include file contain
> only complete declarations and #defines, which would generate no error
> messages if they were duplicates, and it could be undefined behavior to
> #undef anything from the header so you wouldn't be able to tell if the
> second inclusion happened.

That would make it pretty much useless.  No class definitions,
or template definitions (neither of which can be duplicated in a
single translation unit).

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:54:00 CST
Raw View
Yechezkel Mett wrote:
> James Kanze wrote:
> > I think the issue is a bit more subtle.  On some common systems
> > (Linux, Solaris), I can set it up so that:

> >     #include "abc.h"
> >     #include "abc.h"

> > reads two different sets of data.  I think that there's also
> > general agreement that the standard doesn't have to support
> > things that perverse.

> > On the other hand, if these two includes were in two different
> > header files, the including header files are in different
> > directories, and both directories contained a file "abc.h", not
> > only will I get two different files, that's what I expect and
> > want.  Simply comparing the [hq]-char-sequence is not sufficient
> > to say that two files are the same file.

> > Beyond that, I've not seen any concrete explination of what the
> > standard should actually say.  The fact that whatever it says
> > will allow "mistakes" in some perverse cases doesn't bother me
> > too much (as long as you don't consider the environments I
> > actually work in "perverse":-)).  But you need to find some
> > wording to specify this, in an environment neutral manner.

> The standard just needs to say that #pragma once (or whatever) causes a
> second #include of the same file to have no effect, to note that the
> definition of "the same file" is implementation defined, and then to
> leave it as a QOI issue.

Given that it is already implementation defined as to how the
implementation finds the file to begin with, I rather suspect
that you can't do much more.

Perhaps a footnote or a comment to suggest what is expected in a
good implementation, in addition?

> As far as implementation goes, in situations where there is no OS method
> to check if two files are the same I would suggest simply converting all
> filenames to some canonical form and comparing that. It'll fail if the
> same file is accessed via two different shares (within one translation
> unit), but does anyone do that?

I think that for "typical" systems, "the same file" being
interpreted as "the same name, after mapping, and found in the
same location", is probably an adequate definition.  I'm still a
little unsure about cases where the mounted file systems may
differ with regards to case significance or filename length
(remember older systems where <strstream.h> and <strstrea.h>
referred to the same file), but I'm not sure that it is a big
issue.

For the normative text, I think that "implementation defined" is
about the best we can do, but I would like to see something,
somewhere, along the lines of what I just wrote, so that the
implementors have some idea how hard they're expected to try.
IMHO, the real issue is case.  Most implementations currently
just pass the string on to the mounted file system, and let it
do any mapping.  If the mounted file system is case insensitive,
it returns the same file for files written with different case;
if it's not, it won't return the same file.  Do we expect a
quality implementation to find out how the mounted file system
handles case, or just do whatever is appropriate for local file
systems?  (Similar considerations apply to name length, but I
rather doubt that there are very many file servers still running
which only support 8.3.)

I also think that one of the proponents should at least look
into how ClearCase works.  I don't think that there is a problem
there, but as a file server, it does serve up data for which
there is no physical file present.

On the other hand, I think it perfectly acceptable to ignore
things like named pipes, or "files" in /dev.  Or even multiple
mounts of the same file system in different places, or links to
the same file with different names.  (I'm actually a little
unsure about the latter---I can imagine cases where it makes
sense, and would be used.  But they're infrequent enough, and
normally known in advance, so in such cases, the programmer can
simply eschew #once, and drop back to include guards.  Which
will, of course, continue to work.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 13 Dec 2006 09:56:16 CST
Raw View
Yechezkel Mett wrote:
> James Kanze wrote:
> > There's actually a concrete proposal for modules under
> > consideration, which would presumably make anything concerning
> > include files moot.  Given the time frame being aimed for, and
> > the lack of a concrete implementation to actually experiment
> > with, I rather doubt that it will make it (although from what I
> > have seen, it looks very good), however.

> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2122.htm has it
>   under "Heading for a separate TR", meaning that it won't be in C++09,
> but it is still being worked on, to be delivered afterwards.

Thanks.  I really should have looked it up myself.

Logically, that's about where I'd have placed it as well.  The
proposal seems very sound, and it's something that I think we
need, but in the absense of a concrete implementation to play
around with, I'd have estimated two or three years before the
proposal is concrete enough to be worded up into the standard.
And if the standard is going to be C++0x, and not C++1x, the
cut-off date for the wording is very, very close (middle or end
of 2007, I think).

In some ways, it's a shame, as the proposal subsumes dynamicly
loaded objects, which I think are felt to be a must for C++0x.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: rmabee@comcast.net (Robert Mabee)
Date: Wed, 13 Dec 2006 20:40:20 GMT
Raw View
James Kanze wrote:
> Robert Mabee wrote:
>>For example, the new #pragma could require that the include file contain
>>only complete declarations and #defines, which would generate no error
>>messages if they were duplicates, and it could be undefined behavior to
>>#undef anything from the header so you wouldn't be able to tell if the
>>second inclusion happened.
>
> That would make it pretty much useless.  No class definitions,
> or template definitions (neither of which can be duplicated in a
> single translation unit).

Perhaps terminology or usage has drifted; I expected class definitions
only in a main file.  However, the example is easily extended to ignore
duplicate definitions in this context.

I thought it might be acceptable to not compare the entire body of the
definition.  However, that would leave conflicts between two genuinely
different files undiagnosed.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Wed, 13 Dec 2006 14:43:55 CST
Raw View
Bj   rn Roald wrote:
> kuyper@wizard.net skrev:
> > Bj   rn Roald wrote:
> > ..
> >> I do not see how this relate to what I propose.  If the preprocessor
> >> encounter #include "file_b.h" 5 times during the processing of one
> >> translation unit, exactly how is it possible that it will look at
> >> different files?  I can safely assume one command line, so the -I
> >> options are not changing.
> >
> > On many (most?) implementations the search path starts in the same
> > directory as the file where the #include was encountered that started
> > the search. A single translation unit can include code #included from
> > many different directories, and each directory could, in principle,
> > have it's own copy (or version of) file_b.h.
>
> Ok. I see that now, thanks!  This basically removes this opportunity of
> optimalization as the the directory in which the 5 #include "file_b.h"
> are encountered likely will vary where the search is starting during
> preprocessing.

The starting directory may vary from one translation unit to another -
but the relevant question is whether it is reasonable to expect that
identical #include "file_b.h" directives are likely to find two
different header files within a single translation unit. After all, in
order to include a second "file_b.h" it would be necessary to include
some other header file as an intermediary which - by the accident of
its file system location - causes the compiler to find a different
"file_b.h" than the "file_b.h" it found before in that same translation
unit. In other words, the programmer can have little confidence whether
any two, identical #include directives refer to the same - or to
different header files. And if the programmer cannot assume that two
#include directives refer to the same file, then the programmer should
not assume that they would refer to different files either. It short,
there is no rational way to manage dependencies under such
circumstances.

> The overall solution to base it on a hash of file content, md5 or other,
> may still be a good approach to a solution, but this simple
> optimalization of it will fail.

No, #once would succeed in preventing a header file from being included
if a header file with that name had been included earlier within the
same translation unit. And whether or not that is the desired effect -
at least what the #once directive would do when added to a file - would
be clear. And it makes makes much more sense for a #once to have a
clear, predictable effect (so there is no question whether its use is
appropriate for a given situation) then it would be to have #once
mirror the same complexity, fragilily, and unpredictably of existing
practices - the very same practices that #once is intended to
straighten out.

Greg


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Wed, 13 Dec 2006 16:37:02 CST
Raw View
Greg Herlihy wrote:
> Bj   rn Roald wrote:
> > kuyper@wizard.net skrev:
..
> > > On many (most?) implementations the search path starts in the same
> > > directory as the file where the #include was encountered that started
> > > the search. A single translation unit can include code #included from
> > > many different directories, and each directory could, in principle,
> > > have it's own copy (or version of) file_b.h.
> >
> > Ok. I see that now, thanks!  This basically removes this opportunity of
> > optimalization as the the directory in which the 5 #include "file_b.h"
> > are encountered likely will vary where the search is starting during
> > preprocessing.
>
> The starting directory may vary from one translation unit to another -
> but the relevant question is whether it is reasonable to expect that
> identical #include "file_b.h" directives are likely to find two
> different header files within a single translation unit. After all, in
> order to include a second "file_b.h" it would be necessary to include
> some other header file as an intermediary which - by the accident of
> its file system location - causes the compiler to find a different
> "file_b.h" than the "file_b.h" it found before in that same translation
> unit.

In the ordinary course of events, there would be nothing accidental
about that. When a file is found in or under the same directory as the
file that #included it, that's generally because those two files were
put in those particular locations, with the deliberate purpose of
achieving precisely that effect.


 In other words, the programmer can have little confidence whether
> any two, identical #include directives refer to the same - or to
> different header files. And if the programmer cannot assume that two
> #include directives refer to the same file, then the programmer should
> not assume that they would refer to different files either. It short,
> there is no rational way to manage dependencies under such
> circumstances.

When it's done deliberately, as I've described above, rational
dependency management gets a little complicated, but it's nowhere near
to being impossible.


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bjorn@4roald.org (=?ISO-8859-1?Q?Bj=F8rn_Roald?=)
Date: Thu, 14 Dec 2006 06:23:58 GMT
Raw View
James Kanze skrev:

> I also think that one of the proponents should at least look
> into how ClearCase works.  I don't think that there is a problem
> there, but as a file server, it does serve up data for which
> there is no physical file present.

If needed I can help verify and test issues in ClearCase.  As I see it=20
now, it is technically possible to mess things up, but then you are=20
doing stuff that is so far out you don't deserve better.

---
Bj=F8rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: James Dennett <jdennett@acm.org>
Date: Thu, 14 Dec 2006 02:56:50 CST
Raw View
Robert Mabee wrote:
> James Kanze wrote:
>> Robert Mabee wrote:
>>> For example, the new #pragma could require that the include file contain
>>> only complete declarations and #defines, which would generate no error
>>> messages if they were duplicates, and it could be undefined behavior to
>>> #undef anything from the header so you wouldn't be able to tell if the
>>> second inclusion happened.
>>
>> That would make it pretty much useless.  No class definitions,
>> or template definitions (neither of which can be duplicated in a
>> single translation unit).
>
> Perhaps terminology or usage has drifted; I expected class definitions
> only in a main file.

Your terminology appears non-standard.  A class definition is
that which declares the members/bases of the class; a
non-definition declaration of a class merely names the class.
A class definition does not have to define the members, only
declare them.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 14 Dec 2006 12:20:23 CST
Raw View
Greg Herlihy wrote:
> Bj   rn Roald wrote:
> > kuyper@wizard.net skrev:
> > > Bj   rn Roald wrote:
> > > ..
> > >> I do not see how this relate to what I propose.  If the preprocessor
> > >> encounter #include "file_b.h" 5 times during the processing of one
> > >> translation unit, exactly how is it possible that it will look at
> > >> different files?  I can safely assume one command line, so the -I
> > >> options are not changing.

> > > On many (most?) implementations the search path starts in the same
> > > directory as the file where the #include was encountered that started
> > > the search. A single translation unit can include code #included from
> > > many different directories, and each directory could, in principle,
> > > have it's own copy (or version of) file_b.h.

> > Ok. I see that now, thanks!  This basically removes this opportunity of
> > optimalization as the the directory in which the 5 #include "file_b.h"
> > are encountered likely will vary where the search is starting during
> > preprocessing.

> The starting directory may vary from one translation unit to another -
> but the relevant question is whether it is reasonable to expect that
> identical #include "file_b.h" directives are likely to find two
> different header files within a single translation unit.

It's not only reasonable, it's a frequent occurance in some
installations.  Officially, I don't know that g++ uses an
include file named bits/vstring.h, for example; what is supposed
to happen if I happend to have an include file with the same
name.

> After all, in
> order to include a second "file_b.h" it would be necessary to include
> some other header file as an intermediary which - by the accident of
> its file system location - causes the compiler to find a different
> "file_b.h" than the "file_b.h" it found before in that same translation
> unit.

Certainly.  That's what ensures, for example, that when the g++
headers include "bits/vstring.h", they get their implementation
file, and when my application includes "bits/vstring.h", I get
my header, and not the g++ library implementation header.

> In other words, the programmer can have little confidence whether
> any two, identical #include directives refer to the same - or to
> different header files.

Actually, he can have confidence that no implementation include
files in the libraries he's using can conflict with his own
naming scheme.

> And if the programmer cannot assume that two
> #include directives refer to the same file, then the programmer should
> not assume that they would refer to different files either. It short,
> there is no rational way to manage dependencies under such
> circumstances.

And yet... People are doing it.  I have yet to see a company
which maintains a list of the implementation include files in
the various libraries it used, in order to avoid them.

> > The overall solution to base it on a hash of file content, md5 or other,
> > may still be a good approach to a solution, but this simple
> > optimalization of it will fail.

> No, #once would succeed in preventing a header file from being included
> if a header file with that name had been included earlier within the
> same translation unit. And whether or not that is the desired effect -
> at least what the #once directive would do when added to a file - would
> be clear. And it makes makes much more sense for a #once to have a
> clear, predictable effect (so there is no question whether its use is
> appropriate for a given situation) then it would be to have #once
> mirror the same complexity, fragilily, and unpredictably of existing
> practices - the very same practices that #once is intended to
> straighten out.

Realistically... Why should #once have such clear, predictable
effects when #include doesn't?  Where on some systems, #include
<strstream.h> has the same effect as #include <strstrea.h>, and
on others no.  Where depending on which file system the actual
files are mounted on, #include <MyHeader.hh> and #include
<MYHEADER.h> refer to the same file, or to different files.
Where depending on the compiler options, #include <iostream> and
#include "iostream" may or may not include the same file.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 14 Dec 2006 15:52:39 CST
Raw View
kuyper@wizard.net wrote:

    [...]
> When it's done deliberately, as I've described above, rational
> dependency management gets a little complicated, but it's nowhere near
> to being impossible.

Just curious, but does anyone still do dependancy management
manually today?  Isn't this something that is taken care of by
some sort of collaboration between the compiler and make (or
whatever one uses in its place), either implicitly and
automatically, or explicitlly via a special target for make?
(Or by some other tool completely---I have no idea how this
occurs in an IDE, but I can't imagine users of Visual Studio
typing in dependancy lists manually.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: usenet-nospam@nmhq.net (Niklas Matthies)
Date: Thu, 14 Dec 2006 21:54:58 GMT
Raw View
On 2006-12-12 16:43, James Kanze wrote:
:
> There's also a point about how you specify it formally
> (supposing that you don't want to require MD5 if something
> better comes along).

Since the standard doesn't specify the source character encoding
it would be kinda moot anyway.

-- Niklas Matthies

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 14 Dec 2006 15:51:40 CST
Raw View
Robert Mabee wrote:
> James Kanze wrote:
> > Robert Mabee wrote:
> >>For example, the new #pragma could require that the include file contain
> >>only complete declarations and #defines, which would generate no error
> >>messages if they were duplicates, and it could be undefined behavior to
> >>#undef anything from the header so you wouldn't be able to tell if the
> >>second inclusion happened.

> > That would make it pretty much useless.  No class definitions,
> > or template definitions (neither of which can be duplicated in a
> > single translation unit).

> Perhaps terminology or usage has drifted;

Not since the ARM, at any rate.  I'm pretty sure, in fact, not
since the first edition of Stroustrup.

> I expected class definitions
> only in a main file.

The standard requires quite a few in its headers.  Unless the
class definition is in a header, in fact, no one else can use
it.

> However, the example is easily extended to ignore
> duplicate definitions in this context.

I'm not sure what you mean by "the example".  Your proposal
seemed to say: forbid anything in a header whose definition
cannot occur twice in a single translation unit, so a conforming
program cannot tell whether the header was included twice.  If
that were feasable, we wouldn't be using include guards at
present, and nobody would be proposing #pragma once.

> I thought it might be acceptable to not compare the entire body of the
> definition.  However, that would leave conflicts between two genuinely
> different files undiagnosed.

One could, of course, permit multiple definitions in the same
translation unit of everything, with the requirement that they
be identical.  I'm not sure that that's a road I want to take,
however.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Thu, 14 Dec 2006 17:14:17 CST
Raw View
James Kanze wrote:
> kuyper@wizard.net wrote:
>
>     [...]
> > When it's done deliberately, as I've described above, rational
> > dependency management gets a little complicated, but it's nowhere near
> > to being impossible.
>
> Just curious, but does anyone still do dependancy management
> manually today?  Isn't this something that is taken care of by
> some sort of collaboration between the compiler and make (or
> whatever one uses in its place), either implicitly and
> automatically, or explicitlly via a special target for make?
> (Or by some other tool completely---I have no idea how this
> occurs in an IDE, but I can't imagine users of Visual Studio
> typing in dependancy lists manually.)

The thing that occurs automatically might more accurately be described
as dependency tracking. I've never used the phrase "dependancy
management" before, but I'd expect it to include that portion of the
design phase where someone figures out how the files that make up a
system should be allocated to different directories. I don't think that
part of the task can easily be automated.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: =?ISO-8859-1?Q?Bj=F8rn_Roald?= <bjorn@4roald.org>
Date: Thu, 14 Dec 2006 21:58:26 CST
Raw View
James Kanze skrev:

> foo/doh.h
>     #include "abc.h"
>
> bar/doh.h
>     #include "abc.h"
>
> main.cc
>     #include "foo/doh.h"
>     #include "bar/doh.h"
>
> If there is a file abc.h in both foo and bar, the include in
> foo/doh.h will find the one in foo, and the include in bar/doh.h
> will find the one in bar.  The situation actually occurs fairly
> often, for internal headers in libraries.

Yes, I got that part straight in my head now.  Thanks.

>> I can safely assume one command line, so the -I
>> options are not changing.
>
> My mention of the -I options was purely parenthetical, and not
> really relevant to the discussion at hand.

ok :-)


>>>     [...]
>>>> As I see it, there are two goals of the #pragma once, and other similar
>>>>   proposals.
>
>>>> 1. simplicity in use, and less errors as positive important side effect
>
>>> A limited benefit, since external tools normally take care of
>>> this automatically.
>
>> Yes, you repeatedly state this.  But it only hold half way, and I
>> dislike halfway solutions.  Does your tool change the GUARD when you
>> rename a source file?  Or when you copy it?  Or change the namespace? Or...
>
> Of course not (although it would be simple enough to make it do
> so).  Should it?  What if the tool simply generates some random
> guard, or something based on the timestamp?

fair enough, but but it would be much better if it was not needed.

>> Does it fix unbalanced #ifdef #endif blocks?
>
> No.  Nor does it fix any other broken code in the file.  (In
> fact, it doesn't fix anything; it just inserts when the file is
> created.  Does anyone actually use an editor which doesn't do
> something this fundamental?)

Well, this code error would never happen.

>>>> 2. optimalization, mainly in the use of the programmers time - which
>>>> often also involves waiting for the compiler to complete
>
>>> Except that existing compilers (good ones, anyway) recognize the
>>> include guard pattern and don't make redundant includes anyway.
>
>> Yes, but you have to maintain them, scroll past them, make shure they
>> follow the ruling style and conventions, that is also wasted time.
>
> Agreed.  On the other hand, you also have to scroll past the
> copyright notice, which is almost always longer.  As for style
> and naming conventions, you program the editor to insert
> something conform.

You and I do, but it is not my experience that all developers do.

>>> There's also a definite esthetic benefit---include guards are
>>> ugly, and there necessity is an embarassement to the language.
>>> But I don't find a pragma much better.
>
>> It is an improvement but it should have been the default.
>
> C should have had true modules from the start.  Then we wouldn't
> be having this discussion.

Agree

>> That seems to
>> be a bit off the chart.  But I really would have preferred a pragma
>> solution to be used in those files that need the legacy default
>> behavior.  That way you and I would not be bothered much with how it
>> looks.  This would probably break some code, but should be feasible to
>> fix.  The real question I guess is if there is a smart and nice way of
>> handling the transition period for code that have too deal with
>> conforming and none-conforming compiler.
>
> No.

So you don't think changing the default should be part of a proposal,
even if a good migration method is identified?

>>> The current situation is thus: there's a benefit, but it's
>>> pretty small, and there's no real working proposal on the floor.
>
>> Can someone point me at a good proposal text for a similar feature, or
>> other relevant link giving me a clue of what it takes.
>
> Well, a lot depends on the proposal.  This one is simple enough
> and isolated enough that it shouldn't take much work: a short
> introduction, a paragraph or two with a discussion of why it
> will help, what the alteratives might be, etc.  (The discussion
> here should provide all of the necessary material.)  And
> finally, a detailed write up of exactly what
> sentences/paragraphs have to be modified, and how, in the
> standard.

ok.  I have to get a real copy then and have a look.  I only have some
draft versions. I think n1905 is the latest I have.  Is use and
reference to the latest draft appropriate?

> Other aspects which normally have to be addressed are
> implementability (I think some compilers already support it, in
> some form or another, so implementability has been proven,
> provided what you specify doesn't differ too much from what has
> been implemented.) and interaction with other features, which
> should be pretty close to 0, given that we're talking here about
> the preprocessor, and only the preprocessor.
>
>> My quick
>> assessment of what need to be done is:
>
>> - write the proposal and revisions
>> - get reviews
>
> These two steps had better be done very, very quickly if you
> want to get it into the next revision.  Formally, the cut-off
> date for new proposals has passed.

I do not worry too much whether change is in that revision.  It would be
very nice, but not critical for me.  It is more important to get it
right. One important issue I am not sure of is whether the #once
behavior should be default.  I am fully aware of a number of issues, but
it seems logical to make that change.  But pragmatically, the Microsoft
style

#pragma once

is the most realistic.  Additional forms

#pragma STD ONCE
#once

could be allowed, but hen we need to deal with deprication.

>> - implement reference implementation, probably a patch to gcc
>
> No patch needed, g++ already supports it.

As long as I don't want to switch defaults :-)  Then I need a patch.

<snip>


---
Bj   rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: rmabee@comcast.net (Robert Mabee)
Date: Fri, 15 Dec 2006 04:31:11 GMT
Raw View
James Kanze wrote:
> Robert Mabee wrote:
>>However, the example is easily extended to ignore
>>duplicate definitions in this context.
>
> I'm not sure what you mean by "the example".  Your proposal
> seemed to say: forbid anything in a header whose definition
> cannot occur twice in a single translation unit, so a conforming
> program cannot tell whether the header was included twice.

I was referring to a little idea introduced by "For example", which
was an alternative to reliable file identification, supposing that
instead the contents of the tagged file could be forced to be
idempotent, suppressing errors on duplications and adding errors
on violations of the restrictions, which should be on unbalanced
syntax and preprocessor hacks, not on declarations and those
definitions (as previously pointed out) that belong in headers.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Fri, 15 Dec 2006 09:45:17 CST
Raw View
Bj   rn Roald wrote:
> James Kanze skrev:
> > C should have had true modules from the start.  Then we wouldn't
> > be having this discussion.

> Agree

It's being addressed.  The committee has said (via a vote) that
it definitly wants true modules, but that the current proposal
is not far enough along to make it this time.

> >> That seems to
> >> be a bit off the chart.  But I really would have preferred a pragma
> >> solution to be used in those files that need the legacy default
> >> behavior.  That way you and I would not be bothered much with how it
> >> looks.  This would probably break some code, but should be feasible to
> >> fix.  The real question I guess is if there is a smart and nice way of
> >> handling the transition period for code that have too deal with
> >> conforming and none-conforming compiler.

> > No.

> So you don't think changing the default should be part of a proposal,
> even if a good migration method is identified?

I'm against breaking code except for very good reasons.  I could
accept some code breakage (e.g. a new keyword) for full module
support.  I don't think it's worth it here, and there are other
alternatives which wouldn't break code.  (Using #import instead
of #include to imply that the file should only be included once,
for example.)

> >>> The current situation is thus: there's a benefit, but it's
> >>> pretty small, and there's no real working proposal on the floor.

> >> Can someone point me at a good proposal text for a similar feature, or
> >> other relevant link giving me a clue of what it takes.

> > Well, a lot depends on the proposal.  This one is simple enough
> > and isolated enough that it shouldn't take much work: a short
> > introduction, a paragraph or two with a discussion of why it
> > will help, what the alteratives might be, etc.  (The discussion
> > here should provide all of the necessary material.)  And
> > finally, a detailed write up of exactly what
> > sentences/paragraphs have to be modified, and how, in the
> > standard.

> ok.  I have to get a real copy then and have a look.  I only have some
> draft versions. I think n1905 is the latest I have.  Is use and
> reference to the latest draft appropriate?

Probably.  I have N2009, and I don't think it's the latest.  On
the other hand, we're talking here about something that would go
into    16.2, and I don't think that there have been any
modifications there since the orginal standard.

    [...]
> > These two steps had better be done very, very quickly if you
> > want to get it into the next revision.  Formally, the cut-off
> > date for new proposals has passed.

> I do not worry too much whether change is in that revision.  It would be
> very nice, but not critical for me.

I think that the current feeling is that true modules will be in
whatever comes after this revision (a TR, perhaps, and certainly
in the next revision).  You may find that there's not much
interest in looking into this when full modules are coming.

> It is more important to get it
> right. One important issue I am not sure of is whether the #once
> behavior should be default.  I am fully aware of a number of issues, but
> it seems logical to make that change.  But pragmatically, the Microsoft
> style

> #pragma once

> is the most realistic.  Additional forms

> #pragma STD ONCE
> #once

> could be allowed, but hen we need to deal with deprication.

> >> - implement reference implementation, probably a patch to gcc

> > No patch needed, g++ already supports it.

> As long as I don't want to switch defaults :-)  Then I need a patch.

If it's just a question of spelling... I'm sure that the
committee would accept that the implementation of #once is close
enough to that of "#pragma once" for the existance of "#pragma
once" to count as existing practice.  Or even treating it as the
default, and having a new pre-processor directive to disable it.

There's another point I forgot: C compatibility.  There is
likely to be significant feeling that this is something where we
should ensure that C does likewise.  Which probably means that
it would take more time---and that you should write the proposal
with that in view (which shouldn't be too hard---C and C++ are
pratically identical here).

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Thu, 7 Dec 2006 18:08:20 GMT
Raw View
On Wed,  6 Dec 2006 11:45:50 CST, James Kanze wrote:

>Nevin :-] Liber wrote:
>
>> The include guard idiom is based on macro symbols, not on whether or not
>> two files are the same file.  Information about the latter must be
>> provided by the operating system.
>
>The optimization based on recognizing it is based on recognizing
>that two includes actually include the same file.  And the
>reason that it can be implemented correctly is that it isn't an
>error if they do happen to read the same file twice; it just
>results in a slower compile time.  So they can take a
>conservative strategy, and in case of doubt, skip the
>optimization.  This would not be the case in the case of #pragma
>once.

Ok, but how can a "case of doubt" be reasonably identified?

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "eric_backus@alum.mit.edu" <eric_backus@alum.mit.edu>
Date: Thu, 7 Dec 2006 14:00:36 CST
Raw View
James Kanze wrote:
> eric_backus@alum.mit.edu wrote:
> > Which failure is more likely:
> >  * Your system supports linked files, the include files in your project
> > make use of that, but the compiler can't figure out that two files are
> > linked together, or
>
> A fairly usual case, I think.  At least, it's been the case in
> most places I've worked.  (The general context which would allow
> such errors, that is.  In practice, I would imagine that it
> would be pretty rare for even something as simple as a simple
> literal comparison to give a wrong answer.  In any project, all
> include files have a "canonical" name, regardless of the
> different ways they can be accessed, and that canonical name is
> the one used when including the file.)

A usual case?  I find that very hard to believe.  I suspect that linked
header files are quite rare to begin with, and most of the places where
they would be used are places where the compiler would almost certainly
be able to tell that they're linked.  Even if today's compiler does not
already know how to tell that they're linked, if "#pragma once" were
standardized, the compiler could (and would) easily be made to tell.

> >  * You use good old include guards, but accidentally use the same
> > include guard in different include files, due to using cut-and-paste to
> > create one of the files.
>
> I'll admit that I can't imagine that ever happening.  You might
> cut-and-paste the contents of the header file, but never the
> include guards, which have always been created automatically,

You've never done something like "cp header1.h header2.h" and then
edited header2.h?  Even if you haven't, I guarantee that others have.

> using naming conventions guaranteed to generate a unique name
> within the company, and with some specific prefix or suffix to
> make it unlikely to conflict with the naming conventions used in
> some third party library.
>
> If it's really a worry, of course, you can append the timestamp,
> the IP or MAP address of the machine and the process id of the
> editor/generator script to the guard.  Maybe with some bytes
> from /dev/random as well, just for good measure.

Sure, that would work, if you went to the effort of doing it.  Have you
ever done that?  I haven't.

The reality is that few people worry much about this problem, so
include guards occasionally fail.  It's not a big deal, but #pragma
once solves the problem, improves the global namespace, is simple to
specify, is not difficult to implement, and is already existing
practice.

> > I'm pretty sure the second is much more likely--I've seen it happen.
>
> The place to address that is your software development process,
> not the language definition.

Well, that's a valid point, but it doesn't change the fact that this
error does and will continue to happen occasionally.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: rmabee@comcast.net (Robert Mabee)
Date: Fri, 8 Dec 2006 01:08:39 GMT
Raw View
James Kanze wrote:
> And what happens if it recognizes two paths as pointing to the
> same file, when they don't?  I just tried including "/dev/tty"

Shouldn't the compiler be permitted to assume a "normal" model of usage,
such that a source file is a static object that can be "seek"ed in and
reread or cached as needed?  Otherwise the common case could be burdened
with the cost of copying the contents just in case they're needed later
(say for an error message quoting the source line).  I'd also want to be
able to get the file size from the OS and read the whole thing in at
once (memory permitting) which won't work with devices, pipes, or
sockets.

Then it's a small jump to assume that if two references are to the same
file (determined by an OS function still to be disputed) then they are
to the same contents, and any internal equivalent can be substituted,
such as a cached sequence of preprocessor tokens, or omitted if the
compiler can prove the file will have no effect.  "#pragma once" would
be taken as such proof, and if believing it (or ignoring it on an old
compiler or messed-up configuration) results in errors then I as the
programmer am at fault and will fix it -- nothing new there.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jdennett@acm.org (James Dennett)
Date: Fri, 8 Dec 2006 05:46:58 GMT
Raw View
Gennaro Prota wrote:
> On Wed,  6 Dec 2006 11:45:50 CST, James Kanze wrote:
>
>> Nevin :-] Liber wrote:
>>
>>> The include guard idiom is based on macro symbols, not on whether or not
>>> two files are the same file.  Information about the latter must be
>>> provided by the operating system.
>> The optimization based on recognizing it is based on recognizing
>> that two includes actually include the same file.  And the
>> reason that it can be implemented correctly is that it isn't an
>> error if they do happen to read the same file twice; it just
>> results in a slower compile time.  So they can take a
>> conservative strategy, and in case of doubt, skip the
>> optimization.  This would not be the case in the case of #pragma
>> once.
>
> Ok, but how can a "case of doubt" be reasonably identified?

If the compiler can't prove that the files are the same,
there is "doubt", and it should include the file anyway.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Fabio Fracassi <f.fracassi@gmx.net>
Date: Fri, 8 Dec 2006 09:49:10 CST
Raw View
eric_backus@alum.mit.edu wrote:

> James Kanze wrote:
>> eric_backus@alum.mit.edu wrote:
>> > Which failure is more likely:
[snip]
>> >  * You use good old include guards, but accidentally use the same
>> > include guard in different include files, due to using cut-and-paste to
>> > create one of the files.
>>
>> I'll admit that I can't imagine that ever happening.  You might
>> cut-and-paste the contents of the header file, but never the
>> include guards, which have always been created automatically,
>
> You've never done something like "cp header1.h header2.h" and then
> edited header2.h?  Even if you haven't, I guarantee that others have.
>
>> using naming conventions guaranteed to generate a unique name
>> within the company, and with some specific prefix or suffix to
>> make it unlikely to conflict with the naming conventions used in
>> some third party library.
>>
>> If it's really a worry, of course, you can append the timestamp,
>> the IP or MAP address of the machine and the process id of the
>> editor/generator script to the guard.  Maybe with some bytes
>> from /dev/random as well, just for good measure.
>
> Sure, that would work, if you went to the effort of doing it.  Have you
> ever done that?  I haven't.
>
> The reality is that few people worry much about this problem, so
> include guards occasionally fail.  It's not a big deal, but #pragma
> once solves the problem, improves the global namespace, is simple to
> specify, is not difficult to implement, and is already existing
> practice.
>

But #pragma once fails on different occassions, too.
The thing is that when include guards fail (lets say because someone copied
them) then you get some compile time errors shortly after, and fix them.

With #pragma once everything might be fine on your machine, but someone else
checks out the code and it wont work. Think Open Source, where the source
code is often also your install package, so the users don't know or care
how to fix this. Besides you cannot fix this in the code.

--
Fabio Fracassi







---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Fri, 8 Dec 2006 09:52:02 CST
Raw View
ThosRTanner wrote:
> Ben Hutchings wrote:
> > On 2006-12-06, Peter Steiner <pnsteiner@gmail.com> wrote:
> > > Seungbeom Kim wrote:

> > >> Then I don't see much benefit in such a change. Maybe it's way easier
> > >> just to write a program that generates the include guards.

> > > besides the improvement in usability, a pragma once statements
> > > decreases preprocessor runtime cost. include guards don't allow the
> > > preprocessor to omit the file for obvious reasons, while pragma once
> > > does so.
> > <snip>

> > #pragma once has been suggested many times, but it's difficult to
> > specify and may be impossible to implement on some common systems.
> > I know a popular compiler that implements #pragma once which allows a
> > file containing #pragma once that's on a case-insensitive file-system
> > to be included more than once if the include directives use differing
> > capitalisation,

> That's a QOI issue though. I can see that there are issues:
> 1) File name case differs on a case-insensitive file system - easily
> fixable

How?  I don't think that an application program can even
necessarily know if filenames are case-insensitive or not.  On
my Windows machine at home, the files on drive c: are
case-insensitive, and those on drive f: are case-sensitive.  I
don't know if even the kernel knows this, and I'm certainly not
aware of an application level request to find out (but then, I
don't know the Windows API very well).

> 2) Use of soft links - more difficult, though fairly easy to fix (at
> least on unix)

It depends on whether the soft links are resolved transparently
by the NFS or SMB server, no?  I think Samba handles them
transparently, so that Windows clients don't see them (but I'd
have to verify to make sure).

> 3) Use of hard links - I would have described that as a "sanity of
> coder issue". There are certain facilities on unix you really don't
> need to use...

Actually, I think hard links are easier than soft links to
handle.

> 4) Multiple network mounts to same point - again, that is a
> questionable system setup.

I don't know.  I've seen it used in the past for generating
backups, for example.  The fact remains that it is legal.

> Me, I'd be happy to have #pragma once (or some such name) even if I was
> told that 2, 3 and 4 would break it - all of those setups cause
> confusion in the minds of programmers anyway - you think you are
> editing a header file that'll only affect module X, and it could well
> affect module Y without your realising if points 2, 3 or 4 happen to be
> issues.

That's you.  Personally, I've yet to figure out what the
supposed advantages of #pragma once are supposed to be.  There
are a lot of things wrong with the current system---textual
inclusion is NOT the best solution for this sort of
problem---but I don't see where pragma once solves any of them.

> > Another popular compiler recognises the #ifndef FOO...#define FOO...
> > #endif pattern and avoids repeatedly reading such files (so long as it
> > recognises them) if the controlling macro is still defined.   Even if
> > it fails to recognise two paths as pointing to the same file, this only
> > makes compilation a little slower.

> > I know which behaviour I prefer.

> And I know of many cases where the same include guard has been used for
> 2 header files - because of
> 1) Insufficiently specific include guards
> 2) copy and paste
> 3) rename header file without changing include guards, followed by
> creation of new header with same name and same include guard.

> I know which behaviour I prefer.

> I have to admit - I am lazy. If I have to do the same thing repeatedly
> for a particular tool, I feel the tool should be doing it for me.

And isn't it?  Don't you automatically get a guaranteed unique
set of include guards inserted every time you open a new file
with a name which ends in .hh, .hpp, or whatever you use.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Fri, 8 Dec 2006 10:06:52 CST
Raw View
Robert Mabee wrote:
> James Kanze wrote:
> > And what happens if it recognizes two paths as pointing to the
> > same file, when they don't?  I just tried including "/dev/tty"

> Shouldn't the compiler be permitted to assume a "normal" model of usage,

What's normal?  The Intel ASM 86 assembler actually explicitly
allowed reading from console input, and I used the feature on
one or two occasions.  (I'll admit that trying to include
"/dev/tty" is a bit perverse, though, since the input has to be
direct C++ syntax---I can't grab the string and process it
further before the compiler sees it.)

It was just the first example which came to mind.

> such that a source file is a static object that can be "seek"ed in and
> reread or cached as needed?

> Otherwise the common case could be burdened
> with the cost of copying the contents just in case they're needed later
> (say for an error message quoting the source line).  I'd also want to be
> able to get the file size from the OS and read the whole thing in at
> once (memory permitting) which won't work with devices, pipes, or
> sockets.

> Then it's a small jump to assume that if two references are to the same
> file (determined by an OS function still to be disputed) then they are
> to the same contents, and any internal equivalent can be substituted,
> such as a cached sequence of preprocessor tokens, or omitted if the
> compiler can prove the file will have no effect.

Well, compilers already do this, so it is justified by existing
practice, if nothing else.  How a compiler maps the
[qh]-char-sequence to where it searches for the contents is
implementation defined, so I suppose that a compiler could
define it 1) to being undefined behavior unless it is a normal
file (in the Unix sense), and 2) to say changes in the
underlying file while the compiler is running result in
undefined behavior.  In practice, that's probably the current
state of affaires anyway, and I think it would be difficult for
it to be otherwise.

> "#pragma once" would be taken as such proof, and if believing
> it (or ignoring it on an old compiler or messed-up
> configuration) results in errors then I as the programmer am
> at fault and will fix it -- nothing new there.

The real problem with "#pragma once" is in the other direction,
as others have pointed out.  There are simply too many
cases---cases which occur in practice---where the compiler
cannot tell, where what looks like two different includes might
in fact be the same file.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jdennett@acm.org (James Dennett)
Date: Fri, 8 Dec 2006 16:09:56 GMT
Raw View
eric_backus@alum.mit.edu wrote:
> James Kanze wrote:
>> eric_backus@alum.mit.edu wrote:
>>> Which failure is more likely:
>>>  * Your system supports linked files, the include files in your project
>>> make use of that, but the compiler can't figure out that two files are
>>> linked together, or
>> A fairly usual case, I think.  At least, it's been the case in
>> most places I've worked.  (The general context which would allow
>> such errors, that is.  In practice, I would imagine that it
>> would be pretty rare for even something as simple as a simple
>> literal comparison to give a wrong answer.  In any project, all
>> include files have a "canonical" name, regardless of the
>> different ways they can be accessed, and that canonical name is
>> the one used when including the file.)
>
> A usual case?  I find that very hard to believe.  I suspect that linked
> header files are quite rare to begin with, and most of the places where
> they would be used are places where the compiler would almost certainly
> be able to tell that they're linked.  Even if today's compiler does not
> already know how to tell that they're linked, if "#pragma once" were
> standardized, the compiler could (and would) easily be made to tell.
>
>>>  * You use good old include guards, but accidentally use the same
>>> include guard in different include files, due to using cut-and-paste to
>>> create one of the files.
>> I'll admit that I can't imagine that ever happening.  You might
>> cut-and-paste the contents of the header file, but never the
>> include guards, which have always been created automatically,
>
> You've never done something like "cp header1.h header2.h" and then
> edited header2.h?  Even if you haven't, I guarantee that others have.
>
>> using naming conventions guaranteed to generate a unique name
>> within the company, and with some specific prefix or suffix to
>> make it unlikely to conflict with the naming conventions used in
>> some third party library.
>>
>> If it's really a worry, of course, you can append the timestamp,
>> the IP or MAP address of the machine and the process id of the
>> editor/generator script to the guard.  Maybe with some bytes
>> from /dev/random as well, just for good measure.
>
> Sure, that would work, if you went to the effort of doing it.  Have you
> ever done that?  I haven't.
>
> The reality is that few people worry much about this problem, so
> include guards occasionally fail.  It's not a big deal, but #pragma
> once solves the problem, improves the global namespace, is simple to
> specify, is not difficult to implement, and is already existing
> practice.

We seem to be going in circles: I (and some others) hold that
there has been no robust specification and that the specification
we have seen appears to be unimplementable in many environments.
We've discussed why that is so, and I've not seen any refutation
of those arguments.  A portable version of #pragma once, it would
seem, would likely be nothing more than a hint, so that include
guards are required anyway (rendering the #pragma redundant).

My include guard names are generally created by my editor when
I create a new header file.  I expect that any good programmers'
editor should be able to do similarly.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "ThosRTanner" <ttanner2@bloomberg.net>
Date: Fri, 8 Dec 2006 10:16:19 CST
Raw View
"Bo Persson" wrote:
> ThosRTanner wrote:
> > Ben Hutchings wrote:
>
> >> #pragma once has been suggested many times, but it's difficult to
> >> specify and may be impossible to implement on some common systems.
> >> I know a popular compiler that implements #pragma once which
> >> allows a file containing #pragma once that's on a case-insensitive
> >> file-system to be included more than once if the include
> >> directives use differing capitalisation,
> > That's a QOI issue though. I can see that there are issues:
> > 1) File name case differs on a case-insensitive file system - easily
> > fixable
> > 2) Use of soft links - more difficult, though fairly easy to fix (at
> > least on unix)
> > 3) Use of hard links - I would have described that as a "sanity of
> > coder issue". There are certain facilities on unix you really don't
> > need to use...
> > 4) Multiple network mounts to same point - again, that is a
> > questionable system setup.
>
> It's not always that the developers can influence the design of the
> corporate network.
>
> For example, I have network mounts to Windows servers, various NAS disks,
> ClearCase on a UNIX server, and MVS on an IBM mainframe. Should we require a
> C++ compiler to resolve this?

I'm sure you do - we do to. But do you really have 2 different mounts
to the same server, which is what would be required to cause the issue.
I would really query a system set up like that.

Like I said - I don't expect the compiler to resolve that sort of
thing, And if you have 2 different hard paths to the same file,
frankly, you'll have nasty build and maintenance problems anyway.

>
> >>
> > And I know of many cases where the same include guard has been used
> > for 2 header files - because of
> > 1) Insufficiently specific include guards
> > 2) copy and paste
> > 3) rename header file without changing include guards, followed by
> > creation of new header with same name and same include guard.
> >
> > I know which behaviour I prefer.
>
> That the development system takes care of that, so the language standard
> doesn't have to?  :-)
>
>

I'd still prefer a #pragma includemultiple - then the default behaviour
would be what everyone wants. The paranoid (or those with IMHO insanse
network setups) can always add this to all their headers, along with
the include guards.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Fri, 8 Dec 2006 11:07:38 CST
Raw View
eric_backus@alum.mit.edu wrote:
> James Kanze wrote:
> > eric_backus@alum.mit.edu wrote:
> > > Which failure is more likely:
> > >  * Your system supports linked files, the include files in your project
> > > make use of that, but the compiler can't figure out that two files are
> > > linked together, or

> > A fairly usual case, I think.  At least, it's been the case in
> > most places I've worked.  (The general context which would allow
> > such errors, that is.  In practice, I would imagine that it
> > would be pretty rare for even something as simple as a simple
> > literal comparison to give a wrong answer.  In any project, all
> > include files have a "canonical" name, regardless of the
> > different ways they can be accessed, and that canonical name is
> > the one used when including the file.)

> A usual case?  I find that very hard to believe.  I suspect that linked
> header files are quite rare to begin with, and most of the places where
> they would be used are places where the compiler would almost certainly
> be able to tell that they're linked.  Even if today's compiler does not
> already know how to tell that they're linked, if "#pragma once" were
> standardized, the compiler could (and would) easily be made to tell.

Linking isn't the only problem, although linked header files do
occasionnally occur.  (I use them a lot to handle common cases
in dependant directories---if Solaris and Linux actually need
the same file, the Solaris and the Linux dependancy directories
each contain a link to the common file.  Of course, any single
compile will only use one.)  Remotely mounted files pose a
problem in general, and I've never worked at a place where the
source files weren't remotely mounted.  Remotely mounted files
accross different file system families cause a lot more
problems, and we regularly compile Windows code from files
physically located on Unix machines.  And as far as I know,
there's no way the compiler can determine the links in that
case---I don't think that SMB provides the information.

> > >  * You use good old include guards, but accidentally use the same
> > > include guard in different include files, due to using cut-and-paste to
> > > create one of the files.

> > I'll admit that I can't imagine that ever happening.  You might
> > cut-and-paste the contents of the header file, but never the
> > include guards, which have always been created automatically,

> You've never done something like "cp header1.h header2.h" and then
> edited header2.h?  Even if you haven't, I guarantee that others have.

Not really.  The more obvious solution would be to edit both, in
different editor windows, and then copy/paste the parts I
wanted.

> > using naming conventions guaranteed to generate a unique name
> > within the company, and with some specific prefix or suffix to
> > make it unlikely to conflict with the naming conventions used in
> > some third party library.

> > If it's really a worry, of course, you can append the timestamp,
> > the IP or MAP address of the machine and the process id of the
> > editor/generator script to the guard.  Maybe with some bytes
> > from /dev/random as well, just for good measure.

> Sure, that would work, if you went to the effort of doing it.  Have you
> ever done that?  I haven't.

What effort?  My editor does it.  I've never been in a situation
where I've had to worry to the point of using the additional
random bytes, but it would only take about 5 minutes to add to
the editor script.

> The reality is that few people worry much about this problem, so
> include guards occasionally fail.

The reality is also that most sysadmins aren't even aware that
the problem exists, and so gleefully give you multiple paths to
the same file, or allow the file to show up in some cases as all
lower case, in others in its native case.  Or who knows what all
else---I've seen includes for Windows programs using the 8.3
mangling that the remote file server provided for filenames that
were too long; other headers in the same program used the full
name, because by the time they were written, Windows had evolved
beyond the 8.3 era.

The problem with making something standard is that it has to be
well defined, in all possible cases.  Adding pragma once with
the specification that it might prevent multiple inclusion, but
it isn't guaranteed to, isn't a good idea.  And the guarantee
simply isn't implementable in practice.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Fri, 8 Dec 2006 17:07:49 GMT
Raw View
Gennaro Prota wrote:
> On Wed,  6 Dec 2006 11:45:50 CST, James Kanze wrote:
>
>
>>Nevin :-] Liber wrote:
>>
>>
>>>The include guard idiom is based on macro symbols, not on whether or not
>>>two files are the same file.  Information about the latter must be
>>>provided by the operating system.
>>
>>The optimization based on recognizing it is based on recognizing
>>that two includes actually include the same file.  And the
>>reason that it can be implemented correctly is that it isn't an
>>error if they do happen to read the same file twice; it just
>>results in a slower compile time.  So they can take a
>>conservative strategy, and in case of doubt, skip the
>>optimization.  This would not be the case in the case of #pragma
>>once.
>
>
> Ok, but how can a "case of doubt" be reasonably identified?

A compiler running on a not-too-smart filesystem could compute and cache
on disk md5 checksums for each included file. But I guess that's not
foolproof either: now we can't include two files with identical content,
even if we wanted! :o)

Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Fri, 8 Dec 2006 20:14:50 GMT
Raw View
On Fri,  8 Dec 2006 17:07:49 GMT, "Andrei Alexandrescu (See Website
For Email)" wrote:

>Gennaro Prota wrote:
>> Ok, but how can a "case of doubt" be reasonably identified?
>
>A compiler running on a not-too-smart filesystem could compute and cache
>on disk md5 checksums for each included file. But I guess that's not
>foolproof either: now we can't include two files with identical content,
>even if we wanted! :o)

<incidental>

So it should still read and hash the file content but then eventually
ignore it. Not necessarily an optimization, though semantically useful
*if there were no collisions* :-) I had thought of this, as hash
functions are an area in which I'm working a lot in these days.

</incidental>

But you can mount different filesystems. That's what I didn't
understand in James' reply: he seems to refer to a runtime "discerning
capability" on the part of the compiler. No?

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bop@gmb.dk ("Bo Persson")
Date: Fri, 8 Dec 2006 21:14:41 GMT
Raw View
ThosRTanner wrote:
> "Bo Persson" wrote:
>> ThosRTanner wrote:
>>> Ben Hutchings wrote:
>>
>>>> #pragma once has been suggested many times, but it's difficult to
>>>> specify and may be impossible to implement on some common
>>>> systems. I know a popular compiler that implements #pragma once
>>>> which allows a file containing #pragma once that's on a
>>>> case-insensitive file-system to be included more than once if
>>>> the include directives use differing capitalisation,
>>> That's a QOI issue though. I can see that there are issues:
>>> 1) File name case differs on a case-insensitive file system -
>>> easily fixable
>>> 2) Use of soft links - more difficult, though fairly easy to fix
>>> (at least on unix)
>>> 3) Use of hard links - I would have described that as a "sanity of
>>> coder issue". There are certain facilities on unix you really
>>> don't need to use...
>>> 4) Multiple network mounts to same point - again, that is a
>>> questionable system setup.
>>
>> It's not always that the developers can influence the design of the
>> corporate network.
>>
>> For example, I have network mounts to Windows servers, various NAS
>> disks, ClearCase on a UNIX server, and MVS on an IBM mainframe.
>> Should we require a C++ compiler to resolve this?
>
> I'm sure you do - we do to. But do you really have 2 different
> mounts
> to the same server, which is what would be required to cause the
> issue. I would really query a system set up like that.

I have mounts to different department's disks. I have no idea where the
files are stored physically. Could be on the same server, or a different
one, or on some NAS device, or in another city. I don't know.

>
> Like I said - I don't expect the compiler to resolve that sort of
> thing, And if you have 2 different hard paths to the same file,
> frankly, you'll have nasty build and maintenance problems anyway.

So I just add a #pragma once, and the compiler will sort it out for me?
:-)

>
> I'd still prefer a #pragma includemultiple - then the default
> behaviour would be what everyone wants. The paranoid (or those with
> IMHO insanse network setups) can always add this to all their
> headers, along with
> the include guards.

That doesn't help either, if the compiler can't tell whether it is the same
file, or not.

Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Sat, 9 Dec 2006 05:41:02 GMT
Raw View
Gennaro Prota wrote:
> On Fri,  8 Dec 2006 17:07:49 GMT, "Andrei Alexandrescu (See Website
> For Email)" wrote:
>
>> Gennaro Prota wrote:
>>> Ok, but how can a "case of doubt" be reasonably identified?
>> A compiler running on a not-too-smart filesystem could compute and
>> cache on disk md5 checksums for each included file. But I guess
>> that's not foolproof either: now we can't include two files with
>> identical content, even if we wanted! :o)
>
> <incidental>
>
> So it should still read and hash the file content but then eventually
>  ignore it. Not necessarily an optimization, though semantically
> useful *if there were no collisions* :-) I had thought of this, as
> hash functions are an area in which I'm working a lot in these days.

Nonono. It reads and hashes file content, after which the hash is
persisted in the same directory as the file. While the file is older
than the hash, the hash is directly used. On second thought, that
requires write access to the directory, which is not too appealing.


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: m@remove.this.part.rtij.nl (Martijn Lievaart)
Date: Sat, 9 Dec 2006 15:43:05 GMT
Raw View
On Thu, 07 Dec 2006 11:28:00 -0600, James Kanze wrote:

> And what happens if it recognizes two paths as pointing to the
> same file, when they don't?  I just tried including "/dev/tty"
> with g++ (probably the first compiler to implement this), and if

I doubt that: http://www.ioccc.org/years-spoiler.html#1988_spinellis

M4

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: rmabee@comcast.net (Robert Mabee)
Date: Sat, 9 Dec 2006 19:10:28 GMT
Raw View
Andrei Alexandrescu (See Website For Email) wrote:
> Nonono. It reads and hashes file content, after which the hash is
> persisted in the same directory as the file. While the file is older
> than the hash, the hash is directly used. On second thought, that
> requires write access to the directory, which is not too appealing.

Getting close now.  A database anywhere of filenames, hashes, and dates
will let the compiler avoid reading a file which matches a hash-date
pair already encountered in this compilation.  The filename must be an
absolute path but need not be canonical for all the possible filesystem
gotchas.  If a filename does not match it might still need to be
excluded so, only in that case, it must be read, hashed and entered in
the database, and then either used or not used as compiler input
according to the #pragma seen on a prior inclusion of the "same" file
(by hash-date pair) in this compilation.

The only cost of using two name variants for the same file is having
both variants in the database, and hashing the file once for each name
on the first reference to that name after a change to the file.  The
only OS requirement is a dependable file version stamp, for which the
change timestamp is a common proxy.  I trust there are no filesystems
where "same" file can have different contents, such as optional <CR>
characters before <NL>.

The same database will also permit the compiler to believe cached copies
of files despite name vagaries.  I don't suppose this will permit any
significant precompiling, though, because the result varies with macros
and include path.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Sun, 10 Dec 2006 11:05:02 CST
Raw View
James Dennett wrote:
> eric_backus@alum.mit.edu wrote:
> > James Kanze wrote:

    [...]
> We seem to be going in circles: I (and some others) hold that
> there has been no robust specification and that the specification
> we have seen appears to be unimplementable in many environments.
> We've discussed why that is so, and I've not seen any refutation
> of those arguments.  A portable version of #pragma once, it would
> seem, would likely be nothing more than a hint, so that include
> guards are required anyway (rendering the #pragma redundant).

I think that the real problem is that no one has made a concrete
proposal, so it's hard to say whether it is portable or not.  I
think that even those in favor of #pragma once have acknowledged
that there are cases where it is difficult, if not impossible,
for the compiler to know when two includes refer to the same
physical file; my impression is that they are saying that such
cases aren't important in practice.

I'm willing to admit that a lot of them aren't.  But before
commenting further, I'd like to see the actual wording which
specifies which cases aren't supported.  I think it's harder to
specify than it would seem off hand.  The easy way out would be
to simply say that the compiler bases equality on the
[qh]-char-sequence, that if different [qh]-char-sequence refer
in fact to the same file, or that if the same [qh]-char-sequence
refers to different files, the behavior is undefined.
Regretfully, that would probably break a lot of code.  And while
it's nice to say that it usually won't matter, I think at the
very least, we'd have to define when it works, and when it might
not.

> My include guard names are generally created by my editor when
> I create a new header file.  I expect that any good programmers'
> editor should be able to do similarly.

It depends.  In some places I've worked, it is the source code
management system which generates the guards, before the editor
even gets the chance to see the file.

--
James Kanze (Gabi Software)            email: james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Gennaro Prota <geunnaro_prouta@yahoo.com>
Date: Sun, 10 Dec 2006 15:08:17 CST
Raw View
On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:

>I think that the real problem is that no one has made a concrete
>proposal

Or perhaps that we should all be more constructive.

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Mon, 11 Dec 2006 00:42:01 GMT
Raw View
Robert Mabee wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>
>> Nonono. It reads and hashes file content, after which the hash is
>> persisted in the same directory as the file. While the file is older
>> than the hash, the hash is directly used. On second thought, that
>> requires write access to the directory, which is not too appealing.
>
>
> Getting close now.  A database anywhere of filenames, hashes, and dates
> will let the compiler avoid reading a file which matches a hash-date
> pair already encountered in this compilation.  The filename must be an
> absolute path but need not be canonical for all the possible filesystem
> gotchas.  If a filename does not match it might still need to be
> excluded so, only in that case, it must be read, hashed and entered in
> the database, and then either used or not used as compiler input
> according to the #pragma seen on a prior inclusion of the "same" file
> (by hash-date pair) in this compilation.
>
> The only cost of using two name variants for the same file is having
> both variants in the database, and hashing the file once for each name
> on the first reference to that name after a change to the file.  The
> only OS requirement is a dependable file version stamp, for which the
> change timestamp is a common proxy.  I trust there are no filesystems
> where "same" file can have different contents, such as optional <CR>
> characters before <NL>.
>
> The same database will also permit the compiler to believe cached copies
> of files despite name vagaries.  I don't suppose this will permit any
> significant precompiling, though, because the result varies with macros
> and include path.

I think this is eminently workable, and of real value. Thanks for
replying to my idle thoughts with a great idea!

But then again, some applications do need to include the same file
multiple times. How are they going to achieve that?


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jdennett@acm.org (James Dennett)
Date: Mon, 11 Dec 2006 01:53:08 GMT
Raw View
Gennaro Prota wrote:
> On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:
>
>> I think that the real problem is that no one has made a concrete
>> proposal
>
> Or perhaps that we should all be more constructive.

To me, James's point that you quote is a very constructive
one.  There's not been a formal proposal/specification, and
that's the key thing that's needed in order to make progress.
Those who feel that this is not a valuable use of time are
unlikely to volunteer for that, so it takes one of the
proponents of a standardized #pragma once to be willing to
do it, otherwise this is going nowhere.

If it's important enough to change the standard and require
all users to learn about this pragma, and all implementors
of C++ preprocessors to implement it, it seems likely that
it is important enough for one of its supporters to take the
lead in getting a proposal written up (obviously it would
be reasonable to ask others to help).

Based on what I've seen so far, I cannot think of a useful
formal definition of what #pragma once should do, and its
value seems smaller than its cost.  I'm always willing to
be shown to be wrong (so long as there's real evidence).

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Gennaro Prota <geunnaro_prouta@yahoo.com>
Date: Mon, 11 Dec 2006 10:06:31 CST
Raw View
On Mon, 11 Dec 2006 01:53:08 GMT, James Dennett wrote:

>Gennaro Prota wrote:
>> On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:
>>
>>> I think that the real problem is that no one has made a concrete
>>> proposal
>>
>> Or perhaps that we should all be more constructive.
>
>To me, James's point that you quote is a very constructive
>one.

To me, after making some good points James is mainly saying that
there's no real issue because the editor automatically does everything
needs to be done, then again correcting people who say the same by
saying that in some places it's actually the SCM system that provides
the guards and then again whatever else... That may be constructive to
someone but, to me, it's just spirit of contradiction. Looks like
another discussion on c.l.c++.m where he said that my code could work
to reply then that I was using the term "works" without defining it...

>There's not been a formal proposal/specification, and
>that's the key thing that's needed in order to make progress.

That's what I wanted to write.

>[...]
>
>Based on what I've seen so far, I cannot think of a useful
>formal definition of what #pragma once should do, and its
>value seems smaller than its cost.

So what's the problem, let's just forget about it. I asked if there
was interest; answers could have been more explicit.

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Mon, 11 Dec 2006 11:39:10 CST
Raw View
Gennaro Prota wrote:
> On Mon, 11 Dec 2006 01:53:08 GMT, James Dennett wrote:
>
> >Gennaro Prota wrote:
> >> On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:
> >>

[Constructive statement:]
> >>> I think that the real problem is that no one has made a concrete
> >>> proposal

> >> Or perhaps that we should all be more constructive.
> >
> >To me, James's point that you quote is a very constructive
> >one.
>
> To me, after making some good points James is mainly saying that
> there's no real issue because the editor automatically does everything
> needs to be done, then again correcting people who say the same by
> saying that in some places it's actually the SCM system that provides
> the guards and then again whatever else... That may be constructive to
> someone but, to me, it's just spirit of contradiction.

That's not the statement that James Dennet described as constructive.
See above.

.
> >Based on what I've seen so far, I cannot think of a useful
> >formal definition of what #pragma once should do, and its
> >value seems smaller than its cost.
>
> So what's the problem, let's just forget about it. I asked if there
> was interest; answers could have been more explicit.

No one has said "forget about it". They've said "make a proposal". The
proper response to James Dennet's statement that "I cannot think of a
useful formal definition of what #pragma once should do", is to present
him with something that you think is a useful formal definition of what
#pragma once should do.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "ThosRTanner" <ttanner2@bloomberg.net>
Date: Mon, 11 Dec 2006 11:39:49 CST
Raw View
"Bo Persson" wrote:
> ThosRTanner wrote:
> >
> > I'm sure you do - we do to. But do you really have 2 different
> > mounts
> > to the same server, which is what would be required to cause the
> > issue. I would really query a system set up like that.
>
> I have mounts to different department's disks. I have no idea where the
> files are stored physically. Could be on the same server, or a different
> one, or on some NAS device, or in another city. I don't know.
>
Who needs to know where the files are stored physically? It's
irrelevant. The point is that if a/b.h and f/g.h are actually the same
file, you are going to have an awful lot of trouble anyway. Who is
going to realise that altering b.h means that all the files using g.h
need rebuilding?

> >
> > Like I said - I don't expect the compiler to resolve that sort of
> > thing, And if you have 2 different hard paths to the same file,
> > frankly, you'll have nasty build and maintenance problems anyway.
>
> So I just add a #pragma once, and the compiler will sort it out for me?
> :-)
I don't think it's going to make your maintenance issues a lot worse.

> >
> > I'd still prefer a #pragma includemultiple - then the default
> > behaviour would be what everyone wants. The paranoid (or those with
> > IMHO insanse network setups) can always add this to all their
> > headers, along with
> > the include guards.
>
> That doesn't help either, if the compiler can't tell whether it is the same
> file, or not.
Well, as I said, if your system is set up badly, you can put it in
every header file. By default you wouldn't need to put anything in any
header file.

> Bo Persson
>
>
> ---
> [ comp.std.c++ is moderated.  To submit articles, try just posting with ]
> [ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
> [              --- Please see the FAQ before posting. ---               ]
> [ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: usenet@marlowa.plus.com ("Andrew Marlow")
Date: Mon, 11 Dec 2006 17:42:23 GMT
Raw View
On Fri, 08 Dec 2006 17:07:49 +0000, Andrei Alexandrescu (See Website For
Email) wrote:
>>>The optimization based on recognizing it is based on recognizing
>>>that two includes actually include the same file.
>> Ok, but how can a "case of doubt" be reasonably identified?
> A compiler running on a not-too-smart filesystem could compute and cache
> on disk md5 checksums for each included file.

IMO it's alot simpler than that. The compiler can remember each #include
statement it has seen. This won't catch all cases, since people can vary
what comes after the '#include' and still be referring to the same file,
but nontheless it seems like a simple and useful optimisation to me...

--
There is an emerald here the size of a plover's egg!
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: brangdon@cix.co.uk (Dave Harris)
Date: Mon, 11 Dec 2006 17:43:37 GMT
Raw View
james.kanze@gmail.com (James Kanze) wrote (abridged):
> The easy way out would be to simply say that the compiler bases
> equality on the [qh]-char-sequence, that if different
> [qh]-char-sequence refer in fact to the same file, or that if the
> same [qh]-char-sequence refers to different files, the behavior
> is undefined. Regretfully, that would probably break a lot of code.

Indeed. My shop puts different library source in different directories.
The same header may be included as both:
     #include "header.h"
     #include "my_lib\header.h"

in the same compilation unit, if the first is found in a file in my_lib
and the second isn't.

-- Dave Harris, Nottingham, UK.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Mon, 11 Dec 2006 11:42:28 CST
Raw View
James Dennett wrote:
> Gennaro Prota wrote:
> > On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:
> >
> >> I think that the real problem is that no one has made a concrete
> >> proposal
> >
> > Or perhaps that we should all be more constructive.
>
> To me, James's point that you quote is a very constructive
> one.  There's not been a formal proposal/specification, and
> that's the key thing that's needed in order to make progress.
> Those who feel that this is not a valuable use of time are
> unlikely to volunteer for that, so it takes one of the
> proponents of a standardized #pragma once to be willing to
> do it, otherwise this is going nowhere.
>
> If it's important enough to change the standard and require
> all users to learn about this pragma, and all implementors
> of C++ preprocessors to implement it, it seems likely that
> it is important enough for one of its supporters to take the
> lead in getting a proposal written up (obviously it would
> be reasonable to ask others to help).
>
> Based on what I've seen so far, I cannot think of a useful
> formal definition of what #pragma once should do, and its
> value seems smaller than its cost.  I'm always willing to
> be shown to be wrong (so long as there's real evidence).

A formal definition for "#pragma once" would be a paradigm of
simplicitly. Certainly file system issues would present no
difficulties. The file system is largely irrelevant anyway because a
#pragma once works - not with files - but with the names of files as
they appear in an #include directive.

There are only two preprocessor requirements needed to support a
#pragma once. The first concerns the interpretation of the pragma
itself: the preprocessor would interpret #pragma once as an implicit
macro definition - the name of which would be determined as follows:

Start with the definition of the __FILE__ macro (i.e the name of the
file that contains the #pragma once directive), unstringify the name
(i.e. strip the enclosing quotation marks), replace periods (and any
other character not allowed in an identifier) with underscores,
uppercase the name, and prepend an underscore (so the #pragma once
macro should not conflict with any user-declared macros).

For example, assume that the "MyHeader.h" header file contains:

      #pragma once

The compiler would interpret this #pragma like so:

      #define _MYHEADER_H 1

The second requirement would affect how #include directives are
handled: in particular, for each #include directive encountered, the
preprocessor would first test whether the implicitly declared macro
corresponding to the name of the file being included - has been
defined. If it has, then the preprocessor would skip the #include
directive - and the contents of the header file would not be inserted
into the current context.

A "#pragma once" would - at its most basic level - be header guards
done right. "Right" in terms of efficiency: because the preprocessor
would no longer have to load each included file, scan its contents,
only to verify that, yes, the matching #endif really does come at the
end. A #pragma once eliminates that entire series of steps.
Furthermore, managing (or analyzing) program dependencies becomes
easier when a portable way exists to test whether a particular header
file has been processed within the current context.

A #pragma once has an additional benefit which - while less tangible -
is nonetheless one that should not be discounted: and that benefit is,
what it would do for C++ itself. Frankly, the need for header guards is
simply an embarrassment. The embarrassment is not merely that textual
inclusion of source code is antiquated (it is), nor is it merely the
kludgy way that header guards cover an apparent shortcoming in the
language (they do). Rather the biggest embarrassment is the mindset
betrayed by their very existence. That mindset could be summed up like
this: computing time is expensive and a programmer's time is cheap (in
comparison). In other words, because computing cycles are too valuable
to be spent managing header files - the programmer has to do it.

Computing has changed since the days when leasing 2K of memory cost $12
a month. Nowadays it is the programmer's time that is expensive and the
computer's time that is cheap in comparison. So it no longer makes
economic sense for a programmer to be working so that the computer does
not have to. And although a #pragma once would probably add up to a
minuscule amount of processing time for a modern computer - in symbolic
terms, even a small acknowledgement in the C++ Standard that a
programmer's time might be just as - if not more - valuable than a
computer's - would be a welcome change indeed.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Mon, 11 Dec 2006 11:42:34 CST
Raw View
James Dennett wrote:
> Gennaro Prota wrote:
> > On Sun, 10 Dec 2006 11:05:02 CST, James Kanze wrote:

> >> I think that the real problem is that no one has made a concrete
> >> proposal

> > Or perhaps that we should all be more constructive.

> To me, James's point that you quote is a very constructive
> one.  There's not been a formal proposal/specification, and
> that's the key thing that's needed in order to make progress.
> Those who feel that this is not a valuable use of time are
> unlikely to volunteer for that, so it takes one of the
> proponents of a standardized #pragma once to be willing to
> do it, otherwise this is going nowhere.

My point is that basically, I get the feeling that there is a
partial agreement, at least between myself and the proponents of
pragma once, at least with regards to two important points:

 1) that the concept of "same file" isn't workable, as such,
    since it cannot be implemented, and

 2) that the cases where the compiler cannot tell are probably
    exotic or perverse enough that we don't have to support
    them.

The problem is, of course, that if #pragma once doesn't mean
don't include the same file a second time, what does it mean?
Until we have some sort of definition as to what it really
meeans, all further discussion is fruitless.  And as you point
out, I don't feel any particular need for it, so I'm not going
to invest any time I don't have to---a more concrete
specification will have to come from the proponents.

> If it's important enough to change the standard and require
> all users to learn about this pragma, and all implementors
> of C++ preprocessors to implement it, it seems likely that
> it is important enough for one of its supporters to take the
> lead in getting a proposal written up (obviously it would
> be reasonable to ask others to help).

Even without a formal written proposal, for any further
discussion to be useful, we have to know what we're discussing.
(But of course, if it is to be adopted, at some point, some one
will have to write it up.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Mon, 11 Dec 2006 18:40:11 GMT
Raw View
Andrew Marlow wrote:
> On Fri, 08 Dec 2006 17:07:49 +0000, Andrei Alexandrescu (See Website For
> Email) wrote:
>
>>>>The optimization based on recognizing it is based on recognizing
>>>>that two includes actually include the same file.
>>>
>>>Ok, but how can a "case of doubt" be reasonably identified?
>>
>>A compiler running on a not-too-smart filesystem could compute and cache
>>on disk md5 checksums for each included file.
>
>
> IMO it's alot simpler than that. The compiler can remember each #include
> statement it has seen. This won't catch all cases, since people can vary
> what comes after the '#include' and still be referring to the same file,
> but nontheless it seems like a simple and useful optimisation to me...

That can be done on some systems. It has been said repeatedly in this
thread that that's not enough. The md5 checksums database ensures that
#pragma once is implementable on _all_ systems.

The point is that the feature is reasonably easy to implement on any
system that has timestamps on files.


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Mon, 11 Dec 2006 18:59:10 GMT
Raw View
James Kanze wrote:
> My point is that basically, I get the feeling that there is a
> partial agreement, at least between myself and the proponents of
> pragma once, at least with regards to two important points:
>
>  1) that the concept of "same file" isn't workable, as such,
>     since it cannot be implemented, and

What happened to the md5 hash?

Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Robert Mabee <rmabee@comcast.net>
Date: Mon, 11 Dec 2006 13:16:41 CST
Raw View
Anders Dalvander wrote:
> Wouldn't a new #import <file.h> or #using <file.h> construct be better
> overall? Then the file doesn't need to be opened and scanned for
> #pragma once or other include guard constructs. It would also be a step
> toward modules in C++, and perhaps a way to get rid of the need to
> forward declare classes.

That is surely the right long-term solution.  Imported files would have
to be considered as file scope outside any scoping constructs that might
bracket the import statement (probably not a # preprocessor statement)
and considered idempotent so multiple imports would produce no error
message.  Probably they can't do all that and still contribute to the
preprocessor actions in the referencing file.

Textual inclusion could be deprecated but still available like other
preprocessor hackery for those cases where it really is worth inserting
source into the middle of some context, and of course needed forever for
legacy code.

However, this probably can't be worked out in time for the next
standard.  The proposal (to write a formal proposal) for #pragma once
could be a quick fix or it could be a trap that would remove much of the
impetus to do it right.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Niklas Matthies <usenet-nospam@nmhq.net>
Date: Mon, 11 Dec 2006 13:35:46 CST
Raw View
On 2006-12-09 19:10, Robert Mabee wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Nonono. It reads and hashes file content, after which the hash is
>> persisted in the same directory as the file. While the file is older
>> than the hash, the hash is directly used. On second thought, that
>> requires write access to the directory, which is not too appealing.
>
> Getting close now.  A database anywhere of filenames, hashes, and
> dates will let the compiler avoid reading a file which matches a
> hash-date pair already encountered in this compilation.  The
> filename must be an absolute path but need not be canonical for all
> the possible filesystem gotchas.  If a filename does not match it
> might still need to be excluded so, only in that case, it must be
> read, hashed and entered in the database, and then either used or
> not used as compiler input according to the #pragma seen on a prior
> inclusion of the "same" file (by hash-date pair) in this
> compilation.

How do you ensure there are no collisions (i.e. two different files
with same hash and timestamp)?

-- Niklas Matthies

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Mon, 11 Dec 2006 20:32:10 GMT
Raw View
Niklas Matthies wrote:
> On 2006-12-09 19:10, Robert Mabee wrote:
>
>>Andrei Alexandrescu (See Website For Email) wrote:
>>
>>>Nonono. It reads and hashes file content, after which the hash is
>>>persisted in the same directory as the file. While the file is older
>>>than the hash, the hash is directly used. On second thought, that
>>>requires write access to the directory, which is not too appealing.
>>
>>Getting close now.  A database anywhere of filenames, hashes, and
>>dates will let the compiler avoid reading a file which matches a
>>hash-date pair already encountered in this compilation.  The
>>filename must be an absolute path but need not be canonical for all
>>the possible filesystem gotchas.  If a filename does not match it
>>might still need to be excluded so, only in that case, it must be
>>read, hashed and entered in the database, and then either used or
>>not used as compiler input according to the #pragma seen on a prior
>>inclusion of the "same" file (by hash-date pair) in this
>>compilation.
>
>
> How do you ensure there are no collisions (i.e. two different files
> with same hash and timestamp)?

The probability of two files having identical timestamps, identical
length, identical md5 hashes, while they are actually different is
exceeedingly low, save for concerted attacks (see
http://www.mscs.dal.ca/~selinger/md5collision/).

But anyhow, I know some people never get convinced by such arguments
:o). For files with distinct names and identical hash, the system can
compare them and record whether they are identical for that timestamp.
Until both of the files have been modified, there is no need to run the
comparison again.


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Mon, 11 Dec 2006 16:14:28 CST
Raw View
Greg Herlihy wrote:
.
> Start with the definition of the __FILE__ macro (i.e the name of the
> file that contains the #pragma once directive), unstringify the name
> (i.e. strip the enclosing quotation marks), replace periods (and any
> other character not allowed in an identifier) with underscores,
> uppercase the name, and prepend an underscore (so the #pragma once
> macro should not conflict with any user-declared macros).

There are a couple of  problems:
Two different files might be mapped to the same macro name by that
algorithm.

This idea intrudes upon a very large portion of the identifier name
space that used to be reserved to implementations. The macro
corresponding to a given file name might be one the implementation is
already using for some other purpose. It's trivial to construct
filenames where the corresponding macro name is the same as one of the
standard-defined macros, such as "_stdc._", though those are odd enough
that they're not likely to come up in practice. The ones that conflict
with implementation-defined macros are a much bigger problem.

All you're really trying to do is prevent the #inclusion of a new file
with the same value for __FILE__  as the value when the #pragma once
directive was processed. The macros are just a kludge to keep track of
that name. Why not simply define the behavior directly in terms of the
file name? Let the implementation worry about how to keep track of it.

Of course, any scheme that uses the filename alone to identify which
files are the same provides less protection than header guards do. With
header guards, if the same file can be found in the #include search
path by two different names, it is still protected against double
inclusion, which is not the case with filename based approaches.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Mon, 11 Dec 2006 23:32:30 GMT
Raw View
On Mon, 11 Dec 2006 20:32:10 GMT, "Andrei Alexandrescu (See Website
For Email)" wrote:

>The probability of two files having identical timestamps, identical
>length, identical md5 hashes, while they are actually different is
>exceeedingly low, save for concerted attacks (see
>http://www.mscs.dal.ca/~selinger/md5collision/).

Even lower for SHA-2 functions, but still... The point is, I think,
being able to take an appropriate decision even in case of collision.
That the collision is unlikely is a welcome property but we shouldn't
rely on it being impossible.

>But anyhow, I know some people never get convinced by such arguments
>:o).

Cough... :-) Let me see if I understood: you are compiling A.cpp, when
you encounter (scattered somewhere in the inclusion tree)

  #include "a.h"
  #include "mylib/a.h"

in that order. Now, if when reading the second included file you
detect that both the hash (and the date? see below) are the same as
for "a.h" (or any other files previously added to the "database") you
have to go back and read a.h for the comparison. Going back would
require knowing the path of the file *as determined (in an
implementation-defined manner) when the #include "a.h" was executed*,
but we kept that complete path in the database (right?). So you use
that path and just compare the two files; if they are different you go
with the normal processing (#inclusion, insertion into the database),
if they compare equal you do nothing.

Included files with no #once would never be added to the database. The
database is relative to the translation unit. Right? Does anyone see a
flaw in this?

(A prerequisite of this is that a complete path always corresponds to
the same file or, better, to the same content. In all this, anyway, I
think I didn't get the absolute need for timestamps, which means I'm
missing something --note that two files might have identical content
and timestamp, while still being distinct as filesystem entities)

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Mon, 11 Dec 2006 23:39:42 GMT
Raw View
On Mon, 11 Dec 2006 00:42:01 GMT, "Andrei Alexandrescu (See Website
For Email)" wrote:

>I think this is eminently workable, and of real value. Thanks for
>replying to my idle thoughts with a great idea!
>
>But then again, some applications do need to include the same file
>multiple times. How are they going to achieve that?

If I'm following all this they would just omit (include guards and)
#once, no?

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: SeeWebsiteForEmail@erdani.org ("Andrei Alexandrescu (See Website For Email)")
Date: Tue, 12 Dec 2006 02:51:11 GMT
Raw View
Gennaro Prota wrote:
> On Mon, 11 Dec 2006 20:32:10 GMT, "Andrei Alexandrescu (See Website
> For Email)" wrote:
>> The probability of two files having identical timestamps, identical
>> length, identical md5 hashes, while they are actually different is
>> exceeedingly low, save for concerted attacks (see
>> http://www.mscs.dal.ca/~selinger/md5collision/).
>
> Even lower for SHA-2 functions, but still... The point is, I think,
> being able to take an appropriate decision even in case of collision.
> That the collision is unlikely is a welcome property but we shouldn't
> rely on it being impossible.

If the collision probability is lower than the probability of file read
error using a deterministic comparison function, then...

>  Let me see if I understood: you are compiling A.cpp, when
> you encounter (scattered somewhere in the inclusion tree)
>
>   #include "a.h"
>   #include "mylib/a.h"
>
> in that order. Now, if when reading the second included file you
> detect that both the hash (and the date? see below) are the same as
> for "a.h" (or any other files previously added to the "database") you
> have to go back and read a.h for the comparison. Going back would
> require knowing the path of the file *as determined (in an
> implementation-defined manner) when the #include "a.h" was executed*,
> but we kept that complete path in the database (right?). So you use
> that path and just compare the two files; if they are different you go
> with the normal processing (#inclusion, insertion into the database),
> if they compare equal you do nothing.
>
> Included files with no #once would never be added to the database. The
> database is relative to the translation unit. Right? Does anyone see a
> flaw in this?
>
> (A prerequisite of this is that a complete path always corresponds to
> the same file or, better, to the same content. In all this, anyway, I
> think I didn't get the absolute need for timestamps, which means I'm
> missing something --note that two files might have identical content
> and timestamp, while still being distinct as filesystem entities)

The timestamp is useful in that you save the comparison result together
with a pair of timestamps. "Files a.h and mylib/a.h were identical when
a.h had timestamp xxxx and mylib/a.h had timestamp yyyy." As soon as
either file gets modified, that invalidates the comparison result.


Andrei

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: =?ISO-8859-1?Q?Bj=F8rn_Roald?= <bjorn@4roald.org>
Date: Mon, 11 Dec 2006 23:08:33 CST
Raw View
Andrei Alexandrescu (See Website For Email) skrev:
> James Kanze wrote:
>> My point is that basically, I get the feeling that there is a
>> partial agreement, at least between myself and the proponents of
>> pragma once, at least with regards to two important points:
>>
>>  1) that the concept of "same file" isn't workable, as such,
>>     since it cannot be implemented, and
>
> What happened to the md5 hash?

I think use of a hash can be helpful in resolving the cases where we
*must* do the extra check.  But before this is used the simpler *safe*
assumptions should be exploited.  So, what are the useful *safe*
assumptions?  I think the most useful would be that any line of the form

#include <file_a.h>

would cause the compiler to always see the same file no mather what file
system we deal with.  Likewise for the #include "file_b.h" form.

I think the possible use of macros like

#include SOME_MAGIC_MACRO

can be ignored as we can let the above logic apply after the macro
expansion, which should render one of the normal #include forms.

Ok, if this assumption can be used - what does that give us?

Basically we can be keep useful information associated with strings of
the <filename> or "filename" forms. If the compiler has included a file
by use of the <file_a.h> string, it can keep useful information it can
later utilize.  The psedocode below is an attempt to show the logic that
may be involved during preprocessing.

// hold map from include_string to bool indicating
// whether the file contain a once pattern
map<string,bool> has_once_pattern;

// hold hashes included of files with once pattern
set<md5_hash> once_pattern_files;
.


void handle_include_directive(string include_string)
{
   // called for each included file

   pp_file_data ppd;
   if( has_once_pattern.count(include_string) > 0 )
   {
      // ok, we know the file has been seen before
      if( has_once_pattern[include_string] )
      return;                              // done
      else
      {
 pp_file_read(ppd, include_string);   // we know we must read
 pp_include(ppd);                     // and use
      }
   }
   else
   {
      // include_string not seen before
      pp_file_read(ppd, include_string);       // must read
      has_once_pattern[include_string] = ppd.found_once_pattern;
      if( ppd.found_once_pattern )
      {
 if( once_pattern_files.count(pdd.md5_hash) == 0 )
         {
            // file content not seen before
            once_pattern_files.insert(pdd.md5_hash);
            pp_include(ppd);                     // use it
 }
      }
      else
      {
         pp_include(ppd);                     // no once, we use it
      }
   }
}


As I see it, there are two goals of the #pragma once, and other similar
  proposals.

1. simplicity in use, and less errors as positive important side effect

2. optimalization, mainly in the use of the programmers time - which
often also involves waiting for the compiler to complete

I think both of these are achievable and worth the effort.  I do not
wish to be seen as proponent for any of the forms:

#once
#pragma once
#pragma STDC ONCE

as I do not like any of them, they are ugly - even if they are not as
bad as the traditional guards.  I would rather like it to be default and
see support for:

#pragma STDC MORE_THAN_ONCE

or whatever for those who really need it.

-----
Bj   rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bjorn@4roald.org (=?ISO-8859-1?Q?Bj=F8rn_Roald?=)
Date: Tue, 12 Dec 2006 07:23:26 GMT
Raw View
Andrei Alexandrescu (See Website For Email) skrev:
> Gennaro Prota wrote:
>> On Mon, 11 Dec 2006 20:32:10 GMT, "Andrei Alexandrescu (See Website
>> For Email)" wrote:
>>> The probability of two files having identical timestamps, identical=20
>>> length, identical md5 hashes, while they are actually different is=20
>>> exceeedingly low, save for concerted attacks (see=20
>>> http://www.mscs.dal.ca/~selinger/md5collision/).
>>
>> Even lower for SHA-2 functions, but still... The point is, I think,
>> being able to take an appropriate decision even in case of collision.
>> That the collision is unlikely is a welcome property but we shouldn't
>> rely on it being impossible.
>=20
> If the collision probability is lower than the probability of file read=
=20
> error using a deterministic comparison function, then...

yes, it is simply an assessment of the risk involved (likelihood *=20
consequence).  And a decision on whether that is a chance we take.  If=20
it approaches the likelihood that my hard disk evaporate as the result=20
of a meteor hit I can bear it for most of the stuff I will ever compile.=20
  In any way, I think the hash collision on many systems can be made=20
more unlikely using additional low cost, possibly system dependent,=20
features of the file system.  In a safe and slow mode - collisions will=20
always be detectable using file compare - unless that fails as well or=20
the meteor strikes. So basically if you do not accept any risk, you=20
probably should not write code at all.

So for the sake of it, compare all this with the risk of accidental use=20
of the same include guard in multiple include files.  I have encountered=20
  this one on more than one occasion.  We all seems to have accepted it=20
as a risk we have to live with.

have a risky day
Bj=F8rn

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Greg Herlihy" <greghe@pacbell.net>
Date: Tue, 12 Dec 2006 01:20:56 CST
Raw View
kuyper@wizard.net wrote:
> Greg Herlihy wrote:
> .
> > Start with the definition of the __FILE__ macro (i.e the name of the
> > file that contains the #pragma once directive), unstringify the name
> > (i.e. strip the enclosing quotation marks), replace periods (and any
> > other character not allowed in an identifier) with underscores,
> > uppercase the name, and prepend an underscore (so the #pragma once
> > macro should not conflict with any user-declared macros).
>
> There are a couple of  problems:
> Two different files might be mapped to the same macro name by that
> algorithm.

If the two files have the same name then they would in fact have the
same implicitly-defined macro name - so a #once directive (I agree that
the #pragma should be dropped) in one of the files would inhibit the
subsequent inclusion of the other, within a single translation unit.
This behavior would be quite deliberate. In fact, one of the uses for a
#once directive that I can anticipate would to aid programmers in
detecting these kinds of header name collisions in order that they can
be addressed as soon as they have been detected.

> This idea intrudes upon a very large portion of the identifier name
> space that used to be reserved to implementations. The macro
> corresponding to a given file name might be one the implementation is
> already using for some other purpose. It's trivial to construct
> filenames where the corresponding macro name is the same as one of the
> standard-defined macros, such as "_stdc._", though those are odd enough
> that they're not likely to come up in practice. The ones that conflict
> with implementation-defined macros are a much bigger problem.

I don't see how searching and replacing for macro names in a set of
Standard header files is likely to prove all that formidable a task for
most C++ implementors. After all, one has to assume that a C++
implementor is likely to have at least a passing acquaintance with
regular expressions - especially after having implemented a complete
regular expression library for the Standard.

And although the burden that any change to the C++ language would have
on implementors should always be taken into account when considering
the feature - it is really the benefit to user programs that matters
above all else. The reason why the C++ Standard reserves certain types
of names to the implementation is precisely to have the option of
adding a feature like a #once directive and not run the risk of
breaking current user programs. So in a sense an implementor is
accustomed to bearing the brunt of keeping up with the Standard as it
evolves. An implementor's role is in essence to move the language
forward while making sure that no one is left behind.

> with the same value for __FILE__  as the value when the #pragma once
> directive was processed. The macros are just a kludge to keep track of
> that name. Why not simply define the behavior directly in terms of the
> file name? Let the implementation worry about how to keep track of it.

The proposal is for the preprocessor to test for the same
implicitly-defined macro that a #once directive appearing in the header
about to be included - would define. Otherwise there would be little
point in making the comparison since there would be no possibility of
finding a match. In other words:

    #include <headers/MyHeader.h>
    #include <MyHeader.h>

would both test for the same for the same macro, since the __FILE__
macro in a header called "MyHeader.h" would always be defined in the
same way.

Now it is important to note that if someone mismanages their header
files to such an extent that they wind up including two header files
with the same name within the same translation unit, then a #once
directive will not be coming to their rescue. For better or for worse,
using a #once directive will not spare those who carelessly or poorly
manage their program's dependencies - from the consequences of their
actions. A #once directive would prove useful in straightening out
programs whose header files are in a mess - and would also help a
program with well-maintained dependencies to remain that way. But
unless the programmer is on board or "with the program" (so to speak)
and maintains a program's dependencies at least semi-conscientiously,
then a #once directive is unlikely to be of much help to a programmer.

Even today choosing a header file name should always be done carefully.
A user header file whose name collides with the name of a Standard
header leads to undefined behavior. And I see little reason why other
header name collisions would be likely to turn out any better. In fact,
the only way even to differentiate between two headers with the same
name is implementation-defined (assuming it is even possible) and
therefore not a portable practice in the first place. So rather than
condone non-portable, error-prone and potentially dangerous practices
when managing dependencies management, the purpose of the #once
directive as specified - is instead directed toward eliminating such
practices.

> Of course, any scheme that uses the filename alone to identify which
> files are the same provides less protection than header guards do. With
> header guards, if the same file can be found in the #include search
> path by two different names, it is still protected against double
> inclusion, which is not the case with filename based approaches.

Only if the name of the included file changes while the translation
unit is being compiled. And while one should always be prepared for
anything, I'm not sure that "anything" is broad enough a term to cover
this concern.

Greg

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Pete Becker <pete@versatilecoding.com>
Date: Tue, 5 Dec 2006 15:05:30 CST
Raw View
Gennaro Prota wrote:
>
> given that there will be much rewording and expansion in clause 16 of
> the standard to bring it in synch with C99

Those changes have already been made. See
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1653.htm.

> I was thinking to submit a
> proposal for a "pragma once" directive, in the form
>
>   #pragma STDC ONCE
>
> I'd like to know, however, if committee members here feel that the
> proposal will have a concrete chance to be considered.
>

I'm under the impression that everyone who has implemented this has come
to the conclusion that it's not worth the bother. But aside from
technical merits, the deadline for new submissions for C++0x has passed,
so I doubt that a new proposal would get much consideration.

For an overview of the status of the various extension proposals for
C++0x see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2122.htm.

--

 -- Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com)
Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." (www.petebecker.com/tr1book)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Gennaro Prota <geunnaro_prouta@yahoo.com>
Date: Tue, 5 Dec 2006 16:18:34 CST
Raw View
On Tue,  5 Dec 2006 15:05:30 CST, Pete Becker wrote:

>I'm under the impression that everyone who has implemented this has come
>to the conclusion that it's not worth the bother. But aside from
>technical merits, the deadline for new submissions for C++0x has passed,
>so I doubt that a new proposal would get much consideration.

The advantage is for the user, as it relieves from inventing a name
for the include guard macro. Also it is so simple to specify and
implement... Perhaps for C++1x, then.

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: pete@versatilecoding.com (Pete Becker)
Date: Tue, 5 Dec 2006 22:59:35 GMT
Raw View
Gennaro Prota wrote:
>
> The advantage is for the user, as it relieves from inventing a name
> for the include guard macro.  Also it is so simple to specify and
> implement... Perhaps for C++1x, then.
>

It is not simple to specify, nor is it simple to implement (if it were,
the simplistic analysis on Wikipedia wouldn't have a list of compilers
that got it wrong). Remember, an included file can be out on a network
somewhere, on an unknown file system, accessible through multiple paths,
with different names.

--

 -- Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com)
Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." (www.petebecker.com/tr1book)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Wed, 6 Dec 2006 00:22:15 GMT
Raw View
On Tue,  5 Dec 2006 22:59:35 GMT, Pete Becker wrote:

>Gennaro Prota wrote:
>>
>> The advantage is for the user, as it relieves from inventing a name
>> for the include guard macro.  Also it is so simple to specify and
>> implement... Perhaps for C++1x, then.
>>
>
>It is not simple to specify, nor is it simple to implement (if it were,
>the simplistic analysis on Wikipedia wouldn't have a list of compilers
>that got it wrong).

Hmm, could you please provide a link? I can't find any article with
such a list.

>Remember, an included file can be out on a network
>somewhere, on an unknown file system, accessible through multiple paths,
>with different names.

I know. But the EDG front-end, and gcc, claim to recognize the include
guard idiom. How can they implement it correctly?

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "peter koch" <peter.koch.larsen@gmail.com>
Date: Tue, 5 Dec 2006 22:52:31 CST
Raw View
Gennaro Prota skrev:
> Hi,
>
> given that there will be much rewording and expansion in clause 16 of
> the standard to bring it in synch with C99 I was thinking to submit a
> proposal for a "pragma once" directive, in the form
>
>   #pragma STDC ONCE
>
> I'd like to know, however, if committee members here feel that the
> proposal will have a concrete chance to be considered.
>
> --
> Gennaro Prota.    C++ developer. For hire.
> (to mail me, remove any 'u' from the address)
>
I don't really see the purpose of that pragma in the first place, but
as you are doubtlessly more knowledgeable than me in this area, lets
ignore that and assume that pragma once is useful.
In that case, why not simply use the de facto standard format - #pragma
once. Looks nicer that way, I think.

/Peter

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: Seungbeom Kim <musiphil@bawi.org>
Date: Wed, 6 Dec 2006 00:46:16 CST
Raw View
Gennaro Prota wrote:
> On Tue,  5 Dec 2006 15:05:30 CST, Pete Becker wrote:
>
>> I'm under the impression that everyone who has implemented this has come
>> to the conclusion that it's not worth the bother. But aside from
>> technical merits, the deadline for new submissions for C++0x has passed,
>> so I doubt that a new proposal would get much consideration.
>
> The advantage is for the user, as it relieves from inventing a name
> for the include guard macro. Also it is so simple to specify and
> implement... Perhaps for C++1x, then.

Even if it were admitted into C++0x or C++1x, you would still have to
write the include guards yourself anyway in any serious program to
support old compilers, wouldn't you? And you couldn't argue otherwise
because supporting old compilers is trivially easy (3 more lines).

Then I don't see much benefit in such a change. Maybe it's way easier
just to write a program that generates the include guards.

$ genhdr headername.h
$ cat headername.h
#ifndef HEADERNAME_H_INCLUDED
#define HEADERNAME_H_INCLUDED

#endif
$ _

Or there probably are ones already. It's just that writing one myself or
even searching for a good one costs more than writing out the include
guard when I need it. :)

--
Seungbeom Kim

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Nevin :-] Liber" <nevin@eviloverlord.com>
Date: Wed, 6 Dec 2006 00:46:40 CST
Raw View
In article <5d2cn2tvg6te1r2h5otocpmsec5e9gl2sk@4ax.com>,
 geunnaro_prouta@yahoo.com (Gennaro Prota) wrote:

> I know. But the EDG front-end, and gcc, claim to recognize the include
> guard idiom. How can they implement it correctly?

The include guard idiom is based on macro symbols, not on whether or not
two files are the same file.  Information about the latter must be
provided by the operating system.

--
 Nevin ":-)" Liber  <mailto:nevin@eviloverlord.com>

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: kuyper@wizard.net
Date: Wed, 6 Dec 2006 09:37:42 CST
Raw View
peter koch wrote:
> Gennaro Prota skrev:
> > Hi,
> >
> > given that there will be much rewording and expansion in clause 16 of
> > the standard to bring it in synch with C99 I was thinking to submit a
> > proposal for a "pragma once" directive, in the form
> >
> >   #pragma STDC ONCE
.
> In that case, why not simply use the de facto standard format - #pragma
> once. Looks nicer that way, I think.

Because C99, which was the first C standard to provide pragmas with
standard-defined meanings, reserved pragma names starting with "STDC"
for that purpose. I gather that this idea is to be adopted in the next
C++ standard, too. Giving "#pragma once" a standard-defined meaning
would intrude on a name space that has been reserved for implementors.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 6 Dec 2006 10:07:36 CST
Raw View
Seungbeom Kim wrote:
> Gennaro Prota wrote:
> > On Tue,  5 Dec 2006 15:05:30 CST, Pete Becker wrote:

> >> I'm under the impression that everyone who has implemented this has come
> >> to the conclusion that it's not worth the bother. But aside from
> >> technical merits, the deadline for new submissions for C++0x has passed,
> >> so I doubt that a new proposal would get much consideration.

> > The advantage is for the user, as it relieves from inventing a name
> > for the include guard macro. Also it is so simple to specify and
> > implement... Perhaps for C++1x, then.

> Even if it were admitted into C++0x or C++1x, you would still have to
> write the include guards yourself anyway in any serious program to
> support old compilers, wouldn't you? And you couldn't argue otherwise
> because supporting old compilers is trivially easy (3 more lines).

> Then I don't see much benefit in such a change. Maybe it's way easier
> just to write a program that generates the include guards.

> $ genhdr headername.h
> $ cat headername.h
> #ifndef HEADERNAME_H_INCLUDED
> #define HEADERNAME_H_INCLUDED

> #endif
> $ _

> Or there probably are ones already. It's just that writing one myself or
> even searching for a good one costs more than writing out the include
> guard when I need it. :)

Does anyone use an editor which doesn't do this (and insert
other boilerplate text as well) whenever you open a file which
doesn't exist and whose name ends in .h, .hh, .hpp, etc.?  It
seems like a standard feature, present in all of the editors I
use (vim, emacs) except the original vi.  Of course, you have to
configure it to do so, but I would expect this to be the case in
any normal development environment---you also want boilerplate
copyright notices, and who knows what all else.  (I've found it
useful to put

    //  Local Variables:                    --- for emacs
    //  mode: c++                           --- for emacs
    //  tab-width: 8                        --- for emacs
    //  End:                                --- for emacs
    //  vim: set ts=8 sw=4 filetype=cpp:    --- for vim

at the end of all of my sources, for example.  And you should
see what gets generated if the filename ends with .html:-).)

If you don't know how to do this with your current editor, ask
in the appropriate forum for that editor.  You'll typically find
a package already available which does most of the work.  (I
forget what I used with emacs, but I know that I didn't have to
write a single line of elisp:-).)

Of course, in a lot of cases, your source code management system
will take care of these details, so you don't have to bother
configuring the editor.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Wed, 6 Dec 2006 17:42:44 GMT
Raw View
On Tue,  5 Dec 2006 22:52:31 CST, peter koch wrote:

>> [ #pragma STDC ONCE ]
>>
>> I'd like to know, however, if committee members here feel that the
>> proposal will have a concrete chance to be considered.
>>
>I don't really see the purpose of that pragma in the first place, but
>as you are doubtlessly more knowledgeable than me in this area,

Not at all :-)

>lets ignore that and assume that pragma once is useful.
>In that case, why not simply use the de facto standard format - #pragma
>once. Looks nicer that way, I think.

My favourite form would actually be:

 #once

There's no need, I think, to bring "pragma" into it. As I said, the
usefulness of the construct lies in the fact that it doesn't require
any user-defined identifier. (In retrospect, I guess the include once
behavior should have been the default, with a directive to request
otherwise, but that's (perhaps) too late to change, especially now
that many "recursive inclusion" techniques have been popularized by
the boost preprocessor library)

The names of my controlling macros usually encode the project name,
the filename, the developer's name and the date of creation of the
file, and are automatically generated, but still that's not enough to
avoid any reasonable conflict.

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Wed, 6 Dec 2006 11:45:50 CST
Raw View
Nevin :-] Liber wrote:
> In article <5d2cn2tvg6te1r2h5otocpmsec5e9gl2sk@4ax.com>,
>  geunnaro_prouta@yahoo.com (Gennaro Prota) wrote:

> > I know. But the EDG front-end, and gcc, claim to recognize the include
> > guard idiom. How can they implement it correctly?

> The include guard idiom is based on macro symbols, not on whether or not
> two files are the same file.  Information about the latter must be
> provided by the operating system.

The optimization based on recognizing it is based on recognizing
that two includes actually include the same file.  And the
reason that it can be implemented correctly is that it isn't an
error if they do happen to read the same file twice; it just
results in a slower compile time.  So they can take a
conservative strategy, and in case of doubt, skip the
optimization.  This would not be the case in the case of #pragma
once.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Peter Steiner" <pnsteiner@gmail.com>
Date: Wed, 6 Dec 2006 11:46:41 CST
Raw View
Seungbeom Kim wrote:
>
> Then I don't see much benefit in such a change. Maybe it's way easier
> just to write a program that generates the include guards.

besides the improvement in usability, a pragma once statements
decreases preprocessor runtime cost. include guards don't allow the
preprocessor to omit the file for obvious reasons, while pragma once
does so.

this can result in a significant preprocessor speed up.

-- peter

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: ben-public-nospam@decadentplace.org.uk (Ben Hutchings)
Date: Wed, 6 Dec 2006 23:23:13 GMT
Raw View
On 2006-12-06, Peter Steiner <pnsteiner@gmail.com> wrote:
> Seungbeom Kim wrote:
>>
>> Then I don't see much benefit in such a change. Maybe it's way easier
>> just to write a program that generates the include guards.
>
> besides the improvement in usability, a pragma once statements
> decreases preprocessor runtime cost. include guards don't allow the
> preprocessor to omit the file for obvious reasons, while pragma once
> does so.
<snip>

#pragma once has been suggested many times, but it's difficult to
specify and may be impossible to implement on some common systems.
I know a popular compiler that implements #pragma once which allows a
file containing #pragma once that's on a case-insensitive file-system
to be included more than once if the include directives use differing
capitalisation,

Another popular compiler recognises the #ifndef FOO...#define FOO...
#endif pattern and avoids repeatedly reading such files (so long as it
recognises them) if the controlling macro is still defined.   Even if
it fails to recognise two paths as pointing to the same file, this only
makes compilation a little slower.

I know which behaviour I prefer.

Ben.

--
Ben Hutchings
Never attribute to conspiracy what can adequately be explained by stupidity.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "peter koch" <peter.koch.larsen@gmail.com>
Date: Wed, 6 Dec 2006 17:23:50 CST
Raw View
Gennaro Prota skrev:
> On Tue,  5 Dec 2006 22:52:31 CST, peter koch wrote:
>
> >> [ #pragma STDC ONCE ]
> >>
> >> I'd like to know, however, if committee members here feel that the
[snip]
> > why not simply use the de facto standard format - #pragma
> >once. Looks nicer that way, I think.
>
I now understand that the "STDC" form of the pragma is for
"standardised" pragmas and would thus support that form as well.
> My favourite form would actually be:
>
>  #once
>
> There's no need, I think, to bring "pragma" into it. As I said, the
> usefulness of the construct lies in the fact that it doesn't require
> any user-defined identifier. (In retrospect, I guess the include once
> behavior should have been the default, with a directive to request
> otherwise, but that's (perhaps) too late to change, especially now
> that many "recursive inclusion" techniques have been popularized by
> the boost preprocessor library)
>
> The names of my controlling macros usually encode the project name,
> the filename, the developer's name and the date of creation of the
> file, and are automatically generated, but still that's not enough to
> avoid any reasonable conflict.

My controlling macro once had a tail of "blind typing" so it looked
somewhat like
#if !defined
INCLUDE_HPPlsfvugasovytgsurkxjtlouejrasvnyhjsytdjcyvlistcge
#define INCLUDE_HPPlsfvugasovytgsurkxjtlouejrasvnyhjsytdjcyvlistcge

much to the amusement of my fellow collegues. I fell victim for their
teasing and now follow your way (including also the time of day). I
doubt the likelyhood of having identical guards is more than
hypothetical.
One more serious consideration is what to do with linked files. Is it
always possible to determine that a headerfile has already been drawn
in when the name is different? I would not bet on that and would thus
prefer the good old guards.

Kind regards
Peter

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "eric_backus@alum.mit.edu" <eric_backus@alum.mit.edu>
Date: Thu, 7 Dec 2006 01:21:52 CST
Raw View
peter koch wrote:
> One more serious consideration is what to do with linked files. Is it
> always possible to determine that a headerfile has already been drawn
> in when the name is different? I would not bet on that and would thus
> prefer the good old guards.

Which failure is more likely:

 * Your system supports linked files, the include files in your project
make use of that, but the compiler can't figure out that two files are
linked together, or
 * You use good old include guards, but accidentally use the same
include guard in different include files, due to using cut-and-paste to
create one of the files.

I'm pretty sure the second is much more likely--I've seen it happen.
To me, the issue is not that include guards are trivially easy for
people to use (they are), the issue is that people occasionally make
mistakes and duplicate the include guards, and that include guards by
definition pollute the global namespace.

I understand the difficulties in determining file identity when you
involve networking, hard and soft links, and so on.  But I would think
the standard could just say that file identity is determined in an
implementation defined way, and leave this as a quality of
implementation issue.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "ThosRTanner" <ttanner2@bloomberg.net>
Date: Thu, 7 Dec 2006 09:39:43 CST
Raw View
Ben Hutchings wrote:
> On 2006-12-06, Peter Steiner <pnsteiner@gmail.com> wrote:
> > Seungbeom Kim wrote:
> >>
> >> Then I don't see much benefit in such a change. Maybe it's way easier
> >> just to write a program that generates the include guards.
> >
> > besides the improvement in usability, a pragma once statements
> > decreases preprocessor runtime cost. include guards don't allow the
> > preprocessor to omit the file for obvious reasons, while pragma once
> > does so.
> <snip>
>
> #pragma once has been suggested many times, but it's difficult to
> specify and may be impossible to implement on some common systems.
> I know a popular compiler that implements #pragma once which allows a
> file containing #pragma once that's on a case-insensitive file-system
> to be included more than once if the include directives use differing
> capitalisation,
That's a QOI issue though. I can see that there are issues:
1) File name case differs on a case-insensitive file system - easily
fixable
2) Use of soft links - more difficult, though fairly easy to fix (at
least on unix)
3) Use of hard links - I would have described that as a "sanity of
coder issue". There are certain facilities on unix you really don't
need to use...
4) Multiple network mounts to same point - again, that is a
questionable system setup.

Me, I'd be happy to have #pragma once (or some such name) even if I was
told that 2, 3 and 4 would break it - all of those setups cause
confusion in the minds of programmers anyway - you think you are
editing a header file that'll only affect module X, and it could well
affect module Y without your realising if points 2, 3 or 4 happen to be
issues.

> Another popular compiler recognises the #ifndef FOO...#define FOO...
> #endif pattern and avoids repeatedly reading such files (so long as it
> recognises them) if the controlling macro is still defined.   Even if
> it fails to recognise two paths as pointing to the same file, this only
> makes compilation a little slower.
>
> I know which behaviour I prefer.
>
And I know of many cases where the same include guard has been used for
2 header files - because of
1) Insufficiently specific include guards
2) copy and paste
3) rename header file without changing include guards, followed by
creation of new header with same name and same include guard.

I know which behaviour I prefer.

I have to admit - I am lazy. If I have to do the same thing repeatedly
for a particular tool, I feel the tool should be doing it for me.

Actually, as we end up having include guards in ALL our headers, I'd
prefer the compiler to require a '#pragma everytime' to allow the file
to be repeatedly included - it'd make more sense (the only header I've
EVER seen that would need that is assert.h)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Geo" <gg@remm.org>
Date: Thu, 7 Dec 2006 09:41:23 CST
Raw View
Gennaro Prota wrote:

> The names of my controlling macros usually encode the project name,
> the filename, the developer's name and the date of creation of the
> file, and are automatically generated, but still that's not enough to
> avoid any reasonable conflict.
>

My macro includes the time as well, down to seconds, so to get a
conflict, I would need to create two include guards, with the same file
name, within less than a second of each other, I reckon that's fairly
unlikely.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 7 Dec 2006 10:12:19 CST
Raw View
eric_backus@alum.mit.edu wrote:
> peter koch wrote:
> > One more serious consideration is what to do with linked files. Is it
> > always possible to determine that a headerfile has already been drawn
> > in when the name is different? I would not bet on that and would thus
> > prefer the good old guards.

> Which failure is more likely:

It probably depends on how you're organized.  (Or maybe the
correct connective is "whether".)

>  * Your system supports linked files, the include files in your project
> make use of that, but the compiler can't figure out that two files are
> linked together, or

A fairly usual case, I think.  At least, it's been the case in
most places I've worked.  (The general context which would allow
such errors, that is.  In practice, I would imagine that it
would be pretty rare for even something as simple as a simple
literal comparison to give a wrong answer.  In any project, all
include files have a "canonical" name, regardless of the
different ways they can be accessed, and that canonical name is
the one used when including the file.)

>  * You use good old include guards, but accidentally use the same
> include guard in different include files, due to using cut-and-paste to
> create one of the files.

I'll admit that I can't imagine that ever happening.  You might
cut-and-paste the contents of the header file, but never the
include guards, which have always been created automatically,
using naming conventions guaranteed to generate a unique name
within the company, and with some specific prefix or suffix to
make it unlikely to conflict with the naming conventions used in
some third party library.

If it's really a worry, of course, you can append the timestamp,
the IP or MAP address of the machine and the process id of the
editor/generator script to the guard.  Maybe with some bytes
from /dev/random as well, just for good measure.

> I'm pretty sure the second is much more likely--I've seen it happen.

The place to address that is your software development process,
not the language definition.  If you're getting the same names
for include guards (which are generated automatically, and which
don't have to have any human signification if you don't want),
then you're probably getting the same names for other things.

> To me, the issue is not that include guards are trivially easy for
> people to use (they are), the issue is that people occasionally make
> mistakes and duplicate the include guards, and that include guards by
> definition pollute the global namespace.

All macros by definition pollute global namespace.  All projects
that work have naming policies to avoid collisions.  Not just
for include guards, but for everything that might end up at
global scope.

In practice, the only time I've actually encountered a problem
had nothing to do with include guards.  A third party library
had actually defined a macro named String, and we were using the
USL library as well.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: jdennett@acm.org (James Dennett)
Date: Thu, 7 Dec 2006 17:21:59 GMT
Raw View
eric_backus@alum.mit.edu wrote:
>
> I understand the difficulties in determining file identity when you
> involve networking, hard and soft links, and so on.  But I would think
> the standard could just say that file identity is determined in an
> implementation defined way, and leave this as a quality of
> implementation issue.

But given that there are many real-world systems where the best
achievable QoI would be inadequate, this is not appropriate for
standardization -- unless #pragma once is just a hint on an
idempotent header, in which case it's not needed as compilers
can already do that optimization in many cases.

-- James

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "Anders Dalvander" <google@dalvander.com>
Date: Thu, 7 Dec 2006 11:28:29 CST
Raw View
Gennaro Prota wrote:
> given that there will be much rewording and expansion in clause 16 of
> the standard to bring it in synch with C99 I was thinking to submit a
> proposal for a "pragma once" directive, in the form
>
>   #pragma STDC ONCE

Wouldn't a new #import <file.h> or #using <file.h> construct be better
overall? Then the file doesn't need to be opened and scanned for
#pragma once or other include guard constructs. It would also be a step
toward modules in C++, and perhaps a way to get rid of the need to
forward declare classes.

// foo.h
#import <bar.h>
class foo
{
   bar* child;
   void call_child() { child->doit(); }
   void doit() { ... }
};

// bar.h
#import <foo.h>
class bar
{
   foo* parent;
   void call_parent() { parent->doit(); }
   void doit() { ... }
};

Also see http://gamearchitect.net/Articles/ExperimentsWithIncludes.html
for some discussions about external include guards, internal include
guards and pragma once guards.

// Anders Dalvander

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: "James Kanze" <james.kanze@gmail.com>
Date: Thu, 7 Dec 2006 11:28:00 CST
Raw View
Ben Hutchings wrote:

> Another popular compiler recognises the #ifndef FOO...#define FOO...
> #endif pattern and avoids repeatedly reading such files (so long as it
> recognises them) if the controlling macro is still defined.   Even if
> it fails to recognise two paths as pointing to the same file, this only
> makes compilation a little slower.

That should be "Other popular compilers".  I think that there's
more than one, today.

And what happens if it recognizes two paths as pointing to the
same file, when they don't?  I just tried including "/dev/tty"
with g++ (probably the first compiler to implement this), and if
I input the include guards the first time, it doesn't read from
the keyboard a second time.  (Admittedly, reading from
"/dev/tty" is pretty exotic to begin with, and manually typing
in include guards on top of that when I really want the compiler
to read tty input twice is downright ridiculous---normally, you
want your code such that it can be compiled in an at job, with
input redirected.)

I can think of a couple of other ways in which a file would have
different contents the second time you read it, but they're all
of even less pratical interest than including /dev/tty.  Still,
I suppose that it wouldn't hurt to say that it is undefined
behavior if the name of an include file (the h-char-sequence or
the q-char-sequence) are texually equal, and the contents that
the compiler reads are different, the behavior is undefined.
Even if we are making formerly well defined behavior undefined,
I'd be very surprised if there were a single real program that
it broke.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient   e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S   mard, 78210 St.-Cyr-l'   cole, France, +33 (0)1 30 23 00 34


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: bop@gmb.dk ("Bo Persson")
Date: Thu, 7 Dec 2006 17:32:08 GMT
Raw View
ThosRTanner wrote:
> Ben Hutchings wrote:

>> #pragma once has been suggested many times, but it's difficult to
>> specify and may be impossible to implement on some common systems.
>> I know a popular compiler that implements #pragma once which
>> allows a file containing #pragma once that's on a case-insensitive
>> file-system to be included more than once if the include
>> directives use differing capitalisation,
> That's a QOI issue though. I can see that there are issues:
> 1) File name case differs on a case-insensitive file system - easily
> fixable
> 2) Use of soft links - more difficult, though fairly easy to fix (at
> least on unix)
> 3) Use of hard links - I would have described that as a "sanity of
> coder issue". There are certain facilities on unix you really don't
> need to use...
> 4) Multiple network mounts to same point - again, that is a
> questionable system setup.

It's not always that the developers can influence the design of the
corporate network.

For example, I have network mounts to Windows servers, various NAS disks,
ClearCase on a UNIX server, and MVS on an IBM mainframe. Should we require a
C++ compiler to resolve this?

>>
> And I know of many cases where the same include guard has been used
> for 2 header files - because of
> 1) Insufficiently specific include guards
> 2) copy and paste
> 3) rename header file without changing include guards, followed by
> creation of new header with same name and same include guard.
>
> I know which behaviour I prefer.

That the development system takes care of that, so the language standard
doesn't have to?  :-)


Bo Persson


---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]





Author: geunnaro_prouta@yahoo.com (Gennaro Prota)
Date: Tue, 5 Dec 2006 18:39:50 GMT
Raw View
Hi,

given that there will be much rewording and expansion in clause 16 of
the standard to bring it in synch with C99 I was thinking to submit a
proposal for a "pragma once" directive, in the form

  #pragma STDC ONCE

I'd like to know, however, if committee members here feel that the
proposal will have a concrete chance to be considered.

--
Gennaro Prota.    C++ developer. For hire.
(to mail me, remove any 'u' from the address)

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]