Topic: Directories
Author: Christopher Eltschka <celtschk@physik.tu-muenchen.de>
Date: 1999/11/04 Raw View
David R Tribble wrote:
>
> [Discussing a portable set of functions to traverse directory
> hierarchies.]
>
> Christopher Eltschka wrote:
> > A directory is a persistent container of files and directories.
> > For each entry, it contains (at least) a name, and a way to get
> > on the entity (file or directory) connected with that name.
> >
> > A simple interface could be:
> [code omitted]
> >
> > On systems with no directories, creating a directory
> > object would always fail. On systems with directories,
> > the information provided with this interface should
> > always exist.
> >
> > In addition, there should be a "C string" constant
> [code omitted]
> > which contains all allowed directory separators, with the
> > main separator (i.e. the one most commonly used) first.
> >
> > That is, on Unix it would just be "/", on DOS, Win and OS/2
> > it would be "\\/", and on other systems it would be something
> > else. On systems not supporting directories, it would be
> > the empty string.
>
> As has been discussed elsewhere (news:comp.std.c), this isn't enough
> to cover all implementations of directory systems. VMS, for example,
> has names like "NODE::DEV:[STD.LIB.C]FILE.EXT;12" which don't really
> have a "directory separator" character.
I don't see any problem passing "NODE::DEV:[STD.LIB.C]"
to the directory constructor.
The proposed dirsep variable would then probably be the empty
string.
After all, I can even imagine a system where "directories"
are built by giving properties. On such a system, you'd
f.ex. create a "directory" with
directory foodoc("type:text,package:foo");
which would mean "all text files of package foo"
Also, an Unix implementation could allow multiple dirs in PATH format
and wildcard expressions as directory:
directory txt("~/text:~/doc/*.txt");
(which would be a "directory" composed of all files in ~/text and
all files matching *.txt in ~/doc)
Or an implementation could decide to allow access to archives
as directory:
directory foo("foo.zip");
Or it could allow remote files:
directory remote("user@host:dir");
Or URLs:
directory leo("ftp://ftp.leo.org/");
It's up to the implementor (and the user's demand) what strings are
accepted, and how they are interpreted (although it's of course
strongly recomended that anything giving a legal directory is
treated as normal path).
> You also left out the
> equivalents of "." and ".." (the latter is "[-]" in VMS).
I don't see a problem with this:
// Unix:
directory current(".");
directory parent("..");
// VMS:
directory parent("[-]");
> You
> would also have to cover the concept of "members" within "datasets",
> such as those on MVS mainframe systems.
How does a "dataset" differ from a directory?
> You also should have a
> way of handling directory entries that are neither files nor
> directories (i.e., devices).
OK: add a is_file() method to direntry. If neither is_file
nor is_dir return true, then it must be something else.
(OTOH, most devices _are_ files, as far as C++ is concerned:
You can create an istream and/or ostream for them, and read/write.
> VMS also had the concept of revision
> numbers on filenames.
I guess a reasonable behaviour would be to open the newest
revision on default; if someone wants an older revision, he
gets the pathname, adds the revision number and passes that
string to the streambuf constructor.
Another possibility would be to let the user access the
revisions as "directory" (that is, a file could be opened
as file as well as as directory; as file, you'd get the
newest revision, while as directory, you'd iterate through
the revisions. Of course, if already given a revision number,
you could only open it as file, which gives exactly the
revision you asked for)
>
> I've suggested an alternate approach, based on the Common LISP
> library functions for pathnames, that allows a program to build up
> a pathname from component pieces, to extract component pieces from
> pathnames, and to combine partial pathnames.
That's IMHO a more limited model, since it assumes there's
a thing like a pathname that can be split into components.
My model makes only one assumption: That a set of files can
be named by giving a string. No further assumption about that
string is made, although the dirsep variable exists for the
benefit for the common implementation which have directory
separators.
>
> If we have this kind of capability, we could build on it with
> additional standard functions to provide further information, such
> as file type (dir/file/other), modification date, file size, etc.
Those functions would get into the direntry in my model.
Also, I made the direntry interface minimal on purpose
(what's the size of /dev/tty? Or of PRN on DOS?)
The goal is just to provide the basic functionality
(iterate through a directory, and open files/subdirs
from it).
>
> Some people like the idea, but too many others say that it's still
> not universal enough, it'll never fly, and that it doesn't really
> belong in a language standard anyway.
It seems to me that your concept is much too concrete.
So I agree that it's possibly not universal enough.
My concept, OTOH, is unspecific on purpose. All it assumes is:
- There is a way to access files in groups which can be named
(i.e. a string can be given that selects certain files). Those
groups are called "directories".
- The members of directories may be other directories as well as
files (and with the addition of is_file to direntry, they
can also be something completely different, which may be
something different).
- The members of directories all have names.
That's all. There's no requirement that a directory is part
of a tree-like (or graph-like) structure. There's no requirement
that directories are non-intersecting. There's (after addition
of is_file) no requirement that something is either a file or a
directory - it may be both at once, or none at all.
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]
Author: David R Tribble <david@tribble.com>
Date: 1999/11/03 Raw View
[Discussing a portable set of functions to traverse directory
hierarchies.]
Christopher Eltschka wrote:
> A directory is a persistent container of files and directories.
> For each entry, it contains (at least) a name, and a way to get
> on the entity (file or directory) connected with that name.
>
> A simple interface could be:
[code omitted]
>
> On systems with no directories, creating a directory
> object would always fail. On systems with directories,
> the information provided with this interface should
> always exist.
>
> In addition, there should be a "C string" constant
[code omitted]
> which contains all allowed directory separators, with the
> main separator (i.e. the one most commonly used) first.
>
> That is, on Unix it would just be "/", on DOS, Win and OS/2
> it would be "\\/", and on other systems it would be something
> else. On systems not supporting directories, it would be
> the empty string.
As has been discussed elsewhere (news:comp.std.c), this isn't enough
to cover all implementations of directory systems. VMS, for example,
has names like "NODE::DEV:[STD.LIB.C]FILE.EXT;12" which don't really
have a "directory separator" character. You also left out the
equivalents of "." and ".." (the latter is "[-]" in VMS). You
would also have to cover the concept of "members" within "datasets",
such as those on MVS mainframe systems. You also should have a
way of handling directory entries that are neither files nor
directories (i.e., devices). VMS also had the concept of revision
numbers on filenames.
I've suggested an alternate approach, based on the Common LISP
library functions for pathnames, that allows a program to build up
a pathname from component pieces, to extract component pieces from
pathnames, and to combine partial pathnames.
If we have this kind of capability, we could build on it with
additional standard functions to provide further information, such
as file type (dir/file/other), modification date, file size, etc.
Some people like the idea, but too many others say that it's still
not universal enough, it'll never fly, and that it doesn't really
belong in a language standard anyway.
-- David R. Tribble, david@tribble.com, http://www.david.tribble.com --
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]