Thread

Topic: Why no B-Tree container in STL?

Author: Tony@ask.me (Tony)
Date: 1998/12/17 Raw View

In article <36767C87.35F161D0@altisimo.com>, kasper@altisimo.com says...
>
> Tony wrote:
> >
> > In article <367031AD.156E1C17@altisimo.com>, kasper@altisimo.com says...
> > > A problem with B-Tree as a standard class is that it's file-system
> > > dependent.
> >
> > Is it?  Or is it just block size dependent?
>
> Yes, that's what I meant.  I had posted a follow-up with that
> correction, but I guess it got lost.

So what's the hang up then?  We have read/write in the standard library.

Tony

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Richard Kasperowski <kasper@altisimo.com>
Date: 1998/12/15 Raw View

Tony wrote:
>
> In article <367031AD.156E1C17@altisimo.com>, kasper@altisimo.com says...
> > A problem with B-Tree as a standard class is that it's file-system
> > dependent.
>
> Is it?  Or is it just block size dependent?

Yes, that's what I meant.  I had posted a follow-up with that
correction, but I guess it got lost.

--
Richard Kasperowski
kasper@altisimo.com
Tel: (617)576-1552
Fax: (617)576-2441

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/12/15 Raw View

>"I thought I might make the persistent allocator concept work by partially
>specializing each container to overload its constructors to communicate
with
>the allocator.

After some experiments I find that it is desireable to partially specialize
member functions instead of the entire template so that I do not have to
reimplement all of STL in order to facilitate this isolated specialization.
It is not clear to me from the standard that one can do that. The egcs 1.1
compiler, the only compiler here that supports partial specialization, blows
up on the following code, which attempts to partially specialize the
template's constructor. Consequently, I am not sure whether: (1) partial
specialization of member functions is legal and the compiler is wrong
(certainly it is wrong to blow up; it does so only when the function is a
ctor; it issues an error for partial specialization of other member
functions), (2) partial specialization of member functions is legal and my
code is wrong, or (3) partial specialization of member functions alone
without also partially specializing the template class is not legal.

template <class T, class A> class foo {
public:
 foo(T t, A a);
};
template <class T> foo<T, int>::foo(T t, int a) { }

I don't find an idiom that resembles this example of mine in any of the
examples in the specification, Therefore, assuming (3), let me re-address
the question:

> What extensions to the standard would you suggest, to allow disk-based
> associative containers (whether or not they're implemented using
> b-trees)?

I would permit partial specialization of member functions in the manner
shown above, following, of course, an investigation into the other
ramifications of such a requirement. This feature would facilitate the
implementation of persistent containers (not just disk-based associative
containers) by using allocators instead of inheritance.





[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/11 Raw View

In article <367031AD.156E1C17@altisimo.com>, kasper@altisimo.com says...
> A problem with B-Tree as a standard class is that it's file-system
> dependent.

Is it?  Or is it just block size dependent?

> It would be nice, though, to have some kind of disk-based container.

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/12/13 Raw View

>Thanks, Nathan; there go my weekends.


I went back to my original article about a persistent STL and found this
paragraph:

"I thought I might make the persistent allocator concept work by partially
specializing each container to overload its constructors to communicate with
the allocator. Until PC compilers implement partial specialization, however,
I cannot experiment with this approach."

Memory is the second thing to go.





[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Richard Kasperowski <kasper@altisimo.com>
Date: 1998/12/10 Raw View

Tony wrote:
>
> In article <3662E2F8.446B@wizard.net>, kuyper@wizard.net says...
> > No, I'm referring to disk access efficiency. My question is about
> > comparing a disk-based container using a B-tree data structure, to a
> > disk-based container using a red-black tree data structure. For a
> > disk-based container, in-memory access efficiency will almost never be a
> > significant issue.
>
> I think the liturature shows that r-b trees are inappropriate for disk-
> based storage (binary-unbalanced tree vs. n-way-balanced tree?).  I
> believe it comes down to being able to page to and from disk effectively
> so B-tree fits the bill.  OTOH, maybe it is the bottom-up growth of B-
> trees which eliminates linear insertion cost that is the primary reason
> why B-tree is the data structure of choice for disk-based storage
> (access).

Right.  Red-black trees are a way to implement a specific B-tree, one
that is not the right size for any file system's disk block.  A B-Tree
has nodes that are optimized to fit in a single disk block.  Each node
(disk block) will contain the maximum number of search keys and pointers
to child nodes.  The idea is to minimize the number of disk reads while
traversing the tree, while keeping O(log n) search time.  I have some
notes at URL
"http://www.fas.harvard.edu/~adm119/sections/thursday/dec03.html" if
anyone's interested.  Most Data Structures texts discuss the topic.

A problem with B-Tree as a standard class is that it's file-system
dependent.

It would be nice, though, to have some kind of disk-based container.

--
Richard Kasperowski
kasper@altisimo.com
Tel: (617)576-1552
Fax: (617)576-2441

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/12/10 Raw View

>First, why do you expect the contents of your persistent vector to
>"persist" after you have *destroyed* it?

Semantics. I probably used the wrong word. When the memory copy object's
dtor is executed, its contents are returned to the persistent data store. I
don't like "goes out of scope" because that phrase can be misleading.

>  I would expect such a container to be new'd,

It can also be constructed on the stack as an auto object or as an external
variable. There is no reason to restrict it to being constructed on the
heap. Assuming that's what you mean by "new'd."

> and at some point to be synchronized to
>persistent media

That happens when the memory copy of the container is constructed.

>Second, I would implement such a feature by partial specialization
>on the allocator type, so I would not be bound by the standard
>container-allocator interface if it were not sufficient to my needs.
>Of course that requires a more-or-less up-to-date compiler, but those
>are common now.

You need to specialize the container class to communicate with the
allocator, which seems to be what you are saying. I remember now that I
wanted to do that only to realize that I needed partial specialization,
which was not available to me when I did this work. I have not looked at it
or thought about it much since then until this discussion. You might be
right that by using partial specialization no change to the standard
interface is necessary. That would be really good news and I am eager to
give it another go. It would solve a bunch of other problems that caused me
to compromise. Thanks, Nathan; there go my weekends.

>The virtue of using a vector specialized (explicitly or otherwise)
>on an allocator type, rather than a different template such as a
>"persistent_vector", is that it can be passed to template functions
>expecting an actual vector.

I used inheritance from the std containers, which provides similar virtues
when the template functions expect container references. Partially
specializing the std containers would be a better approach because they
would permit generic algorithms to create persistent containers (he said,
thinking out loud).

> I don't think one should expect to be
>able to hide all details about the true nature of the container from
>the code that created it; abstraction is for other users of the object
>or type.  Thus, synchronizing a persistent vector with external media
>should not happen in the destructor, where there is no avenue to report
>errors, but in a separate named function specifically responsible for
>the job and called by the code that knows the "persistent" nature of
>the container.

I was able to implement it that way, however, and so far it has served me
well. The only errors that the destructor needs to process are any inability
to access the disk file (or whatever), which presents the same runtime
problems whether the error occurs in the dtor or in a special function
called prior to the dtor. My published version ignored that problem for the
sake of brevity, however.

One of the side effects of the way I did it was that a particular persistent
data store was unconcerned about what kind of container created it or read
it back in other than to differentiate between maps and the others (maps
have to store two objects). You could create a persistent vector and read it
back into a persistent list, for example.




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/12/10 Raw View

Al Stevens<alstevens@midifitz.com> wrote:
>... I tried to implement an interface something
>like this (simplified for this discussion).
>
>std::vector<int, PAlloc("filename")> pints;
>
>The idea is that the PAlloc allocator uses the contents of "filename" to
>initialize pints with its persistent values when pints is instantiated.
>The program then uses pints like any other std::vector<int>. When pints
>is destroyed, its contents are returned to "filename" if any changes have
>occurred. What I found was that allocators have no way to tell containers
>that they are not only allocating memory but initializing it, too, and
>containers have no way to know that that is happening. Allocators, under
>the current specification, only allocate internal memory, and containers
>tell them how much to allocate.

Two comments:

First, why do you expect the contents of your persistent vector to
"persist" after you have *destroyed* it?  I would expect such a
container to be new'd, and at some point to be synchronized to
persistent media, and either leaked on program termination or
"disconnected" or "unmounted" from the media so that further events
(such as destruction) would not affect the persistent image.

Second, I would implement such a feature by partial specialization
on the allocator type, so I would not be bound by the standard
container-allocator interface if it were not sufficient to my needs.
Of course that requires a more-or-less up-to-date compiler, but those
are common now.

The virtue of using a vector specialized (explicitly or otherwise)
on an allocator type, rather than a different template such as a
"persistent_vector", is that it can be passed to template functions
expecting an actual vector.  I don't think one should expect to be
able to hide all details about the true nature of the container from
the code that created it; abstraction is for other users of the object
or type.  Thus, synchronizing a persistent vector with external media
should not happen in the destructor, where there is no avenue to report
errors, but in a separate named function specifically responsible for
the job and called by the code that knows the "persistent" nature of
the container.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/12/09 Raw View

[I'll try again. It seems that I now have to send a message twice to get it
posted. I sent this originally on 12/6/98 at 5:14 PM EST. Numerous other
messages from others have been posted since then.]

>What extensions to the standard would you suggest, to allow disk-based
>associative containers (whether or not they're implemented using
>b-trees)?


(This is all from memory and might be somewhat inaccurate, so please don't
take aim at the details, only the concept.) My first shot at this assumed
that Alexander and Bjarne were correct when they postulated that allocators
were the path to persistence. I tried to implement an interface something
like this (simplified for this discussion).

std::vector<int, PAlloc("filename")> pints;

The idea is that the PAlloc allocator uses the contents of "filename" to
initialize pints with its persistent values when pints is instantiated. The
program then uses pints like any other std::vector<int>. When pints is
destroyed, its contents are returned to "filename" if any changes have
occurred. What I found was that allocators have no way to tell containers
that they are not only allocating memory but initializing it, too, and
containers have no way to know that that is happening. Allocators, under the
current specification, only allocate internal memory, and containers tell
them how much to allocate.

I could have extended a particular STL implementation to achieve this
objective, but the extension would have been implementation-dependent in
that it would have used the implementation details rather standard parts of
the specification's interface. I did this research with the Microsoft VC++
5.0 implementation and the 12/96 public review draft.

There is much more to consider when the disk-based container is associative
and/or you want not only persistence but cached nodes of objects coming in
and out as they are needed.

To answer your question, I am not now prepared to propose a specific
extension that would support these things, only to suggest that some
research in this area is appropriate. Having done a lot of work with
so-called persistent objects in the past, I would be willing to--would even
like to--participate in such research and in the formulation of a proposal.
I have to believe that others are working along the same lines.




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/12/05 Raw View

Tony wrote:
....
> OK, but the newsgroup is std.c++ and not the.c++.std.  Which reads to me
> like: "the language c++ has been standardized, now let's talk about the
> usage (and implementations)" rather than "let's keep beating this dead
> horse with a stick".

Sure, it's entirely appropriate to discuss general questions about how
to use or implement an actual feature of standard C++. I didn't think
your question fell into that category.

You were asking about why something wasn't in the standard; by
implication, you were suggesting that it should be added to the standard
(or at least to some existing implementations, in preparation for
standardization in the next release). What you wanted to add was a
something defined by its implementation, rather than by its interface.
I'm not saying that it's wrong to discuss that in this group. I am
saying that such a change would be inappropriate, as being inconsistent
with the purposes of the standard. Adding disk-based associative
container would be quite reasonable (though there are lots of issues to
be resolved). Requiring such a container to use a B-tree data structure
would not be appropriate. On the other hand, requiring such a container
to meet complexity requirements that can be met by a B-tree
implementation would be.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/12/06 Raw View

Al Stevens wrote:

> >That's the focus of implementors. The focus of standardizers is to
> >specify the interface (including the very important issue of what to
> >call it), and to leave the implementors free to choose whatever
> >implmentation they want, that meets the specified requirements.

> And of concern is that the specified STL requirements perhaps cannot be met
> by an implemention that uses some particular underlying disk-based data
> structure, such as b-trees. I think that one could probably do it, but the
> STL interface is clearly specified to work well with memory-based
> containers. The specifiers could have considered the requirements for
> disk-based containers, too, but since they did not, implementors must either
> work to fit the solution to the specification, which could result in kludged
> solutions, or move outside the specification, which results in nonstandard
> solutions.

At the moment, those are your only two solutions, since the standard is
final. However, it's reasonable to discuss what extensions new
implementation should support, to create a basis for future
standardization.

What extensions to the standard would you suggest, to allow disk-based
associative containers (whether or not they're implemented using
b-trees)?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/06 Raw View

In article <AP%92.5288$Wv1.1351010@newscene.newscene.com>,
alstevens@midifitz.com says...
>
> >That's the focus of implementors. The focus of standardizers is to
> >specify the interface (including the very important issue of what to
> >call it), and to leave the implementors free to choose whatever
> >implmentation they want, that meets the specified requirements.
>
>
> And of concern is that the specified STL requirements perhaps cannot be met
> by an implemention that uses some particular underlying disk-based data
> structure, such as b-trees. I think that one could probably do it, but the
> STL interface is clearly specified to work well with memory-based
> containers. The specifiers could have considered the requirements for
> disk-based containers, too, but since they did not, implementors must either
> work to fit the solution to the specification, which could result in kludged
> solutions, or move outside the specification, which results in nonstandard
> solutions.

Tis not a failing or a roadblock by any means though.  It just means that
another spec is potentially worthwhile right?  "In the spirit of STl"
rather than "in rigid conformance to the STL".

> My work developing a persistent container library based on the standard
> containers revealed that the specification falls short with respect to the
> requirements for persistent containers; allocators and containers have no
> way to communicate with each other, making it impossible to use the
> allocator mechanism to implement persistence in  the ways that I thought
> would provide an elegant and intuitive interface. That's not a criticism of
> the standard, only an observation. I was able to achieve my objective and
> remain somewhat within the specification, but with a different interface
> than I preferred. Implementors of disk-based containers might run into the
> same kind of obstacle because the specification does not address the
> particular requirements of such containers.

So why not define what an STL-like map, e.g., would mean?

> That situation is one of the consequences of standardizing language and
> library features without years of prior art from which to draw. We are
> seeing a lot of that in the debates and discussions that now surround these
> and other new C++ features.

No, no, no.... don't go there.  Disk-based containers are very doable.  I
think that a lot more flexibility will be left to the implementation is
all. (?)  (Aside: I seem to be creeping up to an STL DB in my own work at
present.)

Tony



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/04 Raw View

In article <3666DC2D.2781@wizard.net>, kuyper@wizard.net says...
> Why are you talking about it here? One of the principles of the C++
> standard is that implementation is an issue for implementors, not for
> the standard itself.

Well EXCUUUUUSE me!

> > Obviously.  But we kinda know already that B-trees work and are the data
> > structure/algorithm of choice for most DB implementations.  So it
> > probably makes sense that it would come up a lot in the discussions about
> > the implementation.  Again, I thought the focus was on how to build this
> > thing rather than what to call it.
>
> That's the focus of implementors. The focus of standardizers is to
> specify the interface (including the very important issue of what to
> call it), and to leave the implementors free to choose whatever
> implmentation they want, that meets the specified requirements.

I thought that "throw it over the wall" paradigm was long gone since it
is decidedly less than optimal?  Indeed, anyone who thinks they can just
pontificate all day (design without implementation investigation) is
probably just looking for status or retirement.  A good designer also
designs for manufacture and a host of other things.

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/12/04 Raw View

Tony wrote:
>
> In article <3666DC2D.2781@wizard.net>, kuyper@wizard.net says...
> > Why are you talking about it here? One of the principles of the C++
> > standard is that implementation is an issue for implementors, not for
> > the standard itself.
>
> Well EXCUUUUUSE me!

You're excused. :-)

...
> > That's the focus of implementors. The focus of standardizers is to
> > specify the interface (including the very important issue of what to
> > call it), and to leave the implementors free to choose whatever
> > implmentation they want, that meets the specified requirements.
>
> I thought that "throw it over the wall" paradigm was long gone since it
> is decidedly less than optimal?  Indeed, anyone who thinks they can just
> pontificate all day (design without implementation investigation) is
> probably just looking for status or retirement.  A good designer also
> designs for manufacture and a host of other things.

You're misunderstanding the purpose of the standard. The standard is a
requirements document, not a design document. Like all requirements
documents, an important issue is whether or not the requirement is
efficiently implementable. However, good requirements documents leave
implementation details up to the detailed design phase. This is
particularly true of language standards. Implementations will be done to
meet these requirements, on an astoundingly large range of platforms.
Every detail that can safely be left unspecified in the standard, makes
it that much easier for each implementor to choose an implementation
tailored to their specific context.

The interface is the key part that can't be left unspecified in the
standard, and that is what standardization quite properly focuses on.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/04 Raw View

In article <36684D3B.167E@wizard.net>, kuyper@wizard.net says...
> You're misunderstanding the purpose of the standard. The standard is a
> requirements document, not a design document. Like all requirements
> documents, an important issue is whether or not the requirement is
> efficiently implementable. However, good requirements documents leave
> implementation details up to the detailed design phase. This is
> particularly true of language standards. Implementations will be done to
> meet these requirements, on an astoundingly large range of platforms.
> Every detail that can safely be left unspecified in the standard, makes
> it that much easier for each implementor to choose an implementation
> tailored to their specific context.
>
> The interface is the key part that can't be left unspecified in the
> standard, and that is what standardization quite properly focuses on.

OK, but the newsgroup is std.c++ and not the.c++.std.  Which reads to me
like: "the language c++ has been standardized, now let's talk about the
usage (and implementations)" rather than "let's keep beating this dead
horse with a stick".

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/12/05 Raw View

NNTP-Posting-Host: 199.233.112.82
NNTP-Posting-Date: Fri, 04 Dec 1998 19:16:16 CDT
X-Trace: newscene.newscene.com 912820576 199.233.112.82 (Fri, 04 Dec 1998 19:16:16 CDT)
X-Newsreader: Microsoft Outlook Express 4.72.3110.1
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
Content-Length: 1998

[I'll try again. I posted this message originally a couple of days ago, but
it got lost in the wash.]

>That's the focus of implementors. The focus of standardizers is to
>specify the interface (including the very important issue of what to
>call it), and to leave the implementors free to choose whatever
>implmentation they want, that meets the specified requirements.


And of concern is that the specified STL requirements perhaps cannot be met
by an implemention that uses some particular underlying disk-based data
structure, such as b-trees. I think that one could probably do it, but the
STL interface is clearly specified to work well with memory-based
containers. The specifiers could have considered the requirements for
disk-based containers, too, but since they did not, implementors must either
work to fit the solution to the specification, which could result in kludged
solutions, or move outside the specification, which results in nonstandard
solutions.

My work developing a persistent container library based on the standard
containers revealed that the specification falls short with respect to the
requirements for persistent containers; allocators and containers have no
way to communicate with each other, making it impossible to use the
allocator mechanism to implement persistence in  the ways that I thought
would provide an elegant and intuitive interface. That's not a criticism of
the standard, only an observation. I was able to achieve my objective and
remain somewhat within the specification, but with a different interface
than I preferred. Implementors of disk-based containers might run into the
same kind of obstacle because the specification does not address the
particular requirements of such containers.

That situation is one of the consequences of standardizing language and
library features without years of prior art from which to draw. We are
seeing a lot of that in the debates and discussions that now surround these
and other new C++ features.




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/12/03 Raw View

Tony wrote:

> In article <741k2s$bff$1@newsource.ihug.co.nz>, ross.s@ihug.co.nz says...
...
> > But there's an important point that several posters in this thread have
> > tried to make and you still seem to be missing. The fact that it's a
> > B-tree is an *implementation detail*.
>
> But we're talking about _implementation_ here.

Why are you talking about it here? One of the principles of the C++
standard is that implementation is an issue for implementors, not for
the standard itself.

...
> > Your hypothetical disk-based container(s) should follow the same
> > principle. Specify the public interface and the complexity of
> > operations. These may well be such that only a B-tree can satisfy them,
> > but that's an implementation detail of interest only to the library
> > writers, not something that should be made explicit in the public
> > interface.

> Obviously.  But we kinda know already that B-trees work and are the data
> structure/algorithm of choice for most DB implementations.  So it
> probably makes sense that it would come up a lot in the discussions about
> the implementation.  Again, I thought the focus was on how to build this
> thing rather than what to call it.

That's the focus of implementors. The focus of standardizers is to
specify the interface (including the very important issue of what to
call it), and to leave the implementors free to choose whatever
implmentation they want, that meets the specified requirements.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/03 Raw View

In article <741k2s$bff$1@newsource.ihug.co.nz>, ross.s@ihug.co.nz says...
>
> Tony wrote in message ...
> >
> >Again though, I don't think STL-spec adherence is the goal here.  But
> >rather something _similar_ for a "standard B-tree".  Perhaps B-tree is
> >only an example of the whole set of disk-based containers, but usually
> >it's good to have a first project to develop an idea upon and then make
> >it more general.
>
> But there's an important point that several posters in this thread have
> tried to make and you still seem to be missing. The fact that it's a
> B-tree is an *implementation detail*.

But we're talking about _implementation_ here.

> The appropriate name for such a
> container would be something like external_array or persistent_map; it
> may well be implemented as a B-tree, but *calling* it that, or making
> the fact that it's a B-tree visible in the public interface, would be
> entirely inappropriate.

I'm not so concerned about the name as you are, but rather the actual
solution.

> Your hypothetical disk-based container(s) should follow the same
> principle. Specify the public interface and the complexity of
> operations. These may well be such that only a B-tree can satisfy them,
> but that's an implementation detail of interest only to the library
> writers, not something that should be made explicit in the public
> interface.

Obviously.  But we kinda know already that B-trees work and are the data
structure/algorithm of choice for most DB implementations.  So it
probably makes sense that it would come up a lot in the discussions about
the implementation.  Again, I thought the focus was on how to build this
thing rather than what to call it.

Tony



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/01 Raw View

In article <3662E2F8.446B@wizard.net>, kuyper@wizard.net says...
> No, I'm referring to disk access efficiency. My question is about
> comparing a disk-based container using a B-tree data structure, to a
> disk-based container using a red-black tree data structure. For a
> disk-based container, in-memory access efficiency will almost never be a
> significant issue.

I think the liturature shows that r-b trees are inappropriate for disk-
based storage (binary-unbalanced tree vs. n-way-balanced tree?).  I
believe it comes down to being able to page to and from disk effectively
so B-tree fits the bill.  OTOH, maybe it is the bottom-up growth of B-
trees which eliminates linear insertion cost that is the primary reason
why B-tree is the data structure of choice for disk-based storage
(access).

> What I'm trying to find out is whether or not you see any such
> functional differences between a B-tree and a red-black tree, or is it
> purely an efficiency issue?

See above.

> Is there any feature of a B-tree that you prefer, other than it's
> greater efficiency for disk-based storage?

See above, again.

> Another way to put it is: would you care whether the standard required a
> disk-based container to use a B-tree, so long as it allowed an
> implementation to use a B-tree for that purpose? Would you be willing to
> allow the implementors to use their own best judgement as to the best
> way to implement that container?

Of course (2nd question)!  I'm not the computer scientist-like algorithm
builder (for the most part), but I do do SE.  Sometimes though, the
standards stuff gets to be overkill in complexity and understandability
to satisfy the most abstract perspective, so reinventing is the order of
the day.  I recently started playing around with the BDB btree routines,
e.g., but there's just too much undocumented stuff going on in there for
me to support at this time.

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/12/01 Raw View

In article <w3y82.4590$bB2.730777@newscene.newscene.com>,
alstevens@midifitz.com says...
> External containers have been developed that fit the STL model. I published
> a persistent template library a while back in Dr. Dobb's Journal. In that
> implementation, which encapsulated all the STL containers, the container
> contents are stored on disk and brought into memory when instantiated. The
> complete container comes into memory while it is in use and its contents are
> rewritten to disk when no longer needed in memory--when the container object
> instantiation is destroyed. Thus its persistent property.

This sounds like it won't work for the case that B-trees were invented
for: indexes too large to fit into memory and avoidance of all-at-once
loading.

Again though, I don't think STL-spec adherence is the goal here.  But
rather something _similar_ for a "standard B-tree".  Perhaps B-tree is
only an example of the whole set of disk-based containers, but usually
it's good to have a first project to develop an idea upon and then make
it more general.

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Ross Smith" <ross.s@ihug.co.nz>
Date: 1998/12/01 Raw View

Tony wrote in message ...
>
>Again though, I don't think STL-spec adherence is the goal here.  But
>rather something _similar_ for a "standard B-tree".  Perhaps B-tree is
>only an example of the whole set of disk-based containers, but usually
>it's good to have a first project to develop an idea upon and then make
>it more general.

But there's an important point that several posters in this thread have
tried to make and you still seem to be missing. The fact that it's a
B-tree is an *implementation detail*. The appropriate name for such a
container would be something like external_array or persistent_map; it
may well be implemented as a B-tree, but *calling* it that, or making
the fact that it's a B-tree visible in the public interface, would be
entirely inappropriate.

For an analogy, consider the three standard sequential containers:
vector, deque, and list. They all have very similar public interfaces,
because to a first approximation they're all the same thing: a linear
sequential container. But the complexity requirements specified in the
standard imply that they're useful in different contexts: broadly, you
use a vector when the sequence is static, when most insertions and
deletions take place at one end, or when random access is important;
you use a deque when insertions/deletions are frequent at either end;
you use a list when insertions/deletions at arbitrary points in the
sequence are common.

The internal implementations are different, of course. But that's not
important from the user's point of view: what matters are the
complexity specifications for various operations. A vector is usually
a plain array internally, a list is a doubly-linked list, and frankly
I haven't a clue how a deque is normally implemented. I don't care,
because the public interface and complexity specification tells me
everything I need to know to use a deque appropriately.

Your hypothetical disk-based container(s) should follow the same
principle. Specify the public interface and the complexity of
operations. These may well be such that only a B-tree can satisfy them,
but that's an implementation detail of interest only to the library
writers, not something that should be made explicit in the public
interface.

--
Ross Smith ................................... mailto:ross.s@ihug.co.nz
.............. The Internet Group, Auckland, New Zealand ..............
                               * * * * *
    To err is human. To really screw up requires the root password.



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1998/11/30 Raw View

In article <MPG.10c44eccc8860c4098998b@netnews.worldnet.att.net>,
Tony@ask.me says...

[ ... ]

> I then won't use the B-tree template!  I think the spec should be
> "containers" rather than "in-memory containers".

The language has currently defined methods for dealing with objects in
memory, using copy ctors, assignment operators, comparison operators,
etc.

The standard containers generalize these capabilities to carry out
operations on groups of objects rather than individual objects, but
they do so using the operations defined for the individual objects in
the group.

By contrast, the language does NOT provide any standardized means for
storing objects on disk, or dealing with objects that have been stored
on disk.

C++ lacks the fundamental capability, and a library can't generalize
something that doesn't exist to start with.

To make all this work, you'd have to deal with some basic issues about
what you were going to provide.  For example, with most individual
implementations, you could write out some arbitrary identifier to
identify the type of an object, then the values of the objects it
contained, in some pre-defined order.

However, to standardize this capability, you'd have to standardize the
identifier, the order, etc.  You'd have to provide some method of
converting from an identifier to a call to the appropriate ctor (or
re-ctor) and so on.

I suspect dealing with this issue alone would take close to as much
time and effort as it took to create the C++ standard we have.  It's
also difficult to standardize something with which the world as a
whole has relatively little experience.  Just for example, they were
making adjustments to details of how templates worked right up to the
very end of the standardization process, despite the world having
around 15 years worth of experience with templates in Ada.  Namespaces
were much the same way -- they've been around in a number of other
languages (e.g. Eiffel) for quite a while, but are open to some
controversy even today.

I don't know of many languages at all similar to C++ that attempt to
define I/O for anything similar to an object in C++.  This would mean
that standardization would more or less have to start with completely
original research rather than being a matter of standardizing existing
practice.  IMO, the committee was quite right to avoid that.  They
already get accused of having gone too far with original features,
despite most of them having been around for a decade or more in
languages similar enough to C++ to make that experience useful and
valid.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/12/01 Raw View

In article <Pine.A32.3.96.981129003249.101266A-100000@falcon.lhup.edu>,
  "John E. Potter" <jpotter@falcon.lhup.edu> wrote:
>
> On 28 Nov 1998, James Kuyper wrote:
>
> > Tony wrote:
> > >
> > > In article <365A33AE.704A4556@wizard.net>, kuyper@wizard.net says...
> > > > What would a B-tree give you that a map<> doesn't?
> > >
> > > Large indexes (too big to fit into memory) backed by secondary storage.
> >
> > So it's not specifically the B-tree that you want; it's the disk-based
> > container that is your real issue?
> > Let's put it this way - I gather that the main reasons why B-trees tend
> > to be associated with disk-based storage are efficiency issues (about
> > which I know next to nothing).
>
> Yes, it is all about efficiency and disk based.  A b-tree is an n-ary tree
> balanced by number of nodes (disk reads) from root to leaf.  The red-black
> tree is a binary tree implementation (see my post on clc++m) of a minimal
> b-tree which takes advantage of the b-tree balancing to produce a tree in
> which the greatest distance from root to leaf is no more than twice the
> shortest distance.
>
> A binary tree is not an efficient external storage data structure.  A poor
> implementation of an external b-tree would still out perform any external
> binary tree.  An internal n-ary b-tree could not be used for map/set since
> it could not meet the iterator requirements.
>
> You have hit the issue.  COBOL has VSAM, why doesn't C++ have a reasonable
> external storage container?  Of course, the answer is that nobody proposed
> it.
>
> There may well be other reasons.  Could we redirect the discussion to
> Tony's concern?  Why does an external container not fit in the STL model?

Whoa! This horse just changed colors!

There are at least two different reasons to want to implement
containers that swap to disk. One is to use a lot of data on systems
that do not have virtual memory. The other is for persistant storage.

When using disk-based containers to minimize memory, the containers
can allocate memory from a fixed "pool." There must be a method to
lock elements that are in use and free them when they can be swapped
out. Programs should avoid repetitive passes over the elements
because this could lead to thrashing.

When using disk-based containers for persistant storage, containers
should each have a filename, specified when the container is created
or "opened." If sufficient virtual memory is expected there is no
need to lock and unlock, since the members can be brought in on
demand (or during initialization) and then remain resident. There is
no reason for programs to avoid repetitive passes over the elements.

That's just what I came up with on the spur of the moment; there are
many many more differences, I'm sure. The two reasons for disk-based
containers do share a few common problems, but in general a solution
to limited-memory containers will not satisfy the needs of persistent
containers, nor vice-versa.

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/30 Raw View

In article <365EC7E0.2CED2029@wizard.net>, kuyper@wizard.net says...
>
> Tony wrote:
> >
> > In article <365A33AE.704A4556@wizard.net>, kuyper@wizard.net says...
> > >
> > > Tony wrote:
> > > ....
> > > > Is STL not evolving anymore perhaps or am I just out of touch?  Being
> > >
> > > Yes, you are out of touch. The STL is specified by the standard, which
> > > has been officially approved. For the next several years, changes to the
> > > STL will be entirely a matter of vendor-specific extensions.
> >
> > So I was right then right?  What am I out of touch about?
> >
> > > > that it's a specification, where are all the implementations (other than
> > > > the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
> > > > not available anywhere.
> > >
> > > What would a B-tree give you that a map<> doesn't?
> >
> > Large indexes (too big to fit into memory) backed by secondary storage.
>
> So it's not specifically the B-tree that you want; it's the disk-based
> container that is your real issue?

Yep, the disk-based container is the thing.  Basing it on B-tree
principles seems like the right thing to do since all (a lot) the
research has been done already and would give the developer familiarity
with it immediately.

> Let's put it this way - I gather that the main reasons why B-trees tend
> to be associated with disk-based storage are efficiency issues (about
> which I know next to nothing).

Access efficiency yes (again, this is for data sets that are too large or
too troublesome to hold in or bring into memory at once).

> Assume that you had two different
> implmentations of a disk-based map<>-like container: one had a very
> efficiently implemented red-black tree data structure; the other had a
> technically correct B-tree data structure, but implemented so poorly
> that it had no efficiency advantage over the red-black tree
> implementation. Is there any specific feature of B-trees that would make
> you favor the second implementation?

(I have a feeling you're alluding to in-memory access efficiency which
isn't the issue--it's disk access that's the issue)

Anyway...
Why would anyone choose a "poor implementation"?  If red-black trees
weren't good enough for disk storage before why would they be now?
Overall, B-trees (even simple ones) do a darn good job.  A little caching
and storage optimization and they're considered excellent.  So why
reinvent the wheel?

Tony

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/30 Raw View

In article <Pine.A32.3.96.981129003249.101266A-100000@falcon.lhup.edu>,
jpotter@falcon.lhup.edu says...
> You have hit the issue.  COBOL has VSAM, why doesn't C++ have a reasonable
> external storage container?  Of course, the answer is that nobody proposed
> it.
>
> There may well be other reasons.  Could we redirect the discussion to
> Tony's concern?  Why does an external container not fit in the STL model?

Actually, I believe the more relevant discussion is what IS required to
make one of these things.  Perhaps some of the STL spec or style could be
used, but we're probably talking about a template class rather than
another entire spec.  Maybe you were asking the same thing?

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/30 Raw View

In article <Ste82.2070$bB2.150103@newscene.newscene.com>,
alstevens@midifitz.com says...
>
> Not functionally, which is at issue here. You need an associative container.
> That comprises its functions. Speed, capacity, and persistence are
> performance issues, although I might concede a minor point on persistence.
>
> Tony wrote in message ...
> >So you don't think that "I need a disk-based container" is any different
> >from "I need an in-memory container" eh???

I guess we'll have to agree to disagree on this one as I'd hardly
consider disk-based storage (persistence) a performance issue rather than
a functional one.  In-memory data structures vs. file structures are two
different animals (at least until OS's abstract those with a single-level
view of storage).

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/30 Raw View

>COBOL has VSAM, why doesn't C++ have a reasonable
>external storage container?  Of course, the answer is that nobody proposed
it.

I don't know about within the committee, but out here the notion has been
discussed ever since we first heard about STL.

>There may well be other reasons.

Venturing a guess--too complex for the committee to specify the first time
out. Without an existing working implementation, they would have had to
build and validate. There was no prior art to start from--in the STL model,
that is.

> Why does an external container not fit in the STL model?


External containers have been developed that fit the STL model. I published
a persistent template library a while back in Dr. Dobb's Journal. In that
implementation, which encapsulated all the STL containers, the container
contents are stored on disk and brought into memory when instantiated. The
complete container comes into memory while it is in use and its contents are
rewritten to disk when no longer needed in memory--when the container object
instantiation is destroyed. Thus its persistent property.

A container that resides on disk while in use, such as a b-tree, b*tree, or
b'tree, presents other problems related to iterators, which cannot be
intermixed with memory pointers as in the existing STL model. That however
is an implementation detail. Iterators would have to be smart enough to get
into the memory<->disk swapping logic; whenever a program dereferences an
iterator, the iterator must decide whether what it refers to is in memory
and what to do if not. Memory based iterators, of course, don't have this
problem. And, to truly fit the STL generic programming model, disk-based
iterators would have to be smart enough to work with all the generic
algorithms that work with memory-based iterators.



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/30 Raw View

Tony wrote:
>
> In article <365EC7E0.2CED2029@wizard.net>, kuyper@wizard.net says...
...
> > So it's not specifically the B-tree that you want; it's the disk-based
> > container that is your real issue?
>
> Yep, the disk-based container is the thing.  Basing it on B-tree
> principles seems like the right thing to do since all (a lot) the
> research has been done already and would give the developer familiarity
> with it immediately.
>
> > Let's put it this way - I gather that the main reasons why B-trees tend
> > to be associated with disk-based storage are efficiency issues (about
> > which I know next to nothing).
>
> Access efficiency yes (again, this is for data sets that are too large or
> too troublesome to hold in or bring into memory at once).
>
> > Assume that you had two different
> > implmentations of a disk-based map<>-like container: one had a very
> > efficiently implemented red-black tree data structure; the other had a
> > technically correct B-tree data structure, but implemented so poorly
> > that it had no efficiency advantage over the red-black tree
> > implementation. Is there any specific feature of B-trees that would make
> > you favor the second implementation?
>
> (I have a feeling you're alluding to in-memory access efficiency which
> isn't the issue--it's disk access that's the issue)

No, I'm referring to disk access efficiency. My question is about
comparing a disk-based container using a B-tree data structure, to a
disk-based container using a red-black tree data structure. For a
disk-based container, in-memory access efficiency will almost never be a
significant issue.

> Anyway...
> Why would anyone choose a "poor implementation"?  If red-black trees

Because a poor implementation of one data structure could have
capabilities that a good implementation of a different data structure
didn't. For instance, for some applications, a poor implementation of a
linked list is preferable to a good implementation of a stack, because
there are frequent needs for push_back(), pop_back(), or insert(in the
middle).

What I'm trying to find out is whether or not you see any such
functional differences between a B-tree and a red-black tree, or is it
purely an efficiency issue?

> weren't good enough for disk storage before why would they be now?
> Overall, B-trees (even simple ones) do a darn good job.  A little caching
> and storage optimization and they're considered excellent.  So why
> reinvent the wheel?

You miss the point of my question; I'm trying distinguish the disk
access issue from the data structure issue. In real life, those issues
are tied together by the fact that one data structure is more efficient
when used with disk-based storage, and the other data structure is more
efficient when used with with memory-based storage. In order to seperate
the issues, I postulated an implementation of what is normally the more
efficient data structure, that was technically correct but so poorly
implemented that it wiped out the efficiency difference. This isn't as
unrealistic as you might think - I've seen some awful implementations of
good ideas, that were technically correct.
Is there any feature of a B-tree that you prefer, other than it's
greater efficiency for disk-based storage?

Another way to put it is: would you care whether the standard required a
disk-based container to use a B-tree, so long as it allowed an
implementation to use a B-tree for that purpose? Would you be willing to
allow the implementors to use their own best judgement as to the best
way to implement that container?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/26 Raw View

In article <365A33AE.704A4556@wizard.net>, kuyper@wizard.net says...
>
> Tony wrote:
> ....
> > Is STL not evolving anymore perhaps or am I just out of touch?  Being
>
> Yes, you are out of touch. The STL is specified by the standard, which
> has been officially approved. For the next several years, changes to the
> STL will be entirely a matter of vendor-specific extensions.

So I was right then right?  What am I out of touch about?

> > that it's a specification, where are all the implementations (other than
> > the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
> > not available anywhere.
>
> What would a B-tree give you that a map<> doesn't?

Large indexes (too big to fit into memory) backed by secondary storage.

Tony

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/26 Raw View

In article <UID62.5081$Q6.1075310@newscene.newscene.com>,
alstevens@midifitz.com says...
> >What would a B-tree give you that a map<> doesn't?
>
> Functionally nothing,
  ^^^^^^^^^^^^^^^^^^^^

> but are we talking apples and oranges, maybe? std::map
> is an associative container with a defined interface. B-tree is a data
> structure that you can use to implement an associative container. The
> framers chose red-black trees, a different data structure to implement
> std::map because red-black trees are more appropriate for in-memory
                                                            ^^^^^^^^^
> containers. Btrees are more appropriate for disk-based containers. But you
                                              ^^^^^^^^^^
> can implement in-memory b-trees and disk-based red-black trees, too,
> although I'm not sure when you would want to do that.

Hardly "functionally nothing".  Fundamentally different if you ask me.

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/27 Raw View

>Hardly "functionally nothing".  Fundamentally different if you ask me.

No, the functions supported by the STL associative container would remain
the same irrespective of the underlying data structure. The performance
might be different. Functional vs performance requirements, and all that.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/27 Raw View

In article <Lvp62.3691$Q6.671595@newscene.newscene.com>,
alstevens@midifitz.com says...
>
> Tony wrote in message ...
> >Is STL not evolving anymore perhaps or am I just out of touch?  Being
> >that it's a specification, where are all the implementations (other than
> >the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
> >not available anywhere.
>
> It's not that easy.

Perhaps some simplifying assumptions would make the task easier.

> Parts of the STL container interface are difficult to
> implement with disk-based containers. Iterators, for example. An iterator
> value might not be able to survive successive retrievals with no intervening
> updates to the container because the memory could be reused. Iterators would
> need to be a lot smarter. Then you get into the whole problem of persistence
> with objects that can be key and data values. How does a template class
> write a <T> object to disk? How much are you willing to constrain the
> designer of T? It's no wonder the framers did not try to add  disk-based
> containers to STL first time out.
>
> I've implemented b-trees in C++ before templates and then later with them
> and have been wanting to find the time to see if I can fit the structure
> into the STL interface. Everytime I get started, something gets in the way.
> Somebody's bound to do it eventually; somebody might have already done it.

I thing Rogue-Wave includes a B-tree.  I don't know if it's a template
though.

Tony
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/28 Raw View

Tony wrote:
>
> In article <365A33AE.704A4556@wizard.net>, kuyper@wizard.net says...
> >
> > Tony wrote:
> > ....
> > > Is STL not evolving anymore perhaps or am I just out of touch?  Being
> >
> > Yes, you are out of touch. The STL is specified by the standard, which
> > has been officially approved. For the next several years, changes to the
> > STL will be entirely a matter of vendor-specific extensions.
>
> So I was right then right?  What am I out of touch about?
>
> > > that it's a specification, where are all the implementations (other than
> > > the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
> > > not available anywhere.
> >
> > What would a B-tree give you that a map<> doesn't?
>
> Large indexes (too big to fit into memory) backed by secondary storage.

So it's not specifically the B-tree that you want; it's the disk-based
container that is your real issue?
Let's put it this way - I gather that the main reasons why B-trees tend
to be associated with disk-based storage are efficiency issues (about
which I know next to nothing). Assume that you had two different
implmentations of a disk-based map<>-like container: one had a very
efficiently implemented red-black tree data structure; the other had a
technically correct B-tree data structure, but implemented so poorly
that it had no efficiency advantage over the red-black tree
implementation. Is there any specific feature of B-trees that would make
you favor the second implementation?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/28 Raw View

Tony wrote:
>
> In article <UID62.5081$Q6.1075310@newscene.newscene.com>,
> alstevens@midifitz.com says...
> > >What would a B-tree give you that a map<> doesn't?
> >
> > Functionally nothing,
>   ^^^^^^^^^^^^^^^^^^^^
>
> > but are we talking apples and oranges, maybe? std::map
> > is an associative container with a defined interface. B-tree is a data
> > structure that you can use to implement an associative container. The
> > framers chose red-black trees, a different data structure to implement
> > std::map because red-black trees are more appropriate for in-memory
>                                                             ^^^^^^^^^
> > containers. Btrees are more appropriate for disk-based containers. But you
>                                               ^^^^^^^^^^
> > can implement in-memory b-trees and disk-based red-black trees, too,
> > although I'm not sure when you would want to do that.
>
> Hardly "functionally nothing".  Fundamentally different if you ask me.

The STL container requirements make it difficult, if not impossible, to
implment safe disk-based containers. The fundamental problem is that a
user can extract an ordinary pointer or reference into a contained
object, and use it even after that object has been swapped back to disk,
which means you cannot safely release the memory allocated for that
object. This applies equally well, regardless of which data structure is
used to implment map<>.

On the flip side, both Btrees and red-black trees can be used to
implement in-memory contains. Thus, the implementation method is purely
an efficiency issue; it's not a functionality issue.

Now, a real functionality issue is why the standard doesn't provide
disk-based containers, regardless of which data structure was used to
implement them. Such a container would have to provide iterators with a
lock() member to force a given element into memory, and an unlock()
member which invalidates all pointers and references previously derived
from dereferencing that iterator. Arguably, lock() might be redundant,
being implicitly called by operator*(), operator->(), and operator[]().
I suspect the issue never came up.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/28 Raw View

In article <fIp72.1829$Cm.332259@newscene.newscene.com>,
alstevens@midifitz.com says...
> >Hardly "functionally nothing".  Fundamentally different if you ask me.
>
> No, the functions supported by the STL associative container would remain
> the same irrespective of the underlying data structure. The performance
> might be different. Functional vs performance requirements, and all that.

So you don't think that "I need a disk-based container" is any different
from "I need an in-memory container" eh???

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "John E. Potter" <jpotter@falcon.lhup.edu>
Date: 1998/11/29 Raw View

On 28 Nov 1998, James Kuyper wrote:

>
> Tony wrote:
> >
> > In article <365A33AE.704A4556@wizard.net>, kuyper@wizard.net says...
> > > What would a B-tree give you that a map<> doesn't?
> >
> > Large indexes (too big to fit into memory) backed by secondary storage.
>
> So it's not specifically the B-tree that you want; it's the disk-based
> container that is your real issue?
> Let's put it this way - I gather that the main reasons why B-trees tend
> to be associated with disk-based storage are efficiency issues (about
> which I know next to nothing).

Yes, it is all about efficiency and disk based.  A b-tree is an n-ary tree
balanced by number of nodes (disk reads) from root to leaf.  The red-black
tree is a binary tree implementation (see my post on clc++m) of a minimal
b-tree which takes advantage of the b-tree balancing to produce a tree in
which the greatest distance from root to leaf is no more than twice the
shortest distance.

A binary tree is not an efficient external storage data structure.  A poor
implementation of an external b-tree would still out perform any external
binary tree.  An internal n-ary b-tree could not be used for map/set since
it could not meet the iterator requirements.

You have hit the issue.  COBOL has VSAM, why doesn't C++ have a reasonable
external storage container?  Of course, the answer is that nobody proposed
it.

There may well be other reasons.  Could we redirect the discussion to
Tony's concern?  Why does an external container not fit in the STL model?

John

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/29 Raw View

Not functionally, which is at issue here. You need an associative container.
That comprises its functions. Speed, capacity, and persistence are
performance issues, although I might concede a minor point on persistence.

Tony wrote in message ...

>In article <fIp72.1829$Cm.332259@newscene.newscene.com>,
>alstevens@midifitz.com says...
>> >Hardly "functionally nothing".  Fundamentally different if you ask me.
>>
>> No, the functions supported by the STL associative container would remain
>> the same irrespective of the underlying data structure. The performance
>> might be different. Functional vs performance requirements, and all that.
>
>So you don't think that "I need a disk-based container" is any different
>from "I need an in-memory container" eh???

[ moderator's note: Excessive quoting removed. Please delete
  unnecessary quoted material before posting. -sdc ]


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/30 Raw View

John E. Potter wrote:
>
> On 28 Nov 1998, James Kuyper wrote:
....
> You have hit the issue.  COBOL has VSAM, why doesn't C++ have a reasonable
> external storage container?  Of course, the answer is that nobody proposed
> it.
>
> There may well be other reasons.  Could we redirect the discussion to
> Tony's concern?  Why does an external container not fit in the STL model?

It doesn't fit the container model. The container requirements don't
provide a mechanism for telling a container that it's safe to swap
in-memory data back to the file. Any such swap would invalidate all
pointers and references to the in-memory data, causing potential
disaster.

Some kind of extension to the current container requirements would be
needed to cover a container that keeps some of its data on disk.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/26 Raw View

Al Stevens wrote:
>
[I wrote:]
> >What would a B-tree give you that a map<> doesn't?
>
> Functionally nothing, but are we talking apples and oranges, maybe? std::map
> is an associative container with a defined interface. B-tree is a data
> structure that you can use to implement an associative container. The

I understand that; the subject line on this thread asks about "B-Tree
container"; presumably a conventionally-defined associative container,
guaranteed by the standard to be implemented using a B-tree. I was using
"B-tree" as shorthand for that concept. I'd expect such a container to
have an interface functionally equivalent to std::map<>, hence my
question.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/24 Raw View

Tony wrote:
....
> Is STL not evolving anymore perhaps or am I just out of touch?  Being

Yes, you are out of touch. The STL is specified by the standard, which
has been officially approved. For the next several years, changes to the
STL will be entirely a matter of vendor-specific extensions.

> that it's a specification, where are all the implementations (other than
> the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
> not available anywhere.

What would a B-tree give you that a map<> doesn't?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/24 Raw View

Tony wrote in message ...
>Is STL not evolving anymore perhaps or am I just out of touch?  Being
>that it's a specification, where are all the implementations (other than
>the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
>not available anywhere.

It's not that easy. Parts of the STL container interface are difficult to
implement with disk-based containers. Iterators, for example. An iterator
value might not be able to survive successive retrievals with no intervening
updates to the container because the memory could be reused. Iterators would
need to be a lot smarter. Then you get into the whole problem of persistence
with objects that can be key and data values. How does a template class
write a <T> object to disk? How much are you willing to constrain the
designer of T? It's no wonder the framers did not try to add  disk-based
containers to STL first time out.

I've implemented b-trees in C++ before templates and then later with them
and have been wanting to find the time to see if I can fit the structure
into the STL interface. Everytime I get started, something gets in the way.
Somebody's bound to do it eventually; somebody might have already done it.

As for evolution, the SGI version of STL goes beyond the standard with more
kinds of memory-based containers.




[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/24 Raw View

In article <3659B24D.64031C77@dba-sys.com>, roeder@dba-sys.com says...
 >Sam@sam.the.man wrote:
 >>
 >> In article <MPG.10b3efa67673ae49896f2@news.rmi.net>, jcoffin@taeus.com
 >> says...
 >> >
 >> > In article <MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net>,
 >> > Sam@sam.the.man says...
 >> > >
 >> > > Why is there no B-Tree container in the STL class libs?
 >> >
 >> > In theory, map and multimap could both be implemented as B-trees, but
 >> > they're generally not because B-trees provide advantages primarily
 >> > when your index doesn't all fit in memory.  Since the standard
 >> > collection classes are intended primarily for use where the entire
 >> > collection is in memory, B-trees probably aren't particularly useful.
 >>
 >> Certainly the spec isn't so constrained?? Why would it be limited to only
 >> memory-based containers especially since the B-tree is arguably the most
 >> influential container ever?
 >>
 >The spec applys to all platforms.  What are you going to do if the
 >platform doesn't have a disk?

I then won't use the B-tree template!  I think the spec should be
"containers" rather than "in-memory containers".

Tony


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Beman Dawes <beman@removethis.visi.net>
Date: 1998/11/24 Raw View

Al Stevens wrote:

> A binary tree is not a b-tree. The "b" stands for "balanced."

Well, actually, no one knows what the "B" stands for.  Bayer (who
invented them) never said. Comer, "The Ubiquitous B-Tree", ACM Computing
Surveys", June, 1979, says "balanced", "broad", or "bushy", but also
possibly "Boeing" since Bayer worked for Boeing.  He adds "Because of
his [Bayer's] contributions, however, it seems appropriate to think of
B-trees as 'Bayer'-trees."

--Beman Dawes


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/25 Raw View

>What would a B-tree give you that a map<> doesn't?

Functionally nothing, but are we talking apples and oranges, maybe? std::map
is an associative container with a defined interface. B-tree is a data
structure that you can use to implement an associative container. The
framers chose red-black trees, a different data structure to implement
std::map because red-black trees are more appropriate for in-memory
containers. Btrees are more appropriate for disk-based containers. But you
can implement in-memory b-trees and disk-based red-black trees, too,
although I'm not sure when you would want to do that.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Greg Colvin" <gcolvin@us.oracle.com>
Date: 1998/11/21 Raw View

Paul Sorensen <psorensen@wedgewood.net> wrote in article <72r1f6$3vo$1@supernews.com>...
>
> Generally B-tree classes are useful when there is some form of block
> structure in the storage used to hold the container (for example a disk
> would often have a block size of 4096 bytes).  There is no real advantage in
> using a btree instead of a map for non-blocked storage.  On the other hand,
> if you wrote an allocator which stored objects on a disk, then the perfect
> implementation of a map container would be a btree.

Virtual memory is swapped to disk in fixed length pages, and
CPU cache lines are fixed length, so there may still be advantages
to a map class which allocates storage in fixed size, properly
aligned pages and arranges map elements so as to minimize the
number of pages touched by searches and insertions.

------------------------------------------------------------------
Greg Colvin          s/-//g          c-o-l-v-i-n-g-@-a-c-m-.-o-r-g



[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/11/24 Raw View

Al Stevens<alstevens@midifitz.com> wrote:
>(somebody) wrote:
>>How about the 2-3-4-tree?  This is a forward balancable extension
>>of the B1-tree.  Is there a standard name for these things where
>>a node can have d to 2d+1 items?  Anyway, if we accept that as some
>>kind of B-tree and a binary tree implementation as still being a
>>B-tree, the red-black tree used in the sgi set/map is a B-tree.
>
>A binary tree is not a b-tree. The "b" stands for "balanced." A b-tree
>is an m-way tree where m is the maximum number of key/data pairs in a
>node. (A b-tree offers little or no advantage over other data structures
>when m is a small number.)... I don't think that STL's red-black trees
>have or need those properties. I think that a need exists for such a
>data structure that has the STL interface.

Much as I hate to disagree with Al, red-black trees do qualify as
a variety of b-tree (with small N).  Also, the "b" doesn't imply
balanced, it implies "almost-balanced".  The almost-balanced quality
of b-trees is a valuable thing even for in-memory structures, because
it reduces fiddling with the balance.

(Followups set to comp.lang.c++.moderated, per moderator request.)
--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/24 Raw View

In article <Fdg52.7198$Py3.1744075@newscene.newscene.com>,
alstevens@midifitz.com says...
> I think that a need exists for such a data structure [B-tree] that has
> the STL interface.

Is STL not evolving anymore perhaps or am I just out of touch?  Being
that it's a specification, where are all the implementations (other than
the original stuff)?  B-tree seems like it would be a shoe-in, yet it's
not available anywhere.

Tony
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: William Roeder <roeder@dba-sys.com>
Date: 1998/11/24 Raw View

Sam@sam.the.man wrote:
>
> In article <MPG.10b3efa67673ae49896f2@news.rmi.net>, jcoffin@taeus.com
> says...
> >
> > In article <MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net>,
> > Sam@sam.the.man says...
> > >
> > > Why is there no B-Tree container in the STL class libs?
> >
> > In theory, map and multimap could both be implemented as B-trees, but
> > they're generally not because B-trees provide advantages primarily
> > when your index doesn't all fit in memory.  Since the standard
> > collection classes are intended primarily for use where the entire
> > collection is in memory, B-trees probably aren't particularly useful.
>
> Certainly the spec isn't so constrained?? Why would it be limited to only
> memory-based containers especially since the B-tree is arguably the most
> influential container ever?
>
The spec applys to all platforms.  What are you going to do if the
platform doesn't have a disk?
--
Bill Roeder
My opinions are my own and do not reflect those of my employer.
Why do people put the above line in their signature?
If they reflected my employer then he would have sent the message!
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/24 Raw View

William Roeder wrote in message <3659B24D.64031C77@dba-sys.com>...
>The spec applys to all platforms.  What are you going to do if the
>platform doesn't have a disk?

Same thing you do with fstreams. Don't use them.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/20 Raw View

>I thought that the only changes were in the values of the stored
>pointers that determine the structure of the B-tree; that the data
>locations themselves remained unchanged?

Yes it is often implemented that way. In the traditional b-tree model, the
upper nodes have keys and pointers to lower nodes, and the leaf nodes have
keys and data. The data values in the leaves can indeed be pointers to
external data records distributed anyway you want. However, if the data
values are integers, for example, the implementation can take a shortcut and
omit the external data records.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Tony@ask.me (Tony)
Date: 1998/11/20 Raw View

In article <2NI42.2413$Py3.697983@newscene.newscene.com>,
alstevens@midifitz.com says...
> >I thought that the only changes were in the values of the stored
> >pointers that determine the structure of the B-tree; that the data
> >locations themselves remained unchanged?
>
> Yes it is often implemented that way.

> In the traditional b-tree model, the
> upper nodes have keys and pointers to lower nodes, and the leaf nodes have
> keys and data.

So this is the "standard" B-tree implementation then?  Berkely DB seems
pretty old and I believe it stores data (keys) throughout the tree(?).

Can someone here classify the various B-tree implementations (textbook or
product)?

Tony
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/20 Raw View

>> In the traditional b-tree model, the
>> upper nodes have keys and pointers to lower nodes, and the leaf nodes
have
>> keys and data.
>
>So this is the "standard" B-tree implementation then?  Berkely DB seems
>pretty old and I believe it stores data (keys) throughout the tree(?).

The key is the argument to the data. All the data values in a b-tree are in
the leaf nodes in the positions that the upper nodes use for pointers to the
lower nodes. All key values have a corresponding data value. The search has
to navigate down to the leaf node to get the data value for a key found in a
non-leaf node.

>Can someone here classify the various B-tree implementations (textbook or
>product)?


Taken from memory, so someone correct me if I get it wrong. There are two
textbook varieties, b-tree and b*tree. The difference is in the node
redistribution logic that b*trees add.

[ moderator's note: This discussion started out asking if there was
  or should be a B-tree implementation in STL. We are moving on to
  implementations of B-trees and B*-trees, which doesn't belong in
  this newsgroup. Let's keep the discussion within the charter
  guidelines, please. -sdc ]


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/21 Raw View

>How about the 2-3-4-tree?  This is a forward balancable extension
>of the B1-tree.  Is there a standard name for these things where
>a node can have d to 2d+1 items?  Anyway, if we accept that as some
>kind of B-tree and a binary tree implementation as still being a
>B-tree, the red-black tree used in the sgi set/map is a B-tree.

A binary tree is not a b-tree. The "b" stands for "balanced." A b-tree is an
m-way tree where m is the maximum number of key/data pairs in a node. (A
b-tree offers little or no advantage over other data structures when m is a
small number.) The characteristic that sets a b-tree apart is that it keeps
itself in balance, which means that the number of node levels from the root
to all leaves is always the same. The b*tree goes further by redistributing
key/data pairs among sibling nodes to maintain the best distribution among
all the nodes at a given level. Key rebalancing and and node redistribution
occur whenever an insert or delete makes it necessary. These are
advantageous properties for data structures that are too large to keep in
memory (inverted indexes into large databases, e.g.) or that need to be
frequently returned to disk to keep persistent synchronization with the
database files. I don't think that STL's red-black trees have or need those
properties. I think that a need exists for such a data structure that has
the STL interface.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/18 Raw View

Al Stevens wrote:

> >Actually, the only thing that invalidates a map<>::iterator or
> >reference is deleting the very element it refers to.

> That would not be true if map was implemented as a b*tree. Every insertion
> or deletion exposes every other entry in the b*tree to potential relocation
> due to balancing algorithms and entry redistribution.

I thought that the only changes were in the values of the stored
pointers that determine the structure of the B-tree; that the data
locations themselves remained unchanged?

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "John E. Potter" <jpotter@falcon.lhup.edu>
Date: 1998/11/20 Raw View

On 18 Nov 1998, James Kuyper wrote:

>
> Al Stevens wrote:
>
> > >Actually, the only thing that invalidates a map<>::iterator or
> > >reference is deleting the very element it refers to.
>
> > That would not be true if map was implemented as a b*tree. Every insertion
> > or deletion exposes every other entry in the b*tree to potential relocation
> > due to balancing algorithms and entry redistribution.
>
> I thought that the only changes were in the values of the stored
> pointers that determine the structure of the B-tree; that the data
> locations themselves remained unchanged?

I think the difference is about implementation details.  Consider
a B1-tree, each node has one or two keys, also known as a 2-3-tree
by number of children.  If nodes contain two item fields and three
pointer fields, balancing can move items within and between nodes.
A binary tree implementation would only change pointers.

How about the 2-3-4-tree?  This is a forward balancable extension
of the B1-tree.  Is there a standard name for these things where
a node can have d to 2d+1 items?  Anyway, if we accept that as some
kind of B-tree and a binary tree implementation as still being a
B-tree, the red-black tree used in the sgi set/map is a B-tree.

John
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/18 Raw View

>Actually, the only thing that invalidates a map<>::iterator or
>reference is deleting the very element it refers to.

That would not be true if map was implemented as a b*tree. Every insertion
or deletion exposes every other entry in the b*tree to potential relocation
due to balancing algorithms and entry redistribution.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Sam@sam.the.man
Date: 1998/11/16 Raw View

In article <MPG.10b3efa67673ae49896f2@news.rmi.net>, jcoffin@taeus.com
says...
>
> In article <MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net>,
> Sam@sam.the.man says...
> >
> > Why is there no B-Tree container in the STL class libs?
>
> In theory, map and multimap could both be implemented as B-trees, but
> they're generally not because B-trees provide advantages primarily
> when your index doesn't all fit in memory.  Since the standard
> collection classes are intended primarily for use where the entire
> collection is in memory, B-trees probably aren't particularly useful.

Certainly the spec isn't so constrained?? Why would it be limited to only
memory-based containers especially since the B-tree is arguably the most
influential container ever?

-S
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Bill Wade" <bill.wade@stoner.com>
Date: 1998/11/17 Raw View

Sam@sam.the.man wrote in message ...
>
>> [STL containers must be memory based]
>
>Certainly the spec isn't so constrained?? Why would it be limited to only
>memory-based containers especially since the B-tree is arguably the most
>influential container ever?

1) In some cases STL containers return references to their contents (one
example is operator[]).

2) C++ has no mechanism to make smart references which can completely
replace a real reference (for instance operator.() can't be replaced).

3) Some implementers want to implement STL in C++ rather than some other
language ;-).  As an added bonus implementers might want to be able to
perform POD type operations (void*) with memory allocated by a container's
Allocator.

One of these has to change to allow containers which aren't memory based.
(1) and (3) are too popular to change easily.  (2) is difficult to change.

Certainly you, or your favorite library supplier, can provide a file-based
B-tree container which acts very much like an STL container but avoids
returning real references.
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: AllanW@my-dejanews.com
Date: 1998/11/17 Raw View

In article <MPG.10b98d743affc7e898993c@netnews.worldnet.att.net>,
  Sam@sam.the.man wrote:
> In article <MPG.10b3efa67673ae49896f2@news.rmi.net>, jcoffin@taeus.com
> says...
> >
> > In article <MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net>,
> > Sam@sam.the.man says...
> > >
> > > Why is there no B-Tree container in the STL class libs?
> >
> > In theory, map and multimap could both be implemented as B-trees, but
> > they're generally not because B-trees provide advantages primarily
> > when your index doesn't all fit in memory.  Since the standard
> > collection classes are intended primarily for use where the entire
> > collection is in memory, B-trees probably aren't particularly useful.
>
> Certainly the spec isn't so constrained?? Why would it be limited to only
> memory-based containers especially since the B-tree is arguably the most
> influential container ever?

If you're arguing that map doesn't preclude normal paging and
swapping mechanisms in multi-task operating systems, then of
course you are correct.

If you're arguing that the implementation of map should be
allowed to swap the object out to disk, and then read it
back in on demand -- well, there's several reasons why this
would be difficult.

The biggest reason is that other objects, or even user code,
can take the address of an object in the map. Now there are
certain well-defined operations, such as adding or deleting
new objects to the map, which invalidate all of those
addresses. But accessing existing elements does not invalidate
anything. If your implementation needed to swap an element
in to memory, and in order to make room it first swapped out
the element that the user had a pointer to... the user might
use or modify some other element instead!

Of course, all of this works if you can find some way to make
element addresses unique (and non-overlapping) and load-on-
demand. That's essentially a per-container implementation
of paging, not an easy assignment. But the ability to do this
implies that you are using a virtual memory system, which in
turn implies that your operating system already has paging.
OS-based paging works better than any per-container paging
system can even hope to accomplish.

For non-VM systems I can definately see the point of a new
container class, perhaps named pagemap. When my company's
program ran on Win3x, I wrote one named mfVArray<T>. Code
would .lock() an element to prevent it from swapping, then
use it, then .unlock() it. (Sounds simple? But matching the
.lock() and .unlock() calls was surprisingly difficult for
some programmers... This was before STL; if I had to do it
over again today, I would use iterators or something very
similar!) Also, my class used COW, which at the time I
thought was so clever that I should patent it. Later I
discovered that it had been created long before I even
heard of C++...

--
AllanW@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Paul Sorensen" <psorensen@wedgewood.net>
Date: 1998/11/17 Raw View

Generally B-tree classes are useful when there is some form of block
structure in the storage used to hold the container (for example a disk
would often have a block size of 4096 bytes).  There is no real advantage in
using a btree instead of a map for non-blocked storage.  On the other hand,
if you wrote an allocator which stored objects on a disk, then the perfect
implementation of a map container would be a btree.


<Sam@sam.the.man> wrote in message
news:MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net...
>
>Why is there no B-Tree container in the STL class libs?

[ moderator's note: excessive quoting deleted. Please do not quote
  more than in necessary. -sdc ]


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: abrahams@spam.motu.com (David Abrahams)
Date: 1998/11/17 Raw View

On 17 Nov 98 13:00:35 GMT, AllanW@my-dejanews.com wrote:

>Now there are
>certain well-defined operations, such as adding or deleting
>new objects to the map, which invalidate all of those
>addresses.

Actually, the only thing that invalidates a map<>::iterator or
reference is deleting the very element it refers to.

-Dave

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: jcoffin@taeus.com (Jerry Coffin)
Date: 1998/11/12 Raw View

In article <MPG.10b316e0b012a1ff98991c@netnews.worldnet.att.net>,
Sam@sam.the.man says...
>
> Why is there no B-Tree container in the STL class libs?

In theory, map and multimap could both be implemented as B-trees, but
they're generally not because B-trees provide advantages primarily
when your index doesn't all fit in memory.  Since the standard
collection classes are intended primarily for use where the entire
collection is in memory, B-trees probably aren't particularly useful.


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Sam@sam.the.man
Date: 1998/11/11 Raw View

Why is there no B-Tree container in the STL class libs?

Sam


[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: James Kuyper <kuyper@wizard.net>
Date: 1998/11/11 Raw View

Sam@sam.the.man wrote:
>
> Why is there no B-Tree container in the STL class libs?

The map<> container was written to allow (but not require) an
implementation using B-Tree techniques.

[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: ncm@nospam.cantrip.org (Nathan Myers)
Date: 1998/11/12 Raw View

Sam wrote:
>
>Why is there no B-Tree container in the STL class libs?

None was proposed.

Anyway the containers in the standard are just examples.
If they are not precisely what you need, you can add your
own and as long as they satisfy the Requirements tables,
they will work as well with the algorithms.

--
Nathan Myers
ncm@nospam.cantrip.org  http://www.cantrip.org/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1998/11/12 Raw View

Sam@sam.the.man wrote:

> Why is there no B-Tree container in the STL class libs?

What do you mean ?

What would be the interface of B-Tree ?

How would it fit in the STL ?

--

Valentin Bonnard                mailto:bonnardv@pratique.fr
info about C++/a propos du C++: http://pages.pratique.fr/~bonnardv/
---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]

Author: "Al Stevens" <alstevens@midifitz.com>
Date: 1998/11/12 Raw View

Sam@sam.the.man wrote in message ...
>
>Why is there no B-Tree container in the STL class libs?