Topic: does std::basic_stringbuf requires continuous character buffer?


Author: "Claude Qu zel" <Claude_Quezel@Syntell.corba>
Date: 2000/05/26
Raw View
I once had a performance problem with string streams. I wrote large
amount of data in a std::basic_iostringstream. In the implementation I
was using, the std::basic_stringbuf  reallocated the buffer when the
capacity was reached. I thought this was rather inefficient and that a
list of buffers could be maintained avoiding the reallocation and the
copy. I then studied the standard and concluded that
std::basic_stringstream could be made more efficient because of the
specification of std::basic_streambuf. My interpretation is that the the
std::basic_streambuf internal character buffer does not have to be
continuous because of the way eback(), gptr(), eptr()...in_avail() work.
I think that in_avail() does not have to return the actual number of
characters left in the string stream but only those in the get buffer.
This way, if in_avail() returns 0 there are no characters left in the
string stream if it returns non zero then there are at least that number
of characters left in the string stream. Is this true? Could a valid
implementation specify that the put and get buffers of the
std::basic_streambuf (std::basic_stingbuf) span only part (different
parts?) of the internal buffer ?

--

Claude Qu=E9zel (claude_quezel@syntell.corba)
anti-spam: replace corba by com in private replies



---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]






Author: Dietmar Kuehl <dietmar.kuehl@claas-solutions.de>
Date: 2000/05/26
Raw View
Hi,
In article <392D3958.C9DB103C@Syntell.corba>,
  Claude_Quezel@Syntell.corba wrote:
> I once had a performance problem with string streams.

As specified by the standard a performance problem is mandated. This
was detected, turned into a defect report, and will probably be
addressed by the next technical corrigendum which is supposed to pass
vote on the next meeting (Toronto/October 2000). However, probably this
is not was experienced. But then, there are some lame implementations
around...

> I wrote large
> amount of data in a std::basic_iostringstream. In the implementation I
> was using, the std::basic_stringbuf  reallocated the buffer when the
> capacity was reached. I thought this was rather inefficient and that a
> list of buffers could be maintained avoiding the reallocation and the
> copy. I then studied the standard and concluded that
> std::basic_stringstream could be made more efficient because of the
> specification of std::basic_streambuf. My interpretation is that the
> std::basic_streambuf internal character buffer does not have to be
> continuous because of the way eback(), gptr(), eptr()...in_avail()
> work.

I think you are right that the internal area used by string streams
does not have to be contigous but it is still a likely implementation
which probably works well for most applications. Using eg. a list of
blocks or some other indirection mechanism would complicate certain
parts of the implementation which are fairly easy to implement using
just one buffer (... and I managed to fail even at this :-(

If you have specific needs like eg. huge string stream you can fairly
easy implement them suiting your particular purpose best. String
streams are often used for simple purposes like formatting numbers or
parsing individual lines. I would expect that most applications of
string streams use them on fairly small strings (eg. 8k of characters).
Thus, a single buffer seems about right: Maintainance of the buffers
would probably result in a slowdown compared to a simple
implementation. However, if you need a string stream for huge buffers,
you can just create one, maybe even preallocate the whole buffer
resulting in a trivial but highly efficient implementation:

  class memorybuf: std::streambuf {
  public:
    memorybuf(std::size_t size): m_buffer(new char[size]) {
      setg(m_buffer, m_buffer, m_buffer);
      setp(m_buffer, m_buffer + size);
    }
    ~memorybuf() { delete[] m_buffer; }
    char* begin() const { return pbase(); }
    char* end() const { return pptr(); }
  private:
    int_type underflow() {
      setg(pbase(), pbase() + (gptr() - eback()), epptr());
      return gptr() != egptr()? traits_type::to_int_type(*gptr())
                              : traits_type::eof();
    }

    char* m_buffer;
  };

(I haven't tested the code but it should be fairly close...).

> I think that in_avail() does not have to return the actual number of
> characters left in the string stream but only those in the get buffer.

You can return a precise amount of characters even for string streams
because you can override 'showmanyc()' and return an accurate numbers.
Basically, to tell whether a string stream uses segments or not, you
would have to derive from 'std::basic_stringbuf' and find out about the
current buffer setup. ... and even this will not tell much about the
internal organization of the string stream...

> Could a valid
> implementation specify that the put and get buffers of the
> std::basic_streambuf (std::basic_stingbuf) span only part (different
> parts?) of the internal buffer ?

For 'std::basic_streambuf' this is definitely true: There are not
requirements on the size of the used buffers and you can even create
stream "buffers" which are unbuffered. For string streams I haven't
gone through all those details but I think it allowed for them, too, to
span multiple buffers.
--
<mailto:dietmar.kuehl@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.

---
[ comp.std.c++ is moderated.  To submit articles, try just posting with ]
[ your news-reader.  If that fails, use mailto:std-c++@ncar.ucar.edu    ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html              ]