Topic: memcmp source


Author: kanze@gabi-soft.fr (J. Kanze)
Date: 1997/12/16
Raw View
"Paul D. DeRocco" <pderocco@ix.netcom.com> writes:

|>  Glenn Morris wrote:
|>  >
|>  > Can anyone tell me how memcmp() would be implemented under most systems?
|>  > Specifically, would it be implemented in C or assembler?
|>
|>  A good compiler will recognize that particular library function, and turn it
|>  into carefully optimized inline code. On Intel processors, there's a repeated
|>  string op that can do the comparison without bothering to fetch the comparison
|>  instruction over and over again. This is what Borland compiles it into.
|>
|>  It could also be implemented as a call to a library routine which is hand coded
|>  in assembler, which would provide even greater optimization opportunity, at the
|>  cost of a call. A really aggressive implementation might compare up to three
|>  bytes until one of the operands is dword aligned, and then do the bulk of the
|>  comparison a dword at a time to conserve bus bandwidth. However, since most
|>  uses of memcmp are probably on short strings, I doubt it would be worth the
|>  setup overhead.

If given constants, or values about which the compiler has some
information, the compiler can generate optimal inline code without the
tests.  On an Intel architecture, for example, it would be a very poor
compiler that would still use movsb when the length were an even
constant, for example.  From actual measurements, done a long time ago
on an 8086, over actual code, using different implementations of memcpy
(as a function, thus, always with run-time tests), shifting the count
right and copying words was a definite win, even with relatively short
strings, despite the overhead of the extra tests.  On the other hand,
trying to align wasn't: statistically, enough of the
sources/destinations were already aligned so that the rare improvement
didn't offset the tests of the other calls.

--
James Kanze    +33 (0)1 39 23 84 71    mailto: kanze@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orient   e objet --
              -- Beratung in objektorientierter Datenverarbeitung
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1997/12/17
Raw View
J. Kanze wrote:
>
> If given constants, or values about which the compiler has some
> information, the compiler can generate optimal inline code without the
> tests.  On an Intel architecture, for example, it would be a very poor
> compiler that would still use movsb when the length were an even
> constant, for example.  From actual measurements, done a long time ago
> on an 8086, over actual code, using different implementations of memcpy
> (as a function, thus, always with run-time tests), shifting the count
> right and copying words was a definite win, even with relatively short
> strings, despite the overhead of the extra tests.  On the other hand,
> trying to align wasn't: statistically, enough of the
> sources/destinations were already aligned so that the rare improvement
> didn't offset the tests of the other calls.

What you're saying is undoubtedly correct for memcpy, but the original question
was about memcmp. While it is certainly common to do big memcpy's (e.g., disk
buffers), I think memcmp's are typically much shorter, so it may not be worth
even the small optimization you suggest. For instance, the Borland compiler's
library functions (which are used when intrinsics are disabled) perform this
optimization for memcpy but not memcmp.

I did a version of memcpy once that did a simple test at the beginning: if the
count was greater than a certain value (I think I used 16), I did the
full-blown optimization, including alignment. Otherwise, it just moved bytes.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: "Glenn Morris" <gmorris@flash.net>
Date: 1997/12/12
Raw View
Can anyone tell me how memcmp() would be implemented under most systems?
Specifically, would it be implemented in C or assembler?
--
Glenn Morris
gmorris@flash.net
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: "Paul D. DeRocco" <pderocco@ix.netcom.com>
Date: 1997/12/13
Raw View
Glenn Morris wrote:
>
> Can anyone tell me how memcmp() would be implemented under most systems?
> Specifically, would it be implemented in C or assembler?

A good compiler will recognize that particular library function, and turn it
into carefully optimized inline code. On Intel processors, there's a repeated
string op that can do the comparison without bothering to fetch the comparison
instruction over and over again. This is what Borland compiles it into.

It could also be implemented as a call to a library routine which is hand coded
in assembler, which would provide even greater optimization opportunity, at the
cost of a call. A really aggressive implementation might compare up to three
bytes until one of the operands is dword aligned, and then do the bulk of the
comparison a dword at a time to conserve bus bandwidth. However, since most
uses of memcmp are probably on short strings, I doubt it would be worth the
setup overhead.

--

Ciao,
Paul
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Edward Diener <eddielee@abraxis.com>
Date: 1997/12/13
Raw View
I can give you the rundown for my C++ compilers under Windows.

Borland = assembler
Microsoft = C
Symantec = assembler

I can't find the source for Watcom. They don't seem to ship it with their C++
compiler.

I have the latest versions for all.

Glenn Morris wrote:

> Can anyone tell me how memcmp() would be implemented under most systems?
> Specifically, would it be implemented in C or assembler?
> --
> Glenn Morris
> gmorris@flash.net
> ---
> [ comp.std.c++ is moderated.  To submit articles: Try just posting with your
>                 newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
>   comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
>   Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
>   Comments? mailto:std-c++-request@ncar.ucar.edu
> ]
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]





Author: Valentin Bonnard <bonnardv@pratique.fr>
Date: 1997/12/13
Raw View
Glenn Morris <gmorris@flash.net> writes:

> Can anyone tell me how memcmp() would be implemented under most systems?
> Specifically, would it be implemented in C or assembler?

It can be implemented portably by using unsigned char
pointers. unsigned char is sometimes just another name
for raw memory.

The portable implementation, however, is often innefficient.
So it's implemented in unportable C or in assembler. It
can also be a primitive for the compiler, which generates
the best code inline.

A non portable implementation in C/C++ may copy longs (or
unsigned longs), or even doubles.

--

Valentin Bonnard                mailto:bonnardv@pratique.fr
info about C++/a propos du C++: http://www.pratique.fr/~bonnardv/
---
[ comp.std.c++ is moderated.  To submit articles: Try just posting with your
                newsreader.  If that fails, use mailto:std-c++@ncar.ucar.edu
  comp.std.c++ FAQ: http://reality.sgi.com/austern/std-c++/faq.html
  Moderation policy: http://reality.sgi.com/austern/std-c++/policy.html
  Comments? mailto:std-c++-request@ncar.ucar.edu
]