Topic: bitset search function
Author: jononanon@googlemail.com
Date: Sat, 24 Jan 2015 11:31:05 -0800 (PST)
Raw View
------=_Part_402_294913550.1422127865540
Content-Type: multipart/alternative;
boundary="----=_Part_403_219777574.1422127865540"
------=_Part_403_219777574.1422127865540
Content-Type: text/plain; charset=UTF-8
Hi there,
I'd love to see a quick and efficient std::bitset search function: Example: *Search
for bitstate "true", starting from bit-index 0*.
It could have an interface like this, maby:
template<size_t N>
size_t std::bitset<N>::find(size_t idx, bool bitstate=true);
*The effect of bs.find(idx, bitstate): *
we search from idx (usually from 0, to start from the rightmost lsb bit),
until we find the first index from which:
bs[idx] == bitstate
If no such index is found, return idx = std::bitset::npos (e.g. equal
to numeric_limits<size_t>::max())
*Why?*
Well some datastructures can profit from using bitset and providing and
efficient search for a particular bitstate.
I'm particularly thinking of a memory pool here: Mark the mem-objects that
are in use, in a bitset.
Then use something like this, to iterate and find the appropriate bit (e.g.
in the destructor of the pool):
for (size_t idx = 0; (idx = bs.find(idx)) != bitset::npos; ++idx) {
/* ... example: release resource idx */
// unmark that bit
bs.reset(idx);
}
The most important point is this: we can *search incredibly fast in a
bitset, while having a nice compressed representation*.
In the current std::bitset, all we have is a compressed representation (if
the library-designers decided to use 8 bits in a byte); but NO good search
possibility.
And that's a pity.
Here's my example of an efficient search
<https://groups.google.com/d/msg/comp.lang.c++/y0LnADpFb2A/CwgCRzi0ihYJ>.
This is just an example for a speedy search, for library designers who are
interested. But yes: this example (just an example!) abuses a bitset, by
assuming that the internal bit-representation is in the first member, and
therefore only plays with libraries such as libstdc++, which do indeed put
the bit-representation in the first member (link
<https://github.com/gcc-mirror/gcc/blob/d353bf189d2bbaf4059f402ee4d2a5ea074c349f/libstdc%2B%2B-v3/include/std/bitset#L76>).
But such a search really needs to be part of the bitset class, to hide
complexities...
Regards,
John
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
------=_Part_403_219777574.1422127865540
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr">Hi there,<br><br>I'd love to see a quick and efficient <sp=
an style=3D"font-family: courier new,monospace;">std::bitset</span> search =
function: Example: <i>Search for bitstate "true", starting from bit-index 0=
</i>.<br><br>It could have an interface like this, maby:<br><br><div style=
=3D"margin-left: 40px;"><span style=3D"font-family: courier new,monospace;"=
>template<size_t N></span><br><span style=3D"font-family: courier new=
,monospace;">size_t std::bitset<N>::find(size_t <span style=3D"color:=
rgb(102, 0, 0);">idx</span>, bool <span style=3D"color: rgb(120, 63, 4);">=
bitstate</span>=3Dtrue);</span><br></div><br><br><b>The effect of &n=
bsp; <span style=3D"font-family: courier new,monospace;">bs.find(<spa=
n style=3D"color: rgb(102, 0, 0);">idx</span>, <span style=3D"color: rgb(12=
0, 63, 4);">bitstate</span>)</span>: </b><br><div style=3D"margin-lef=
t: 40px;">we search from <span style=3D"color: rgb(102, 0, 0);"><span style=
=3D"font-family: courier new,monospace;">idx</span></span> (usually from 0,=
to start from the rightmost lsb bit), until we find the first index from w=
hich:<br><span style=3D"font-family: courier new,monospace;">bs[<span style=
=3D"color: rgb(102, 0, 0);">idx</span>] =3D=3D <span style=3D"color: rgb(12=
0, 63, 4);">bitstate</span></span><br>If no such index is found, return <sp=
an style=3D"font-family: courier new,monospace;"><span style=3D"color: rgb(=
102, 0, 0);">idx</span> =3D</span> <span style=3D"font-family: courier new,=
monospace;">std::bitset::npos</span> (e.g. equal to=
<span style=3D"font-family: courier new,monospace;">numeric_limits<size=
_t>::max()</span>)<br><br></div><b>Why?</b><br><div style=3D"margin-left=
: 40px;">Well some datastructures can profit from using bitset and providin=
g and efficient search for a particular bitstate. <br>I'm particularly thin=
king of a memory pool here: Mark the mem-objects that are in use, in a bits=
et.<br>Then use something like this, to iterate and find the appropriate bi=
t (e.g. in the destructor of the pool):<br><br><span style=3D"font-family: =
courier new,monospace;">for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D =
bitset::npos; ++idx) {<br> <br> /* ... =
example: release resource idx */<br><br> // unmark that b=
it<br> bs.reset(idx);<br>}</span><br><br><br>The most imp=
ortant point is this: we can <b>search incredibly fast in a bitset, while h=
aving a nice compressed representation</b>.<br>In the current <span style=
=3D"font-family: courier new,monospace;">std::bitset</span>, all we have is=
a compressed representation (if the library-designers decided to use 8 bit=
s in a byte); but NO good search possibility.<br>And that's a pity.<br><br>=
Here's my example of an <a href=3D"https://groups.google.com/d/msg/comp.lan=
g.c++/y0LnADpFb2A/CwgCRzi0ihYJ">efficient search </a>. This is just an exam=
ple for a speedy search, for library designers who are interested. But yes:=
this example (just an example!) abuses a bitset, by assuming that the inte=
rnal bit-representation is in the first member, and therefore only plays wi=
th libraries such as libstdc++, which do indeed put the bit-representation =
in the first member (<a href=3D"https://github.com/gcc-mirror/gcc/blob/d353=
bf189d2bbaf4059f402ee4d2a5ea074c349f/libstdc%2B%2B-v3/include/std/bitset#L7=
6">link</a>). But such a search really needs to be part of the bitset class=
, to hide complexities...<br><br></div>Regards,<br>John<br></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_403_219777574.1422127865540--
------=_Part_402_294913550.1422127865540--
.
Author: Howard Hinnant <howard.hinnant@gmail.com>
Date: Sat, 24 Jan 2015 16:10:18 -0500
Raw View
On Jan 24, 2015, at 2:31 PM, jononanon@googlemail.com wrote:
> Hi there,
>=20
> I'd love to see a quick and efficient std::bitset search function: Exampl=
e: Search for bitstate "true", starting from bit-index 0.
>=20
> It could have an interface like this, maby:
>=20
> template<size_t N>
> size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue);
>=20
>=20
> The effect of bs.find(idx, bitstate): =20
> we search from idx (usually from 0, to start from the rightmost lsb bit),=
until we find the first index from which:
> bs[idx] =3D=3D bitstate
> If no such index is found, return idx =3D std::bitset::npos (e.g. equ=
al to numeric_limits<size_t>::max())
>=20
> Why?
> Well some datastructures can profit from using bitset and providing and e=
fficient search for a particular bitstate.=20
> I'm particularly thinking of a memory pool here: Mark the mem-objects tha=
t are in use, in a bitset.
> Then use something like this, to iterate and find the appropriate bit (e.=
g. in the destructor of the pool):
>=20
> for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; ++idx) {
> =20
> /* ... example: release resource idx */
>=20
> // unmark that bit
> bs.reset(idx);
> }
>=20
>=20
> The most important point is this: we can search incredibly fast in a bits=
et, while having a nice compressed representation.
> In the current std::bitset, all we have is a compressed representation (i=
f the library-designers decided to use 8 bits in a byte); but NO good searc=
h possibility.
> And that's a pity.
>=20
> Here's my example of an efficient search . This is just an example for a =
speedy search, for library designers who are interested. But yes: this exam=
ple (just an example!) abuses a bitset, by assuming that the internal bit-r=
epresentation is in the first member, and therefore only plays with librari=
es such as libstdc++, which do indeed put the bit-representation in the fir=
st member (link). But such a search really needs to be part of the bitset c=
lass, to hide complexities=E2=80=A6
I very much like the functionality you describe here. But what if we gave =
bitset iterators, and then made the interface look like:
template <size_t N>
bitset<N>::const_iterator
find(bitset<N>::const_iterator first, bitset<N>::const_iterator last, b=
ool value);
This would mimic in syntax the generic function:
template <class InputIterator, class T>
InputIterator
find(InputIterator first, InputIterator last, const T& value);
Then generic code could search for things, be they bits or whatever, with t=
he same syntax and maximal efficiency for each type.
A detail here is that bitset<N>::const_iterator would have to be an alias f=
or an iterator defined at namespace scope, as the N is not in a deducible c=
ontext the way I=E2=80=99ve written it. I /believe/ that such an iterator =
could be designed to be independent of N. I also believe that the algorith=
m with the iterator interface would could still look similar to your exampl=
e efficient search.
Here is an example implementation of it with the iterator interface:
http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_reference
The __bit_iterator at this site is not necessarily the best design of bitse=
t<N>::iterator. But it illustrates the concept.
Here is some performance data on these algorithms:
http://howardhinnant.github.io/onvectorbool.html
Howard
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: jononanon@googlemail.com
Date: Sat, 24 Jan 2015 16:10:42 -0800 (PST)
Raw View
------=_Part_3097_754569982.1422144642680
Content-Type: multipart/alternative;
boundary="----=_Part_3098_1166751939.1422144642680"
------=_Part_3098_1166751939.1422144642680
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Saturday, January 24, 2015 at 10:10:19 PM UTC+1, Howard Hinnant wrote:
>
> On Jan 24, 2015, at 2:31 PM, jono...@googlemail.com <javascript:> wrote:=
=20
>
> > Hi there,=20
> >=20
> > I'd love to see a quick and efficient std::bitset search function:=20
> Example: Search for bitstate "true", starting from bit-index 0.=20
> >=20
> > It could have an interface like this, maby:=20
> >=20
> > template<size_t N>=20
> > size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue);=20
> >=20
> >=20
> > The effect of bs.find(idx, bitstate): =20
> > we search from idx (usually from 0, to start from the rightmost lsb=20
> bit), until we find the first index from which:=20
> > bs[idx] =3D=3D bitstate=20
> > If no such index is found, return idx =3D std::bitset::npos (e.g.=
=20
> equal to numeric_limits<size_t>::max())=20
> >=20
> > Why?=20
> > Well some datastructures can profit from using bitset and providing and=
=20
> efficient search for a particular bitstate.=20
> > I'm particularly thinking of a memory pool here: Mark the mem-objects=
=20
> that are in use, in a bitset.=20
> > Then use something like this, to iterate and find the appropriate bit=
=20
> (e.g. in the destructor of the pool):=20
> >=20
> > for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; ++idx)=
{=20
> > =20
> > /* ... example: release resource idx */=20
> >=20
> > // unmark that bit=20
> > bs.reset(idx);=20
> > }=20
> >=20
> >=20
> > The most important point is this: we can search incredibly fast in a=20
> bitset, while having a nice compressed representation.=20
> > In the current std::bitset, all we have is a compressed representation=
=20
> (if the library-designers decided to use 8 bits in a byte); but NO good=
=20
> search possibility.=20
> > And that's a pity.=20
> >=20
> > Here's my example of an efficient search . This is just an example for =
a=20
> speedy search, for library designers who are interested. But yes: this=20
> example (just an example!) abuses a bitset, by assuming that the internal=
=20
> bit-representation is in the first member, and therefore only plays with=
=20
> libraries such as libstdc++, which do indeed put the bit-representation i=
n=20
> the first member (link). But such a search really needs to be part of the=
=20
> bitset class, to hide complexities=E2=80=A6=20
>
> I very much like the functionality you describe here. But what if we gav=
e=20
> bitset iterators, and then made the interface look like:=20
>
> template <size_t N>=20
> bitset<N>::const_iterator=20
> find(bitset<N>::const_iterator first, bitset<N>::const_iterator last,=
=20
> bool value);=20
>
> This would mimic in syntax the generic function:=20
>
> template <class InputIterator, class T>=20
> InputIterator=20
> find(InputIterator first, InputIterator last, const T& value);=20
>
> Then generic code could search for things, be they bits or whatever, with=
=20
> the same syntax and maximal efficiency for each type.=20
>
> A detail here is that bitset<N>::const_iterator would have to be an alias=
=20
> for an iterator defined at namespace scope, as the N is not in a deducibl=
e=20
> context the way I=E2=80=99ve written it. I /believe/ that such an iterat=
or could=20
> be designed to be independent of N. I also believe that the algorithm wi=
th=20
> the iterator interface would could still look similar to your example=20
> efficient search.=20
>
> Here is an example implementation of it with the iterator interface:=20
>
> http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_reference=20
>
> The __bit_iterator at this site is not necessarily the best design of=20
> bitset<N>::iterator. But it illustrates the concept.=20
>
> Here is some performance data on these algorithms:=20
>
> http://howardhinnant.github.io/onvectorbool.html=20
>
> Howard=20
>
>
Hi Howard,
thanks for your reply.
You're definitely right. Using find and iterators is the way to do this.=20
The C++ way!
I've given it a go...
As a very basic proof of concept. ;)
(Please realize: I'm probably way out of league here... This is how one=20
codes after just having almost finished a dedicated reading of Stroustup's=
=20
rather 'basic' "Principles and Practice" book... and having done most of=
=20
the book exercises.
In particular: I'm wondering about const_iterator (it's not really=20
mentioned in "Principles and Practice"), so any recommendations or tips=20
will be appreciated.)
C++ wizards will certainly be able to improve this!
/////////////////////////////////////////////////////////////////
#include <iostream>
#include <bitset>
#include <climits>
#include <cassert>
using namespace std;
int find_lowest_idx_set_in_ulong(unsigned long val);
struct Bititerator {
Bititerator(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1},=
=20
quot{quot1}, rem{rem1} {}
constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;
=20
Bititerator& operator++()
{
++rem;
if (rem =3D=3D num_bits_ulong) {
++quot;
rem =3D 0;
}
return *this;
}
Bititerator& operator--()
{
if (rem =3D=3D 0) {
--quot;
rem =3D num_bits_ulong-1;
} else {
--rem;
}
return *this;
}
=20
bool operator=3D=3D(const Bititerator& it) const {
return ((mem =3D=3D it.mem) &&
(quot =3D=3D it.quot) &&
(rem =3D=3D it.rem));
}
bool operator!=3D(const Bititerator& it) const {
return !(*this =3D=3D it);
}
bool operator*() { return mem[quot] & (1UL << rem); }
=20
friend Bititerator find(Bititerator first, Bititerator last, const bool=
=20
bitstate);
friend int operator-(const Bititerator& a, const Bititerator& b);
=20
private:
// the current bit is (mem[quot] & (1UL << rem)), which has index (quot *=
=20
(num_bits_ulong) + rem)
=20
unsigned long *mem;
size_t quot;
size_t rem;
};
Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
/* bitstate not yet taken into account.=20
This algorithm (currently) always takes bitstate =3D=3D true, i.e only=
=20
search for set true bits
Just a demonstration...
Todo: handle bitstate...
*/
{
if (first =3D=3D last)
return last;
=20
using ulong =3D unsigned long;
size_t lim_quot;
size_t lim_rem;
if (last.rem > 0) {
lim_quot =3D last.quot;
lim_rem =3D last.rem-1;
} else {
lim_quot =3D last.quot-1;
lim_rem =3D Bititerator::num_bits_ulong-1;
}
ulong val =3D first.mem[first.quot] & (-1UL << first.rem);
if (first.quot < lim_quot) {
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
=20
while (++first.quot < lim_quot) {
if (first.mem[first.quot]) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
}
val =3D first.mem[first.quot];
}
val &=3D (-1UL >> (Bititerator::num_bits_ulong - lim_rem - 1));
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
return last; // not found
}
int operator-(const Bititerator& a, const Bititerator& b)
{
return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
(b.quot * (Bititerator::num_bits_ulong) + b.rem));
}
#if (ULONG_MAX =3D=3D 4294967295UL)
// unsigned long has 32 bits
static_assert(sizeof(unsigned long) =3D=3D 4, "unsigned long must have 4=20
bytes");
static const int MultiplyDeBruijnBitPosition[32] =3D
{
0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
#else
#if (ULONG_MAX =3D=3D 18446744073709551615UL)
// unsigned long has 64 bits
static_assert(sizeof(unsigned long) =3D=3D 8, "unsigned long must have 8=20
bytes");
static const int MultiplyDeBruijnBitPosition[64] =3D
{
0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
};
#else
static_assert(0, "strange machine architecture");
#endif
#endif
int find_lowest_idx_set_in_ulong(unsigned long val)
// only call this function if val does definately have a bit that is set
{
#if (ULONG_MAX =3D=3D 4294967295UL)
// unsigned long has 32 bits
=20
// Reference: http://stackoverflow.com/a/757266 (=20
http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup =
)
return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >> 27]=
;
=20
#else
#if (ULONG_MAX =3D=3D 18446744073709551615UL)
// unsigned long has 64 bits
// return __builtin_ctzl(val);
// Reference: http://stackoverflow.com/a/24146182 (=20
http://chessprogramming.wikispaces.com/BitScan#KimWalisch )
return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) *=20
0x03F79D71B4CB0A89UL)) >> 58];
#endif
#endif =20
}
template<size_t N> class Mybitset : public bitset<N> {
public:
using bitset<N>::bitset;
constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;
using iterator =3D Bititerator;
=20
Bititerator begin()
{
assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned=
=20
*>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
}
Bititerator end()
{
assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned=
=20
*>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
size_t quot =3D N / num_bits_ulong;
return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N -=20
quot*num_bits_ulong};
}
};
int main()
{
Mybitset<10> x;
x.set(0);
x.set(3);
// print bitpattern of x using iterators
if (x.begin() !=3D x.end()) {
Mybitset<10>::iterator it =3D x.end();
do {
--it;
cout << *it;
} while (it !=3D x.begin());
}
cout << "\n\n";
// search for bit
for (; true; x <<=3D 1) {
auto it =3D find(x.begin(), x.end(), true);
cout << x << '\t';
if (it !=3D x.end())
cout << (it - x.begin()) << "\n";
else
cout << "not found\n";
if (x.none())
break;
}
=20
return 0;
}
/////////////////////////////////////////////////////////////////
Regards,
John
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_3098_1166751939.1422144642680
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Saturday, January 24, 2015 at 10:10:19 PM UTC+1=
, Howard Hinnant wrote:<blockquote class=3D"gmail_quote" style=3D"margin: 0=
;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">On Jan =
24, 2015, at 2:31 PM, <a href=3D"javascript:" target=3D"_blank" gdf-obfusca=
ted-mailto=3D"a_OdTAdvCUsJ" rel=3D"nofollow" onmousedown=3D"this.href=3D'ja=
vascript:';return true;" onclick=3D"this.href=3D'javascript:';return true;"=
>jono...@googlemail.com</a> wrote:
<br>
<br>> Hi there,
<br>>=20
<br>> I'd love to see a quick and efficient std::bitset search function:=
Example: Search for bitstate "true", starting from bit-index 0.
<br>>=20
<br>> It could have an interface like this, maby:
<br>>=20
<br>> template<size_t N>
<br>> size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue=
);
<br>>=20
<br>>=20
<br>> The effect of bs.find(idx, bitstate):
<br>> we search from idx (usually from 0, to start from the rightmost ls=
b bit), until we find the first index from which:
<br>> bs[idx] =3D=3D bitstate
<br>> If no such index is found, return idx =3D std::bitset::npos =
(e.g. equal to numeric_limits<size_t>::max())
<br>>=20
<br>> Why?
<br>> Well some datastructures can profit from using bitset and providin=
g and efficient search for a particular bitstate.=20
<br>> I'm particularly thinking of a memory pool here: Mark the mem-obje=
cts that are in use, in a bitset.
<br>> Then use something like this, to iterate and find the appropriate =
bit (e.g. in the destructor of the pool):
<br>>=20
<br>> for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; +=
+idx) {
<br>> =20
<br>> /* ... example: release resource idx */
<br>>=20
<br>> // unmark that bit
<br>> bs.reset(idx);
<br>> }
<br>>=20
<br>>=20
<br>> The most important point is this: we can search incredibly fast in=
a bitset, while having a nice compressed representation.
<br>> In the current std::bitset, all we have is a compressed representa=
tion (if the library-designers decided to use 8 bits in a byte); but NO goo=
d search possibility.
<br>> And that's a pity.
<br>>=20
<br>> Here's my example of an efficient search . This is just an example=
for a speedy search, for library designers who are interested. But yes: th=
is example (just an example!) abuses a bitset, by assuming that the interna=
l bit-representation is in the first member, and therefore only plays with =
libraries such as libstdc++, which do indeed put the bit-representation in =
the first member (link). But such a search really needs to be part of the b=
itset class, to hide complexities=E2=80=A6
<br>
<br>I very much like the functionality you describe here. But what if=
we gave bitset iterators, and then made the interface look like:
<br>
<br> template <size_t N>
<br> bitset<N>::const_iterator
<br> find(bitset<N>::const_iterator first, bitset<N&g=
t;::const_iterator last, bool value);
<br>
<br>This would mimic in syntax the generic function:
<br>
<br> template <class InputIterator, class T>
<br> InputIterator
<br> find(InputIterator first, InputIterator last, const T&=
; value);
<br>
<br>Then generic code could search for things, be they bits or whatever, wi=
th the same syntax and maximal efficiency for each type.
<br>
<br>A detail here is that bitset<N>::const_iterator would have to be =
an alias for an iterator defined at namespace scope, as the N is not in a d=
educible context the way I=E2=80=99ve written it. I /believe/ that su=
ch an iterator could be designed to be independent of N. I also belie=
ve that the algorithm with the iterator interface would could still look si=
milar to your example efficient search.
<br>
<br>Here is an example implementation of it with the iterator interface:
<br>
<br><a href=3D"http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_=
reference" target=3D"_blank" rel=3D"nofollow" onmousedown=3D"this.href=3D'h=
ttp://www.google.com/url?q\75http%3A%2F%2Fllvm.org%2Fsvn%2Fllvm-project%2Fl=
ibcxx%2Ftrunk%2Finclude%2F__bit_reference\46sa\75D\46sntz\0751\46usg\75AFQj=
CNHHyzNP1SYzrii1LzQGRLVigs5TKQ';return true;" onclick=3D"this.href=3D'http:=
//www.google.com/url?q\75http%3A%2F%2Fllvm.org%2Fsvn%2Fllvm-project%2Flibcx=
x%2Ftrunk%2Finclude%2F__bit_reference\46sa\75D\46sntz\0751\46usg\75AFQjCNHH=
yzNP1SYzrii1LzQGRLVigs5TKQ';return true;">http://llvm.org/svn/llvm-<wbr>pro=
ject/libcxx/trunk/include/_<wbr>_bit_reference</a>
<br>
<br>The __bit_iterator at this site is not necessarily the best design of b=
itset<N>::iterator. But it illustrates the concept.
<br>
<br>Here is some performance data on these algorithms:
<br>
<br><a href=3D"http://howardhinnant.github.io/onvectorbool.html" target=3D"=
_blank" rel=3D"nofollow" onmousedown=3D"this.href=3D'http://www.google.com/=
url?q\75http%3A%2F%2Fhowardhinnant.github.io%2Fonvectorbool.html\46sa\75D\4=
6sntz\0751\46usg\75AFQjCNFoTLKbhxJd9zvsOcdnpr-HKXoRzw';return true;" onclic=
k=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fhowardhinnant.=
github.io%2Fonvectorbool.html\46sa\75D\46sntz\0751\46usg\75AFQjCNFoTLKbhxJd=
9zvsOcdnpr-HKXoRzw';return true;">http://howardhinnant.github.<wbr>io/onvec=
torbool.html</a>
<br>
<br>Howard
<br>
<br></blockquote><div><br>Hi Howard,<br><br>thanks for your reply.<br><br>Y=
ou're definitely right. Using find and iterators is the way to do this. The=
C++ way!<br>I've given it a go...<br><br>As a very basic proof of concept.=
;)<br><br>(Please realize: I'm probably way out of league here... This is =
how one codes after just having almost finished a dedicated reading of Stro=
ustup's rather 'basic' "Principles and Practice" book... and having d=
one most of the book exercises.<br>In particular: I'm wondering about <span=
style=3D"font-family: courier new,monospace;">const_iterator</span> <font =
size=3D"1">(it's not really mentioned in "Principles and Practice")</font>,=
so any recommendations or tips will be appreciated.)<br><br>C++ wizards wi=
ll certainly be able to improve this!<br><br><font size=3D"1"><span style=
=3D"font-family: courier new,monospace;">//////////////////////////////////=
///////////////////////////////<br>#include <iostream><br>#include &l=
t;bitset><br>#include <climits><br>#include <cassert><br><br=
>using namespace std;<br><br>int find_lowest_idx_set_in_ulong(unsigned long=
val);<br><br><br><br><br>struct Bititerator {<br> Bititerator(unsign=
ed long *mem1, size_t quot1, size_t rem1) : mem{mem1}, quot{quot1}, rem{rem=
1} {}<br><br> constexpr static int num_bits_ulong =3D sizeof(unsigned=
long) * CHAR_BIT;<br> <br> Bititerator& operator++()<br>&n=
bsp; {<br> ++rem;<br> if (rem =3D=3D nu=
m_bits_ulong) {<br> ++quot;<br> &n=
bsp; rem =3D 0;<br> }<br> r=
eturn *this;<br> }<br><br> Bititerator& operator--()<br>&nb=
sp; {<br> if (rem =3D=3D 0) {<br> =
--quot;<br> rem =3D num_bits_ulong-1;<=
br> } else {<br> --rem;<br>=
}<br> return *this;<br> }<br><br=
> <br> bool operator=3D=3D(const Bititerator& it) const {<b=
r> return ((mem =3D=3D it.mem) &&<br>=
(quot =
=3D=3D it.quot) &&<br> &nb=
sp; (rem =3D=3D it.rem));<br> }<br><br> =
bool operator!=3D(const Bititerator& it) const {<br> =
return !(*this =3D=3D it);<br> }<br><br> bool operator*() { re=
turn mem[quot] & (1UL << rem); }<br> <br> friend Biti=
terator find(Bititerator first, Bititerator last, const bool bitstate);<br>=
<br> friend int operator-(const Bititerator& a, const Bititerator=
& b);<br> <br>private:<br> // the current bit i=
s (mem[quot] & (1UL << rem)), which has index (quot * (num_bits_u=
long) + rem)<br> <br> unsigned long *mem;<br> size_t quot=
;<br> size_t rem;<br>};<br><br><br>Bititerator find(Bititerator first=
, Bititerator last, const bool bitstate)<br>/* bitstate not yet taken into =
account. <br> This algorithm (currently) always takes bitstate =
=3D=3D true, i.e only search for set true bits<br><br> Just a d=
emonstration...<br> Todo: handle bitstate...<br> */<br>{<b=
r> if (first =3D=3D last)<br> return last;<br> =
; <br> using ulong =3D unsigned long;<br><br> size_t lim_quot;<=
br> size_t lim_rem;<br> if (last.rem > 0) {<br> &=
nbsp; lim_quot =3D last.quot;<br> lim_rem =3D last.=
rem-1;<br> } else {<br> lim_quot =3D last.quot-1;<b=
r> lim_rem =3D Bititerator::num_bits_ulong-1;<br>&n=
bsp; }<br><br> ulong val =3D first.mem[first.quot] & (-1UL <&l=
t; first.rem);<br><br> if (first.quot < lim_quot) {<br>  =
; if (val) {<br> first.rem =3D find_low=
est_idx_set_in_ulong(val);<br> return first;<=
br> }<br> <br> while (++first.quo=
t < lim_quot) {<br> if (first.mem[first.qu=
ot]) {<br> first.rem =3D find_low=
est_idx_set_in_ulong(val);<br> re=
turn first;<br> }<br> }<br>=
<br> val =3D first.mem[first.quot];<br> }<br><br>&n=
bsp; val &=3D (-1UL >> (Bititerator::num_bits_ulong - lim_rem - 1=
));<br> if (val) {<br> first.rem =3D find_lowest_id=
x_set_in_ulong(val);<br> return first;<br> }<br><br=
> return last; // not found<br>}<br><br><br>int operator-(const Bitit=
erator& a, const Bititerator& b)<br>{<br> return ((a.quot * (=
Bititerator::num_bits_ulong) + a.rem) -<br> &n=
bsp; (b.quot * (Bititerator::num_bits_ulong) + b.rem));<b=
r>}<br><br><br><br><br>#if (ULONG_MAX =3D=3D 4294967295UL)<br><br>// unsign=
ed long has 32 bits<br>static_assert(sizeof(unsigned long) =3D=3D 4, "unsig=
ned long must have 4 bytes");<br><br>static const int MultiplyDeBruijnBitPo=
sition[32] =3D<br> {<br> 0, 1, 28, 2, 29, 14, 24, 3=
, 30, 22, 20, 15, 25, 17, 4, 8,<br> 31, 27, 13, 23, 21, 1=
9, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9<br> };<br><br>#else<br>#if (ULO=
NG_MAX =3D=3D 18446744073709551615UL)<br><br>// unsigned long has 64 bits<b=
r>static_assert(sizeof(unsigned long) =3D=3D 8, "unsigned long must have 8 =
bytes");<br><br>static const int MultiplyDeBruijnBitPosition[64] =3D<br>&nb=
sp; {<br> 0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28=
, 16, 3, 61,<br> 54, 58, 35, 52, 50, 42, 21, 44, 38, 32, =
29, 23, 17, 11, 4, 62,<br> 46, 55, 26, 59, 40, 36, 15, 53=
, 34, 51, 20, 43, 31, 22, 10, 45,<br> 25, 39, 14, 33, 19,=
30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63<br> };<br><br>#else<br><br>sta=
tic_assert(0, "strange machine architecture");<br><br>#endif<br>#endif<br><=
br>int find_lowest_idx_set_in_ulong(unsigned long val)<br>// only call this=
function if val does definately have a bit that is set<br>{<br>#if (ULONG_=
MAX =3D=3D 4294967295UL)<br><br> // unsigned long has 32 bits<br>&nbs=
p; <br> // Reference: http://stackoverflow.com/a/757266 ( http://grap=
hics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup )<br> =
return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >=
;> 27];<br><br> <br>#else<br>#if (ULONG_MAX =3D=3D 184467440737095=
51615UL)<br><br> // unsigned long has 64 bits<br><br> // return=
__builtin_ctzl(val);<br><br> // Reference: http://stackoverflow.com/=
a/24146182 ( http://chessprogramming.wikispaces.com/BitScan#KimWalisch )<br=
> return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) * 0x03F79D71B4=
CB0A89UL)) >> 58];<br><br>#endif<br>#endif <br>}<br><br><br>tem=
plate<size_t N> class Mybitset : public bitset<N> {<br>public:<=
br> using bitset<N>::bitset;<br><br> constexpr static int=
num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;<br><br> using i=
terator =3D Bititerator;<br> <br> Bititerator begin()<br> =
{<br> assert((reinterpret_cast<unsigned long>(rein=
terpret_cast<const unsigned *>(this)) % sizeof(unsigned long)) =3D=3D=
0); /* check alignment */<br> return Bititerator{reinter=
pret_cast<unsigned long*>(this), 0, 0};<br> }<br> Bititer=
ator end()<br> {<br> assert((reinterpret_cast<un=
signed long>(reinterpret_cast<const unsigned *>(this)) % sizeof(un=
signed long)) =3D=3D 0); /* check alignment */<br> size_t=
quot =3D N / num_bits_ulong;<br> return Bititerator{rein=
terpret_cast<unsigned long*>(this), quot, N - quot*num_bits_ulong};<b=
r> }<br>};<br><br><br>int main()<br>{<br> Mybitset<10> x;=
<br> x.set(0);<br> x.set(3);<br><br> // print bitpattern =
of x using iterators<br> if (x.begin() !=3D x.end()) {<br>  =
; Mybitset<10>::iterator it =3D x.end();<br> =
do {<br> --it;<br> &nb=
sp; cout << *it;<br> } while (it !=3D x.begin());<b=
r> }<br> cout << "\n\n";<br><br> // search for bit<=
br> for (; true; x <<=3D 1) {<br> auto it =3D=
find(x.begin(), x.end(), true);<br> cout << x <=
< '\t';<br> if (it !=3D x.end())<br> =
cout << (it - x.begin()) << "\n";<br> &=
nbsp; else<br> cout << "not found\n";<b=
r><br> if (x.none())<br> br=
eak;<br> }<br> <br> return 0;<br>}<br>///////////////////=
//////////////////////////////////////////////</span></font><br><br>Regards=
,<br>John<br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_3098_1166751939.1422144642680--
------=_Part_3097_754569982.1422144642680--
.
Author: Howard Hinnant <howard.hinnant@gmail.com>
Date: Sat, 24 Jan 2015 19:22:35 -0500
Raw View
On Jan 24, 2015, at 7:10 PM, jononanon@googlemail.com wrote:
>=20
>=20
> On Saturday, January 24, 2015 at 10:10:19 PM UTC+1, Howard Hinnant wrote:
> On Jan 24, 2015, at 2:31 PM, jono...@googlemail.com wrote:=20
>=20
> > Hi there,=20
> >=20
> > I'd love to see a quick and efficient std::bitset search function: Exam=
ple: Search for bitstate "true", starting from bit-index 0.=20
> >=20
> > It could have an interface like this, maby:=20
> >=20
> > template<size_t N>=20
> > size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue);=20
> >=20
> >=20
> > The effect of bs.find(idx, bitstate): =20
> > we search from idx (usually from 0, to start from the rightmost lsb bit=
), until we find the first index from which:=20
> > bs[idx] =3D=3D bitstate=20
> > If no such index is found, return idx =3D std::bitset::npos (e.g. e=
qual to numeric_limits<size_t>::max())=20
> >=20
> > Why?=20
> > Well some datastructures can profit from using bitset and providing and=
efficient search for a particular bitstate.=20
> > I'm particularly thinking of a memory pool here: Mark the mem-objects t=
hat are in use, in a bitset.=20
> > Then use something like this, to iterate and find the appropriate bit (=
e.g. in the destructor of the pool):=20
> >=20
> > for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; ++idx)=
{=20
> > =20
> > /* ... example: release resource idx */=20
> >=20
> > // unmark that bit=20
> > bs.reset(idx);=20
> > }=20
> >=20
> >=20
> > The most important point is this: we can search incredibly fast in a bi=
tset, while having a nice compressed representation.=20
> > In the current std::bitset, all we have is a compressed representation =
(if the library-designers decided to use 8 bits in a byte); but NO good sea=
rch possibility.=20
> > And that's a pity.=20
> >=20
> > Here's my example of an efficient search . This is just an example for =
a speedy search, for library designers who are interested. But yes: this ex=
ample (just an example!) abuses a bitset, by assuming that the internal bit=
-representation is in the first member, and therefore only plays with libra=
ries such as libstdc++, which do indeed put the bit-representation in the f=
irst member (link). But such a search really needs to be part of the bitset=
class, to hide complexities=E2=80=A6=20
>=20
> I very much like the functionality you describe here. But what if we gav=
e bitset iterators, and then made the interface look like:=20
>=20
> template <size_t N>=20
> bitset<N>::const_iterator=20
> find(bitset<N>::const_iterator first, bitset<N>::const_iterator last,=
bool value);=20
>=20
> This would mimic in syntax the generic function:=20
>=20
> template <class InputIterator, class T>=20
> InputIterator=20
> find(InputIterator first, InputIterator last, const T& value);=20
>=20
> Then generic code could search for things, be they bits or whatever, with=
the same syntax and maximal efficiency for each type.=20
>=20
> A detail here is that bitset<N>::const_iterator would have to be an alias=
for an iterator defined at namespace scope, as the N is not in a deducible=
context the way I=E2=80=99ve written it. I /believe/ that such an iterato=
r could be designed to be independent of N. I also believe that the algori=
thm with the iterator interface would could still look similar to your exam=
ple efficient search.=20
>=20
> Here is an example implementation of it with the iterator interface:=20
>=20
> http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_reference=20
>=20
> The __bit_iterator at this site is not necessarily the best design of bit=
set<N>::iterator. But it illustrates the concept.=20
>=20
> Here is some performance data on these algorithms:=20
>=20
> http://howardhinnant.github.io/onvectorbool.html=20
>=20
> Howard=20
>=20
>=20
> Hi Howard,
>=20
> thanks for your reply.
>=20
> You're definitely right. Using find and iterators is the way to do this. =
The C++ way!
> I've given it a go...
>=20
> As a very basic proof of concept. ;)
>=20
> (Please realize: I'm probably way out of league here... This is how one c=
odes after just having almost finished a dedicated reading of Stroustup's r=
ather 'basic' "Principles and Practice" book... and having done most of th=
e book exercises.
> In particular: I'm wondering about const_iterator (it's not really mentio=
ned in "Principles and Practice"), so any recommendations or tips will be a=
ppreciated.)
>=20
> C++ wizards will certainly be able to improve this!
>=20
> /////////////////////////////////////////////////////////////////
> #include <iostream>
> #include <bitset>
> #include <climits>
> #include <cassert>
>=20
> using namespace std;
>=20
> int find_lowest_idx_set_in_ulong(unsigned long val);
>=20
>=20
>=20
>=20
> struct Bititerator {
> Bititerator(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1}=
, quot{quot1}, rem{rem1} {}
>=20
> constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BI=
T;
> =20
> Bititerator& operator++()
> {
> ++rem;
> if (rem =3D=3D num_bits_ulong) {
> ++quot;
> rem =3D 0;
> }
> return *this;
> }
>=20
> Bititerator& operator--()
> {
> if (rem =3D=3D 0) {
> --quot;
> rem =3D num_bits_ulong-1;
> } else {
> --rem;
> }
> return *this;
> }
>=20
> =20
> bool operator=3D=3D(const Bititerator& it) const {
> return ((mem =3D=3D it.mem) &&
> (quot =3D=3D it.quot) &&
> (rem =3D=3D it.rem));
> }
>=20
> bool operator!=3D(const Bititerator& it) const {
> return !(*this =3D=3D it);
> }
>=20
> bool operator*() { return mem[quot] & (1UL << rem); }
> =20
> friend Bititerator find(Bititerator first, Bititerator last, const bool=
bitstate);
>=20
> friend int operator-(const Bititerator& a, const Bititerator& b);
> =20
> private:
> // the current bit is (mem[quot] & (1UL << rem)), which has index (quot=
* (num_bits_ulong) + rem)
> =20
> unsigned long *mem;
> size_t quot;
> size_t rem;
> };
>=20
>=20
> Bititerator find(Bititerator first, Bititerator last, const bool bitstate=
)
> /* bitstate not yet taken into account.=20
> This algorithm (currently) always takes bitstate =3D=3D true, i.e only=
search for set true bits
>=20
> Just a demonstration...
> Todo: handle bitstate...
> */
> {
> if (first =3D=3D last)
> return last;
> =20
> using ulong =3D unsigned long;
>=20
> size_t lim_quot;
> size_t lim_rem;
> if (last.rem > 0) {
> lim_quot =3D last.quot;
> lim_rem =3D last.rem-1;
> } else {
> lim_quot =3D last.quot-1;
> lim_rem =3D Bititerator::num_bits_ulong-1;
> }
>=20
> ulong val =3D first.mem[first.quot] & (-1UL << first.rem);
>=20
> if (first.quot < lim_quot) {
> if (val) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
> =20
> while (++first.quot < lim_quot) {
> if (first.mem[first.quot]) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
> }
>=20
> val =3D first.mem[first.quot];
> }
>=20
> val &=3D (-1UL >> (Bititerator::num_bits_ulong - lim_rem - 1));
> if (val) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
>=20
> return last; // not found
> }
>=20
>=20
> int operator-(const Bititerator& a, const Bititerator& b)
> {
> return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
> (b.quot * (Bititerator::num_bits_ulong) + b.rem));
> }
>=20
>=20
>=20
>=20
> #if (ULONG_MAX =3D=3D 4294967295UL)
>=20
> // unsigned long has 32 bits
> static_assert(sizeof(unsigned long) =3D=3D 4, "unsigned long must have 4 =
bytes");
>=20
> static const int MultiplyDeBruijnBitPosition[32] =3D
> {
> 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
> 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
> };
>=20
> #else
> #if (ULONG_MAX =3D=3D 18446744073709551615UL)
>=20
> // unsigned long has 64 bits
> static_assert(sizeof(unsigned long) =3D=3D 8, "unsigned long must have 8 =
bytes");
>=20
> static const int MultiplyDeBruijnBitPosition[64] =3D
> {
> 0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
> 54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
> 46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
> 25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
> };
>=20
> #else
>=20
> static_assert(0, "strange machine architecture");
>=20
> #endif
> #endif
>=20
> int find_lowest_idx_set_in_ulong(unsigned long val)
> // only call this function if val does definately have a bit that is set
> {
> #if (ULONG_MAX =3D=3D 4294967295UL)
>=20
> // unsigned long has 32 bits
> =20
> // Reference: http://stackoverflow.com/a/757266 ( http://graphics.stanf=
ord.edu/~seander/bithacks.html#ZerosOnRightMultLookup )
> return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >> 2=
7];
>=20
> =20
> #else
> #if (ULONG_MAX =3D=3D 18446744073709551615UL)
>=20
> // unsigned long has 64 bits
>=20
> // return __builtin_ctzl(val);
>=20
> // Reference: http://stackoverflow.com/a/24146182 ( http://chessprogram=
ming.wikispaces.com/BitScan#KimWalisch )
> return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) * 0x03F79D71B4CB0A=
89UL)) >> 58];
>=20
> #endif
> #endif =20
> }
>=20
>=20
> template<size_t N> class Mybitset : public bitset<N> {
> public:
> using bitset<N>::bitset;
>=20
> constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BI=
T;
>=20
> using iterator =3D Bititerator;
> =20
> Bititerator begin()
> {
> assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsign=
ed *>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
> return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
> }
> Bititerator end()
> {
> assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsign=
ed *>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
> size_t quot =3D N / num_bits_ulong;
> return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N - =
quot*num_bits_ulong};
> }
> };
>=20
>=20
> int main()
> {
> Mybitset<10> x;
> x.set(0);
> x.set(3);
>=20
> // print bitpattern of x using iterators
> if (x.begin() !=3D x.end()) {
> Mybitset<10>::iterator it =3D x.end();
> do {
> --it;
> cout << *it;
> } while (it !=3D x.begin());
> }
> cout << "\n\n";
>=20
> // search for bit
> for (; true; x <<=3D 1) {
> auto it =3D find(x.begin(), x.end(), true);
> cout << x << '\t';
> if (it !=3D x.end())
> cout << (it - x.begin()) << "\n";
> else
> cout << "not found\n";
>=20
> if (x.none())
> break;
> }
> =20
> return 0;
> }
> /////////////////////////////////////////////////////////////////
Looks like an excellent start to me John!
Howard
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.
Author: David Krauss <potswa@gmail.com>
Date: Sun, 25 Jan 2015 13:27:23 +0800
Raw View
--Apple-Mail=_DCFDB34C-5071-4438-AD93-D78E0D540069
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8
> On 2015=E2=80=9301=E2=80=9325, at 8:10 AM, jononanon@googlemail.com wrote=
:
>=20
> But what if we gave bitset iterators,
Implementations should be able to reuse std::vector<bool>::iterator. If arr=
ay and vector<bool> both have iterators, then bitset should too.
Also, referring to GCC=E2=80=99s standard library, SGI already defined bits=
et::find_first and bitset::find_next functions; these were just never stand=
ardized. (They are available in GCC as _Find_first and _Find_next.) Such fu=
nctions could be adapted into optimized std::find overloads.
This is really a job for the standard library, because intrinsic count-lead=
ing-zeroes functionality is going to be faster than any bitwise arithmetic =
tricks.
Finding a bit-string in a bit-sequence with std::search or comparing bit-st=
rings with std::equal or std::mismatch are also interesting operations with=
applications in data compression. I wonder if any library already optimize=
s these. GCC does not, so I guess it=E2=80=99s all rather niche.
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
--Apple-Mail=_DCFDB34C-5071-4438-AD93-D78E0D540069
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8
<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html charset=
=3Dutf-8"></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: s=
pace; -webkit-line-break: after-white-space;" class=3D""><br class=3D""><di=
v><blockquote type=3D"cite" class=3D""><div class=3D"">On 2015=E2=80=9301=
=E2=80=9325, at 8:10 AM, <a href=3D"mailto:jononanon@googlemail.com" class=
=3D"">jononanon@googlemail.com</a> wrote:</div><br class=3D"Apple-interchan=
ge-newline"><div class=3D""><span style=3D"font-family: Helvetica; font-siz=
e: 12px; font-style: normal; font-variant: normal; font-weight: normal; let=
ter-spacing: normal; line-height: normal; orphans: auto; text-align: start;=
text-indent: 0px; text-transform: none; white-space: normal; widows: auto;=
word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: i=
nline !important;" class=3D"">But what if we gave bitset iterators,</span><=
/div></blockquote></div><br class=3D""><div class=3D"">Implementations shou=
ld be able to reuse <font face=3D"Courier" class=3D"">std::vector<bool&g=
t;::iterator</font>. If <font face=3D"Courier" class=3D"">array</font> =
;and <font face=3D"Courier" class=3D"">vector<bool></font> both have =
iterators, then <font face=3D"Courier" class=3D"">bitset</font> should too.=
</div><div class=3D""><br class=3D""></div><div class=3D"">Also, referring =
to GCC=E2=80=99s standard library, SGI already defined <font face=3D"C=
ourier" class=3D"">bitset::find_first</font> and <span style=3D"font-f=
amily: Courier;" class=3D"">bitset::</span><font face=3D"Courier" class=3D"=
">find_next</font> functions; these were just never standardized. (They are=
available in GCC as <font face=3D"Courier" class=3D"">_Find_first</font> a=
nd <font face=3D"Courier" class=3D"">_Find_next</font>.) Such functions cou=
ld be adapted into optimized <font face=3D"Courier" class=3D"">std::find</f=
ont> overloads.</div><div class=3D""><br class=3D""></div><div class=3D"">T=
his is really a job for the standard library, because intrinsic count-leadi=
ng-zeroes functionality is going to be faster than any bitwise arithmetic t=
ricks.</div><div class=3D""><br class=3D""></div><div class=3D"">Finding a =
bit-string in a bit-sequence with <font face=3D"Courier" class=3D"">std::se=
arch</font> or comparing bit-strings with <font face=3D"Courier" class=3D""=
>std::equal</font> or <font face=3D"Courier" class=3D"">std::mism=
atch</font> are also interesting operations with applications in data compr=
ession. I wonder if any library already optimizes these. GCC does not, so I=
guess it=E2=80=99s all rather niche.</div></body></html>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
--Apple-Mail=_DCFDB34C-5071-4438-AD93-D78E0D540069--
.
Author: jononanon@googlemail.com
Date: Sun, 25 Jan 2015 07:34:30 -0800 (PST)
Raw View
------=_Part_190_1856221431.1422200070041
Content-Type: multipart/alternative;
boundary="----=_Part_191_1018650848.1422200070041"
------=_Part_191_1018650848.1422200070041
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Sunday, January 25, 2015 at 1:10:42 AM UTC+1, jono...@googlemail.com=20
wrote:
>
>
>
> On Saturday, January 24, 2015 at 10:10:19 PM UTC+1, Howard Hinnant wrote:
>>
>> On Jan 24, 2015, at 2:31 PM, jono...@googlemail.com wrote:=20
>>
>> > Hi there,=20
>> >=20
>> > I'd love to see a quick and efficient std::bitset search function:=20
>> Example: Search for bitstate "true", starting from bit-index 0.=20
>> >=20
>> > It could have an interface like this, maby:=20
>> >=20
>> > template<size_t N>=20
>> > size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue);=20
>> >=20
>> >=20
>> > The effect of bs.find(idx, bitstate): =20
>> > we search from idx (usually from 0, to start from the rightmost lsb=20
>> bit), until we find the first index from which:=20
>> > bs[idx] =3D=3D bitstate=20
>> > If no such index is found, return idx =3D std::bitset::npos (e.g.=
=20
>> equal to numeric_limits<size_t>::max())=20
>> >=20
>> > Why?=20
>> > Well some datastructures can profit from using bitset and providing an=
d=20
>> efficient search for a particular bitstate.=20
>> > I'm particularly thinking of a memory pool here: Mark the mem-objects=
=20
>> that are in use, in a bitset.=20
>> > Then use something like this, to iterate and find the appropriate bit=
=20
>> (e.g. in the destructor of the pool):=20
>> >=20
>> > for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; ++idx=
) {=20
>> > =20
>> > /* ... example: release resource idx */=20
>> >=20
>> > // unmark that bit=20
>> > bs.reset(idx);=20
>> > }=20
>> >=20
>> >=20
>> > The most important point is this: we can search incredibly fast in a=
=20
>> bitset, while having a nice compressed representation.=20
>> > In the current std::bitset, all we have is a compressed representation=
=20
>> (if the library-designers decided to use 8 bits in a byte); but NO good=
=20
>> search possibility.=20
>> > And that's a pity.=20
>> >=20
>> > Here's my example of an efficient search . This is just an example for=
=20
>> a speedy search, for library designers who are interested. But yes: this=
=20
>> example (just an example!) abuses a bitset, by assuming that the interna=
l=20
>> bit-representation is in the first member, and therefore only plays with=
=20
>> libraries such as libstdc++, which do indeed put the bit-representation =
in=20
>> the first member (link). But such a search really needs to be part of th=
e=20
>> bitset class, to hide complexities=E2=80=A6=20
>>
>> I very much like the functionality you describe here. But what if we=20
>> gave bitset iterators, and then made the interface look like:=20
>>
>> template <size_t N>=20
>> bitset<N>::const_iterator=20
>> find(bitset<N>::const_iterator first, bitset<N>::const_iterator last=
,=20
>> bool value);=20
>>
>> This would mimic in syntax the generic function:=20
>>
>> template <class InputIterator, class T>=20
>> InputIterator=20
>> find(InputIterator first, InputIterator last, const T& value);=20
>>
>> Then generic code could search for things, be they bits or whatever, wit=
h=20
>> the same syntax and maximal efficiency for each type.=20
>>
>> A detail here is that bitset<N>::const_iterator would have to be an alia=
s=20
>> for an iterator defined at namespace scope, as the N is not in a deducib=
le=20
>> context the way I=E2=80=99ve written it. I /believe/ that such an itera=
tor could=20
>> be designed to be independent of N. I also believe that the algorithm w=
ith=20
>> the iterator interface would could still look similar to your example=20
>> efficient search.=20
>>
>> Here is an example implementation of it with the iterator interface:=20
>>
>> http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_reference=20
>>
>> The __bit_iterator at this site is not necessarily the best design of=20
>> bitset<N>::iterator. But it illustrates the concept.=20
>>
>> Here is some performance data on these algorithms:=20
>>
>> http://howardhinnant.github.io/onvectorbool.html=20
>>
>> Howard=20
>>
>>
> Hi Howard,
>
> thanks for your reply.
>
> You're definitely right. Using find and iterators is the way to do this.=
=20
> The C++ way!
> I've given it a go...
>
> As a very basic proof of concept. ;)
>
> (Please realize: I'm probably way out of league here... This is how one=
=20
> codes after just having almost finished a dedicated reading of Stroustup'=
s=20
> rather 'basic' "Principles and Practice" book... and having done most of=
=20
> the book exercises.
> In particular: I'm wondering about const_iterator (it's not really=20
> mentioned in "Principles and Practice"), so any recommendations or tips=
=20
> will be appreciated.)
>
> C++ wizards will certainly be able to improve this!
>
> /////////////////////////////////////////////////////////////////
> #include <iostream>
> #include <bitset>
> #include <climits>
> #include <cassert>
>
> using namespace std;
>
> int find_lowest_idx_set_in_ulong(unsigned long val);
>
>
>
>
> struct Bititerator {
> Bititerator(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1}=
,=20
> quot{quot1}, rem{rem1} {}
>
> constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BI=
T;
> =20
> Bititerator& operator++()
> {
> ++rem;
> if (rem =3D=3D num_bits_ulong) {
> ++quot;
> rem =3D 0;
> }
> return *this;
> }
>
> Bititerator& operator--()
> {
> if (rem =3D=3D 0) {
> --quot;
> rem =3D num_bits_ulong-1;
> } else {
> --rem;
> }
> return *this;
> }
>
> =20
> bool operator=3D=3D(const Bititerator& it) const {
> return ((mem =3D=3D it.mem) &&
> (quot =3D=3D it.quot) &&
> (rem =3D=3D it.rem));
> }
>
> bool operator!=3D(const Bititerator& it) const {
> return !(*this =3D=3D it);
> }
>
> bool operator*() { return mem[quot] & (1UL << rem); }
> =20
> friend Bititerator find(Bititerator first, Bititerator last, const bool=
=20
> bitstate);
>
> friend int operator-(const Bititerator& a, const Bititerator& b);
> =20
> private:
> // the current bit is (mem[quot] & (1UL << rem)), which has index (quot=
=20
> * (num_bits_ulong) + rem)
> =20
> unsigned long *mem;
> size_t quot;
> size_t rem;
> };
>
>
> Bititerator find(Bititerator first, Bititerator last, const bool bitstate=
)
> /* bitstate not yet taken into account.=20
> This algorithm (currently) always takes bitstate =3D=3D true, i.e only=
=20
> search for set true bits
>
> Just a demonstration...
> Todo: handle bitstate...
> */
> {
> if (first =3D=3D last)
> return last;
> =20
> using ulong =3D unsigned long;
>
> size_t lim_quot;
> size_t lim_rem;
> if (last.rem > 0) {
> lim_quot =3D last.quot;
> lim_rem =3D last.rem-1;
> } else {
> lim_quot =3D last.quot-1;
> lim_rem =3D Bititerator::num_bits_ulong-1;
> }
>
> ulong val =3D first.mem[first.quot] & (-1UL << first.rem);
>
> if (first.quot < lim_quot) {
> if (val) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
> =20
> while (++first.quot < lim_quot) {
> if (first.mem[first.quot]) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
> }
>
> val =3D first.mem[first.quot];
> }
>
> val &=3D (-1UL >> (Bititerator::num_bits_ulong - lim_rem - 1));
> if (val) {
> first.rem =3D find_lowest_idx_set_in_ulong(val);
> return first;
> }
>
> return last; // not found
> }
>
>
> int operator-(const Bititerator& a, const Bititerator& b)
> {
> return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
> (b.quot * (Bititerator::num_bits_ulong) + b.rem));
> }
>
>
>
>
> #if (ULONG_MAX =3D=3D 4294967295UL)
>
> // unsigned long has 32 bits
> static_assert(sizeof(unsigned long) =3D=3D 4, "unsigned long must have 4=
=20
> bytes");
>
> static const int MultiplyDeBruijnBitPosition[32] =3D
> {
> 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
> 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
> };
>
> #else
> #if (ULONG_MAX =3D=3D 18446744073709551615UL)
>
> // unsigned long has 64 bits
> static_assert(sizeof(unsigned long) =3D=3D 8, "unsigned long must have 8=
=20
> bytes");
>
> static const int MultiplyDeBruijnBitPosition[64] =3D
> {
> 0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
> 54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
> 46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
> 25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
> };
>
> #else
>
> static_assert(0, "strange machine architecture");
>
> #endif
> #endif
>
> int find_lowest_idx_set_in_ulong(unsigned long val)
> // only call this function if val does definately have a bit that is set
> {
> #if (ULONG_MAX =3D=3D 4294967295UL)
>
> // unsigned long has 32 bits
> =20
> // Reference: http://stackoverflow.com/a/757266 (=20
> http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLooku=
p=20
> )
> return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >>=
=20
> 27];
>
> =20
> #else
> #if (ULONG_MAX =3D=3D 18446744073709551615UL)
>
> // unsigned long has 64 bits
>
> // return __builtin_ctzl(val);
>
> // Reference: http://stackoverflow.com/a/24146182 (=20
> http://chessprogramming.wikispaces.com/BitScan#KimWalisch )
> return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) *=20
> 0x03F79D71B4CB0A89UL)) >> 58];
>
> #endif
> #endif =20
> }
>
>
> template<size_t N> class Mybitset : public bitset<N> {
> public:
> using bitset<N>::bitset;
>
> constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BI=
T;
>
> using iterator =3D Bititerator;
> =20
> Bititerator begin()
> {
> assert((reinterpret_cast<unsigned long>(reinterpret_cast<const=20
> unsigned *>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment=
*/
> return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
> }
> Bititerator end()
> {
> assert((reinterpret_cast<unsigned long>(reinterpret_cast<const=20
> unsigned *>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment=
*/
> size_t quot =3D N / num_bits_ulong;
> return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N -=
=20
> quot*num_bits_ulong};
> }
> };
>
>
> int main()
> {
> Mybitset<10> x;
> x.set(0);
> x.set(3);
>
> // print bitpattern of x using iterators
> if (x.begin() !=3D x.end()) {
> Mybitset<10>::iterator it =3D x.end();
> do {
> --it;
> cout << *it;
> } while (it !=3D x.begin());
> }
> cout << "\n\n";
>
> // search for bit
> for (; true; x <<=3D 1) {
> auto it =3D find(x.begin(), x.end(), true);
> cout << x << '\t';
> if (it !=3D x.end())
> cout << (it - x.begin()) << "\n";
> else
> cout << "not found\n";
>
> if (x.none())
> break;
> }
> =20
> return 0;
> }
> /////////////////////////////////////////////////////////////////
>
> Regards,
> John
>
Here's an improvement of the above code:
In find, I got rid of lim_quot and lim_rem (which was used above)
What else has changed?
Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
now calls either
inline Bititerator find_1(Bititerator first, Bititerator last)
(this one is superfast!!! On my machine MultiplyDeBruijnBitPosition=20
actually seemed slightly faster than __builtin_ctzl(val))
or=20
inline Bititerator find_0(Bititerator first, Bititerator last)
(this one basically inverts words and then uses the same search as find_1=
=20
uses on words. this can probably be improved: any fast "count trailing=20
ones" available??)
/////////////////////////////////////////////////////////////////
#include <iostream>
#include <bitset>
#include <climits> /* CHAR_BIT */
#include <cassert>
#include <cstddef> /* ptrdiff_t */
using namespace std;
int find_lowest_idx_set_in_ulong(unsigned long val);
struct Bititerator {
Bititerator(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1},=
=20
quot{quot1}, rem{rem1} {}
constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;
=20
Bititerator& operator++()
{
++rem;
if (rem =3D=3D num_bits_ulong) {
++quot;
rem =3D 0;
}
return *this;
}
Bititerator& operator--()
{
if (rem =3D=3D 0) {
--quot;
rem =3D num_bits_ulong-1;
} else {
--rem;
}
return *this;
}
=20
bool operator=3D=3D(const Bititerator& it) const {
return ((mem =3D=3D it.mem) &&
(quot =3D=3D it.quot) &&
(rem =3D=3D it.rem));
}
bool operator!=3D(const Bititerator& it) const {
return !(*this =3D=3D it);
}
bool operator*() { return mem[quot] & (1UL << rem); }
=20
friend inline Bititerator find_1(Bititerator first, Bititerator last);
friend inline Bititerator find_0(Bititerator first, Bititerator last);
friend ptrdiff_t operator-(const Bititerator& a, const Bititerator& b);
=20
private:
// the current bit is (mem[quot] & (1UL << rem)), which has index (quot *=
=20
(num_bits_ulong) + rem)
=20
unsigned long *mem;
size_t quot;
size_t rem;
};
inline Bititerator find_1(Bititerator first, Bititerator last)
{
/*
// optional, so gain efficiency by leaving this
if (first =3D=3D last)
return last;
*/
using ulong =3D unsigned long;
ulong val =3D first.mem[first.quot] & (-1UL << first.rem);
if (first.quot < last.quot) {
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
=20
while (++first.quot < last.quot) {
if (first.mem[first.quot]) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
}
val =3D first.mem[first.quot];
}
if (last.rem) { /* avoid last.rem =3D=3D 0,=20
otherwise (Bititerator::num_bits_ulong - last.rem) >=
=3D=20
width of type
and the right-shift will then misbehave */
val &=3D (-1UL >> (Bititerator::num_bits_ulong - last.rem));
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
}
=20
return last; // not found
}
inline Bititerator find_0(Bititerator first, Bititerator last)
{
/*
// optional, so gain efficiency by leaving this
if (first =3D=3D last)
return last;
*/
using ulong =3D unsigned long;
ulong val =3D first.mem[first.quot] & (-1UL << first.rem);
constexpr ulong fail =3D -1UL; // fail with all 1's -- we're looking for=
=20
0's here
=20
if (first.quot < last.quot) {
val =3D ~val;
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
=20
while (++first.quot < last.quot) {
val =3D ~first.mem[first.quot];
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
}
val =3D first.mem[first.quot];
}
if (last.rem) { /* avoid last.rem =3D=3D 0,=20
otherwise (Bititerator::num_bits_ulong - last.rem) >=
=3D=20
width of type
and the right-shift will then misbehave */
val &=3D (-1UL >> (Bititerator::num_bits_ulong - last.rem));
val =3D ~val;
if (val) {
first.rem =3D find_lowest_idx_set_in_ulong(val);
return first;
}
}
=20
return last; // not found
}
Bititerator find(Bititerator first, Bititerator last, const bool bitstate)
{
if (bitstate)
return find_1(first, last);
return find_0(first, last);
}
ptrdiff_t operator-(const Bititerator& a, const Bititerator& b)
{
return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -
(b.quot * (Bititerator::num_bits_ulong) + b.rem));
}
#if (ULONG_MAX =3D=3D 4294967295UL)
// unsigned long has 32 bits
static_assert(sizeof(unsigned long) =3D=3D 4, "unsigned long must have 4=20
bytes");
static const int MultiplyDeBruijnBitPosition[32] =3D
{
0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
#else
#if (ULONG_MAX =3D=3D 18446744073709551615UL)
// unsigned long has 64 bits
static_assert(sizeof(unsigned long) =3D=3D 8, "unsigned long must have 8=20
bytes");
static const int MultiplyDeBruijnBitPosition[64] =3D
{
0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
};
#else
static_assert(0, "strange machine architecture");
#endif
#endif
inline int find_lowest_idx_set_in_ulong(unsigned long val)
// only call this function if val does definately have a bit that is set
{
#if (ULONG_MAX =3D=3D 4294967295UL)
// unsigned long has 32 bits
#ifdef __GNUC__
return __builtin_ctz(val);
#else
// Reference: http://stackoverflow.com/a/757266 (=20
http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup =
)
return MultiplyDeBruijnBitPosition[(((val & -val) * 0x077CB531UL)) >> 27]=
;
#endif
=20
#else
#if (ULONG_MAX =3D=3D 18446744073709551615UL)
// unsigned long has 64 bits
#ifdef __GNUC__
return __builtin_ctzl(val);
#else
// Reference: http://stackoverflow.com/a/24146182 (=20
http://chessprogramming.wikispaces.com/BitScan#KimWalisch )
return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) *=20
0x03F79D71B4CB0A89UL)) >> 58];
#endif
=20
#endif
#endif =20
}
template<size_t N> class Mybitset : public bitset<N> {
public:
using bitset<N>::bitset;
constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;
using iterator =3D Bititerator;
=20
Bititerator begin()
{
assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned=
=20
*>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
return Bititerator{reinterpret_cast<unsigned long*>(this), 0, 0};
}
Bititerator end()
{
assert((reinterpret_cast<unsigned long>(reinterpret_cast<const unsigned=
=20
*>(this)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */
size_t quot =3D N / num_bits_ulong;
return Bititerator{reinterpret_cast<unsigned long*>(this), quot, N -=20
quot*num_bits_ulong};
}
};
int main()
{
Mybitset<50> x;
x.set(0);
x.set(3);
// print bitpattern of x using iterators
if (x.begin() !=3D x.end()) {
Mybitset<10>::iterator it =3D x.end();
do {
--it;
cout << *it;
} while (it !=3D x.begin());
}
cout << "\n\n";
// search for 1-bit
cout << "search for 1\n"
<< "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n";
auto it_end =3D x.end(); // --(--(x.end()));
for (; true; x <<=3D 1) {
auto it =3D find(x.begin(), it_end, true);
cout << x << '\t';
if (it !=3D it_end)
cout << (it - x.begin()) << "\n";
else
cout << "not found\n";
if (x.none())
break;
}
// search for 0-bit
cout << "\n"
<< "search for 0\n"
<< "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n";
x.reset();
x.flip();
it_end =3D x.end(); // --(--(x.end()));
for (; true; x >>=3D 1) {
auto it =3D find(x.begin(), it_end, false);
cout << x << '\t';
if (it !=3D it_end)
cout << (it - x.begin()) << "\n";
else
cout << "not found\n";
if (x.none())
break;
}
constexpr unsigned num =3D 100000;
cout << "\n"
<< "searching " << num+1 << " times through a Mybitset<" << num <<=
=20
"> with shifting bits\n"
<< "(please wait for a few minor seconds...)\n";
Mybitset<100000> y;
y.set(0);
for (; true ; y <<=3D1 ) {
auto it =3D find(y.begin(), y.end(), true);
/*
if (it !=3D y.end()) {
cout << (it - y.begin()) << " ";
}
else {
cout << "not found\n";
}
*/
if (y.none())
break;
}
return 0;
}
/////////////////////////////////////////////////////////////////
=20
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
------=_Part_191_1018650848.1422200070041
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><br>On Sunday, January 25, 2015 at 1:10:42 AM UTC+1, j=
ono...@googlemail.com wrote:<blockquote class=3D"gmail_quote" style=3D"marg=
in: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><d=
iv dir=3D"ltr"><br><br>On Saturday, January 24, 2015 at 10:10:19 PM UTC+1, =
Howard Hinnant wrote:<blockquote class=3D"gmail_quote" style=3D"margin:0;ma=
rgin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">On Jan 24, 201=
5, at 2:31 PM, <a rel=3D"nofollow">jono...@googlemail.com</a> wrote:
<br>
<br>> Hi there,
<br>>=20
<br>> I'd love to see a quick and efficient std::bitset search function:=
Example: Search for bitstate "true", starting from bit-index 0.
<br>>=20
<br>> It could have an interface like this, maby:
<br>>=20
<br>> template<size_t N>
<br>> size_t std::bitset<N>::find(size_t idx, bool bitstate=3Dtrue=
);
<br>>=20
<br>>=20
<br>> The effect of bs.find(idx, bitstate):
<br>> we search from idx (usually from 0, to start from the rightmost ls=
b bit), until we find the first index from which:
<br>> bs[idx] =3D=3D bitstate
<br>> If no such index is found, return idx =3D std::bitset::npos =
(e.g. equal to numeric_limits<size_t>::max())
<br>>=20
<br>> Why?
<br>> Well some datastructures can profit from using bitset and providin=
g and efficient search for a particular bitstate.=20
<br>> I'm particularly thinking of a memory pool here: Mark the mem-obje=
cts that are in use, in a bitset.
<br>> Then use something like this, to iterate and find the appropriate =
bit (e.g. in the destructor of the pool):
<br>>=20
<br>> for (size_t idx =3D 0; (idx =3D bs.find(idx)) !=3D bitset::npos; +=
+idx) {
<br>> =20
<br>> /* ... example: release resource idx */
<br>>=20
<br>> // unmark that bit
<br>> bs.reset(idx);
<br>> }
<br>>=20
<br>>=20
<br>> The most important point is this: we can search incredibly fast in=
a bitset, while having a nice compressed representation.
<br>> In the current std::bitset, all we have is a compressed representa=
tion (if the library-designers decided to use 8 bits in a byte); but NO goo=
d search possibility.
<br>> And that's a pity.
<br>>=20
<br>> Here's my example of an efficient search . This is just an example=
for a speedy search, for library designers who are interested. But yes: th=
is example (just an example!) abuses a bitset, by assuming that the interna=
l bit-representation is in the first member, and therefore only plays with =
libraries such as libstdc++, which do indeed put the bit-representation in =
the first member (link). But such a search really needs to be part of the b=
itset class, to hide complexities=E2=80=A6
<br>
<br>I very much like the functionality you describe here. But what if=
we gave bitset iterators, and then made the interface look like:
<br>
<br> template <size_t N>
<br> bitset<N>::const_iterator
<br> find(bitset<N>::const_iterator first, bitset<N&g=
t;::const_iterator last, bool value);
<br>
<br>This would mimic in syntax the generic function:
<br>
<br> template <class InputIterator, class T>
<br> InputIterator
<br> find(InputIterator first, InputIterator last, const T&=
; value);
<br>
<br>Then generic code could search for things, be they bits or whatever, wi=
th the same syntax and maximal efficiency for each type.
<br>
<br>A detail here is that bitset<N>::const_iterator would have to be =
an alias for an iterator defined at namespace scope, as the N is not in a d=
educible context the way I=E2=80=99ve written it. I /believe/ that su=
ch an iterator could be designed to be independent of N. I also belie=
ve that the algorithm with the iterator interface would could still look si=
milar to your example efficient search.
<br>
<br>Here is an example implementation of it with the iterator interface:
<br>
<br><a href=3D"http://llvm.org/svn/llvm-project/libcxx/trunk/include/__bit_=
reference" rel=3D"nofollow" target=3D"_blank" onmousedown=3D"this.href=3D'h=
ttp://www.google.com/url?q\75http%3A%2F%2Fllvm.org%2Fsvn%2Fllvm-project%2Fl=
ibcxx%2Ftrunk%2Finclude%2F__bit_reference\46sa\75D\46sntz\0751\46usg\75AFQj=
CNHHyzNP1SYzrii1LzQGRLVigs5TKQ';return true;" onclick=3D"this.href=3D'http:=
//www.google.com/url?q\75http%3A%2F%2Fllvm.org%2Fsvn%2Fllvm-project%2Flibcx=
x%2Ftrunk%2Finclude%2F__bit_reference\46sa\75D\46sntz\0751\46usg\75AFQjCNHH=
yzNP1SYzrii1LzQGRLVigs5TKQ';return true;">http://llvm.org/svn/llvm-<wbr>pro=
ject/libcxx/trunk/include/_<wbr>_bit_reference</a>
<br>
<br>The __bit_iterator at this site is not necessarily the best design of b=
itset<N>::iterator. But it illustrates the concept.
<br>
<br>Here is some performance data on these algorithms:
<br>
<br><a href=3D"http://howardhinnant.github.io/onvectorbool.html" rel=3D"nof=
ollow" target=3D"_blank" onmousedown=3D"this.href=3D'http://www.google.com/=
url?q\75http%3A%2F%2Fhowardhinnant.github.io%2Fonvectorbool.html\46sa\75D\4=
6sntz\0751\46usg\75AFQjCNFoTLKbhxJd9zvsOcdnpr-HKXoRzw';return true;" onclic=
k=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fhowardhinnant.=
github.io%2Fonvectorbool.html\46sa\75D\46sntz\0751\46usg\75AFQjCNFoTLKbhxJd=
9zvsOcdnpr-HKXoRzw';return true;">http://howardhinnant.github.<wbr>io/onvec=
torbool.html</a>
<br>
<br>Howard
<br>
<br></blockquote><div><br>Hi Howard,<br><br>thanks for your reply.<br><br>Y=
ou're definitely right. Using find and iterators is the way to do this. The=
C++ way!<br>I've given it a go...<br><br>As a very basic proof of concept.=
;)<br><br>(Please realize: I'm probably way out of league here... This is =
how one codes after just having almost finished a dedicated reading of Stro=
ustup's rather 'basic' "Principles and Practice" book... and having d=
one most of the book exercises.<br>In particular: I'm wondering about <span=
style=3D"font-family:courier new,monospace">const_iterator</span> <font si=
ze=3D"1">(it's not really mentioned in "Principles and Practice")</font>, s=
o any recommendations or tips will be appreciated.)<br><br>C++ wizards will=
certainly be able to improve this!<br><br><font size=3D"1"><span style=3D"=
font-family:courier new,monospace">//////////////////////////////<wbr>/////=
/////////////////////////<wbr>/////<br>#include <iostream><br>#includ=
e <bitset><br>#include <climits><br>#include <cassert><br=
><br>using namespace std;<br><br>int find_lowest_idx_set_in_ulong(<wbr>unsi=
gned long val);<br><br><br><br><br>struct Bititerator {<br> Bititerat=
or(unsigned long *mem1, size_t quot1, size_t rem1) : mem{mem1}, quot{quot1}=
, rem{rem1} {}<br><br> constexpr static int num_bits_ulong =3D sizeof=
(unsigned long) * CHAR_BIT;<br> <br> Bititerator& operator+=
+()<br> {<br> ++rem;<br> if (rem =
=3D=3D num_bits_ulong) {<br> ++quot;<br> =
; rem =3D 0;<br> }<br>  =
; return *this;<br> }<br><br> Bititerator& operator--=
()<br> {<br> if (rem =3D=3D 0) {<br> &nb=
sp; --quot;<br> rem =3D num_bits_=
ulong-1;<br> } else {<br> -=
-rem;<br> }<br> return *this;<br> =
}<br><br> <br> bool operator=3D=3D(const Bititerator& it) =
const {<br> return ((mem =3D=3D it.mem) &=
&<br> =
(quot =3D=3D it.quot) &&<br> &n=
bsp; (rem =3D=3D it.rem));<br> }<br><br=
> bool operator!=3D(const Bititerator& it) const {<br>  =
; return !(*this =3D=3D it);<br> }<br><br> bool operator*=
() { return mem[quot] & (1UL << rem); }<br> <br> frie=
nd Bititerator find(Bititerator first, Bititerator last, const bool bitstat=
e);<br><br> friend int operator-(const Bititerator& a, const Biti=
terator& b);<br> <br>private:<br> // the curren=
t bit is (mem[quot] & (1UL << rem)), which has index (quot * (num=
_bits_ulong) + rem)<br> <br> unsigned long *mem;<br> size=
_t quot;<br> size_t rem;<br>};<br><br><br>Bititerator find(Bititerato=
r first, Bititerator last, const bool bitstate)<br>/* bitstate not yet take=
n into account. <br> This algorithm (currently) always takes bi=
tstate =3D=3D true, i.e only search for set true bits<br><br> J=
ust a demonstration...<br> Todo: handle bitstate...<br> */=
<br>{<br> if (first =3D=3D last)<br> return last;<b=
r> <br> using ulong =3D unsigned long;<br><br> size_t lim=
_quot;<br> size_t lim_rem;<br> if (last.rem > 0) {<br> =
lim_quot =3D last.quot;<br> lim_rem =
=3D last.rem-1;<br> } else {<br> lim_quot =3D last.=
quot-1;<br> lim_rem =3D Bititerator::num_bits_ulong=
-1;<br> }<br><br> ulong val =3D first.mem[first.quot] & (-1=
UL << first.rem);<br><br> if (first.quot < lim_quot) {<br>&n=
bsp; if (val) {<br> first.rem =3D=
find_lowest_idx_set_in_ulong(<wbr>val);<br> =
return first;<br> }<br> <br> whil=
e (++first.quot < lim_quot) {<br> if (firs=
t.mem[first.quot]) {<br> first.re=
m =3D find_lowest_idx_set_in_ulong(<wbr>val);<br> &n=
bsp; return first;<br> }<br> =
; }<br><br> val =3D first.mem[first.quot];<br=
> }<br><br> val &=3D (-1UL >> (Bititerator::num_bits_=
ulong - lim_rem - 1));<br> if (val) {<br> first.rem=
=3D find_lowest_idx_set_in_ulong(<wbr>val);<br> return f=
irst;<br> }<br><br> return last; // not found<br>}<br><br><br>i=
nt operator-(const Bititerator& a, const Bititerator& b)<br>{<br>&n=
bsp; return ((a.quot * (Bititerator::num_bits_ulong) + a.rem) -<br> &n=
bsp; (b.quot * (Bititerator::num_=
bits_ulong) + b.rem));<br>}<br><br><br><br><br>#if (ULONG_MAX =3D=3D 429496=
7295UL)<br><br>// unsigned long has 32 bits<br>static_assert(sizeof(unsigne=
d long) =3D=3D 4, "unsigned long must have 4 bytes");<br><br>static const i=
nt MultiplyDeBruijnBitPosition[<wbr>32] =3D<br> {<br> &nbs=
p; 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,<br>  =
; 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9<br> =
; };<br><br>#else<br>#if (ULONG_MAX =3D=3D 18446744073709551615UL)<br><br>/=
/ unsigned long has 64 bits<br>static_assert(sizeof(unsigned long) =3D=3D 8=
, "unsigned long must have 8 bytes");<br><br>static const int MultiplyDeBru=
ijnBitPosition[<wbr>64] =3D<br> {<br> 0, 47, 1, 56,=
48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,<br> 54, 58=
, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,<br> &nb=
sp; 46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,<br>&nbs=
p; 25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63<br=
> };<br><br>#else<br><br>static_assert(0, "strange machine architectu=
re");<br><br>#endif<br>#endif<br><br>int find_lowest_idx_set_in_ulong(<wbr>=
unsigned long val)<br>// only call this function if val does definately hav=
e a bit that is set<br>{<br>#if (ULONG_MAX =3D=3D 4294967295UL)<br><br>&nbs=
p; // unsigned long has 32 bits<br> <br> // Reference: <a href=
=3D"http://stackoverflow.com/a/757266" target=3D"_blank" rel=3D"nofollow" o=
nmousedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fstack=
overflow.com%2Fa%2F757266\46sa\75D\46sntz\0751\46usg\75AFQjCNGb9lzFui8ZaEEu=
rbqY1wocW7ch5g';return true;" onclick=3D"this.href=3D'http://www.google.com=
/url?q\75http%3A%2F%2Fstackoverflow.com%2Fa%2F757266\46sa\75D\46sntz\0751\4=
6usg\75AFQjCNGb9lzFui8ZaEEurbqY1wocW7ch5g';return true;">http://stackoverfl=
ow.com/a/<wbr>757266</a> ( <a href=3D"http://graphics.stanford.edu/~seander=
/bithacks.html#ZerosOnRightMultLookup" target=3D"_blank" rel=3D"nofollow" o=
nmousedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fgraph=
ics.stanford.edu%2F~seander%2Fbithacks.html%23ZerosOnRightMultLookup\46sa\7=
5D\46sntz\0751\46usg\75AFQjCNGhoVqBIIAAElYNggMh4MJHg9CUEw';return true;" on=
click=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fgraphics.s=
tanford.edu%2F~seander%2Fbithacks.html%23ZerosOnRightMultLookup\46sa\75D\46=
sntz\0751\46usg\75AFQjCNGhoVqBIIAAElYNggMh4MJHg9CUEw';return true;">http://=
graphics.stanford.edu/~<wbr>seander/bithacks.html#<wbr>ZerosOnRightMultLook=
up</a> )<br> return MultiplyDeBruijnBitPosition[((<wbr>(val & -va=
l) * 0x077CB531UL)) >> 27];<br><br> <br>#else<br>#if (ULONG_MAX=
=3D=3D 18446744073709551615UL)<br><br> // unsigned long has 64 bits<=
br><br> // return __builtin_ctzl(val);<br><br> // Reference: <a=
href=3D"http://stackoverflow.com/a/24146182" target=3D"_blank" rel=3D"nofo=
llow" onmousedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%=
2Fstackoverflow.com%2Fa%2F24146182\46sa\75D\46sntz\0751\46usg\75AFQjCNGRZJG=
tQmRNgP-aKfhvSmoQUd8lyw';return true;" onclick=3D"this.href=3D'http://www.g=
oogle.com/url?q\75http%3A%2F%2Fstackoverflow.com%2Fa%2F24146182\46sa\75D\46=
sntz\0751\46usg\75AFQjCNGRZJGtQmRNgP-aKfhvSmoQUd8lyw';return true;">http://=
stackoverflow.com/a/<wbr>24146182</a> ( <a href=3D"http://chessprogramming.=
wikispaces.com/BitScan#KimWalisch" target=3D"_blank" rel=3D"nofollow" onmou=
sedown=3D"this.href=3D'http://www.google.com/url?q\75http%3A%2F%2Fchessprog=
ramming.wikispaces.com%2FBitScan%23KimWalisch\46sa\75D\46sntz\0751\46usg\75=
AFQjCNEE7YekpiEBqy7ieacMcc8-k1xyiw';return true;" onclick=3D"this.href=3D'h=
ttp://www.google.com/url?q\75http%3A%2F%2Fchessprogramming.wikispaces.com%2=
FBitScan%23KimWalisch\46sa\75D\46sntz\0751\46usg\75AFQjCNEE7YekpiEBqy7ieacM=
cc8-k1xyiw';return true;">http://chessprogramming.<wbr>wikispaces.com/BitSc=
an#<wbr>KimWalisch</a> )<br> return MultiplyDeBruijnBitPosition[((<wb=
r>(val ^ (val-1)) * 0x03F79D71B4CB0A89UL)) >> 58];<br><br>#endif<br>#=
endif <br>}<br><br><br>template<size_t N> class Mybitset : publ=
ic bitset<N> {<br>public:<br> using bitset<N>::bitset;<br=
><br> constexpr static int num_bits_ulong =3D sizeof(unsigned long) *=
CHAR_BIT;<br><br> using iterator =3D Bititerator;<br> <br>&nbs=
p; Bititerator begin()<br> {<br> assert((reinterpre=
t_cast<<wbr>unsigned long>(reinterpret_cast<const unsigned *>(t=
his)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */<br> &n=
bsp; return Bititerator{reinterpret_cast<<wbr>unsigned long*>(t=
his), 0, 0};<br> }<br> Bititerator end()<br> {<br> &=
nbsp; assert((reinterpret_cast<<wbr>unsigned long>(reinterpret_=
cast<const unsigned *>(this)) % sizeof(unsigned long)) =3D=3D 0); /* =
check alignment */<br> size_t quot =3D N / num_bits_ulong=
;<br> return Bititerator{reinterpret_cast<<wbr>unsigne=
d long*>(this), quot, N - quot*num_bits_ulong};<br> }<br>};<br><br=
><br>int main()<br>{<br> Mybitset<10> x;<br> x.set(0);<br=
> x.set(3);<br><br> // print bitpattern of x using iterators<br=
> if (x.begin() !=3D x.end()) {<br> Mybitset<10&=
gt;::iterator it =3D x.end();<br> do {<br> &nb=
sp; --it;<br> cout << *it;<=
br> } while (it !=3D x.begin());<br> }<br> co=
ut << "\n\n";<br><br> // search for bit<br> for (; true; =
x <<=3D 1) {<br> auto it =3D find(x.begin(), x.end(=
), true);<br> cout << x << '\t';<br> &nb=
sp; if (it !=3D x.end())<br> cout <&=
lt; (it - x.begin()) << "\n";<br> else<br> &nb=
sp; cout << "not found\n";<br><br>  =
; if (x.none())<br> break;<br> }<br>&nb=
sp; <br> return 0;<br>}<br>//////////////////////////////<wbr>///////=
///////////////////////<wbr>/////</span></font><br><br>Regards,<br>John<br>=
</div></div></blockquote><div><br><br>Here's an improvement of the above co=
de:<br><br>In <span style=3D"font-family: courier new,monospace;">find</spa=
n>, I got rid of <font size=3D"1"><span style=3D"font-family:courier new,mo=
nospace">lim_quot <font size=3D"2"><span style=3D"font-family: arial,sans-s=
erif;">and</span></font> </span></font><font size=3D"1"><span style=3D"font=
-family:courier new,monospace">lim_rem</span></font> (which was used above)=
<br><br><br>What else has changed?<br><div style=3D"margin-left: 40px;"><fo=
nt face=3D"courier new,monospace">Bititerator find(Bititerator first, Bitit=
erator last, const bool bitstate)</font><br></div>now calls either<br><div =
style=3D"margin-left: 40px;"><font face=3D"courier new,monospace" size=3D"2=
">inline Bititerator <span style=3D"color: rgb(56, 118, 29);">find_1</span>=
(Bititerator first, Bititerator last)</font><br>(this one<span style=3D"fon=
t-family: arial,sans-serif;"> is <span style=3D"color: rgb(56, 118, 29);">s=
uperfast</span>!!! On my machin<font size=3D"2">e <span style=3D"font-famil=
y: courier new,monospace;">MultiplyDeBruijnBitPosition</span> actually seem=
ed slightly faster than</font></span><font face=3D"courier new,monospace"><=
font size=3D"1"><span style=3D"font-family: arial,sans-serif;"><font size=
=3D"2"> <span style=3D"font-family: courier new,monospace;">__builtin_ctzl(=
val)</span>)</font></span></font></font><br></div>or <br><div style=3D"marg=
in-left: 40px;"><font face=3D"courier new,monospace" size=3D"2">inline Biti=
terator find_0(Bititerator first, Bititerator last)</font><br>(this one bas=
ically inverts words and then uses the same search as <span style=3D"font-f=
amily: courier new,monospace;">find_1</span> uses on words. this can probab=
ly be improved: any fast "count trailing ones" available??)<br></div><br><b=
r><br><font face=3D"courier new,monospace"><font face=3D"courier new,monosp=
ace" size=3D"1">///////////////////////////////////////////////////////////=
//////<br></font><font size=3D"1">#include <iostream><br>#include <=
;bitset><br>#include <climits> /* CHAR_BIT */<br>#include <cass=
ert><br>#include <cstddef> /* ptrdiff_t */<br><br>using namespace =
std;<br><br>int find_lowest_idx_set_in_ulong(unsigned long val);<br><br><br=
><br><br>struct Bititerator {<br> Bititerator(unsigned long *mem1, si=
ze_t quot1, size_t rem1) : mem{mem1}, quot{quot1}, rem{rem1} {}<br><br>&nbs=
p; constexpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT=
;<br> <br> Bititerator& operator++()<br> {<br> &=
nbsp; ++rem;<br> if (rem =3D=3D num_bits_ulong) {<b=
r> ++quot;<br> =
rem =3D 0;<br> }<br> return *this;<br>&=
nbsp; }<br><br> Bititerator& operator--()<br> {<br> &n=
bsp; if (rem =3D=3D 0) {<br> --quot;<br=
> rem =3D num_bits_ulong-1;<br> &n=
bsp; } else {<br> --rem;<br>  =
; }<br> return *this;<br> }<br><br> <br> =
; bool operator=3D=3D(const Bititerator& it) const {<br> &nb=
sp; return ((mem =3D=3D it.mem) &&<br>  =
; (quot =3D=3D it.quot) &am=
p;&<br> &nbs=
p; (rem =3D=3D it.rem));<br> }<br><br> bool operator!=3D(=
const Bititerator& it) const {<br> return !(*this =3D=
=3D it);<br> }<br><br> bool operator*() { return mem[quot] &=
; (1UL << rem); }<br> <br> friend inline Bititerator find=
_1(Bititerator first, Bititerator last);<br> friend inline Bititerato=
r find_0(Bititerator first, Bititerator last);<br><br> friend ptrdiff=
_t operator-(const Bititerator& a, const Bititerator& b);<br> =
<br>private:<br> // the current bit is (mem[quot] & =
(1UL << rem)), which has index (quot * (num_bits_ulong) + rem)<br>&nb=
sp; <br> unsigned long *mem;<br> size_t quot;<br> size_t =
rem;<br>};<br><br><br>inline Bititerator find_1(Bititerator first, Bititera=
tor last)<br>{<br> /*<br> // optional, so gain efficiency by le=
aving this<br> if (first =3D=3D last)<br> return la=
st;<br> */<br> using ulong =3D unsigned long;<br><br> ulo=
ng val =3D first.mem[first.quot] & (-1UL << first.rem);<br><br>&n=
bsp; if (first.quot < last.quot) {<br> if (val) {<br>&=
nbsp; first.rem =3D find_lowest_idx_set_in_ulong(va=
l);<br> return first;<br> }=
<br> <br> while (++first.quot < last.quot) {<br>=
if (first.mem[first.quot]) {<br> =
first.rem =3D find_lowest_idx_set_in_ulong(v=
al);<br> return first;<br> &=
nbsp; }<br> }<br><br> =
val =3D first.mem[first.quot];<br> }<br><br> if (last.rem) { /=
* avoid last.rem =3D=3D 0, <br> &n=
bsp;  =
; otherwise (Bititerator::num_bits_ulong - last.rem) >=3D width of type<=
br> =
and the right-shift will t=
hen misbehave */<br> val &=3D (-1UL >> (Bititer=
ator::num_bits_ulong - last.rem));<br> if (val) {<br>&nbs=
p; first.rem =3D find_lowest_idx_set_in_ulong(val);=
<br> return first;<br> }<br=
> }<br> <br> return last; // not found<br>}<br><br>inline=
Bititerator find_0(Bititerator first, Bititerator last)<br>{<br> /*<=
br> // optional, so gain efficiency by leaving this<br> if (fir=
st =3D=3D last)<br> return last;<br> */<br> u=
sing ulong =3D unsigned long;<br><br> ulong val =3D first.mem[first.q=
uot] & (-1UL << first.rem);<br> constexpr ulong fail =3D -1=
UL; // fail with all 1's -- we're looking for 0's here<br> <br> =
if (first.quot < last.quot) {<br> val =3D ~val;<br>&n=
bsp; if (val) {<br> first.rem =3D=
find_lowest_idx_set_in_ulong(val);<br> retur=
n first;<br> }<br> <br> while (++=
first.quot < last.quot) {<br> val =3D ~fir=
st.mem[first.quot];<br> if (val) {<br> &=
nbsp; first.rem =3D find_lowest_idx_set_in_ul=
ong(val);<br> return first;<br>&n=
bsp; }<br> }<br><br> &=
nbsp; val =3D first.mem[first.quot];<br> }<br><br> if (last.rem=
) { /* avoid last.rem =3D=3D 0, <br> &nb=
sp; =
otherwise (Bititerator::num_bits_ulong - last.rem) >=3D width of =
type<br> &=
nbsp; and the right-shift w=
ill then misbehave */<br> val &=3D (-1UL >> (Bi=
titerator::num_bits_ulong - last.rem));<br> val =3D ~val;=
<br> if (val) {<br> first.r=
em =3D find_lowest_idx_set_in_ulong(val);<br> =
return first;<br> }<br> }<br> <br> ret=
urn last; // not found<br>}<br><br><br>Bititerator find(Bititerator first, =
Bititerator last, const bool bitstate)<br>{<br> if (bitstate)<br>&nbs=
p; return find_1(first, last);<br> return find_0(first, l=
ast);<br>}<br><br>ptrdiff_t operator-(const Bititerator& a, const Bitit=
erator& b)<br>{<br> return ((a.quot * (Bititerator::num_bits_ulon=
g) + a.rem) -<br> (b.=
quot * (Bititerator::num_bits_ulong) + b.rem));<br>}<br><br><br><br><br>#if=
(ULONG_MAX =3D=3D 4294967295UL)<br><br>// unsigned long has 32 bits<br>sta=
tic_assert(sizeof(unsigned long) =3D=3D 4, "unsigned long must have 4 bytes=
");<br><br>static const int MultiplyDeBruijnBitPosition[32] =3D<br> {=
<br> 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, =
4, 8,<br> 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 1=
1, 5, 10, 9<br> };<br><br>#else<br>#if (ULONG_MAX =3D=3D 184467440737=
09551615UL)<br><br>// unsigned long has 64 bits<br>static_assert(sizeof(uns=
igned long) =3D=3D 8, "unsigned long must have 8 bytes");<br><br>static con=
st int MultiplyDeBruijnBitPosition[64] =3D<br> {<br>  =
; 0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,<br> &nbs=
p; 54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,<br>=
46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, =
10, 45,<br> 25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12,=
7, 6, 5, 63<br> };<br><br>#else<br><br>static_assert(0, "strange mac=
hine architecture");<br><br>#endif<br>#endif<br><br><br><br>inline int find=
_lowest_idx_set_in_ulong(unsigned long val)<br>// only call this function i=
f val does definately have a bit that is set<br>{<br>#if (ULONG_MAX =3D=3D =
4294967295UL)<br><br> // unsigned long has 32 bits<br><br>#ifdef __GN=
UC__<br> return __builtin_ctz(val);<br>#else<br> // Reference: =
http://stackoverflow.com/a/757266 ( http://graphics.stanford.edu/~seander/b=
ithacks.html#ZerosOnRightMultLookup )<br> return MultiplyDeBruijnBitP=
osition[(((val & -val) * 0x077CB531UL)) >> 27];<br>#endif<br>&nbs=
p; <br>#else<br>#if (ULONG_MAX =3D=3D 18446744073709551615UL)<br><br> =
// unsigned long has 64 bits<br><br>#ifdef __GNUC__<br> return __bui=
ltin_ctzl(val);<br>#else<br> // Reference: http://stackoverflow.com/a=
/24146182 ( http://chessprogramming.wikispaces.com/BitScan#KimWalisch )<br>=
return MultiplyDeBruijnBitPosition[(((val ^ (val-1)) * 0x03F79D71B4C=
B0A89UL)) >> 58];<br>#endif<br> <br>#endif<br>#endif <br>=
}<br><br><br>template<size_t N> class Mybitset : public bitset<N&g=
t; {<br>public:<br> using bitset<N>::bitset;<br><br> cons=
texpr static int num_bits_ulong =3D sizeof(unsigned long) * CHAR_BIT;<br><b=
r> using iterator =3D Bititerator;<br> <br> Bititerator b=
egin()<br> {<br> assert((reinterpret_cast<unsign=
ed long>(reinterpret_cast<const unsigned *>(this)) % sizeof(unsign=
ed long)) =3D=3D 0); /* check alignment */<br> return Bit=
iterator{reinterpret_cast<unsigned long*>(this), 0, 0};<br> }<b=
r> Bititerator end()<br> {<br> assert((reinte=
rpret_cast<unsigned long>(reinterpret_cast<const unsigned *>(th=
is)) % sizeof(unsigned long)) =3D=3D 0); /* check alignment */<br> &nb=
sp; size_t quot =3D N / num_bits_ulong;<br> return =
Bititerator{reinterpret_cast<unsigned long*>(this), quot, N - quot*nu=
m_bits_ulong};<br> }<br>};<br><br><br>int main()<br>{<br> Mybit=
set<50> x;<br> x.set(0);<br> x.set(3);<br><br> // p=
rint bitpattern of x using iterators<br> if (x.begin() !=3D x.end()) =
{<br> Mybitset<10>::iterator it =3D x.end();<br>&nb=
sp; do {<br> --it;<br>  =
; cout << *it;<br> } while (it !=
=3D x.begin());<br> }<br> cout << "\n\n";<br><br> /=
/ search for 1-bit<br> cout << "search for 1\n"<br> =
<< "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n";=
<br> auto it_end =3D x.end(); &nbs=
p; // --(--(x.end()));<br> for (; true; x <<=3D 1) {<br>&=
nbsp; auto it =3D find(x.begin(), it_end, true);<br>  =
; cout << x << '\t';<br> if (it !=3D it=
_end)<br> cout << (it - x.begin()) <=
< "\n";<br> else<br> cou=
t << "not found\n";<br><br> if (x.none())<br> =
break;<br> }<br><br> // search for 0-b=
it<br> cout << "\n"<br> <=
;< "search for 0\n"<br> << "=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n";<br> x.reset();<br> x.f=
lip();<br> it_end =3D x.end(); &nb=
sp; // --(--(x.end()));<br> for (; true; x >>=3D 1) {<br>=
auto it =3D find(x.begin(), it_end, false);<br> &nb=
sp; cout << x << '\t';<br> if (it !=3D =
it_end)<br> cout << (it - x.begin()) &l=
t;< "\n";<br> else<br> c=
out << "not found\n";<br><br> if (x.none())<br>&nbs=
p; break;<br> }<br><br><br> constexpr u=
nsigned num =3D 100000;<br> cout << "\n"<br> &=
nbsp; << "searching " << num+1 << " times thr=
ough a Mybitset<" << num << "> with shifting bits\n"<br>&=
nbsp; << "(please wait for a few minor =
seconds...)\n";<br> Mybitset<100000> y;<br> y.set(0);<br>=
for (; true ; y <<=3D1 ) {<br> auto it =3D f=
ind(y.begin(), y.end(), true);<br> /*<br> &nbs=
p; if (it !=3D y.end()) {<br> cout << (=
it - y.begin()) << " ";<br> }<br> =
else {<br> cout << "not found\n";<br>&=
nbsp; }<br> */<br> if (y.no=
ne())<br> break;<br> }<br> return=
0;<br>}<br><br>///////////////////////////////////////////////////////////=
//////<br></font><br><br><br></font> <br></div></div>
<p></p>
-- <br />
<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/">http://groups.google.com/a/isocpp.org/group/std-proposals/<=
/a>.<br />
------=_Part_191_1018650848.1422200070041--
------=_Part_190_1856221431.1422200070041--
.
Author: Howard Hinnant <howard.hinnant@gmail.com>
Date: Sun, 25 Jan 2015 16:13:16 -0500
Raw View
On Jan 25, 2015, at 12:27 AM, David Krauss <potswa@gmail.com> wrote:
>=20
>> On 2015=E2=80=9301=E2=80=9325, at 8:10 AM, jononanon@googlemail.com wrot=
e:
>>=20
>> But what if we gave bitset iterators,
>=20
> Implementations should be able to reuse std::vector<bool>::iterator. If a=
rray and vector<bool> both have iterators, then bitset should too.
Agreed. And reuse the vector<bool>::reference as well.
> Also, referring to GCC=E2=80=99s standard library, SGI already defined bi=
tset::find_first and bitset::find_next functions; these were just never sta=
ndardized. (They are available in GCC as _Find_first and _Find_next.) Such =
functions could be adapted into optimized std::find overloads.
>=20
> This is really a job for the standard library, because intrinsic count-le=
ading-zeroes functionality is going to be faster than any bitwise arithmeti=
c tricks.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3864.html is a pro=
posal to /finally/ make these intrinsics available. It is long overdue.
>=20
> Finding a bit-string in a bit-sequence with std::search or comparing bit-=
strings with std::equal or std::mismatch are also interesting operations wi=
th applications in data compression. I wonder if any library already optimi=
zes these. GCC does not, so I guess it=E2=80=99s all rather niche.
libc++ optimizes:
swap, find, count, fill, fill_n, copy, copy_backward, move, move_backward, =
swap_ranges, rotate and equal. No doubt there is more work to be done in t=
his area.
Howard
--=20
---=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposa=
ls/.
.