Topic: Proposal n2050


Author: mstankus@gmail.com
Date: Sun, 25 Feb 2018 20:49:23 -0800 (PST)
Raw View
------=_Part_13288_2094367528.1519620563172
Content-Type: multipart/alternative;
 boundary="----=_Part_13289_1837089058.1519620563173"

------=_Part_13289_1837089058.1519620563173
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable


  I was reading=20
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2050.pdf.

 Note std::vector<T> has a data element (
http://en.cppreference.com/w/cpp/container/vector/data).
It provides a pointer to the underlying data. I really like having=20
something like =E2=80=9Cto_block_range=E2=80=9D,
so that people can use their bit manipulation techniques. This can be used=
=20
to reduce cache hits.
Note that gcc 7.0.3 has an implementation of dynamic_bitset and quotes your=
=20
proposal, but does
not include to_block_range.

  I have some hesitations about to_block_range,  though.

  One thing which dynamic_bitset could be used for is to  store DNA strands=
..
A DNA strand consists of a large number of letters and each letter (A,C,G=
=20
and T) can be stored in 2 bits.
Therefore, you can store 4 letters in 8 bits and sizeof(unsigned long long=
=20
int)*CHAR_BIT/2 letters
per block in the default case.

  Suppose that a dynamic_bitset holds 2000 blocks (e.g., unsigned long long=
=20
int)
and someone wants to look at blocks 999 through 1010. The to_block_range
operation would be called 2000 times, even though only twelve blocks are=20
necessary.
Whatever bit manipulation savings which could be realized would be dwarfed
by the unnecessary work.

  There are applications for bioinformatics being worked on by an=20
organization
called =E2=80=9CSeqan=E2=80=9D. See https://github.com/seqan/seqan. Their a=
pplication is
award winning and they are revising it (https://github.com/seqan/seqan3).
I am not part of that organization.  I am just starting to look at seqan an=
d
very much am into  C++.

  Why not create a block_const_iterator (and possibly block_iterator)
inside of the dynamic_bitset class. You could implement all of the
normal operations:

    auto w =3D my_dyn_bitset.block_begin()
    ++w;
    auto n =3D *w;
   w +=3D 10;

and so on. You could create a dynamic_bitset<>::const_iterator class so tha=
t
you can revise the implementation if possible.

  My particular interest is math:  Groebner bases in the noncommutative=20
case.
When computing Groebner basis, some people use suffix trees. Alot of
substring matching occurs and storing words in a dynamic_bitset could
work well.

--=20
You received this message because you are subscribed to the Google Groups "=
ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp=
..org/d/msgid/std-proposals/cfafda21-9363-4041-8f0d-c468e7276b75%40isocpp.or=
g.

------=_Part_13289_1837089058.1519620563173
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><span style=3D"font-family: arial, sans-serif; font-s=
ize: 15.36px; text-size-adjust: auto;"><br></span></div><span style=3D"font=
-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=
=C2=A0 I was reading<span class=3D"Apple-converted-space">=C2=A0</span></sp=
an><a href=3D"http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2050=
..pdf" data-saferedirecturl=3D"https://www.google.com/url?hl=3Den&amp;q=3Dht=
tp://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2050.pdf&amp;source=
=3Dgmail&amp;ust=3D1519698845455000&amp;usg=3DAFQjCNEdTOx-MSjJB6QyvEXs5gXgj=
iTXxQ" rel=3D"noreferrer" target=3D"_blank" style=3D"color: rgb(17, 85, 204=
); font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: au=
to;">http://www.open-std.org/jtc1/<wbr>sc22/wg21/docs/papers/2006/<wbr>n205=
0.pdf</a><span style=3D"font-family: arial, sans-serif; font-size: 15.36px;=
 text-size-adjust: auto;">.</span><div><br style=3D"font-family: arial, san=
s-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-f=
amily: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=C2=
=A0Note std::vector&lt;T&gt; has a data element (</span><a href=3D"http://e=
n.cppreference.com/w/cpp/container/vector/data" data-saferedirecturl=3D"htt=
ps://www.google.com/url?hl=3Den&amp;q=3Dhttp://en.cppreference.com/w/cpp/co=
ntainer/vector/data&amp;source=3Dgmail&amp;ust=3D1519698845455000&amp;usg=
=3DAFQjCNEVewny1rHytHeEGmOgsEhnMv1BJw" rel=3D"noreferrer" target=3D"_blank"=
 style=3D"color: rgb(17, 85, 204); font-family: arial, sans-serif; font-siz=
e: 15.36px; text-size-adjust: auto;">http://en.cppreference.com/w/<wbr>cpp/=
container/vector/data</a><span style=3D"font-family: arial, sans-serif; fon=
t-size: 15.36px; text-size-adjust: auto;">).</span><br style=3D"font-family=
: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span sty=
le=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust:=
 auto;">It provides a pointer to the underlying data. I really like having =
something like =E2=80=9Cto_block_range=E2=80=9D,</span><br style=3D"font-fa=
mily: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span=
 style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adj=
ust: auto;">so that people can use their bit manipulation techniques. This =
can be used to reduce cache hits.</span><br style=3D"font-family: arial, sa=
ns-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-=
family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">Not=
e that gcc 7.0.3 has an implementation of dynamic_bitset and quotes your pr=
oposal, but does</span><br style=3D"font-family: arial, sans-serif; font-si=
ze: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: arial, sa=
ns-serif; font-size: 15.36px; text-size-adjust: auto;">not include to_block=
_range.</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36=
px; text-size-adjust: auto;"><br style=3D"font-family: arial, sans-serif; f=
ont-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: ari=
al, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=C2=A0 I have =
some hesitations about to_block_range,=C2=A0 though.</span><br style=3D"fon=
t-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><=
br style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-a=
djust: auto;"><span style=3D"font-family: arial, sans-serif; font-size: 15.=
36px; text-size-adjust: auto;">=C2=A0 One thing which dynamic_bitset could =
be used for is to=C2=A0 store DNA strands.</span><br style=3D"font-family: =
arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=
=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: a=
uto;">A DNA strand consists of a large number of letters and each letter (A=
,C,G and T) can be stored in 2 bits.</span><br style=3D"font-family: arial,=
 sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"fo=
nt-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=
Therefore, you can store 4 letters in 8 bits and sizeof(unsigned long long =
int)*CHAR_BIT/2 letters</span><br style=3D"font-family: arial, sans-serif; =
font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: ar=
ial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">per block in =
the default case.</span><br style=3D"font-family: arial, sans-serif; font-s=
ize: 15.36px; text-size-adjust: auto;"><br style=3D"font-family: arial, san=
s-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-f=
amily: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=C2=
=A0 Suppose that a dynamic_bitset holds 2000 blocks (e.g., unsigned long lo=
ng int)</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36=
px; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif;=
 font-size: 15.36px; text-size-adjust: auto;">and someone wants to look at =
blocks 999 through 1010. The to_block_range</span><br style=3D"font-family:=
 arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span styl=
e=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: =
auto;">operation would be called 2000 times, even though only twelve blocks=
 are necessary.</span><br style=3D"font-family: arial, sans-serif; font-siz=
e: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: arial, san=
s-serif; font-size: 15.36px; text-size-adjust: auto;">Whatever bit manipula=
tion savings which could be realized would be dwarfed</span><br style=3D"fo=
nt-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=
<span style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-siz=
e-adjust: auto;">by the unnecessary work.</span><br style=3D"font-family: a=
rial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><br style=3D=
"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto=
;"><span style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-=
size-adjust: auto;">=C2=A0 There are applications for bioinformatics being =
worked on by an organization</span><br style=3D"font-family: arial, sans-se=
rif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-famil=
y: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">called =
=E2=80=9CSeqan=E2=80=9D. See<span class=3D"Apple-converted-space">=C2=A0</s=
pan></span><a href=3D"https://github.com/seqan/seqan" data-saferedirecturl=
=3D"https://www.google.com/url?hl=3Den&amp;q=3Dhttps://github.com/seqan/seq=
an&amp;source=3Dgmail&amp;ust=3D1519698845456000&amp;usg=3DAFQjCNEeAeeg3wc7=
ZNvnUXWYHIN0YsTz0g" rel=3D"noreferrer" target=3D"_blank" style=3D"color: rg=
b(17, 85, 204); font-family: arial, sans-serif; font-size: 15.36px; text-si=
ze-adjust: auto;">https://github.com/seqan/seqan</a><wbr style=3D"font-fami=
ly: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span s=
tyle=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjus=
t: auto;">. Their application is</span><br style=3D"font-family: arial, san=
s-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-f=
amily: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">awar=
d winning and they are revising it (</span><a href=3D"https://github.com/se=
qan/seqan3" data-saferedirecturl=3D"https://www.google.com/url?hl=3Den&amp;=
q=3Dhttps://github.com/seqan/seqan3&amp;source=3Dgmail&amp;ust=3D1519698845=
456000&amp;usg=3DAFQjCNEXqOfSgR_sN_vtfppmUkfdXRlYdQ" rel=3D"noreferrer" tar=
get=3D"_blank" style=3D"color: rgb(17, 85, 204); font-family: arial, sans-s=
erif; font-size: 15.36px; text-size-adjust: auto;">https://github.com/seqan=
/<wbr>seqan3</a><span style=3D"font-family: arial, sans-serif; font-size: 1=
5.36px; text-size-adjust: auto;">).</span><br style=3D"font-family: arial, =
sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"fon=
t-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">I=
 am not part of that organization.=C2=A0 I am just starting to look at seqa=
n and</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36px=
; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; f=
ont-size: 15.36px; text-size-adjust: auto;">very much am into=C2=A0 C++.</s=
pan><br style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-s=
ize-adjust: auto;"><br style=3D"font-family: arial, sans-serif; font-size: =
15.36px; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-s=
erif; font-size: 15.36px; text-size-adjust: auto;">=C2=A0 Why not create a =
block_const_iterator (and possibly block_iterator)</span><br style=3D"font-=
family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><sp=
an style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-a=
djust: auto;">inside of the dynamic_bitset class. You could implement all o=
f the</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36px=
; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; f=
ont-size: 15.36px; text-size-adjust: auto;">normal operations:</span><br st=
yle=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust=
: auto;"><br style=3D"font-family: arial, sans-serif; font-size: 15.36px; t=
ext-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; font=
-size: 15.36px; text-size-adjust: auto;">=C2=A0 =C2=A0 auto w =3D my_dyn_bi=
tset.block_begin()</span><br style=3D"font-family: arial, sans-serif; font-=
size: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: arial, =
sans-serif; font-size: 15.36px; text-size-adjust: auto;">=C2=A0 =C2=A0 ++w;=
</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36px; tex=
t-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; font-s=
ize: 15.36px; text-size-adjust: auto;">=C2=A0 =C2=A0 auto n =3D *w;</span><=
br style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size-a=
djust: auto;"><span style=3D"font-family: arial, sans-serif; font-size: 15.=
36px; text-size-adjust: auto;">=C2=A0 =C2=A0w +=3D 10;</span><br style=3D"f=
ont-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;"=
><br style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-size=
-adjust: auto;"><span style=3D"font-family: arial, sans-serif; font-size: 1=
5.36px; text-size-adjust: auto;">and so on. You could create a dynamic_bits=
et&lt;&gt;::const_</span><wbr style=3D"font-family: arial, sans-serif; font=
-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-family: arial,=
 sans-serif; font-size: 15.36px; text-size-adjust: auto;">iterator class so=
 that</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36px=
; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; f=
ont-size: 15.36px; text-size-adjust: auto;">you can revise the implementati=
on if possible.</span><br style=3D"font-family: arial, sans-serif; font-siz=
e: 15.36px; text-size-adjust: auto;"><br style=3D"font-family: arial, sans-=
serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"font-fam=
ily: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;">=C2=A0=
 My particular interest is math:=C2=A0 Groebner bases in the noncommutative=
 case.</span><br style=3D"font-family: arial, sans-serif; font-size: 15.36p=
x; text-size-adjust: auto;"><span style=3D"font-family: arial, sans-serif; =
font-size: 15.36px; text-size-adjust: auto;">When computing Groebner basis,=
 some people use suffix trees. Alot of</span><br style=3D"font-family: aria=
l, sans-serif; font-size: 15.36px; text-size-adjust: auto;"><span style=3D"=
font-family: arial, sans-serif; font-size: 15.36px; text-size-adjust: auto;=
">substring matching occurs and storing words in a dynamic_bitset could</sp=
an><br style=3D"font-family: arial, sans-serif; font-size: 15.36px; text-si=
ze-adjust: auto;"><span style=3D"font-family: arial, sans-serif; font-size:=
 15.36px; text-size-adjust: auto;">work well.</span><br></div></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to <a href=3D"mailto:std-proposals+unsubscribe@isocpp.org">std-proposa=
ls+unsubscribe@isocpp.org</a>.<br />
To post to this group, send email to <a href=3D"mailto:std-proposals@isocpp=
..org">std-proposals@isocpp.org</a>.<br />
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/a/isocpp.org/d/msgid/std-proposals/cfafda21-9363-4041-8f0d-c468e7276b75%=
40isocpp.org?utm_medium=3Demail&utm_source=3Dfooter">https://groups.google.=
com/a/isocpp.org/d/msgid/std-proposals/cfafda21-9363-4041-8f0d-c468e7276b75=
%40isocpp.org</a>.<br />

------=_Part_13289_1837089058.1519620563173--

------=_Part_13288_2094367528.1519620563172--

.