Issue 561350044: Issue 5705: int->long in Stem::get_beaming and set_beaming

Issue 561350044: Issue 5705: int->long in Stem::get_beaming and set_beaming (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
5 years, 3 months ago by Dan Eble

Modified:
5 years, 3 months ago

Reviewers:
dak, dan, lemzwerg, hanwenn

CC:
lilypond-devel_gnu.org

Visibility:
Public.

More Reviews

Description

https://sourceforge.net/p/testlilyissues/issues/5705/ Fixes a type-conversion warning because scm_ilength () returns a long. Other changes are for consistency.

Patch Set 1 #

Total comments: 1

Created: 5 years, 3 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Stats (+12 lines, -12 lines)			Patch
M	lily/include/stem.hh	View	1 chunk	+2 lines, -2 lines	0 comments	Download
M	lily/stem.cc	View	3 chunks	+6 lines, -6 lines	1 comment	Download
M	lily/stem-engraver.cc	View	1 chunk	+2 lines, -2 lines	0 comments	Download
M	lily/tuplet-number.cc	View	1 chunk	+2 lines, -2 lines	0 comments	Download

Messages

Total messages: 13

Expand All Messages | Collapse All Messages

hanwenn

sorry to complain late about this change. I understand that this gets rid of a ...

5 years, 3 months ago (2020-02-03 08:04:24 UTC) #2

dak

hanwenn@gmail.com writes: > sorry to complain late about this change. I understand that this gets ...

5 years, 3 months ago (2020-02-03 11:27:30 UTC) #3

dan_faithful.be

On Feb 3, 2020, at 06:27, David Kastrup <dak@gnu.org> wrote: > > hanwenn@gmail.com writes: > ...

5 years, 3 months ago (2020-02-03 13:40:15 UTC) #4

On Feb 3, 2020, at 06:27, David Kastrup <dak@gnu.org> wrote:
> 
> hanwenn@gmail.com writes:
> 
>> sorry to complain late about this change. I understand that this gets
>> rid of a conversion warning, which is something that we want, but I am
>> missing the big picture here. Is there a plan for the big picture?
> 
> As far as I can see the big picture is matching the used variables to
> the used data types to that overflows become a non-issue.
...
> Just choosing the right types means that when a problem crops up likely
> due to an exceeded size, this is not a place you need to check up on and
> verify when going through all warning-silencing casts.

It's a fair question, especially on seeing a case where I chose to keep a signed
type, after seeing many other cases where I changed a signed index to size_t.

"Choosing the right types" is a nice ideal, but there's plenty of pragmatism in
this campaign, because I don't want to spend my whole life fixing warnings, but
I very much want to avoid either hiding or introducing problems.  A type that
seems more fitting on its face might require refitting a lot of code.  I do
start out trying to understand the bigger picture, but at some point, the effort
becomes more than I'm willing to spend.  This is one of those cases.  Changing
int to long seemed like a conservative way to eliminate the reason for the
warning, and I tried to do it in a way that was coherent in the scope of the
Stem code, if not in the big picture.

Han-Wen wrote:
> We could globally replace [int] with int64_t which has the
> additional advantage of being explicit about its size.

I'm used to using those typedefs and I think they're beneficial, but I don't
expect that switching to them would be merely automatic, given libraries that
use basic C types (e.g. scm_ilength returning long).  I'd expect some human
judgment to be required.

Han-Wen also wrote:
> Using a long here is 7 bytes too many.

Do you believe that in general, no value should ever be stored in a larger space
than is necessary, or are you speaking specifically about this beaming code
being a memory hog?

In my experience, for parameters, local variables, and return values, it's
usually better to reduce the number of conversions and not worry about the size
until there is evidence of a problem.  For object data that will be used in a
large number of instances, concern is appropriate.

I confess, I've not educated myself on how Guile stores int versus long, like
whether it actually uses more space for scm_from_long(50) than for
scm_from_int(50); however, this code didn't strike me as deserving that extra
effort.  If you think it's important, I'm willing to revisit it and try to save
space in the list while while still avoiding questionable silent conversions.
— 
Dan

dak

Dan Eble <dan@faithful.be> writes: > I confess, I've not educated myself on how Guile stores ...

5 years, 3 months ago (2020-02-03 13:43:43 UTC) #5

dan_faithful.be

On Feb 3, 2020, at 08:43, David Kastrup <dak@gnu.org> wrote: > > Dan Eble <dan@faithful.be> ...

5 years, 3 months ago (2020-02-03 13:52:33 UTC) #6

hanwenn

On Mon, Feb 3, 2020 at 2:40 PM Dan Eble <dan@faithful.be> wrote: > > On ...

5 years, 3 months ago (2020-02-03 19:53:45 UTC) #7

On Mon, Feb 3, 2020 at 2:40 PM Dan Eble <dan@faithful.be> wrote:
>
> On Feb 3, 2020, at 06:27, David Kastrup <dak@gnu.org> wrote:
> >
> > hanwenn@gmail.com writes:
> >
> >> sorry to complain late about this change. I understand that this gets
> >> rid of a conversion warning, which is something that we want, but I am
> >> missing the big picture here. Is there a plan for the big picture?
> >
> > As far as I can see the big picture is matching the used variables to
> > the used data types to that overflows become a non-issue.
> ...
> > Just choosing the right types means that when a problem crops up likely
> > due to an exceeded size, this is not a place you need to check up on and
> > verify when going through all warning-silencing casts.
>
> It's a fair question, especially on seeing a case where I chose to keep a
signed type, after seeing many other cases where I changed a signed index to
size_t.
>
> "Choosing the right types" is a nice ideal, but there's plenty of pragmatism
in this campaign, because I don't want to spend my whole life fixing warnings,
but I very much want to avoid either hiding or introducing problems.  A type
that seems more fitting on its face might require refitting a lot of code.  I do
start out trying to understand the bigger picture, but at some point, the effort
becomes more than I'm willing to spend.  This is one of those cases.  Changing
int to long seemed like a conservative way to eliminate the reason for the
warning, and I tried to do it in a way that was coherent in the scope of the
Stem code, if not in the big picture.
>
> Han-Wen wrote:
> > We could globally replace [int] with int64_t which has the
> > additional advantage of being explicit about its size.
>
> I'm used to using those typedefs and I think they're beneficial, but I don't
expect that switching to them would be merely automatic, given libraries that
use basic C types (e.g. scm_ilength returning long).  I'd expect some human
judgment to be required.
>
> Han-Wen also wrote:
> > Using a long here is 7 bytes too many.
>
> Do you believe that in general, no value should ever be stored in a larger
space than is necessary, or are you speaking specifically about this beaming
code being a memory hog?
>
> In my experience, for parameters, local variables, and return values, it's
usually better to reduce the number of conversions and not worry about the size
until there is evidence of a problem.  For object data that will be used in a
large number of instances, concern is appropriate.

My concern is this: the beaming number of a stem is a small integer
(it is generally equal to Stem::duration_log - 2).  By making it a
'long',  the Stem class is now inconsistent with itself, because
Stem::duration_log is an int, instead of a long.

I am worried that if we continue making changes to types based on the
signatures of libraries external to lilypond, the code will become a
patchwork of types that are consistent with the external libraries
(Guile, STL, etc.), but internally inconsistent with what things mean
in the LilyPond codebase, ie. some small integers are 'int', some
other set is 'long'.

There is also a cognitive overhead. If most of the code uses "int" for
standard integers, then places that use "long" are out of the
ordinary, and they imply that there is something special going on that
requires an extra-wide type. This will make the code harder to
understand if there actually is no special reason.

There are multiple ways out of this:

1) cast to the appropriate type as close to the external interface as
possible (ie. cast scm_ilength() to an int directly)

2) standardize on a type that is wide enough for our purposes, ie.
int64_t or long, so all of the code uses the same type, so we don't
imply distinctions that are not there.

3) remove the convenience method Stem::get_beaming, and have all the
callers deal with scm_ilength directly.

4) introduce a ly_int_length() which would encapsulate the cast in 1).

I prefer 1) because it is simplest change to the existing code base,
but there is a good case to make that 2) will require less
case-by-case deliberation to make the cast warnings go away overall.

David makes a case that the Rational code has overflows, and he is
right in that the Rational code is a place where it is important to be
very careful about the size of integers (*). However, almost all other
integers in LilyPond describe entities that realistically occur in
much smaller quantities than maxint32, so I'd rather see tradeoffs
biased to internal coherency of the LilyPond codebase itself.

(*) - I doubt that there are overflows in the Rational code due to
erroneous casts, but I'm happy to be proven wrong.

> I confess, I've not educated myself on how Guile stores int versus long, like
whether it actually uses more space for scm_from_long(50) than for
scm_from_int(50); however, this code didn't strike me as deserving that extra
effort.  If you think it's important, I'm willing to revisit it and try to save
space in the list while while still avoiding questionable silent conversions.

GUILE values are machine words (8 byte on 64bit, 4 byte on 32 bit). If
the lower 3 (4) bits are zero (ie. it looks like a pointer to a cell
of 2 SCM values), it's considered a pointer, otherwise the low order
bits are type tags, and there are other values stored inside, eg.
small integers, booleans and other direct values. You shift out the
lower order tag bits to get to the integer. This means you get around
60 bit of space for integers on 64-bit. However, GUILE supports
arbitrary precision integers, so there is no C type that is guaranteed
to fit an SCM integer.

Speaking of rationals, it would be interesting to move to SCM values
for Rationals instead of the current C++ class,  but we'd need to be
guile 2.x to garbage collection for values on the stack.

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

dan_faithful.be

On Feb 3, 2020, at 14:53, Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > 'long', the Stem class ...

5 years, 3 months ago (2020-02-03 20:21:03 UTC) #8

dak

Han-Wen Nienhuys <hanwenn@gmail.com> writes: > On Mon, Feb 3, 2020 at 2:40 PM Dan Eble ...

5 years, 3 months ago (2020-02-03 20:23:09 UTC) #9

Han-Wen Nienhuys <hanwenn@gmail.com> writes:

> On Mon, Feb 3, 2020 at 2:40 PM Dan Eble <dan@faithful.be> wrote:
>>
>> Han-Wen wrote:
>> > We could globally replace [int] with int64_t which has the
>> > additional advantage of being explicit about its size.
>>
>> I'm used to using those typedefs and I think they're beneficial, but
>> I don't expect that switching to them would be merely automatic,
>> given libraries that use basic C types (e.g. scm_ilength returning
>> long).  I'd expect some human judgment to be required.
>>
>> Han-Wen also wrote:
>> > Using a long here is 7 bytes too many.
>>
>> Do you believe that in general, no value should ever be stored in a
>> larger space than is necessary, or are you speaking specifically
>> about this beaming code being a memory hog?
>>
>> In my experience, for parameters, local variables, and return
>> values, it's usually better to reduce the number of conversions and
>> not worry about the size until there is evidence of a problem.  For
>> object data that will be used in a large number of instances,
>> concern is appropriate.
>
> My concern is this: the beaming number of a stem is a small integer
> (it is generally equal to Stem::duration_log - 2).  By making it a
> 'long',  the Stem class is now inconsistent with itself, because
> Stem::duration_log is an int, instead of a long.
>
> I am worried that if we continue making changes to types based on the
> signatures of libraries external to lilypond, the code will become a
> patchwork of types that are consistent with the external libraries
> (Guile, STL, etc.), but internally inconsistent with what things mean
> in the LilyPond codebase, ie. some small integers are 'int', some
> other set is 'long'.
>
> There is also a cognitive overhead. If most of the code uses "int" for
> standard integers, then places that use "long" are out of the
> ordinary, and they imply that there is something special going on that
> requires an extra-wide type. This will make the code harder to
> understand if there actually is no special reason.
>
> There are multiple ways out of this:
>
> 1) cast to the appropriate type as close to the external interface as
> possible (ie. cast scm_ilength() to an int directly)
>
> 2) standardize on a type that is wide enough for our purposes, ie.
> int64_t or long, so all of the code uses the same type, so we don't
> imply distinctions that are not there.
>
> 3) remove the convenience method Stem::get_beaming, and have all the
> callers deal with scm_ilength directly.
>
> 4) introduce a ly_int_length() which would encapsulate the cast in 1).

5) do not recompute the size information as an implicit property of the
list (which is an O(n) call every time it is needed) but rather keep it
around separately.

The principal problem we have here is that something starts as an
integer and then is only implicitly available as the length of a list.
Moving to an STL list would turn the length call into O(1) but it would
still be an implicitly available measure and the wrong type.

> I prefer 1) because it is simplest change to the existing code base,
> but there is a good case to make that 2) will require less
> case-by-case deliberation to make the cast warnings go away overall.

Until size_t and int64_t diverge which is the kind of thing that got us
all the warnings on moving to 64bit in the first place.

After looking at more of the code, it would appear that the way the list
is initialized, it cannot really have non-int length (as the
initialization count is an int).  If the length is supposed to become
large, scm_ilength is a bad idea to call anyway because of its cost.

> Speaking of rationals, it would be interesting to move to SCM values
> for Rationals instead of the current C++ class, but we'd need to be
> guile 2.x to garbage collection for values on the stack.

Guile 1.8 collects values on the stack.  That is not the problem.  The
problem is the values elsewhere, and we use, for example, Moment
elements in the Midi structs which are not SCM values.  The Midi data
structures are not scmified at all which would be a good idea for making
the Midi layer a whole lot more user-accessible than it is now.  But we
are not there yet, and as long as a Moment is in there, at least with
Guile 1.8 this part of the heap is not marked.

-- 
David Kastrup

Dan Eble

On 2020/02/03 20:51:05, Dan Eble wrote: > revert int->long Never mind this here. I forgot ...

5 years, 3 months ago (2020-02-03 21:11:23 UTC) #11

dak

5 years, 3 months ago (2020-02-03 21:58:27 UTC) #13

Message was sent while issue was closed.

On 2020/02/03 21:33:11, Dan Eble wrote:
> Next try: http://codereview.appspot.com/565610043

It's probably more an academical remark, but a "kosher" way of doing that might
be using scm_to_int (scm_length (...)) instead of scm_ilength (...).  This will
take out the length in the form it has been produced without a cast, throwing an
error if it does not fit.  It is likely that it comes at the cost of converting
a size_t to SCM and then out again as int which would be sort of ridiculous.  At
the actual sizes we are talking about, the cost would probably not be all that
large.  Probably not worth the trouble, but there might be other situations
where such an approach could prove more satisfactory.

Expand All Messages | Collapse All Messages