Issue 561390043: GUILE2: Scale GC heap with the number of smobs

lemzwerg

LGTM https://codereview.appspot.com/561390043/diff/563450046/lily/score-engraver.cc File lily/score-engraver.cc (right): https://codereview.appspot.com/561390043/diff/563450046/lily/score-engraver.cc#newcode200 lily/score-engraver.cc:200: // This double the heap. TODO: don't do ...

5 years, 1 month ago (2020-02-01 06:58:32 UTC) #1

Dan Eble

https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh File lily/include/score-engraver.hh (right): https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode33 lily/include/score-engraver.hh:33: GC_word last_gc_count_; FYI: Because we're using C++11 now, you ...

5 years, 1 month ago (2020-02-01 10:55:50 UTC) #3

hahnjo

https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh File lily/include/score-engraver.hh (right): https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode26 lily/include/score-engraver.hh:26: #include <gc.h> This effectively adds a dependency on libgc ...

5 years, 1 month ago (2020-02-01 11:00:41 UTC) #4

hahnjo

https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh File lily/include/score-engraver.hh (right): https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode26 lily/include/score-engraver.hh:26: #include <gc.h> On 2020/02/01 11:00:40, hahnjo wrote: > This ...

5 years, 1 month ago (2020-02-01 11:26:43 UTC) #5

hanwenn

On 2020/02/01 11:00:41, hahnjo wrote: > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh > File lily/include/score-engraver.hh (right): > > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode26 > ...

5 years, 1 month ago (2020-02-01 20:49:09 UTC) #6

hanwenn

https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh File lily/include/score-engraver.hh (right): https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode33 lily/include/score-engraver.hh:33: GC_word last_gc_count_; On 2020/02/01 10:55:50, Dan Eble wrote: > ...

5 years, 1 month ago (2020-02-01 20:58:24 UTC) #8

dak

On 2020/02/01 20:49:09, hanwenn wrote: > On 2020/02/01 11:00:41, hahnjo wrote: > > > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh ...

5 years, 1 month ago (2020-02-01 20:59:06 UTC) #9

hanwenn

On 2020/02/01 20:59:06, dak wrote: > On 2020/02/01 20:49:09, hanwenn wrote: > > On 2020/02/01 ...

5 years, 1 month ago (2020-02-01 21:23:51 UTC) #10

dak

On 2020/02/01 21:23:51, hanwenn wrote: > On 2020/02/01 20:59:06, dak wrote: > > On 2020/02/01 ...

5 years, 1 month ago (2020-02-01 21:59:14 UTC) #11

On 2020/02/01 21:23:51, hanwenn wrote:
> On 2020/02/01 20:59:06, dak wrote:
> > On 2020/02/01 20:49:09, hanwenn wrote:
> > > On 2020/02/01 11:00:41, hahnjo wrote:
> > > >
> > >
> >
>
https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-en...
> > > > File lily/include/score-engraver.hh (right):
> > > > 
> > > >
> > >
> >
>
https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-en...
> > > > lily/include/score-engraver.hh:26: #include <gc.h>
> > > > This effectively adds a dependency on libgc which you should search for
at
> > > > configure - it is not necessarily installed in a default location.
> > > 
> > > I've added a TODO for now. One complication is that libgc doesn't come
with
> > > pkg-config.
> > 
> > It seems like a bad idea to change the bgc parameters behind Guile's back. 
> 
> Can you give a specific reason why? 
> 
> >  Does
> > Guile not have any GC hooks of its own that one could use for increasing the
> > heap size or its behavior with regard to requesting memory?
> 
> I guess you could allocate a large block of memory, and then deallocate it
> again.
> 
> The whole idea of GUILE moving to libgc is that it gets out of the business of
> trying
> to do GC. Why would they insert themselves into that game again?

From the Guile-2.2 manual:

 -- C Function: void * scm_malloc (size_t SIZE)
 -- C Function: void * scm_calloc (size_t SIZE)
 [...]

     These functions will (indirectly) call
     ‘scm_gc_register_allocation’.

 -- C Function: void scm_gc_register_allocation (size_t SIZE)
     Informs the garbage collector that SIZE bytes have been allocated,
     which the collector would otherwise not have known about.

     In general, Scheme will decide to collect garbage only after some
     amount of memory has been allocated.  Calling this function will
     make the Scheme garbage collector know about more allocation, and
     thus run more often (as appropriate).

     It is especially important to call this function when large
     unmanaged allocations, like images, may be freed by small Scheme
     allocations, like foreign objects.

So it would appear that Guile does have an interface for the notion of getting
in memory pressure, and we incidentally also call those hooks when we are
allocating smob structures.

Sign in to reply to this message.

hahnjo

On 2020/02/01 20:49:09, hanwenn wrote: > On 2020/02/01 11:00:41, hahnjo wrote: > > > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh ...

5 years, 1 month ago (2020-02-02 09:49:50 UTC) #12

hanwenn

On 2020/02/02 09:49:50, hahnjo wrote: > On 2020/02/01 20:49:09, hanwenn wrote: > > On 2020/02/01 ...

5 years, 1 month ago (2020-02-02 12:41:59 UTC) #13

hanwenn

On 2020/02/01 21:59:14, dak wrote: > On 2020/02/01 21:23:51, hanwenn wrote: > > On 2020/02/01 ...

5 years, 1 month ago (2020-02-02 12:56:44 UTC) #14

On 2020/02/01 21:59:14, dak wrote:
> On 2020/02/01 21:23:51, hanwenn wrote:
> > On 2020/02/01 20:59:06, dak wrote:
> > > On 2020/02/01 20:49:09, hanwenn wrote:
> > > > On 2020/02/01 11:00:41, hahnjo wrote:
> > > > >
> > > >
> > >
> >
>
https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-en...
> > > > > File lily/include/score-engraver.hh (right):
> > > > > 
> > > > >
> > > >
> > >
> >
>
https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-en...
> > > > > lily/include/score-engraver.hh:26: #include <gc.h>
> > > > > This effectively adds a dependency on libgc which you should search
for
> at
> > > > > configure - it is not necessarily installed in a default location.
> > > > 
> > > > I've added a TODO for now. One complication is that libgc doesn't come
> with
> > > > pkg-config.
> > > 
> > > It seems like a bad idea to change the bgc parameters behind Guile's back.

> > 
> > Can you give a specific reason why? 
> > 
> > >  Does
> > > Guile not have any GC hooks of its own that one could use for increasing
the
> > > heap size or its behavior with regard to requesting memory?
> > 
> > I guess you could allocate a large block of memory, and then deallocate it
> > again.
> > 
> > The whole idea of GUILE moving to libgc is that it gets out of the business
of
> > trying
> > to do GC. Why would they insert themselves into that game again?
> 
> From the Guile-2.2 manual:
> 
>  -- C Function: void * scm_malloc (size_t SIZE)
>  -- C Function: void * scm_calloc (size_t SIZE)
>  [...]
> 
>      These functions will (indirectly) call
>      ‘scm_gc_register_allocation’.
> 
>  -- C Function: void scm_gc_register_allocation (size_t SIZE)
>      Informs the garbage collector that SIZE bytes have been allocated,
>      which the collector would otherwise not have known about.
> 
>      In general, Scheme will decide to collect garbage only after some
>      amount of memory has been allocated.  Calling this function will
>      make the Scheme garbage collector know about more allocation, and
>      thus run more often (as appropriate).
> 
>      It is especially important to call this function when large
>      unmanaged allocations, like images, may be freed by small Scheme
>      allocations, like foreign objects.
> 
> So it would appear that Guile does have an interface for the notion of getting
> in memory pressure, and we incidentally also call those hooks when we are
> allocating smob structures.

that is a hook, but it's not the one we need. What we want is a way to make
BDW/Guile expand the heap even if it thinks it's not necessary.
scm_register_allocation is meant to increase the GC frequency, but we want the
opposite.

We could do scm_gc_malloc + scm_gc_free, but 

1) it's doing more work than we need

2) there is a risk that the GC library will treat our large allocations
especially (e.g returning to the O/S on GC_free) 

I think asking libgc to expand the heap is the best way to ask for a larger
heap.

Sign in to reply to this message.

hanwenn

On 2020/02/01 11:00:41, hahnjo wrote: > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh > File lily/include/score-engraver.hh (right): > > https://codereview.appspot.com/561390043/diff/557260051/lily/include/score-engraver.hh#newcode26 > ...

5 years, 1 month ago (2020-02-02 13:14:15 UTC) #15

hahnjo

https://codereview.appspot.com/561390043/diff/565600046/configure.ac File configure.ac (right): https://codereview.appspot.com/561390043/diff/565600046/configure.ac#newcode270 configure.ac:270: PKG_CHECK_MODULES(BDWGC, bdw-gc) This should fail if not found, you're ...

5 years, 1 month ago (2020-02-02 13:21:09 UTC) #17

hanwenn

https://codereview.appspot.com/561390043/diff/565600046/configure.ac File configure.ac (right): https://codereview.appspot.com/561390043/diff/565600046/configure.ac#newcode270 configure.ac:270: PKG_CHECK_MODULES(BDWGC, bdw-gc) On 2020/02/02 13:21:09, hahnjo wrote: > This ...

5 years, 1 month ago (2020-02-02 13:44:06 UTC) #19

hahnjo

On 2020/02/02 13:43:25, hanwenn wrote: > jonas' comments The uploaded diff has the wrong base, ...

5 years, 1 month ago (2020-02-02 13:45:34 UTC) #20

hanwenn

On 2020/02/02 13:45:34, hahnjo wrote: > On 2020/02/02 13:43:25, hanwenn wrote: > > jonas' comments ...

5 years, 1 month ago (2020-02-02 13:52:04 UTC) #22

hahnjo

I just tried to reproduce the timings for commits already in master and this patch. ...

5 years, 1 month ago (2020-02-02 20:32:37 UTC) #23

dak

jonas.hahnfeld@gmail.com writes: > I just tried to reproduce the timings for commits already in master ...

5 years, 1 month ago (2020-02-02 20:51:38 UTC) #24

hanwenn

For testing, use https://github.com/hanwen/lilypond/tree/guile22-experiment in particular, you need https://github.com/hanwen/lilypond/commit/b696550379831ecec1519be6d59cd8a3003e5329 for the UTF-8 parsing. I fixed ...

5 years, 1 month ago (2020-02-02 22:33:50 UTC) #25

For testing, use

https://github.com/hanwen/lilypond/tree/guile22-experiment

in particular, you need

https://github.com/hanwen/lilypond/commit/b696550379831ecec1519be6d59cd8a3003...

for the UTF-8 parsing. I fixed this a week ago, but due to  the delays
in getting the preceding fix in ("cleanup embedded SCM parsing"), I
couldn't send that for review yet.

For me, juggling 15 different outstanding code reviews at the same
time is the bane of the current development process.

On Sun, Feb 2, 2020 at 9:51 PM David Kastrup <dak@gnu.org> wrote:
>
> jonas.hahnfeld@gmail.com writes:
>
> > I just tried to reproduce the timings for commits already in master and
> > this patch. To be honest I don't see a clear picture yet.
> >
> > Yes, this change seems to improve the time spent for garbage collection,
> > but the real time reported by "time" only decreases by a fraction (less
> > than 50% of the saved time for gc). Also I consistently measure
> > increased total and gc time when toggling the setting of the initial
> > heap size, ie the change in master actually makes it slower for me.
> >
> > My conclusion would be that we need to measure larger scores, not
> > executions less than 10s. This may be the use case that most users care
> > about, but AFAICS it's actually pretty hard to get reliable data for
> > now.
> > I've tried to use the MSDM example from
> > https://lists.gnu.org/archive/html/lilypond-user/2016-11/msg00700.html
> > which runs for around ~40s on my system, but it crashes with Guile 2.2:
> > GUILE signaled an error for the expression beginning here
> > #
> >
> >  (define-music-function (parser location )()
> >
> > Unbound variable: ol
>
> The preceding line is
>
> col =
>
> so this is likely a matter of passing the wrong part of the file into
> Guile when encountering # .  The file contains two 3-byte UTF-8
> sequences above which could be thought to throw off the interpretation
> by 4 bytes.  But it actually is off by 6 bytes if it is running into the
> preceding "ol", as if the special characters/bytes are not seen at all.
>
> --
> David Kastrup



-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

Sign in to reply to this message.

pkx166h_posteo.net

On 02/02/2020 22:33, Han-Wen Nienhuys wrote: > For me, juggling 15 different outstanding code reviews ...

5 years, 1 month ago (2020-02-02 22:47:27 UTC) #26

hanwenn

On Sun, Feb 2, 2020 at 11:47 PM <pkx166h@posteo.net> wrote: > > On 02/02/2020 22:33, ...

5 years, 1 month ago (2020-02-03 08:36:09 UTC) #27

hanwenn

On Mon, Feb 3, 2020 at 9:35 AM Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > > On ...

5 years, 1 month ago (2020-02-03 09:21:10 UTC) #28

hanwenn

On Mon, Feb 3, 2020 at 9:35 AM Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > > > ...

5 years, 1 month ago (2020-02-03 09:31:10 UTC) #29

hanwenn

On Mon, Feb 3, 2020 at 9:35 AM Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > > On ...

5 years, 1 month ago (2020-02-03 09:41:12 UTC) #30

dak

Han-Wen Nienhuys <hanwenn@gmail.com> writes: > On Sun, Feb 2, 2020 at 11:47 PM <pkx166h@posteo.net> wrote: ...

5 years, 1 month ago (2020-02-03 10:34:22 UTC) #31

dak

Han-Wen Nienhuys <hanwenn@gmail.com> writes: > On Mon, Feb 3, 2020 at 9:35 AM Han-Wen Nienhuys ...

5 years, 1 month ago (2020-02-03 10:57:30 UTC) #32

Han-Wen Nienhuys <hanwenn@gmail.com> writes:

> On Mon, Feb 3, 2020 at 9:35 AM Han-Wen Nienhuys <hanwenn@gmail.com> wrote:
>> > > For me, juggling 15 different outstanding code reviews at the same
>> > > time is the bane of the current development process.
>> >
>> > what do you suggest?
>>
>> I think we should move to a git-based code review tool; Github, Gitlab
>> or Gerrit.  Gerrit is probably the closest to the current workflow.
>
> I think it would also be good if we can find a mode in which we get
> rid of the "countdown". Instead of waiting for complaints, a change is
> pushed once it passes tests and someone LGTM'd it.

If I may remind you, that was the effective state when I started on
LilyPond.  I submitted several patches that improved upon internals, so
nobody currently active felt up to giving a comment or LGTM.  Eventually
I raised such a shitstorm over this complete blockage of new work that
in the aftermath of dealing with it (that eventually led to the adoption
of rules and procedures more amenable to new developers) Valentin left.
I am not proud of that outcome.

I have a Guile patch submitted in this state of waiting for "someone
LGTM it" for 6 years already
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=17474> because it
involves internals of Guile, so people will rely on Andy Wingo for the
LGTM and Andy Wingo cannot be arsed to even comment.

That is a very dissatisfactory state particularly for newcomers since it
draws a sharp and terminal dividing line between newcomers to the
project and existing members who can just go ahead pushing their own
patches or rely on stock LGTMs by others.

We did not adopt the current system in a vacuum, we adopted it to
address a situation that was very frustrating for new developers.

> If we discover a problem with the change afterwards, we could simply
> revert it and discuss further.

Let me make this clear: you are currently in the process of submitting
patches that change code conventions from current C++ practice used by
current contributors to standards you decided on decades ago for your
own work on LilyPond.  If you follow the current discussion, how many
LGTMs on those changes do you count?  Our rules give you a chance at
outsitting objections until your patch makes it in by default.  If you
instead had to wait for an LGTM, you'd be held up for an indefinite
time.  People could just ignore you and your work would never get
anywhere, unless you use your commit privileges to ignore the process.

That's the way things were handled when I started working on LilyPond,
and it is not, in my book, a state worth returning to.

In short: your proposed solution would not address your problem if you
took reviews seriously.  Instead you would have to rely on ignoring what
people say and go ahead anyway.

That is an approach that does not scale to projects of arbitrary size.

-- 
David Kastrup

Sign in to reply to this message.

dan_faithful.be

On Feb 3, 2020, at 04:30, Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > > Instead of waiting ...

5 years, 1 month ago (2020-02-03 14:58:11 UTC) #33

dak

Dan Eble <dan@faithful.be> writes: > On Feb 3, 2020, at 04:30, Han-Wen Nienhuys <hanwenn@gmail.com> wrote: ...

5 years, 1 month ago (2020-02-03 15:06:05 UTC) #34

Dan Eble <dan@faithful.be> writes:

> On Feb 3, 2020, at 04:30, Han-Wen Nienhuys <hanwenn@gmail.com> wrote:
>> 
>> Instead of waiting for complaints, a change is
>> pushed once it passes tests and someone LGTM'd it.
>
> I've worked in places where commits were handled with self-discipline
> and mutual accountability, and I've worked in places where a UI
> enforced upper-management's policy that every change had to be
> approved by two other developers before it could be merged.  I prefer
> self-discipline and mutual accountability to having to nag people with
> a superficial understanding of a change to put themselves on record as
> approving it.
>
> Therefore, regarding the countdown, I think it's a bad idea to require
> approval before pushing, unless we grant that the patch meister can
> approve pushing with the reason "countdown complete."
>
> Regarding accelerating the process, I wouldn't have a problem with a
> developer pushing a change after tests have passed, after receiving
> the clear approval of a developer competent in the subject, and when
> others aren't likely to disagree.  It would take a little more trust
> and clarity of feedback than always waiting for the countdown.  I
> don't expect that it would be the norm.

It does happen, and different developers are differently comfortable
taking the responsibility for bypassing the opportunity of feedback
depending on the situation.  The staging procedure reduces the potential
size of consequences of such a step, even in case of completely
bypassing any kind of review for trivial changes (trivial typically also
implying that no computer-interpreted syntax is changed).  It mostly
makes sense where it is unexpected others will comment, or when in the
course of larger interdependent changes that cannot be defused by
rebases, the queue is expected to continue stalling for prolonged
amounts of time.

-- 
David Kastrup

Sign in to reply to this message.

hahnjo

Am Montag, den 03.02.2020, 09:58 -0500 schrieb Dan Eble: > On Feb 3, 2020, at ...

5 years, 1 month ago (2020-02-03 15:09:04 UTC) #35

Am Montag, den 03.02.2020, 09:58 -0500 schrieb Dan Eble:
> On Feb 3, 2020, at 04:30, Han-Wen Nienhuys <
> hanwenn@gmail.com
> > wrote:
> > Instead of waiting for complaints, a change is
> > pushed once it passes tests and someone LGTM'd it.
> 
> I've worked in places where commits were handled with self-discipline and
mutual accountability, and I've worked in places where a UI enforced
upper-management's policy that every change had to be approved by two other
developers before it could be merged.  I prefer self-discipline and mutual
accountability to having to nag people with a superficial understanding of a
change to put themselves on record as approving it.
> 
> Therefore, regarding the countdown, I think it's a bad idea to require
approval before pushing, unless we grant that the patch meister can approve
pushing with the reason "countdown complete."
> 
> Regarding accelerating the process, I wouldn't have a problem with a developer
pushing a change after tests have passed, after receiving the clear approval of
a developer competent in the subject, and when others aren't likely to
disagree.

Right now, the Countdown is also the time when other developers get the
possibility to disagree. How can you know if others disagree before
they tell you?
(I worked in another project where it could happen that a patch was
accepted within 10 minutes and pushed 30 minutes after upload. No way I
can be online 24/7, and there were serious problems with the changes.)

> It would take a little more trust and clarity of feedback than always waiting
for the countdown.  I don't expect that it would be the norm.

We already have this opportunity (see David's concurrent reply). No,
this should not be the norm and certainly is not, which is good in my
opinion.

To get forward, I'd propose a fresh thread on lilypond-devel with a
concrete proposal on what to change, its pros and cons. Discussing this
on a random review thread is just a waste of time.

Jonas

Sign in to reply to this message.

hahnjo

On 2020/02/02 22:33:50, hanwenn wrote: > For testing, use > > https://github.com/hanwen/lilypond/tree/guile22-experiment So I ran ...

5 years, 1 month ago (2020-02-04 18:50:51 UTC) #36

hanwenn

On Tue, Feb 4, 2020 at 7:50 PM <jonas.hahnfeld@gmail.com> wrote: > On 2020/02/02 22:33:50, hanwenn ...

5 years, 1 month ago (2020-02-04 22:23:45 UTC) #37

hahnjo

On 2020/02/04 22:23:45, hanwenn wrote: > On Tue, Feb 4, 2020 at 7:50 PM <mailto:jonas.hahnfeld@gmail.com> ...

5 years, 1 month ago (2020-02-05 11:33:31 UTC) #38

On 2020/02/04 22:23:45, hanwenn wrote:
> On Tue, Feb 4, 2020 at 7:50 PM <mailto:jonas.hahnfeld@gmail.com> wrote:
> 
> > On 2020/02/02 22:33:50, hanwenn wrote:
> > > For testing, use
> > >
> > > https://github.com/hanwen/lilypond/tree/guile22-experiment
> >
> > So I ran this with the large example and I see
> > GC Warning: Failed to expand heap by 30635458560 bytes
> > a few times (that's 30 GB, my laptop only has 8 GB!!) and finally
> > warning: g_spawn_sync failed (0): gs: Failed to fork (Cannot allocate
> > memory)
> > Maybe exponential growth is not the best choice here? At least my system
> > can't handle this score anymore.
> >
> > Also most of the 'speedup' by this patch seems to come from segments
> > _after_ music interpretation (only 10% improvement in music
> > interpretation, but around 2x for the total time). I think the later
> > parts just benefit from the enormous heap because GC doesn't need to run
> > at all? But that's not really the idea of this patch, is it?
> >
> >
> In part, it actually is. If you have a lot of RAM, you should use it
> instead of trying to waste CPU cycles recycling it.
> 
> do you have a pointer to the score for testing?

I posted this above:
https://lists.gnu.org/archive/html/lilypond-user/2016-11/msg00700.html

> Can you do some timings with different values for GC_MAXIMUM_HEAP_SIZE , eg.
> 
>  GC_MAXIMUM_HEAP_SIZE=100M

Not sure if you actually mean GC_MAXIMUM_HEAP_SIZE or rather
GC_INITIAL_HEAP_SIZE?

guile22-experiment w/o the growing added in this patch:
default: ~2m33s (gc time taken: 153.316484488)
GC_MAXIMUM_HEAP_SIZE=100M -> "GC Warning: Out of Memory!  Trying to continue..."
(kind of makes sense because it needs ~900MB of RAM)
GC_INITIAL_HEAP_SIZE=100M: 2m30s (gc time taken: 2.189005726; can't believe
this)
GC_INITIAL_HEAP_SIZE=900M: 0m59s (gc time taken: 8.895736892)

> Also, it would be interesting to GC_PRINT_STATS=1 and see how much time is
> spent in GC, and how much a typical GC runs reclaim across the process.

Um, that outputs a lot. Anything in particular?

Sign in to reply to this message.

hahnjo

Some more numbers: I took guile22-experiment and removed the following: * GC_set_free_space_divisor, * GC_INITIAL_HEAP_SIZE=40M * ...

5 years, 1 month ago (2020-02-05 12:59:37 UTC) #39

dan_faithful.be

On Feb 5, 2020, at 07:59, jonas.hahnfeld@gmail.com wrote: > > and it seems stable, but ...

5 years, 1 month ago (2020-02-05 13:38:02 UTC) #40

dak

Dan Eble <dan@faithful.be> writes: > On Feb 5, 2020, at 07:59, jonas.hahnfeld@gmail.com wrote: >> >> ...

5 years, 1 month ago (2020-02-05 13:43:52 UTC) #41

hanwenn

On Wed, Feb 5, 2020 at 12:33 PM <jonas.hahnfeld@gmail.com> wrote: > > Can you do ...

5 years, 1 month ago (2020-02-05 23:19:22 UTC) #42

On Wed, Feb 5, 2020 at 12:33 PM <jonas.hahnfeld@gmail.com> wrote:

> > Can you do some timings with different values for GC_MAXIMUM_HEAP_SIZE
> , eg.
> >
> >  GC_MAXIMUM_HEAP_SIZE=100M
>
> Not sure if you actually mean GC_MAXIMUM_HEAP_SIZE or rather
> GC_INITIAL_HEAP_SIZE?
>
>
I mean maximum, it sets the total amount of memory. If you set the initial
heap size, it will still grow the heap, but from a larger starting point. I
am wondering

* what the amount of RAM is this score needs as a minimum,
* how much over the minimum you need to be to get decent runtime.

this could give us hints on how aggressive we should scale up the heap.

guile22-experiment w/o the growing added in this patch:
> default: ~2m33s (gc time taken: 153.316484488)
> GC_MAXIMUM_HEAP_SIZE=100M -> "GC Warning: Out of Memory!  Trying to
> continue..." (kind of makes sense because it needs ~900MB of RAM)
> GC_INITIAL_HEAP_SIZE=100M: 2m30s (gc time taken: 2.189005726; can't
> believe this)
> GC_INITIAL_HEAP_SIZE=900M: 0m59s (gc time taken: 8.895736892)
>



> > Also, it would be interesting to GC_PRINT_STATS=1 and see how much
> time is
> > spent in GC, and how much a typical GC runs reclaim across the
> process.
>
> Um, that outputs a lot. Anything in particular?
>

when you see things like:

World-stopped marking took 187 msecs (77 in average)
In-use heap: 51% (40071 KiB pointers + 6357 KiB other)

it means it took 187ms for GC, and then reclaimed 49% of the memory.

If you often see higher percentages, we'll spend more CPU in marking memory
without getting fresh memory.

I tried to run the carver score, but it needs updating and,

 [hanwen@localhost lilypond]$ ./out/bin/convert-ly  carver/*ly
convert-ly (GNU LilyPond) 2.21.0

Traceback (most recent call last):
  File "./out/bin/convert-ly", line 413, in <module>
    main ()
  File "./out/bin/convert-ly", line 387, in main
    f = f.decode (sys.stdin.encoding or "utf-8")
AttributeError: 'str' object has no attribute 'decode'
[hanwen@localhost lilypond]$ python2 ./out/bin/convert-ly  carver/*ly
Traceback (most recent call last):
  File "./out/bin/convert-ly", line 59, in <module>
    import lilylib as ly
  File
"/home/hanwen/vc/lilypond/out/lib/lilypond/current/python/lilylib.py", line
216
    print('command failed:', cmd, file=sys.stderr)
                                      ^
SyntaxError: invalid syntax

Is there still work left for the python3 conversion?

> https://codereview.appspot.com/561390043/
>


-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

Sign in to reply to this message.

hanwenn

On Thu, Feb 6, 2020 at 12:19 AM Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > Um, that ...

5 years, 1 month ago (2020-02-05 23:20:32 UTC) #43

dak

Han-Wen Nienhuys <hanwenn@gmail.com> writes: > On Wed, Feb 5, 2020 at 12:33 PM <jonas.hahnfeld@gmail.com> wrote: ...

5 years, 1 month ago (2020-02-05 23:27:25 UTC) #44

hahnjo_hahnjo.de

Am Donnerstag, den 06.02.2020, 00:27 +0100 schrieb David Kastrup: > Han-Wen Nienhuys < > hanwenn@gmail.com ...

5 years, 1 month ago (2020-02-06 12:45:02 UTC) #45

hanwenn

Thanks, I ran the carver score successfully now. I took some pauses between runs so ...

5 years, 1 month ago (2020-02-07 16:53:24 UTC) #46

Thanks, I ran the carver score successfully now.

I took some pauses between runs so thermal throttling wasn't an issue.

My standard guile2.2 branch (with 40M initial heap), takes about 2m wall
time, 1m of GC time.

The stats printing shows that we have a very full heap:

In-use heap: 85% (370824 KiB pointers + 60065 KiB other)
In-use heap: 85% (370725 KiB pointers + 59737 KiB other)
In-use heap: 85% (370143 KiB pointers + 59687 KiB other)
..

so we have to scan a lot of live data to reclaim just 15% of the heap.

Tinkering with INITIAL_HEAP lets us tune things a little bit:

 initial heap 2G: 50s (2.69s GC)
 initial heap 900M: 0:54 (6s GC)
 initial heap 500M: 1:09 (33s GC)

The GC timings seem wonky, but since BDW is multithreaded, it's possible
that the computation times don't completely add up.

GUILE 1.8 in default configuration:

 wall time 40s, 100% CPU.



On Thu, Feb 6, 2020 at 1:45 PM Jonas Hahnfeld <hahnjo@hahnjo.de> wrote:

> Am Donnerstag, den 06.02.2020, 00:27 +0100 schrieb David Kastrup:
> > Han-Wen Nienhuys <
> > hanwenn@gmail.com
> > > writes:
> >
> > > On Wed, Feb 5, 2020 at 12:33 PM <
> > > jonas.hahnfeld@gmail.com
> > > > wrote:
> > >
> > > when you see things like:
> > >
> > > World-stopped marking took 187 msecs (77 in average)
> > > In-use heap: 51% (40071 KiB pointers + 6357 KiB other)
> > >
> > > it means it took 187ms for GC, and then reclaimed 49% of the memory.
> > >
> > > If you often see higher percentages, we'll spend more CPU in marking
> memory
> > > without getting fresh memory.
> > >
> > > I tried to run the carver score, but it needs updating and,
> > >
> > >  [hanwen@localhost lilypond]$ ./out/bin/convert-ly  carver/*ly
> > > convert-ly (GNU LilyPond) 2.21.0
> > >
> > > Traceback (most recent call last):
> > >   File "./out/bin/convert-ly", line 413, in <module>
> > >     main ()
> > >   File "./out/bin/convert-ly", line 387, in main
> > >     f = f.decode (sys.stdin.encoding or "utf-8")
> > > AttributeError: 'str' object has no attribute 'decode'
> > > [hanwen@localhost lilypond]$ python2 ./out/bin/convert-ly  carver/*ly
> > > Traceback (most recent call last):
> > >   File "./out/bin/convert-ly", line 59, in <module>
> > >     import lilylib as ly
> > >   File
> > > "/home/hanwen/vc/lilypond/out/lib/lilypond/current/python/lilylib.py",
> line
> > > 216
> > >     print('command failed:', cmd, file=sys.stderr)
> > >                                       ^
> > > SyntaxError: invalid syntax
> > >
> > > Is there still work left for the python3 conversion?
> > >
> > > > https://codereview.appspot.com/561390043/
> > > >
> > > >
> >
> > Yes, separate issue.  Jonas, I think you should push the fix for that
> > one to staging: from "doesn't work" there cannot be much more of a
> > regression to be afraid of.
>
> Right, in staging now.
>


-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

Sign in to reply to this message.

hanwenn

On Fri, Feb 7, 2020 at 5:53 PM Han-Wen Nienhuys <hanwenn@gmail.com> wrote: > Thanks, I ...

5 years, 1 month ago (2020-02-07 17:27:35 UTC) #47

On Fri, Feb 7, 2020 at 5:53 PM Han-Wen Nienhuys <hanwenn@gmail.com> wrote:

> Thanks, I ran the carver score successfully now.
>
> I took some pauses between runs so thermal throttling wasn't an issue.
>
> My standard guile2.2 branch (with 40M initial heap), takes about 2m wall
> time, 1m of GC time.
>
> The stats printing shows that we have a very full heap:
>
> In-use heap: 85% (370824 KiB pointers + 60065 KiB other)
> In-use heap: 85% (370725 KiB pointers + 59737 KiB other)
> In-use heap: 85% (370143 KiB pointers + 59687 KiB other)
> ..
>
> so we have to scan a lot of live data to reclaim just 15% of the heap.
>
> Tinkering with INITIAL_HEAP lets us tune things a little bit:
>
>  initial heap 2G: 50s (2.69s GC)
>  initial heap 900M: 0:54 (6s GC)
>  initial heap 500M: 1:09 (33s GC)
>
> The GC timings seem wonky, but since BDW is multithreaded, it's possible
> that the computation times don't completely add up.
>
>

Single threaded, the numbers make more sense:


MAX=INIT=2G
gc time taken: 1.843968805
User time (seconds): 49.36

MAX=INIT=1G
gc time taken: 3.291264925
User time (seconds): 51.74

MAX=INIT=800M
gc time taken: 5.760042906
User time (seconds): 54.62

500M
gc time taken: 17.921457247
User time (seconds): 68.24


It's interesting to note that the multithreaded collector doesn't really
help. With a 500M heap, wall clock is 1:08 for the single threaded case,
and 1:09 for the multithreaded case.

The 800M case seems like a good configuration: it spends about 10% of the
time doing GC.

$ grep In-use log

In-use heap: 0% (0 KiB pointers + 0 KiB other)
In-use heap: 2% (20959 KiB pointers + 2430 KiB other)
In-use heap: 2% (14698 KiB pointers + 2415 KiB other)
In-use heap: 39% (292008 KiB pointers + 29486 KiB other)
In-use heap: 47% (334968 KiB pointers + 53702 KiB other)
In-use heap: 45% (317237 KiB pointers + 53424 KiB other)
In-use heap: 57% (407528 KiB pointers + 67897 KiB other)
In-use heap: 53% (378243 KiB pointers + 60449 KiB other)
In-use heap: 53% (378249 KiB pointers + 60519 KiB other)
In-use heap: 53% (377488 KiB pointers + 60322 KiB other)

so we should scale the heap so that approximately 50% is collected on GC.


-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

Sign in to reply to this message.

dak

Han-Wen Nienhuys <hanwenn@gmail.com> writes: > Single threaded, the numbers make more sense: > > > ...

5 years, 1 month ago (2020-02-07 18:27:29 UTC) #48

hanwenn

On Fri, Feb 7, 2020 at 7:27 PM David Kastrup <dak@gnu.org> wrote: > Han-Wen Nienhuys ...

5 years, 1 month ago (2020-02-07 22:57:52 UTC) #50

On Fri, Feb 7, 2020 at 7:27 PM David Kastrup <dak@gnu.org> wrote:

> Han-Wen Nienhuys <hanwenn@gmail.com> writes:
>
> > Single threaded, the numbers make more sense:
> >
> >
> > MAX=INIT=2G
> > gc time taken: 1.843968805
> > User time (seconds): 49.36
> >
> > MAX=INIT=1G
> > gc time taken: 3.291264925
> > User time (seconds): 51.74
> >
> > MAX=INIT=800M
> > gc time taken: 5.760042906
> > User time (seconds): 54.62
> >
> > 500M
> > gc time taken: 17.921457247
> > User time (seconds): 68.24
> >
> >
> > It's interesting to note that the multithreaded collector doesn't really
> > help.
>
> I think that this may be due to both/either our use of mark hooks and of
> finalisers for calling destructors.  Either may cause serialisation.
> Another serialisation is because Guile itself switches BGC to Java mode
> where finalised objects can no longer be marked (or something like that:
> the exact semantics I do not remember).  And of course the C++ free
> store still has to do its full job.


Interesting. I'll try to put this somewhere as a comment.



> That's one reason why I think it may be a good idea to "punt" a bit on
> the encoding stuff by keeping it basically in 8-bit domain: it may give
> us an operative LilyPond for experimenting with other Guilev2 aspects,
> like making a custom allocator for Scheme-containing data structures
> instead of being in a state where only parts are operative.
>
>
It looks like we'll rather have to punt on the GC scaling. Jonas' run of
the carver score exploded so spectacularly, because the GC stats from libgc
don't mean what I thought they meant, which means that the scaling was
overly aggressive.

It seems impossible to scale up the heap automatically for now.  We should
instead print a message directing people to set

 GC_INITIAL_HEAPSIZE=xxx

in the environment. For more info, see also
https://github.com/ivmai/bdwgc/issues/304. Ivan is open to extending the
API, but until that percolates through all the distributions ...

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

Sign in to reply to this message.

hanwenn

On 2020/02/08 20:05:53, hanwenn wrote: > Use smob count as memory proxy This looks good ...

5 years, 1 month ago (2020-02-08 20:09:27 UTC) #52

Dan Eble

https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh File lily/include/smobs.hh (right): https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh#newcode312 lily/include/smobs.hh:312: static size_t count; It seems that this is initialized ...

5 years, 1 month ago (2020-02-09 03:29:27 UTC) #53

hahnjo

The approach sounds good to me. I can confirm that this works on my system ...

5 years, 1 month ago (2020-02-09 09:25:21 UTC) #54

The approach sounds good to me. I can confirm that this works on my system and
the runtime is pretty good. Please readd the autoconf logic to detect bdw-gc.

FWIW I tried to research on the internals of GC. I found the following statement
that decides whether to collect or just expand the heap:
https://github.com/ivmai/bdwgc/blob/v8.0.4/alloc.c#L1435 We are hit by the case
(GC_fo_entries > (last_fo_entries + 500) && (last_bytes_finalized |
GC_bytes_finalized) != 0)
Removing this case gets me to a state very close to this patch, both in terms of
performance and memory usage.

If I understand the approach correctly, GC_fo_entries counts the number of
outstanding finalizations. It seems like Guile is making good use of this
functionality, so GC tries to reclaim that memory first instead of just
expanding the heap. Please note that the number 500 is hard-coded, so we have no
means to influence it via environment variables or APIs. I guess something like
that could be added, but it would again take some time to bubble through all
distributions that we care about.

TL;DR: For now it's probably easiest to scale the heap as this patch does.

https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh
File lily/include/smobs.hh (right):

https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh...
lily/include/smobs.hh:312: static size_t count;
On 2020/02/09 03:29:26, Dan Eble wrote:
> It seems that this is initialized to zero because it is static, but if it
simply
> had an "= 0", I wouldn't have had to go refresh my memory with a web search.
> 
> Is it correct that there are no static Smobs anywhere in the program?  If
there
> were, it would be responsible to spend some time checking whether this could
be
> affected by the "static initialization order fiasco."  (Maybe you already
have.)

IIRC static objects without initialization are put into .bss which is
initialized to 0 when loading the binary. Adding "= 0" puts the variable in
data, but it'll still be initialized at load time. "static initialization" is
only a problem when the constructor is doing work, which is not the case for an
integer.

Anyway I think adding a comment here and in smobs.cc how this is initialized
(refer to static lifetime in C/C++) would likely clarify the intent.

https://codereview.appspot.com/561390043/diff/567180043/lily/smobs.cc
File lily/smobs.cc (right):

https://codereview.appspot.com/561390043/diff/567180043/lily/smobs.cc#newcode23
lily/smobs.cc:23: #include <gc/gc.h>
You probably want this to be guarded by GUILEV2?

https://codereview.appspot.com/561390043/diff/567180043/lily/smobs.cc#newcode46
lily/smobs.cc:46: printf ("bytes per obj %d\n", size / count);
Please remove before committing.

Sign in to reply to this message.

hanwenn

On Sun, Feb 9, 2020 at 10:25 AM <jonas.hahnfeld@gmail.com> wrote: > The approach sounds good ...

5 years, 1 month ago (2020-02-09 09:36:40 UTC) #55

hahnjo

On 2020/02/09 09:36:40, hanwenn wrote: > On Sun, Feb 9, 2020 at 10:25 AM <mailto:jonas.hahnfeld@gmail.com> ...

5 years, 1 month ago (2020-02-09 09:55:12 UTC) #56

hanwenn

On 2020/02/09 09:55:12, hahnjo wrote: > On 2020/02/09 09:36:40, hanwenn wrote: > > On Sun, ...

5 years, 1 month ago (2020-02-09 10:08:20 UTC) #57

On 2020/02/09 09:55:12, hahnjo wrote:
> On 2020/02/09 09:36:40, hanwenn wrote:
> > On Sun, Feb 9, 2020 at 10:25 AM <mailto:jonas.hahnfeld@gmail.com> wrote:
> > > FWIW I tried to research on the internals of GC. I found the following
> > > statement that decides whether to collect or just expand the heap:
> > > https://github.com/ivmai/bdwgc/blob/v8.0.4/alloc.c#L1435 We are hit by
> > > the case
> > > (GC_fo_entries > (last_fo_entries + 500) && (last_bytes_finalized |
> > > GC_bytes_finalized) != 0)
> > > Removing this case gets me to a state very close to this patch, both in
> > > terms of performance and memory usage.
> > >
> > > If I understand the approach correctly, GC_fo_entries counts the number
> > > of outstanding finalizations. It seems like Guile is making good use of
> > > this functionality, so GC tries to reclaim that memory first instead of
> > > just expanding the heap. Please note that the number 500 is hard-coded,
> > > so we have no means to influence it via environment variables or APIs. I
> > > guess something like that could be added, but it would again take some
> > > time to bubble through all distributions that we care about.
> > 
> > Aha! That explains why we see such poor performance, because we rely on
> > finalizers a lot. If we were to move away from finalizers and let bdwgc
> > handle all of the memory management, then that it would also become a
> > non-issue?
> 
> Disclaimer: I'm not yet very familiar with LilyPond's Scheme code. If you
refer
> to scm_set_smob_free / Smob::free_smob, then I think you are right. Is there
an
> easy way to get rid of this, maybe just for testing?

You could define operator new/delete in the smob class (using scm_gc_malloc),
and then let 
BDWGC handle memory for the smobs. The problem is that this will not reclaim any
STL containers (ie. vectors) contained in the smobs, and especially engravers
are full
of them.

A way around that is to change all instances of vectors holding SCM values to a
new 
gc_vector type that has a custom allocator.  That will be significant work, but
probably
desirable in the long term.

Sign in to reply to this message.

hanwenn

https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh File lily/include/smobs.hh (right): https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh#newcode312 lily/include/smobs.hh:312: static size_t count; On 2020/02/09 03:29:26, Dan Eble wrote: ...

5 years, 1 month ago (2020-02-09 10:17:38 UTC) #59

dak

https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh File lily/include/smobs.hh (right): https://codereview.appspot.com/561390043/diff/567180043/lily/include/smobs.hh#newcode312 lily/include/smobs.hh:312: static size_t count; On 2020/02/09 03:29:26, Dan Eble wrote: ...

5 years, 1 month ago (2020-02-09 12:13:26 UTC) #62

dan_faithful.be

On Feb 9, 2020, at 04:25, jonas.hahnfeld@gmail.com wrote: > > Anyway I think adding a ...

5 years, 1 month ago (2020-02-09 12:58:56 UTC) #63

dak

On 2020/02/09 10:08:20, hanwenn wrote: > A way around that is to change all instances ...

5 years, 1 month ago (2020-02-09 23:20:36 UTC) #64

hanwenn

5 years ago (2020-02-13 11:41:09 UTC) #65

commit a19aed147bf1605b21cbe7b1909ff6cbf519fb64
Author: Han-Wen Nienhuys <hanwen@lilypond.org>
Date:   Sat Feb 8 21:02:12 2020 +0100

    GUILE2: Scale GC heap with the number of smobs

Sign in to reply to this message.

Issue 561390043: GUILE2: Scale GC heap with the number of smobs (Closed)

Description

Patch Set 1 #

Patch Set 2 : nit #

Patch Set 3 : move to lily-guile.cc #

Patch Set 4 : autoconf #

Patch Set 5 : jonas' comments #

Patch Set 6 : rebase #

Patch Set 7 : hook into GC event #

Patch Set 8 : Use smob count as memory proxy #

Patch Set 9 : Jonas' comments #

Patch Set 10 : comment #

Messages