Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(128)

Issue 6965050: [google 4.7] atomic update of profile counters

Can't Edit
Can't Publish+Mail
Start Review
Created:
11 years, 4 months ago by xur
Modified:
11 years, 4 months ago
Reviewers:
richard.guenther, davidxl, hubicka, pinskia
CC:
gcc-patches_gcc.gnu.org
Base URL:
svn+ssh://gcc.gnu.org/svn/gcc/branches/google/gcc-4_7/
Visibility:
Public.

Patch Set 1 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+104 lines, -14 lines) Patch
M gcc/common.opt View 1 chunk +9 lines, -0 lines 0 comments Download
M gcc/gcov-io.h View 2 chunks +20 lines, -0 lines 0 comments Download
M gcc/tree-profile.c View 3 chunks +40 lines, -14 lines 0 comments Download
M libgcc/libgcov.c View 4 chunks +35 lines, -0 lines 0 comments Download

Messages

Total messages: 11
xur
Hi, This patch adds the supprot of atomic update the profile counters. Tested with google ...
11 years, 4 months ago (2012-12-19 20:08:30 UTC) #1
davidxl
This looks good to me for google branches. Useful for trunk too. David On Wed, ...
11 years, 4 months ago (2012-12-20 00:25:00 UTC) #2
pinskia_gmail.com
On Wed, Dec 19, 2012 at 12:08 PM, Rong Xu <xur@google.com> wrote: > Hi, > ...
11 years, 4 months ago (2012-12-20 00:29:32 UTC) #3
xur
On Wed, Dec 19, 2012 at 4:29 PM, Andrew Pinski <pinskia@gmail.com> wrote: > > On ...
11 years, 4 months ago (2012-12-20 00:56:39 UTC) #4
hubicka_ucw.cz
> On Wed, Dec 19, 2012 at 4:29 PM, Andrew Pinski <pinskia@gmail.com> wrote: > > ...
11 years, 4 months ago (2012-12-20 16:20:55 UTC) #5
pinskia_gmail.com
On Thu, Dec 20, 2012 at 8:20 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Wed, ...
11 years, 4 months ago (2012-12-20 16:57:50 UTC) #6
xur
we have this patch primarily for getting valid profile counts. we observe that for some ...
11 years, 4 months ago (2012-12-20 19:35:30 UTC) #7
pinskia_gmail.com
On Thu, Dec 20, 2012 at 11:35 AM, Rong Xu <xur@google.com> wrote: > we have ...
11 years, 4 months ago (2012-12-20 19:42:25 UTC) #8
hubicka_ucw.cz
> On Thu, Dec 20, 2012 at 8:20 AM, Jan Hubicka <hubicka@ucw.cz> wrote: > >> ...
11 years, 4 months ago (2012-12-21 09:13:39 UTC) #9
richard.guenther_gmail.com
On Fri, Dec 21, 2012 at 10:13 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Thu, ...
11 years, 4 months ago (2012-12-21 09:55:28 UTC) #10
hubicka_ucw.cz
11 years, 4 months ago (2012-12-21 10:36:28 UTC) #11
> On Fri, Dec 21, 2012 at 10:13 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> On Thu, Dec 20, 2012 at 8:20 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> On Wed, Dec 19, 2012 at 4:29 PM, Andrew Pinski <pinskia@gmail.com>
wrote:
> >> >> >
> >> >> > On Wed, Dec 19, 2012 at 12:08 PM, Rong Xu <xur@google.com> wrote:
> >> >> > > Hi,
> >> >> > >
> >> >> > > This patch adds the supprot of atomic update the profile counters.
> >> >> > > Tested with google internal benchmarks and fdo kernel build.
> >> >> >
> >> >> > I think you should use the __atomic_ functions instead of __sync_
> >> >> > functions as they allow better performance for simple counters as you
> >> >> > can use __ATOMIC_RELAXED.
> >> >>
> >> >> You are right. I think __ATOMIC_RELAXED should be OK here.
> >> >> Thanks for the suggestion.
> >> >>
> >> >> >
> >> >> > And this would be useful for the trunk also.  I was going to
implement
> >> >> > this exact thing this week but some other important stuff came up.
> >> >>
> >> >> I'll post trunk patch later.
> >> >
> >> > Yes, I like that patch, too. Even if the costs are quite high (and this
is why
> >> > atomic updates was sort of voted down in the past) the alternative of
using TLS
> >> > has problems with too-much per-thread memory.
> >>
> >> Actually sometimes (on some processors) atomic increments are cheaper
> >> than doing a regular incremental.  Mainly because there is an
> >> instruction which can handle it in the L2 cache rather than populating
> >> the L1.   Octeon is one such processor where this is true.
> >
> > One reason for large divergence may be the fact that we optimize the
counter
> > update code.  Perhaps declaring counters volatile will prevent load/store
motion
> > and reduce the racing, too.
> 
> Well, that will make it slower, too.  The best benchmark to check is tramp3d
> for all this stuff.  I remember that ICC when it had a function call for each
> counter update was about 100000x slower instrumented than w/o instrumentation
> (that is, I never waited long enough to make it finish even one iteration
...)
> 
> Thus, it's very important that counter updates are subject to loop
> invariant / store
> motion (and SCEV const-prop)!  GCC does a wonderful job here at the moment,
> please do not regress here.

Well, this feature is enabled by user switch.  I do not thing we should change
the default behaviour...

Which makes me to ask, the patch is very isolated (i.e. enabled by command line
only) and has obvious value for end user.  Would it be fine for stage3?

Honza
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b