Issue 168870043: [dev.power64] code review 168870043: 9g: fix under-zeroing in clearfat

Issue 168870043: [dev.power64] code review 168870043: 9g: fix under-zeroing in clearfat (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
10 years, 5 months ago by austin

Modified:
10 years, 5 months ago

Reviewers:
dave, bradfitz

CC:
rsc, dave_cheney.net, bradfitz, golang-codereviews

Visibility:
Public.

Description

9g: fix under-zeroing in clearfat All three cases of clearfat were wrong on power64x. The cases that handle 1032 bytes and up and 32 bytes and up both use MOVDU (one directly generated in a loop and the other via duffzero), which leaves the pointer register pointing at the *last written* address. The generated code was not accounting for this, so the byte fill loop was re-zeroing the last zeroed dword, rather than the bytes following the last zeroed dword. Fix this by simply adding an additional 8 byte offset to the byte zeroing loop. The case that handled under 32 bytes was also wrong. It didn't update the pointer register at all, so the byte zeroing loop was simply re-zeroing the beginning of region. Again, the fix is to add an offset to the byte zeroing loop to account for this.

Patch Set 1 #

Patch Set 2 : diff -r 38a2a61c4a2ee2e8bd17296e415d6013011991b4 https://code.google.com/p/go #

Patch Set 3 : diff -r 38a2a61c4a2ee2e8bd17296e415d6013011991b4 https://code.google.com/p/go #

Patch Set 4 : diff -r 38a2a61c4a2ee2e8bd17296e415d6013011991b4 https://code.google.com/p/go #

Patch Set 5 : diff -r 38a2a61c4a2ee2e8bd17296e415d6013011991b4 https://code.google.com/p/go #

Created: 10 years, 5 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+82 lines, -8 lines)			Patch
M	src/cmd/9g/ggen.c	View	1 2 3	3 chunks	+13 lines, -7 lines	0 comments	Download
M	src/runtime/asm_power64x.s	View	1 2 3	1 chunk	+1 line, -1 line	0 comments	Download
A	test/clearfat.go	View	1 2 3	1 chunk	+68 lines, -0 lines	0 comments	Download

Messages

Total messages: 11

Expand All Messages | Collapse All Messages

austin

Hello rsc@golang.org, dave@cheney.net (cc: golang-codereviews@googlegroups.com), I'd like you to review this change to the dev.power64 ...

10 years, 5 months ago (2014-10-30 21:36:34 UTC) #1

dave_cheney.net

Some minor commentary nits. Your fix looks good, but I'm a bit wary of all ...

10 years, 5 months ago (2014-10-30 22:16:25 UTC) #2

Some minor commentary nits.

Your fix looks good, but I'm a bit wary of all the subtracting and adding
of word widths, it looks fragile.

Dave (phone)

On 31 Oct 2014 08:36, <austin@google.com> wrote:
>
> Reviewers: rsc, dfc,
>
> Message:
> Hello rsc@golang.org, dave@cheney.net (cc:
> golang-codereviews@googlegroups.com),
>
> I'd like you to review this change to the dev.power64 branch of
> https://code.google.com/p/go
>
>
> Description:
> 9g: fix under-zeroing in clearfat
>
> All three cases of clearfat were wrong on power64x.
>
> The cases that handle 1032 bytes and up and 32 bytes and up
> both use MOVDU (one directly generated in a loop and the other
> via duffzero), which leaves the pointer register pointing at
> the *last written* address.  The generated code was not
> accounting for this, so the byte fill loop was re-zeroing the
> last zeroed dword, rather than the bytes following the last
> zeroed dword.  Fix this by simply adding an additional 8 byte
> offset to the byte zeroing loop.
>
> The case that handled under 32 bytes was also wrong.  It
> didn't update the pointer register at all, so the byte zeroing
> loop was simply re-zeroing the beginning of region.  Again,
> the fix is to add an offset to the byte zeroing loop to
> account for this.
>
> Please review this at https://codereview.appspot.com/168870043/
>
> Affected files (+14, -7 lines):
>   M src/cmd/9g/ggen.c
>   M src/runtime/asm_power64x.s
>
>
> Index: src/cmd/9g/ggen.c
> ===================================================================
> --- a/src/cmd/9g/ggen.c
> +++ b/src/cmd/9g/ggen.c
> @@ -900,7 +900,7 @@ ret:
>  void
>  clearfat(Node *nl)
>  {
> -       uint64 w, c, q, t;
> +       uint64 w, c, q, t, boff;
>         Node dst, end, r0, *f;
>         Prog *p, *pl;
>
> @@ -944,6 +944,8 @@ clearfat(Node *nl)
>                 patch(gbranch(ABNE, T, 0), pl);
>
>                 regfree(&end);
> +               // The above loop leaves R3 on the last zeroed dword

S/above//

> +               boff = 8;
>         } else if(q >= 4) {
>                 p = gins(ASUB, N, &dst);
>                 p->from.type = D_CONST;
> @@ -953,17 +955,21 @@ clearfat(Node *nl)
>                 afunclit(&p->to, f);
>                 // 4 and 128 = magic constants: see
../../runtime/asm_power64x.s
>                 p->to.offset = 4*(128-q);
> -       } else
> -       for(t = 0; t < q; t++) {
> -               p = gins(AMOVD, &r0, &dst);
> -               p->to.type = D_OREG;
> -               p->to.offset = 8*t;
> +               // duffzero leaves R3 on the last zeroed dword
> +               boff = 8;
> +       } else {
> +               for(t = 0; t < q; t++) {
> +                       p = gins(AMOVD, &r0, &dst);
> +                       p->to.type = D_OREG;
> +                       p->to.offset = 8*t;
> +               }
> +               boff = 8*q;
>         }
>
>         for(t = 0; t < c; t++) {
>                 p = gins(AMOVB, &r0, &dst);
>                 p->to.type = D_OREG;
> -               p->to.offset = t;
> +               p->to.offset = t+boff;
>         }
>         reg[REGRT1]--;
>  }
> Index: src/runtime/asm_power64x.s
> ===================================================================
> --- a/src/runtime/asm_power64x.s
> +++ b/src/runtime/asm_power64x.s
> @@ -830,6 +830,7 @@ notfound:
>  // R0: always zero
>  // R3 (aka REGRT1): ptr to memory to be zeroed - 8
>  // R3 is updated as a side effect.

Drop the line, the following line is enough
> +// On return, R3 points to the last zeroed dword.
>  TEXT runtime·duffzero(SB), NOSPLIT, $-8-0
>         MOVDU   R0, 8(R3)
>         MOVDU   R0, 8(R3)
>
>

bradfitz

And is this something observable from Go code? Possible to add a test? On Oct ...

10 years, 5 months ago (2014-10-30 22:24:49 UTC) #3

And is this something observable from Go code? Possible to add a test?
 On Oct 30, 2014 7:16 PM, "Dave Cheney" <dave@cheney.net> wrote:

> Some minor commentary nits.
>
> Your fix looks good, but I'm a bit wary of all the subtracting and adding
> of word widths, it looks fragile.
>
> Dave (phone)
>
> On 31 Oct 2014 08:36, <austin@google.com> wrote:
> >
> > Reviewers: rsc, dfc,
> >
> > Message:
> > Hello rsc@golang.org, dave@cheney.net (cc:
> > golang-codereviews@googlegroups.com),
> >
> > I'd like you to review this change to the dev.power64 branch of
> > https://code.google.com/p/go
> >
> >
> > Description:
> > 9g: fix under-zeroing in clearfat
> >
> > All three cases of clearfat were wrong on power64x.
> >
> > The cases that handle 1032 bytes and up and 32 bytes and up
> > both use MOVDU (one directly generated in a loop and the other
> > via duffzero), which leaves the pointer register pointing at
> > the *last written* address.  The generated code was not
> > accounting for this, so the byte fill loop was re-zeroing the
> > last zeroed dword, rather than the bytes following the last
> > zeroed dword.  Fix this by simply adding an additional 8 byte
> > offset to the byte zeroing loop.
> >
> > The case that handled under 32 bytes was also wrong.  It
> > didn't update the pointer register at all, so the byte zeroing
> > loop was simply re-zeroing the beginning of region.  Again,
> > the fix is to add an offset to the byte zeroing loop to
> > account for this.
> >
> > Please review this at https://codereview.appspot.com/168870043/
> >
> > Affected files (+14, -7 lines):
> >   M src/cmd/9g/ggen.c
> >   M src/runtime/asm_power64x.s
> >
> >
> > Index: src/cmd/9g/ggen.c
> > ===================================================================
> > --- a/src/cmd/9g/ggen.c
> > +++ b/src/cmd/9g/ggen.c
> > @@ -900,7 +900,7 @@ ret:
> >  void
> >  clearfat(Node *nl)
> >  {
> > -       uint64 w, c, q, t;
> > +       uint64 w, c, q, t, boff;
> >         Node dst, end, r0, *f;
> >         Prog *p, *pl;
> >
> > @@ -944,6 +944,8 @@ clearfat(Node *nl)
> >                 patch(gbranch(ABNE, T, 0), pl);
> >
> >                 regfree(&end);
> > +               // The above loop leaves R3 on the last zeroed dword
>
> S/above//
>
> > +               boff = 8;
> >         } else if(q >= 4) {
> >                 p = gins(ASUB, N, &dst);
> >                 p->from.type = D_CONST;
> > @@ -953,17 +955,21 @@ clearfat(Node *nl)
> >                 afunclit(&p->to, f);
> >                 // 4 and 128 = magic constants: see
> ../../runtime/asm_power64x.s
> >                 p->to.offset = 4*(128-q);
> > -       } else
> > -       for(t = 0; t < q; t++) {
> > -               p = gins(AMOVD, &r0, &dst);
> > -               p->to.type = D_OREG;
> > -               p->to.offset = 8*t;
> > +               // duffzero leaves R3 on the last zeroed dword
> > +               boff = 8;
> > +       } else {
> > +               for(t = 0; t < q; t++) {
> > +                       p = gins(AMOVD, &r0, &dst);
> > +                       p->to.type = D_OREG;
> > +                       p->to.offset = 8*t;
> > +               }
> > +               boff = 8*q;
> >         }
> >
> >         for(t = 0; t < c; t++) {
> >                 p = gins(AMOVB, &r0, &dst);
> >                 p->to.type = D_OREG;
> > -               p->to.offset = t;
> > +               p->to.offset = t+boff;
> >         }
> >         reg[REGRT1]--;
> >  }
> > Index: src/runtime/asm_power64x.s
> > ===================================================================
> > --- a/src/runtime/asm_power64x.s
> > +++ b/src/runtime/asm_power64x.s
> > @@ -830,6 +830,7 @@ notfound:
> >  // R0: always zero
> >  // R3 (aka REGRT1): ptr to memory to be zeroed - 8
> >  // R3 is updated as a side effect.
>
> Drop the line, the following line is enough
> > +// On return, R3 points to the last zeroed dword.
> >  TEXT runtime·duffzero(SB), NOSPLIT, $-8-0
> >         MOVDU   R0, 8(R3)
> >         MOVDU   R0, 8(R3)
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-codereviews" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-codereviews+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

dave_cheney.net

I've been able to come up with one failing test case http://paste.ubuntu.com/8753330/ If this sounds ...

10 years, 5 months ago (2014-10-31 00:04:14 UTC) #4

I've been able to come up with one failing test case

http://paste.ubuntu.com/8753330/

If this sounds sensible I can expand this to be a real test in the
runtime package.

On Fri, Oct 31, 2014 at 9:24 AM, Brad Fitzpatrick <bradfitz@golang.org> wrote:
> And is this something observable from Go code? Possible to add a test?
>
> On Oct 30, 2014 7:16 PM, "Dave Cheney" <dave@cheney.net> wrote:
>>
>> Some minor commentary nits.
>>
>> Your fix looks good, but I'm a bit wary of all the subtracting and adding
>> of word widths, it looks fragile.
>>
>> Dave (phone)
>>
>> On 31 Oct 2014 08:36, <austin@google.com> wrote:
>> >
>> > Reviewers: rsc, dfc,
>> >
>> > Message:
>> > Hello rsc@golang.org, dave@cheney.net (cc:
>> > golang-codereviews@googlegroups.com),
>> >
>> > I'd like you to review this change to the dev.power64 branch of
>> > https://code.google.com/p/go
>> >
>> >
>> > Description:
>> > 9g: fix under-zeroing in clearfat
>> >
>> > All three cases of clearfat were wrong on power64x.
>> >
>> > The cases that handle 1032 bytes and up and 32 bytes and up
>> > both use MOVDU (one directly generated in a loop and the other
>> > via duffzero), which leaves the pointer register pointing at
>> > the *last written* address.  The generated code was not
>> > accounting for this, so the byte fill loop was re-zeroing the
>> > last zeroed dword, rather than the bytes following the last
>> > zeroed dword.  Fix this by simply adding an additional 8 byte
>> > offset to the byte zeroing loop.
>> >
>> > The case that handled under 32 bytes was also wrong.  It
>> > didn't update the pointer register at all, so the byte zeroing
>> > loop was simply re-zeroing the beginning of region.  Again,
>> > the fix is to add an offset to the byte zeroing loop to
>> > account for this.
>> >
>> > Please review this at https://codereview.appspot.com/168870043/
>> >
>> > Affected files (+14, -7 lines):
>> >   M src/cmd/9g/ggen.c
>> >   M src/runtime/asm_power64x.s
>> >
>> >
>> > Index: src/cmd/9g/ggen.c
>> > ===================================================================
>> > --- a/src/cmd/9g/ggen.c
>> > +++ b/src/cmd/9g/ggen.c
>> > @@ -900,7 +900,7 @@ ret:
>> >  void
>> >  clearfat(Node *nl)
>> >  {
>> > -       uint64 w, c, q, t;
>> > +       uint64 w, c, q, t, boff;
>> >         Node dst, end, r0, *f;
>> >         Prog *p, *pl;
>> >
>> > @@ -944,6 +944,8 @@ clearfat(Node *nl)
>> >                 patch(gbranch(ABNE, T, 0), pl);
>> >
>> >                 regfree(&end);
>> > +               // The above loop leaves R3 on the last zeroed dword
>>
>> S/above//
>>
>> > +               boff = 8;
>> >         } else if(q >= 4) {
>> >                 p = gins(ASUB, N, &dst);
>> >                 p->from.type = D_CONST;
>> > @@ -953,17 +955,21 @@ clearfat(Node *nl)
>> >                 afunclit(&p->to, f);
>> >                 // 4 and 128 = magic constants: see
>> > ../../runtime/asm_power64x.s
>> >                 p->to.offset = 4*(128-q);
>> > -       } else
>> > -       for(t = 0; t < q; t++) {
>> > -               p = gins(AMOVD, &r0, &dst);
>> > -               p->to.type = D_OREG;
>> > -               p->to.offset = 8*t;
>> > +               // duffzero leaves R3 on the last zeroed dword
>> > +               boff = 8;
>> > +       } else {
>> > +               for(t = 0; t < q; t++) {
>> > +                       p = gins(AMOVD, &r0, &dst);
>> > +                       p->to.type = D_OREG;
>> > +                       p->to.offset = 8*t;
>> > +               }
>> > +               boff = 8*q;
>> >         }
>> >
>> >         for(t = 0; t < c; t++) {
>> >                 p = gins(AMOVB, &r0, &dst);
>> >                 p->to.type = D_OREG;
>> > -               p->to.offset = t;
>> > +               p->to.offset = t+boff;
>> >         }
>> >         reg[REGRT1]--;
>> >  }
>> > Index: src/runtime/asm_power64x.s
>> > ===================================================================
>> > --- a/src/runtime/asm_power64x.s
>> > +++ b/src/runtime/asm_power64x.s
>> > @@ -830,6 +830,7 @@ notfound:
>> >  // R0: always zero
>> >  // R3 (aka REGRT1): ptr to memory to be zeroed - 8
>> >  // R3 is updated as a side effect.
>>
>> Drop the line, the following line is enough
>> > +// On return, R3 points to the last zeroed dword.
>> >  TEXT runtime·duffzero(SB), NOSPLIT, $-8-0
>> >         MOVDU   R0, 8(R3)
>> >         MOVDU   R0, 8(R3)
>> >
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-codereviews" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-codereviews+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

dave_cheney.net

LGTM. https://codereview.appspot.com/162570043 contains the beginnings of a test which failed prior to this CL and ...

10 years, 5 months ago (2014-10-31 00:42:53 UTC) #5

austin

On Thu, Oct 30, 2014 at 6:16 PM, Dave Cheney <dave@cheney.net> wrote: > Some minor ...

10 years, 5 months ago (2014-10-31 13:39:00 UTC) #6

On Thu, Oct 30, 2014 at 6:16 PM, Dave Cheney <dave@cheney.net> wrote:

> Some minor commentary nits.
>
> Your fix looks good, but I'm a bit wary of all the subtracting and adding
> of word widths, it looks fragile.
>
This code is already keenly aware of low level details like this, so I
think it would merely obscure it if I tried to be more abstract for these
changes, but I'm open to suggestions.

>  Dave (phone)
>
> On 31 Oct 2014 08:36, <austin@google.com> wrote:
> >
> > Reviewers: rsc, dfc,
> >
> > Message:
> > Hello rsc@golang.org, dave@cheney.net (cc:
> > golang-codereviews@googlegroups.com),
> >
> > I'd like you to review this change to the dev.power64 branch of
> > https://code.google.com/p/go
> >
> >
> > Description:
> > 9g: fix under-zeroing in clearfat
> >
> > All three cases of clearfat were wrong on power64x.
> >
> > The cases that handle 1032 bytes and up and 32 bytes and up
> > both use MOVDU (one directly generated in a loop and the other
> > via duffzero), which leaves the pointer register pointing at
> > the *last written* address.  The generated code was not
> > accounting for this, so the byte fill loop was re-zeroing the
> > last zeroed dword, rather than the bytes following the last
> > zeroed dword.  Fix this by simply adding an additional 8 byte
> > offset to the byte zeroing loop.
> >
> > The case that handled under 32 bytes was also wrong.  It
> > didn't update the pointer register at all, so the byte zeroing
> > loop was simply re-zeroing the beginning of region.  Again,
> > the fix is to add an offset to the byte zeroing loop to
> > account for this.
> >
> > Please review this at https://codereview.appspot.com/168870043/
> >
> > Affected files (+14, -7 lines):
> >   M src/cmd/9g/ggen.c
> >   M src/runtime/asm_power64x.s
> >
> >
> > Index: src/cmd/9g/ggen.c
> > ===================================================================
> > --- a/src/cmd/9g/ggen.c
> > +++ b/src/cmd/9g/ggen.c
> > @@ -900,7 +900,7 @@ ret:
> >  void
> >  clearfat(Node *nl)
> >  {
> > -       uint64 w, c, q, t;
> > +       uint64 w, c, q, t, boff;
> >         Node dst, end, r0, *f;
> >         Prog *p, *pl;
> >
> > @@ -944,6 +944,8 @@ clearfat(Node *nl)
> >                 patch(gbranch(ABNE, T, 0), pl);
> >
> >                 regfree(&end);
> > +               // The above loop leaves R3 on the last zeroed dword
>
> S/above//
>
Done.

> > +               boff = 8;
> >         } else if(q >= 4) {
> >                 p = gins(ASUB, N, &dst);
> >                 p->from.type = D_CONST;
> > @@ -953,17 +955,21 @@ clearfat(Node *nl)
> >                 afunclit(&p->to, f);
> >                 // 4 and 128 = magic constants: see
> ../../runtime/asm_power64x.s
> >                 p->to.offset = 4*(128-q);
> > -       } else
> > -       for(t = 0; t < q; t++) {
> > -               p = gins(AMOVD, &r0, &dst);
> > -               p->to.type = D_OREG;
> > -               p->to.offset = 8*t;
> > +               // duffzero leaves R3 on the last zeroed dword
> > +               boff = 8;
> > +       } else {
> > +               for(t = 0; t < q; t++) {
> > +                       p = gins(AMOVD, &r0, &dst);
> > +                       p->to.type = D_OREG;
> > +                       p->to.offset = 8*t;
> > +               }
> > +               boff = 8*q;
> >         }
> >
> >         for(t = 0; t < c; t++) {
> >                 p = gins(AMOVB, &r0, &dst);
> >                 p->to.type = D_OREG;
> > -               p->to.offset = t;
> > +               p->to.offset = t+boff;
> >         }
> >         reg[REGRT1]--;
> >  }
> > Index: src/runtime/asm_power64x.s
> > ===================================================================
> > --- a/src/runtime/asm_power64x.s
> > +++ b/src/runtime/asm_power64x.s
> > @@ -830,6 +830,7 @@ notfound:
> >  // R0: always zero
> >  // R3 (aka REGRT1): ptr to memory to be zeroed - 8
> >  // R3 is updated as a side effect.
>
> Drop the line, the following line is enough
>
Done.

> > +// On return, R3 points to the last zeroed dword.
> >  TEXT runtime·duffzero(SB), NOSPLIT, $-8-0
> >         MOVDU   R0, 8(R3)
> >         MOVDU   R0, 8(R3)
> >
> >
>

dave_cheney.net

My initial reaction was an overreaction. I agree with Brad that adding a test is ...

10 years, 5 months ago (2014-10-31 13:56:57 UTC) #7

My initial reaction was an overreaction. I agree with Brad that adding
a test is probably sufficient coverage for this bug.

On Sat, Nov 1, 2014 at 12:39 AM, Austin Clements <austin@google.com> wrote:
> On Thu, Oct 30, 2014 at 6:16 PM, Dave Cheney <dave@cheney.net> wrote:
>>
>> Some minor commentary nits.
>>
>> Your fix looks good, but I'm a bit wary of all the subtracting and adding
>> of word widths, it looks fragile.
>
> This code is already keenly aware of low level details like this, so I think
> it would merely obscure it if I tried to be more abstract for these changes,
> but I'm open to suggestions.
>>
>> Dave (phone)
>>
>> On 31 Oct 2014 08:36, <austin@google.com> wrote:
>> >
>> > Reviewers: rsc, dfc,
>> >
>> > Message:
>> > Hello rsc@golang.org, dave@cheney.net (cc:
>> > golang-codereviews@googlegroups.com),
>> >
>> > I'd like you to review this change to the dev.power64 branch of
>> > https://code.google.com/p/go
>> >
>> >
>> > Description:
>> > 9g: fix under-zeroing in clearfat
>> >
>> > All three cases of clearfat were wrong on power64x.
>> >
>> > The cases that handle 1032 bytes and up and 32 bytes and up
>> > both use MOVDU (one directly generated in a loop and the other
>> > via duffzero), which leaves the pointer register pointing at
>> > the *last written* address.  The generated code was not
>> > accounting for this, so the byte fill loop was re-zeroing the
>> > last zeroed dword, rather than the bytes following the last
>> > zeroed dword.  Fix this by simply adding an additional 8 byte
>> > offset to the byte zeroing loop.
>> >
>> > The case that handled under 32 bytes was also wrong.  It
>> > didn't update the pointer register at all, so the byte zeroing
>> > loop was simply re-zeroing the beginning of region.  Again,
>> > the fix is to add an offset to the byte zeroing loop to
>> > account for this.
>> >
>> > Please review this at https://codereview.appspot.com/168870043/
>> >
>> > Affected files (+14, -7 lines):
>> >   M src/cmd/9g/ggen.c
>> >   M src/runtime/asm_power64x.s
>> >
>> >
>> > Index: src/cmd/9g/ggen.c
>> > ===================================================================
>> > --- a/src/cmd/9g/ggen.c
>> > +++ b/src/cmd/9g/ggen.c
>> > @@ -900,7 +900,7 @@ ret:
>> >  void
>> >  clearfat(Node *nl)
>> >  {
>> > -       uint64 w, c, q, t;
>> > +       uint64 w, c, q, t, boff;
>> >         Node dst, end, r0, *f;
>> >         Prog *p, *pl;
>> >
>> > @@ -944,6 +944,8 @@ clearfat(Node *nl)
>> >                 patch(gbranch(ABNE, T, 0), pl);
>> >
>> >                 regfree(&end);
>> > +               // The above loop leaves R3 on the last zeroed dword
>>
>> S/above//
>
> Done.
>>
>> > +               boff = 8;
>> >         } else if(q >= 4) {
>> >                 p = gins(ASUB, N, &dst);
>> >                 p->from.type = D_CONST;
>> > @@ -953,17 +955,21 @@ clearfat(Node *nl)
>> >                 afunclit(&p->to, f);
>> >                 // 4 and 128 = magic constants: see
>> > ../../runtime/asm_power64x.s
>> >                 p->to.offset = 4*(128-q);
>> > -       } else
>> > -       for(t = 0; t < q; t++) {
>> > -               p = gins(AMOVD, &r0, &dst);
>> > -               p->to.type = D_OREG;
>> > -               p->to.offset = 8*t;
>> > +               // duffzero leaves R3 on the last zeroed dword
>> > +               boff = 8;
>> > +       } else {
>> > +               for(t = 0; t < q; t++) {
>> > +                       p = gins(AMOVD, &r0, &dst);
>> > +                       p->to.type = D_OREG;
>> > +                       p->to.offset = 8*t;
>> > +               }
>> > +               boff = 8*q;
>> >         }
>> >
>> >         for(t = 0; t < c; t++) {
>> >                 p = gins(AMOVB, &r0, &dst);
>> >                 p->to.type = D_OREG;
>> > -               p->to.offset = t;
>> > +               p->to.offset = t+boff;
>> >         }
>> >         reg[REGRT1]--;
>> >  }
>> > Index: src/runtime/asm_power64x.s
>> > ===================================================================
>> > --- a/src/runtime/asm_power64x.s
>> > +++ b/src/runtime/asm_power64x.s
>> > @@ -830,6 +830,7 @@ notfound:
>> >  // R0: always zero
>> >  // R3 (aka REGRT1): ptr to memory to be zeroed - 8
>> >  // R3 is updated as a side effect.
>>
>> Drop the line, the following line is enough
>
> Done.
>>
>> > +// On return, R3 points to the last zeroed dword.
>> >  TEXT runtime·duffzero(SB), NOSPLIT, $-8-0
>> >         MOVDU   R0, 8(R3)
>> >         MOVDU   R0, 8(R3)
>> >
>> >

austin

Hello rsc@golang.org, dave@cheney.net, bradfitz@golang.org (cc: golang-codereviews@googlegroups.com), Please take another look.

10 years, 5 months ago (2014-10-31 14:12:54 UTC) #8

austin

I took this test and sort of ran with it and combined it with the ...

10 years, 5 months ago (2014-10-31 14:16:29 UTC) #9

austin

10 years, 5 months ago (2014-10-31 15:08:32 UTC) #11

*** Submitted as https://code.google.com/p/go/source/detail?r=1e9dfa88c1e2 ***

[dev.power64] 9g: fix under-zeroing in clearfat

All three cases of clearfat were wrong on power64x.

The cases that handle 1032 bytes and up and 32 bytes and up
both use MOVDU (one directly generated in a loop and the other
via duffzero), which leaves the pointer register pointing at
the *last written* address.  The generated code was not
accounting for this, so the byte fill loop was re-zeroing the
last zeroed dword, rather than the bytes following the last
zeroed dword.  Fix this by simply adding an additional 8 byte
offset to the byte zeroing loop.

The case that handled under 32 bytes was also wrong.  It
didn't update the pointer register at all, so the byte zeroing
loop was simply re-zeroing the beginning of region.  Again,
the fix is to add an offset to the byte zeroing loop to
account for this.

LGTM=dave, bradfitz
R=rsc, dave, bradfitz
CC=golang-codereviews
https://codereview.appspot.com/168870043

Expand All Messages | Collapse All Messages