Issue 331290043: ICU ticket #13467: make U8_NEXT() never call a function

https://codereview.appspot.com/331290043/diff/1/icu4c/source/common/unicode/utf8.h File icu4c/source/common/unicode/utf8.h (right): https://codereview.appspot.com/331290043/diff/1/icu4c/source/common/unicode/utf8.h#newcode351 icu4c/source/common/unicode/utf8.h:351: #define U8_NEXT(s, i, length, c) U8_INTERNAL_NEXT_OR_SUB(s, i, length, c, ...

7 years, 7 months ago (2017-11-17 16:58:01 UTC) #3

sffc

I made some benchmarks. I compared four implementations: 1. U8_NEXT on trunk 2. U8_NEXT_MARKUS from ...

7 years, 7 months ago (2017-11-30 08:39:34 UTC) #4

I made some benchmarks.  I compared four implementations:

   1. U8_NEXT on trunk
   2. U8_NEXT_MARKUS from this code review (as in the old code, 3-byte is
   consumed before 2-byte)
   3. U8_NEXT_SHANE for my first implementation, which consumes in byte
   order (2-byte is consumed before 3-byte)
   4. QU8_NEXT_SHANE for a second implementation I was testing, which
   reduces the number of bytes of machine code on x86 to the lowest of these
   options except for trunk U8_NEXT

Results are below.  Although it's basically a toss-up, there are a few
interesting results:

   - The new implementations, the ones that don't depend on the library
   function utf8_nextCharSafeBody, are significantly faster on 4-byte
   characters.  My chart only shows 12% faster, but this is on a corpus of
   mostly English with a few Emoji; on a string of all 4-byte characters,
   trunk U8_NEXT is more than twice as slow than any of the other
   implementations.
   - U8_NEXT and U8_NEXT_MARKUS are slightly faster on Chinese (3-byte
   characters) and slightly slower on Russian (2-byte characters) than
   QU8_NEXT_SHANE, consistent with the logic of the functions.
   - U8_NEXT_MARKUS is a bit slower on Russian than any of the other
   implementations.  It's not immediately obvious to me why that would be the
   case.  Maybe something hard to measure like code locality?


My code: http://go/bit/sffc/5784002938011648

Shane



On Fri, Nov 17, 2017 at 8:58 AM, <markus.icu@gmail.com> wrote:

>
> https://codereview.appspot.com/331290043/diff/1/icu4c/source
> /common/unicode/utf8.h
> File icu4c/source/common/unicode/utf8.h (right):
>
> https://codereview.appspot.com/331290043/diff/1/icu4c/source
> /common/unicode/utf8.h#newcode351
> icu4c/source/common/unicode/utf8.h:351: #define U8_NEXT(s, i, length, c)
> U8_INTERNAL_NEXT_OR_SUB(s, i, length, c, U_SENTINEL)
> On 2017/11/17 08:55:22, sffc wrote:
>
>> The function utf8_nextCharSafeBody has a call to U_IS_UNICODE_NONCHAR
>>
> that I
>
>> don't see here.  Is that case covered somewhere else?
>>
>
> That function supports obsolete behaviors (for utf_old.h compatibility)
> as well as current behaviors. Look at what it does for strict=-1 and
> strict=-3.
>
> https://codereview.appspot.com/331290043/diff/1/icu4c/source
> /common/unicode/utf8.h#newcode428
> icu4c/source/common/unicode/utf8.h:428: uint32_t __uc=(c); \
> On 2017/11/17 08:55:22, sffc wrote:
>
>> Presumably the compiler will optimize out the assignment operation?
>>
>
> It should. Same value, just different interpretation of what less-than
> means.
>
> Any risk of name collision with another variable?
>>
>
> Shouldn't, because it's inside a { block }. We have not had that sort of
> trouble. The double underscore prefix is probably overkill, too, just
> extra careful. I guess some compiler could warn about hiding a member
> field or something. See other macros in these files.
>
> https://codereview.appspot.com/331290043/
>

markus.icu

On Thu, Nov 30, 2017 at 12:39 AM, Shane Carr <sffc@google.com> wrote: > I made ...

7 years, 7 months ago (2017-11-30 22:15:15 UTC) #5

markus.icu

I looked at the size of my proposed U8_NEXT() code with Linux clang 3.4 with ...

7 years, 7 months ago (2017-12-01 18:45:49 UTC) #6

markus.icu

I realized that since the test function did not return the final index, the last ...

7 years, 7 months ago (2017-12-01 23:09:31 UTC) #7

markus.icu

expand validity marcros: do not leave it up to the compiler to detect that masking ...

7 years, 7 months ago (2017-12-02 00:08:11 UTC) #9

markus.icu

This passes all ICU tests. Shane, maybe you can plug this into your little benchmark. ...

7 years, 7 months ago (2017-12-02 00:10:05 UTC) #10

sffc

LGTM When I run this with -O3, the newest implementation (patch set 3) is faster ...

7 years, 7 months ago (2017-12-02 02:24:55 UTC) #11

mark_macchiato.com

Nice work! Mark <https://twitter.com/mark_e_davis> On Sat, Dec 2, 2017 at 3:24 AM, <sffc@google.com> wrote: > ...

7 years, 7 months ago (2017-12-02 06:14:25 UTC) #12

markus.icu

7 years, 7 months ago (2017-12-05 19:23:31 UTC) #13

Submitted: http://bugs.icu-project.org/trac/changeset/40698

Expand All Messages | Collapse All Messages

Issue 331290043: ICU ticket #13467: make U8_NEXT() never call a function (Closed)

Description

Patch Set 1 #

Patch Set 2 : smaller U8_NEXT() with no function call #

Patch Set 3 : expand validity marcros: do not leave it up to the compiler to detect that masking the lead bytes i… #

Messages