Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(675)

Issue 97690043: Update libjpeg_turbo to use clz for bitcounting on ARM

Can't Edit
Can't Publish+Mail
Start Review
Created:
10 years, 8 months ago by Olle Liljenzin
Modified:
10 years, 8 months ago
Reviewers:
noel1, noel chromium
Base URL:
http://src.chromium.org/svn/trunk/deps/third_party/libjpeg_turbo
Visibility:
Public.

Description

Update libjpeg_turbo to use clz for bitcounting on ARM Cherry-picked r1220 from upstream: Use clz/bsr instructions on ARM for bit counting rather than the lookup table (reduces memory footprint and can improve performance in some cases.) Upstream review: http://sourceforge.net/p/libjpeg-turbo/patches/57/ Original review: https://codereview.appspot.com/77480045/ Removing the lookup table saves 64k data for each process that uses jpeg encoding. Benchmarks on a few ARM devices shows encoding performance changes, from a slowdown of 3-4% on some devices, to a speedup of 10-20% on other devices. In average performance improves. x86 will still use the lookup table because the bsr instruction showed to be slower on some chips. BUG=

Patch Set 1 #

Patch Set 2 : Updated google.patch file #

Unified diffs Side-by-side diffs Delta from patch set Stats (+108 lines, -2 lines) Patch
M README.chromium View 1 chunk +1 line, -0 lines 0 comments Download
M google.patch View 1 1 chunk +75 lines, -0 lines 0 comments Download
M jchuff.c View 5 chunks +32 lines, -2 lines 0 comments Download

Messages

Total messages: 13
Olle Liljenzin
10 years, 8 months ago (2014-05-22 15:09:32 UTC) #1
noel chromium
Looking good. Please also compute the diff for jchuff.c and add it to the google.patch ...
10 years, 8 months ago (2014-05-23 12:13:52 UTC) #2
noel chromium
Minor thing: the title of this issue should be the first line of the change ...
10 years, 8 months ago (2014-05-23 12:25:09 UTC) #3
Olle Liljenzin
Updated title of this issue and uploaded the updated google.patch file.
10 years, 8 months ago (2014-05-23 12:43:27 UTC) #4
noel chromium
LGTM
10 years, 8 months ago (2014-05-23 13:19:13 UTC) #5
Olle Liljenzin
Noel, I don't think I have permission to run 'git cl dcommit'. So could you ...
10 years, 8 months ago (2014-05-23 13:35:44 UTC) #6
noel chromium
Committed the patch manually. Reset your local libjpeg_turbo, git svn fetch, and let me know ...
10 years, 8 months ago (2014-05-23 23:57:48 UTC) #7
Olle Liljenzin
On 2014/05/23 23:57:48, noel chromium wrote: > Committed the patch manually. Reset your local libjpeg_turbo, ...
10 years, 8 months ago (2014-05-26 11:12:21 UTC) #8
noel chromium
Yes, prepare a patch to roll DEPS and upload it, reviewer inferno@chromium.org, cc me, and ...
10 years, 8 months ago (2014-05-26 12:26:54 UTC) #9
Olle Liljenzin
On 2014/05/26 12:26:54, noel chromium wrote: > Yes, prepare a patch to roll DEPS and ...
10 years, 8 months ago (2014-05-26 12:47:41 UTC) #10
noel chromium
Doing the DEPS roll on https://codereview.chromium.org/300553007
10 years, 8 months ago (2014-05-27 00:11:00 UTC) #11
noel chromium
re the x86 comment: "bsr instruction showed to be slower on some chips." Just curious: ...
10 years, 8 months ago (2014-05-28 03:36:39 UTC) #12
Olle Liljenzin
10 years, 8 months ago (2014-05-28 07:18:05 UTC) #13
On 2014/05/28 03:36:39, noel chromium wrote:
> re the x86 comment: "bsr instruction showed to be slower on some chips."  Just
> curious: which chips - old ones, new ones, specific types ...?

In the tests by DCR and me we found older AMDs to be slow.

When searching around I found the following draft paper by Torbjörn Granlund
that claims slow bsr timings on Intel P4, AMD K6-K9 and on Intel Atom, although
I haven't verified if these numbers apply directly to the jpeg encoding
performance:
https://gmplib.org/~tege/x86-timing.pdf


And thanks for all help with the review!
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b