Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(427)

Issue 6506058: Replace D3DXFloat16To32Array. (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
12 years, 11 months ago by apatrick1
Modified:
12 years, 11 months ago
Reviewers:
dgkoch, nicolas
CC:
angleproject-review_googlegroups.com
Base URL:
http://angleproject.googlecode.com/svn/trunk/
Visibility:
Public.

Description

Replace D3DXFloat16To32Array. Method described here: ftp://ftp.fox-toolkit.org/pub/fasthalffloatconversion.pdf Committed: https://code.google.com/p/angleproject/source/detail?r=1270

Patch Set 1 : #

Unified diffs Side-by-side diffs Delta from patch set Stats (+2293 lines, -8 lines) Patch
M src/build_angle.gypi View 1 chunk +1 line, -0 lines 0 comments Download
M src/libGLESv2/Context.cpp View 1 chunk +4 lines, -8 lines 0 comments Download
A src/libGLESv2/Float16ToFloat32.cpp View 1 chunk +2203 lines, -0 lines 0 comments Download
A src/libGLESv2/Float16ToFloat32.py View 1 chunk +78 lines, -0 lines 0 comments Download
M src/libGLESv2/libGLESv2.vcproj View 1 chunk +4 lines, -0 lines 0 comments Download
M src/libGLESv2/mathutil.h View 1 chunk +3 lines, -0 lines 0 comments Download

Messages

Total messages: 2
apatrick1
Daniel, I tested all 16-bit floats between 0000 and FFFF. The results differed from DirectXMath::PackedVector::XMConvertHalfToFloat ...
12 years, 11 months ago (2012-08-31 21:58:21 UTC) #1
nicolas
12 years, 11 months ago (2012-09-05 19:17:02 UTC) #2
Looks ok to me. The OES_texture_half_float extension allows to either have
INF/NAN representations, or not, so this change is fine and probably for the
better.

In my experience that algorithm isn't very optimal though. It always takes
multiple clock cycles per conversion (reciprocal throughput), while with a
65536-entry table it's a single lookup. Even though the latter doesn't fit in
the L1 cache, this merely affects the latency and there can be dozens of lookups
in flight simultaneously to cover for that. Also, there typically is some level
of coherence so usually it doesn't access all 256 kB of a complete lookup table.

Anyway, since there's no particular expectation for glReadPixels to be fast,
this implementation seems to be a fair compromise between speed and size.

That said, glReadPixels doesn't strictly have to support this conversion. See
http://code.google.com/p/angleproject/issues/detail?id=295 and page 103 of the
2.0.24 spec. But let's not make any behavior changes here and just get rid of
this D3DX dependency first. Thanks!
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b