Overall, this looks very good. Thanks for doing it! http://codereview.appspot.com/11905/diff/1/2 File Python/ceval.c (left): http://codereview.appspot.com/11905/diff/1/2#oldcode669 Line ...
15 years, 11 months ago
(2009-01-12 20:05:54 UTC)
#1
Overall, this looks very good. Thanks for doing it!
http://codereview.appspot.com/11905/diff/1/2
File Python/ceval.c (left):
http://codereview.appspot.com/11905/diff/1/2#oldcode669
Line 669: #define PREDICTED(op) PRED_##op: next_instr++
Does the compiler leave these instructions in the assembly even if no code jumps
to them?
http://codereview.appspot.com/11905/diff/1/2
File Python/ceval.c (right):
http://codereview.appspot.com/11905/diff/1/2#newcode82
Line 82: #define LLTRACE 1 /* Low-level trace feature */
Since Py_DEBUG defines LLTRACE, I don't think you should turn off COMPUTED_GOTOs
based on it. We want builds with --with-pydebug to use the same dispatch
mechanism by default as optimized builds.
You could do the same thing in response to lltrace==true as you do in response
to _Py_TracingPossible since most of the time lltrace is false.
http://codereview.appspot.com/11905/diff/1/2#newcode623
Line 623: #if (defined(__GNUC__) || defined (__SUNPRO_C)) \
Arguably, you should use configure to check if computed gotos are supported,
rather than a specific list of compilers.
http://codereview.appspot.com/11905/diff/1/2#newcode652
Line 652: { \
Consider making the macro expand to "do { ... } while(0)" to force the semicolon
at the end.
http://codereview.appspot.com/11905/diff/1/2#newcode653
Line 653: /* Avoid multiple loads from _Py_Ticker despite `volatile` */ \
This is really a separate optimization worth, IIRC, ~1%. It's worth submitting
separately if that'll get it in faster. You can also get the same result more
subtly by using _Py_Ticker-- instead of --_Py_Ticker, but I like the temporary
variable better.
http://codereview.appspot.com/11905/diff/1/2#newcode659
Line 659: continue; \
Oh wait, this will do the wrong thing if you wrap it in a do/while loop. You
could add a new jump target at the top of the loop, but leaving this as a block
is fine too.
http://codereview.appspot.com/11905/diff/1/2#newcode758
Line 758: Opcode prediction is disabled with threaded code, since the latter
allows
This isn't necessarily a good idea. Most processors have a much better predictor
for conditional branches (from the PREDICT() macro) than indirect branches (from
the FAST_DISPATCH() macro). I don't know the exact numbers involved, but the
processor will need most or all targets of an indirect branch to be the same in
order to predict it accurately, while it can handle simple taken/untaken
patterns and still get a conditional branch right.
None of the above is conclusive, of course, just an argument to measure the
difference before committing to removing the static prediction.
http://codereview.appspot.com/11905/diff/1/2#newcode2259
Line 2259: _setup_finally:
Labels like this produce gcc warnings when compiled without USE_COMPUTED_GOTOS,
so consider making a macro for them.
http://codereview.appspot.com/11905/diff/1/3
File Python/makeopcodetargets.py (right):
http://codereview.appspot.com/11905/diff/1/3#newcode14
Line 14: os.path.dirname(os.path.dirname(__file__)), "Lib")
Very funny way to spell "..". ;)
Thanks for your comments! http://codereview.appspot.com/11905/diff/1/2 File Python/ceval.c (left): http://codereview.appspot.com/11905/diff/1/2#oldcode669 Line 669: #define PREDICTED(op) PRED_##op: next_instr++ ...
15 years, 11 months ago
(2009-01-12 21:19:48 UTC)
#2
Thanks for your comments!
http://codereview.appspot.com/11905/diff/1/2
File Python/ceval.c (left):
http://codereview.appspot.com/11905/diff/1/2#oldcode669
Line 669: #define PREDICTED(op) PRED_##op: next_instr++
On 2009/01/12 20:05:55, jyasskin wrote:
> Does the compiler leave these instructions in the assembly even if no code
jumps
> to them?
I don't think so, but I thought that maybe this code could end up invoked
mistakingly because of a fall-through.
http://codereview.appspot.com/11905/diff/1/2
File Python/ceval.c (right):
http://codereview.appspot.com/11905/diff/1/2#newcode82
Line 82: #define LLTRACE 1 /* Low-level trace feature */
Ok, I'll try that.
http://codereview.appspot.com/11905/diff/1/2#newcode623
Line 623: #if (defined(__GNUC__) || defined (__SUNPRO_C)) \
Hmm, well let's see. There are three possibilities now:
- use a specific list of compilers (arguably a rather primitive solution)
- use automatic detection at configure-time (your proposal)
- use an explicit configure option (MaL's proposal)
http://codereview.appspot.com/11905/diff/1/2#newcode653
Line 653: /* Avoid multiple loads from _Py_Ticker despite `volatile` */ \
I don't want to open a separate entry for such a small change. And there's no
point in wanting to get it in faster :)
As for using the postdecrement form, I seem to remember gcc didn't optimize it
any more than the predecrement form. I suppose the use of volatile disables a
lot of optimization paths in the compiler.
http://codereview.appspot.com/11905/diff/1/2#newcode659
Line 659: continue; \
Let's keep it simple :)
http://codereview.appspot.com/11905/diff/1/2#newcode758
Line 758: Opcode prediction is disabled with threaded code, since the latter
allows
> just an argument to measure the difference before
> committing to removing the static prediction.
I find it hard to believe leaving PREDICT() would make a positive difference.
Figures welcome anyway.
http://codereview.appspot.com/11905/diff/1/2#newcode2259
Line 2259: _setup_finally:
On 2009/01/12 20:05:55, jyasskin wrote:
> Labels like this produce gcc warnings when compiled without
USE_COMPUTED_GOTOS,
> so consider making a macro for them.
I think adding "if(0) goto impl" to the non-computed goto version of
TARGET_WITH_IMPL will do the trick.
http://codereview.appspot.com/11905/diff/1/3
File Python/makeopcodetargets.py (right):
http://codereview.appspot.com/11905/diff/1/3#newcode14
Line 14: os.path.dirname(os.path.dirname(__file__)), "Lib")
On 2009/01/12 20:05:55, jyasskin wrote:
> Very funny way to spell "..". ;)
Well, not if you are running the script from a different location than the root
of the checkout!
Hi Marc, there's a question for you below. Antoine, I look forward to an updated ...
15 years, 11 months ago
(2009-01-14 16:56:35 UTC)
#3
Hi Marc, there's a question for you below.
Antoine, I look forward to an updated version of the patch. :)
http://codereview.appspot.com/11905/diff/1/2
File Python/ceval.c (right):
http://codereview.appspot.com/11905/diff/1/2#newcode623
Line 623: #if (defined(__GNUC__) || defined (__SUNPRO_C)) \
On 2009/01/12 21:19:48, Antoine Pitrou wrote:
> Hmm, well let's see. There are three possibilities now:
> - use a specific list of compilers (arguably a rather primitive solution)
> - use automatic detection at configure-time (your proposal)
> - use an explicit configure option (MaL's proposal)
I prefer detecting this automatically in the long run since many people
compiling Python won't realize that they have to set a configure option to get
optimal performance. That said, having an explicit option with an auto-detected
default seems like a good idea, and we could initially default it to 'off' in
order to shake out any bugs. If MaL feels strongly, though, I won't argue.
http://codereview.appspot.com/11905/diff/1/2#newcode2259
Line 2259: _setup_finally:
On 2009/01/12 21:19:48, Antoine Pitrou wrote:
> On 2009/01/12 20:05:55, jyasskin wrote:
> > Labels like this produce gcc warnings when compiled without
> USE_COMPUTED_GOTOS,
> > so consider making a macro for them.
>
> I think adding "if(0) goto impl" to the non-computed goto version of
> TARGET_WITH_IMPL will do the trick.
>
Good idea.
On 2009/01/14 16:56:35, jyasskin wrote: > That said, having an explicit option with an auto-detected ...
15 years, 11 months ago
(2009-01-14 18:15:08 UTC)
#4
On 2009/01/14 16:56:35, jyasskin wrote:
> That said, having an explicit option with an auto-detected
> default seems like a good idea, and we could initially default it to 'off' in
> order to shake out any bugs.
Hmm, my configure skills are exactly equal to 0.0, so the simpler the better :)
I'll take a look but I might need help.
Issue 11905: threadedceval5.patch
(Closed)
Created 15 years, 11 months ago by Jeffrey Yasskin
Modified 15 years, 5 months ago
Reviewers: Jeffrey Yasskin, Antoine Pitrou
Base URL: http://svn.python.org/view/*checkout*/python/branches/py3k/
Comments: 19