|
|
Descriptioncompiler: Basic escape analysis for the go frontend.
This is an implementation of the algorithm described in
"Escape Analysis in Java" by Choi et. al.
It relies on dataflow information to discover variable
references to one another. Handles assignments to closures
and association between closures variables and the variables
of the enclosing scope.
Dataflow analysis does not discover references through range
statements e.g. for _, v := range a will not recognize that
all values of v are references to a.
Patch Set 1 #Patch Set 2 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #Patch Set 3 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #Patch Set 4 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #Patch Set 5 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 31
Patch Set 6 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 14
Patch Set 7 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 40
Patch Set 8 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 6
Patch Set 9 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 6
Patch Set 10 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #
Total comments: 7
Patch Set 11 : diff -r d42a0819e2eb https://code.google.com/p/gofrontend #Patch Set 12 : diff -r 2169f7d99472 https://code.google.com/p/gofrontend #Patch Set 13 : diff -r ee25dd0adff9 https://code.google.com/p/gofrontend #Patch Set 14 : diff -r ee25dd0adff9 https://code.google.com/p/gofrontend #
MessagesTotal messages: 26
Hello iant@golang.org (cc: gofrontend-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/gofrontend
Sign in to reply to this message.
On 2015/02/23 19:44:04, cmang wrote: > Hello mailto:iant@golang.org (cc: mailto:gofrontend-dev@googlegroups.com), > > I'd like you to review this change to > https://code.google.com/p/gofrontend Diff to gcc/go/Make-lang.in: --- a/gcc/go/Make-lang.in +++ b/gcc/go/Make-lang.in @@ -51,6 +51,7 @@ go-warn = $(STRICT_WARN) GO_OBJS = \ go/ast-dump.o \ go/dataflow.o \ + go/escape.o \ go/export.o \ go/expressions.o \ go/go-backend.o \
Sign in to reply to this message.
On 2015/02/23 19:44:35, cmang wrote: > Diff to gcc/go/Make-lang.in: > > --- a/gcc/go/Make-lang.in > +++ b/gcc/go/Make-lang.in > @@ -51,6 +51,7 @@ go-warn = $(STRICT_WARN) > GO_OBJS = \ > go/ast-dump.o \ > go/dataflow.o \ > + go/escape.o \ where is this go/escape.cc? > go/export.o \ > go/expressions.o \ > go/go-backend.o \
Sign in to reply to this message.
On 2015/02/23 21:02:18, minux wrote: > On 2015/02/23 19:44:35, cmang wrote: > > Diff to gcc/go/Make-lang.in: > > > > --- a/gcc/go/Make-lang.in > > +++ b/gcc/go/Make-lang.in > > @@ -51,6 +51,7 @@ go-warn = $(STRICT_WARN) > > GO_OBJS = \ > > go/ast-dump.o \ > > go/dataflow.o \ > > + go/escape.o \ > where is this go/escape.cc? > > go/export.o \ > > go/expressions.o \ > > go/go-backend.o \ Sorry, I thought it picked up the newly added files earlier.
Sign in to reply to this message.
A quick first look. https://codereview.appspot.com/204110043/diff/70001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/dataflow.cc#newcode177 go/dataflow.cc:177: // analysis for case statements here; the analysis for the blocks will s/case/select/ https://codereview.appspot.com/204110043/diff/70001/go/escape.cc File go/escape.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode1 go/escape.cc:1: // escape.cc -- Go frontend escape analysis. -*- C++ -*- You don't need the -*- C++ -*- here. It's only needed for .h files. https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode138 go/escape.cc:138: std::set<Node*> edges = this->edges(); Make edges a const reference. https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode139 go/escape.cc:139: for (std::set<Node*>::const_iterator p1 = edges.begin(); Most iterators in the frontend are called p. May as well call this one p rather than p1. https://codereview.appspot.com/204110043/diff/70001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode23 go/escape.h:23: // of reducing the overhead cost of garbage collection by allocation objects s/allocation/allocating/ https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode24 go/escape.h:24: // on the stack when they do escape the method they were created in. s/do/do not/ https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode24 go/escape.h:24: // on the stack when they do escape the method they were created in. You use the word "method" here and several times below but I think you mean "function or method" and I think it might be clearer to simply say "function." https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode113 go/escape.h:113: std::set<Node*> I haven't looked at the call sites, but most likely this should return const std::set<Node*>& so that it doesn't have to copy the entire set. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode118 go/escape.h:118: std::string name_; These fields should be private. Also they should be down at the bottom of the class definition. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode118 go/escape.h:118: std::string name_; If the name and label methods are going to simply return these fields after initializing them, then probably name and label should return const std::string& rather than copying the string. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode163 go/escape.h:163: virtual std::string const std::string& ? https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode179 go/escape.h:179: virtual std::string const std::string& ? https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode200 go/escape.h:200: std::set<Node*> const std::set<Node*>& ? https://codereview.appspot.com/204110043/diff/70001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/gogo.cc#newcode4977 go/gogo.cc:4977: // TODO(cmang): Implement a backend function to allocate memory on the I suggested that but actually I think that to way to do this is to make a zero-initialized temporary variable, and return the address of that. That is, we don't need a new function, we already have the functionality we need. https://codereview.appspot.com/204110043/diff/70001/go/gogo.h File go/gogo.h (right): https://codereview.appspot.com/204110043/diff/70001/go/gogo.h#newcode130 go/gogo.h:130: std::set<Node*> const std::set<Node*>& ? Also below.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/70001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/dataflow.cc#newcode177 go/dataflow.cc:177: // analysis for case statements here; the analysis for the blocks will On 2015/02/24 03:34:49, iant wrote: > s/case/select/ Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.cc File go/escape.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode1 go/escape.cc:1: // escape.cc -- Go frontend escape analysis. -*- C++ -*- On 2015/02/24 03:34:49, iant wrote: > You don't need the -*- C++ -*- here. It's only needed for .h files. Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode138 go/escape.cc:138: std::set<Node*> edges = this->edges(); On 2015/02/24 03:34:49, iant wrote: > Make edges a const reference. Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.cc#newcode139 go/escape.cc:139: for (std::set<Node*>::const_iterator p1 = edges.begin(); On 2015/02/24 03:34:49, iant wrote: > Most iterators in the frontend are called p. May as well call this one p rather > than p1. Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode23 go/escape.h:23: // of reducing the overhead cost of garbage collection by allocation objects On 2015/02/24 03:34:49, iant wrote: > s/allocation/allocating/ Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode24 go/escape.h:24: // on the stack when they do escape the method they were created in. On 2015/02/24 03:34:49, iant wrote: > s/do/do not/ Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode24 go/escape.h:24: // on the stack when they do escape the method they were created in. On 2015/02/24 03:34:50, iant wrote: > You use the word "method" here and several times below but I think you mean > "function or method" and I think it might be clearer to simply say "function." Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode113 go/escape.h:113: std::set<Node*> On 2015/02/24 03:34:49, iant wrote: > I haven't looked at the call sites, but most likely this should return > const std::set<Node*>& > so that it doesn't have to copy the entire set. Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode118 go/escape.h:118: std::string name_; On 2015/02/24 03:34:50, iant wrote: > These fields should be private. Also they should be down at the bottom of the > class definition. I meant for these to be protected so Call_node and Connection_node could set them. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode118 go/escape.h:118: std::string name_; On 2015/02/24 03:34:50, iant wrote: > If the name and label methods are going to simply return these fields after > initializing them, then probably name and label should return const std::string& > rather than copying the string. Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode163 go/escape.h:163: virtual std::string On 2015/02/24 03:34:50, iant wrote: > const std::string& ? Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode179 go/escape.h:179: virtual std::string On 2015/02/24 03:34:49, iant wrote: > const std::string& ? Done. https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode200 go/escape.h:200: std::set<Node*> On 2015/02/24 03:34:50, iant wrote: > const std::set<Node*>& ? Done. https://codereview.appspot.com/204110043/diff/70001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/70001/go/gogo.cc#newcode4977 go/gogo.cc:4977: // TODO(cmang): Implement a backend function to allocate memory on the On 2015/02/24 03:34:50, iant wrote: > I suggested that but actually I think that to way to do this is to make a > zero-initialized temporary variable, and return the address of that. That is, > we don't need a new function, we already have the functionality we need. Acknowledged. https://codereview.appspot.com/204110043/diff/70001/go/gogo.h File go/gogo.h (right): https://codereview.appspot.com/204110043/diff/70001/go/gogo.h#newcode130 go/gogo.h:130: std::set<Node*> On 2015/02/24 03:34:50, iant wrote: > const std::set<Node*>& ? > > Also below. Done.
Sign in to reply to this message.
A few more comments. Still haven't really gone through the whole thing yet. https://codereview.appspot.com/204110043/diff/70001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/70001/go/escape.h#newcode118 go/escape.h:118: std::string name_; On 2015/02/24 18:35:17, cmang wrote: > On 2015/02/24 03:34:50, iant wrote: > > These fields should be private. Also they should be down at the bottom of the > > class definition. > > I meant for these to be protected so Call_node and Connection_node could set > them. In this C++ coding style, all class members are private. You should add protected accessors and setters as needed. https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc#newcode193 go/dataflow.cc:193: for (Expression_list::const_iterator p1 = ce->args()->begin(); s/p1/p/ https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc#newcode197 go/dataflow.cc:197: Named_object* var = get_var(*p1); It seems to me that you need to traverse the whole expression here. You are handling the case of go f(a), but not the case of go f(*a). https://codereview.appspot.com/204110043/diff/90001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode17 go/escape.h:17: class Gogo; I don't know if you need to #include "gogo.h" above, but I know that you don't need to both #include "gogo.h" and declare class Gogo. Drop one or the other. https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode21 go/escape.h:21: // algorithm from "Escape Analysis for Java" by Choi et. al in OPSLA '99. s/OPSLA/OOPSLA/ (I think) https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode37 go/escape.h:37: class Node The Node class should have a comment specific to the class, explaining what it is (presumably a general graph node). Actually you should say something about the graph representation somewhere. E.g., what sorts of objects can appear in the graph? https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode46 go/escape.h:46: Node(Node_classification, Named_object* object); Perhaps the Node constructor should be protected? (Probably should be true for Expression and Statement as well, really.) https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode98 go/escape.h:98: name(); Both Call_node and Connection_node have a name method; should this method be = 0?
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc#newcode193 go/dataflow.cc:193: for (Expression_list::const_iterator p1 = ce->args()->begin(); On 2015/02/24 19:27:14, iant wrote: > s/p1/p/ Done. https://codereview.appspot.com/204110043/diff/90001/go/dataflow.cc#newcode197 go/dataflow.cc:197: Named_object* var = get_var(*p1); On 2015/02/24 19:27:14, iant wrote: > It seems to me that you need to traverse the whole expression here. You are > handling the case of go f(a), but not the case of go f(*a). Done. Thunk statements will be handled in the above traversal. https://codereview.appspot.com/204110043/diff/90001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode17 go/escape.h:17: class Gogo; On 2015/02/24 19:27:14, iant wrote: > I don't know if you need to #include "gogo.h" above, but I know that you don't > need to both #include "gogo.h" and declare class Gogo. Drop one or the other. I included "gogo.h" to use Named_object::is_function and Named_object::is_function_declaration. Removed both Gogo and Named_object forward declarations. https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode21 go/escape.h:21: // algorithm from "Escape Analysis for Java" by Choi et. al in OPSLA '99. On 2015/02/24 19:27:14, iant wrote: > s/OPSLA/OOPSLA/ (I think) Done. https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode37 go/escape.h:37: class Node On 2015/02/24 19:27:14, iant wrote: > The Node class should have a comment specific to the class, explaining what it > is (presumably a general graph node). Actually you should say something about > the graph representation somewhere. E.g., what sorts of objects can appear in > the graph? Done. https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode46 go/escape.h:46: Node(Node_classification, Named_object* object); On 2015/02/24 19:27:14, iant wrote: > Perhaps the Node constructor should be protected? (Probably should be true for > Expression and Statement as well, really.) Done. https://codereview.appspot.com/204110043/diff/90001/go/escape.h#newcode98 go/escape.h:98: name(); On 2015/02/24 19:27:14, iant wrote: > Both Call_node and Connection_node have a name method; should this method be = > 0? Done.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/110001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/dataflow.cc#newcode173 go/dataflow.cc:173: || statement->thunk_statement() != NULL) Needs a comment about thunk_statement. https://codereview.appspot.com/204110043/diff/110001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/110001/go/escape.h#newcode35 go/escape.h:35: // This is general graph node representing a named object used in a call graph s/is/is a/ https://codereview.appspot.com/204110043/diff/110001/go/expressions.h File go/expressions.h (right): https://codereview.appspot.com/204110043/diff/110001/go/expressions.h#newcode... go/expressions.h:2589: do_check_types(Gogo*) You can omit the empty do_check_types, that's the default anyhow. https://codereview.appspot.com/204110043/diff/110001/go/go.cc File go/go.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/go.cc#newcode113 go/go.cc:113: // Consider escape analysis information when deciding if a variable should Let's move this below do_exports. It shouldn't make any difference. I like the idea of having do_exports be the first thing we do. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode1948 go/gogo.cc:1948: && object->var_value() != NULL If object->is_variable() is true then object->var_value() != NULL is guaranteed. You don't need to test it. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3201 go/gogo.cc:3201: // A traversal class used to build a call graph for this program. Perhaps Build_call_graph and Build_connection_graph and friends should be in escape.cc. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3233 go/gogo.cc:3233: this->current_function_ = fn; For future proofing, please add go_assert(this->current_function_ == NULL); https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3244 go/gogo.cc:3244: if (this->current_function_ == NULL) What happens for something like var global = f() Do we need a special node in the call graph to represent initialization? It may not matter for escape analysis. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3260 go/gogo.cc:3260: fn = expr->func_expression()->named_object(); So this isn't precisely a call graph, since it includes references to functions that are only referenced, not called. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3303 go/gogo.cc:3303: handle_call(Named_object*, Expression*); Should handle_call be a private method? https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3320 go/gogo.cc:3320: // Given an expression, return the Named_object that it refers to. A single expression can of course refer to multiple Named_objects. You should clarify which one you are looking for. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3325 go/gogo.cc:3325: { Perhaps this should be a static function in the Build_connection_graphs class. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3390 go/gogo.cc:3390: // e.g. 'func(var)' which could make var escape. I think this comment is slightly misleading. You aren't really trying to find func(var) that could make var escape--a call to an unknown function means that the variable escapes. You're trying to find func(var) for which you can prove that the variable does not escape. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3474 go/gogo.cc:3474: && var->var_value() != NULL No need to test var_value() != NULL. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3482 go/gogo.cc:3482: for (Dataflow::Defs::const_iterator p1 = defs->begin(); s/p1/p/ https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3527 go/gogo.cc:3527: for (Dataflow::Refs::const_iterator p1 = refs->begin(); s/p1/p/ https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3811 go/gogo.cc:3811: return; We should probably skip this is saw_errors() returns true. https://codereview.appspot.com/204110043/diff/110001/go/gogo.h File go/gogo.h (right): https://codereview.appspot.com/204110043/diff/110001/go/gogo.h#newcode1494 go/gogo.h:1494: && !this->is_non_escaping_address_taken_ This change isn't right. The is_non_escaping_address_taken_ field is set to true if the address of the variable is taken in such a way that it does not force the variable to escape. However, it is possible that the variable still escapes in some other way. Looking farther I see that you are using this flag to mean what you want. I think you need to introduce a new flag instead. Otherwise you will be messed up by the current code that calls set_non_escaping_address_taken, like the code you get from the call to set_does_not_escape in Unary_expression::do_lower. That call only means that this instance of &v does not escape. https://codereview.appspot.com/204110043/diff/110001/go/gogo.h#newcode1778 go/gogo.h:1778: && !this->is_non_escaping_address_taken_; As above, this change is not correct.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/110001/go/dataflow.cc File go/dataflow.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/dataflow.cc#newcode173 go/dataflow.cc:173: || statement->thunk_statement() != NULL) On 2015/02/24 22:11:14, iant wrote: > Needs a comment about thunk_statement. Done. https://codereview.appspot.com/204110043/diff/110001/go/escape.h File go/escape.h (right): https://codereview.appspot.com/204110043/diff/110001/go/escape.h#newcode35 go/escape.h:35: // This is general graph node representing a named object used in a call graph On 2015/02/24 22:11:14, iant wrote: > s/is/is a/ Done. https://codereview.appspot.com/204110043/diff/110001/go/expressions.h File go/expressions.h (right): https://codereview.appspot.com/204110043/diff/110001/go/expressions.h#newcode... go/expressions.h:2589: do_check_types(Gogo*) On 2015/02/24 22:11:14, iant wrote: > You can omit the empty do_check_types, that's the default anyhow. Done. https://codereview.appspot.com/204110043/diff/110001/go/go.cc File go/go.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/go.cc#newcode113 go/go.cc:113: // Consider escape analysis information when deciding if a variable should On 2015/02/24 22:11:15, iant wrote: > Let's move this below do_exports. It shouldn't make any difference. I like the > idea of having do_exports be the first thing we do. Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode1948 go/gogo.cc:1948: && object->var_value() != NULL On 2015/02/24 22:11:15, iant wrote: > If object->is_variable() is true then object->var_value() != NULL is guaranteed. > You don't need to test it. Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3201 go/gogo.cc:3201: // A traversal class used to build a call graph for this program. On 2015/02/24 22:11:15, iant wrote: > Perhaps Build_call_graph and Build_connection_graph and friends should be in > escape.cc. Acknowledged. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3233 go/gogo.cc:3233: this->current_function_ = fn; On 2015/02/24 22:11:15, iant wrote: > For future proofing, please add > go_assert(this->current_function_ == NULL); Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3244 go/gogo.cc:3244: if (this->current_function_ == NULL) On 2015/02/24 22:11:15, iant wrote: > What happens for something like > > var global = f() > > Do we need a special node in the call graph to represent initialization? It may > not matter for escape analysis. I think this might okay for escape analysis. f will have a node in the call graph and global will have a node in the global connection graph. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3260 go/gogo.cc:3260: fn = expr->func_expression()->named_object(); On 2015/02/24 22:11:15, iant wrote: > So this isn't precisely a call graph, since it includes references to functions > that are only referenced, not called. Acknowledged. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3303 go/gogo.cc:3303: handle_call(Named_object*, Expression*); On 2015/02/24 22:11:15, iant wrote: > Should handle_call be a private method? Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3320 go/gogo.cc:3320: // Given an expression, return the Named_object that it refers to. On 2015/02/24 22:11:15, iant wrote: > A single expression can of course refer to multiple Named_objects. You should > clarify which one you are looking for. Done, kinda. I don't think "outermost" is the right word since it doesn't make sense for things like binary expressions, but we don't check for those here. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3325 go/gogo.cc:3325: { On 2015/02/24 22:11:15, iant wrote: > Perhaps this should be a static function in the Build_connection_graphs class. It's only used in Build_connection_graphs so it can probably be private. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3390 go/gogo.cc:3390: // e.g. 'func(var)' which could make var escape. On 2015/02/24 22:11:15, iant wrote: > I think this comment is slightly misleading. You aren't really trying to find > func(var) that could make var escape--a call to an unknown function means that > the variable escapes. You're trying to find func(var) for which you can prove > that the variable does not escape. Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3474 go/gogo.cc:3474: && var->var_value() != NULL On 2015/02/24 22:11:15, iant wrote: > No need to test var_value() != NULL. Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3482 go/gogo.cc:3482: for (Dataflow::Defs::const_iterator p1 = defs->begin(); On 2015/02/24 22:11:15, iant wrote: > s/p1/p/ Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3527 go/gogo.cc:3527: for (Dataflow::Refs::const_iterator p1 = refs->begin(); On 2015/02/24 22:11:15, iant wrote: > s/p1/p/ Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3811 go/gogo.cc:3811: return; On 2015/02/24 22:11:15, iant wrote: > We should probably skip this is saw_errors() returns true. Done. https://codereview.appspot.com/204110043/diff/110001/go/gogo.h File go/gogo.h (right): https://codereview.appspot.com/204110043/diff/110001/go/gogo.h#newcode1494 go/gogo.h:1494: && !this->is_non_escaping_address_taken_ On 2015/02/24 22:11:15, iant wrote: > This change isn't right. The is_non_escaping_address_taken_ field is set to > true if the address of the variable is taken in such a way that it does not > force the variable to escape. However, it is possible that the variable still > escapes in some other way. > > Looking farther I see that you are using this flag to mean what you want. I > think you need to introduce a new flag instead. Otherwise you will be messed up > by the current code that calls set_non_escaping_address_taken, like the code you > get from the call to set_does_not_escape in Unary_expression::do_lower. That > call only means that this instance of &v does not escape. Done, introduced escapes_. https://codereview.appspot.com/204110043/diff/110001/go/gogo.h#newcode1778 go/gogo.h:1778: && !this->is_non_escaping_address_taken_; On 2015/02/24 22:11:15, iant wrote: > As above, this change is not correct. Done.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/110001/go/go.cc File go/go.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/go.cc#newcode113 go/go.cc:113: // Consider escape analysis information when deciding if a variable should On 2015/02/25 00:28:48, cmang wrote: > On 2015/02/24 22:11:15, iant wrote: > > Let's move this below do_exports. It shouldn't make any difference. I like > the > > idea of having do_exports be the first thing we do. > > Done. Sorry, I'm an idiot. We need the escape information for the next phase, when we record function parameter escape info in the export data. Move optimize_allocations back before do_exports. https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3395 go/gogo.cc:3395: return; If I have a function f(fn func(a []int)) and somewhere inside f I write fn(x), then x escapes, because the compiler has no idea what fn is. Where in this code do we mark x as escaping in that case? I would have thought it would be here. https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3433 go/gogo.cc:3433: if (resolve_var_reference(*arg) == object) This doesn't seem reliable. Given func f(a []int, b []int) it's possible that a does not escape but b does. Now I could call f(x, x). It looks like this code is going to associate x with a both times, so it will think that x does not escape, but of course since x is also passed as b it does escape.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3395 go/gogo.cc:3395: return; On 2015/02/25 19:35:07, iant wrote: > If I have a function f(fn func(a []int)) and somewhere inside f I write fn(x), > then x escapes, because the compiler has no idea what fn is. Where in this code > do we mark x as escaping in that case? I would have thought it would be here. Done. https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3433 go/gogo.cc:3433: if (resolve_var_reference(*arg) == object) On 2015/02/25 19:35:07, iant wrote: > This doesn't seem reliable. Given func f(a []int, b []int) it's possible that a > does not escape but b does. Now I could call f(x, x). It looks like this code > is going to associate x with a both times, so it will think that x does not > escape, but of course since x is also passed as b it does escape. Done. Edges should be made between the object and each parameter it refers to.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3395 go/gogo.cc:3395: return; On 2015/02/25 22:43:30, cmang wrote: > On 2015/02/25 19:35:07, iant wrote: > > If I have a function f(fn func(a []int)) and somewhere inside f I write fn(x), > > then x escapes, because the compiler has no idea what fn is. Where in this > code > > do we mark x as escaping in that case? I would have thought it would be here. > > Done. What did you change? I'm not seeing it. https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3233 go/gogo.cc:3233: go_assert(this->current_function_ != NULL); I think I suggested this assertion, but I think what I meant was this: go_assert(this->current_function_ == NULL); this->current_function_ = fn; https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3355 go/gogo.cc:3355: // p <- c A receive expression is just <-c, or, rather <-p. p <- c would be a send statement. https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3648 go/gogo.cc:3648: // Traverse statements to find interesting references that might have not What kinds of references does this code find? Can we make a note for each edge that it adds that was not added because of dataflow? I think this is of particular interest because resolve_var_reference is imprecise.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/130001/go/gogo.cc#newcode3395 go/gogo.cc:3395: return; On 2015/02/26 18:21:33, iant wrote: > On 2015/02/25 22:43:30, cmang wrote: > > On 2015/02/25 19:35:07, iant wrote: > > > If I have a function f(fn func(a []int)) and somewhere inside f I write > fn(x), > > > then x escapes, because the compiler has no idea what fn is. Where in this > > code > > > do we mark x as escaping in that case? I would have thought it would be > here. > > > > Done. > > What did you change? I'm not seeing it. Maybe looking at the wrong patchset. I added a special case for function calls where the function is a parameter variable that makes every argument escape. https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3233 go/gogo.cc:3233: go_assert(this->current_function_ != NULL); On 2015/02/26 18:21:34, iant wrote: > I think I suggested this assertion, but I think what I meant was this: > go_assert(this->current_function_ == NULL); > this->current_function_ = fn; Done. https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3355 go/gogo.cc:3355: // p <- c On 2015/02/26 18:21:33, iant wrote: > A receive expression is just <-c, or, rather <-p. p <- c would be a send > statement. Done. https://codereview.appspot.com/204110043/diff/150001/go/gogo.cc#newcode3648 go/gogo.cc:3648: // Traverse statements to find interesting references that might have not On 2015/02/26 18:21:33, iant wrote: > What kinds of references does this code find? Can we make a note for each edge > that it adds that was not added because of dataflow? > > I think this is of particular interest because resolve_var_reference is > imprecise. I'll add notes for the edges added that weren't in dataflow. In some small tests I wrote, I noticed dataflow doesn't record a variable being defined as a function or closure e.g. (from escape.go:for_escapes3) f[n] = func(){} is not recorded. The other cases I just added on, but I was under the impression that dataflow wasn't recording definitions and references inside of anonymous closures e.g. func() { a = b } would not record the def of a or ref of b. I'll need to take another look at some of the smaller cases.
Sign in to reply to this message.
More comments. I'm troubled by the imprecision of resolve_var_reference, because this code must be either precise or conservative, and right now I'm not convinced it is either. https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3326 go/gogo.cc:3326: resolve_var_reference(Expression* expr) I'm starting to think that the purpose of this function is: given an Expression EXPR, find the variable that is relevant for purposes of escape analysis. If we see &x, or x[i], or x.f, then x is what we care about. If that is accurate, then I think we need to handle some additional cases in which we care about x: T(x) uintptr(unsafe.Pointer(x)) + n set_and_use_temporary bound_method_expression conditional expression compound expression In the case of a composite literal, there can be multiple escaping values. We don't necessarily need to handle all these cases. But if there are any cases that we don't handle here, the code calling this needs to be conservative when resolve_var_reference returns NULL. So I think we need to return a three part answer: 1) a Named_object (or, if we handle a composite literal, a list of Named_objects); 2) a determination that there is no named object; 3) uncertain, in which case the calling code must be conservative and assume that the value escapes. https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3328 go/gogo.cc:3328: while (expr->unary_expression() != NULL You have a while statement around a switch statement, and they are both checking the classification. Let's change it to a simple while (!done), and change the switch statemen to do done = true in the default case. https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3463 go/gogo.cc:3463: if (resolve_var_reference(*arg) == object) Instead of doing this, let's consider changing dataflow. Right now all dataflow gives you for a variable reference is a Statement. But in principle there is no reason it can't give you more detailed information. With some more bookkeeping, dataflow could give you a parameter number in a call expression. Then we wouldn't have to work it out here. I don't know if this will work but I think it's worth a try.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3326 go/gogo.cc:3326: resolve_var_reference(Expression* expr) On 2015/02/27 21:17:03, iant wrote: > I'm starting to think that the purpose of this function is: given an Expression > EXPR, find the variable that is relevant for purposes of escape analysis. If we > see &x, or x[i], or x.f, then x is what we care about. > > If that is accurate, then I think we need to handle some additional cases in > which we care about x: > T(x) > uintptr(unsafe.Pointer(x)) + n > set_and_use_temporary > bound_method_expression > conditional expression > compound expression > > In the case of a composite literal, there can be multiple escaping values. > > We don't necessarily need to handle all these cases. But if there are any cases > that we don't handle here, the code calling this needs to be conservative when > resolve_var_reference returns NULL. So I think we need to return a three part > answer: 1) a Named_object (or, if we handle a composite literal, a list of > Named_objects); 2) a determination that there is no named object; 3) uncertain, > in which case the calling code must be conservative and assume that the value > escapes. I think your description of what this function is intended for is correct. The cases handled here were derived from the escape*.go files and so the extra cases you mentioned should be handled here as well. I'm not sure if I understand the second and third part of the three part answer you suggested. What does it mean if we determine there is a named object in the expression, but this returns NULL or an empty list? The expression is one we don't care about analyzing, like a binary expression e.g. a + b where neither a nor b are being explicitly referenced? Presumably, if we determine there is no named object, this expression is the definition of some other object as a literal or new/make, etc. What exactly does uncertain translate to? That we found a named object or several, but don't know which one to refer to e.g. a[b[i]]? https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3328 go/gogo.cc:3328: while (expr->unary_expression() != NULL On 2015/02/27 21:17:04, iant wrote: > You have a while statement around a switch statement, and they are both checking > the classification. Let's change it to a simple while (!done), and change the > switch statemen to do done = true in the default case. Done. https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3463 go/gogo.cc:3463: if (resolve_var_reference(*arg) == object) On 2015/02/27 21:17:03, iant wrote: > Instead of doing this, let's consider changing dataflow. Right now all dataflow > gives you for a variable reference is a Statement. But in principle there is no > reason it can't give you more detailed information. With some more bookkeeping, > dataflow could give you a parameter number in a call expression. Then we > wouldn't have to work it out here. > > I don't know if this will work but I think it's worth a try. Acknowledged. Added TODO.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3326 go/gogo.cc:3326: resolve_var_reference(Expression* expr) On 2015/02/27 22:00:52, cmang wrote: > > I'm not sure if I understand the second and third part of the three part answer > you suggested. What does it mean if we determine there is a named object in the > expression, but this returns NULL or an empty list? The expression is one we > don't care about analyzing, like a binary expression e.g. a + b where neither a > nor b are being explicitly referenced? Presumably, if we determine there is no > named object, this expression is the definition of some other object as a > literal or new/make, etc. What exactly does uncertain translate to? That we > found a named object or several, but don't know which one to refer to e.g. > a[b[i]]? For a[b[i]], we would return a. That is the value that escapes. For a - b, we would return NULL. We know for sure that neither value escapes. I don't know if we need an uncertain result. But it's clear that the function right now is not always certain. Given an expression a(b, c), where we don't know a, we have to assume that both b and c escape. We don't currently have a return value that indicates that. Or maybe we don't have a return value that indicates that in a - b neither escapes. Basically there are more cases than we are returning right now.
Sign in to reply to this message.
https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc File go/gogo.cc (right): https://codereview.appspot.com/204110043/diff/110001/go/gogo.cc#newcode3201 go/gogo.cc:3201: // A traversal class used to build a call graph for this program. On 2015/02/25 00:28:48, cmang wrote: > On 2015/02/24 22:11:15, iant wrote: > > Perhaps Build_call_graph and Build_connection_graph and friends should be in > > escape.cc. > > Acknowledged. Done.
Sign in to reply to this message.
On 2015/02/27 23:49:57, iant wrote: > https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc > File go/gogo.cc (right): > > https://codereview.appspot.com/204110043/diff/170001/go/gogo.cc#newcode3326 > go/gogo.cc:3326: resolve_var_reference(Expression* expr) > On 2015/02/27 22:00:52, cmang wrote: > > > > I'm not sure if I understand the second and third part of the three part > answer > > you suggested. What does it mean if we determine there is a named object in > the > > expression, but this returns NULL or an empty list? The expression is one we > > don't care about analyzing, like a binary expression e.g. a + b where neither > a > > nor b are being explicitly referenced? Presumably, if we determine there is no > > named object, this expression is the definition of some other object as a > > literal or new/make, etc. What exactly does uncertain translate to? That we > > found a named object or several, but don't know which one to refer to e.g. > > a[b[i]]? > > For a[b[i]], we would return a. That is the value that escapes. > > For a - b, we would return NULL. We know for sure that neither value escapes. > > I don't know if we need an uncertain result. But it's clear that the function > right now is not always certain. Given an expression a(b, c), where we don't > know a, we have to assume that both b and c escape. We don't currently have a > return value that indicates that. Or maybe we don't have a return value that > indicates that in a - b neither escapes. Basically there are more cases than we > are returning right now. I added the extra cases you mentioned to resolve_var_reference, but I'm still not sure about needing a multi-part answer here. resolve_var_reference just needs to find the "interesting" part of the expression which is usually the outermost variable being referenced; if it returns NULL, that means we don't have any interest (in terms of escape analysis) in exploring this expression. In the case of 'a-b', neither a nor b are interesting to explore because 'a-b' is not something to be considered towards escape analysis. In expressions like `a[b[i]` or `a(b, c)`, we decide a is of interest and in the context of the analysis code, it should do that correct thing. For `a(b, c)`, that gets handled appropriately in handle_call, where the function being unknown causes b and c to escape. I'm just not sure if resolve_var_reference needs to decide more than whether or not the expression contains a named object of interest in it, which I believe it currently does.
Sign in to reply to this message.
The case I'm most worried about in resolve_var_reference is composite literals. Right now I think it will return NULL for a composite literal. But it seems clear that a composite literal can cause a value to escape. Where do we handle that? That is, in code like var global interface{} func f(a, b *byte) { global = []*byte{a, b} } both a and b escape. How is that recorded?
Sign in to reply to this message.
On 2015/03/09 22:34:38, iant wrote: > The case I'm most worried about in resolve_var_reference is composite literals. > Right now I think it will return NULL for a composite literal. But it seems > clear that a composite literal can cause a value to escape. Where do we handle > that? > > That is, in code like > > var global interface{} > > func f(a, b *byte) { > global = []*byte{a, b} > } > > both a and b escape. How is that recorded? Acknowledged; composite literals aren't being handled at all currently. I think composite literals need to be handled specially, similarly to how call expressions are handled specially in the current analysis.
Sign in to reply to this message.
On 2015/03/09 22:49:06, cmang wrote: > On 2015/03/09 22:34:38, iant wrote: > > The case I'm most worried about in resolve_var_reference is composite > literals. > > Right now I think it will return NULL for a composite literal. But it seems > > clear that a composite literal can cause a value to escape. Where do we > handle > > that? > > > > That is, in code like > > > > var global interface{} > > > > func f(a, b *byte) { > > global = []*byte{a, b} > > } > > > > both a and b escape. How is that recorded? > > Acknowledged; composite literals aren't being handled at all currently. I think > composite literals need to be handled specially, similarly to how call > expressions are handled specially in the current analysis. PTAL: All tests now pass. Analysis is now conservative about calls to functions/methods defined in a foreign package and builtins until escape analysis info is added to the export data. Analysis now handles composite literals and various combinations of embedded calls and composite literals. For example, there were failing tests in net/http because expressions of the form &Type{Field: package.Func(&Type2{Field2: var})} were not recursively analyzed to discover that var escapes via a composite literal argument to an unknown function. Currently trying to figure out how to analyze how effective this optimization is.
Sign in to reply to this message.
LGTM Let's get this in and see what happens next.
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/gofrontend/source/detail?r=b0d3b9400c63 *** compiler: Basic escape analysis for the go frontend. This is an implementation of the algorithm described in "Escape Analysis in Java" by Choi et. al. It relies on dataflow information to discover variable references to one another. Handles assignments to closures and association between closures variables and the variables of the enclosing scope. Dataflow analysis does not discover references through range statements e.g. for _, v := range a will not recognize that all values of v are references to a. LGTM=iant R=iant, minux CC=gofrontend-dev https://codereview.appspot.com/204110043 Committer: Ian Lance Taylor <iant@golang.org>
Sign in to reply to this message.
On 2015/03/12 22:47:33, iant wrote: > LGTM > > Let's get this in and see what happens next. Ah, didn't think you LGTM this already. This still doesn't stack allocate variables that don't escape; I'll add that next and get some statistics on how effective it is.
Sign in to reply to this message.
|