Some initial design comments on this approach. http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrection.h File include/clang/Sema/TypoCorrection.h (right): http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrection.h#newcode25 include/clang/Sema/TypoCorrection.h:25: typedef UnresolvedSet<4> ...
13 years, 3 months ago
(2011-07-16 01:33:58 UTC)
#1
Some initial design comments on this approach.
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrection.h
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:25: typedef UnresolvedSet<4>
CorrectionDeclSet;
Why the typedef? Doesn't seem all that onerous to type out.
Also, why UnresolvedSet? That stores DeclAccessPairs, and it doesn't seem like
you're using the access side of it.
I would just use llvm::SmallVector<NamedDecl *, 1>
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:33: IsKeyword(false) {
Why not set the Decl* to be a KeywordDecl as before? That seemed like a good
solution, and it keeps everything simple. It also ensures that a non-empty
vector of Decls does correspond to a viable result, keyword or otherwise.
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:48: IsKeyword(false) {
Why not just initialize this to 'Name == KeywordDecl()'?
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:52: CorrectionDecls = new CorrectionDeclSet;
Why heap allocate this? UnresolvedSet is essentially a SmallVector; it already
has a small number of elements with inline storage and the remainder of the
storage goes on the heap.
What might be best is to use UnresolvedSet<1> so that we optimize for the common
case of exactly one decl, and put them on the heap when there are more.
13 years, 3 months ago
(2011-07-18 18:36:41 UTC)
#2
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrection.h
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:25: typedef UnresolvedSet<4>
CorrectionDeclSet;
On 2011/07/16 01:33:58, chandlerc wrote:
> Why the typedef? Doesn't seem all that onerous to type out.
>
> Also, why UnresolvedSet? That stores DeclAccessPairs, and it doesn't seem like
> you're using the access side of it.
>
> I would just use llvm::SmallVector<NamedDecl *, 1>
The typedef was to reduce the chances of mismatch since it was used in more than
one file and I'd already changed the type a couple of times. And I used
UnresolvedSet because the functions I was looking at for doing overload
resolution used them--Matt pointed me at Sema::isExprCallable which takes an
UnresolvedSet, and I found the OverloadCandidateSet stuff that seems to do what
I needed, with Sema::AddFunctionCandidates populating an OverloadCandidateSet
from an UnresolvedSet. I figured it was better to just use UnresolvedSet in the
TypoCorrection than to loop through the set of Decls to either add them to an
UnresolvedSet to feed into the OverloadCandidateSet or to explicitly convert
them to DeclAccessPairs and add them individually to the OverloadCandidateSet
(which would require me to specify an access).
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:33: IsKeyword(false) {
On 2011/07/16 01:33:58, chandlerc wrote:
> Why not set the Decl* to be a KeywordDecl as before? That seemed like a good
> solution, and it keeps everything simple. It also ensures that a non-empty
> vector of Decls does correspond to a viable result, keyword or otherwise.
That raises the potential for other complexities, such as always having to dip
into the vector to determine if the result is a keyword, and to not always have
the set of Decls be a set of valid pointers. And while it may not be possible
from a language standpoint, very bad things could happen if the "special"
KeywordDecl was in the vector with regular Decl pointers (unless there was
additional logic to avoid that happening, in which case we're pretty much back
where we started).
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:48: IsKeyword(false) {
On 2011/07/16 01:33:58, chandlerc wrote:
> Why not just initialize this to 'Name == KeywordDecl()'?
Wasn't thinking. ;)
http://codereview.appspot.com/4747047/diff/1/include/clang/Sema/TypoCorrectio...
include/clang/Sema/TypoCorrection.h:52: CorrectionDecls = new CorrectionDeclSet;
On 2011/07/16 01:33:58, chandlerc wrote:
> Why heap allocate this? UnresolvedSet is essentially a SmallVector; it already
> has a small number of elements with inline storage and the remainder of the
> storage goes on the heap.
>
> What might be best is to use UnresolvedSet<1> so that we optimize for the
common
> case of exactly one decl, and put them on the heap when there are more.
I think I mentioned in my email to cfe-commits that I'd realized it would
probably be better to not dynamically allocate the UnresolvedSet. I just wanted
confirmation before changing it (just in case I was mistaken; path of least
resistence/work even), so I'll go ahead and remove the "new"s now, and make the
change to UnresolvedSet<1> too.
I've updated the patchset though I'm going to wait until it's a bit more final ...
13 years, 3 months ago
(2011-07-19 17:52:24 UTC)
#4
I've updated the patchset though I'm going to wait until it's a bit more final
before remailing it to cfe-commits again.
http://codereview.appspot.com/4747047/diff/5001/include/clang/Sema/TypoCorrec...
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/5001/include/clang/Sema/TypoCorrec...
include/clang/Sema/TypoCorrection.h:83: void setCorrectionDecl(NamedDecl
*CDecl){
On 2011/07/19 08:15:05, chandlerc wrote:
> space before {
Oops. Fixed.
http://codereview.appspot.com/4747047/diff/5001/include/clang/Sema/TypoCorrec...
include/clang/Sema/TypoCorrection.h:100: *(CorrectionDecls.begin()) ==
KeywordDecl();
On 2011/07/19 08:15:05, chandlerc wrote:
> .front() is designed to simplify code like this.
Didn't think about trying to use .front() because it doesn't show up in the
"complete" list of members for SmallPtrSet in the doxygen-generated API docs
(http://llvm.org/doxygen/classllvm_1_1SmallPtrSet-members.html).
http://codereview.appspot.com/4747047/diff/5001/include/clang/Sema/TypoCorrec...
include/clang/Sema/TypoCorrection.h:110: const llvm::SmallPtrSet<NamedDecl*, 1>
&getCorrectionDecls() const { return CorrectionDecls; }
On 2011/07/19 08:15:05, chandlerc wrote:
> 80-columns please
Damned search and replace screwing up my formatting! ;)
http://codereview.appspot.com/4747047/diff/5001/include/clang/Sema/TypoCorrec...
include/clang/Sema/TypoCorrection.h:123: llvm::SmallPtrSet<NamedDecl*, 1>
CorrectionDecls;
On 2011/07/19 08:15:05, chandlerc wrote:
> Do we actually need a uniquing container here?
Something akin to habit to use SmallPtrSet for a set/list of pointers where
ordering is not important. I'm assuming you bring this up because of a slight
performance penalty of using SmallPtrSet instead of SmallVector...
http://codereview.appspot.com/4747047/diff/5001/lib/Sema/SemaExpr.cpp
File lib/Sema/SemaExpr.cpp (right):
http://codereview.appspot.com/4747047/diff/5001/lib/Sema/SemaExpr.cpp#newcode...
lib/Sema/SemaExpr.cpp:1415: AddOverloadCandidate(cast<FunctionDecl>(*CD),
On 2011/07/19 08:15:05, chandlerc wrote:
> I have to admit I'm a bit surprised we never end up with a non-FunctionDecl
> here. It might be worth a comment or an explicit assert with a string
> documenting the precondition being violated.
Well, when I was storing the NamedDecls in an UnresolvedSet, the APIs I was
using didn't (seem to) require me to check that the ND was a FunctionDecl much
less cast it. And as far as I'm aware, variables can't be overridden so
Corrected.isOverloaded() would only be true for FunctionDecls (especially since
typo corrections for e.g. C++ member functions don't hit this code path). I've
added an assertion just in case there is ever a situation where my understanding
is proven incorrect.
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaLookup.cpp File lib/Sema/SemaLookup.cpp (right): http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaLookup.cpp#newcode3693 lib/Sema/SemaLookup.cpp:3693: // FIXME: This sets the CorrectionDecl to NULL for ...
13 years, 3 months ago
(2011-07-27 23:33:55 UTC)
#6
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaLookup.cpp
File lib/Sema/SemaLookup.cpp (right):
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaLookup.cpp#newc...
lib/Sema/SemaLookup.cpp:3693: // FIXME: This sets the CorrectionDecl to NULL for
overloaded functions.
On 2011/07/27 14:01:00, hwennborg wrote:
> This fixme could be removed now.
Good catch. Thanks :)
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaLookup.cpp#newc...
lib/Sema/SemaLookup.cpp:3824: if (!CDecl) return;
On 2011/07/27 14:01:00, hwennborg wrote:
> I'm probably missing something, but why would one call addCorrectionDecl with
a
> NULL CDecl?
setCorrectionDecl currently calls addCorrectionDecl after clearing the list of
decls, so it keeps setCorrectionDecl(NULL) (which clears the correction decls in
the TypoCorrection) from breaking things. I could move the NULL check to within
setCorrectionDecl but it opens the path for accidental breakage in other places
where addCorrectionDecl is called, such as when iterating through the decls in a
LookupResult. It's both safer and simpler to make addCorrectionDecl do nothing
if passed a NULL decl than to force all callers to guarantee they won't ever
pass NULL to addCorrectionDecl. ;)
13 years, 3 months ago
(2011-08-01 18:22:10 UTC)
#8
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/Sema.h
File include/clang/Sema/Sema.h (right):
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/Sema.h#ne...
include/clang/Sema/Sema.h:2209: Expr **Args = NULL, unsigned NumArgs = 0);
On 2011/08/01 17:43:32, chandlerc wrote:
> Use '0' for null pointers rather than NULL.
Ewww! Grudgingly changed.
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:30: CorrectionDecls(),
On 2011/08/01 17:43:32, chandlerc wrote:
> No need to list this constructor.
According to the compiler I do. When I remove it I get build errors about "no
matching function for call to 'clang::TypoCorrection::TypoCorrection()'"
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:76: return hasCorrectionDecl() ?
*(CorrectionDecls.begin()) : NULL;
On 2011/08/01 17:43:32, chandlerc wrote:
> s/NULL/0/
*sigh*... changed
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:97: static inline NamedDecl *KeywordDecl() {
return (NamedDecl*)~0x3; }
On 2011/08/01 17:43:32, chandlerc wrote:
> Why is this a static method instead of a static constant? Also, it needs a
nice
> healthy comment to explain whats going on here.
It's a static inline method as it was the only way I could come up with for
making it an inline constant without the compiler complaining about an "invalid
in-class initialization of static data member of non-integral type", other than
doing:
#define KEYWORD_DECL ((NamedDecl*)~0x3)
>
> I'm not actually convinced this is safe... is this pointer aligned properly?
The reason it's ~0x3 instead of -1 is not about alignment so much as it is about
making sure the low few bits aren't set in case it is used in llvm data
structures that store information about pointers in the low few bits, such as
PointerUnion or UnresolvedSet.
>
> It seems like maybe what you want here is a wrapper around NamedDecl*,
> potentially through a PointerUnion, but maybe not. I'd be interested if Doug
has
> any ideas about how best to structure this.
A wrapper like PointerUnion seems a bit heavyweight for dealing with the single
special case of storing something in a SmallVector<NamedDecl*> when the
correction is a keyword, for which there never actually any NamedDecl.
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaExpr.cpp
File lib/Sema/SemaExpr.cpp (right):
http://codereview.appspot.com/4747047/diff/10001/lib/Sema/SemaExpr.cpp#newcod...
lib/Sema/SemaExpr.cpp:1423: case OR_Ambiguous:
On 2011/08/01 17:43:32, chandlerc wrote:
> ? Ambiguous? Might be worth a comment.
Actually, I looked at the definition of
OverloadCandidateSet::BestViableFunction, and handling OR_Ambiguous like
OR_Success was a mistake on my part.
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorrection.h File include/clang/Sema/TypoCorrection.h (right): http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorrection.h#newcode30 include/clang/Sema/TypoCorrection.h:30: CorrectionDecls(), On 2011/08/01 18:22:10, rikka wrote: > On 2011/08/01 ...
13 years, 3 months ago
(2011-08-01 22:51:35 UTC)
#9
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:30: CorrectionDecls(),
On 2011/08/01 18:22:10, rikka wrote:
> On 2011/08/01 17:43:32, chandlerc wrote:
> > No need to list this constructor.
>
> According to the compiler I do. When I remove it I get build errors about "no
> matching function for call to 'clang::TypoCorrection::TypoCorrection()'"
>
I just meant line 30... the CorrectionDecls member shouldn't need to be called
out in the initializer list.
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:97: static inline NamedDecl *KeywordDecl() {
return (NamedDecl*)~0x3; }
On 2011/08/01 18:22:10, rikka wrote:
> On 2011/08/01 17:43:32, chandlerc wrote:
> > Why is this a static method instead of a static constant? Also, it needs a
> nice
> > healthy comment to explain whats going on here.
>
> It's a static inline method as it was the only way I could come up with for
> making it an inline constant without the compiler complaining about an
"invalid
> in-class initialization of static data member of non-integral type", other
than
> doing:
> #define KEYWORD_DECL ((NamedDecl*)~0x3)
>
> >
> > I'm not actually convinced this is safe... is this pointer aligned properly?
>
> The reason it's ~0x3 instead of -1 is not about alignment so much as it is
about
> making sure the low few bits aren't set in case it is used in llvm data
> structures that store information about pointers in the low few bits, such as
> PointerUnion or UnresolvedSet.
Yes, but what about alignment? And aliasing rules? We can't just make up an
address and call it a NamedDecl.
We could potentially use a NULL pointer if there is no other context where a
NULL pointer gets inserted into the small vector. Would that work instead?
>
> >
> > It seems like maybe what you want here is a wrapper around NamedDecl*,
> > potentially through a PointerUnion, but maybe not. I'd be interested if Doug
> has
> > any ideas about how best to structure this.
>
> A wrapper like PointerUnion seems a bit heavyweight for dealing with the
single
> special case of storing something in a SmallVector<NamedDecl*> when the
> correction is a keyword, for which there never actually any NamedDecl.
See above; my concern is with the C++ semantics for this. We need a construct
that is reasonably well defined according to the standard.
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorrection.h File include/clang/Sema/TypoCorrection.h (right): http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorrection.h#newcode30 include/clang/Sema/TypoCorrection.h:30: CorrectionDecls(), On 2011/08/01 22:51:36, chandlerc wrote: > On 2011/08/01 ...
13 years, 3 months ago
(2011-08-01 23:27:19 UTC)
#10
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
File include/clang/Sema/TypoCorrection.h (right):
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:30: CorrectionDecls(),
On 2011/08/01 22:51:36, chandlerc wrote:
> On 2011/08/01 18:22:10, rikka wrote:
> > On 2011/08/01 17:43:32, chandlerc wrote:
> > > No need to list this constructor.
> >
> > According to the compiler I do. When I remove it I get build errors about
"no
> > matching function for call to 'clang::TypoCorrection::TypoCorrection()'"
> >
>
> I just meant line 30... the CorrectionDecls member shouldn't need to be called
> out in the initializer list.
I totally misread the context of the comment in the email sent out. Removed the
"CorrectionDecls()" calls.
http://codereview.appspot.com/4747047/diff/10001/include/clang/Sema/TypoCorre...
include/clang/Sema/TypoCorrection.h:97: static inline NamedDecl *KeywordDecl() {
return (NamedDecl*)~0x3; }
On 2011/08/01 22:51:36, chandlerc wrote:
> On 2011/08/01 18:22:10, rikka wrote:
> > On 2011/08/01 17:43:32, chandlerc wrote:
> > > Why is this a static method instead of a static constant? Also, it needs a
> > nice
> > > healthy comment to explain whats going on here.
> >
> > It's a static inline method as it was the only way I could come up with for
> > making it an inline constant without the compiler complaining about an
> "invalid
> > in-class initialization of static data member of non-integral type", other
> than
> > doing:
> > #define KEYWORD_DECL ((NamedDecl*)~0x3)
> >
> > >
> > > I'm not actually convinced this is safe... is this pointer aligned
properly?
> >
> > The reason it's ~0x3 instead of -1 is not about alignment so much as it is
> about
> > making sure the low few bits aren't set in case it is used in llvm data
> > structures that store information about pointers in the low few bits, such
as
> > PointerUnion or UnresolvedSet.
>
> Yes, but what about alignment? And aliasing rules? We can't just make up an
> address and call it a NamedDecl.
>
> We could potentially use a NULL pointer if there is no other context where a
> NULL pointer gets inserted into the small vector. Would that work instead?
>
> >
> > >
> > > It seems like maybe what you want here is a wrapper around NamedDecl*,
> > > potentially through a PointerUnion, but maybe not. I'd be interested if
Doug
> > has
> > > any ideas about how best to structure this.
> >
> > A wrapper like PointerUnion seems a bit heavyweight for dealing with the
> single
> > special case of storing something in a SmallVector<NamedDecl*> when the
> > correction is a keyword, for which there never actually any NamedDecl.
>
> See above; my concern is with the C++ semantics for this. We need a construct
> that is reasonably well defined according to the standard.
Actually, you're right. Now that it is storing a list of CorrectionDecl pointers
instead of a single pointer, using NULL for the keyword decl should be just
fine. Just have to fix up the code a bit to distinguish between a TypoCorrection
being initialized without a valid CorrectionDecl and when the new TypoCorrection
is a keyword.
The current patch has been submitted as r136807, with a separate patch adding support for ...
13 years, 3 months ago
(2011-08-03 20:38:29 UTC)
#11
The current patch has been submitted as r136807, with a separate patch adding
support for resolving overloaded template functions and using declarations
coming soon.
Issue 4747047: Better handle overloaded functions in typo correction
(Closed)
Created 13 years, 3 months ago by Kaelyn
Modified 13 years, 3 months ago
Reviewers: chandlerc, hwennborg
Base URL:
Comments: 36