|
|
Descriptionruntime: record type information for hashtable internal structures.
Remove all hashtable-specific GC code.
Fixes bug 6119.
Patch Set 1 #Patch Set 2 : diff -r cddeaea05b11 https://khr%40golang.org@code.google.com/p/go/ #Patch Set 3 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #Patch Set 4 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #
Total comments: 21
Patch Set 5 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #
Total comments: 3
Patch Set 6 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #Patch Set 7 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #
Total comments: 4
Patch Set 8 : diff -r da8cd843043b https://khr%40golang.org@code.google.com/p/go/ #
Total comments: 3
Patch Set 9 : diff -r f795546f1c55 https://khr%40golang.org@code.google.com/p/go/ #
MessagesTotal messages: 25
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://khr%40golang.org@code.google.com/p/go/
Sign in to reply to this message.
minor comments, the compiler part looks good to me https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c File src/cmd/gc/reflect.c (right): https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:104: // Builds a type respresenting a Bucket structure for A description of the bucket contents, or a pointer to a description of the bucket contents, would be great. https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:106: // we include only enough information to generate a correct GC GC gc? Should this be one word or two? https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:116: int32 bucketsize = 8; Is initializing a variable when its declared permitted by the prevailing coding conventions? https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:133: overflowfield->width = bucketsize; // "width" is offset in structure can we make some note about the field that is not shown (the one that induces the bucketsize offset) https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:142: keysfield->width = bucketsize + widthptr; It would be great if these width computations could be driven by the width values they are offset from, rather than being composed of the elementary size values. for example, this would be something like overflowfield->width + overflowfield->type->width.
Sign in to reply to this message.
https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c File src/cmd/gc/reflect.c (right): https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:116: int32 bucketsize = 8; On 2013/08/21 23:37:44, cshapiro1 wrote: > Is initializing a variable when its declared permitted by the prevailing coding > conventions? yes, please don't do this for constants we usually use enums https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:122: if(keytype->width > maxkeysize) { drop {} we generally do not use {} for if/for/while with one-line body https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcod... src/cmd/gc/reflect.c:125: if(valtype->width > maxvalsize) { same https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c File src/pkg/runtime/hashmap.c (right): https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:85: // Note: the format of the Bucket is encoded in ../../cmd/gc/reflect.c:mapbucket(). please move this to line 77 https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:259: // Note: the array really does have pointers, but we tell the gc about s/gc/GC/ gc is the compiler https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:261: // the gc from scanning the bucket array itself. Fixes issue 6119. s/gc/GC/ https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:263: // we set the proper type and erase the FlagNoPointers mark so gc will s/gc/GC/ https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:459: new_buckets = runtime·mallocgc((uintptr)h->bucketsize << (h->B + 1), 0, FlagNoZero|FlagNoPointers); i afraid it becomes more complicated when you start attaching types to blocks consider that you do not occupy all buckets in the block before next hash_grow or just have some spare memory after the buckets (physical memory block may be larger than you ask) that non-occupied memory can contain old pointers (1) at the very best it will cause block rescanning in GC (2) worse scenario is memory leaks (3) the nightmare scenario is GC uses wrong types to scan the pointers now that you have types attached it all can happen similarly you use FlagNoZero when allocating satellite key/data also clearly can cause at least (1) and (2) i am not sure about (3), current GC algorithm is incomprehensible https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:680: kmem = runtime·mallocgc(t->key->size, (uintptr)t->key, FlagNoZero); see the comment where you allocate new_buckets with FlagNoZero https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:686: vmem = runtime·mallocgc(t->elem->size, (uintptr)t->elem, FlagNoZero); see the comment where you allocate new_buckets with FlagNoZero https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:732: *(byte**)k = nil; are you sure this writes nil into correct place? i would expect nil to be written into *(byte**)k before k is updated on the previous line https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... src/pkg/runtime/hashmap.c:738: *(byte**)v = nil; ditto https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc File src/pkg/runtime/malloc.goc (right): https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#... src/pkg/runtime/malloc.goc:221: runtime·free_to_gc(void *v, uintptr size, uintptr typ) { please do not use underscores in name "free" is confusing in the name, it does not free and object is usable after this functions, also it was available for GC before this functions I think update/change/attachtype would be better https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#... src/pkg/runtime/malloc.goc:221: runtime·free_to_gc(void *v, uintptr size, uintptr typ) { open brace on the next line for C functions https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#... src/pkg/runtime/malloc.goc:224: if(!UseSpanType) you need to reset FlagNoPointers even is !UseSpanType otherwise GC will corrupt heap https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#... src/pkg/runtime/malloc.goc:240: m->locks++; you need to disable preemption around the whole function otherwise if g is descheduled in settype_flush, another g on this m will overflow settype_buf
Sign in to reply to this message.
On Thu, Aug 22, 2013 at 5:12 AM, <dvyukov@google.com> wrote: > > https://codereview.appspot.**com/13078044/diff/8001/src/**cmd/gc/reflect.c<ht... > File src/cmd/gc/reflect.c (right): > > https://codereview.appspot.**com/13078044/diff/8001/src/** > cmd/gc/reflect.c#newcode116<https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcode116> > src/cmd/gc/reflect.c:116: int32 bucketsize = 8; > On 2013/08/21 23:37:44, cshapiro1 wrote: > >> Is initializing a variable when its declared permitted by the >> > prevailing coding > >> conventions? >> > > yes, please don't do this > for constants we usually use enums > > https://codereview.appspot.**com/13078044/diff/8001/src/** > cmd/gc/reflect.c#newcode122<https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcode122> > src/cmd/gc/reflect.c:122: if(keytype->width > maxkeysize) { > drop {} > we generally do not use {} for if/for/while with one-line body > > https://codereview.appspot.**com/13078044/diff/8001/src/** > cmd/gc/reflect.c#newcode125<https://codereview.appspot.com/13078044/diff/8001/src/cmd/gc/reflect.c#newcode125> > src/cmd/gc/reflect.c:125: if(valtype->width > maxvalsize) { > same > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c> > File src/pkg/runtime/hashmap.c (right): > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode85<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode85> > src/pkg/runtime/hashmap.c:85: // Note: the format of the Bucket is > encoded in ../../cmd/gc/reflect.c:**mapbucket(). > please move this to line 77 > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode259<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode259> > src/pkg/runtime/hashmap.c:259: // Note: the array really does have > pointers, but we tell the gc about > s/gc/GC/ > gc is the compiler > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode261<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode261> > src/pkg/runtime/hashmap.c:261: // the gc from scanning the bucket array > itself. Fixes issue 6119. > s/gc/GC/ > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode263<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode263> > src/pkg/runtime/hashmap.c:263: // we set the proper type and erase the > FlagNoPointers mark so gc will > s/gc/GC/ > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode459<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode459> > src/pkg/runtime/hashmap.c:459: new_buckets = > runtime·mallocgc((uintptr)h->**bucketsize << (h->B + 1), 0, > FlagNoZero|FlagNoPointers); > i afraid it becomes more complicated when you start attaching types to > blocks > consider that you do not occupy all buckets in the block before next > hash_grow or just have some spare memory after the buckets (physical > memory block may be larger than you ask) > that non-occupied memory can contain old pointers > (1) at the very best it will cause block rescanning in GC > (2) worse scenario is memory leaks > (3) the nightmare scenario is GC uses wrong types to scan the pointers > now that you have types attached it all can happen > > similarly you use FlagNoZero when allocating satellite key/data > also clearly can cause at least (1) and (2) > i am not sure about (3), current GC algorithm is incomprehensible > > The algorithm guarantees that all the buckets get initialized before the next grow (every insert initializes at least one bucket, and the number of inserts needed to trigger a grow is bigger than the number of buckets). So that's not a problem. But the extra space worries me. How can FlagNoZero (without FlagNoPointers or FlagNoGC) ever be correct, then? Maybe the allocator needs to zero any space between the requested size and the rounded-up size. > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode680<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode680> > src/pkg/runtime/hashmap.c:680: kmem = runtime·mallocgc(t->key->size, > (uintptr)t->key, FlagNoZero); > see the comment where you allocate new_buckets with FlagNoZero > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode686<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode686> > src/pkg/runtime/hashmap.c:686: vmem = runtime·mallocgc(t->elem->**size, > (uintptr)t->elem, FlagNoZero); > see the comment where you allocate new_buckets with FlagNoZero > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode732<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode732> > src/pkg/runtime/hashmap.c:732: *(byte**)k = nil; > are you sure this writes nil into correct place? > i would expect nil to be written into *(byte**)k before k is updated on > the previous line > Yes, I had that wrong. Fixed. > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/hashmap.c#**newcode738<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode738> > src/pkg/runtime/hashmap.c:738: *(byte**)v = nil; > ditto > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/malloc.goc<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc> > File src/pkg/runtime/malloc.goc (right): > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/malloc.goc#**newcode221<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#newcode221> > src/pkg/runtime/malloc.goc:**221: runtime·free_to_gc(void *v, uintptr > size, uintptr typ) { > please do not use underscores in name > "free" is confusing in the name, it does not free and object is usable > after this functions, also it was available for GC before this functions > I think update/change/attachtype would be better > How about turnonscanning? > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/malloc.goc#**newcode221<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#newcode221> > src/pkg/runtime/malloc.goc:**221: runtime·free_to_gc(void *v, uintptr > size, uintptr typ) { > open brace on the next line for C functions > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/malloc.goc#**newcode224<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#newcode224> > src/pkg/runtime/malloc.goc:**224: if(!UseSpanType) > you need to reset FlagNoPointers even is !UseSpanType > otherwise GC will corrupt heap > > https://codereview.appspot.**com/13078044/diff/8001/src/** > pkg/runtime/malloc.goc#**newcode240<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#newcode240> > src/pkg/runtime/malloc.goc:**240: m->locks++; > you need to disable preemption around the whole function > otherwise if g is descheduled in settype_flush, another g on this m will > overflow settype_buf > > https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... > I think I've addressed all the other points from both of you.
Sign in to reply to this message.
changes look good https://codereview.appspot.com/13078044/diff/15001/src/cmd/gc/reflect.c File src/cmd/gc/reflect.c (right): https://codereview.appspot.com/13078044/diff/15001/src/cmd/gc/reflect.c#newco... src/cmd/gc/reflect.c:123: // constants from ../../pkg/runtime/hashmap.c Shouldn't this comment be moved above line 110 now? (Or maybe even removed, since the same information is on line 108.)
Sign in to reply to this message.
https://codereview.appspot.com/13078044/diff/15001/src/pkg/runtime/hashmap.c File src/pkg/runtime/hashmap.c (right): https://codereview.appspot.com/13078044/diff/15001/src/pkg/runtime/hashmap.c#... src/pkg/runtime/hashmap.c:730: if((h->flags & IndirectKey) != 0) { here you use &IndirectKey, but below you use &CanFreeKey is it the same? or is it a bug? please either make them the same, or add a comment, or something else to make it non confusing https://codereview.appspot.com/13078044/diff/15001/src/pkg/runtime/malloc.goc File src/pkg/runtime/malloc.goc (right): https://codereview.appspot.com/13078044/diff/15001/src/pkg/runtime/malloc.goc... src/pkg/runtime/malloc.goc:240: if(m->settype_bufsize == nelem(m->settype_buf)) { please drop {}
Sign in to reply to this message.
>
pkg/runtime/malloc.goc#**newcode221<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/malloc.goc#newcode221>
> > src/pkg/runtime/malloc.goc:**221: runtime·free_to_gc(void *v, uintptr
> > size, uintptr typ) {
> > please do not use underscores in name
> > "free" is confusing in the name, it does not free and object is usable
> > after this functions, also it was available for GC before this functions
> > I think update/change/attachtype would be better
> >
>
> How about turnonscanning?
sounds better
Sign in to reply to this message.
> pkg/runtime/hashmap.c#**newcode459<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode459> > > src/pkg/runtime/hashmap.c:459: new_buckets = > > runtime·mallocgc((uintptr)h->**bucketsize << (h->B + 1), 0, > > FlagNoZero|FlagNoPointers); > > i afraid it becomes more complicated when you start attaching types to > > blocks > > consider that you do not occupy all buckets in the block before next > > hash_grow or just have some spare memory after the buckets (physical > > memory block may be larger than you ask) > > that non-occupied memory can contain old pointers > > (1) at the very best it will cause block rescanning in GC > > (2) worse scenario is memory leaks > > (3) the nightmare scenario is GC uses wrong types to scan the pointers > > now that you have types attached it all can happen > > > > similarly you use FlagNoZero when allocating satellite key/data > > also clearly can cause at least (1) and (2) > > i am not sure about (3), current GC algorithm is incomprehensible > > > > > The algorithm guarantees that all the buckets get initialized before the > next grow (every insert initializes at least one bucket, and the number of > inserts needed to trigger a grow is bigger than the number of buckets). So > that's not a problem. OK > But the extra space worries me. How can FlagNoZero > (without FlagNoPointers or FlagNoGC) ever be correct, then? Maybe the > allocator needs to zero any space between the requested size and the > rounded-up size. This is a very good question. Problem is not only with the extra space. Now you attach types to satellite keys/values, allocate them with FlagNoZero and then copy key/value in soon after. But it happens not fast enough to make it invisible to GC. If GC is triggered in between allocation and copy, it will scan garbage as typed memory and destroy heap. Now that you attach types to buckets and keys/values, and as the consequence you also zero removed keys/values and zero buckets, there seems to be no reasons to not make the final step -- allocate new buckets w/o FlagNoZero (effectively this is done anyway, but incrementally), allocate satellite keys/values w/o FlagNoZero (this needs to be done anyway to fix the bug), attach proper type to Hmap, and finally remove all that special code from GC, gciter, turnonscanning, etc. Then GC will consider hashmaps as just independent set of typed blocks. What do you think?
Sign in to reply to this message.
On Sat, Aug 24, 2013 at 6:44 AM, <dvyukov@google.com> wrote: > > pkg/runtime/hashmap.c#****newcode459<https://codereview.** > appspot.com/13078044/diff/**8001/src/pkg/runtime/hashmap.**c#newcode459<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode459> > > > >> > src/pkg/runtime/hashmap.c:459: new_buckets = >> > runtime·mallocgc((uintptr)h->****bucketsize << (h->B + 1), 0, >> >> > FlagNoZero|FlagNoPointers); >> > i afraid it becomes more complicated when you start attaching types >> > to > >> > blocks >> > consider that you do not occupy all buckets in the block before next >> > hash_grow or just have some spare memory after the buckets (physical >> > memory block may be larger than you ask) >> > that non-occupied memory can contain old pointers >> > (1) at the very best it will cause block rescanning in GC >> > (2) worse scenario is memory leaks >> > (3) the nightmare scenario is GC uses wrong types to scan the >> > pointers > >> > now that you have types attached it all can happen >> > >> > similarly you use FlagNoZero when allocating satellite key/data >> > also clearly can cause at least (1) and (2) >> > i am not sure about (3), current GC algorithm is incomprehensible >> > >> > >> The algorithm guarantees that all the buckets get initialized before >> > the > >> next grow (every insert initializes at least one bucket, and the >> > number of > >> inserts needed to trigger a grow is bigger than the number of >> > buckets). So > >> that's not a problem. >> > > OK > > > But the extra space worries me. How can FlagNoZero >> (without FlagNoPointers or FlagNoGC) ever be correct, then? Maybe the >> allocator needs to zero any space between the requested size and the >> rounded-up size. >> > > This is a very good question. > Problem is not only with the extra space. Now you attach types to > satellite keys/values, allocate them with FlagNoZero and then copy > key/value in soon after. But it happens not fast enough to make it > invisible to GC. If GC is triggered in between allocation and copy, it > will scan garbage as typed memory and destroy heap. > > Now that you attach types to buckets and keys/values, and as the > consequence you also zero removed keys/values and zero buckets, there > seems to be no reasons to not make the final step -- allocate new > buckets w/o FlagNoZero (effectively this is done anyway, but > incrementally), allocate satellite keys/values w/o FlagNoZero (this > needs to be done anyway to fix the bug), attach proper type to Hmap, and > finally remove all that special code from GC, gciter, turnonscanning, > etc. Then GC will consider hashmaps as just independent set of typed > blocks. > What do you think? > > I was trying to avoid the expense of zeroing a large bucket array all at once. All the rest of the hashtable implementation does only a constant amount of work per op. But yes, maybe we should go that route - a lot of pieces get a whole lot easier. > https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... >
Sign in to reply to this message.
On Sat, Aug 24, 2013 at 8:57 PM, Keith Randall <khr@google.com> wrote: pkg/runtime/hashmap.c#**newcode459<https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#newcode459> >>> >>> > src/pkg/runtime/hashmap.c:459: new_buckets = >>> > runtime·mallocgc((uintptr)h->**bucketsize << (h->B + 1), 0, >>> >>> > FlagNoZero|FlagNoPointers); >>> > i afraid it becomes more complicated when you start attaching types >> >> to >>> >>> > blocks >>> > consider that you do not occupy all buckets in the block before next >>> > hash_grow or just have some spare memory after the buckets (physical >>> > memory block may be larger than you ask) >>> > that non-occupied memory can contain old pointers >>> > (1) at the very best it will cause block rescanning in GC >>> > (2) worse scenario is memory leaks >>> > (3) the nightmare scenario is GC uses wrong types to scan the >> >> pointers >>> >>> > now that you have types attached it all can happen >>> > >>> > similarly you use FlagNoZero when allocating satellite key/data >>> > also clearly can cause at least (1) and (2) >>> > i am not sure about (3), current GC algorithm is incomprehensible >>> > >>> > >>> The algorithm guarantees that all the buckets get initialized before >> >> the >>> >>> next grow (every insert initializes at least one bucket, and the >> >> number of >>> >>> inserts needed to trigger a grow is bigger than the number of >> >> buckets). So >>> >>> that's not a problem. >> >> >> OK >> >> >>> But the extra space worries me. How can FlagNoZero >>> (without FlagNoPointers or FlagNoGC) ever be correct, then? Maybe the >>> allocator needs to zero any space between the requested size and the >>> rounded-up size. >> >> >> This is a very good question. >> Problem is not only with the extra space. Now you attach types to >> satellite keys/values, allocate them with FlagNoZero and then copy >> key/value in soon after. But it happens not fast enough to make it >> invisible to GC. If GC is triggered in between allocation and copy, it >> will scan garbage as typed memory and destroy heap. >> >> Now that you attach types to buckets and keys/values, and as the >> consequence you also zero removed keys/values and zero buckets, there >> seems to be no reasons to not make the final step -- allocate new >> buckets w/o FlagNoZero (effectively this is done anyway, but >> incrementally), allocate satellite keys/values w/o FlagNoZero (this >> needs to be done anyway to fix the bug), attach proper type to Hmap, and >> finally remove all that special code from GC, gciter, turnonscanning, >> etc. Then GC will consider hashmaps as just independent set of typed >> blocks. >> What do you think? >> > > I was trying to avoid the expense of zeroing a large bucket array all at > once. All the rest of the hashtable implementation does only a constant > amount of work per op. But yes, maybe we should go that route - a lot of > pieces get a whole lot easier. I would not expect zeroing to be a significant problem. Today machines can zero 5-10-20 GB/s. During Go1.1 release we have lots of very tough issues with GC caused by using points-to type (rather than block's own type). What we have in this CL seems to be the same sort of problems. I don't know whether we've identified all issues or not. Also I suspect that we will need to turn down this special handling of maps and channel in future anyway. It leads to too non-uniform code in GC and also required fat type info. I have a similar CL that removes special handling of channels, but it's not yet time to submit it. I would consider this is a very positive change. If you agree with the plan, then you will need type info for Hmap. Search by "gc_m_ptr", that's how we do it for M.
Sign in to reply to this message.
Ok, take another look. I've removed all of the FlagNoZero flags and the hashmap free calls, made another Type for the Hmap structure, and deleted all the special casing for maps inside the GC. On Sat, Aug 24, 2013 at 10:07 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Sat, Aug 24, 2013 at 8:57 PM, Keith Randall <khr@google.com> wrote: > pkg/runtime/hashmap.c#**newcode459< > https://codereview.appspot.com/13078044/diff/8001/src/pkg/runtime/hashmap.c#n... > > > >>> > >>> > src/pkg/runtime/hashmap.c:459: new_buckets = > >>> > runtime·mallocgc((uintptr)h->**bucketsize << (h->B + 1), 0, > >>> > >>> > FlagNoZero|FlagNoPointers); > >>> > i afraid it becomes more complicated when you start attaching types > >> > >> to > >>> > >>> > blocks > >>> > consider that you do not occupy all buckets in the block before next > >>> > hash_grow or just have some spare memory after the buckets (physical > >>> > memory block may be larger than you ask) > >>> > that non-occupied memory can contain old pointers > >>> > (1) at the very best it will cause block rescanning in GC > >>> > (2) worse scenario is memory leaks > >>> > (3) the nightmare scenario is GC uses wrong types to scan the > >> > >> pointers > >>> > >>> > now that you have types attached it all can happen > >>> > > >>> > similarly you use FlagNoZero when allocating satellite key/data > >>> > also clearly can cause at least (1) and (2) > >>> > i am not sure about (3), current GC algorithm is incomprehensible > >>> > > >>> > > >>> The algorithm guarantees that all the buckets get initialized before > >> > >> the > >>> > >>> next grow (every insert initializes at least one bucket, and the > >> > >> number of > >>> > >>> inserts needed to trigger a grow is bigger than the number of > >> > >> buckets). So > >>> > >>> that's not a problem. > >> > >> > >> OK > >> > >> > >>> But the extra space worries me. How can FlagNoZero > >>> (without FlagNoPointers or FlagNoGC) ever be correct, then? Maybe the > >>> allocator needs to zero any space between the requested size and the > >>> rounded-up size. > >> > >> > >> This is a very good question. > >> Problem is not only with the extra space. Now you attach types to > >> satellite keys/values, allocate them with FlagNoZero and then copy > >> key/value in soon after. But it happens not fast enough to make it > >> invisible to GC. If GC is triggered in between allocation and copy, it > >> will scan garbage as typed memory and destroy heap. > >> > >> Now that you attach types to buckets and keys/values, and as the > >> consequence you also zero removed keys/values and zero buckets, there > >> seems to be no reasons to not make the final step -- allocate new > >> buckets w/o FlagNoZero (effectively this is done anyway, but > >> incrementally), allocate satellite keys/values w/o FlagNoZero (this > >> needs to be done anyway to fix the bug), attach proper type to Hmap, and > >> finally remove all that special code from GC, gciter, turnonscanning, > >> etc. Then GC will consider hashmaps as just independent set of typed > >> blocks. > >> What do you think? > >> > > > > I was trying to avoid the expense of zeroing a large bucket array all at > > once. All the rest of the hashtable implementation does only a constant > > amount of work per op. But yes, maybe we should go that route - a lot of > > pieces get a whole lot easier. > > > I would not expect zeroing to be a significant problem. Today machines > can zero 5-10-20 GB/s. > > During Go1.1 release we have lots of very tough issues with GC caused > by using points-to type (rather than block's own type). What we have > in this CL seems to be the same sort of problems. I don't know whether > we've identified all issues or not. > > Also I suspect that we will need to turn down this special handling of > maps and channel in future anyway. It leads to too non-uniform code in > GC and also required fat type info. > I have a similar CL that removes special handling of channels, but > it's not yet time to submit it. > > I would consider this is a very positive change. > > If you agree with the plan, then you will need type info for Hmap. > Search by "gc_m_ptr", that's how we do it for M. >
Sign in to reply to this message.
Thanks for doing this, I liked seeing all of the hash table knowledge disappear from the GC. https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h File src/cmd/gc/go.h (right): https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h#newcode190 src/cmd/gc/go.h:190: Type* bucket; // (internal) type of hash bucket What is conveyed by the parenthesized use of internal in the comment? If it is not too much trouble, can we spell this out and get rid of the parenthesis? (I like the phrasing in type.h if that is okay) https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c File src/cmd/gc/reflect.c (right): https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... src/cmd/gc/reflect.c:186: Type *hmap, *bucket; Pardon my C ignorance, but is it legal to have the hmap local variable name in a function called hmap? Are we certain the local variable name will shadow the function name such as down on line 229? (I assume this is okay in at least some contexts because I assume you have compiled and run this.) https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... src/cmd/gc/reflect.c:197: offset += 4; // flags Can we use sizeof here instead of bare numbers? https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... src/cmd/gc/reflect.c:220: offset += widthint; // nevacuate What does this word mean?
Sign in to reply to this message.
On Tue, Aug 27, 2013 at 1:42 PM, <cshapiro@google.com> wrote: > Thanks for doing this, I liked seeing all of the hash table knowledge > disappear from the GC. > > > https://codereview.appspot.**com/13078044/diff/29001/src/**cmd/gc/go.h<https:... > File src/cmd/gc/go.h (right): > > https://codereview.appspot.**com/13078044/diff/29001/src/** > cmd/gc/go.h#newcode190<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h#newcode190> > src/cmd/gc/go.h:190: Type* bucket; // (internal) type of hash > bucket > What is conveyed by the parenthesized use of internal in the comment? > If it is not too much trouble, can we spell this out and get rid of the > parenthesis? > > (I like the phrasing in type.h if that is okay) > > Sure. > https://codereview.appspot.**com/13078044/diff/29001/src/** > cmd/gc/reflect.c<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c> > File src/cmd/gc/reflect.c (right): > > https://codereview.appspot.**com/13078044/diff/29001/src/** > cmd/gc/reflect.c#newcode186<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode186> > src/cmd/gc/reflect.c:186: Type *hmap, *bucket; > Pardon my C ignorance, but is it legal to have the hmap local variable > name in a function called hmap? Are we certain the local variable name > will shadow the function name such as down on line 229? > > (I assume this is okay in at least some contexts because I assume you > have compiled and run this.) > > changed the local variable to h. > https://codereview.appspot.**com/13078044/diff/29001/src/** > cmd/gc/reflect.c#newcode197<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode197> > src/cmd/gc/reflect.c:197: offset += 4; // flags > Can we use sizeof here instead of bare numbers? > > I suppose, but "sizeof(uint32)" is just a verbose name for "4". Ideally "sizeof(flags)" would be nice but not feasible, as that's the host size not the target size. > https://codereview.appspot.**com/13078044/diff/29001/src/** > cmd/gc/reflect.c#newcode220<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode220> > src/cmd/gc/reflect.c:220: offset += widthint; // nevacuate > What does this word mean? > > That's the name of the last field, in parallel with the comments earlier. I added "(last field in Hmap)" > https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... >
Sign in to reply to this message.
Dmitry, can you take another look? On Tue, Aug 27, 2013 at 1:57 PM, Keith Randall <khr@google.com> wrote: > > > > On Tue, Aug 27, 2013 at 1:42 PM, <cshapiro@google.com> wrote: > >> Thanks for doing this, I liked seeing all of the hash table knowledge >> disappear from the GC. >> >> >> https://codereview.appspot.**com/13078044/diff/29001/src/**cmd/gc/go.h<https:... >> File src/cmd/gc/go.h (right): >> >> https://codereview.appspot.**com/13078044/diff/29001/src/** >> cmd/gc/go.h#newcode190<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h#newcode190> >> src/cmd/gc/go.h:190: Type* bucket; // (internal) type of >> hash bucket >> What is conveyed by the parenthesized use of internal in the comment? >> If it is not too much trouble, can we spell this out and get rid of the >> parenthesis? >> >> (I like the phrasing in type.h if that is okay) >> >> > Sure. > > >> https://codereview.appspot.**com/13078044/diff/29001/src/** >> cmd/gc/reflect.c<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c> >> File src/cmd/gc/reflect.c (right): >> >> https://codereview.appspot.**com/13078044/diff/29001/src/** >> cmd/gc/reflect.c#newcode186<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode186> >> src/cmd/gc/reflect.c:186: Type *hmap, *bucket; >> Pardon my C ignorance, but is it legal to have the hmap local variable >> name in a function called hmap? Are we certain the local variable name >> will shadow the function name such as down on line 229? >> >> (I assume this is okay in at least some contexts because I assume you >> have compiled and run this.) >> >> > changed the local variable to h. > > >> https://codereview.appspot.**com/13078044/diff/29001/src/** >> cmd/gc/reflect.c#newcode197<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode197> >> src/cmd/gc/reflect.c:197: offset += 4; // flags >> Can we use sizeof here instead of bare numbers? >> >> > I suppose, but "sizeof(uint32)" is just a verbose name for "4". Ideally > "sizeof(flags)" would be nice but not feasible, as that's the host size not > the target size. > > >> https://codereview.appspot.**com/13078044/diff/29001/src/** >> cmd/gc/reflect.c#newcode220<https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newcode220> >> src/cmd/gc/reflect.c:220: offset += widthint; // nevacuate >> What does this word mean? >> >> > That's the name of the last field, in parallel with the comments earlier. > I added "(last field in Hmap)" > > >> https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... >> > >
Sign in to reply to this message.
I will look tomorrow. I am seriously behind my inbox. On Thu, Aug 29, 2013 at 11:39 PM, Keith Randall <khr@google.com> wrote: > Dmitry, can you take another look? > > > > On Tue, Aug 27, 2013 at 1:57 PM, Keith Randall <khr@google.com> wrote: >> >> >> >> >> On Tue, Aug 27, 2013 at 1:42 PM, <cshapiro@google.com> wrote: >>> >>> Thanks for doing this, I liked seeing all of the hash table knowledge >>> disappear from the GC. >>> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h >>> File src/cmd/gc/go.h (right): >>> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/go.h#newcode190 >>> src/cmd/gc/go.h:190: Type* bucket; // (internal) type of >>> hash bucket >>> What is conveyed by the parenthesized use of internal in the comment? >>> If it is not too much trouble, can we spell this out and get rid of the >>> parenthesis? >>> >>> (I like the phrasing in type.h if that is okay) >>> >> >> Sure. >> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c >>> File src/cmd/gc/reflect.c (right): >>> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... >>> src/cmd/gc/reflect.c:186: Type *hmap, *bucket; >>> Pardon my C ignorance, but is it legal to have the hmap local variable >>> name in a function called hmap? Are we certain the local variable name >>> will shadow the function name such as down on line 229? >>> >>> (I assume this is okay in at least some contexts because I assume you >>> have compiled and run this.) >>> >> >> changed the local variable to h. >> >>> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... >>> src/cmd/gc/reflect.c:197: offset += 4; // flags >>> Can we use sizeof here instead of bare numbers? >>> >> >> I suppose, but "sizeof(uint32)" is just a verbose name for "4". Ideally >> "sizeof(flags)" would be nice but not feasible, as that's the host size not >> the target size. >> >>> >>> >>> https://codereview.appspot.com/13078044/diff/29001/src/cmd/gc/reflect.c#newco... >>> src/cmd/gc/reflect.c:220: offset += widthint; // nevacuate >>> What does this word mean? >>> >> >> That's the name of the last field, in parallel with the comments earlier. >> I added "(last field in Hmap)" >> >>> >>> https://codereview.appspot.com/13078044/ >> >> >
Sign in to reply to this message.
https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h File src/pkg/runtime/hashmap.h (right): https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h#... src/pkg/runtime/hashmap.h:1: // Copyright 2009 The Go Authors. All rights reserved. remove this file? https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h File src/pkg/runtime/type.h (right): https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h#new... src/pkg/runtime/type.h:74: Type *hmap; // internal type representing a Hmap Why do you use additional hmap rather than MapType.Type.gc? MapType.Type.gc looks like a reasonable place for Hmap GC info. Related to this, if we have v map[T]V is it GC_PTR or GC_APTR? Do we have points-to type info associated with the pointer, or we must consult heap block type info? I am fine either way, just want to make sure that it does not have wrong type info. Does Hmap heap block type info is correct? It must describe Hmap. If MapType.Type.gc==nil, then I guess we will scan Hmap conservatively.
Sign in to reply to this message.
On 2013/08/27 20:42:08, cshapiro1 wrote: > Thanks for doing this, I liked seeing all of the hash table knowledge disappear > from the GC. Yes, this is awesome. Thanks!
Sign in to reply to this message.
> Does Hmap heap block type info is correct? It must describe Hmap. If > MapType.Type.gc==nil, then I guess we will scan Hmap conservatively. This is obviously nonsense. You explicitly pass typ->hmap into mallocgc when allocating Hmap.
Sign in to reply to this message.
https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c File src/pkg/runtime/hashmap.c (right): https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c#... src/pkg/runtime/hashmap.c:932: h = runtime·mallocgc(sizeof(*h), (uintptr)typ->hmap, 0); I guess you can use runtime·cnew/runtime·cnewarray in most places now. cnew/cnewarray are good, because they say "this is a normal typed allocation" rather than some runtime trickery with flags and types.
Sign in to reply to this message.
On Fri, Aug 30, 2013 at 3:28 AM, <dvyukov@google.com> wrote: > > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/hashmap.h<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h> > File src/pkg/runtime/hashmap.h (right): > > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/hashmap.h#newcode1<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h#newcode1> > src/pkg/runtime/hashmap.h:1: // Copyright 2009 The Go Authors. All > rights reserved. > remove this file? > > I'll clean that up in a separate CL. > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/type.h<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h> > File src/pkg/runtime/type.h (right): > > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/type.h#newcode74<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h#newcode74> > src/pkg/runtime/type.h:74: Type *hmap; // internal type representing a > Hmap > Why do you use additional hmap rather than MapType.Type.gc? > MapType.Type.gc looks like a reasonable place for Hmap GC info. > > We need a whole Type to pass to mallocgc. > Related to this, if we have > v map[T]V > is it GC_PTR or GC_APTR? Do we have points-to type info associated with > the pointer, or we must consult heap block type info? > I am fine either way, just want to make sure that it does not have wrong > type info. > > It is GC_PTR because we know the exact type of the referent. It will be an Hmap. > Does Hmap heap block type info is correct? It must describe Hmap. If > MapType.Type.gc==nil, then I guess we will scan Hmap conservatively. > > https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... >
Sign in to reply to this message.
On Fri, Aug 30, 2013 at 3:39 AM, <dvyukov@google.com> wrote: > > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/hashmap.c<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c> > File src/pkg/runtime/hashmap.c (right): > > https://codereview.appspot.**com/13078044/diff/36001/src/** > pkg/runtime/hashmap.c#**newcode932<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c#newcode932> > src/pkg/runtime/hashmap.c:932: h = runtime·mallocgc(sizeof(*h), > (uintptr)typ->hmap, 0); > I guess you can use runtime·cnew/runtime·cnewarray in most places now. > cnew/cnewarray are good, because they say "this is a normal typed > allocation" rather than some runtime trickery with flags and types. > > I'll clean this up in a separate CL also. > https://codereview.appspot.**com/13078044/<https://codereview.appspot.com/130... >
Sign in to reply to this message.
On 2013/08/31 05:37:53, khr1 wrote: > On Fri, Aug 30, 2013 at 3:28 AM, <mailto:dvyukov@google.com> wrote: > > > > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/hashmap.h<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h> > > File src/pkg/runtime/hashmap.h (right): > > > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/hashmap.h#newcode1<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.h#newcode1> > > src/pkg/runtime/hashmap.h:1: // Copyright 2009 The Go Authors. All > > rights reserved. > > remove this file? > > > > > I'll clean that up in a separate CL. OK > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/type.h<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h> > > File src/pkg/runtime/type.h (right): > > > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/type.h#newcode74<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/type.h#newcode74> > > src/pkg/runtime/type.h:74: Type *hmap; // internal type representing a > > Hmap > > Why do you use additional hmap rather than MapType.Type.gc? > > MapType.Type.gc looks like a reasonable place for Hmap GC info. > > > > > We need a whole Type to pass to mallocgc. I meant to pass whole MapType to mallocgc. > > Related to this, if we have > > v map[T]V > > is it GC_PTR or GC_APTR? Do we have points-to type info associated with > > the pointer, or we must consult heap block type info? > > I am fine either way, just want to make sure that it does not have wrong > > type info. > > > > > It is GC_PTR because we know the exact type of the referent. It will be an > Hmap. OK
Sign in to reply to this message.
On 2013/08/31 05:38:47, khr1 wrote: > On Fri, Aug 30, 2013 at 3:39 AM, <mailto:dvyukov@google.com> wrote: > > > > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/hashmap.c<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c> > > File src/pkg/runtime/hashmap.c (right): > > > > https://codereview.appspot.**com/13078044/diff/36001/src/** > > > pkg/runtime/hashmap.c#**newcode932<https://codereview.appspot.com/13078044/diff/36001/src/pkg/runtime/hashmap.c#newcode932> > > src/pkg/runtime/hashmap.c:932: h = runtime·mallocgc(sizeof(*h), > > (uintptr)typ->hmap, 0); > > I guess you can use runtime·cnew/runtime·cnewarray in most places now. > > cnew/cnewarray are good, because they say "this is a normal typed > > allocation" rather than some runtime trickery with flags and types. > > > > > I'll clean this up in a separate CL also. OK Note that the freeze starts tomorrow
Sign in to reply to this message.
LGTM
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/go/source/detail?r=f605488ef701 *** runtime: record type information for hashtable internal structures. Remove all hashtable-specific GC code. Fixes bug 6119. R=cshapiro, dvyukov, khr CC=golang-dev https://codereview.appspot.com/13078044
Sign in to reply to this message.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
