Code review - Issue 6344063: TypeId: add hashing and map lookuphttps://codereview.appspot.com/2012-11-29T08:07:16+00:00rietveld
Message from unknown
2012-06-29T23:31:40+00:00Peter Barnesurn:md5:cddfa51d7f74c84e1474a85f8d84603c
Message from jefeyoung313@gmail.com
2012-07-23T19:13:11+00:00Jeff Yurn:md5:a3fc131d9aaf5514188520e6324b5037
Hi Peter,
This looks good. I had two minor comments:
1) Possibly use "TypeId" instead of "Type-id" at least for NS3-related object and test names.
2) Include your hash chaining description in some higher-level doc (I suggested as part of the existing hash doc) because it would be really useful for anyone who encounters a hash collision (and may not think to check the type-id source code).
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc
File src/core/model/type-id.cc (left):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#oldcode106
src/core/model/type-id.cc:106: }
Extra comment suggestion: "//Handle hash collisions based on the tag name"
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#oldcode115
src/core/model/type-id.cc:115: NS_ASSERT (uid <= 0xffff);
Extra comment suggestion: "//Add new UID to both the name map and the hash map. This allows for looking up UID by name or hash value."
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc
File src/core/model/type-id.cc (right):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#newcode36
src/core/model/type-id.cc:36:
Can you combine these two "namespace ns3" calls?
Also, I think this should be defined as "TypeId" to match up with NS-3 style.
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#newcode40
src/core/model/type-id.cc:40:
I'm assuming you are talking about IidManager here, but might be useful to make clear in the comment since namespace ns3 is used above. Something like "IidManager needs to be in..."
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#newcode111
src/core/model/type-id.cc:111:
This is a good explanation of the hash chaining functionality, but I'm wondering if it wouldn't also be really useful in higher-level documentation.
One suggestion might be to have a brief discussion of how hashes are used with TypeIDs in doc/manual/source/hash-functions.rst and link in this text there. Collisions are a pretty uncommon occurrence, but this brief explanation would really assist anyone should they encounter the two-fold or three-fold collision.
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc
File src/core/test/type-id-test-suite.cc (right):
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc#newcode259
src/core/test/type-id-test-suite.cc:259:
type-id => TypeId to match up with NS-3 style and suggested change to name in type-id.cc.
Message from tomh.org@gmail.com
2012-07-25T13:25:25+00:00Tom Hendersonurn:md5:30b5d28ac4c527904bafb91edac55673
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc
File src/core/test/type-id-test-suite.cc (right):
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc#newcode169
src/core/test/type-id-test-suite.cc:169: // Performance test
we have a performance test suite type; suggest to move this test to type PERFORMANCE and decouple it from the other UNIT tests
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc#newcode170
src/core/test/type-id-test-suite.cc:170:
We have a separate PERFORMANCE test suite type for this; suggest to decouple this test from UNIT and make it a PERFORMANCE.
http://codereview.appspot.com/6344063/diff/1/src/core/test/type-id-test-suite.cc#newcode259
src/core/test/type-id-test-suite.cc:259:
On 2012/07/23 19:13:11, Jeff Y wrote:
> type-id => TypeId to match up with NS-3 style and suggested change to name in
> type-id.cc.
I think it is consistent with test suite labeling to keep it at 'type-id' (see output of ./test.py)
Message from mathieu.lacage@gmail.com
2012-07-28T08:01:30+00:00Mathieu Lacageurn:md5:912da148efaf4e4d0b5eb7939533f5de
ack
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc
File src/core/model/type-id.cc (right):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#newcode125
src/core/model/type-id.cc:125: // is only 2 x 10^-4. The probability for a three-fold collision is
should be 10^-21, no ?
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.h
File src/core/model/type-id.h (right):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.h#newcode75
src/core/model/type-id.h:75: typedef Hash32_t hash_value_type;
oh, come on....
Not another layer of indirection, please. Just pick a 32 or a 64 bit type and let's live with it.
Message from pdbj@mac.com
2012-11-29T06:57:36+00:00Peter Barnesurn:md5:497724f8715894a5225864df0515a170
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc
File src/core/model/type-id.cc (right):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.cc#newcode125
src/core/model/type-id.cc:125: // is only 2 x 10^-4. The probability for a three-fold collision is
On 2012/07/28 08:01:30, Mathieu Lacage wrote:
> should be 10^-21, no ?
Do you mean n/k = 2^-21? That's the average occupancy per bucket. We need the expected number of collisions.
Referring to:
http://www.math.dartmouth.edu/archive/m19w03/public_html/Section6-5.pdf
Number of buckets = k = 2^bits
E(collisions) = n - k + k (1 - 1/k)^n
where n = number of entries in table.
This form is numerically unstable for low collision rates. Expanding for large k we get
1 n 1 n 1 n
E(c) = --- ( ) - --- ( ) + --- ( ) - ...
k 2 k^2 3 k^3 4
1 n [ n - 2 (n - 2)(n - 3) ]
= --- ( ) | 1 - ----- + -------------- - ... |
k 2 [ 3 k 12 k^2 ]
Using the first two terms, k = 2^31, and n = 2^10, I calculate E(c) = 2.4 10^-4.
For the three-fold probability, I take n/k E(c) = 1.2 10^-10
I'll update the numbers.
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.h
File src/core/model/type-id.h (right):
http://codereview.appspot.com/6344063/diff/1/src/core/model/type-id.h#newcode75
src/core/model/type-id.h:75: typedef Hash32_t hash_value_type;
On 2012/07/28 08:01:30, Mathieu Lacage wrote:
> oh, come on....
I was waiting for that ;)
I removed this, then realized collisions can be dealt with via a trivial fourth method: switch to 64-bit hashes. To do this, you have to change all the uint32_t type names *that refer to hashes* (there are many) and not get any other uint32_t's (there are also many).
On balance, it seems to me that hash_value_type is a reasonable future-proofing compromise: it clearly indicates value semantics, and makes it trivial to switch to 64-bit hashes if needed in the future.
Message from unknown
2012-11-29T08:07:16+00:00Peter Barnesurn:md5:4e1876b34599d7eef600263ed0919c2c