|
|
|
Created:
16 years, 9 months ago by Josh Kelley Modified:
11 years, 1 month ago Reviewers:
Zhanyong Wan CC:
googletestframework_googlegroups.com Base URL:
http://googletest.googlecode.com/svn/trunk/ Visibility:
Public. |
Patch Set 1 #
Total comments: 44
Patch Set 2 : test shuffling - flag parsing and management only #
Total comments: 6
Patch Set 3 : option to run the tests in a random order #Patch Set 4 : option to run the tests in a random order #
Total comments: 3
Patch Set 5 : final patch implementing running tests in a random order #
Total comments: 42
MessagesTotal messages: 8
Hi Josh, I apologize for the massive delay. In general I like what I'm seeing. Thanks for another high-quality patch! There are several major issues to be sorted out: - Shuffling lists is no fun and error-prone. Since we plan to replace the list type used in gtest with something that supports random access (as part of the work on the event listener API), it may make sense to implement test shuffling when that is done. It will simplify both the shuffling code and the tests. Vlad is working on the event listener, so we should coordinate with him. - We need many more tests to cover the new behavior. - I need to make sure this works well with test sharding and possibly other features. http://codereview.appspot.com/52057/diff/1/4 File include/gtest/gtest.h (right): http://codereview.appspot.com/52057/diff/1/4#newcode140 Line 140: GTEST_DECLARE_bool_(shuffle); Keeping the flags sorted by name? http://codereview.appspot.com/52057/diff/1/3 File include/gtest/internal/gtest-internal.h (right): http://codereview.appspot.com/52057/diff/1/3#newcode755 Line 755: // A simple Linear Congruential Generator for generating random numbers. Unlike Comment on what kind of distribution can be expected. (Uniform?) http://codereview.appspot.com/52057/diff/1/3#newcode761 Line 761: explicit Random(UInt32 state = 1) : state_(state) {} Google's C++ style guide bans default arguments. http://codereview.appspot.com/52057/diff/1/3#newcode763 Line 763: int Generate(int range); Comment on the range of the return value (e.g. [0, range - 1]). http://codereview.appspot.com/52057/diff/1/3#newcode763 Line 763: int Generate(int range); Why aren't the range and return type UInt32? http://codereview.appspot.com/52057/diff/1/5 File src/gtest-internal-inl.h (right): http://codereview.appspot.com/52057/diff/1/5#newcode89 Line 89: const char kShuffleFlag[] = "shuffle"; Keep the flags sorted by name. http://codereview.appspot.com/52057/diff/1/5#newcode111 Line 111: shuffle_ = GTEST_FLAG(shuffle); The same. http://codereview.appspot.com/52057/diff/1/5#newcode130 Line 130: GTEST_FLAG(shuffle) = shuffle_; The same. http://codereview.appspot.com/52057/diff/1/5#newcode149 Line 149: bool shuffle_; And this. http://codereview.appspot.com/52057/diff/1/5#newcode439 Line 439: // Performs an in-place shuffle of a range of this list's nodes. I know it's implied, but it helps to emphasize that it's a user error to call this with nodes that aren't in this list. http://codereview.appspot.com/52057/diff/1/5#newcode445 Line 445: { Style: put { at the end of the previous line. http://codereview.appspot.com/52057/diff/1/5#newcode452 Line 452: ListNode<E> **array = new ListNode<E>*[size()]; ListNode<T>** const array = ... http://codereview.appspot.com/52057/diff/1/5#newcode454 Line 454: for ( ListNode<E> * node = from; Style: put the for() loop conditions on one line. No space after "(" or before ")". http://codereview.appspot.com/52057/diff/1/5#newcode464 Line 464: int k = random->Generate(n); // 0 <= k < n const int http://codereview.appspot.com/52057/diff/1/5#newcode466 Line 466: ListNode<E> *temp = array[n]; ListNode<E>* const temp http://codereview.appspot.com/52057/diff/1/5#newcode1284 Line 1284: internal::Random* Random(); By Google's style, the getter should be named random() (all lower case). http://codereview.appspot.com/52057/diff/1/5#newcode1373 Line 1373: internal::Random* random_; Why not use an object (instead of a pointer)? http://codereview.appspot.com/52057/diff/1/6 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/1/6#newcode243 Line 243: shuffle, Sort the flags by name. http://codereview.appspot.com/52057/diff/1/6#newcode257 Line 257: const UInt32 kM = 0x7fffffffu; // analogous to RAND_MAX Why isn't this 0xffffffff? http://codereview.appspot.com/52057/diff/1/6#newcode258 Line 258: const UInt32 kA = 1103515245; Comment on how the values of kA and kC are chosen? http://codereview.appspot.com/52057/diff/1/6#newcode264 Line 264: return int(double(state_) / (double(kM) + 1) * range); Is this any better than state_ % range? http://codereview.appspot.com/52057/diff/1/6#newcode2677 Line 2677: "Note: Randomizing tests' orders with a seed of %i\n", This info is better printed in OnUnitTestStart() as we want it to be easily findable. http://codereview.appspot.com/52057/diff/1/6#newcode3694 Line 3694: GTEST_FLAG(random_seed++); GTEST_FLAG(random_seed)++; http://codereview.appspot.com/52057/diff/1/6#newcode3995 Line 3995: random_ = new internal::Random(static_cast<unsigned>(GetTimeInMillis())); unsigned => UInt32 http://codereview.appspot.com/52057/diff/1/6#newcode4000 Line 4000: void UnitTestImpl::ShuffleTestCases() { In general, shuffling a list is awkward and error-prone. I'm a bit concerned with having to maintain such code. It may make sense to first implement a vector-like data structure to replace the list type (it's useful for the event listener API as well). http://codereview.appspot.com/52057/diff/1/6#newcode4311 Line 4311: ParseBoolFlag(arg, kThrowOnFailureFlag, >EST_FLAG(throw_on_failure)) || Style: <= 80 columns per line. http://codereview.appspot.com/52057/diff/1/6#newcode4312 Line 4312: ParseBoolFlag(arg, kShuffleFlag, >EST_FLAG(shuffle)) || Sort the flags by name. http://codereview.appspot.com/52057/diff/1/6#newcode4376 Line 4376: if (!GTEST_FLAG(random_seed)) { Style: GTEST_FLAG(random_seed) == 0 http://codereview.appspot.com/52057/diff/1/6#newcode4377 Line 4377: GTEST_FLAG(random_seed) = static_cast<Int32>(GetTimeInMillis()); It's best to treat the random_seed flag as read-only by gtest. We don't want to lose the information that the user set it to 0, for example.
Sign in to reply to this message.
One thing I learned from this is that big patches can take a long time to be reviewed and get ready for check-in, as we are all multi-tasking and have other important work to tend to. During the time it takes to review and improve the patch, the code base is constantly changing, often creating merge conflicts. This makes the experience suck. One thing we can try to make our lives easier is to break down a big patch into smaller, but still coherent ones. For example, your first patch in this series can be adding the logic to parse the new flags (but don't actually implement any behavior for these flags). I imagine that can be reviewed and checked in quickly. Once that's done, you may want to contribute the code to shuffle a list (but don't yet hook it up with the rest of gtest), or maybe an easier-to-shuffle data structure to replace List. Then we take yet another step...
Sign in to reply to this message.
I uploaded a reduced patch set, as you suggested, that only implements parsing, managing, and displaying the flags. (I realize that a couple of aspects of managing and displaying may change depending on your review here.) I tried to address your other comments, too, but feel free to ignore those until those patches are ready. I honestly didn't think that shuffling the list was that bad (although the unit tests for shuffling were pretty ugly). To help me understand better, could you explain your concerns? (Of course, if the event listener implementation will be replacing List with something randomly accessible anyway, then it's irrelevant.) You mentioned that many more tests are needed. I know that I'd originally suggested adding a Python script gtest_shuffle_test.py that does the following: 1) Runs --gtest_shuffle --gtest_repeat=3 and verifies non-repeating seeds. 2) Runs --gtest_shuffle --gtest_random_seed=n and verifies that the order does in fact change. 3) Runs a test suite containing death tests 10 times or so and verifies that death tests always occur before non-death tests. What other tests would you like to see? http://codereview.appspot.com/52057/diff/1/3 File include/gtest/internal/gtest-internal.h (right): http://codereview.appspot.com/52057/diff/1/3#newcode761 Line 761: explicit Random(UInt32 state = 1) : state_(state) {} On 2009/06/16 06:35:11, Zhanyong Wan wrote: > Google's C++ style guide bans default arguments. I saw that the guide also bans overloading used to simulate default arguments, so I wasn't certain what to do. Is Random() : state_(1) {} explicit Random(UInt32 state) : state_(state) {} okay even though it's overloaded to (sort of) simulate default arguments? http://codereview.appspot.com/52057/diff/1/3#newcode763 Line 763: int Generate(int range); On 2009/06/16 06:35:11, Zhanyong Wan wrote: > Why aren't the range and return type UInt32? I saw that the style guide strongly discourages the use of unsigned, and so it seemed better to treat the unsignedness and bit width as implementation details and use the default int for the public interface. http://codereview.appspot.com/52057/diff/1/5 File src/gtest-internal-inl.h (right): http://codereview.appspot.com/52057/diff/1/5#newcode1373 Line 1373: internal::Random* random_; On 2009/06/16 06:35:11, Zhanyong Wan wrote: > Why not use an object (instead of a pointer)? I think that was left over from an earlier design I'd tried. Thanks for catching this. http://codereview.appspot.com/52057/diff/1/6 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/1/6#newcode257 Line 257: const UInt32 kM = 0x7fffffffu; // analogous to RAND_MAX On 2009/06/16 06:35:11, Zhanyong Wan wrote: > Why isn't this 0xffffffff? This (as well as kA and kC) came from glibc's implementation of rand(). After rereading the Wikipedia article on LCGs, it looks like other numbers would give better results, but only if I use a 64-bit temporary. I assume that's not worth doing. http://codereview.appspot.com/52057/diff/1/6#newcode264 Line 264: return int(double(state_) / (double(kM) + 1) * range); On 2009/06/16 06:35:11, Zhanyong Wan wrote: > Is this any better than state_ % range? state_ % range introduces a downward bias, since values for state_ in the range of kM - kM % range through kM, inclusive, result in numbers in the lower portion of [0, range). Converting via double also introduces bias, since the number of values of state_ that map to each value of [0, range) isn't consistent, but the bias is at least spread throughout. Converting via double should also help avoid the problem that some LCGs have where the lower bits have relatively little randomness. http://codereview.appspot.com/52057/diff/1/6#newcode2677 Line 2677: "Note: Randomizing tests' orders with a seed of %i\n", On 2009/06/16 06:35:11, Zhanyong Wan wrote: > This info is better printed in OnUnitTestStart() as we want it to be easily > findable. Printing in OnUnitTestStart() would look nicer, but printing it here reflects any changes that the user makes in a test environment. Which do you think is better? http://codereview.appspot.com/52057/diff/1/6#newcode4377 Line 4377: GTEST_FLAG(random_seed) = static_cast<Int32>(GetTimeInMillis()); On 2009/06/16 06:35:11, Zhanyong Wan wrote: > It's best to treat the random_seed flag as read-only by gtest. We don't want to > lose the information that the user set it to 0, for example. I don't follow. If it defaults to 0, then how does treating it as read-only preserve the information that the user set it to 0? If 0 means "use a default/time-based seed," then what's wrong with overwriting 0, whether 0 came from the user or the default? If it is tentatively a goal to let the user manipulate the shuffling-related flags in global and test case setup and teardown, then I thought that reusing or overwriting random_seed would be the best way to do this.
Sign in to reply to this message.
http://codereview.appspot.com/52057/diff/1/3 File include/gtest/internal/gtest-internal.h (right): http://codereview.appspot.com/52057/diff/1/3#newcode761 Line 761: explicit Random(UInt32 state = 1) : state_(state) {} On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > Google's C++ style guide bans default arguments. > > I saw that the guide also bans overloading used to simulate default arguments, > so I wasn't certain what to do. > > Is > Random() : state_(1) {} Why do we have to have this default ctor? We can always initialize Random objects with an explicit initial state. > explicit Random(UInt32 state) : state_(state) {} > okay even though it's overloaded to (sort of) simulate default arguments? http://codereview.appspot.com/52057/diff/1/3#newcode762 Line 762: void Reseed(UInt32 state) { state_ = state; } "seed" and "state" are different concepts. The projection from seeds to states doesn't have to be a bijection. For example, we limit the valid seeds to the range [1, 9999], but the range of valid states can be much larger. Therefore the argument should be renamed seed. I think we should keep the state an implementation detail, as there's no need for the user to know about it. This means the ctor's argument should be "seed" too. http://codereview.appspot.com/52057/diff/1/3#newcode763 Line 763: int Generate(int range); On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > Why aren't the range and return type UInt32? > > I saw that the style guide strongly discourages the use of unsigned, and so it > seemed better to treat the unsignedness and bit width as implementation details > and use the default int for the public interface. I think Random is one of the places where you do care about the signedness and width of the value. A generic int type isn't good enough. Suppose you use the code on an architecture where sizeof(int) is 8, you will get very bad result as Random can only generate results in [0, 2^32 - 1]. We should make this fact explicit in the type. That means we should use UInt32 as the input and output type. http://codereview.appspot.com/52057/diff/1/6 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/1/6#newcode257 Line 257: const UInt32 kM = 0x7fffffffu; // analogous to RAND_MAX On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > Why isn't this 0xffffffff? > > This (as well as kA and kC) came from glibc's implementation of rand(). > > After rereading the Wikipedia article on LCGs, it looks like other numbers would > give better results, but only if I use a 64-bit temporary. I assume that's not > worth doing. Correct, it's not worth doing. What I'd like to see if a comment on how the constant was picked, such that the reader knows it's not arbitrary. http://codereview.appspot.com/52057/diff/1/6#newcode264 Line 264: return int(double(state_) / (double(kM) + 1) * range); On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > Is this any better than state_ % range? > > state_ % range introduces a downward bias, since values for state_ in the range > of kM - kM % range through kM, inclusive, result in numbers in the lower portion > of [0, range). > > Converting via double also introduces bias, since the number of values of state_ > that map to each value of [0, range) isn't consistent, but the bias is at least > spread throughout. Yes. However I doubt this bias matters for our purpose. > Converting via double should also help avoid the problem that some LCGs have > where the lower bits have relatively little randomness. I think % is good enough for our purpose. I prefer to keep it simple. http://codereview.appspot.com/52057/diff/1/6#newcode2677 Line 2677: "Note: Randomizing tests' orders with a seed of %i\n", On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > This info is better printed in OnUnitTestStart() as we want it to be easily > > findable. > > Printing in OnUnitTestStart() would look nicer, but printing it here reflects > any changes that the user makes in a test environment. > > Which do you think is better? To keep it simple, I think gtest should only look at the values of the shuffle and random_seed flags once at the beginning of the test program. Modifying the flags afterwards should have no effect. Therefore we should print the message in OnUnitTestStart(). http://codereview.appspot.com/52057/diff/1/6#newcode4377 Line 4377: GTEST_FLAG(random_seed) = static_cast<Int32>(GetTimeInMillis()); On 2009/06/25 02:07:02, Josh Kelley wrote: > On 2009/06/16 06:35:11, Zhanyong Wan wrote: > > It's best to treat the random_seed flag as read-only by gtest. We don't want > to > > lose the information that the user set it to 0, for example. > > I don't follow. If it defaults to 0, then how does treating it as read-only > preserve the information that the user set it to 0? If 0 means "use a > default/time-based seed," then what's wrong with overwriting 0, whether 0 came > from the user or the default? By "treating it as read-only" I mean that gtest should never modify the flags (except in the flag parser). We want to have a record on what the user specified and we don't want to lose that information. I don't care to distinguish between explicit --gtest_random_seed=0 and no gtest_random_seed flag. They both mean that the user wants to use the default, time-based, random seed. In other words, the user's intention is the same in the two cases. However, I do care about the difference between --gtest_random_seed=1234 and --gtest_random_seed=0 where the clock happens to tell gtest to pick 1234, as the user intention is different in the two cases. What gtest should do is to use a separate actual_random_seed variable somewhere to hold the actual random seed. > If it is tentatively a goal to let the user manipulate the shuffling-related > flags in global and test case setup and teardown, then I thought that reusing or > overwriting random_seed would be the best way to do this. No, this is not a goal. http://codereview.appspot.com/52057/diff/4001/5004#newcode223 Line 223: Also say that 0 means to get the seed from the current time. Also explain what the valid range is. http://codereview.appspot.com/52057/diff/4001/5004#newcode2754 Line 2754: fflush(stdout); We should normalize the seed into the range [0, 9999]. http://codereview.appspot.com/52057/diff/4001/5004#newcode4467 Line 4467: Function names should normally be verb phrases. Also we should make it obvious that the seed is obtained from the current time. GetRandomSeedFromTime()? http://codereview.appspot.com/52057/diff/4001/5004#newcode4468 Line 4468: Change 10000 to 9999 to match the name "max default random seed". http://codereview.appspot.com/52057/diff/4001/5004#newcode4469 Line 4469: static_cast<Int32>((GetTimeInMillis() % kMaxDefaultRandomSeed) + 1) - The cast should enclose the entire expression. - The result range should be [1, 9999] as 0 is invalid (if we pick 0, the user won't be able to reproduce the failure as he cannot specify 0 as the seed).
Sign in to reply to this message.
A general question: I assume that assertions are a good idea, even if existing code doesn't always use them? When should I use assert, and when should I use GTEST_CHECK_? Is the intent to use GTEST_CHECK_ for code that clients (such as event listeners) may use? http://codereview.appspot.com/52057/diff/1/3 File include/gtest/internal/gtest-internal.h (right): http://codereview.appspot.com/52057/diff/1/3#newcode761 Line 761: explicit Random(UInt32 state = 1) : state_(state) {} On 2009/06/29 21:20:16, Zhanyong Wan wrote: > Why do we have to have this default ctor? We can always initialize Random > objects with an explicit initial state. You're right, sorry. I was trying to optionally hide implementation details from Random's callers, but that's not even possible for a decent RNG. http://codereview.appspot.com/52057/diff/4001/5004 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/4001/5004#newcode223 Line 223: "Random number seed to use when shuffling test orders."); On 2009/06/29 21:20:16, Zhanyong Wan wrote: > Also say that 0 means to get the seed from the current time. > > Also explain what the valid range is. Is there benefit to constraining users' manually chosen seeds if they really want a seed >99999? Is there a non-hackish way of using kMaxDefaultRandomSeed here, to keep the text in sync with any future changes to kMaxDefaultRandomSeed?
Sign in to reply to this message.
Thanks for keeping working on this, Josh! http://codereview.appspot.com/52057/diff/9001/9003 File test/gtest_list_tests_unittest.py (right): http://codereview.appspot.com/52057/diff/9001/9003#newcode57 Line 57: BarDeathTest. Why this change? http://codereview.appspot.com/52057/diff/9001/9002 File test/gtest_list_tests_unittest_.cc (right): http://codereview.appspot.com/52057/diff/9001/9002#newcode79 Line 79: TEST(BarDeathTest, Test1) { Why this change? http://codereview.appspot.com/52057/diff/11001/12003 File src/gtest-internal-inl.h (right): http://codereview.appspot.com/52057/diff/11001/12003#newcode56 Line 56: #include <algorithm> Move this before <string> to keep the group sorted? http://codereview.appspot.com/52057/diff/11001/12003#newcode389 Line 389: // Swaps the a-th and b-th elements of the list. Nit: i and j are more common names for int-typed variables. People immediately know they are integers. a and b don't have this kind of implication. http://codereview.appspot.com/52057/diff/11001/12003#newcode389 Line 389: // Swaps the a-th and b-th elements of the list. list => vector http://codereview.appspot.com/52057/diff/11001/12003#newcode397 Line 397: // TODO: Okay to use std::swap here? gtest is used in some environments that have a broken STL. Therefore it's safer to avoid depending on <algorithm>. http://codereview.appspot.com/52057/diff/11001/12003#newcode401 Line 401: // Performs an in-place shuffle of a range of this list's nodes. list => vector http://codereview.appspot.com/52057/diff/11001/12003#newcode404 Line 404: void ShuffleRange(internal::Random* random, int from, int to) { Since this function manipulates indices heavily, it helps to make the code more obviously correct if we declare from and to as const here. http://codereview.appspot.com/52057/diff/11001/12003#newcode405 Line 405: GTEST_CHECK_(0 <= from && from <= size_) from should be < size. http://codereview.appspot.com/52057/diff/11001/12003#newcode408 Line 408: GTEST_CHECK_(0 <= to && to <= size_) to should be >= from. If it's < from, it's likely an user error and we should complain. http://codereview.appspot.com/52057/diff/11001/12003#newcode420 Line 420: const int k = random->Generate(n - from) + from; // from <= k < n It's slightly more intuitive like this: for (int range_width = to - from; range_width >= 2; range_width--) { const int last_in_range = from + range_width - 1; const int selected = from + random->Generate(range_width); Swap(selected, last_in_range); } - The termination condition is more intuitive: the range must contain >= 2 elements to be worth shuffling. - Give the two elements being swapped meaningful names. http://codereview.appspot.com/52057/diff/11001/12003#newcode976 Line 976: // Gets the random seed used at the start of the current test run. Won't it be more useful for this to be the seed used at the beginning of the current test *iteration*? http://codereview.appspot.com/52057/diff/11001/12003#newcode984 Line 984: void ShuffleTestCases(); If it shuffles the tests within each test case too, it should be called ShuffleTests(). ShuffleTestCases() implies that only the test cases are shuffled. http://codereview.appspot.com/52057/diff/11001/12004 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/11001/12004#newcode3703 Line 3703: random_(1), Why 1? http://codereview.appspot.com/52057/diff/11001/12004#newcode3921 Line 3921: random()->Reseed(random_seed_); If I see a failure in the 700-th iteration, how can I repro it without running the first 699 iterations? I think we should restore the original test order at the end of an iteration, and then shuffle with a new seed. This allows the user to reconstruct the test order in any iteration without running all iterations before it. http://codereview.appspot.com/52057/diff/11001/12004#newcode3921 Line 3921: random()->Reseed(random_seed_); Can we assert that random_seed_ is not 0 here? It's important as the user has no way to specify a 0 seed (--gtest_random_seed=0 has a special meaning). http://codereview.appspot.com/52057/diff/11001/12004#newcode4172 Line 4172: } else { Add a comment "// All death tests must be run before non-death tests, even in the shuffle mode."? http://codereview.appspot.com/52057/diff/11001/12004#newcode4173 Line 4173: test_cases_.ShuffleRange(random(), Add a comment "// Shuffles all death test cases."? http://codereview.appspot.com/52057/diff/11001/12004#newcode4175 Line 4175: test_cases_.ShuffleRange(random(), Add a comment "// Shuffles all non-death test cases."? http://codereview.appspot.com/52057/diff/11001/12001 File test/gtest_unittest.cc (right): http://codereview.appspot.com/52057/diff/11001/12001#newcode780 Line 780: static ::std::ostream& operator<<(::std::ostream& os, const TestingVector& vector) { Fit the line within 80 columns. http://codereview.appspot.com/52057/diff/11001/12001#newcode790 Line 790: static const int kVectorSize = 20; Style: data members should be defined at the end of the class. http://codereview.appspot.com/52057/diff/11001/12001#newcode794 Line 794: static bool IsVectorNotCorrupt(const TestingVector& vector); A predicate should read like a statement, not a question. Hence VectorIsNotCorrupt() is a better name. The same for other predicates. http://codereview.appspot.com/52057/diff/11001/12001#newcode797 Line 797: int count); We use [begin, end) style range everywhere else, so a (start, count) pair can be confusing. You cannot tell the second value is a count at the caller site. http://codereview.appspot.com/52057/diff/11001/12001#newcode822 Line 822: if (vector.GetElement(i) < 0 || Save vector.GetElement(i) to a const int variable instead of calling it 4 times. http://codereview.appspot.com/52057/diff/11001/12001#newcode836 Line 836: int start_element, Fix indentation. http://codereview.appspot.com/52057/diff/11001/12001#newcode837 Line 837: int count) { The same. http://codereview.appspot.com/52057/diff/11001/12001#newcode840 Line 840: if (i != vector.GetElement(i)) { This is more strict than "is sequential". The sequence "2 3 4" and "1 2 3" (starting at location 2) are both sequential, but the function only returns true for the former. Rename to RangeIsUnshuffled? http://codereview.appspot.com/52057/diff/11001/12001#newcode890 Line 890: EXPECT_EQ(kVectorSize - 1, vector_.GetElement(kVectorSize - 1)); Why this? http://codereview.appspot.com/52057/diff/11001/12001#newcode898 Line 898: // there are no off-by-one problems in our shuffle algorithm. I don't understand what problem the following two assertions are catching. http://codereview.appspot.com/52057/diff/11001/12001#newcode957 Line 957: EXPECT_EQ(vector_.GetElement(i), vector2.GetElement(i)); ... << " at position " << i;
Sign in to reply to this message.
http://codereview.appspot.com/52057/diff/9001/9003 File test/gtest_list_tests_unittest.py (right): http://codereview.appspot.com/52057/diff/9001/9003#newcode57 Line 57: BarDeathTest. On 2009/09/22 05:46:24, Zhanyong Wan wrote: > Why this change? This (and the change in gtest_list_tests_unittest_.cc) were so that a Python test script could call gtest_list_tests_unittest_ and verify that death tests were shuffled properly. I reverted this change in my most recent patch since you said you'd take care of the Python scripts. http://codereview.appspot.com/52057/diff/11001/12003 File src/gtest-internal-inl.h (right): http://codereview.appspot.com/52057/diff/11001/12003#newcode405 Line 405: GTEST_CHECK_(0 <= from && from <= size_) On 2009/09/22 05:46:24, Zhanyong Wan wrote: > from should be < size. Since to == size is permitted (to shuffle the entire list), and since from == to is permitted (shuffle an empty range), then from == size should be permitted. If from cannot equal size, then we'd have to add special cases to handle, e.g., shuffling a test suite containing only death test cases. http://codereview.appspot.com/52057/diff/11001/12003#newcode420 Line 420: const int k = random->Generate(n - from) + from; // from <= k < n On 2009/09/22 05:46:24, Zhanyong Wan wrote: > It's slightly more intuitive like this: > > for (int range_width = to - from; range_width >= 2; range_width--) { > const int last_in_range = from + range_width - 1; > const int selected = from + random->Generate(range_width); > Swap(selected, last_in_range); > } > > - The termination condition is more intuitive: the range must contain >= 2 > elements to be worth shuffling. > - Give the two elements being swapped meaningful names. > I took the code with almost no modification from Wikipedia, which apparently took the algorithm (including the variable names n and k) from Durstenfield's and Knuth's work. I see how your approach is clearer for those new to the code, but I would guess that people familiar with the algorithm or reading the complete Wikipedia article would find Wikipedia's approach clearer. Thoughts? http://codereview.appspot.com/52057/diff/11001/12003#newcode976 Line 976: // Gets the random seed used at the start of the current test run. On 2009/09/22 05:46:24, Zhanyong Wan wrote: > Won't it be more useful for this to be the seed used at the beginning of the > current test *iteration*? My terminology was poor. It's for the current iteration. http://codereview.appspot.com/52057/diff/11001/12004 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/11001/12004#newcode3703 Line 3703: random_(1), On 2009/09/22 05:46:24, Zhanyong Wan wrote: > Why 1? A default c'tor for Random didn't really seem appropriate, so I needed some seed to use until we got a random seed from flags or from the clock, and 0 is a special case and not a valid seed, so I picked 1. Although I guess seemingly magic numbers are bad too. What would be a better approach? http://codereview.appspot.com/52057/diff/11001/12004#newcode3921 Line 3921: random()->Reseed(random_seed_); On 2009/09/22 05:46:24, Zhanyong Wan wrote: > If I see a failure in the 700-th iteration, how can I repro it without running > the first 699 iterations? > > I think we should restore the original test order at the end of an iteration, > and then shuffle with a new seed. This allows the user to reconstruct the test > order in any iteration without running all iterations before it. Shoot. I failed to consider how each shuffling currently depends on the results of the previous shuffle. I'll have to fix that. http://codereview.appspot.com/52057/diff/11001/12001 File test/gtest_unittest.cc (right): http://codereview.appspot.com/52057/diff/11001/12001#newcode790 Line 790: static const int kVectorSize = 20; On 2009/09/22 05:46:24, Zhanyong Wan wrote: > Style: data members should be defined at the end of the class. The style guide says constants go towards the beginning? http://codereview.appspot.com/52057/diff/11001/12001#newcode840 Line 840: if (i != vector.GetElement(i)) { On 2009/09/22 05:46:24, Zhanyong Wan wrote: > This is more strict than "is sequential". The sequence "2 3 4" and "1 2 3" > (starting at location 2) are both sequential, but the function only returns true > for the former. > > Rename to RangeIsUnshuffled? I was having trouble thinking of a good name. RangeIsUnshuffled is good. Thanks. http://codereview.appspot.com/52057/diff/11001/12001#newcode890 Line 890: EXPECT_EQ(kVectorSize - 1, vector_.GetElement(kVectorSize - 1)); On 2009/09/22 05:46:24, Zhanyong Wan wrote: > Why this? It's sort of a precondition, or a test of the test code, to make sure that the list is set up as this test expects and that the assertions below won't accidentally pass. http://codereview.appspot.com/52057/diff/11001/12001#newcode898 Line 898: // there are no off-by-one problems in our shuffle algorithm. On 2009/09/22 05:46:24, Zhanyong Wan wrote: > I don't understand what problem the following two assertions are catching. I wanted to make sure that the shuffle algorithm changes the first and last elements of the range; if so, then it seems safe to assume that it shuffles the entire range. The Wikipedia article I was using as a reference had a warning about common implementation errors such as an off-by-one mistake that could cause the last element to be shuffled incorrectly. Reading that prompted me to add these tests, although the comments I wrote aren't very descriptive. Sorry.
Sign in to reply to this message.
http://codereview.appspot.com/52057/diff/11001/12003 File src/gtest-internal-inl.h (right): http://codereview.appspot.com/52057/diff/11001/12003#newcode405 Line 405: GTEST_CHECK_(0 <= from && from <= size_) On 2009/09/23 03:35:19, Josh Kelley wrote: > On 2009/09/22 05:46:24, Zhanyong Wan wrote: > > from should be < size. > > Since to == size is permitted (to shuffle the entire list), and since from == to > is permitted (shuffle an empty range), then from == size should be permitted. > If from cannot equal size, then we'd have to add special cases to handle, e.g., > shuffling a test suite containing only death test cases. Makes sense! http://codereview.appspot.com/52057/diff/11001/12003#newcode420 Line 420: const int k = random->Generate(n - from) + from; // from <= k < n On 2009/09/23 03:35:19, Josh Kelley wrote: > On 2009/09/22 05:46:24, Zhanyong Wan wrote: > > It's slightly more intuitive like this: > > > > for (int range_width = to - from; range_width >= 2; range_width--) { > > const int last_in_range = from + range_width - 1; > > const int selected = from + random->Generate(range_width); > > Swap(selected, last_in_range); > > } > > > > - The termination condition is more intuitive: the range must contain >= 2 > > elements to be worth shuffling. > > - Give the two elements being swapped meaningful names. > > > > I took the code with almost no modification from Wikipedia, which apparently > took the algorithm (including the variable names n and k) from Durstenfield's > and Knuth's work. I see how your approach is clearer for those new to the code, > but I would guess that people familiar with the algorithm or reading the > complete Wikipedia article would find Wikipedia's approach clearer. Thoughts? The code should be as self-contained and self-explaining as possible. It's a distraction to ask the reader to open a web page and read it. This algorithm is simple enough that one doesn't need to read the wikipedia entry to understand how it works. We should optimize for people who don't read the article. Also, the code I wrote is implementing the same algorithm - just a different encoding. I don't think people who already know about the wiki article or Knuth's work will be confused, unless they memorize it word by word without understanding the essence of it - in that case I don't feel sorry for them. :-) http://codereview.appspot.com/52057/diff/11001/12004 File src/gtest.cc (right): http://codereview.appspot.com/52057/diff/11001/12004#newcode3703 Line 3703: random_(1), On 2009/09/23 03:35:19, Josh Kelley wrote: > On 2009/09/22 05:46:24, Zhanyong Wan wrote: > > Why 1? > > A default c'tor for Random didn't really seem appropriate, so I needed some seed > to use until we got a random seed from flags or from the clock, and 0 is a > special case and not a valid seed, so I picked 1. Although I guess seemingly > magic numbers are bad too. What would be a better approach? The fact that 0 is not a seed usable by the user makes it a good choice here. This tells the reader "we are just initializing random_ with an invalid seed here, and we'll reseed it later, so don't read too much into the seed value." I'd suggest to initialize it with 0 and add a comment that we'll reseed it before first use. http://codereview.appspot.com/52057/diff/11001/12001 File test/gtest_unittest.cc (right): http://codereview.appspot.com/52057/diff/11001/12001#newcode790 Line 790: static const int kVectorSize = 20; On 2009/09/23 03:35:19, Josh Kelley wrote: > On 2009/09/22 05:46:24, Zhanyong Wan wrote: > > Style: data members should be defined at the end of the class. > > The style guide says constants go towards the beginning? You're absolutely right. Ashamed that I didn't do my homework. :) http://codereview.appspot.com/52057/diff/11001/12001#newcode890 Line 890: EXPECT_EQ(kVectorSize - 1, vector_.GetElement(kVectorSize - 1)); On 2009/09/23 03:35:19, Josh Kelley wrote: > On 2009/09/22 05:46:24, Zhanyong Wan wrote: > > Why this? > > It's sort of a precondition, or a test of the test code, to make sure that the > list is set up as this test expects and that the assertions below won't > accidentally pass. In general, we don't test the test code, as it obscures the intention of the test. (It confused me in this case, for example.) Instead, the test code itself should be obviously correct. If that's not the case, we probably have a bigger problem. Pretty much all tests in VectorShuffleTest depends on vector_ being {0,1,2,...}. We don't assert it else where. I don't think we need to make a special case here.
Sign in to reply to this message.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
