|
|
Created:
15 years, 1 month ago by brian.olson Modified:
15 years ago Reviewers:
kenton Base URL:
http://protobuf.googlecode.com/svn/trunk/ Visibility:
Public. |
DescriptionCreate classes GZIPInputStream and GZIPOutputStream, subclasses of the C++ ZeroCopy*Stream classes, data-compatible with similar Java classes under java.util.zip.* and command line gzip and gunzip.
Includes tests for internal consistency and compatibility with command line gzip and gunzip.
Patch Set 1 #Patch Set 2 : added files #
Total comments: 62
Patch Set 3 : response to first round of code review #
Total comments: 35
Patch Set 4 : response to review #
Total comments: 15
Patch Set 5 : constructor reorg, zlib/gzip mode, cleanup #
Total comments: 14
Patch Set 6 : s/GZIP/Gzip/ and some other minor cleanup #
Total comments: 4
Patch Set 7 : license notice for gzip test script, author set for new files #
MessagesTotal messages: 11
We're definitely going to need to have the configure script auto-detect zlib and disable this code if it isn't present, as at least one supported platform (MinGW) does not have any libz by default. We should also provide a configure option to disable zlib support even if available since some people may not appreciate linking in the 100k zlib library if they aren't going to use it. Can you add this? http://codereview.appspot.com/32086/diff/7/1015 File README.txt (right): http://codereview.appspot.com/32086/diff/7/1015#newcode56 Line 56: To compile fat binaries, configure as follows: I'm not sure this belongs here. This is information that would apply to any project, not just protobuf. http://codereview.appspot.com/32086/diff/7/1015#newcode58 Line 58: ./configure CXXFLAGS="-arch ppc -arch ppc64 -arch i386 -arch x86_64" --enable-dependency-tracking=no If we keep this line, please wrap it to 80 chars. End the first line with a backslash so that it can still be directly copied and pasted into a shell. http://codereview.appspot.com/32086/diff/7/1013 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/7/1013#newcode8 Line 8: static const size_t GZIP_STREAM_BUFFER_SIZE = 64*1024; 64k buffer? Seems large. Any particular reasoning behind it? IMO you should make this configurable via a constructor parameter, if for no other reason than so that you can have a unit test that uses a very small buffer size (to more thoroughly exercise the code). You'll still need a default, of course. http://codereview.appspot.com/32086/diff/7/1013#newcode13 Line 13: zcontext_.zalloc = Z_NULL; Maybe use memset to zero-out this structure, so it's clear that you got everything? Or could Z_NULL be something other than zero? (Why does zlib have its own NULL definition?) http://codereview.appspot.com/32086/diff/7/1013#newcode21 Line 21: output_buffer_ = malloc(GZIP_STREAM_BUFFER_SIZE); Use operator new, like: output_buffer_ = operator new(GZIP_STREAM_BUFFER_SIZE); And delete with: operator delete(output_buffer_); http://codereview.appspot.com/32086/diff/7/1013#newcode74 Line 74: GOOGLE_LOG(INFO) << "GZIPInputStream::Next fails, zerror_=" << zerror_; Is this necessary? Remember that some people use libprotobuf in non-server apps where printing random messages to the console may be undesirable. http://codereview.appspot.com/32086/diff/7/1013#newcode75 Line 75: return ok; return false; http://codereview.appspot.com/32086/diff/7/1013#newcode88 Line 88: if (zcontext_.next_out == NULL) { Brief comment, please. http://codereview.appspot.com/32086/diff/7/1013#newcode89 Line 89: return false; If someone were to call Next() again after this return, I believe DoNextOutput() would produce an invalid buffer with a negative size. Please correct this so that Next() returns false on subsequent calls. http://codereview.appspot.com/32086/diff/7/1013#newcode91 Line 91: if (zcontext_.avail_out != 0) { Here too. http://codereview.appspot.com/32086/diff/7/1013#newcode99 Line 99: return ok; return false; http://codereview.appspot.com/32086/diff/7/1013#newcode106 Line 106: reinterpret_cast<uintptr_t>(output_position_) - count); I'd prefer that you canst to char* or uint8* here to perform this arithmetic, rather than cast to an integer. http://codereview.appspot.com/32086/diff/7/1013#newcode126 Line 126: Write a separator here, like: // =================================================================== http://codereview.appspot.com/32086/diff/7/1013#newcode129 Line 129: input_buffer_ = malloc(GZIP_STREAM_BUFFER_SIZE); operator new http://codereview.appspot.com/32086/diff/7/1013#newcode133 Line 133: zcontext_.zalloc = Z_NULL; memset zero again? http://codereview.appspot.com/32086/diff/7/1013#newcode176 Line 176: // Always immediately notify lower layer of data. Why? This seems like a waste if we aren't flushing. http://codereview.appspot.com/32086/diff/7/1013#newcode203 Line 203: // zlib is holding on to some state My interpretation of the zlib docs is that this will never happen as long as enough output space has been provided, which in our case it has. If I'm wrong and this can happen, the code you have in this block to handle it looks dubious. You set *data to next_in + avail_in, but presumably it is then pointing to the end of the input buffer; writing data there would be invalid. Can you please figure out under what circumstances this can happen and either remove this code (replace with LOG(DFATAL)) if it can't or write a test that exercises it if it can? http://codereview.appspot.com/32086/diff/7/1014 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/7/1014#newcode24 Line 24: // Return last error message or NULL if no error. Please place any new methods *above* the "implements ZeroCopyInputStream" comment, since these aren't overriding methods of ZeroCopyInputStream. http://codereview.appspot.com/32086/diff/7/1014#newcode58 Line 58: // Return last error message or NULL if no error. Here too. http://codereview.appspot.com/32086/diff/7/1014#newcode66 Line 66: // This moves more data towards the output, but does not guarantee a full This comment is confusing. If Flush() can't guarantee that all data is flushed, why would anyone use it? But from the zlib docs it looks like it actually does guarantee this. Please reword. Please also mention that it is the caller's responsibility to flush the underlying stream if necessary. http://codereview.appspot.com/32086/diff/7/1014#newcode71 Line 71: // Writes out all data and closes the gzip stream. Mention that it is the caller's responsibility to close the underlying stream if necessary. http://codereview.appspot.com/32086/diff/7/1011 File src/google/protobuf/io/zero_copy_stream_impl.h (right): http://codereview.appspot.com/32086/diff/7/1011#newcode434 Line 434: // Flushes any buffers but does not close the underlying file. Is this necessary for this change, or just something you felt like adding? Not that I mind, just wondering if I missed something. http://codereview.appspot.com/32086/diff/7/1012 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/7/1012#newcode282 Line 282: ArrayOutputStream output(buffer, kBufferSize, -1); Use varying sizes for the ArrayOutputStream's block size. Otherwise this test isn't really testing what happens when a write crosses a buffer boundary. It would also be nice to test varying sizes for the uncompressed buffers, as I mentioned in the .cc file. http://codereview.appspot.com/32086/diff/7/1012#newcode341 Line 341: } Add a blank line between the tests. http://codereview.appspot.com/32086/diff/7/1012#newcode342 Line 342: TEST_F(IoTest, GzipFileIo) { Why do we need a test that uses a file? This seems like it is testing unrelated functionality, and unnecessarily hitting the filesystem. http://codereview.appspot.com/32086/diff/7/1009 File src/google/protobuf/testing/zcgunzip.cc (right): http://codereview.appspot.com/32086/diff/7/1009#newcode2 Line 2: Test program to verify that GZIPInputStream is compatible with command line gunzip or java.util.zip.GZIPInputStream Please wrap lines at 80 chars, and use line comments rather than block comments. http://codereview.appspot.com/32086/diff/7/1009#newcode18 Line 18: FileInputStream* fin = new FileInputStream(STDIN_FILENO); You might as well stack-allocate these... http://codereview.appspot.com/32086/diff/7/1008 File src/google/protobuf/testing/zcgzip.cc (right): http://codereview.appspot.com/32086/diff/7/1008#newcode1 Line 1: /* Same comments as zcgunzip.cc. http://codereview.appspot.com/32086/diff/7/1008#newcode19 Line 19: void* buffer = malloc(BUFFLEN); This buffer is not used in the code. http://codereview.appspot.com/32086/diff/7/1006 File src/test_gzip_streams.sh (right): http://codereview.appspot.com/32086/diff/7/1006#newcode1 Line 1: #!/bin/sh -x Please put this file in src/google/protobuf/io and call it "gzip_stream_unittest.sh". http://codereview.appspot.com/32086/diff/7/1006#newcode5 Line 5: # test compatibility between command line gzip/gunzip binaries and ZeroCopyStream versions. wrap at 80 chars http://codereview.appspot.com/32086/diff/7/1006#newcode9 Line 9: # result of "(cmd) && (cmd)" implicitly becomes result of this script and thus the test. wrap at 80 chars
Sign in to reply to this message.
Things fixed, new patch uploaded. Includes changes to configure.ac and src/Makefile.am http://codereview.appspot.com/32086/diff/7/1013 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/7/1013#newcode8 Line 8: static const size_t GZIP_STREAM_BUFFER_SIZE = 64*1024; Made into argument of constructors with default of 64k. I think there are some zlib internals that function on chunks of 32 or 64k, and so this is a reasonable scale of buffer to use externally as well. http://codereview.appspot.com/32086/diff/7/1013#newcode13 Line 13: zcontext_.zalloc = Z_NULL; On 2009/04/02 21:48:38, kenton wrote: > Maybe use memset to zero-out this structure, so it's clear that you got > everything? > > Or could Z_NULL be something other than zero? (Why does zlib have its own NULL > definition?) Z_NULL is zero, but zlib.h is pretty clear that it expects certain field to be set to Z_NULL, so I went 'by the book'. http://codereview.appspot.com/32086/diff/7/1013#newcode21 Line 21: output_buffer_ = malloc(GZIP_STREAM_BUFFER_SIZE); On 2009/04/02 21:48:38, kenton wrote: > Use operator new, like: > output_buffer_ = operator new(GZIP_STREAM_BUFFER_SIZE); > > And delete with: > operator delete(output_buffer_); Done. http://codereview.appspot.com/32086/diff/7/1013#newcode74 Line 74: GOOGLE_LOG(INFO) << "GZIPInputStream::Next fails, zerror_=" << zerror_; On 2009/04/02 21:48:38, kenton wrote: > Is this necessary? Remember that some people use libprotobuf in non-server apps > where printing random messages to the console may be undesirable. Done. http://codereview.appspot.com/32086/diff/7/1013#newcode75 Line 75: return ok; On 2009/04/02 21:48:38, kenton wrote: > return false; Done. http://codereview.appspot.com/32086/diff/7/1013#newcode88 Line 88: if (zcontext_.next_out == NULL) { On 2009/04/02 21:48:38, kenton wrote: > Brief comment, please. Done. http://codereview.appspot.com/32086/diff/7/1013#newcode89 Line 89: return false; On 2009/04/02 21:48:38, kenton wrote: > If someone were to call Next() again after this return, I believe DoNextOutput() > would produce an invalid buffer with a negative size. Please correct this so > that Next() returns false on subsequent calls. Done. http://codereview.appspot.com/32086/diff/7/1013#newcode91 Line 91: if (zcontext_.avail_out != 0) { On 2009/04/02 21:48:38, kenton wrote: > Here too. Turns out this logic could be simplified out a bit. Done. http://codereview.appspot.com/32086/diff/7/1013#newcode99 Line 99: return ok; On 2009/04/02 21:48:38, kenton wrote: > return false; Done. http://codereview.appspot.com/32086/diff/7/1013#newcode129 Line 129: input_buffer_ = malloc(GZIP_STREAM_BUFFER_SIZE); On 2009/04/02 21:48:38, kenton wrote: > operator new Done. http://codereview.appspot.com/32086/diff/7/1013#newcode176 Line 176: // Always immediately notify lower layer of data. On 2009/04/02 21:48:38, kenton wrote: > Why? This seems like a waste if we aren't flushing. Changed to only happen on flush. Done. http://codereview.appspot.com/32086/diff/7/1013#newcode203 Line 203: // zlib is holding on to some state On 2009/04/02 21:48:38, kenton wrote: > My interpretation of the zlib docs is that this will never happen as long as > enough output space has been provided, which in our case it has. > > If I'm wrong and this can happen, the code you have in this block to handle it > looks dubious. You set *data to next_in + avail_in, but presumably it is then > pointing to the end of the input buffer; writing data there would be invalid. > > Can you please figure out under what circumstances this can happen and either > remove this code (replace with LOG(DFATAL)) if it can't or write a test that > exercises it if it can? Done. http://codereview.appspot.com/32086/diff/7/1014 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/7/1014#newcode24 Line 24: // Return last error message or NULL if no error. On 2009/04/02 21:48:38, kenton wrote: > Please place any new methods *above* the "implements ZeroCopyInputStream" > comment, since these aren't overriding methods of ZeroCopyInputStream. Done. http://codereview.appspot.com/32086/diff/7/1014#newcode58 Line 58: // Return last error message or NULL if no error. On 2009/04/02 21:48:38, kenton wrote: > Here too. Done. http://codereview.appspot.com/32086/diff/7/1014#newcode66 Line 66: // This moves more data towards the output, but does not guarantee a full On 2009/04/02 21:48:38, kenton wrote: > This comment is confusing. If Flush() can't guarantee that all data is flushed, > why would anyone use it? But from the zlib docs it looks like it actually does > guarantee this. Please reword. > > Please also mention that it is the caller's responsibility to flush the > underlying stream if necessary. Yes, zlib does actually create a coherent stream at the point of the flush, reworded most of the comment. Done. http://codereview.appspot.com/32086/diff/7/1014#newcode71 Line 71: // Writes out all data and closes the gzip stream. On 2009/04/02 21:48:38, kenton wrote: > Mention that it is the caller's responsibility to close the underlying stream if > necessary. Done. http://codereview.appspot.com/32086/diff/7/1011 File src/google/protobuf/io/zero_copy_stream_impl.h (right): http://codereview.appspot.com/32086/diff/7/1011#newcode434 Line 434: // Flushes any buffers but does not close the underlying file. On 2009/04/02 21:48:38, kenton wrote: > Is this necessary for this change, or just something you felt like adding? Not > that I mind, just wondering if I missed something. I added this for zcgzip.cc or another test while I was developing so that I could check the total number of bytes written out of a stream when that count wasn't updated until Flush. The only way to flush was to delete the object, and that didn't allow for getting the ByteCount(). Not actually used in remaining code, but still a good addition, eh? http://codereview.appspot.com/32086/diff/7/1012 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/7/1012#newcode282 Line 282: ArrayOutputStream output(buffer, kBufferSize, -1); On 2009/04/02 21:48:38, kenton wrote: > Use varying sizes for the ArrayOutputStream's block size. Otherwise this test > isn't really testing what happens when a write crosses a buffer boundary. > > It would also be nice to test varying sizes for the uncompressed buffers, as I > mentioned in the .cc file. Array block sizes and gzip buffer sizes testing added. http://codereview.appspot.com/32086/diff/7/1012#newcode341 Line 341: } On 2009/04/02 21:48:38, kenton wrote: > Add a blank line between the tests. Done. http://codereview.appspot.com/32086/diff/7/1012#newcode342 Line 342: TEST_F(IoTest, GzipFileIo) { On 2009/04/02 21:48:38, kenton wrote: > Why do we need a test that uses a file? This seems like it is testing unrelated > functionality, and unnecessarily hitting the filesystem. There were things uncovered by this test that didn't turn up in the other test, possibly due to buffer size vs file data size. http://codereview.appspot.com/32086/diff/7/1009 File src/google/protobuf/testing/zcgunzip.cc (right): http://codereview.appspot.com/32086/diff/7/1009#newcode2 Line 2: Test program to verify that GZIPInputStream is compatible with command line gunzip or java.util.zip.GZIPInputStream On 2009/04/02 21:48:38, kenton wrote: > Please wrap lines at 80 chars, and use line comments rather than block comments. Done. http://codereview.appspot.com/32086/diff/7/1009#newcode18 Line 18: FileInputStream* fin = new FileInputStream(STDIN_FILENO); On 2009/04/02 21:48:38, kenton wrote: > You might as well stack-allocate these... Done. http://codereview.appspot.com/32086/diff/7/1008 File src/google/protobuf/testing/zcgzip.cc (right): http://codereview.appspot.com/32086/diff/7/1008#newcode1 Line 1: /* On 2009/04/02 21:48:38, kenton wrote: > Same comments as zcgunzip.cc. Done. http://codereview.appspot.com/32086/diff/7/1008#newcode19 Line 19: void* buffer = malloc(BUFFLEN); On 2009/04/02 21:48:38, kenton wrote: > This buffer is not used in the code. Done. http://codereview.appspot.com/32086/diff/7/1006 File src/test_gzip_streams.sh (right): http://codereview.appspot.com/32086/diff/7/1006#newcode1 Line 1: #!/bin/sh -x On 2009/04/02 21:48:38, kenton wrote: > Please put this file in src/google/protobuf/io and call it > "gzip_stream_unittest.sh". Done. http://codereview.appspot.com/32086/diff/7/1006#newcode5 Line 5: # test compatibility between command line gzip/gunzip binaries and ZeroCopyStream versions. On 2009/04/02 21:48:38, kenton wrote: > wrap at 80 chars Done. http://codereview.appspot.com/32086/diff/7/1006#newcode9 Line 9: # result of "(cmd) && (cmd)" implicitly becomes result of this script and thus the test. On 2009/04/02 21:48:38, kenton wrote: > wrap at 80 chars Done.
Sign in to reply to this message.
Almost done. Thanks for doing this, BTW -- I think a lot of people should find it useful. What do you think about also supporting zlib streaming format (i.e. RFC 1950)? It's basically gzip minus unnecessary headers (like the original file name, etc.). I think it's just an option you pass to zlib somehow, so you should be able to add it as a constructor parameter. http://codereview.appspot.com/32086/diff/7/1015 File README.txt (right): http://codereview.appspot.com/32086/diff/7/1015#newcode56 Line 56: To compile fat binaries, configure as follows: On 2009/04/02 21:48:38, kenton wrote: > I'm not sure this belongs here. This is information that would apply to any > project, not just protobuf. You didn't address this. http://codereview.appspot.com/32086/diff/7/1015#newcode58 Line 58: ./configure CXXFLAGS="-arch ppc -arch ppc64 -arch i386 -arch x86_64" --enable-dependency-tracking=no On 2009/04/02 21:48:38, kenton wrote: > If we keep this line, please wrap it to 80 chars. End the first line with a > backslash so that it can still be directly copied and pasted into a shell. Or this. http://codereview.appspot.com/32086/diff/2006/2016 File configure.ac (right): http://codereview.appspot.com/32086/diff/2006/2016#newcode47 Line 47: AS_IF([test "x$with_zlib" != xno], Isn't the leading x unnecessary as long as you have quotes around the variable (as you do)? http://codereview.appspot.com/32086/diff/2006/2007 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/2006/2007#newcode4 Line 4: GZFLAGS = -DHAVE_ZLIB=1 We use autoheader, so HAVE_ZLIB will be #defined in config.h, so using -D shouldn't be necessary. http://codereview.appspot.com/32086/diff/2006/2007#newcode5 Line 5: GZHEADERS = google/protobuf/io/gzip_stream.h When HAVE_ZLIB is not enabled, we probably need to declare these unused source files in EXTRA_DIST so that they still end up in the distribution. Also, maybe we should have gzip_stream.cc in the sources either way but #ifdef out the entire body? This will probably be easier to deal with in the MSVC project files, which I don't think support conditional compiling of entire source files. http://codereview.appspot.com/32086/diff/2006/2007#newcode57 Line 57: $(GZHEADERS) \ Please line up the backslash. http://codereview.appspot.com/32086/diff/2006/2007#newcode72 Line 72: libprotobuf_la_LIBADD = $(PTHREAD_LIBS) Don't we still need -lz if gzip is enabled? http://codereview.appspot.com/32086/diff/2006/2007#newcode101 Line 101: $(GZSOURCES) \ Here too. http://codereview.appspot.com/32086/diff/2006/2014 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/2006/2014#newcode8 Line 8: GZIPInputStream::GZIPInputStream(ZeroCopyInputStream* sub_stream, int buffer_size) 80 char limit (maybe you should set your editor to detect this for you?) http://codereview.appspot.com/32086/diff/2006/2014#newcode19 Line 19: GOOGLE_CHECK(output_buffer_ != NULL); Nothing else in the library actually verifies that memory allocation returns non-null... it's kind of assumed that if this happens then we're screwed anyway. http://codereview.appspot.com/32086/diff/2006/2014#newcode68 Line 68: if ((!((zerror_ == Z_OK) || (zerror_ == Z_STREAM_END) I find it very hard to read a boolean expression with an outer complement. Can you factor out the "ok" variable like you had before? Below, too. http://codereview.appspot.com/32086/diff/2006/2014#newcode118 Line 118: GZIPOutputStream::GZIPOutputStream(ZeroCopyOutputStream* sub_stream, int buffer_size) 80 chars http://codereview.appspot.com/32086/diff/2006/2015 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/2006/2015#newcode15 Line 15: explicit GZIPInputStream(ZeroCopyInputStream* sub_stream, int buffer_size=65536); Please don't define the default value like this. This violates the Google C++ style guide. Instead, use a default of -1 here and check for it in the constructor body. http://codereview.appspot.com/32086/diff/2006/2015#newcode51 Line 51: explicit GZIPOutputStream(ZeroCopyOutputStream* sub_stream, int buffer_size=65536); Here too. http://codereview.appspot.com/32086/diff/2006/2015#newcode63 Line 63: // The underlying stream may need to be separately flushed. This sounds ambiguous, as if there are some cases where this stream may flush the underlying stream for you. Please use my wording: "It is the caller's responsibility to flush the underlying stream if necessary." http://codereview.appspot.com/32086/diff/2006/2015#newcode69 Line 69: // The underlying stream may need to be separately closed. same issue here http://codereview.appspot.com/32086/diff/2006/2011 File src/google/protobuf/io/zero_copy_stream_impl.h (right): http://codereview.appspot.com/32086/diff/2006/2011#newcode358 Line 358: bool Flush(); I don't see a definition for this one. What would Flush() do on an input stream, anyway? http://codereview.appspot.com/32086/diff/2006/2011#newcode437 Line 437: // Flushes any buffers but does not close the underlying file. Comment that this only flushes FileOutputStream's buffer. It does not ensure that the OS flushes the data to the physical disk. http://codereview.appspot.com/32086/diff/2006/2013 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/2006/2013#newcode62 Line 62: #if HAVE_ZLIB #include "config.h" to get this definition. http://codereview.appspot.com/32086/diff/2006/2013#newcode287 Line 287: if (gzip_buffer_size == -1) { This if block won't be necessary after comments are applied in gzip_stream.h.
Sign in to reply to this message.
Things replied to (I think I got them all) and new patch uploaded. http://codereview.appspot.com/32086/diff/7/1015 File README.txt (right): http://codereview.appspot.com/32086/diff/7/1015#newcode56 Line 56: To compile fat binaries, configure as follows: On 2009/04/03 22:03:52, kenton wrote: > On 2009/04/02 21:48:38, kenton wrote: > > I'm not sure this belongs here. This is information that would apply to any > > project, not just protobuf. > > You didn't address this. Ok, decided that the note wasn't that special. dropped. http://codereview.appspot.com/32086/diff/2006/2016 File configure.ac (right): http://codereview.appspot.com/32086/diff/2006/2016#newcode47 Line 47: AS_IF([test "x$with_zlib" != xno], On 2009/04/03 22:03:53, kenton wrote: > Isn't the leading x unnecessary as long as you have quotes around the variable > (as you do)? I was just parroting style from autoconf docs. works without the x. http://codereview.appspot.com/32086/diff/2006/2007 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/2006/2007#newcode4 Line 4: GZFLAGS = -DHAVE_ZLIB=1 On 2009/04/03 22:03:53, kenton wrote: > We use autoheader, so HAVE_ZLIB will be #defined in config.h, so using -D > shouldn't be necessary. Some .cc files needed to #include "config.h", which is done now. http://codereview.appspot.com/32086/diff/2006/2007#newcode5 Line 5: GZHEADERS = google/protobuf/io/gzip_stream.h On 2009/04/03 22:03:53, kenton wrote: > When HAVE_ZLIB is not enabled, we probably need to declare these unused source > files in EXTRA_DIST so that they still end up in the distribution. > > Also, maybe we should have gzip_stream.cc in the sources either way but #ifdef > out the entire body? This will probably be easier to deal with in the MSVC > project files, which I don't think support conditional compiling of entire > source files. Done. http://codereview.appspot.com/32086/diff/2006/2007#newcode57 Line 57: $(GZHEADERS) \ On 2009/04/03 22:03:53, kenton wrote: > Please line up the backslash. Done. http://codereview.appspot.com/32086/diff/2006/2007#newcode72 Line 72: libprotobuf_la_LIBADD = $(PTHREAD_LIBS) On 2009/04/03 22:03:53, kenton wrote: > Don't we still need -lz if gzip is enabled? It's found by autoconf and added to other LDFLAGS http://codereview.appspot.com/32086/diff/2006/2007#newcode101 Line 101: $(GZSOURCES) \ On 2009/04/03 22:03:53, kenton wrote: > Here too. Done. http://codereview.appspot.com/32086/diff/2006/2014 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/2006/2014#newcode8 Line 8: GZIPInputStream::GZIPInputStream(ZeroCopyInputStream* sub_stream, int buffer_size) On 2009/04/03 22:03:53, kenton wrote: > 80 char limit (maybe you should set your editor to detect this for you?) Done. http://codereview.appspot.com/32086/diff/2006/2014#newcode68 Line 68: if ((!((zerror_ == Z_OK) || (zerror_ == Z_STREAM_END) On 2009/04/03 22:03:53, kenton wrote: > I find it very hard to read a boolean expression with an outer complement. Can > you factor out the "ok" variable like you had before? Below, too. Done. http://codereview.appspot.com/32086/diff/2006/2014#newcode118 Line 118: GZIPOutputStream::GZIPOutputStream(ZeroCopyOutputStream* sub_stream, int buffer_size) On 2009/04/03 22:03:53, kenton wrote: > 80 chars Done. http://codereview.appspot.com/32086/diff/2006/2015 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/2006/2015#newcode15 Line 15: explicit GZIPInputStream(ZeroCopyInputStream* sub_stream, int buffer_size=65536); On 2009/04/03 22:03:53, kenton wrote: > Please don't define the default value like this. This violates the Google C++ > style guide. Instead, use a default of -1 here and check for it in the > constructor body. Done. http://codereview.appspot.com/32086/diff/2006/2015#newcode51 Line 51: explicit GZIPOutputStream(ZeroCopyOutputStream* sub_stream, int buffer_size=65536); On 2009/04/03 22:03:53, kenton wrote: > Here too. Done. http://codereview.appspot.com/32086/diff/2006/2015#newcode63 Line 63: // The underlying stream may need to be separately flushed. On 2009/04/03 22:03:53, kenton wrote: > This sounds ambiguous, as if there are some cases where this stream may flush > the underlying stream for you. Please use my wording: "It is the caller's > responsibility to flush the underlying stream if necessary." Done. http://codereview.appspot.com/32086/diff/2006/2015#newcode69 Line 69: // The underlying stream may need to be separately closed. On 2009/04/03 22:03:53, kenton wrote: > same issue here Done. http://codereview.appspot.com/32086/diff/2006/2011 File src/google/protobuf/io/zero_copy_stream_impl.h (right): http://codereview.appspot.com/32086/diff/2006/2011#newcode358 Line 358: bool Flush(); On 2009/04/03 22:03:53, kenton wrote: > I don't see a definition for this one. What would Flush() do on an input > stream, anyway? Done. http://codereview.appspot.com/32086/diff/2006/2011#newcode437 Line 437: // Flushes any buffers but does not close the underlying file. On 2009/04/03 22:03:53, kenton wrote: > Comment that this only flushes FileOutputStream's buffer. It does not ensure > that the OS flushes the data to the physical disk. Done. http://codereview.appspot.com/32086/diff/2006/2013 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/2006/2013#newcode62 Line 62: #if HAVE_ZLIB On 2009/04/03 22:03:53, kenton wrote: > #include "config.h" to get this definition. Done. http://codereview.appspot.com/32086/diff/2006/2013#newcode287 Line 287: if (gzip_buffer_size == -1) { On 2009/04/03 22:03:53, kenton wrote: > This if block won't be necessary after comments are applied in gzip_stream.h. Done.
Sign in to reply to this message.
Sorry, more issues. If you are sick of this, feel free to just let me make the remaining edits when I get time, though it may take longer before the patch gets submitted. http://codereview.appspot.com/32086/diff/19/20 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/19/20#newcode51 Line 51: google/protobuf/io/gzip_stream.h \ Sorry, I should have been clearer. I don't think we should install this header if we did not compile with gzip support enabled, so it should still be conditional (and EXTRA_DIST if it isn't listed in the headers). It's just the .cc file which doesn't need to be conditional, since it is #ifdef'd. http://codereview.appspot.com/32086/diff/19/27 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/19/27#newcode1 Line 1: #if HAVE_CONFIG_H Don't do "#if HAVE_CONFIG_H" -- we always have a config.h (I think, unless autoconf has some way to disable it?) and I don't think the MSVC project files define this. http://codereview.appspot.com/32086/diff/19/28 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/19/28#newcode13 Line 13: class LIBPROTOBUF_EXPORT GZIPInputStream : public ZeroCopyInputStream { Sorry for not catching this earlier, but please use correct capitalization here: "GZipInputStream", not "GZIPInputStream". Same for the output stream. (Though, arguably it should now be ZLibInputStream since it supports multiple formats... though GZipInputStream may be more recognizeable. What do you think?) http://codereview.appspot.com/32086/diff/19/28#newcode15 Line 15: // buffer_size and format may be -1 for default of 64kB and GZIP format No no, use -1 as the default argument. Don't force people to specify it manually. Like: GZipInputStream(ZeroCopyInputStream* sub_stream, int buffer_size = -1); Sorry for being unclear. Look at ArrayInputStream for an example. I don't think "format" should have a default -- make people be explicit about this. http://codereview.appspot.com/32086/diff/19/28#newcode17 Line 17: ZeroCopyInputStream* sub_stream, int buffer_size, int format); Make "format" be the second parameter, since it's more likely people will want to specify it than buffer_size. http://codereview.appspot.com/32086/diff/19/28#newcode23 Line 23: static const int GZIP; For type-safety, please use an enum rather than int constants. http://codereview.appspot.com/32086/diff/19/26 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/19/26#newcode49 Line 49: #if HAVE_CONFIG_H Remove the #if. http://codereview.appspot.com/32086/diff/19/26#newcode284 Line 284: TEST_F(IoTest, GzipIo) { Please make sure to test the ZLIB mode, at least by doing a round-trip. You could pretty easily add another nested for loop which does it. http://codereview.appspot.com/32086/diff/19/22 File src/google/protobuf/testing/zcgunzip.cc (right): http://codereview.appspot.com/32086/diff/19/22#newcode11 Line 11: #if HAVE_ZLIB This isn't necessary since we don't compile this file if zlib isn't enabled. (Same with zcgzip.cc.)
Sign in to reply to this message.
The big difference in this round is the constructor arguments. Also the format argument actually gets used and I added testing to make sure GZIP, ZLIB, and AUTO formats all work. I didn't do the global find-replace on GZIP*Stream yet, as I think there's an issue about class name and default arguments to resolve. http://codereview.appspot.com/32086/diff/19/20 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/19/20#newcode51 Line 51: google/protobuf/io/gzip_stream.h \ On 2009/04/07 23:18:49, kenton wrote: > Sorry, I should have been clearer. I don't think we should install this header > if we did not compile with gzip support enabled, so it should still be > conditional (and EXTRA_DIST if it isn't listed in the headers). It's just the > .cc file which doesn't need to be conditional, since it is #ifdef'd. Done. http://codereview.appspot.com/32086/diff/19/28 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/19/28#newcode13 Line 13: class LIBPROTOBUF_EXPORT GZIPInputStream : public ZeroCopyInputStream { On 2009/04/07 23:18:49, kenton wrote: > Sorry for not catching this earlier, but please use correct capitalization here: > "GZipInputStream", not "GZIPInputStream". Same for the output stream. > (Though, arguably it should now be ZLibInputStream since it supports multiple > formats... though GZipInputStream may be more recognizeable. What do you > think?) My goal was to be conceptually and data compatible with http://java.sun.com/j2se/1.5.0/docs/api/java/util/zip/GZIPInputStream.html Java has other names for the raw zlib streams, "InflaterInputStream" "DeflaterOutputStream". One could also contemplate 'Zlib' (lower L) Input and Output streams. If the default mode is going to continue to be with the gzip header (as I'd prefer, I think command-line gzip compatibility is also a win) then I think the name of the class should still reflect gzip, probably 'Gzip' Input and Output Streams to go with the preferred capitalization. http://codereview.appspot.com/32086/diff/19/28#newcode15 Line 15: // buffer_size and format may be -1 for default of 64kB and GZIP format On 2009/04/07 23:18:49, kenton wrote: > No no, use -1 as the default argument. Don't force people to specify it > manually. Like: > > GZipInputStream(ZeroCopyInputStream* sub_stream, int buffer_size = -1); > > Sorry for being unclear. Look at ArrayInputStream for an example. > > I don't think "format" should have a default -- make people be explicit about > this. As above, if the class name is 'GzipInputStream' then gzip format is a logical default. (arguments changed to have default of -1) http://codereview.appspot.com/32086/diff/19/28#newcode17 Line 17: ZeroCopyInputStream* sub_stream, int buffer_size, int format); On 2009/04/07 23:18:49, kenton wrote: > Make "format" be the second parameter, since it's more likely people will want > to specify it than buffer_size. Pending further contemplation of class names and defaults. http://codereview.appspot.com/32086/diff/19/28#newcode23 Line 23: static const int GZIP; On 2009/04/07 23:18:49, kenton wrote: > For type-safety, please use an enum rather than int constants. Done.
Sign in to reply to this message.
http://codereview.appspot.com/32086/diff/19/28 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/19/28#newcode13 Line 13: class LIBPROTOBUF_EXPORT GZIPInputStream : public ZeroCopyInputStream { On 2009/04/09 01:03:01, brian.olson wrote: > On 2009/04/07 23:18:49, kenton wrote: > > Sorry for not catching this earlier, but please use correct capitalization > here: > > "GZipInputStream", not "GZIPInputStream". Same for the output stream. > > (Though, arguably it should now be ZLibInputStream since it supports multiple > > formats... though GZipInputStream may be more recognizeable. What do you > > think?) > > My goal was to be conceptually and data compatible with > http://java.sun.com/j2se/1.5.0/docs/api/java/util/zip/GZIPInputStream.html > > Java has other names for the raw zlib streams, "InflaterInputStream" > "DeflaterOutputStream". One could also contemplate 'Zlib' (lower L) Input and > Output streams. If the default mode is going to continue to be with the gzip > header (as I'd prefer, I think command-line gzip compatibility is also a win) > then I think the name of the class should still reflect gzip, probably 'Gzip' > Input and Output Streams to go with the preferred capitalization. OK, sounds good to me. (I don't understand why Java used that capitalization, since gzip is not an acronym. I'm happy with "Gzip".) http://codereview.appspot.com/32086/diff/46/2029 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/46/2029#newcode5 Line 5: GZHEADERS = This should not be blank. http://codereview.appspot.com/32086/diff/46/2036 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/46/2036#newcode1 Line 1: #include "config.h" Please make sure all your new files have copyright notices matching the rest of the library. http://codereview.appspot.com/32086/diff/46/2036#newcode58 Line 58: case GZIP: windowBitsFormat = 16; break; I don't see these bits mentioned in the zlib docs. Where did you find them? http://codereview.appspot.com/32086/diff/46/2036#newcode61 Line 61: default: GOOGLE_CHECK(false); You can remove this default case. GCC will recognize that you have cases covering all possible values of format_ (and would produce a warning otherwise). You will need to initialize windowBitsFormat to zero to avoid an uninitialized variable warning, though. http://codereview.appspot.com/32086/diff/46/2037 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/46/2037#newcode72 Line 72: GZIP = -1, You can make this be 1, or even omit the numeric values for these enums. The style guide doesn't say that default values have to be -1, just that they shouldn't be magic numbers. http://codereview.appspot.com/32086/diff/46/2035 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/46/2035#newcode292 Line 292: GZIPOutputStream gzout(&output, GZIPOutputStream::GZIP, gzip_buffer_size); You have a bunch of lines crossing 80 chars again. Please fix. http://codereview.appspot.com/32086/diff/46/2035#newcode334 Line 334: TEST_F(IoTest, ZlibIoInputAutodetect) { For this test I think it's unnecessary to loop through all combinations of block sizes, since the block size and the auto-detection logic are not really related.
Sign in to reply to this message.
new sources have header copied from similar sources. s/GZIP/Gzip/g and a few other cleanups. http://codereview.appspot.com/32086/diff/46/2029 File src/Makefile.am (right): http://codereview.appspot.com/32086/diff/46/2029#newcode5 Line 5: GZHEADERS = On 2009/04/09 19:43:38, kenton wrote: > This should not be blank. Done. http://codereview.appspot.com/32086/diff/46/2036 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/46/2036#newcode1 Line 1: #include "config.h" On 2009/04/09 19:43:38, kenton wrote: > Please make sure all your new files have copyright notices matching the rest of > the library. Done. http://codereview.appspot.com/32086/diff/46/2036#newcode58 Line 58: case GZIP: windowBitsFormat = 16; break; On 2009/04/09 19:43:38, kenton wrote: > I don't see these bits mentioned in the zlib docs. Where did you find them? I found it in the description of inflateInit2 in my /usr/include/zlib.h on either Mac or Linux. http://codereview.appspot.com/32086/diff/46/2036#newcode61 Line 61: default: GOOGLE_CHECK(false); On 2009/04/09 19:43:38, kenton wrote: > You can remove this default case. GCC will recognize that you have cases > covering all possible values of format_ (and would produce a warning otherwise). > You will need to initialize windowBitsFormat to zero to avoid an uninitialized > variable warning, though. Done. http://codereview.appspot.com/32086/diff/46/2037 File src/google/protobuf/io/gzip_stream.h (right): http://codereview.appspot.com/32086/diff/46/2037#newcode72 Line 72: GZIP = -1, On 2009/04/09 19:43:38, kenton wrote: > You can make this be 1, or even omit the numeric values for these enums. The > style guide doesn't say that default values have to be -1, just that they > shouldn't be magic numbers. Done. http://codereview.appspot.com/32086/diff/46/2035 File src/google/protobuf/io/zero_copy_stream_unittest.cc (right): http://codereview.appspot.com/32086/diff/46/2035#newcode292 Line 292: GZIPOutputStream gzout(&output, GZIPOutputStream::GZIP, gzip_buffer_size); On 2009/04/09 19:43:38, kenton wrote: > You have a bunch of lines crossing 80 chars again. Please fix. Done. http://codereview.appspot.com/32086/diff/46/2035#newcode334 Line 334: TEST_F(IoTest, ZlibIoInputAutodetect) { On 2009/04/09 19:43:38, kenton wrote: > For this test I think it's unnecessary to loop through all combinations of block > sizes, since the block size and the auto-detection logic are not really related. Done.
Sign in to reply to this message.
Just two trivial things left... http://codereview.appspot.com/32086/diff/6001/6009 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/6001/6009#newcode31 Line 31: // Author: kenton@google.com (Kenton Varda) Put yourself as the author of the new files, not me. :) (Feel free to use your @google.com e-mail address, or omit it if you prefer.) http://codereview.appspot.com/32086/diff/6001/6007 File src/google/protobuf/io/gzip_stream_unittest.sh (right): http://codereview.appspot.com/32086/diff/6001/6007#newcode1 Line 1: #!/bin/sh -x This file needs a copyright notice too.
Sign in to reply to this message.
Done http://codereview.appspot.com/32086/diff/6001/6009 File src/google/protobuf/io/gzip_stream.cc (right): http://codereview.appspot.com/32086/diff/6001/6009#newcode31 Line 31: // Author: kenton@google.com (Kenton Varda) On 2009/04/14 00:46:47, kenton wrote: > Put yourself as the author of the new files, not me. :) (Feel free to use your > @google.com e-mail address, or omit it if you prefer.) Done. http://codereview.appspot.com/32086/diff/6001/6007 File src/google/protobuf/io/gzip_stream_unittest.sh (right): http://codereview.appspot.com/32086/diff/6001/6007#newcode1 Line 1: #!/bin/sh -x On 2009/04/14 00:46:47, kenton wrote: > This file needs a copyright notice too. Done.
Sign in to reply to this message.
|