|
|
Descriptioncrypto/tls: make Conn.Read return (n, io.EOF) when EOF is next in buffer
Update Issue 3514
An io.Reader is permitted to return either (n, nil)
or (n, io.EOF) on EOF or other error.
The tls package previously always returned (n, nil) for a read
of size n if n bytes were available, not surfacing errors at
the same time.
Amazon's HTTPS frontends like to hang up on clients without
sending the appropriate HTTP headers. (In their defense,
they're allowed to hang up any time, but generally a server
hangs up after a bit of inactivity, not immediately.) In any
case, the Go HTTP client tries to re-use connections by
looking at whether the response headers say to keep the
connection open, and because the connection looks okay, under
heavy load it's possible we'll reuse it immediately, writing
the next request, just as the Transport's always-reading
goroutine returns from tls.Conn.Read and sees (0, io.EOF).
But because Amazon does send an AlertCloseNotify record before
it hangs up on us, and the tls package does its own internal
buffering (up to 1024 bytes) of pending data, we have the
AlertCloseNotify in an unread buffer when our Conn.Read (to
the HTTP Transport code) reads its final bit of data in the
HTTP response body.
This change makes that final Read return (n, io.EOF) when
an AlertCloseNotify record is buffered right after, if we'd
otherwise return (n, nil).
A dependent change in the HTTP code then notes whether a
client connection has seen an io.EOF and uses that as an
additional signal to not reuse a HTTPS connection. With both
changes, the majority of Amazon request failures go
away. Without either one, 10-20 goroutines hitting the S3 API
leads to such an error rate that empirically up to 5 retries
are needed to complete an API call.
Patch Set 1 #Patch Set 2 : diff -r 62052ebe728b https://go.googlecode.com/hg/ #Patch Set 3 : diff -r 62052ebe728b https://go.googlecode.com/hg/ #
Total comments: 1
Patch Set 4 : diff -r 4450c84265f0 https://go.googlecode.com/hg/ #
MessagesTotal messages: 14
Hello agl@golang.org (cc: golang-codereviews@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
Sign in to reply to this message.
Most sites don't send close notify alerts so this seems rather AWS specific. (In fact, I'm pretty surprised that they do.) Also, isn't this only going to work if the close-notify is in the buffer at read time? What if it's slightly delayed (as will frequently be the case because they are usually in their own packet). Additionally, changing the behaviour of Read is a little worrisome this late in the 1.3 process, even if it is within the Reader contract. I think that's a minor point though.
Sign in to reply to this message.
On Mon, Mar 24, 2014 at 7:29 AM, <agl@golang.org> wrote: > Most sites don't send close notify alerts so this seems rather AWS > specific. (In fact, I'm pretty surprised that they do.) Also, isn't this > only going to work if the close-notify is in the buffer at read time? > What if it's slightly delayed (as will frequently be the case because > they are usually in their own packet). > I can't say whether it's AWS-specific or not, but if it is then empirically: the close-notify record is ~always there in the buffer. Perhaps their load balancer makes its keep-alive decision while the response body is still being proxied and then sends the close-notify before its final flush. Or maybe we're just getting lucky timing-wise and the two packets are getting combined before we do the raw net.Conn.Read. Without the support for writing per-session keys to disk for Wireshark, I didn't actually look at packets. In any case, this change makes a night & day difference around whether 20 goroutines can pound on AWS APIs without errors. Reportedly, this includes CloudFront and any site on the web behind it. Previously I had a loop around all API calls, retrying 5 times, because 4 times wasn't always enough. It was that flaky before. I agree that there's still an inherent HTTP race, though, since HTTP has no lame duck mechanism. With an idle keep-alive connection, one side can decide to write while the other side decides to close. Retrying requests in that case is https://code.google.com/p/go/issues/detail?id=4677 but that's not for Go 1.3. I was hoping this change was safe enough for Go 1.3, though. Additionally, changing the behaviour of Read is a little worrisome this > late in the 1.3 process, even if it is within the Reader contract. I > think that's a minor point though. > FWIW, I did this to a few other Readers in Go 1.3, and instead of problems, we actually got a nice performance surprise ( https://codereview.appspot.com/49570044) because ioutil.ReadAll works better when it sees the EOF earlier.
Sign in to reply to this message.
On 2014/03/24 15:54:24, bradfitz wrote: > In any case, this change makes a night & day difference around whether 20 > goroutines can pound on AWS APIs without errors. Reportedly, this includes > CloudFront and any site on the web behind it. Previously I had a loop > around all API calls, retrying 5 times, because 4 times wasn't always > enough. It was that flaky before. > > I agree that there's still an inherent HTTP race, though, since HTTP has no > lame duck mechanism. With an idle keep-alive connection, one side can > decide to write while the other side decides to close. Retrying requests > in that case is https://code.google.com/p/go/issues/detail?id=4677 but > that's not for Go 1.3. I was hoping this change was safe enough for Go 1.3, > though. I'm do think that this change is nonsense from a theoretical point of view but perhaps it's a reasonable bandaid for 1.3. If you are planning on doing retries at the HTTP layer for 1.4, then I guess this is ok, although please add a TODO(bradfitz): remove for 1.4. LGTM.
Sign in to reply to this message.
On Mon, Mar 24, 2014 at 9:38 AM, <agl@golang.org> wrote: > On 2014/03/24 15:54:24, bradfitz wrote: > >> In any case, this change makes a night & day difference around whether >> > 20 > >> goroutines can pound on AWS APIs without errors. Reportedly, this >> > includes > >> CloudFront and any site on the web behind it. Previously I had a loop >> around all API calls, retrying 5 times, because 4 times wasn't always >> enough. It was that flaky before. >> > > I agree that there's still an inherent HTTP race, though, since HTTP >> > has no > >> lame duck mechanism. With an idle keep-alive connection, one side can >> decide to write while the other side decides to close. Retrying >> > requests > >> in that case is https://code.google.com/p/go/issues/detail?id=4677 but >> that's not for Go 1.3. I was hoping this change was safe enough for Go >> > 1.3, > >> though. >> > > I'm do think that this change is nonsense from a theoretical point of > view but perhaps it's a reasonable bandaid for 1.3. > > If you are planning on doing retries at the HTTP layer for 1.4, then I > guess this is ok, although please add a TODO(bradfitz): remove for 1.4. > I don't think this is a bandaid, and I don't want to see it removed in Go 1.4. If we've seen the end, I still want to return (n, EOF) instead of (n, nil). It conveys more information. It's better to let the http package (or anything using TLS) know immediately to not re-use a connection, rather than try to write to it and fail. The HTTP write-retry mechanism would be a last-ditch effort. But if there's an obvious signal (this CL) to avoid failing in the first place, it's much better to not write the request at all, especially since many requests aren't retry-able, like POSTs or anything with a io.Reader Body (which we can only write once, or buffer). I'm happy to have your LGTM but I'd rather not submit until we're on the same page. Why is this change nonsense from a theoretical point of view? I didn't feel that there was anything wrong or hacky in it.
Sign in to reply to this message.
this seems okay but it makes me uneasy. yes, crypto/tls is allowed to behave this way, but why does the http package need it to? it seems like there is a bug in http. why is there no read pending? does the read not start until after writing the next request? why is that? can the read be started earlier? that will fix a broader class of related notice-EOF-early bugs (https and also http). russ https://codereview.appspot.com/76400046/diff/40001/src/pkg/crypto/tls/conn.go File src/pkg/crypto/tls/conn.go (right): https://codereview.appspot.com/76400046/diff/40001/src/pkg/crypto/tls/conn.go... src/pkg/crypto/tls/conn.go:455: // that we're throwing away r.Read's err here. it's okay, you got the data. the next read should return the same error.
Sign in to reply to this message.
On Mon, Mar 24, 2014 at 3:47 PM, <rsc@golang.org> wrote: > this seems okay but it makes me uneasy. yes, crypto/tls is allowed to > behave this way, but why does the http package need it to? it seems like > there is a bug in http. why is there no read pending? does the read not > start until after writing the next request? why is that? can the read be > started earlier? that will fix a broader class of related > notice-EOF-early bugs (https and also http). > The http package doesn't need it, but it gives it more information. The http transport code is always either reading a response, or it's peeking a byte waiting for a close. It makes the decision about whether or not the connection is reusable after the response has been read, and before the 1-byte Read begins. Imagine the Content-Length was 100 bytes: we read the headers, find the 100 byte content-length, and then we Read(buf[:100]). crypto/tls then returns (100, nil). The Response headers say keep-alive. Great. We loop around and begin the (blocking!) 1-byte peek, waiting to be hung up, or the response for our next request. We could start an n millisecond timer waiting for the Read(1 byte) to not say anything (and not hang up on us), before we put that connection back into the free pool, but that is gross for a number of reasons: it unnecessarily delays the majority of connections, it introduces a new class of connection (in-use, free, and maybe-free-soon) which complicates a lot of the free pool get/put code. Or at least complicates the dial code, with another way it could be done, if connections in maybe-reusable state are plucked off by the dial code before new connections are made. I do not believe there is a bug in net/http here. This CL provides new information that the tls package already has, just at the time that http could use it: before it decides to reuse a connection, as it's transitioning from reading a response to deciding whether to kill the connection or freelist it and go back to reading a byte. I can't read a byte before deciding to freelist it, because that might block in the common case.
Sign in to reply to this message.
On 2014/03/24 23:09:16, bradfitz wrote: > The http package doesn't need it, but it gives it more information. Indeed, it's information that crypto/tls has, so why not expose it? The problem is that I feel that this is fragile and creates perverse dependencies. If you're depending on it then it could well be that subtle changes to timing or the network code suddenly `break' this behaviour. That will create a negative effect for users and so remote parts of the code base suddenly become more dangerous to work on. I searched for any URL on S3 and found https://jets3t.s3.amazonaws.com/toolkit/configuration.html. I ran Wireshark on a fetch of that URL and I see the last byte of data arriving at 1.281 seconds from the SYN and the close-notify at 2.263. So I believe that S3 has a connection timeout of 1 second. So I'm surprised that you see the close-notify queued when reading the end of the response. What we really want is to do a non-blocking read on the network to see whether a FIN or close-notify has appeared. As you note, an n millisecond timer is a bit ugly, but non-blocking reads aren't exposed from net, although perhaps SetDeadline can be used to make an inefficient one. Having gorountines reading from the open sockets would do it, but I assume that's too large a change for 1.3 at this point. It's a shame that HTTP/1.1 didn't have servers advertise their timeout in responses. So that's why I have reservations and thought that it should be a temporary fix. But I'll leave it up to you because this does sound like an issue with significant, practical implications for people.
Sign in to reply to this message.
LGTM Adam, I think you and I are viewing this the same way, but after a chat with Brad I think I understand better what's going on and why this is reasonable. Perhaps this will help. In package http there *are* goroutines reading from all the open cached network connections, even when there are no pending requests. If a goroutine reader observes a hangup, it closes the connection and removes it from the cache/pool. But there is a fundamental race where a client might try to use the connection at the same time that the reader observes the hangup and starts the teardown, and we don't know whether the request made it over to the server or not. Brad says Chrome has various heuristics for which requests are idempotent enough to retry (you can't just do it with all GETs); the Go package never retries in this case, to avoid maintaining the heuristics. So for the general case things should be working pretty well. But the race means that it can't always work well. What Brad seems to be seeing (in contrast to what you see) is that on some requests Amazon decides to close the connection immediately, despite generally saying it will keep things alive. In that case it has to do a TCP linger wait to send the final FIN, but it can and does send the close notify earlier. Maybe we get slow enough that we get both the response and the close notify before we bother to read either, or maybe in some cases Amazon sends them together. Let's just suppose that happens somehow. Then what happens is this: A. goroutine 1 - receive HTTPS response - put HTTPS connection back in pool B. goroutine 2 - start request - take HTTPS connection out of pool to use, get same one - send request C. goroutine 3 (reader for that connection) - observe close-notify - poison and close HTTPS connection, remove from pool The race here is between B and C, which are both purely local execution if you assume the close notify has already arrived. Really it comes down to which goroutine manages to run first. If C manages to tear down the connection and pull it from the pool before B attempts to reuse it, all is well. But if not, B sees itself write a request and then the connection dies. It's a narrow race, nothing as dumb as we're just sitting on connections not noticing the close at all, but if you pound on the system it can happen. I have gotten a report off list about seeing the same behavior from IIS, so I don't think it's anything Amazon is doing wrong per se. So the right things are already going on as far as tearing down closed connections promptly, it's just that there's always an opportunity for a race, and Brad's CL is arranging for this specific instance of the race not to happen anymore. This problem is exacerbated by HTTPS because when a goroutine needs a new connection, we kick off a new dial+TLS handshake but then wait for either that to complete or a connection to be dropped back into the pool for reuse, whichever happens first. So B above might be waiting for a dial+TLS and then A comes along and drops this not-really-useful connection into the pool and B says "oh look! I'll use this one right now!" And it gets there before C pulls the connection back out of the pool. Given that the basic background observation of closes is in fact present and working (and has been for multiple releases), I think there's not much harm to nibbling away at the race windows. This CL hits one that seems to be arising in practice. Brad, can you please make the comment: // If a close-notify alert is waiting, read it so that we can return n, EOF instead of n, nil, // to signal to the HTTP response reading goroutine that the connection is now closed. // This eliminates a race where the HTTP response reading goroutine would otherwise // not observe the EOF until its next read, by which time a client goroutine might have // already tried to reuse the HTTP connection for a new request. // See https://codereview.appspot.com/76400046 and golang.org/issue/3514. Thanks. Russ
Sign in to reply to this message.
Thanks for the writeup, Russ. LGTM.
Sign in to reply to this message.
*** Submitted as https://code.google.com/p/go/source/detail?r=2e0323211d75 *** crypto/tls: make Conn.Read return (n, io.EOF) when EOF is next in buffer Update Issue 3514 An io.Reader is permitted to return either (n, nil) or (n, io.EOF) on EOF or other error. The tls package previously always returned (n, nil) for a read of size n if n bytes were available, not surfacing errors at the same time. Amazon's HTTPS frontends like to hang up on clients without sending the appropriate HTTP headers. (In their defense, they're allowed to hang up any time, but generally a server hangs up after a bit of inactivity, not immediately.) In any case, the Go HTTP client tries to re-use connections by looking at whether the response headers say to keep the connection open, and because the connection looks okay, under heavy load it's possible we'll reuse it immediately, writing the next request, just as the Transport's always-reading goroutine returns from tls.Conn.Read and sees (0, io.EOF). But because Amazon does send an AlertCloseNotify record before it hangs up on us, and the tls package does its own internal buffering (up to 1024 bytes) of pending data, we have the AlertCloseNotify in an unread buffer when our Conn.Read (to the HTTP Transport code) reads its final bit of data in the HTTP response body. This change makes that final Read return (n, io.EOF) when an AlertCloseNotify record is buffered right after, if we'd otherwise return (n, nil). A dependent change in the HTTP code then notes whether a client connection has seen an io.EOF and uses that as an additional signal to not reuse a HTTPS connection. With both changes, the majority of Amazon request failures go away. Without either one, 10-20 goroutines hitting the S3 API leads to such an error rate that empirically up to 5 retries are needed to complete an API call. LGTM=agl, rsc R=agl, rsc CC=golang-codereviews https://codereview.appspot.com/76400046
Sign in to reply to this message.
Message was sent while issue was closed.
This CL appears to have broken the darwin-amd64-cheney builder. See http://build.golang.org/log/6591ea870ca7b750a512c71d8dc70425dcd55c03
Sign in to reply to this message.
This is a real failure # ../test/bench/go1 ok _/Users/builder/workspace/darwin-amd64-cheney-2e0323211d75/go/test/bench/go1 2.660s # ../test Build complete, duration 1h0m0.003083969s. Result: error: timed out after 1h0m0s The builder normally takes less than 10 minutes for the whole thing, but something in ./test is wedging On Thu, Mar 27, 2014 at 6:38 AM, <gobot@golang.org> wrote: > This CL appears to have broken the darwin-amd64-cheney builder. > See http://build.golang.org/log/6591ea870ca7b750a512c71d8dc70425dcd55c03 > > > https://codereview.appspot.com/76400046/ > > -- > You received this message because you are subscribed to the Google Groups > "golang-codereviews" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-codereviews+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout.
Sign in to reply to this message.
This has nothing to do with this tls change, though. On Wed, Mar 26, 2014 at 12:48 PM, Dave Cheney <dave@cheney.net> wrote: > This is a real failure > > # ../test/bench/go1 > ok > _/Users/builder/workspace/darwin-amd64-cheney-2e0323211d75/go/test/bench/go1 > 2.660s > > # ../test > Build complete, duration 1h0m0.003083969s. Result: error: timed out after > 1h0m0s > > The builder normally takes less than 10 minutes for the whole thing, > but something in ./test is wedging > > On Thu, Mar 27, 2014 at 6:38 AM, <gobot@golang.org> wrote: > > This CL appears to have broken the darwin-amd64-cheney builder. > > See http://build.golang.org/log/6591ea870ca7b750a512c71d8dc70425dcd55c03 > > > > > > https://codereview.appspot.com/76400046/ > > > > -- > > You received this message because you are subscribed to the Google Groups > > "golang-codereviews" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to golang-codereviews+unsubscribe@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
|