Issue 76400046: code review 76400046: crypto/tls: make Conn.Read return (n, io.EOF) when EOF ...

Issue 76400046: code review 76400046: crypto/tls: make Conn.Read return (n, io.EOF) when EOF ... (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
11 years, 4 months ago by bradfitz

Modified:
11 years, 4 months ago

Reviewers:
agl1, gobot, rsc, dave

CC:
agl1, rsc, golang-codereviews

Visibility:
Public.

Description

crypto/tls: make Conn.Read return (n, io.EOF) when EOF is next in buffer Update Issue 3514 An io.Reader is permitted to return either (n, nil) or (n, io.EOF) on EOF or other error. The tls package previously always returned (n, nil) for a read of size n if n bytes were available, not surfacing errors at the same time. Amazon's HTTPS frontends like to hang up on clients without sending the appropriate HTTP headers. (In their defense, they're allowed to hang up any time, but generally a server hangs up after a bit of inactivity, not immediately.) In any case, the Go HTTP client tries to re-use connections by looking at whether the response headers say to keep the connection open, and because the connection looks okay, under heavy load it's possible we'll reuse it immediately, writing the next request, just as the Transport's always-reading goroutine returns from tls.Conn.Read and sees (0, io.EOF). But because Amazon does send an AlertCloseNotify record before it hangs up on us, and the tls package does its own internal buffering (up to 1024 bytes) of pending data, we have the AlertCloseNotify in an unread buffer when our Conn.Read (to the HTTP Transport code) reads its final bit of data in the HTTP response body. This change makes that final Read return (n, io.EOF) when an AlertCloseNotify record is buffered right after, if we'd otherwise return (n, nil). A dependent change in the HTTP code then notes whether a client connection has seen an io.EOF and uses that as an additional signal to not reuse a HTTPS connection. With both changes, the majority of Amazon request failures go away. Without either one, 10-20 goroutines hitting the S3 API leads to such an error rate that empirically up to 5 retries are needed to complete an API call.

Patch Set 1 #

Patch Set 2 : diff -r 62052ebe728b https://go.googlecode.com/hg/ #

Patch Set 3 : diff -r 62052ebe728b https://go.googlecode.com/hg/ #

Total comments: 1

Patch Set 4 : diff -r 4450c84265f0 https://go.googlecode.com/hg/ #

Created: 11 years, 4 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+91 lines, -8 lines)			Patch
	M	src/pkg/crypto/tls/conn.go	View	1 2 3	2 chunks	+21 lines, -0 lines	0 comments	Download
	M	src/pkg/crypto/tls/tls_test.go	View	1	4 chunks	+70 lines, -8 lines	0 comments	Download

Messages

Total messages: 14

Expand All Messages | Collapse All Messages

bradfitz

Hello agl@golang.org (cc: golang-codereviews@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/

11 years, 4 months ago (2014-03-24 02:39:52 UTC) #1

agl1

Most sites don't send close notify alerts so this seems rather AWS specific. (In fact, ...

11 years, 4 months ago (2014-03-24 14:29:40 UTC) #2

bradfitz

On Mon, Mar 24, 2014 at 7:29 AM, <agl@golang.org> wrote: > Most sites don't send ...

11 years, 4 months ago (2014-03-24 15:54:24 UTC) #3

On Mon, Mar 24, 2014 at 7:29 AM, <agl@golang.org> wrote:

> Most sites don't send close notify alerts so this seems rather AWS
> specific. (In fact, I'm pretty surprised that they do.) Also, isn't this
> only going to work if the close-notify is in the buffer at read time?
> What if it's slightly delayed (as will frequently be the case because
> they are usually in their own packet).
>

 I can't say whether it's AWS-specific or not, but if it is then
empirically: the close-notify record is ~always there in the buffer.
Perhaps their load balancer makes its keep-alive decision while the
response body is still being proxied and then sends the close-notify before
its final flush.  Or maybe we're just getting lucky timing-wise and the two
packets are getting combined before we do the raw net.Conn.Read.  Without
the support for writing per-session keys to disk for Wireshark, I didn't
actually look at packets.

In any case, this change makes a night & day difference around whether 20
goroutines can pound on AWS APIs without errors. Reportedly, this includes
CloudFront and any site on the web behind it. Previously I had a loop
around all API calls, retrying 5 times, because 4 times wasn't always
enough. It was that flaky before.

I agree that there's still an inherent HTTP race, though, since HTTP has no
lame duck mechanism.  With an idle keep-alive connection, one side can
decide to write while the other side decides to close.  Retrying requests
in that case is https://code.google.com/p/go/issues/detail?id=4677 but
that's not for Go 1.3. I was hoping this change was safe enough for Go 1.3,
though.

Additionally, changing the behaviour of Read is a little worrisome this
> late in the 1.3 process, even if it is within the Reader contract. I
> think that's a minor point though.
>

FWIW, I did this to a few other Readers in Go 1.3, and instead of problems,
we actually got a nice performance surprise (
https://codereview.appspot.com/49570044) because ioutil.ReadAll works
better when it sees the EOF earlier.

agl1

On 2014/03/24 15:54:24, bradfitz wrote: > In any case, this change makes a night & ...

11 years, 4 months ago (2014-03-24 16:38:42 UTC) #4

bradfitz

On Mon, Mar 24, 2014 at 9:38 AM, <agl@golang.org> wrote: > On 2014/03/24 15:54:24, bradfitz ...

11 years, 4 months ago (2014-03-24 16:51:43 UTC) #5

On Mon, Mar 24, 2014 at 9:38 AM, <agl@golang.org> wrote:

> On 2014/03/24 15:54:24, bradfitz wrote:
>
>> In any case, this change makes a night & day difference around whether
>>
> 20
>
>> goroutines can pound on AWS APIs without errors. Reportedly, this
>>
> includes
>
>> CloudFront and any site on the web behind it. Previously I had a loop
>> around all API calls, retrying 5 times, because 4 times wasn't always
>> enough. It was that flaky before.
>>
>
>  I agree that there's still an inherent HTTP race, though, since HTTP
>>
> has no
>
>> lame duck mechanism.  With an idle keep-alive connection, one side can
>> decide to write while the other side decides to close.  Retrying
>>
> requests
>
>> in that case is https://code.google.com/p/go/issues/detail?id=4677 but
>> that's not for Go 1.3. I was hoping this change was safe enough for Go
>>
> 1.3,
>
>> though.
>>
>
> I'm do think that this change is nonsense from a theoretical point of
> view but perhaps it's a reasonable bandaid for 1.3.
>
> If you are planning on doing retries at the HTTP layer for 1.4, then I
> guess this is ok, although please add a TODO(bradfitz): remove for 1.4.
>

I don't think this is a bandaid, and I don't want to see it removed in Go
1.4.  If we've seen the end, I still want to return (n, EOF) instead of (n,
nil).  It conveys more information.  It's better to let the http package
(or anything using TLS) know immediately to not re-use a connection, rather
than try to write to it and fail.  The HTTP write-retry mechanism would be
a last-ditch effort.  But if there's an obvious signal (this CL) to avoid
failing in the first place, it's much better to not write the request at
all, especially since many requests aren't retry-able, like POSTs or
anything with a io.Reader Body (which we can only write once, or buffer).

I'm happy to have your LGTM but I'd rather not submit until we're on the
same page.

Why is this change nonsense from a theoretical point of view?  I didn't
feel that there was anything wrong or hacky in it.

rsc

this seems okay but it makes me uneasy. yes, crypto/tls is allowed to behave this ...

11 years, 4 months ago (2014-03-24 22:47:24 UTC) #6

bradfitz

On Mon, Mar 24, 2014 at 3:47 PM, <rsc@golang.org> wrote: > this seems okay but ...

11 years, 4 months ago (2014-03-24 23:09:16 UTC) #7

On Mon, Mar 24, 2014 at 3:47 PM, <rsc@golang.org> wrote:

> this seems okay but it makes me uneasy. yes, crypto/tls is allowed to
> behave this way, but why does the http package need it to? it seems like
> there is a bug in http. why is there no read pending? does the read not
> start until after writing the next request? why is that? can the read be
> started earlier? that will fix a broader class of related
> notice-EOF-early bugs (https and also http).
>

The http package doesn't need it, but it gives it more information.

The http transport code is always either reading a response, or it's
peeking a byte waiting for a close.

It makes the decision about whether or not the connection is reusable after
the response has been read, and before the 1-byte Read begins.  Imagine the
Content-Length was 100 bytes: we read the headers, find the 100 byte
content-length, and then we Read(buf[:100]).  crypto/tls then returns (100,
nil).  The Response headers say keep-alive.  Great.  We loop around and
begin the (blocking!) 1-byte peek, waiting to be hung up, or the response
for our next request.

We could start an n millisecond timer waiting for the Read(1 byte) to not
say anything (and not hang up on us), before we put that connection back
into the free pool, but that is gross for a number of reasons: it
unnecessarily delays the majority of connections, it introduces a new class
of connection (in-use, free, and maybe-free-soon) which complicates a lot
of the free pool get/put code.  Or at least complicates the dial code, with
another way it could be done, if connections in maybe-reusable state are
plucked off by the dial code before new connections are made.

I do not believe there is a bug in net/http here.

This CL provides new information that the tls package already has, just at
the time that http could use it: before it decides to reuse a connection,
as it's transitioning from reading a response to deciding whether to kill
the connection or freelist it and go back to reading a byte.  I can't read
a byte before deciding to freelist it, because that might block in the
common case.

agl1

On 2014/03/24 23:09:16, bradfitz wrote: > The http package doesn't need it, but it gives ...

11 years, 4 months ago (2014-03-25 14:51:06 UTC) #8

rsc

LGTM Adam, I think you and I are viewing this the same way, but after ...

11 years, 4 months ago (2014-03-25 15:38:38 UTC) #9

LGTM

Adam, I think you and I are viewing this the same way, but after a chat
with Brad I think I understand better what's going on and why this is
reasonable. Perhaps this will help.

In package http there *are* goroutines reading from all the open cached
network connections, even when there are no pending requests. If a
goroutine reader observes a hangup, it closes the connection and removes it
from the cache/pool. But there is a fundamental race where a client might
try to use the connection at the same time that the reader observes the
hangup and starts the teardown, and we don't know whether the request made
it over to the server or not. Brad says Chrome has various heuristics for
which requests are idempotent enough to retry (you can't just do it with
all GETs); the Go package never retries in this case, to avoid maintaining
the heuristics.

So for the general case things should be working pretty well. But the race
means that it can't always work well.

What Brad seems to be seeing (in contrast to what you see) is that on some
requests Amazon decides to close the connection immediately, despite
generally saying it will keep things alive. In that case it has to do a TCP
linger wait to send the final FIN, but it can and does send the close
notify earlier. Maybe we get slow enough that we get both the response and
the close notify before we bother to read either, or maybe in some cases
Amazon sends them together. Let's just suppose that happens somehow.

Then what happens is this:

A. goroutine 1
   - receive HTTPS response
   - put HTTPS connection back in pool

B. goroutine 2
   - start request
   - take HTTPS connection out of pool to use, get same one
   - send request

C. goroutine 3 (reader for that connection)
   - observe close-notify
   - poison and close HTTPS connection, remove from pool

The race here is between B and C, which are both purely local execution if
you assume the close notify has already arrived. Really it comes down to
which goroutine manages to run first. If C manages to tear down the
connection and pull it from the pool before B attempts to reuse it, all is
well. But if not, B sees itself write a request and then the connection
dies. It's a narrow race, nothing as dumb as we're just sitting on
connections not noticing the close at all, but if you pound on the system
it can happen. I have gotten a report off list about seeing the same
behavior from IIS, so I don't think it's anything Amazon is doing wrong per
se.

So the right things are already going on as far as tearing down closed
connections promptly, it's just that there's always an opportunity for a
race, and Brad's CL is arranging for this specific instance of the race not
to happen anymore.

This problem is exacerbated by HTTPS because when a goroutine needs a new
connection, we kick off a new dial+TLS handshake but then wait for either
that to complete or a connection to be dropped back into the pool for
reuse, whichever happens first. So B above might be waiting for a dial+TLS
and then A comes along and drops this not-really-useful connection into the
pool and B says "oh look! I'll use this one right now!" And it gets there
before C pulls the connection back out of the pool.

Given that the basic background observation of closes is in fact present
and working (and has been for multiple releases), I think there's not much
harm to nibbling away at the race windows. This CL hits one that seems to
be arising in practice.

Brad, can you please make the comment:

// If a close-notify alert is waiting, read it so that we can return n, EOF
instead of n, nil,
// to signal to the HTTP response reading goroutine that the connection is
now closed.
// This eliminates a race where the HTTP response reading goroutine would
otherwise
// not observe the EOF until its next read, by which time a client
goroutine might have
// already tried to reuse the HTTP connection for a new request.
// See https://codereview.appspot.com/76400046 and golang.org/issue/3514.

Thanks.
Russ

bradfitz

*** Submitted as https://code.google.com/p/go/source/detail?r=2e0323211d75 *** crypto/tls: make Conn.Read return (n, io.EOF) when EOF is next ...

11 years, 4 months ago (2014-03-25 17:58:36 UTC) #11

*** Submitted as https://code.google.com/p/go/source/detail?r=2e0323211d75 ***

crypto/tls: make Conn.Read return (n, io.EOF) when EOF is next in buffer

Update Issue 3514

An io.Reader is permitted to return either (n, nil)
or (n, io.EOF) on EOF or other error.

The tls package previously always returned (n, nil) for a read
of size n if n bytes were available, not surfacing errors at
the same time.

Amazon's HTTPS frontends like to hang up on clients without
sending the appropriate HTTP headers. (In their defense,
they're allowed to hang up any time, but generally a server
hangs up after a bit of inactivity, not immediately.) In any
case, the Go HTTP client tries to re-use connections by
looking at whether the response headers say to keep the
connection open, and because the connection looks okay, under
heavy load it's possible we'll reuse it immediately, writing
the next request, just as the Transport's always-reading
goroutine returns from tls.Conn.Read and sees (0, io.EOF).

But because Amazon does send an AlertCloseNotify record before
it hangs up on us, and the tls package does its own internal
buffering (up to 1024 bytes) of pending data, we have the
AlertCloseNotify in an unread buffer when our Conn.Read (to
the HTTP Transport code) reads its final bit of data in the
HTTP response body.

This change makes that final Read return (n, io.EOF) when
an AlertCloseNotify record is buffered right after, if we'd
otherwise return (n, nil).

A dependent change in the HTTP code then notes whether a
client connection has seen an io.EOF and uses that as an
additional signal to not reuse a HTTPS connection. With both
changes, the majority of Amazon request failures go
away. Without either one, 10-20 goroutines hitting the S3 API
leads to such an error rate that empirically up to 5 retries
are needed to complete an API call.

LGTM=agl, rsc
R=agl, rsc
CC=golang-codereviews
https://codereview.appspot.com/76400046

gobot

This CL appears to have broken the darwin-amd64-cheney builder. See http://build.golang.org/log/6591ea870ca7b750a512c71d8dc70425dcd55c03

11 years, 4 months ago (2014-03-26 19:38:45 UTC) #12

dave_cheney.net

This is a real failure # ../test/bench/go1 ok _/Users/builder/workspace/darwin-amd64-cheney-2e0323211d75/go/test/bench/go1 2.660s # ../test Build complete, duration ...

11 years, 4 months ago (2014-03-26 19:48:24 UTC) #13

bradfitz

11 years, 4 months ago (2014-03-26 22:02:02 UTC) #14

This has nothing to do with this tls change, though.



On Wed, Mar 26, 2014 at 12:48 PM, Dave Cheney <dave@cheney.net> wrote:

> This is a real failure
>
> # ../test/bench/go1
> ok
> _/Users/builder/workspace/darwin-amd64-cheney-2e0323211d75/go/test/bench/go1
> 2.660s
>
> # ../test
> Build complete, duration 1h0m0.003083969s. Result: error: timed out after
> 1h0m0s
>
> The builder normally takes less than 10 minutes for the whole thing,
> but something in ./test is wedging
>
> On Thu, Mar 27, 2014 at 6:38 AM,  <gobot@golang.org> wrote:
> > This CL appears to have broken the darwin-amd64-cheney builder.
> > See http://build.golang.org/log/6591ea870ca7b750a512c71d8dc70425dcd55c03
> >
> >
> > https://codereview.appspot.com/76400046/
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "golang-codereviews" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to golang-codereviews+unsubscribe@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
>

Expand All Messages | Collapse All Messages