commit eb134e46e41b06f6022f1c9a481205a8180515bd Author: Willy Tarreau Date: Tue Jan 14 11:42:59 2020 +0100 BUG/MEDIUM: mux-h2: don't stop sending when crossing a buffer boundary In version 2.0, after commit 9c218e7521 ("MAJOR: mux-h2: switch to next mux buffer on buffer full condition."), the H2 mux started to use a ring buffer for the output data in order to reduce competition between streams. However, one corner case was suboptimally covered: when crossing a buffer boundary, we have to shrink the outgoing frame size to the one left in the output buffer, but this shorter size is later used as a signal of incomplete send due to a buffer full condition (which used to be true when using a single buffer). As a result, function h2s_frt_make_resp_data() used to return less than requested, which in turn would cause h2_snd_buf() to stop sending and leave some unsent data in the buffer, and si_cs_send() to subscribe for sending more later. But it goes a bit further than this, because subscribing to send again causes the mux's send_list not to be empty anymore, hence extra streams can be denied the access to the mux till the first stream is woken again. This causes a nasty wakeup-sleep dance between streams that makes it totally impractical to try to remove the sending list. A test showed that it was possible to observe 3 million h2_snd_buf() giveups for only 100k requests when using 100 concurrent streams on 20kB objects. It doesn't seem likely that a stream could get blocked and time out due to this bug, though it's not possible either to demonstrate the opposite. One risk is that incompletely sent streams do not have any blocking flags so they may not be identified as blocked. However on first scan of the send_list they meet all conditions for a wakeup. This patch simply allows to continue on a new frame after a partial frame. with only this change, the number of failed h2_snd_buf() was divided by 800 (4% of calls). And by slightly increasing the H2C_MBUF_CNT size, it can go down to zero. This fix must be backported to 2.1 and 2.0. (cherry picked from commit c7ce4e3e7fb2d7f9f037b4df318df7d6e23e8f7a) Signed-off-by: Christopher Faulet diff --git a/src/mux_h2.c b/src/mux_h2.c index d46a316ac..8a82f60fd 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -5157,6 +5157,7 @@ static size_t h2s_frt_make_resp_data(struct h2s *h2s, struct buffer *buf, size_t struct htx_blk *blk; enum htx_blk_type type; int idx; + int trunc_out; /* non-zero if truncated on out buf */ TRACE_ENTER(H2_EV_TX_FRAME|H2_EV_TX_DATA, h2c->conn, h2s); @@ -5183,6 +5184,7 @@ static size_t h2s_frt_make_resp_data(struct h2s *h2s, struct buffer *buf, size_t type = htx_get_blk_type(blk); // DATA or EOM bsize = htx_get_blksz(blk); fsize = bsize; + trunc_out = 0; if (type == HTX_BLK_EOM) { if (h2s->flags & H2_SF_ES_SENT) { @@ -5345,6 +5347,7 @@ static size_t h2s_frt_make_resp_data(struct h2s *h2s, struct buffer *buf, size_t b_data(mbuf) <= MAX_DATA_REALIGN) goto realign_again; fsize = outbuf.size - 9; + trunc_out = 1; if (fsize <= 0) { /* no need to send an empty frame here */ @@ -5402,6 +5405,8 @@ static size_t h2s_frt_make_resp_data(struct h2s *h2s, struct buffer *buf, size_t } else { /* we've truncated this block */ htx_cut_data_blk(htx, blk, fsize); + if (trunc_out) + goto new_frame; } if (es_now) {