summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorJulian Anastasov <ja@ssi.bg>2020-07-01 18:17:19 +0300
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2020-08-19 08:14:56 +0200
commit7106f943302247ed9fcde84afdca06cbe9e19dce (patch)
tree24bf38015522f504774ffc421ef70b584cbcb8bd /include
parentfdac85326f40c7ba6ae2b9e4a2c710f26b708ab8 (diff)
ipvs: allow connection reuse for unconfirmed conntrack
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 https://github.com/kubernetes/kubernetes/issues/70747 - Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544 - Additional 1s latency in `host -> service IP -> pod` https://github.com/kubernetes/kubernetes/issues/90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <yx.atom1@gmail.com> Signed-off-by: YangYuxi <yx.atom1@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Diffstat (limited to 'include')
-rw-r--r--include/net/ip_vs.h10
1 files changed, 4 insertions, 6 deletions
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index af0ede9ad4d0..c31e54a41b5c 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1614,18 +1614,16 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
}
#endif /* CONFIG_IP_VS_NFCT */
-/* Really using conntrack? */
-static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp,
- struct sk_buff *skb)
+/* Using old conntrack that can not be redirected to another real server? */
+static inline bool ip_vs_conn_uses_old_conntrack(struct ip_vs_conn *cp,
+ struct sk_buff *skb)
{
#ifdef CONFIG_IP_VS_NFCT
enum ip_conntrack_info ctinfo;
struct nf_conn *ct;
- if (!(cp->flags & IP_VS_CONN_F_NFCT))
- return false;
ct = nf_ct_get(skb, &ctinfo);
- if (ct)
+ if (ct && nf_ct_is_confirmed(ct))
return true;
#endif
return false;