summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-02-14Linux 3.7.8v3.7.8Greg Kroah-Hartman
2013-02-14drm/nouveau: add lockdep annotationsMarcin Slusarz
commit 5f97ab913cf0fbc378ea8ffc3ee66f4890d11c55 upstream. 1) Lockdep thinks all nouveau subdevs belong to the same class and can be locked in arbitrary order, which is not true (at least in general case). Tell it to distinguish subdevs by (o)class type. 2) DRM client can be locked under user client lock - tell lockdep to put DRM client lock in a separate class. Reported-by: Arend van Spriel <arend@broadcom.com> Reported-by: Peter Hurley <peter@hurleysoftware.com> Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reported-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: splice: fix __splice_segment()Eric Dumazet
[ Upstream commit bc9540c637c3d8712ccbf9dcf28621f380ed5e64 ] commit 9ca1b22d6d2 (net: splice: avoid high order page splitting) forgot that skb->head could need a copy into several page frags. This could be the case for loopback traffic mostly. Also remove now useless skb argument from linear_to_page() and __splice_segment() prototypes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: splice: avoid high order page splittingEric Dumazet
[ Upstream commit 82bda6195615891181115f579a480aa5001ce7e9 ] splice() can handle pages of any order, but network code tries hard to split them in PAGE_SIZE units. Not quite successfully anyway, as __splice_segment() assumed poff < PAGE_SIZE. This is true for the skb->data part, not necessarily for the fragments. This patch removes this logic to give the pages as they are in the skb. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix splice() and tcp collapsing interactionEric Dumazet
[ Upstream commit f26845b43c75d3f32f98d194c1327b5b1e6b3fb0 ] Under unusual circumstances, TCP collapse can split a big GRO TCP packet while its being used in a splice(socket->pipe) operation. skb_splice_bits() releases the socket lock before calling splice_to_pipe(). [ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc() [ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]- [ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13 To fix this problem, we must eat skbs in tcp_recv_skb(). Remove the inline keyword from tcp_recv_skb() definition since it has three call sites. Reported-by: Christian Becker <c.becker@traviangames.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: splice: fix an infinite loop in tcp_read_sock()Eric Dumazet
[ Upstream commit ff905b1e4aad8ccbbb0d42f7137f19482742ff07 ] commit 02275a2ee7c0 (tcp: don't abort splice() after small transfers) added a regression. [ 83.843570] INFO: rcu_sched self-detected stall on CPU [ 83.844575] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 0, t=21002 jiffies, g=4457, c=4456, q=13132) [ 83.844582] Task dump for CPU 6: [ 83.844584] netperf R running task 0 8966 8952 0x0000000c [ 83.844587] 0000000000000000 0000000000000006 0000000000006c6c 0000000000000000 [ 83.844589] 000000000000006c 0000000000000096 ffffffff819ce2bc ffffffffffffff10 [ 83.844592] ffffffff81088679 0000000000000010 0000000000000246 ffff880c4b9ddcd8 [ 83.844594] Call Trace: [ 83.844596] [<ffffffff81088679>] ? vprintk_emit+0x1c9/0x4c0 [ 83.844601] [<ffffffff815ad449>] ? schedule+0x29/0x70 [ 83.844606] [<ffffffff81537bd2>] ? tcp_splice_data_recv+0x42/0x50 [ 83.844610] [<ffffffff8153beaa>] ? tcp_read_sock+0xda/0x260 [ 83.844613] [<ffffffff81537b90>] ? tcp_prequeue_process+0xb0/0xb0 [ 83.844615] [<ffffffff8153c0f0>] ? tcp_splice_read+0xc0/0x250 [ 83.844618] [<ffffffff814dc0c2>] ? sock_splice_read+0x22/0x30 [ 83.844622] [<ffffffff811b820b>] ? do_splice_to+0x7b/0xa0 [ 83.844627] [<ffffffff811ba4bc>] ? sys_splice+0x59c/0x5d0 [ 83.844630] [<ffffffff8119745b>] ? putname+0x2b/0x40 [ 83.844633] [<ffffffff8118bcb4>] ? do_sys_open+0x174/0x1e0 [ 83.844636] [<ffffffff815b6202>] ? system_call_fastpath+0x16/0x1b if recv_actor() returns 0, we should stop immediately, because looping wont give a chance to drain the pipe. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: don't abort splice() after small transfersWilly Tarreau
[ Upstream commit 02275a2ee7c0ea475b6f4a6428f5df592bc9d30b ] TCP coalescing added a regression in splice(socket->pipe) performance, for some workloads because of the way tcp_read_sock() is implemented. The reason for this is the break when (offset + 1 != skb->len). As we released the socket lock, this condition is possible if TCP stack added a fragment to the skb, which can happen with TCP coalescing. So let's go back to the beginning of the loop when this happens, to give a chance to splice more frags per system call. Doing so fixes the issue and makes GRO 10% faster than LRO on CPU-bound splice() workloads instead of the opposite. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix for zero packets_in_flight was too broadIlpo Järvinen
[ Upstream commit 6731d2095bd4aef18027c72ef845ab1087c3ba63 ] There are transients during normal FRTO procedure during which the packets_in_flight can go to zero between write_queue state updates and firing the resulting segments out. As FRTO processing occurs during that window the check must be more precise to not match "spuriously" :-). More specificly, e.g., when packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic branch that set cwnd into zero would not be taken and new segments might be sent out later. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: frto should not set snd_cwnd to 0Eric Dumazet
[ Upstream commit 2e5f421211ff76c17130b4597bc06df4eeead24f ] Commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()) uncovered a bug in FRTO code : tcp_process_frto() is setting snd_cwnd to 0 if the number of in flight packets is 0. As Neal pointed out, if no packet is in flight we lost our chance to disambiguate whether a loss timeout was spurious. We should assume it was a proper loss. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix an infinite loop in tcp_slow_start()Eric Dumazet
[ Upstream commit 973ec449bb4f2b8c514bacbcb4d9506fc31c8ce3 ] Since commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()), a nul snd_cwnd triggers an infinite loop in tcp_slow_start() Avoid this infinite loop and log a one time error for further analysis. FRTO code is suspected to cause this bug. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: detect SYN/data drop when F-RTO is disabledYuchung Cheng
[ Upstream commit 66555e92fb7a619188c02cceae4bbc414f15f96d ] On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN retransmission to detect SYN/data drops. But if F-RTO is disabled, icsk_retransmit is reset at step D of tcp_fastretrans_alert() ( under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use total_retrans instead which accounts for SYN retransmission regardless the use of F-RTO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: sctp: sctp_endpoint_free: zero out secret key dataDaniel Borkmann
[ Upstream commit b5c37fe6e24eec194bb29d22fdd55d73bcc709bf ] On sctp_endpoint_destroy, previously used sensitive keying material should be zeroed out before the memory is returned, as we already do with e.g. auth keys when released. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfreeDaniel Borkmann
[ Upstream commit 6ba542a291a5e558603ac51cda9bded347ce7627 ] In sctp_setsockopt_auth_key, we create a temporary copy of the user passed shared auth key for the endpoint or association and after internal setup, we free it right away. Since it's sensitive data, we should zero out the key before returning the memory back to the allocator. Thus, use kzfree instead of kfree, just as we do in sctp_auth_key_put(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14sctp: refactor sctp_outq_teardown to insure proper re-initalizationNeil Horman
[ Upstream commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86 ] Jamie Parsons reported a problem recently, in which the re-initalization of an association (The duplicate init case), resulted in a loss of receive window space. He tracked down the root cause to sctp_outq_teardown, which discarded all the data on an outq during a re-initalization of the corresponding association, but never reset the outq->outstanding_data field to zero. I wrote, and he tested this fix, which does a proper full re-initalization of the outq, fixing this problem, and hopefully future proofing us from simmilar issues down the road. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Fix route refcount on pmtu discoverySteffen Klassert
[ Upstream commit b44108dbdbaa07c609bb5755e8dd6c2035236251 ] git commit 9cb3a50c (ipv4: Invalidate the socket cached route on pmtu events if possible) introduced a refcount problem. We don't get a refcount on the route if we get it from__sk_dst_get(), but we need one if we want to reuse this route because __sk_dst_set() releases the refcount of the old route. This patch adds proper refcount handling for that case. We introduce a 'new' flag to indicate that we are going to use a new route and we release the old route only if we replace it by a new one. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Add a socket release callback for datagram socketsSteffen Klassert
[ Upstream commit 8141ed9fcedb278f4a3a78680591bef1e55f75fb ] This implements a socket release callback function to check if the socket cached route got invalid during the time we owned the socket. The function is used from udp, raw and ping sockets. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Invalidate the socket cached route on pmtu events if possibleSteffen Klassert
[ Upstream commit 9cb3a50c5f63ed745702972f66eaee8767659acd ] The route lookup in ipv4_sk_update_pmtu() might return a route different from the route we cached at the socket. This is because standart routes are per cpu, so each cpu has it's own struct rtable. This means that we do not invalidate the socket cached route if the NET_RX_SOFTIRQ is not served by the same cpu that the sending socket uses. As a result, the cached route reused until we disconnect. With this patch we invalidate the socket cached route if possible. If the socket is owened by the user, we can't update the cached route directly. A followup patch will implement socket release callback functions for datagram sockets to handle this case. Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: Add an error handler for icmp6Steffen Klassert
[ Upstream commit 6f809da27c94425e07be4a64d5093e1df95188e9 ] pmtu and redirect events are now handled in the protocols error handler, so add an error handler for icmp6 to do this. It is needed in the case when we have no socket context. Based on a patch by Duan Jiong. Reported-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Don't update the pmtu on mtu locked routesSteffen Klassert
[ Upstream commit fa1e492aa3cbafba9f8fc6d05e5b08a3091daf4a ] Routes with locked mtu should not use learned pmtu informations, so do not update the pmtu on these routes. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Remove output route check in ipv4_mtuSteffen Klassert
[ Upstream commit 38d523e2948162776903349c89d65f7b9370dadb ] The output route check was introduced with git commit 261663b0 (ipv4: Don't use the cached pmtu informations for input routes) during times when we cached the pmtu informations on the inetpeer. Now the pmtu informations are back in the routes, so this check is obsolete. It also had some unwanted side effects, as reported by Timo Teras and Lukas Tribus. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14netback: correct netbk_tx_err to handle wrap around.Ian Campbell
[ Upstream commit b9149729ebdcfce63f853aa54a404c6a8f6ebbf3 ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: free already allocated memory on failure in xen_netbk_get_requestsIan Campbell
[ Upstream commit 4cc7c1cb7b11b6f3515bd9075527576a1eecc4aa ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.Matthew Daley
[ Upstream commit 7d5145d8eb2b9791533ffe4dc003b129b9696c48 ] Signed-off-by: Matthew Daley <mattjd@gmail.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: shutdown the ring if it contains garbage.Ian Campbell
[ Upstream commit 48856286b64e4b66ec62b94e504d0b29c1ade664 ] A buggy or malicious frontend should not be able to confuse netback. If we spot anything which is not as it should be then shutdown the device and don't try to continue with the ring in a potentially hostile state. Well behaved and non-hostile frontends will not be penalised. As well as making the existing checks for such errors fatal also add a new check that ensures that there isn't an insane number of requests on the ring (i.e. more than would fit in the ring). If the ring contains garbage then previously is was possible to loop over this insane number, getting an error each time and therefore not generating any more pending requests and therefore not exiting the loop in xen_netbk_tx_build_gops for an externded period. Also turn various netdev_dbg calls which no precipitate a fatal error into netdev_err, they are rate limited because the device is shutdown afterwards. This fixes at least one known DoS/softlockup of the backend domain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14atm/iphase: rename fregt_t -> ffreg_tHeiko Carstens
[ Upstream commit ab54ee80aa7585f9666ff4dd665441d7ce41f1e8 ] We have conflicting type qualifiers for "freg_t" in s390's ptrace.h and the iphase atm device driver, which causes the compile error below. Unfortunately the s390 typedef can't be renamed, since it's a user visible api, nor can I change the include order in s390 code to avoid the conflict. So simply rename the iphase typedef to a new name. Fixes this compile error: In file included from drivers/atm/iphase.c:66:0: drivers/atm/iphase.h:639:25: error: conflicting type qualifiers for 'freg_t' In file included from next/arch/s390/include/asm/ptrace.h:9:0, from next/arch/s390/include/asm/lowcore.h:12, from next/arch/s390/include/asm/thread_info.h:30, from include/linux/thread_info.h:54, from include/linux/preempt.h:9, from include/linux/spinlock.h:50, from include/linux/seqlock.h:29, from include/linux/time.h:5, from include/linux/stat.h:18, from include/linux/module.h:10, from drivers/atm/iphase.c:43: next/arch/s390/include/uapi/asm/ptrace.h:197:3: note: previous declaration of 'freg_t' was here Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()Tommi Rantala
[ Upstream commit 41ab3e31bd50b42c85ac0aa0469642866aee2a9a ] ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch, and the -1 return value from this function is bogus. Use the error handling we already have in place in ip6gre_tunnel_xmit() for this error case to fix this. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14packet: fix leakage of tx_ring memoryPhil Sutter
[ Upstream commit 9665d5d62487e8e7b1f546c00e11107155384b9a ] When releasing a packet socket, the routine packet_set_ring() is reused to free rings instead of allocating them. But when calling it for the first time, it fills req->tp_block_nr with the value of rb->pg_vec_len which in the second invocation makes it bail out since req->tp_block_nr is greater zero but req->tp_block_size is zero. This patch solves the problem by passing a zeroed auto-variable to packet_set_ring() upon each invocation from packet_release(). As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING and packet mmap), i.e. the original inclusion of TX ring support into af_packet, but applies only to sockets with both RX and TX ring allocated, which is probably why this was unnoticed all the time. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> Cc: Johann Baudy <johann.baudy@gnu-log.net> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14via-rhine: Fix bugs in NAPI support.David S. Miller
[ Upstream commit 559bcac35facfed49ab4f408e162971612dcfdf3 ] 1) rhine_tx() should use dev_kfree_skb() not dev_kfree_skb_irq() 2) rhine_slow_event_task's NAPI triggering logic is racey, it should just hit the interrupt mask register. This is the same as commit 7dbb491878a2c51d372a8890fa45a8ff80358af1 ("r8169: avoid NAPI scheduling delay.") made to fix the same problem in the r8169 driver. From Francois Romieu. Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com> Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: do not create neighbor entries for local deliveryMarcelo Ricardo Leitner
[ Upstream commit bd30e947207e2ea0ff2c08f5b4a03025ddce48d3 ] They will be created at output, if ever needed. This avoids creating empty neighbor entries when TPROXYing/Forwarding packets for addresses that are not even directly reachable. Note that IPv4 already handles it this way. No neighbor entries are created for local input. Tested by myself and customer. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14pktgen: correctly handle failures when adding a deviceCong Wang
[ Upstream commit 604dfd6efc9b79bce432f2394791708d8e8f6efc ] The return value of pktgen_add_device() is not checked, so even if we fail to add some device, for example, non-exist one, we still see "OK:...". This patch fixes it. After this patch, I got: # echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0 -bash: echo: write error: No such device # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: Result: ERROR: can not add device non-exist # echo "add_device eth0" > /proc/net/pktgen/kpktgend_0 # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: eth0 Result: OK: add_device=eth0 (Candidate for -stable) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14IP_GRE: Fix kernel panic in IP_GRE with GRE csum.Pravin B Shelar
[ Upstream commit 5465740ace36f179de5bb0ccb5d46ddeb945e309 ] Due to IP_GRE GSO support, GRE can recieve non linear skb which results in panic in case of GRE_CSUM. Following patch fixes it by using correct csum API. Bug introduced in commit 6b78f16e4bdde3936b (gre: add GSO support) Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: loopback: fix a dst refcounting issueEric Dumazet
[ Upstream commit 794ed393b707f01858f5ebe2ae5eabaf89d00022 ] Ben Greear reported crashes in ip_rcv_finish() on a stress test involving many macvlans. We tracked the bug to a dst use after free. ip_rcv_finish() was calling dst->input() and got garbage for dst->input value. It appears the bug is in loopback driver, lacking a skb_dst_force() before calling netif_rx(). As a result, a non refcounted dst, normally protected by a RCU read_lock section, was escaping this section and could be freed before the packet being processed. [<ffffffff813a3c4d>] loopback_xmit+0x64/0x83 [<ffffffff81477364>] dev_hard_start_xmit+0x26c/0x35e [<ffffffff8147771a>] dev_queue_xmit+0x2c4/0x37c [<ffffffff81477456>] ? dev_hard_start_xmit+0x35e/0x35e [<ffffffff8148cfa6>] ? eth_header+0x28/0xb6 [<ffffffff81480f09>] neigh_resolve_output+0x176/0x1a7 [<ffffffff814ad835>] ip_finish_output2+0x297/0x30d [<ffffffff814ad6d5>] ? ip_finish_output2+0x137/0x30d [<ffffffff814ad90e>] ip_finish_output+0x63/0x68 [<ffffffff814ae412>] ip_output+0x61/0x67 [<ffffffff814ab904>] dst_output+0x17/0x1b [<ffffffff814adb6d>] ip_local_out+0x1e/0x23 [<ffffffff814ae1c4>] ip_queue_xmit+0x315/0x353 [<ffffffff814adeaf>] ? ip_send_unicast_reply+0x2cc/0x2cc [<ffffffff814c018f>] tcp_transmit_skb+0x7ca/0x80b [<ffffffff814c3571>] tcp_connect+0x53c/0x587 [<ffffffff810c2f0c>] ? getnstimeofday+0x44/0x7d [<ffffffff810c2f56>] ? ktime_get_real+0x11/0x3e [<ffffffff814c6f9b>] tcp_v4_connect+0x3c2/0x431 [<ffffffff814d6913>] __inet_stream_connect+0x84/0x287 [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49 [<ffffffff8108d695>] ? _local_bh_enable_ip+0x84/0x9f [<ffffffff8108d6c8>] ? local_bh_enable+0xd/0x11 [<ffffffff8146763c>] ? lock_sock_nested+0x6e/0x79 [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49 [<ffffffff814d6b49>] inet_stream_connect+0x33/0x49 [<ffffffff814632c6>] sys_connect+0x75/0x98 This bug was introduced in linux-2.6.35, in commit 7fee226ad2397b (net: add a noref bit on skb dst) skb_dst_force() is enforced in dev_queue_xmit() for devices having a qdisc. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14r8169: remove the obsolete and incorrect AMD workaroundTimo Teräs
[ Upstream commit 5d0feaff230c0abfe4a112e6f09f096ed99e0b2d ] This was introduced in commit 6dccd16 "r8169: merge with version 6.001.00 of Realtek's r8169 driver". I did not find the version 6.001.00 online, but in 6.002.00 or any later r8169 from Realtek this hunk is no longer present. Also commit 05af214 "r8169: fix Ethernet Hangup for RTL8110SC rev d" claims to have fixed this issue otherwise. The magic compare mask of 0xfffe000 is dubious as it masks parts of the Reserved part, and parts of the VLAN tag. But this does not make much sense as the VLAN tag parts are perfectly valid there. In matter of fact this seems to be triggered with any VLAN tagged packet as RxVlanTag bit is matched. I would suspect 0xfffe0000 was intended to test reserved part only. Finally, this hunk is evil as it can cause more packets to be handled than what was NAPI quota causing net/core/dev.c: net_rx_action(): WARN_ON_ONCE(work > weight) to trigger, and mess up the NAPI state causing device to hang. As result, any system using VLANs and having high receive traffic (so that NAPI poll budget limits rtl_rx) would result in device hang. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14netxen: fix off by one bug in netxen_release_tx_buffer()Eric Dumazet
[ Upstream commit a05948f296ce103989b28a2606e47d2e287c3c89 ] Christoph Paasch found netxen could trigger a BUG in its dismantle phase, in netxen_release_tx_buffer(), using full size TSO packets. cmd_buf->frag_count includes the skb->data part, so the loop must start at index 1 instead of 0, or else we can make an out of bound access to cmd_buff->frag_array[MAX_SKB_FRAGS + 2] Christoph provided the fixes in netxen_map_tx_skb() function. In case of a dma mapping error, its better to clear the dma fields so that we don't try to unmap them again in netxen_release_tx_buffer() Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Christoph Paasch <christoph.paasch@uclouvain.be> Cc: Sony Chacko <sony.chacko@qlogic.com> Cc: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14isdn/gigaset: fix zero size border case in debug dumpTilman Schmidt
[ Upstream commit d721a1752ba544df8d7d36959038b26bc92bdf80 ] If subtracting 12 from l leaves zero we'd do a zero size allocation, leading to an oops later when we try to set the NUL terminator. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix incorrect LOCKDROPPEDICMPS counterEric Dumazet
[ Upstream commit b74aa930ef49a3c0d8e4c1987f89decac768fb2c ] commit 563d34d057 (tcp: dont drop MTU reduction indications) added an error leading to incorrect accounting of LINUX_MIB_LOCKDROPPEDICMPS If socket is owned by the user, we want to increment this SNMP counter, unless the message is a (ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one. Reported-by: Maciej ¯enczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej ¯enczykowski <maze@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaultsOr Gerlitz
[ Upstream commit ca4c7b35f75492de7fbf5ee95be07481c348caee ] The lines if (mlx4_is_mfunc(dev)) { nreq = 2; } else { which hard code the number of requested msi-x vectors under multi-function mode to two can be removed completely, since the firmware sets num_eqs and reserved_eqs appropriately Thus, the code line: nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq); is by itself sufficient and correct for all cases. Currently, for mfunc mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled. This triples (one vector is used for the async events and commands EQ) the horse power provided for processing of incoming packets on netdev RSS scheme, IO initiators/targets commands processing flows, etc. Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV modeYan Burman
[ Upstream commit 213815a1e6ae70b9648483b110bc5081795f99e8 ] Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on same host" introduced a regression under which a bridge acting as vSwitch whose uplink is an mlx4 Ethernet device become non-operative in native (non sriov) mode. This happens since broadcast ARP requests sent by VMs were loopback-ed by the HW and hence the bridge learned VM source MACs on both the VM and the uplink ports. The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch configuration or when the device is in selftest. Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: calxedaxgmac: throw away overrun framesRob Herring
[ Upstream commit d6fb3be544b46a7611a3373fcaa62b5b0be01888 ] The xgmac driver assumes 1 frame per descriptor. If a frame larger than the descriptor's buffer size is received, the frame will spill over into the next descriptor. So check for received frames that span more than one descriptor and discard them. This prevents a crash if we receive erroneous large packets. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14macvlan: fix macvlan_get_size()Eric Dumazet
[ Upstream commit 01fe944f1024bd4e5c327ddbe8d657656b66af2f ] commit df8ef8f3aaa (macvlan: add FDB bridge ops and macvlan flags) forgot to update macvlan_get_size() after the addition of IFLA_MACVLAN_FLAGS Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: fix header length calculation in ip6_append_data()Romain KUNTZ
[ Upstream commit 7efdba5bd9a2f3e2059beeb45c9fa55eefe1bced ] Commit 299b0767 (ipv6: Fix IPsec slowpath fragmentation problem) has introduced a error in the header length calculation that provokes corrupted packets when non-fragmentable extensions headers (Destination Option or Routing Header Type 2) are used. rt->rt6i_nfheader_len is the length of the non-fragmentable extension header, and it should be substracted to rt->dst.header_len, and not to exthdrlen, as it was done before commit 299b0767. This patch reverts to the original and correct behavior. It has been successfully tested with and without IPsec on packets that include non-fragmentable extensions headers. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14MAINTAINERS: Stephen Hemminger email changeStephen Hemminger
[ Upstream commit adbbf69d1a54abf424e91875746a610dcc80017d ] I changed my email because the vyatta.com mail server is now redirected to brocade.com; and the Brocade mail system is not friendly to Linux desktop users. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix a panic on UP machines in reqsk_fastopen_removeEric Dumazet
[ Upstream commit cce894bb824429fd312706c7012acae43e725865 ] spin_is_locked() on a non !SMP build is kind of useless. BUG_ON(!spin_is_locked(xx)) is guaranteed to crash. Just remove this check in reqsk_fastopen_remove() as the callers do hold the socket lock. Reported-by: Ketan Kulkarni <ketkulka@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Dave Taht <dave.taht@gmail.com> Acked-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net, wireless: overwrite default_ethtool_opsStanislaw Gruszka
[ Upstream commit d07d7507bfb4e23735c9b83e397c43e1e8a173e8 ] Since: commit 2c60db037034d27f8c636403355d52872da92f81 Author: Eric Dumazet <edumazet@google.com> Date: Sun Sep 16 09:17:26 2012 +0000 net: provide a default dev->ethtool_ops wireless core does not correctly assign ethtool_ops. After alloc_netdev*() call, some cfg80211 drivers provide they own ethtool_ops, but some do not. For them, wireless core provide generic cfg80211_ethtool_ops, which is assigned in NETDEV_REGISTER notify call: if (!dev->ethtool_ops) dev->ethtool_ops = &cfg80211_ethtool_ops; But after Eric's commit, dev->ethtool_ops is no longer NULL (on cfg80211 drivers without custom ethtool_ops), but points to &default_ethtool_ops. In order to fix the problem, provide function which will overwrite default_ethtool_ops and use it by wireless core. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: fix the noflags test in addrconf_get_prefix_routeRomain Kuntz
[ Upstream commit 85da53bf1c336bb07ac038fb951403ab0478d2c5 ] The tests on the flags in addrconf_get_prefix_route() does no make much sense: the 'noflags' parameter contains the set of flags that must not match with the route flags, so the test must be done against 'noflags', and not against 'flags'. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: prevent setting ttl=0 via IP_TTLCong Wang
[ Upstream commit c9be4a5c49cf51cc70a993f004c5bb30067a65ce ] A regression is introduced by the following commit: commit 4d52cfbef6266092d535237ba5a4b981458ab171 Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue Jun 2 00:42:16 2009 -0700 net: ipv4/ip_sockglue.c cleanups Pure cleanups but it is not a pure cleanup... - if (val != -1 && (val < 1 || val>255)) + if (val != -1 && (val < 0 || val > 255)) Since there is no reason provided to allow ttl=0, change it back. Reported-by: nitin padalia <padalia.nitin@gmail.com> Cc: nitin padalia <padalia.nitin@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14dm thin: fix queue limits stackingMike Snitzer
commit 0f640dca08330dfc7820d610578e5935b5e654b2 upstream. thin_io_hints() is blindly copying the queue limits from the thin-pool which can lead to incorrect limits being set. The fix here simply deletes the thin_io_hints() hook which leaves the existing stacking infrastructure to set the limits correctly. When a thin-pool uses an MD device for the data device a thin device from the thin-pool must respect MD's constraints about disallowing a bio from spanning multiple chunks. Otherwise we can see problems. If the raid0 chunksize is 1152K and thin-pool chunksize is 256K I see the following md/raid0 error (with extra debug tracing added to thin_endio) when mkfs.xfs is executed against the thin device: md/raid0:md99: make_request bug: can't convert block across chunks or bigger than 1152k 6688 127 device-mapper: thin: bio sector=2080 err=-5 bi_size=130560 bi_rw=17 bi_vcnt=32 bi_idx=0 This extra DM debugging shows that the failing bio is spanning across the first and second logical 1152K chunk (sector 2080 + 255 takes the bio beyond the first chunk's boundary of sector 2304). So the bio splitting that DM is doing clearly isn't respecting the MD limits. max_hw_sectors_kb is 127 for both the thin-pool and thin device (queue_max_hw_sectors returns 255 so we'll excuse sysfs's lack of precision). So this explains why bi_size is 130560. But the thin device's max_hw_sectors_kb should be 4 (PAGE_SIZE) given that it doesn't have a .merge function (for bio_add_page to consult indirectly via dm_merge_bvec) yet the thin-pool does sit above an MD device that has a compulsory merge_bvec_fn. This scenario is exactly why DM must resort to sending single PAGE_SIZE bios to the underlying layer. Some additional context for this is available in the header for commit 8cbeb67a ("dm: avoid unsupported spanning of md stripe boundaries"). Long story short, the reason a thin device doesn't properly get configured to have a max_hw_sectors_kb of 4 (PAGE_SIZE) is that thin_io_hints() is blindly copying the queue limits from the thin-pool device directly to the thin device's queue limits. Fix this by eliminating thin_io_hints. Doing so is safe because the block layer's queue limits stacking already enables the upper level thin device to inherit the thin-pool device's discard and minimum_io_size and optimal_io_size limits that get set in pool_io_hints. But avoiding the queue limits copy allows the thin and thin-pool limits to be different where it is important, namely max_hw_sectors_kb. Reported-by: Daniel Browning <db@kavod.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14mfd: db8500-prcmu: Fix irqdomain usageLinus Walleij
commit 89d9b1c99374997d68910ba49d5b7df80e7f2061 upstream. This fixes two issues with the DB8500 PRCMU irqdomain: - You have to state the irq base 0 to get a linear domain for the DT case from irq_domain_add_simple() - The irqdomain was not used to translate the initial irq request using irq_create_mapping() making the linear case fail as it was lacking a proper descriptor. I took this opportunity to fix two lines of whitespace errors in related code as I was anyway messing around with it. Acked-by Lee Jones <lee.jones@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ath9k_hw: fix calibration issues on chainmask that don't include chain 0Felix Fietkau
commit 4a8f199508d79ff8a7d1e22f47b912baaf225336 upstream. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14media: pwc-if: must check vb2_queue_init() successMauro Carvalho Chehab
commit eda94710d6502672c5ee7de198fa78a63ddfae3a upstream. drivers/media/usb/pwc/pwc-if.c: In function 'usb_pwc_probe': drivers/media/usb/pwc/pwc-if.c:1003:16: warning: ignoring return value of 'vb2_queue_init', declared with attribute warn_unused_result [-Wunused-result] In the past, it used to have a logic there at queue init that would BUG() on errors. This logic got removed. Drivers are now required to explicitly handle the queue initialization errors, or very bad things may happen. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Paul Bolle <pebolle@tiscali.nl>