summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-09-18bcache: Possible flush fixHEADbcache-3.10-stableKent Overstreet
Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-18bcache: Fix for handling overlapping extents when reading in a btree nodeKent Overstreet
btree_sort_fixup() was overly clever, because it was trying to avoid pulling a key off the btree iterator in more than one place. This led to a really obscure bug where we'd break early from the loop in btree_sort_fixup() if the current key overlapped with keys in more than one older set, and the next key it overlapped with was zero size. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-18bcache: Fix a shrinker deadlockKent Overstreet
GFP_NOIO means we could be getting called recursively - mca_alloc() -> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then. Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-18bcache: Fix a dumb CPU spinning bug in writebackKent Overstreet
schedule_timeout() != schedule_timeout_uninterruptible() Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-18bcache: Fix a flush/fua performance bugKent Overstreet
bch_journal_meta() was missing the flush to make the journal write actually go down (instead of waiting up to journal_delay_ms)... Whoops Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-18bcache: Fix a writeback performance regressionKent Overstreet
Background writeback works by scanning the btree for dirty data and adding those keys into a fixed size buffer, then for each dirty key in the keybuf writing it to the backing device. When read_dirty() finishes and it's time to scan for more dirty data, we need to wait for the outstanding writeback IO to finish - they still take up slots in the keybuf (so that foreground writes can check for them to avoid races) - without that wait, we'll continually rescan when we'll be able to add at most a key or two to the keybuf, and that takes locks that starves foreground IO. Doh. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-09-14Linux 3.10.12Greg Kroah-Hartman
2013-09-14ARM: at91: dt: sam9260: add i2c gpio pinctrlJean-Christophe PLAGNIOL-VILLARD
commit f89ae61bd74ae195c464bdd97a134e30908884d5 upstream. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Boris BREZILLON <b.brezillon@overkiz.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14mwifiex: do not create AP and P2P interfaces upon driver loadingBing Zhao
commit 1211c961170cedb21c30d5bb7e2033c8720b38db upstream. Bug 60747 - 1286:2044 [Microsoft Surface Pro] Marvell 88W8797 wifi show 3 interface under network https://bugzilla.kernel.org/show_bug.cgi?id=60747 This issue was also reported previously by OLPC and some folks from the community. There are 3 network interfaces with different types being created when mwifiex driver is loaded: 1. mlan0 (infra. STA) 2. uap0 (AP) 3. p2p0 (P2P_CLIENT) The Network Manager attempts to use all 3 interfaces above without filtering the managed interface type. As the result, 3 identical interfaces are displayed under network manager. If user happens to click on an entry under which its interface is uap0 or p2p0, the association will fail. Work around it by removing the creation of AP and P2P interfaces at driver loading time. These interfaces can be added with 'iw' or other applications manually when they are needed. Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: Avinash Patil <patila@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14drivers/rtc/rtc-max77686.c: Fix wrong registerSangjung Woo
commit 1748cbf7f7c464593232cde914f5a103181a83b5 upstream. Fix a read of the wrong register when checking whether the RTC timer has reached the alarm time. Signed-off-by: Sangjung Woo <sangjung.woo@samsung.com> Signed-off-by: Myugnjoo Ham <myungjoo.ham@samsung.com> Reviewed-by: Jonghwa Lee <jonghwa3.lee@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14crypto: xor - Check for osxsave as well as avx in crypto/xorJohn Haxby
commit edb6f29464afc65fc73767540b854abf63ae7144 upstream. This affects xen pv guests with sufficiently old versions of xen and sufficiently new hardware. On such a system, a guest with a btrfs root won't even boot. Signed-off-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Michael Marineau <michael.marineau@coreos.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net: mvneta: properly disable HW PHY polling and ensure adjust_link() worksThomas Petazzoni
[ Upstream commit 714086029116b6b0a34e67ba1dd2f0d1cf26770c ] This commit fixes a long-standing bug that has been reported by many users: on some Armada 370 platforms, only the network interface that has been used in U-Boot to tftp the kernel works properly in Linux. The other network interfaces can see a 'link up', but are unable to transmit data. The reports were generally made on the Armada 370-based Mirabox, but have also been given on the Armada 370-RD board. The network MAC in the Armada 370/XP (supported by the mvneta driver in Linux) has a functionality that allows it to continuously poll the PHY and directly update the MAC configuration accordingly (speed, duplex, etc.). The very first versions of the driver submitted for review were using this hardware mechanism, but due to this, the driver was not integrated with the kernel phylib. Following reviews, the driver was changed to use the phylib, and therefore a software based polling. In software based polling, Linux regularly talks to the PHY over the MDIO bus, and sees if the link status has changed. If it's the case then the adjust_link() callback of the driver is called to update the MAC configuration accordingly. However, it turns out that the adjust_link() callback was not configuring the hardware in a completely correct way: while it was setting the speed and duplex bits correctly, it wasn't telling the hardware to actually take into account those bits rather than what the hardware-based PHY polling mechanism has concluded. So, in fact the adjust_link() callback was basically a no-op. However, the network happened to be working because on the network interfaces used by U-Boot for tftp on Armada 370 platforms because the hardware PHY polling was enabled by the bootloader, and left enabled by Linux. However, the second network interface not used for tftp (or both network interfaces if the kernel is loaded from USB, NAND or SD card) didn't had the hardware PHY polling enabled. This patch fixes this situation by: (1) Making sure that the hardware PHY polling is disabled by clearing the MVNETA_PHY_POLLING_ENABLE bit in the MVNETA_UNIT_CONTROL register in the driver ->probe() function. (2) Making sure that the duplex and speed selections made by the adjust_link() callback are taken into account by clearing the MVNETA_GMAC_AN_SPEED_EN and MVNETA_GMAC_AN_DUPLEX_EN bits in the MVNETA_GMAC_AUTONEG_CONFIG register. This patch has been tested on Armada 370 Mirabox, and now both network interfaces are usable after boot. [ Problem introduced by commit c5aff18 ("net: mvneta: driver for Marvell Armada 370/XP network unit") ] Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Jochen De Smet <jochen.armkernel@leahnim.org> Cc: Peter Sanford <psanford@nearbuy.io> Cc: Ethan Tuttle <ethan@ethantuttle.com> Cc: Chény Yves-Gael <yves@cheny.fr> Cc: Ryan Press <ryan@presslab.us> Cc: Simon Guinot <simon.guinot@sequanux.org> Cc: vdonnefort@lacie.com Cc: stable@vger.kernel.org Acked-by: Jason Cooper <jason@lakedaemon.net> Tested-by: Vincent Donnefort <vdonnefort@gmail.com> Tested-by: Yves-Gael Cheny <yves@cheny.fr> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net: ipv6: tcp: fix potential use after free in tcp_v6_do_rcvDaniel Borkmann
[ Upstream commit 3a1c756590633c0e86df606e5c618c190926a0df ] In tcp_v6_do_rcv() code, when processing pkt options, we soley work on our skb clone opt_skb that we've created earlier before entering tcp_rcv_established() on our way. However, only in condition ... if (np->rxopt.bits.rxtclass) np->rcv_tclass = ipv6_get_dsfield(ipv6_hdr(skb)); ... we work on skb itself. As we extract every other information out of opt_skb in ipv6_pktoptions path, this seems wrong, since skb can already be released by tcp_rcv_established() earlier on. When we try to access it in ipv6_hdr(), we will dereference freed skb. [ Bug added by commit 4c507d2897bd9b ("net: implement IP_RECVTOS for IP_PKTOPTIONS") ] Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv6: fix null pointer dereference in __ip6addrlbl_addHannes Frederic Sowa
[ Upstream commit 639739b5e609a5074839bb22fc061b37baa06269 ] Commit b67bfe0d42cac56c512dd5da4b1b347a23f4b70a ("hlist: drop the node parameter from iterators") changed the behavior of hlist_for_each_entry_safe to leave the p argument NULL. Fix this up by tracking the last argument. Reported-by: Michele Baldessari <michele@acksyn.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Michele Baldessari <michele@acksyn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14vhost_net: poll vhost queue after marking DMA is doneJason Wang
[ Upstream commit 19c73b3e08d16ee923f3962df4abf6205127896a ] We used to poll vhost queue before making DMA is done, this is racy if vhost thread were waked up before marking DMA is done which can result the signal to be missed. Fix this by always polling the vhost thread before DMA is done. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tg3: Don't turn off led on 5719 serdes port 0Nithin Sujir
[ Upstream commit 989038e217e94161862a959e82f9a1ecf8dda152 ] Turning off led on port 0 of the 5719 serdes causes all other ports to lose power and stop functioning. Add tg3_phy_led_bug() function to check for this condition. We use a switch() in tg3_phy_led_bug() for consistency with the tg3_phy_power_bug() function. Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTOJiri Bohac
[ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ] RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination unreachable) messages: 5 - Source address failed ingress/egress policy 6 - Reject route to destination Now they are treated as protocol error and icmpv6_err_convert() converts them to EPROTO. RFC 4443 says: "Codes 5 and 6 are more informative subsets of code 1." Treat codes 5 and 6 as code 1 (EACCES) Btw, connect() returning -EPROTO confuses firefox, so that fallback to other/IPv4 addresses does not work: https://bugzilla.mozilla.org/show_bug.cgi?id=910773 Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delayDaniel Borkmann
[ Upstream commit 2d98c29b6fb3de44d9eaa73c09f9cf7209346383 ] While looking into MLDv1/v2 code, I noticed that bridging code does not convert it's max delay into jiffies for MLDv2 messages as we do in core IPv6' multicast code. RFC3810, 5.1.3. Maximum Response Code says: The Maximum Response Code field specifies the maximum time allowed before sending a responding Report. The actual time allowed, called the Maximum Response Delay, is represented in units of milliseconds, and is derived from the Maximum Response Code as follows: [...] As we update timers that work with jiffies, we need to convert it. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Linus Lüssing <linus.luessing@web.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net: revert 8728c544a9c ("net: dev_pick_tx() fix")Eric Dumazet
[ Upstream commit 702821f4ea6f68db18aa1de7d8ed62c6ba586a64 ] commit 8728c544a9cbdc ("net: dev_pick_tx() fix") and commit b6fe83e9525a ("bonding: refine IFF_XMIT_DST_RELEASE capability") are quite incompatible : Queue selection is disabled because skb dst was dropped before entering bonding device. This causes major performance regression, mainly because TCP packets for a given flow can be sent to multiple queues. This is particularly visible when using the new FQ packet scheduler with MQ + FQ setup on the slaves. We can safely revert the first commit now that 416186fbf8c5b ("net: Split core bits of netdev_pick_tx into __netdev_pick_tx") properly caps the queue_index. Reported-by: Xi Wang <xii@google.com> Diagnosed-by: Xi Wang <xii@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tipc: set sk_err correctly when connection failsErik Hugne
[ Upstream commit 2c8d85182348021fc0a1bed193a4be4161dc8364 ] Should a connect fail, if the publication/server is unavailable or due to some other error, a positive value will be returned and errno is never set. If the application code checks for an explicit zero return from connect (success) or a negative return (failure), it will not catch the error and subsequent send() calls will fail as shown from the strace snippet below. socket(0x1e /* PF_??? */, SOCK_SEQPACKET, 0) = 3 connect(3, {sa_family=0x1e /* AF_??? */, sa_data="\2\1\322\4\0\0\322\4\0\0\0\0\0\0"}, 16) = 111 sendto(3, "test", 4, 0, NULL, 0) = -1 EPIPE (Broken pipe) The reason for this behaviour is that TIPC wrongly inverts error codes set in sk_err. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: tcp_make_synack() should use sock_wmallocPhil Oester
[ Upstream commit eb8895debe1baba41fcb62c78a16f0c63c21662a ] In commit 90ba9b19 (tcp: tcp_make_synack() can use alloc_skb()), Eric changed the call to sock_wmalloc in tcp_make_synack to alloc_skb. In doing so, the netfilter owner match lost its ability to block the SYNACK packet on outbound listening sockets. Revert the change, restoring the owner match functionality. This closes netfilter bugzilla #847. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv6: Don't depend on per socket memory for neighbour discovery messagesThomas Graf
[ Upstream commit 25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 ] Allocating skbs when sending out neighbour discovery messages currently uses sock_alloc_send_skb() based on a per net namespace socket and thus share a socket wmem buffer space. If a netdevice is temporarily unable to transmit due to carrier loss or for other reasons, the queued up ndisc messages will cosnume all of the wmem space and will thus prevent from any more skbs to be allocated even for netdevices that are able to transmit packets. The number of neighbour discovery messages sent is very limited, use of alloc_skb() bypasses the socket wmem buffer size enforcement while the manual call to skb_set_owner_w() maintains the socket reference needed for the IPv6 output path. This patch has orginally been posted by Eric Dumazet in a modified form. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Fabio Estevam <festevam@gmail.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Tested-by: Stephen Warren <swarren@nvidia.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv4: sendto/hdrincl: don't use destination address found in headerChris Clark
[ Upstream commit c27c9322d015dc1d9dfdf31724fca71c0476c4d1 ] ipv4: raw_sendmsg: don't use header's destination address A sendto() regression was bisected and found to start with commit f8126f1d5136be1 (ipv4: Adjust semantics of rt->rt_gateway.) The problem is that it tries to ARP-lookup the constructed packet's destination address rather than the explicitly provided address. Fix this using FLOWI_FLAG_KNOWN_NH so that given nexthop is used. cf. commit 2ad5b9e4bd314fc685086b99e90e5de3bc59e26b Reported-by: Chris Clark <chris.clark@alcatel-lucent.com> Bisected-by: Chris Clark <chris.clark@alcatel-lucent.com> Tested-by: Chris Clark <chris.clark@alcatel-lucent.com> Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Chris Clark <chris.clark@alcatel-lucent.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: don't apply tsoffset if rcv_tsecr is zeroAndrew Vagin
[ Upstream commit e3e12028315749b7fa2edbc37328e5847be9ede9 ] The zero value means that tsecr is not valid, so it's a special case. tsoffset is used to customize tcp_time_stamp for one socket. tsoffset is usually zero, it's used when a socket was moved from one host to another host. Currently this issue affects logic of tcp_rcv_rtt_measure_ts. Due to incorrect value of rcv_tsecr, tcp_rcv_rtt_measure_ts sets rto to TCP_RTO_MAX. Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: initialize rcv_tstamp for restored socketsAndrew Vagin
[ Upstream commit c7781a6e3c4a9a17e144ec2db00ebfea327bd627 ] u32 rcv_tstamp; /* timestamp of last received ACK */ Its value used in tcp_retransmit_timer, which closes socket if the last ack was received more then TCP_RTO_MAX ago. Currently rcv_tstamp is initialized to zero and if tcp_retransmit_timer is called before receiving a first ack, the connection is closed. This patch initializes rcv_tstamp to a timestamp, when a socket was restored. Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net: usb: Add HP hs2434 device to ZLP exception tableRob Gardner
[ Upstream commit 03803a59e32453ee5737c6096a295f748f03cc49 ] This patch adds another entry (HP hs2434 Mobile Broadband) to the list of exceptional devices that require a zero length packet in order to function properly. This list was added in commit 844e88f0. The hs2434 is manufactured by Sierra Wireless, who also produces the MC7710, which the ZLP exception list was created for in the first place. So hopefully it is just this one producer's devices that will need this workaround. Tested on a DM1-4310NR HP notebook, which does not function without this change. Signed-off-by: Rob Gardner <robmatic@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14be2net: fix disabling TX in be_close()Sathya Perla
[ Upstream commit 6e1f99757a2b24b7255263b2240a0eb04215174d ] commit fba875591 ("disable TX in be_close()") disabled TX in be_close() to protect be_xmit() from touching freed up queues in the AER recovery flow. But, TX must be disabled *before* cleaning up TX completions in the close() path, not after. This allows be_tx_compl_clean() to free up all TX-req skbs that were notified to the HW. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14sfc: Fix lookup of default RX MAC filters when steered using ethtoolBen Hutchings
[ Upstream commit f3851b0acc5a75bd33c6d344a2e4f920e1622ff0 ] commit 385904f819e3 ('sfc: Don't use efx_filter_{build,hash,increment}() for default MAC filters') used the wrong name to find the index of default RX MAC filters at insertion/ update time. This could result in memory corruption and would in any case silently fail to update the filter. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14net_sched: restore "linklayer atm" handlingJesper Dangaard Brouer
[ Upstream commit 8a8e3d84b1719a56f9151909e80ea6ebc5b8e318 ] commit 56b765b79 ("htb: improved accuracy at high rates") broke the "linklayer atm" handling. tc class add ... htb rate X ceil Y linklayer atm The linklayer setting is implemented by modifying the rate table which is send to the kernel. No direct parameter were transferred to the kernel indicating the linklayer setting. The commit 56b765b79 ("htb: improved accuracy at high rates") removed the use of the rate table system. To keep compatible with older iproute2 utils, this patch detects the linklayer by parsing the rate table. It also supports future versions of iproute2 to send this linklayer parameter to the kernel directly. This is done by using the __reserved field in struct tc_ratespec, to convey the choosen linklayer option, but only using the lower 4 bits of this field. Linklayer detection is limited to speeds below 100Mbit/s, because at high rates the rtab is gets too inaccurate, so bad that several fields contain the same values, this resembling the ATM detect. Fields even start to contain "0" time to send, e.g. at 1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have been more broken than we first realized. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14bridge: Use the correct bit length for bitmap functions in the VLAN codeToshiaki Makita
[ Upstream commit ef40b7ef181b7b1a24df2ef2d1ef84956bffa635 ] The VLAN code needs to know the length of the per-port VLAN bitmap to perform its most basic operations (retrieving VLAN informations, removing VLANs, forwarding database manipulation, etc). Unfortunately, in the current implementation we are using a macro that indicates the bitmap size in longs in places where the size in bits is expected, which in some cases can cause what appear to be random failures. Use the correct macro. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14packet: restore packet statistics tp_packets to include dropsWillem de Bruijn
[ Upstream commit 8bcdeaff5ed544704a9a691d4aef0adb3f9c5b8f ] getsockopt PACKET_STATISTICS returns tp_packets + tp_drops. Commit ee80fbf301 ("packet: account statistics only in tpacket_stats_u") cleaned up the getsockopt PACKET_STATISTICS code. This also changed semantics. Historically, tp_packets included tp_drops on return. The commit removed the line that adds tp_drops into tp_packets. This patch reinstates the old semantics. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: set timestamps for restored skb-sAndrey Vagin
[ Upstream commit 7ed5c5ae96d23da22de95e1c7a239537acd378b1 ] When the repair mode is turned off, the write queue seqs are updated so that the whole queue is considered to be 'already sent. The "when" field must be set for such skb. It's used in tcp_rearm_rto for example. If the "when" field isn't set, the retransmit timeout can be calculated incorrectly and a tcp connected can stop for two minutes (TCP_RTO_MAX). Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv6: drop packets with multiple fragmentation headersHannes Frederic Sowa
[ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ] It is not allowed for an ipv6 packet to contain multiple fragmentation headers. So discard packets which were already reassembled by fragmentation logic and send back a parameter problem icmp. The updates for RFC 6980 will come in later, I have to do a bit more research here. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv6: remove max_addresses check from ipv6_create_tempaddrHannes Frederic Sowa
[ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ] Because of the max_addresses check attackers were able to disable privacy extensions on an interface by creating enough autoconfigured addresses: <http://seclists.org/oss-sec/2012/q4/292> But the check is not actually needed: max_addresses protects the kernel to install too many ipv6 addresses on an interface and guards addrconf_prefix_rcv to install further addresses as soon as this limit is reached. We only generate temporary addresses in direct response of a new address showing up. As soon as we filled up the maximum number of addresses of an interface, we stop installing more addresses and thus also stop generating more temp addresses. Even if the attacker tries to generate a lot of temporary addresses by announcing a prefix and removing it again (lifetime == 0) we won't install more temp addresses, because the temporary addresses do count to the maximum number of addresses, thus we would stop installing new autoconfigured addresses when the limit is reached. This patch fixes CVE-2013-0343 (but other layer-2 attacks are still possible). Thanks to Ding Tianhong to bring this topic up again. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: George Kargiotakis <kargig@void.gr> Cc: P J P <ppandit@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tun: signedness bug in tun_get_user()Dan Carpenter
[ Upstream commit 15718ea0d844e4816dbd95d57a8a0e3e264ba90e ] The recent fix d9bf5f1309 "tun: compare with 0 instead of total_len" is not totally correct. Because "len" and "sizeof()" are size_t type, that means they are never less than zero. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg headerAsbjoern Sloth Toennesen
[ Upstream commit 3e805ad288c524bb65aad3f1e004402223d3d504 ] Fix the iproute2 command `bridge vlan show`, after switching from rtgenmsg to ifinfomsg. Let's start with a little history: Feb 20: Vlad Yasevich got his VLAN-aware bridge patchset included in the 3.9 merge window. In the kernel commit 6cbdceeb, he added attribute support to bridge GETLINK requests sent with rtgenmsg. Mar 6th: Vlad got this iproute2 reference implementation of the bridge vlan netlink interface accepted (iproute2 9eff0e5c) Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca) http://patchwork.ozlabs.org/patch/239602/ http://marc.info/?t=136680900700007 Apr 28th: Linus released 3.9 Apr 30th: Stephen released iproute2 3.9.0 The `bridge vlan show` command haven't been working since the switch to ifinfomsg, or in a released version of iproute2. Since the kernel side only supports rtgenmsg, which iproute2 switched away from just prior to the iproute2 3.9.0 release. I haven't been able to find any documentation, about neither rtgenmsg nor ifinfomsg, and in which situation to use which, but kernel commit 88c5b5ce seams to suggest that ifinfomsg should be used. Fixing this in kernel will break compatibility, but I doubt that anybody have been using it due to this bug in the user space reference implementation, at least not without noticing this bug. That said the functionality is still fully functional in 3.9, when reversing iproute2 commit 63338dca. This could also be fixed in iproute2, but thats an ugly patch that would reintroduce rtgenmsg in iproute2, and from searching in netdev it seams like rtgenmsg usage is discouraged. I'm assuming that the only reason that Vlad implemented the kernel side to use rtgenmsg, was because iproute2 was using it at the time. Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net> Reviewed-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.Pravin B Shelar
[ Upstream commit 4221f40513233fa8edeef7fc82e44163fde03b9b ] Using inner-id for tunnel id is not safe in some rare cases. E.g. packets coming from multiple sources entering same tunnel can have same id. Therefore on tunnel packet receive we could have packets from two different stream but with same source and dst IP with same ip-id which could confuse ip packet reassembly. Following patch reverts optimization from commit 490ab08127 (IP_GRE: Fix IP-Identification.) Signed-off-by: Pravin B Shelar <pshelar@nicira.com> CC: Jarno Rajahalme <jrajahalme@nicira.com> CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14genl: Hold reference on correct module while netlink-dump.Pravin B Shelar
[ Upstream commit 33c6b1f6b154894321f5734e50c66621e9134e7e ] netlink dump operations take module as parameter to hold reference for entire netlink dump duration. Currently it holds ref only on genl module which is not correct when we use ops registered to genl from another module. Following patch adds module pointer to genl_ops so that netlink can hold ref count on it. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> CC: Jesse Gross <jesse@nicira.com> CC: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14genl: Fix genl dumpit() locking.Pravin B Shelar
[ Upstream commit 9b96309c5b0b9e466773c07a5bc8b7b68fcf010a ] In case of genl-family with parallel ops off, dumpif() callback is expected to run under genl_lock, But commit def3117493eafd9df (genl: Allow concurrent genl callbacks.) changed this behaviour where only first dumpit() op was called under genl-lock. For subsequent dump, only nlk->cb_lock was taken. Following patch fixes it by defining locked dumpit() and done() callback which takes care of genl-locking. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> CC: Jesse Gross <jesse@nicira.com> CC: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14rtnetlink: Fix inverted check in ndo_dflt_fdb_del()Sridhar Samudrala
[ Upstream commit 645359930231d5e78fd3296a38b98c1a658a7ade ] Fix inverted check when deleting an fdb entry. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-148139cp: Fix skb leak in rx_status_loop failure path.Dave Jones
[ Upstream commit d06f5187469eee1b2932c02fd093d113cfc60d5e ] Introduced in cf3c4c03060b688cbc389ebc5065ebcce5653e96 ("8139cp: Add dma_mapping_error checking") Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ip_gre: fix ipgre_header to return correct offset MIME-Version: 1.0Timo Teräs
[ Upstream commit 77a482bdb2e68d13fae87541b341905ba70d572b ] Fix ipgre_header() (header_ops->create) to return the correct amount of bytes pushed. Most callers of dev_hard_header() seem to care only if it was success, but af_packet.c uses it as offset to the skb to copy from userspace only once. In practice this fixes packet socket sendto()/sendmsg() to gre tunnels. Regression introduced in c54419321455631079c7d6e60bc732dd0c5914c5 ("GRE: Refactor GRE tunneling code.") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not matchHannes Frederic Sowa
[ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ] In case a subtree did not match we currently stop backtracking and return NULL (root table from fib_lookup). This could yield in invalid routing table lookups when using subtrees. Instead continue to backtrack until a valid subtree or node is found and return this match. Also remove unneeded NULL check. Reported-by: Teco Boot <teco@inf-net.nl> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Lamparter <equinox@diac24.net> Cc: <boutier@pps.univ-paris-diderot.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: cubic: fix bug in bictcp_acked()Eric Dumazet
[ Upstream commit cd6b423afd3c08b27e1fed52db828ade0addbc6b ] While investigating about strange increase of retransmit rates on hosts ~24 days after boot, Van found hystart was disabled if ca->epoch_start was 0, as following condition is true when tcp_time_stamp high order bit is set. (s32)(tcp_time_stamp - ca->epoch_start) < HZ Quoting Van : At initialization & after every loss ca->epoch_start is set to zero so I believe that the above line will turn off hystart as soon as the 2^31 bit is set in tcp_time_stamp & hystart will stay off for 24 days. I think we've observed that cubic's restart is too aggressive without hystart so this might account for the higher drop rate we observe. Diagnosed-by: Van Jacobson <vanj@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14tcp: cubic: fix overflow error in bictcp_update()Eric Dumazet
[ Upstream commit 2ed0edf9090bf4afa2c6fc4f38575a85a80d4b20 ] commit 17a6e9f1aa9 ("tcp_cubic: fix clock dependency") added an overflow error in bictcp_update() in following code : /* change the unit from HZ to bictcp_HZ */ t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) - ca->epoch_start) << BICTCP_HZ) / HZ; Because msecs_to_jiffies() being unsigned long, compiler does implicit type promotion. We really want to constrain (tcp_time_stamp - ca->epoch_start) to a signed 32bit value, or else 't' has unexpected high values. This bugs triggers an increase of retransmit rates ~24 days after boot [1], as the high order bit of tcp_time_stamp flips. [1] for hosts with HZ=1000 Big thanks to Van Jacobson for spotting this problem. Diagnosed-by: Van Jacobson <vanj@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14bridge: don't try to update timers in case of broken MLD queriesLinus Lüssing
[ Upstream commit 248ba8ec05a2c3b118c2224e57eb10c128176ab1 ] Currently we are reading an uninitialized value for the max_delay variable when snooping an MLD query message of invalid length and would update our timers with that. Fixing this by simply ignoring such broken MLD queries (just like we do for IGMP already). This is a regression introduced by: "bridge: disable snooping if there is no querier" (b00589af3b04) Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14fib_trie: remove potential out of bound accessEric Dumazet
[ Upstream commit aab515d7c32a34300312416c50314e755ea6f765 ] AddressSanitizer [1] dynamic checker pointed a potential out of bound access in leaf_walk_rcu() We could allocate one more slot in tnode_new() to leave the prefetch() in-place but it looks not worth the pain. Bug added in commit 82cfbb008572b ("[IPV4] fib_trie: iterator recode") [1] : https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14bonding: modify only neigh_parms owned by usVeaceslav Falico
[ Upstream commit 9918d5bf329d0dc5bb2d9d293bcb772bdb626e65 ] Otherwise, on neighbour creation, bond_neigh_init() will be called with a foreign netdev. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14neighbour: populate neigh_parms on alloc before calling ndo_neigh_setupVeaceslav Falico
[ Upstream commit 63134803a6369dcf7dddf7f0d5e37b9566b308d2 ] dev->ndo_neigh_setup() might need some of the values of neigh_parms, so populate them before calling it. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14macvlan: validate flagsMichael S. Tsirkin
[ Upstream commit 1512747820367c8b3b8b72035f0f78c62f2bf1e9 ] commit df8ef8f3aaa6692970a436204c4429210addb23a macvlan: add FDB bridge ops and macvlan flags added a flags field to macvlan, which can be controlled from userspace. The idea is to make the interface future-proof so we can add flags and not new fields. However, flags value isn't validated, as a result, userspace can't detect which flags are supported. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>