summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2021-03-07Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_dataGopal Tiwari
[ Upstream commit e8bd76ede155fd54d8c41d045dda43cd3174d506 ] kernel panic trace looks like: #5 [ffffb9e08698fc80] do_page_fault at ffffffffb666e0d7 #6 [ffffb9e08698fcb0] page_fault at ffffffffb70010fe [exception RIP: amp_read_loc_assoc_final_data+63] RIP: ffffffffc06ab54f RSP: ffffb9e08698fd68 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8c8845a5a000 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffff8c8b9153d000 RDI: ffff8c8845a5a000 RBP: ffffb9e08698fe40 R8: 00000000000330e0 R9: ffffffffc0675c94 R10: ffffb9e08698fe58 R11: 0000000000000001 R12: ffff8c8b9cbf6200 R13: 0000000000000000 R14: 0000000000000000 R15: ffff8c8b2026da0b ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffb9e08698fda8] hci_event_packet at ffffffffc0676904 [bluetooth] #8 [ffffb9e08698fe50] hci_rx_work at ffffffffc06629ac [bluetooth] #9 [ffffb9e08698fe98] process_one_work at ffffffffb66f95e7 hcon->amp_mgr seems NULL triggered kernel panic in following line inside function amp_read_loc_assoc_final_data set_bit(READ_LOC_AMP_ASSOC_FINAL, &mgr->state); Fixed by checking NULL for mgr. Signed-off-by: Gopal Tiwari <gtiwari@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-07Bluetooth: Add new HCI_QUIRK_NO_SUSPEND_NOTIFIER quirkHans de Goede
[ Upstream commit 219991e6be7f4a31d471611e265b72f75b2d0538 ] Some devices, e.g. the RTL8723BS bluetooth part, some USB attached devices, completely drop from the bus on a system-suspend. These devices will have their driver unbound and rebound on resume (when the dropping of the bus gets detected) and will show up as a new HCI after resume. These devices do not benefit from the suspend / resume handling work done by the hci_suspend_notifier. At best this unnecessarily adds some time to the suspend/resume time. But this may also actually cause problems, if the code doing the driver unbinding runs after the pm-notifier then the hci_suspend_notifier code will try to talk to a device which is now in an uninitialized state. This commit adds a new HCI_QUIRK_NO_SUSPEND_NOTIFIER quirk which allows drivers to opt-out of the hci_suspend_notifier when they know beforehand that their device will be fully re-initialized / reprobed on resume. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-07pktgen: fix misuse of BUG_ON() in pktgen_thread_worker()Di Zhu
[ Upstream commit 275b1e88cabb34dbcbe99756b67e9939d34a99b6 ] pktgen create threads for all online cpus and bond these threads to relevant cpu repecivtily. when this thread firstly be woken up, it will compare cpu currently running with the cpu specified at the time of creation and if the two cpus are not equal, BUG_ON() will take effect causing panic on the system. Notice that these threads could be migrated to other cpus before start running because of the cpu hotplug after these threads have created. so the BUG_ON() used here seems unreasonable and we can replace it with WARN_ON() to just printf a warning other than panic the system. Signed-off-by: Di Zhu <zhudi21@huawei.com> Link: https://lore.kernel.org/r/20210125124229.19334-1-zhudi21@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-07Bluetooth: btusb: fix memory leak on suspend and resumeVamshi K Sthambamkadi
[ Upstream commit 5ff20cbe6752a5bc06ff58fee8aa11a0d5075819 ] kmemleak report: unreferenced object 0xffff9b1127f00500 (size 208): comm "kworker/u17:2", pid 500, jiffies 4294937470 (age 580.136s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 60 ed 05 11 9b ff ff 00 00 00 00 00 00 00 00 .`.............. backtrace: [<000000006ab3fd59>] kmem_cache_alloc_node+0x17a/0x480 [<0000000051a5f6f9>] __alloc_skb+0x5b/0x1d0 [<0000000037e2d252>] hci_prepare_cmd+0x32/0xc0 [bluetooth] [<0000000010b586d5>] hci_req_add_ev+0x84/0xe0 [bluetooth] [<00000000d2deb520>] hci_req_clear_event_filter+0x42/0x70 [bluetooth] [<00000000f864bd8c>] hci_req_prepare_suspend+0x84/0x470 [bluetooth] [<000000001deb2cc4>] hci_prepare_suspend+0x31/0x40 [bluetooth] [<000000002677dd79>] process_one_work+0x209/0x3b0 [<00000000aaa62b07>] worker_thread+0x34/0x400 [<00000000826d176c>] kthread+0x126/0x140 [<000000002305e558>] ret_from_fork+0x22/0x30 unreferenced object 0xffff9b1125c6ee00 (size 512): comm "kworker/u17:2", pid 500, jiffies 4294937470 (age 580.136s) hex dump (first 32 bytes): 04 00 00 00 0d 00 00 00 05 0c 01 00 11 9b ff ff ................ 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ backtrace: [<000000009f07c0cc>] slab_post_alloc_hook+0x59/0x270 [<0000000049431dc2>] __kmalloc_node_track_caller+0x15f/0x330 [<00000000027a42f6>] __kmalloc_reserve.isra.70+0x31/0x90 [<00000000e8e3e76a>] __alloc_skb+0x87/0x1d0 [<0000000037e2d252>] hci_prepare_cmd+0x32/0xc0 [bluetooth] [<0000000010b586d5>] hci_req_add_ev+0x84/0xe0 [bluetooth] [<00000000d2deb520>] hci_req_clear_event_filter+0x42/0x70 [bluetooth] [<00000000f864bd8c>] hci_req_prepare_suspend+0x84/0x470 [bluetooth] [<000000001deb2cc4>] hci_prepare_suspend+0x31/0x40 [bluetooth] [<000000002677dd79>] process_one_work+0x209/0x3b0 [<00000000aaa62b07>] worker_thread+0x34/0x400 [<00000000826d176c>] kthread+0x126/0x140 [<000000002305e558>] ret_from_fork+0x22/0x30 unreferenced object 0xffff9b112b395788 (size 8): comm "kworker/u17:2", pid 500, jiffies 4294937470 (age 580.136s) hex dump (first 8 bytes): 20 00 00 00 00 00 04 00 ....... backtrace: [<0000000052dc28d2>] kmem_cache_alloc_trace+0x15e/0x460 [<0000000046147591>] alloc_ctrl_urb+0x52/0xe0 [btusb] [<00000000a2ed3e9e>] btusb_send_frame+0x91/0x100 [btusb] [<000000001e66030e>] hci_send_frame+0x7e/0xf0 [bluetooth] [<00000000bf6b7269>] hci_cmd_work+0xc5/0x130 [bluetooth] [<000000002677dd79>] process_one_work+0x209/0x3b0 [<00000000aaa62b07>] worker_thread+0x34/0x400 [<00000000826d176c>] kthread+0x126/0x140 [<000000002305e558>] ret_from_fork+0x22/0x30 In pm sleep-resume context, while the btusb device rebinds, it enters hci_unregister_dev(), whilst there is a possibility of hdev receiving PM_POST_SUSPEND suspend_notifier event, leading to generation of msg frames. When hci_unregister_dev() completes, i.e. hdev context is destroyed/freed, those intermittently sent msg frames cause memory leak. BUG details: Below is stack trace of thread that enters hci_unregister_dev(), marks the hdev flag HCI_UNREGISTER to 1, and then goes onto to wait on notifier lock - refer unregister_pm_notifier(). hci_unregister_dev+0xa5/0x320 [bluetoot] btusb_disconnect+0x68/0x150 [btusb] usb_unbind_interface+0x77/0x250 ? kernfs_remove_by_name_ns+0x75/0xa0 device_release_driver_internal+0xfe/0x1 device_release_driver+0x12/0x20 bus_remove_device+0xe1/0x150 device_del+0x192/0x3e0 ? usb_remove_ep_devs+0x1f/0x30 usb_disable_device+0x92/0x1b0 usb_disconnect+0xc2/0x270 hub_event+0x9f6/0x15d0 ? rpm_idle+0x23/0x360 ? rpm_idle+0x26b/0x360 process_one_work+0x209/0x3b0 worker_thread+0x34/0x400 ? process_one_work+0x3b0/0x3b0 kthread+0x126/0x140 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 Below is stack trace of thread executing hci_suspend_notifier() which processes the PM_POST_SUSPEND event, while the unbinding thread is waiting on lock. hci_suspend_notifier.cold.39+0x5/0x2b [bluetooth] blocking_notifier_call_chain+0x69/0x90 pm_notifier_call_chain+0x1a/0x20 pm_suspend.cold.9+0x334/0x352 state_store+0x84/0xf0 kobj_attr_store+0x12/0x20 sysfs_kf_write+0x3b/0x40 kernfs_fop_write+0xda/0x1c0 vfs_write+0xbb/0x250 ksys_write+0x61/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix hci_suspend_notifer(), not to act on events when flag HCI_UNREGISTER is set. Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-07net: fix dev_ifsioc_locked() race conditionCong Wang
commit 3b23a32a63219f51a5298bc55a65ecee866e79d0 upstream. dev_ifsioc_locked() is called with only RCU read lock, so when there is a parallel writer changing the mac address, it could get a partially updated mac address, as shown below: Thread 1 Thread 2 // eth_commit_mac_addr_change() memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN); // dev_ifsioc_locked() memcpy(ifr->ifr_hwaddr.sa_data, dev->dev_addr,...); Close this race condition by guarding them with a RW semaphore, like netdev_get_name(). We can not use seqlock here as it does not allow blocking. The writers already take RTNL anyway, so this does not affect the slow path. To avoid bothering existing dev_set_mac_address() callers in drivers, introduce a new wrapper just for user-facing callers on ioctl and rtnetlink paths. Note, bonding also changes slave mac addresses but that requires a separate patch due to the complexity of bonding code. Fixes: 3710becf8a58 ("net: RCU locking for simple ioctl()") Reported-by: "Gong, Sishuai" <sishuai@purdue.edu> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net: psample: Fix netlink skb length with tunnel infoChris Mi
commit a93dcaada2ddb58dbc72652b42548adedd646d7a upstream. Currently, the psample netlink skb is allocated with a size that does not account for the nested 'PSAMPLE_ATTR_TUNNEL' attribute and the padding required for the 64-bit attribute 'PSAMPLE_TUNNEL_KEY_ATTR_ID'. This can result in failure to add attributes to the netlink skb due to insufficient tail room. The following error message is printed to the kernel log: "Could not create psample log message". Fix this by adjusting the allocation size to take into account the nested attribute and the padding. Fixes: d8bed686ab96 ("net: psample: Add tunnel support") CC: Yotam Gigi <yotam.gi@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Chris Mi <cmi@nvidia.com> Link: https://lore.kernel.org/r/20210225075145.184314-1-cmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net: hsr: add support for EntryForgetTimeMarco Wenzel
commit f176411401127a07a9360dec14eca448eb2e9d45 upstream. In IEC 62439-3 EntryForgetTime is defined with a value of 400 ms. When a node does not send any frame within this time, the sequence number check for can be ignored. This solves communication issues with Cisco IE 2000 in Redbox mode. Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Marco Wenzel <marco.wenzel@a-eberle.de> Reviewed-by: George McCollister <george.mccollister@gmail.com> Tested-by: George McCollister <george.mccollister@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210224094653.1440-1-marco.wenzel@a-eberle.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net: dsa: tag_rtl4_a: Support also egress tagsLinus Walleij
commit 86dd9868b8788a9063893a97649594af93cd5aa6 upstream. Support also transmitting frames using the custom "8899 A" 4 byte tag. Qingfang came up with the solution: we need to pad the ethernet frame to 60 bytes using eth_skb_pad(), then the switch will happily accept frames with custom tags. Cc: Mauri Sandberg <sandberg@mailfence.com> Reported-by: DENG Qingfang <dqfext@gmail.com> Fixes: efd7fe68f0c6 ("net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net/sched: cls_flower: Reject invalid ct_state flags ruleswenxu
commit 1bcc51ac0731aab1b109b2cd5c3d495f1884e5ca upstream. Reject the unsupported and invalid ct_state flags of cls flower rules. Fixes: e0ace68af2ac ("net/sched: cls_flower: Add matching on conntrack info") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net: bridge: use switchdev for port flags set through sysfs tooVladimir Oltean
commit 8043c845b63a2dd88daf2d2d268a33e1872800f0 upstream. Looking through patchwork I don't see that there was any consensus to use switchdev notifiers only in case of netlink provided port flags but not sysfs (as a sort of deprecation, punishment or anything like that), so we should probably keep the user interface consistent in terms of functionality. http://patchwork.ozlabs.org/project/netdev/patch/20170605092043.3523-3-jiri@resnulli.us/ http://patchwork.ozlabs.org/project/netdev/patch/20170608064428.4785-3-jiri@resnulli.us/ Fixes: 3922285d96e7 ("net: bridge: Add support for offloading port attributes") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07mptcp: fix DATA_FIN generation on early shutdownPaolo Abeni
commit d87903b63e3ce1eafaa701aec5cc1d0ecd0d84dc upstream. If the msk is closed before sending or receiving any data, no DATA_FIN is generated, instead an MPC ack packet is crafted out. In the above scenario, the MPTCP protocol creates and sends a pure ack and such packets matches also the criteria for an MPC ack and the protocol tries first to insert MPC options, leading to the described error. This change addresses the issue by avoiding the insertion of an MPC option for DATA_FIN packets or if the sub-flow is not established. To avoid doing multiple times the same test, fetch the data_fin flag in a bool variable and pass it to both the interested helpers. Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07mptcp: do not wakeup listener for MPJ subflowsPaolo Abeni
commit 52557dbc7538ecceb27ef2206719a47a8039a335 upstream. MPJ subflows are not exposed as fds to user spaces. As such, incoming MPJ subflows are removed from the accept queue by tcp_check_req()/tcp_get_cookie_sock(). Later tcp_child_process() invokes subflow_data_ready() on the parent socket regardless of the subflow kind, leading to poll wakeups even if the later accept will block. Address the issue by double-checking the queue state before waking the user-space. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/164 Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07mptcp: fix spurious retransmissionsPaolo Abeni
commit 64b9cea7a0afe579dd2682f1f1c04f2e4e72fd25 upstream. Syzkaller was able to trigger the following splat again: WARNING: CPU: 1 PID: 12512 at net/mptcp/protocol.c:761 mptcp_reset_timer+0x12a/0x160 net/mptcp/protocol.c:761 Modules linked in: CPU: 1 PID: 12512 Comm: kworker/1:6 Not tainted 5.10.0-rc6 #52 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:mptcp_reset_timer+0x12a/0x160 net/mptcp/protocol.c:761 Code: e8 4b 0c ad ff e8 56 21 88 fe 48 b8 00 00 00 00 00 fc ff df 48 c7 04 03 00 00 00 00 48 83 c4 40 5b 5d 41 5c c3 e8 36 21 88 fe <0f> 0b 41 bc c8 00 00 00 eb 98 e8 e7 b1 af fe e9 30 ff ff ff 48 c7 RSP: 0018:ffffc900018c7c68 EFLAGS: 00010293 RAX: ffff888108cb1c80 RBX: 1ffff92000318f8d RCX: ffffffff82ad0307 RDX: 0000000000000000 RSI: ffffffff82ad036a RDI: 0000000000000007 RBP: ffff888113e2d000 R08: ffff888108cb1c80 R09: ffffed10227c5ab7 R10: ffff888113e2d5b7 R11: ffffed10227c5ab6 R12: 0000000000000000 R13: ffff88801f100000 R14: ffff888113e2d5b0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88811b500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd76a874ef8 CR3: 000000001689c005 CR4: 0000000000170ee0 Call Trace: mptcp_worker+0xaa4/0x1560 net/mptcp/protocol.c:2334 process_one_work+0x8d3/0x1200 kernel/workqueue.c:2272 worker_thread+0x9c/0x1090 kernel/workqueue.c:2418 kthread+0x303/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 The mptcp_worker tries to update the MPTCP retransmission timer even if such timer is not currently scheduled. The mptcp_rtx_head() return value is bogus: we can have enqueued data not yet transmitted. The above may additionally cause spurious, unneeded MPTCP-level retransmissions. Fix the issue adding an explicit clearing of the rtx queue before trying to retransmit and checking for unacked data. Additionally drop an unneeded timer stop call and the unused mptcp_rtx_tail() helper. Reported-by: Christoph Paasch <cpaasch@apple.com> Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net: fix up truesize of cloned skb in skb_prepare_for_shift()Marco Elver
commit 097b9146c0e26aabaa6ff3e5ea536a53f5254a79 upstream. Avoid the assumption that ksize(kmalloc(S)) == ksize(kmalloc(S)): when cloning an skb, save and restore truesize after pskb_expand_head(). This can occur if the allocator decides to service an allocation of the same size differently (e.g. use a different size class, or pass the allocation on to KFENCE). Because truesize is used for bookkeeping (such as sk_wmem_queued), a modified truesize of a cloned skb may result in corrupt bookkeeping and relevant warnings (such as in sk_stream_kill_queues()). Link: https://lkml.kernel.org/r/X9JR/J6dMMOy1obu@elver.google.com Reported-by: syzbot+7b99aafdcc2eedea6178@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210201160420.2826895-1-elver@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07net/af_iucv: remove WARN_ONCE on malformed RX packetsAlexander Egorenkov
commit 27e9c1de529919d8dd7d072415d3bcae77709300 upstream. syzbot reported the following finding: AF_IUCV failed to receive skb, len=0 WARNING: CPU: 0 PID: 522 at net/iucv/af_iucv.c:2039 afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039 CPU: 0 PID: 522 Comm: syz-executor091 Not tainted 5.10.0-rc1-syzkaller-07082-g55027a88ec9f #0 Hardware name: IBM 3906 M04 701 (KVM/Linux) Call Trace: [<00000000b87ea538>] afiucv_hs_rcv+0x178/0x190 net/iucv/af_iucv.c:2039 ([<00000000b87ea534>] afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039) [<00000000b796533e>] __netif_receive_skb_one_core+0x13e/0x188 net/core/dev.c:5315 [<00000000b79653ce>] __netif_receive_skb+0x46/0x1c0 net/core/dev.c:5429 [<00000000b79655fe>] netif_receive_skb_internal+0xb6/0x220 net/core/dev.c:5534 [<00000000b796ac3a>] netif_receive_skb+0x42/0x318 net/core/dev.c:5593 [<00000000b6fd45f4>] tun_rx_batched.isra.0+0x6fc/0x860 drivers/net/tun.c:1485 [<00000000b6fddc4e>] tun_get_user+0x1c26/0x27f0 drivers/net/tun.c:1939 [<00000000b6fe0f00>] tun_chr_write_iter+0x158/0x248 drivers/net/tun.c:1968 [<00000000b4f22bfa>] call_write_iter include/linux/fs.h:1887 [inline] [<00000000b4f22bfa>] new_sync_write+0x442/0x648 fs/read_write.c:518 [<00000000b4f238fe>] vfs_write.part.0+0x36e/0x5d8 fs/read_write.c:605 [<00000000b4f2984e>] vfs_write+0x10e/0x148 fs/read_write.c:615 [<00000000b4f29d0e>] ksys_write+0x166/0x290 fs/read_write.c:658 [<00000000b8dc4ab4>] system_call+0xe0/0x28c arch/s390/kernel/entry.S:415 Last Breaking-Event-Address: [<00000000b8dc64d4>] __s390_indirect_jump_r14+0x0/0xc Malformed RX packets shouldn't generate any warnings because debugging info already flows to dropmon via the kfree_skb(). Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04net_sched: fix RTNL deadlock again caused by request_module()Cong Wang
commit d349f997686887906b1183b5be96933c5452362a upstream. tcf_action_init_1() loads tc action modules automatically with request_module() after parsing the tc action names, and it drops RTNL lock and re-holds it before and after request_module(). This causes a lot of troubles, as discovered by syzbot, because we can be in the middle of batch initializations when we create an array of tc actions. One of the problem is deadlock: CPU 0 CPU 1 rtnl_lock(); for (...) { tcf_action_init_1(); -> rtnl_unlock(); -> request_module(); rtnl_lock(); for (...) { tcf_action_init_1(); -> tcf_idr_check_alloc(); // Insert one action into idr, // but it is not committed until // tcf_idr_insert_many(), then drop // the RTNL lock in the _next_ // iteration -> rtnl_unlock(); -> rtnl_lock(); -> a_o->init(); -> tcf_idr_check_alloc(); // Now waiting for the same index // to be committed -> request_module(); -> rtnl_lock() // Now waiting for RTNL lock } rtnl_unlock(); } rtnl_unlock(); This is not easy to solve, we can move the request_module() before this loop and pre-load all the modules we need for this netlink message and then do the rest initializations. So the loop breaks down to two now: for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) { struct tc_action_ops *a_o; a_o = tc_action_load_ops(name, tb[i]...); ops[i - 1] = a_o; } for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) { act = tcf_action_init_1(ops[i - 1]...); } Although this looks serious, it only has been reported by syzbot, so it seems hard to trigger this by humans. And given the size of this patch, I'd suggest to make it to net-next and not to backport to stable. This patch has been tested by syzbot and tested with tdc.py by me. Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together") Reported-and-tested-by: syzbot+82752bc5331601cf4899@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+b3b63b6bff456bd95294@syzkaller.appspotmail.com Reported-by: syzbot+ba67b12b1ca729912834@syzkaller.appspotmail.com Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20210117005657.14810-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04net: qrtr: Fix memory leak in qrtr_tun_openTakeshi Misawa
commit fc0494ead6398609c49afa37bc949b61c5c16b91 upstream. If qrtr_endpoint_register() failed, tun is leaked. Fix this, by freeing tun in error path. syzbot report: BUG: memory leak unreferenced object 0xffff88811848d680 (size 64): comm "syz-executor684", pid 10171, jiffies 4294951561 (age 26.070s) hex dump (first 32 bytes): 80 dd 0a 84 ff ff ff ff 00 00 00 00 00 00 00 00 ................ 90 d6 48 18 81 88 ff ff 90 d6 48 18 81 88 ff ff ..H.......H..... backtrace: [<0000000018992a50>] kmalloc include/linux/slab.h:552 [inline] [<0000000018992a50>] kzalloc include/linux/slab.h:682 [inline] [<0000000018992a50>] qrtr_tun_open+0x22/0x90 net/qrtr/tun.c:35 [<0000000003a453ef>] misc_open+0x19c/0x1e0 drivers/char/misc.c:141 [<00000000dec38ac8>] chrdev_open+0x10d/0x340 fs/char_dev.c:414 [<0000000079094996>] do_dentry_open+0x1e6/0x620 fs/open.c:817 [<000000004096d290>] do_open fs/namei.c:3252 [inline] [<000000004096d290>] path_openat+0x74a/0x1b00 fs/namei.c:3369 [<00000000b8e64241>] do_filp_open+0xa0/0x190 fs/namei.c:3396 [<00000000a3299422>] do_sys_openat2+0xed/0x230 fs/open.c:1172 [<000000002c1bdcef>] do_sys_open fs/open.c:1188 [inline] [<000000002c1bdcef>] __do_sys_openat fs/open.c:1204 [inline] [<000000002c1bdcef>] __se_sys_openat fs/open.c:1199 [inline] [<000000002c1bdcef>] __x64_sys_openat+0x7f/0xe0 fs/open.c:1199 [<00000000f3a5728f>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 [<000000004b38b7ec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space") Reported-by: syzbot+5d6e4af21385f5cfc56a@syzkaller.appspotmail.com Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com> Link: https://lore.kernel.org/r/20210221234427.GA2140@DESKTOP Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04net: sched: fix police ext initializationVlad Buslov
commit 396d7f23adf9e8c436dd81a69488b5b6a865acf8 upstream. When police action is created by cls API tcf_exts_validate() first conditional that calls tcf_action_init_1() directly, the action idr is not updated according to latest changes in action API that require caller to commit newly created action to idr with tcf_idr_insert_many(). This results such action not being accessible through act API and causes crash reported by syzbot: ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline] BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline] BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204 CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 ================================================================== Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G B 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 panic+0x306/0x73d kernel/panic.c:231 end_report+0x58/0x5e mm/kasan/report.c:100 __kasan_report mm/kasan/report.c:403 [inline] kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413 check_memory_region_inline mm/kasan/generic.c:179 [inline] check_memory_region+0x13d/0x180 mm/kasan/generic.c:185 instrument_atomic_read include/linux/instrumented.h:71 [inline] atomic_read include/asm-generic/atomic-instrumented.h:27 [inline] __tcf_idr_release net/sched/act_api.c:178 [inline] tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598 tc_action_net_exit include/net/act_api.h:151 [inline] police_exit_net+0x168/0x360 net/sched/act_police.c:390 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Kernel Offset: disabled Fix the issue by calling tcf_idr_insert_many() after successful action initialization. Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together") Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sendingJason A. Donenfeld
commit ee576c47db60432c37e54b1e2b43a8ca6d3a8dca upstream. The icmp{,v6}_send functions make all sorts of use of skb->cb, casting it with IPCB or IP6CB, assuming the skb to have come directly from the inet layer. But when the packet comes from the ndo layer, especially when forwarded, there's no telling what might be in skb->cb at that point. As a result, the icmp sending code risks reading bogus memory contents, which can result in nasty stack overflows such as this one reported by a user: panic+0x108/0x2ea __stack_chk_fail+0x14/0x20 __icmp_send+0x5bd/0x5c0 icmp_ndo_send+0x148/0x160 In icmp_send, skb->cb is cast with IPCB and an ip_options struct is read from it. The optlen parameter there is of particular note, as it can induce writes beyond bounds. There are quite a few ways that can happen in __ip_options_echo. For example: // sptr/skb are attacker-controlled skb bytes sptr = skb_network_header(skb); // dptr/dopt points to stack memory allocated by __icmp_send dptr = dopt->__data; // sopt is the corrupt skb->cb in question if (sopt->rr) { optlen = sptr[sopt->rr+1]; // corrupt skb->cb + skb->data soffset = sptr[sopt->rr+2]; // corrupt skb->cb + skb->data // this now writes potentially attacker-controlled data, over // flowing the stack: memcpy(dptr, sptr+sopt->rr, optlen); } In the icmpv6_send case, the story is similar, but not as dire, as only IP6CB(skb)->iif and IP6CB(skb)->dsthao are used. The dsthao case is worse than the iif case, but it is passed to ipv6_find_tlv, which does a bit of bounds checking on the value. This is easy to simulate by doing a `memset(skb->cb, 0x41, sizeof(skb->cb));` before calling icmp{,v6}_ndo_send, and it's only by good fortune and the rarity of icmp sending from that context that we've avoided reports like this until now. For example, in KASAN: BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xa0e/0x12b0 Write of size 38 at addr ffff888006f1f80e by task ping/89 CPU: 2 PID: 89 Comm: ping Not tainted 5.10.0-rc7-debug+ #5 Call Trace: dump_stack+0x9a/0xcc print_address_description.constprop.0+0x1a/0x160 __kasan_report.cold+0x20/0x38 kasan_report+0x32/0x40 check_memory_region+0x145/0x1a0 memcpy+0x39/0x60 __ip_options_echo+0xa0e/0x12b0 __icmp_send+0x744/0x1700 Actually, out of the 4 drivers that do this, only gtp zeroed the cb for the v4 case, while the rest did not. So this commit actually removes the gtp-specific zeroing, while putting the code where it belongs in the shared infrastructure of icmp{,v6}_ndo_send. This commit fixes the issue by passing an empty IPCB or IP6CB along to the functions that actually do the work. For the icmp_send, this was already trivial, thanks to __icmp_send providing the plumbing function. For icmpv6_send, this required a tiny bit of refactoring to make it behave like the v4 case, after which it was straight forward. Fixes: a2b78e9b2cac ("sunvnet: generate ICMP PTMUD messages for smaller port MTUs") Reported-by: SinYu <liuxyon@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/netdev/CAF=yD-LOF116aHub6RMe8vB8ZpnrrnoTdqhobEx+bvoA8AsP0w@mail.gmail.com/T/ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20210223131858.72082-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04svcrdma: Hold private mutex while invoking rdma_accept()Chuck Lever
[ Upstream commit 0ac24c320c4d89a9de6ec802591398b8675c7b3c ] RDMA core mutex locking was restructured by commit d114c6feedfe ("RDMA/cma: Add missing locking to rdma_accept()") [Aug 2020]. When lock debugging is enabled, the RPC/RDMA server trips over the new lockdep assertion in rdma_accept() because it doesn't call rdma_accept() from its CM event handler. As a temporary fix, have svc_rdma_accept() take the handler_mutex explicitly. In the meantime, let's consider how to restructure the RPC/RDMA transport to invoke rdma_accept() from the proper context. Calls to svc_rdma_accept() are serialized with calls to svc_rdma_free() by the generic RPC server layer. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/linux-rdma/20210209154014.GO4247@nvidia.com/ Fixes: d114c6feedfe ("RDMA/cma: Add missing locking to rdma_accept()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04tty: convert tty_ldisc_ops 'read()' function to take a kernel pointerLinus Torvalds
[ Upstream commit 3b830a9c34d5897be07176ce4e6f2d75e2c8cfd7 ] The tty line discipline .read() function was passed the final user pointer destination as an argument, which doesn't match the 'write()' function, and makes it very inconvenient to do a splice method for ttys. This is a conversion to use a kernel buffer instead. NOTE! It does this by passing the tty line discipline ->read() function an additional "cookie" to fill in, and an offset into the cookie data. The line discipline can fill in the cookie data with its own private information, and then the reader will repeat the read until either the cookie is cleared or it runs out of data. The only real user of this is N_HDLC, which can use this to handle big packets, even if the kernel buffer is smaller than the whole packet. Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04Bluetooth: Remove hci_req_le_suspend_configAbhishek Pandit-Subedi
[ Upstream commit 295fa2a5647b13681594bb1bcc76c74619035218 ] Add a missing SUSPEND_SCAN_ENABLE in passive scan, remove the separate function for configuring le scan during suspend and update the request complete function to clear both enable and disable tasks. Fixes: dce0a4be8054 ("Bluetooth: Set missing suspend task bits") Reviewed-by: Alain Michaud <alainm@chromium.org> Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04bpf: Fix bpf_fib_lookup helper MTU check for SKB ctxJesper Dangaard Brouer
[ Upstream commit 2c0a10af688c02adcf127aad29e923e0056c6b69 ] BPF end-user on Cilium slack-channel (Carlo Carraro) wants to use bpf_fib_lookup for doing MTU-check, but *prior* to extending packet size, by adjusting fib_params 'tot_len' with the packet length plus the expected encap size. (Just like the bpf_check_mtu helper supports). He discovered that for SKB ctx the param->tot_len was not used, instead skb->len was used (via MTU check in is_skb_forwardable() that checks against netdev MTU). Fix this by using fib_params 'tot_len' for MTU check. If not provided (e.g. zero) then keep existing TC behaviour intact. Notice that 'tot_len' for MTU check is done like XDP code-path, which checks against FIB-dst MTU. V16: - Revert V13 optimization, 2nd lookup is against egress/resulting netdev V13: - Only do ifindex lookup one time, calling dev_get_by_index_rcu(). V10: - Use same method as XDP for 'tot_len' MTU check Fixes: 4c79579b44b1 ("bpf: Change bpf_fib_lookup to return lookup status") Reported-by: Carlo Carraro <colrack@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161287789444.790810.15247494756551413508.stgit@firesoul Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04mac80211: fix potential overflow when multiplying to u32 integersColin Ian King
[ Upstream commit 6194f7e6473be78acdc5d03edd116944bdbb2c4e ] The multiplication of the u32 variables tx_time and estimated_retx is performed using a 32 bit multiplication and the result is stored in a u64 result. This has a potential u32 overflow issue, so avoid this by casting tx_time to a u64 to force a 64 bit multiply. Addresses-Coverity: ("Unintentional integer overflow") Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210205175352.208841-1-colin.king@canonical.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04Bluetooth: Put HCI device if inquiry procedure interruptsPan Bian
[ Upstream commit 28a758c861ff290e39d4f1ee0aa5df0f0b9a45ee ] Jump to the label done to decrement the reference count of HCI device hdev on path that the Inquiry procedure is interrupted. Fixes: 3e13fa1e1fab ("Bluetooth: Fix hci_inquiry ioctl usage") Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04Bluetooth: drop HCI device reference before returnPan Bian
[ Upstream commit 5a3ef03afe7e12982dc3b978f4c5077c907f7501 ] Call hci_dev_put() to decrement reference count of HCI device hdev if fails to duplicate memory. Fixes: 0b26ab9dce74 ("Bluetooth: AMP: Handle Accept phylink command status evt") Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04Bluetooth: Fix initializing response id after clearing structChristopher William Snowhill
[ Upstream commit a5687c644015a097304a2e47476c0ecab2065734 ] Looks like this was missed when patching the source to clear the structures throughout, causing this one instance to clear the struct after the response id is assigned. Fixes: eddb7732119d ("Bluetooth: A2MP: Fix not initializing all members") Signed-off-by: Christopher William Snowhill <chris@kode54.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Another pile of networing fixes: 1) ath9k build error fix from Arnd Bergmann 2) dma memory leak fix in mediatec driver from Lorenzo Bianconi. 3) bpf int3 kprobe fix from Alexei Starovoitov. 4) bpf stackmap integer overflow fix from Bui Quang Minh. 5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from Christoph Schemmel. 6) Don't update deleted entry in xt_recent netfilter module, from Jazsef Kadlecsik. 7) Use after free in nftables, fix from Pablo Neira Ayuso. 8) Header checksum fix in flowtable from Sven Auhagen. 9) Validate user controlled length in qrtr code, from Sabyrzhan Tasbolatov. 10) Fix race in xen/netback, from Juergen Gross, 11) New device ID in cxgb4, from Raju Rangoju. 12) Fix ring locking in rxrpc release call, from David Howells. 13) Don't return LAPB error codes from x25_open(), from Xie He. 14) Missing error returns in gsi_channel_setup() from Alex Elder. 15) Get skb_copy_and_csum_datagram working properly with odd segment sizes, from Willem de Bruijn. 16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean. 17) Do teardown on probe failure in DSA, from Vladimir Oltean. 18) Fix compilation failures of txtimestamp selftest, from Vadim Fedorenko. 19) Limit rx per-napi gro queue size to fix latency regression, from Eric Dumazet. 20) dpaa_eth xdp fixes from Camelia Groza. 21) Missing txq mode update when switching CBS off, in stmmac driver, from Mohammad Athari Bin Ismail. 22) Failover pending logic fix in ibmvnic driver, from Sukadev Bhattiprolu. 23) Null deref fix in vmw_vsock, from Norbert Slusarek. 24) Missing verdict update in xdp paths of ena driver, from Shay Agroskin. 25) seq_file iteration fix in sctp from Neil Brown. 26) bpf 32-bit src register truncation fix on div/mod, from Daniel Borkmann. 27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann. 28) Fix locking in vsock_shutdown(), from Stefano Garzarella. 29) Various missing index bound checks in hns3 driver, from Yufeng Mo. 30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from Vladimir Oltean. 31) Don't mix up stp and mrp port states in bridge layer, from Horatiu Vultur. 32) Fix locking during netif_tx_disable(), from Edwin Peer" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) bpf: Fix 32 bit src register truncation on div/mod bpf: Fix verifier jmp32 pruning decision logic bpf: Fix verifier jsgt branch analysis on max bound vsock: fix locking in vsock_shutdown() net: hns3: add a check for index in hclge_get_rss_key() net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() net: hns3: add a check for queue_id in hclge_reset_vf_queue() net: dsa: felix: implement port flushing on .phylink_mac_link_down switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state net: watchdog: hold device global xmit lock during tx disable netfilter: nftables: relax check for stateful expressions in set definition netfilter: conntrack: skip identical origin tuple in same zone only vsock/virtio: update credit only if socket is not closed net: fix iteration for sctp transport seq_files net: ena: Update XDP verdict upon failure net/vmw_vsock: improve locking in vsock_connect_timeout() net/vmw_vsock: fix NULL pointer dereference ibmvnic: Clear failover_pending if unable to schedule net: stmmac: set TxQ mode back to DCB after disabling CBS ...
2021-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) nf_conntrack_tuple_taken() needs to recheck zone for NAT clash resolution, from Florian Westphal. 2) Restore support for stateful expressions when set definition specifies no stateful expressions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09vsock: fix locking in vsock_shutdown()Stefano Garzarella
In vsock_shutdown() we touched some socket fields without holding the socket lock, such as 'state' and 'sk_flags'. Also, after the introduction of multi-transport, we are accessing 'vsk->transport' in vsock_send_shutdown() without holding the lock and this call can be made while the connection is in progress, so the transport can change in the meantime. To avoid issues, we hold the socket lock when we enter in vsock_shutdown() and release it when we leave. Among the transports that implement the 'shutdown' callback, only hyperv_transport acquired the lock. Since the caller now holds it, we no longer take it. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_stateHoratiu Vultur
The function br_mrp_port_switchdev_set_state was called both with MRP port state and STP port state, which is an issue because they don't match exactly. Therefore, update the function to be used only with STP port state and use the id SWITCHDEV_ATTR_ID_PORT_STP_STATE. The choice of using STP over MRP is that the drivers already implement SWITCHDEV_ATTR_ID_PORT_STP_STATE and already in SW we update the port STP state. Fixes: 9a9f26e8f7ea30 ("bridge: mrp: Connect MRP API with the switchdev API") Fixes: fadd409136f0f2 ("bridge: switchdev: mrp: Implement MRP API for switchdev") Fixes: 2f1a11ae11d222 ("bridge: mrp: Add MRP interface.") Reported-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09netfilter: nftables: relax check for stateful expressions in set definitionPablo Neira Ayuso
Restore the original behaviour where users are allowed to add an element with any stateful expression if the set definition specifies no stateful expressions. Make sure upper maximum number of stateful expressions of NFT_SET_EXPR_MAX is not reached. Fixes: 8cfd9b0f8515 ("netfilter: nftables: generalize set expressions support") Fixes: 48b0ae046ee9 ("netfilter: nftables: netlink support for several set element expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-09netfilter: conntrack: skip identical origin tuple in same zone onlyFlorian Westphal
The origin skip check needs to re-test the zone. Else, we might skip a colliding tuple in the reply direction. This only occurs when using 'directional zones' where origin tuples reside in different zones but the reply tuples share the same zone. This causes the new conntrack entry to be dropped at confirmation time because NAT clash resolution was elided. Fixes: 4e35c1cb9460240 ("netfilter: nf_nat: skip nat clash resolution for same-origin entries") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-08vsock/virtio: update credit only if socket is not closedStefano Garzarella
If the socket is closed or is being released, some resources used by virtio_transport_space_update() such as 'vsk->trans' may be released. To avoid a use after free bug we should only update the available credit when we are sure the socket is still open and we have the lock held. Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20210208144454.84438-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-08net: fix iteration for sctp transport seq_filesNeilBrown
The sctp transport seq_file iterators take a reference to the transport in the ->start and ->next functions and releases the reference in the ->show function. The preferred handling for such resources is to release them in the subsequent ->next or ->stop function call. Since Commit 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface") there is no guarantee that ->show will be called after ->next, so this function can now leak references. So move the sctp_transport_put() call to ->next and ->stop. Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface") Reported-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06net/vmw_vsock: improve locking in vsock_connect_timeout()Norbert Slusarek
A possible locking issue in vsock_connect_timeout() was recognized by Eric Dumazet which might cause a null pointer dereference in vsock_transport_cancel_pkt(). This patch assures that vsock_transport_cancel_pkt() will be called within the lock, so a race condition won't occur which could result in vsk->transport to be set to NULL. Fixes: 380feae0def7 ("vsock: cancel packets when failing to connect") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/trinity-f8e0937a-cf0e-4d80-a76e-d9a958ba3ef1-1612535522360@3c-app-gmx-bap12 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06net/vmw_vsock: fix NULL pointer dereferenceNorbert Slusarek
In vsock_stream_connect(), a thread will enter schedule_timeout(). While being scheduled out, another thread can enter vsock_stream_connect() as well and set vsk->transport to NULL. In case a signal was sent, the first thread can leave schedule_timeout() and vsock_transport_cancel_pkt() will be called right after. Inside vsock_transport_cancel_pkt(), a null dereference will happen on transport->cancel_pkt. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/trinity-c2d6cede-bfb1-44e2-85af-1fbc7f541715-1612535117028@3c-app-gmx-bap12 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06Merge tag 'wireless-drivers-2021-02-05' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.11 Third, and most likely the last, set of fixes for v5.11. Two very small fixes. ath9k * fix build regression related to LEDS_CLASS mt76 * fix a memory leak * tag 'wireless-drivers-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers: mt76: dma: fix a possible memory leak in mt76_add_fragment() ath9k: fix build error with LEDS_CLASS=m ==================== Link: https://lore.kernel.org/r/20210205163434.14D94C433ED@smtp.codeaurora.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05net: gro: do not keep too many GRO packets in napi->rx_listEric Dumazet
Commit c80794323e82 ("net: Fix packet reordering caused by GRO and listified RX cooperation") had the unfortunate effect of adding latencies in common workloads. Before the patch, GRO packets were immediately passed to upper stacks. After the patch, we can accumulate quite a lot of GRO packets (depdending on NAPI budget). My fix is counting in napi->rx_count number of segments instead of number of logical packets. Fixes: c80794323e82 ("net: Fix packet reordering caused by GRO and listified RX cooperation") Signed-off-by: Eric Dumazet <edumazet@google.com> Bisected-by: John Sperbeck <jsperbeck@google.com> Tested-by: Jian Yang <jianyang@google.com> Cc: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Link: https://lore.kernel.org/r/20210204213146.4192368-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05Merge tag 'nfsd-5.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: "Fix non-page-aligned NFS READs" * tag 'nfsd-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix NFS READs that start at non-page-aligned offsets
2021-02-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix combination of --reap and --update in xt_recent that triggers UAF, from Jozsef Kadlecsik. 2) Fix current year in nft_meta selftest, from Fabian Frederick. 3) Fix possible UAF in the netns destroy path of nftables. 4) Fix incorrect checksum calculation when mangling ports in flowtable, from Sven Auhagen. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: flowtable: fix tcp and udp header checksum update netfilter: nftables: fix possible UAF over chains from packet path in netns selftests: netfilter: fix current year netfilter: xt_recent: Fix attempt to update deleted entry ==================== Link: https://lore.kernel.org/r/20210205001727.2125-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04net: dsa: call teardown method on probe failureVladimir Oltean
Since teardown is supposed to undo the effects of the setup method, it should be called in the error path for dsa_switch_setup, not just in dsa_switch_teardown. Fixes: 5e3f847a02aa ("net: dsa: Add teardown callback for drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210204163351.2929670-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04udp: fix skb_copy_and_csum_datagram with odd segment sizesWillem de Bruijn
When iteratively computing a checksum with csum_block_add, track the offset "pos" to correctly rotate in csum_block_add when offset is odd. The open coded implementation of skb_copy_and_csum_datagram did this. With the switch to __skb_datagram_iter calling csum_and_copy_to_iter, pos was reinitialized to 0 on each call. Bring back the pos by passing it along with the csum to the callback. Changes v1->v2 - pass csum value, instead of csump pointer (Alexander Duyck) Link: https://lore.kernel.org/netdev/20210128152353.GB27281@optiplex/ Fixes: 950fcaecd5cc ("datagram: consolidate datagram copy to iter helpers") Reported-by: Oliver Graute <oliver.graute@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210203192952.1849843-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04rxrpc: Fix clearance of Tx/Rx ring when releasing a callDavid Howells
At the end of rxrpc_release_call(), rxrpc_cleanup_ring() is called to clear the Rx/Tx skbuff ring, but this doesn't lock the ring whilst it's accessing it. Unfortunately, rxrpc_resend() might be trying to retransmit a packet concurrently with this - and whilst it does lock the ring, this isn't protection against rxrpc_cleanup_call(). Fix this by removing the call to rxrpc_cleanup_ring() from rxrpc_release_call(). rxrpc_cleanup_ring() will be called again anyway from rxrpc_cleanup_call(). The earlier call is just an optimisation to recycle skbuffs more quickly. Alternative solutions include rxrpc_release_call() could try to cancel the work item or wait for it to complete or rxrpc_cleanup_ring() could lock when accessing the ring (which would require a bh lock). This can produce a report like the following: BUG: KASAN: use-after-free in rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372 Read of size 4 at addr ffff888011606e04 by task kworker/0:0/5 ... Workqueue: krxrpcd rxrpc_process_call Call Trace: ... kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413 rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372 rxrpc_resend net/rxrpc/call_event.c:266 [inline] rxrpc_process_call+0x1634/0x1f60 net/rxrpc/call_event.c:412 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 ... Allocated by task 2318: ... sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2348 rxrpc_send_data+0xb51/0x2bf0 net/rxrpc/sendmsg.c:358 rxrpc_do_sendmsg+0xc03/0x1350 net/rxrpc/sendmsg.c:744 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:560 ... Freed by task 2318: ... kfree_skb+0x140/0x3f0 net/core/skbuff.c:704 rxrpc_free_skb+0x11d/0x150 net/rxrpc/skbuff.c:78 rxrpc_cleanup_ring net/rxrpc/call_object.c:485 [inline] rxrpc_release_call+0x5dd/0x860 net/rxrpc/call_object.c:552 rxrpc_release_calls_on_socket+0x21c/0x300 net/rxrpc/call_object.c:579 rxrpc_release_sock net/rxrpc/af_rxrpc.c:885 [inline] rxrpc_release+0x263/0x5a0 net/rxrpc/af_rxrpc.c:916 __sock_release+0xcd/0x280 net/socket.c:597 ... The buggy address belongs to the object at ffff888011606dc0 which belongs to the cache skbuff_head_cache of size 232 Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: syzbot+174de899852504e4a74a@syzkaller.appspotmail.com Reported-by: syzbot+3d1c772efafd3c38d007@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Hillf Danton <hdanton@sina.com> Link: https://lore.kernel.org/r/161234207610.653119.5287360098400436976.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-03net/qrtr: restrict user-controlled length in qrtr_tun_write_iter()Sabyrzhan Tasbolatov
syzbot found WARNING in qrtr_tun_write_iter [1] when write_iter length exceeds KMALLOC_MAX_SIZE causing order >= MAX_ORDER condition. Additionally, there is no check for 0 length write. [1] WARNING: mm/page_alloc.c:5011 [..] Call Trace: alloc_pages_current+0x18c/0x2a0 mm/mempolicy.c:2267 alloc_pages include/linux/gfp.h:547 [inline] kmalloc_order+0x2e/0xb0 mm/slab_common.c:837 kmalloc_order_trace+0x14/0x120 mm/slab_common.c:853 kmalloc include/linux/slab.h:557 [inline] kzalloc include/linux/slab.h:682 [inline] qrtr_tun_write_iter+0x8a/0x180 net/qrtr/tun.c:83 call_write_iter include/linux/fs.h:1901 [inline] Reported-by: syzbot+c2a7e5c5211605a90865@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Link: https://lore.kernel.org/r/20210202092059.1361381-1-snovitoll@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04netfilter: flowtable: fix tcp and udp header checksum updateSven Auhagen
When updating the tcp or udp header checksum on port nat the function inet_proto_csum_replace2 with the last parameter pseudohdr as true. This leads to an error in the case that GRO is used and packets are split up in GSO. The tcp or udp checksum of all packets is incorrect. The error is probably masked due to the fact the most network driver implement tcp/udp checksum offloading. It also only happens when GRO is applied and not on single packets. The error is most visible when using a pppoe connection which is not triggering the tcp/udp checksum offload. Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure") Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-04netfilter: nftables: fix possible UAF over chains from packet path in netnsPablo Neira Ayuso
Although hooks are released via call_rcu(), chain and rule objects are immediately released while packets are still walking over these bits. This patch adds the .pre_exit callback which is invoked before synchronize_rcu() in the netns framework to stay safe. Remove a comment which is not valid anymore since the core does not use synchronize_net() anymore since 8c873e219970 ("netfilter: core: free hooks with call_rcu"). Suggested-by: Florian Westphal <fw@strlen.de> Fixes: df05ef874b28 ("netfilter: nf_tables: release objects on netns destruction") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-04netfilter: xt_recent: Fix attempt to update deleted entryJozsef Kadlecsik
When both --reap and --update flag are specified, there's a code path at which the entry to be updated is reaped beforehand, which then leads to kernel crash. Reap only entries which won't be updated. Fixes kernel bugzilla #207773. Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773 Reported-by: Reindl Harald <h.reindl@thelounge.net> Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-02Merge tag 'net-5.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.11-rc7, including fixes from bpf and mac80211 trees. Current release - regressions: - ip_tunnel: fix mtu calculation - mlx5: fix function calculation for page trees Previous releases - regressions: - vsock: fix the race conditions in multi-transport support - neighbour: prevent a dead entry from updating gc_list - dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add Previous releases - always broken: - bpf, cgroup: two copy_{from,to}_user() warn_on_once splats for BPF cgroup getsockopt infra when user space is trying to race against optlen, from Loris Reiff. - bpf: add missing fput() in BPF inode storage map update helper - udp: ipv4: manipulate network header of NATed UDP GRO fraglist - mac80211: fix station rate table updates on assoc - r8169: work around RTL8125 UDP HW bug - igc: report speed and duplex as unknown when device is runtime suspended - rxrpc: fix deadlock around release of dst cached on udp tunnel" * tag 'net-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits) net: hsr: align sup_multicast_addr in struct hsr_priv to u16 boundary net: ipa: fix two format specifier errors net: ipa: use the right accessor in ipa_endpoint_status_skip() net: ipa: be explicit about endianness net: ipa: add a missing __iomem attribute net: ipa: pass correct dma_handle to dma_free_coherent() r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS net: mvpp2: TCAM entry enable should be written after SRAM data net: lapb: Copy the skb before sending a packet net/mlx5e: Release skb in case of failure in tc update skb net/mlx5e: Update max_opened_tc also when channels are closed net/mlx5: Fix leak upon failure of rule creation net/mlx5: Fix function calculation for page trees docs: networking: swap words in icmp_errors_use_inbound_ifaddr doc udp: ipv4: manipulate network header of NATed UDP GRO fraglist net: ip_tunnel: fix mtu calculation vsock: fix the race conditions in multi-transport support net: sched: replaced invalid qdisc tree flush helper in qdisc_replace ibmvnic: device remove has higher precedence over reset ...
2021-02-02net: hsr: align sup_multicast_addr in struct hsr_priv to u16 boundaryAndreas Oetken
sup_multicast_addr is passed to ether_addr_equal for address comparison which casts the address inputs to u16 leading to an unaligned access. Aligning the sup_multicast_addr to u16 boundary fixes the issue. Signed-off-by: Andreas Oetken <andreas.oetken@siemens.com> Link: https://lore.kernel.org/r/20210202090304.2740471-1-ennoerlangen@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>