summaryrefslogtreecommitdiff
path: root/net/bridge
AgeCommit message (Collapse)Author
2020-04-28netfilter: ebtables: compat: reject all padding in matches/watchersFlorian Westphal
commit e608f631f0ba5f1fc5ee2e260a3a35d13107cbfe upstream. syzbot reported following splat: BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline] BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155 Read of size 4 at addr ffffc900004461f4 by task syz-executor267/7937 CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0 size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline] compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155 compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249 compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333 [..] Because padding isn't considered during computation of ->buf_user_offset, "total" is decremented by fewer bytes than it should. Therefore, the first part of if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry)) will pass, -- it should not have. This causes oob access: entry->next_offset is past the vmalloced size. Reject padding and check that computed user offset (sum of ebt_entry structure plus all individual matches/watchers/targets) is same value that userspace gave us as the offset of the next entry. Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2020-04-28netfilter: ebtables: convert BUG_ONs to WARN_ONsFlorian Westphal
commit fc6a5d0601c5ac1d02f283a46f60b87b2033e5ca upstream. All of these conditions are not fatal and should have been WARN_ONs from the get-go. Convert them to WARN_ONs and bail out. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2020-04-28netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()Eric Dumazet
commit 5604285839aaedfb23ebe297799c6e558939334d upstream. syzbot is kind enough to remind us we need to call skb_may_pull() BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665 CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108 __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245 br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline] nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512 nf_hook include/linux/netfilter.h:260 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] __br_forward+0x78f/0xe30 net/bridge/br_forward.c:109 br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234 br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162 nf_hook_bridge_pre net/bridge/br_input.c:245 [inline] br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348 __netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830 __netif_receive_skb_one_core net/core/dev.c:4927 [inline] __netif_receive_skb net/core/dev.c:5043 [inline] process_backlog+0x610/0x13c0 net/core/dev.c:5874 napi_poll net/core/dev.c:6311 [inline] net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379 __do_softirq+0x4a1/0x83a kernel/softirq.c:293 do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091 </IRQ> do_softirq kernel/softirq.c:338 [inline] __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32 rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline] __dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825 packet_snd net/packet/af_packet.c:2959 [inline] packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] __sys_sendto+0xc44/0xc70 net/socket.c:1952 __do_sys_sendto net/socket.c:1964 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:1960 __x64_sys_sendto+0x6e/0x90 net/socket.c:1960 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45a679 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f0a3c9e5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 000000000045a679 RDX: 000000000000000e RSI: 0000000020000200 RDI: 0000000000000003 RBP: 000000000075bf20 R08: 00000000200000c0 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0a3c9e66d4 R13: 00000000004c8ec1 R14: 00000000004dfe28 R15: 00000000ffffffff Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline] kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132 kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86 slab_alloc_node mm/slub.c:2773 [inline] __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x306/0xa10 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662 sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244 packet_alloc_skb net/packet/af_packet.c:2807 [inline] packet_snd net/packet/af_packet.c:2902 [inline] packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] __sys_sendto+0xc44/0xc70 net/socket.c:1952 __do_sys_sendto net/socket.c:1964 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:1960 __x64_sys_sendto+0x6e/0x90 net/socket.c:1960 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c4e70a87d975 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2020-02-11net: bridge: deny dev_set_mac_address() when unregisteringNikolay Aleksandrov
commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 upstream. We have an interesting memory leak in the bridge when it is being unregistered and is a slave to a master device which would change the mac of its slaves on unregister (e.g. bond, team). This is a very unusual setup but we do end up leaking 1 fdb entry because dev_set_mac_address() would cause the bridge to insert the new mac address into its table after all fdbs are flushed, i.e. after dellink() on the bridge has finished and we call NETDEV_UNREGISTER the bond/team would release it and will call dev_set_mac_address() to restore its original address and that in turn will add an fdb in the bridge. One fix is to check for the bridge dev's reg_state in its ndo_set_mac_address callback and return an error if the bridge is not in NETREG_REGISTERED. Easy steps to reproduce: 1. add bond in mode != A/B 2. add any slave to the bond 3. add bridge dev as a slave to the bond 4. destroy the bridge device Trace: unreferenced object 0xffff888035c4d080 (size 128): comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s) hex dump (first 32 bytes): 41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00 A..6............ d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00 ...^?........... backtrace: [<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f [<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge] [<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge] [<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge] [<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge] [<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge] [<000000006846a77f>] dev_set_mac_address+0x63/0x9b [<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding] [<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding] [<00000000305d7795>] notifier_call_chain+0x38/0x56 [<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23 [<000000008279477b>] rollback_registered_many+0x353/0x6a4 [<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f [<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43 [<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a [<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268 Fixes: 43598813386f ("bridge: add local MAC address to forwarding table (v2)") Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-11-22net: bridge: mcast: don't delete permanent entries when fast leave is enabledNikolay Aleksandrov
commit 5c725b6b65067909548ac9ca9bc777098ec9883d upstream. When permanent entries were introduced by the commit below, they were exempt from timing out and thus igmp leave wouldn't affect them unless fast leave was enabled on the port which was added before permanent entries existed. It shouldn't matter if fast leave is enabled or not if the user added a permanent entry it shouldn't be deleted on igmp leave. Before: $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent $ bridge mdb show dev br0 port eth4 grp 229.1.1.1 permanent < join and leave 229.1.1.1 on eth4 > $ bridge mdb show $ After: $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent $ bridge mdb show dev br0 port eth4 grp 229.1.1.1 permanent < join and leave 229.1.1.1 on eth4 > $ bridge mdb show dev br0 port eth4 grp 229.1.1.1 permanent Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: Check PERMANENT flag in net_bridge_port_group::state, not net_bridge_port_group::flags.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-10-31net: bridge: stp: don't cache eth dest pointer before skb pullNikolay Aleksandrov
commit 2446a68ae6a8cee6d480e2f5b52f5007c7c41312 upstream. Don't cache eth dest pointer before calling pskb_may_pull. Fixes: cf0f02d04a83 ("[BRIDGE]: use llc for receiving STP packets") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-09-23netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last ruleFlorian Westphal
commit 680f6af5337c98d116e4f127cea7845339dba8da upstream. If userspace provides a rule blob with trailing data after last target, we trigger a splat, then convert ruleset to 64bit format (with trailing data), then pass that to do_replace_finish() which then returns -EINVAL. Erroring out right away avoids the splat plus unneeded translation and error unwind. Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-08-13netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ONFlorian Westphal
commit 7caa56f006e9d712b44f27b32520c66420d5cbc6 upstream. It means userspace gave us a ruleset where there is some other data after the ebtables target but before the beginning of the next rule. Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support") Reported-by: syzbot+659574e7bcc7f7eb4df7@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-08-13net: bridge: multicast: use rcu to access port list from ↵Nikolay Aleksandrov
br_multicast_start_querier commit c5b493ce192bd7a4e7bd073b5685aad121eeef82 upstream. br_multicast_start_querier() walks over the port list but it can be called from a timer with only multicast_lock held which doesn't protect the port list, so use RCU to walk over it. Fixes: c83b8fab06fc ("bridge: Restart queries when last querier expires") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-08-13netfilter: bridge: set skb transport_header before entering NF_INET_PRE_ROUTINGXin Long
commit e166e4fdaced850bee3d5ee12a5740258fb30587 upstream. Since Commit 21d1196a35f5 ("ipv4: set transport header earlier"), skb->transport_header has been always set before entering INET netfilter. This patch is to set skb->transport_header for bridge before entering INET netfilter by bridge-nf-call-iptables. It also fixes an issue that sctp_error() couldn't compute a right csum due to unset skb->transport_header. Fixes: e6d8b64b34aa ("net: sctp: fix and consolidate SCTP checksumming code") Reported-by: Li Shuang <shuali@redhat.com> Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [bwh: Backported to 3.16: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-05-02net: bridge: Fix ethernet header pointer before check skb forwardableYunjian Wang
commit 28c1382fa28f2e2d9d0d6f25ae879b5af2ecbd03 upstream. The skb header should be set to ethernet header before using is_skb_forwardable. Because the ethernet header length has been considered in is_skb_forwardable(including dev->hard_header_len length). To reproduce the issue: 1, add 2 ports on linux bridge br using following commands: $ brctl addbr br $ brctl addif br eth0 $ brctl addif br eth1 2, the MTU of eth0 and eth1 is 1500 3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4) from eth0 to eth1 So the expect result is packet larger than 1500 cannot pass through eth0 and eth1. But currently, the packet passes through success, it means eth1's MTU limit doesn't take effect. Fixes: f6367b4660dd ("bridge: use is_skb_forwardable in forward path") Cc: bridge@lists.linux-foundation.org Cc: Nkolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: - br_dev_queue_push_xmit() has to call nf_bridge_maybe_copy_header() before pushing the Ethernet header - adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-10-21netfilter: ebtables: handle string from userspace with carePaolo Abeni
commit 94c752f99954797da583a84c4907ff19e92550a4 upstream. strlcpy() can't be safely used on a user-space provided string, as it can try to read beyond the buffer's end, if the latter is not NULL terminated. Leveraging the above, syzbot has been able to trigger the following splat: BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] BUG: KASAN: stack-out-of-bounds in ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] BUG: KASAN: stack-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504 CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285 compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367 compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline] compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156 compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279 inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041 compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901 compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050 __compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403 __do_compat_sys_setsockopt net/compat.c:416 [inline] __se_compat_sys_setsockopt net/compat.c:413 [inline] __ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413 do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline] do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7fb3cb9 RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x2fffc0000000000() raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Fix the issue replacing the unsafe function with strscpy() and taking care of possible errors. Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support") Reported-and-tested-by: syzbot+4e42a04e0bc33cb6c087@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-06-16netfilter: bridge: ebt_among: add more missing match size checksFlorian Westphal
commit c8d70a700a5b486bfa8e5a7d33d805389f6e59f9 upstream. ebt_among is special, it has a dynamic match size and is exempt from the central size checks. commit c4585a2823edf ("bridge: ebt_among: add missing match size checks") added validation for pool size, but missed fact that the macros ebt_among_wh_src/dst can already return out-of-bound result because they do not check value of wh_src/dst_ofs (an offset) vs. the size of the match that userspace gave to us. v2: check that offset has correct alignment. Paolo Abeni points out that we should also check that src/dst wormhash arrays do not overlap, and src + length lines up with start of dst (or vice versa). v3: compact wormhash_sizes_valid() part NB: Fixes tag is intentionally wrong, this bug exists from day one when match was added for 2.6 kernel. Tag is there so stable maintainers will notice this one too. Tested with same rules from the earlier patch. Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks") Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-06-16netfilter: bridge: ebt_among: add missing match size checksFlorian Westphal
commit c4585a2823edf4d1326da44d1524ecbfda26bb37 upstream. ebt_among is special, it has a dynamic match size and is exempt from the central size checks. Therefore it must check that the size of the match structure provided from userspace is sane by making sure em->match_size is at least the minimum size of the expected structure. The module has such a check, but its only done after accessing a structure that might be out of bounds. tested with: ebtables -A INPUT ... \ --among-dst fe:fe:fe:fe:fe:fe --among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe Reported-by: <syzbot+fe0b19af568972814355@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-06-16bridge: check brport attr show in brport_showXin Long
commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f upstream. Now br_sysfs_if file flush doesn't have attr show. To read it will cause kernel panic after users chmod u+r this file. Xiong found this issue when running the commands: ip link add br0 type bridge ip link add type veth ip link set veth0 master br0 chmod u+r /sys/devices/virtual/net/veth0/brport/flush timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush kernel crashed with NULL a pointer dereference call trace. This patch is to fix it by return -EINVAL when brport_attr->show is null, just the same as the check for brport_attr->store in brport_store(). Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table") Reported-by: Xiong Zhou <xzhou@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-06-16netfilter: ebtables: fix erroneous reject of last ruleFlorian Westphal
commit 932909d9b28d27e807ff8eecb68c7748f6701628 upstream. The last rule in the blob has next_entry offset that is same as total size. This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel. Fixes: b71812168571fa ("netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-06-16netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsetsFlorian Westphal
commit b71812168571fa55e44cdd0254471331b9c4c4c6 upstream. We need to make sure the offsets are not out of range of the total size. Also check that they are in ascending order. The WARN_ON triggered by syzkaller (it sets panic_on_warn) is changed to also bail out, no point in continuing parsing. Briefly tested with simple ruleset of -A INPUT --limit 1/s' --log plus jump to custom chains using 32bit ebtables binary. Reported-by: <syzbot+845a53d13171abf8bf29@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-03-03net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaksNikolay Aleksandrov
commit 84aeb437ab98a2bce3d4b2111c79723aedfceb33 upstream. The early call to br_stp_change_bridge_id in bridge's newlink can cause a memory leak if an error occurs during the newlink because the fdb entries are not cleaned up if a different lladdr was specified, also another minor issue is that it generates fdb notifications with ifindex = 0. Another unrelated memory leak is the bridge sysfs entries which get added on NETDEV_REGISTER event, but are not cleaned up in the newlink error path. To remove this special case the call to br_stp_change_bridge_id is done after netdev register and we cleanup the bridge on changelink error via br_dev_delete to plug all leaks. This patch makes netlink bridge destruction on newlink error the same as dellink and ioctl del which is necessary since at that point we have a fully initialized bridge device. To reproduce the issue: $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1 RTNETLINK answers: Invalid argument $ rmmod bridge [ 1822.142525] ============================================================================= [ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown() [ 1822.144821] ----------------------------------------------------------------------------- [ 1822.145990] Disabling lock debugging due to kernel taint [ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100 [ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87 [ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 1822.150008] Call Trace: [ 1822.150510] dump_stack+0x78/0xa9 [ 1822.151156] slab_err+0xb1/0xd3 [ 1822.151834] ? __kmalloc+0x1bb/0x1ce [ 1822.152546] __kmem_cache_shutdown+0x151/0x28b [ 1822.153395] shutdown_cache+0x13/0x144 [ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb [ 1822.154669] SyS_delete_module+0x194/0x244 [ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a [ 1822.156343] RIP: 0033:0x7f929bd38b17 [ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0 [ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17 [ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0 [ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11 [ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80 [ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090 [ 1822.161278] INFO: Object 0x000000007645de29 @offset=0 [ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128 Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device") Fixes: 5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: register_netdevice() was the last thing done in br_dev_newlink(), so no extra cleanup is needed] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-11-11dst: Increase alignment of metrics to allow extra flag on pointersBen Hutchings
For the backport of "ipv4: add reference counting to metrics", we will need a third flag on metrics pointers. This was not needed upstream as the DST_METRICS_FORCE_OVERWRITE flag has been eliminated there. In order to use three flag bits we need to increase the alignment of metrics from 4 to 8 bytes. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-11-11net: bridge: fix dest lookup when vlan proto doesn't matchNikolay Aleksandrov
commit 31a4562d7408493c6377933ff2f7d7302dbdea80 upstream. With 802.1ad support the vlan_ingress code started checking for vlan protocol mismatch which causes the current tag to be inserted and the bridge vlan protocol & pvid to be set. The vlan tag insertion changes the skb mac_header and thus the lookup mac dest pointer which was loaded prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN bytes off now, pointing to the last two bytes of the destination mac and the first four of the source mac causing lookups to always fail and broadcasting all such packets to all ports. Same thing happens for locally originated packets when passing via br_dev_xmit. So load the dest pointer after the vlan checks and possible skb change. Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support") Reported-by: Anitha Narasimha Murthy <anitha@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-03-16net: bridge: fix old ioctl unlocked net device walkNikolay Aleksandrov
[ Upstream commit 31ca0458a61a502adb7ed192bf9716c6d05791a5 ] get_bridge_ifindices() is used from the old "deviceless" bridge ioctl calls which aren't called with rtnl held. The comment above says that it is called with rtnl but that is not really the case. Here's a sample output from a test ASSERT_RTNL() which I put in get_bridge_ifindices and executed "brctl show": [ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30) [ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O 4.6.0-rc4+ #157 [ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 957.423009] 0000000000000000 ffff880058adfdf0 ffffffff8138dec5 0000000000000400 [ 957.423009] ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32 0000000000000001 [ 957.423009] 00007ffec1a444b0 0000000000000400 ffff880053c19130 0000000000008940 [ 957.423009] Call Trace: [ 957.423009] [<ffffffff8138dec5>] dump_stack+0x85/0xc0 [ 957.423009] [<ffffffffa05ead32>] br_ioctl_deviceless_stub+0x212/0x2e0 [bridge] [ 957.423009] [<ffffffff81515beb>] sock_ioctl+0x22b/0x290 [ 957.423009] [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700 [ 957.423009] [<ffffffff8126c159>] SyS_ioctl+0x79/0x90 [ 957.423009] [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1 Since it only reads bridge ifindices, we can use rcu to safely walk the net device list. Also remove the wrong rtnl comment above. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-02-23bridge: multicast: restore perm router ports on multicast enableNikolay Aleksandrov
commit 7cb3f9214dfa443c1ccc2be637dcc6344cc203f0 upstream. Satish reported a problem with the perm multicast router ports not getting reenabled after some series of events, in particular if it happens that the multicast snooping has been disabled and the port goes to disabled state then it will be deleted from the router port list, but if it moves into non-disabled state it will not be re-added because the mcast snooping is still disabled, and enabling snooping later does nothing. Here are the steps to reproduce, setup br0 with snooping enabled and eth1 added as a perm router (multicast_router = 2): 1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping 2. $ ip l set eth1 down ^ This step deletes the interface from the router list 3. $ ip l set eth1 up ^ This step does not add it again because mcast snooping is disabled 4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping 5. $ bridge -d -s mdb show <empty> At this point we have mcast enabled and eth1 as a perm router (value = 2) but it is not in the router list which is incorrect. After this change: 1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping 2. $ ip l set eth1 down ^ This step deletes the interface from the router list 3. $ ip l set eth1 up ^ This step does not add it again because mcast snooping is disabled 4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping 5. $ bridge -d -s mdb show router ports on br0: eth1 Note: we can directly do br_multicast_enable_port for all because the querier timer already has checks for the port state and will simply expire if it's in blocking/disabled. See the comment added by commit 9aa66382163e7 ("bridge: multicast: add a comment to br_port_state_selection about blocking state") Fixes: 561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle") Reported-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-08-22Revert "netfilter: ensure number of counters is >0 in do_replace()"Bernhard Thaler
commit d26e2c9ffa385dd1b646f43c1397ba12af9ed431 upstream. This partially reverts commit 1086bbe97a07 ("netfilter: ensure number of counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c. Setting rules with ebtables does not work any more with 1086bbe97a07 place. There is an error message and no rules set in the end. e.g. ~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP Unable to update the kernel. Two possible causes: 1. Multiple ebtables programs were executing simultaneously. The ebtables userspace tool doesn't by default support multiple ebtables programs running Reverting the ebtables part of 1086bbe97a07 makes this work again. Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-08-22netfilter: ensure number of counters is >0 in do_replace()Dave Jones
commit 1086bbe97a074844188c6c988fa0b1a98c3ccbb9 upstream. After improving setsockopt() coverage in trinity, I started triggering vmalloc failures pretty reliably from this code path: warn_alloc_failed+0xe9/0x140 __vmalloc_node_range+0x1be/0x270 vzalloc+0x4b/0x50 __do_replace+0x52/0x260 [ip_tables] do_ipt_set_ctl+0x15d/0x1d0 [ip_tables] nf_setsockopt+0x65/0x90 ip_setsockopt+0x61/0xa0 raw_setsockopt+0x16/0x60 sock_common_setsockopt+0x14/0x20 SyS_setsockopt+0x71/0xd0 It turns out we don't validate that the num_counters field in the struct we pass in from userspace is initialized. The same problem also exists in ebtables, arptables, ipv6, and the compat variants. Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-08-22Bridge: Fix ipv6 mc snooping if bridge has no ipv6 addressdaniel
commit 0888d5f3c0f183ea6177355752ada433d370ac89 upstream. The bridge is falsly dropping ipv6 mulitcast packets if there is: 1. No ipv6 address assigned on the brigde. 2. No external mld querier present. 3. The internal querier enabled. When the bridge fails to build mld queries, because it has no ipv6 address, it slilently returns, but keeps the local querier enabled. This specific case causes confusing packet loss. Ipv6 multicast snooping can only work if: a) An external querier is present OR b) The bridge has an ipv6 address an is capable of sending own queries Otherwise it has to forward/flood the ipv6 multicast traffic, because snooping cannot work. This patch fixes the issue by adding a flag to the bridge struct that indicates that there is currently no ipv6 address assinged to the bridge and returns a false state for the local querier in __br_multicast_querier_exists(). Special thanks to Linus Lüssing. Fixes: d1d81d4c3dd8 ("bridge: check return value of ipv6_dev_get_saddr()") Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com> Acked-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-02-02bridge: fix lockdep addr_list_lock false positive splatNikolay Aleksandrov
commit c6894dec8ea9ae05747124dce98b3b5c2e69b168 upstream. After promisc mode management was introduced a bridge device could do dev_set_promiscuity from its ndo_change_rx_flags() callback which in turn can be called after the bridge's addr_list_lock has been taken (e.g. by dev_uc_add). This causes a false positive lockdep splat because the port interfaces' addr_list_lock is taken when br_manage_promisc() runs after the bridge's addr list lock was already taken. To remove the false positive introduce a custom bridge addr_list_lock class and set it on bridge init. A simple way to reproduce this is with the following: $ brctl addbr br0 $ ip l add l br0 br0.100 type vlan id 100 $ ip l set br0 up $ ip l set br0.100 up $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering $ brctl addif br0 eth0 Splat: [ 43.684325] ============================================= [ 43.684485] [ INFO: possible recursive locking detected ] [ 43.684636] 4.4.0-rc8+ #54 Not tainted [ 43.684755] --------------------------------------------- [ 43.684906] brctl/1187 is trying to acquire lock: [ 43.685047] (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.685460] but task is already holding lock: [ 43.685618] (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.686015] other info that might help us debug this: [ 43.686316] Possible unsafe locking scenario: [ 43.686743] CPU0 [ 43.686967] ---- [ 43.687197] lock(_xmit_ETHER); [ 43.687544] lock(_xmit_ETHER); [ 43.687886] *** DEADLOCK *** [ 43.688438] May be due to missing lock nesting notation [ 43.688882] 2 locks held by brctl/1187: [ 43.689134] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20 [ 43.689852] #1: (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.690575] stack backtrace: [ 43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54 [ 43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 43.691770] ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0 [ 43.692425] ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139 [ 43.693071] ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020 [ 43.693709] Call Trace: [ 43.693931] [<ffffffff81360ceb>] dump_stack+0x4b/0x70 [ 43.694199] [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90 [ 43.694483] [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0 [ 43.694789] [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0 [ 43.695064] [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0 [ 43.695340] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.695623] [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80 [ 43.695901] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.696180] [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.696460] [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50 [ 43.696750] [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge] [ 43.697052] [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge] [ 43.697348] [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge] [ 43.697655] [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0 [ 43.697943] [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90 [ 43.698223] [<ffffffff815072de>] dev_uc_add+0x5e/0x80 [ 43.698498] [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q] [ 43.698798] [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80 [ 43.699083] [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20 [ 43.699374] [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80 [ 43.699678] [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20 [ 43.699973] [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge] [ 43.700259] [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge] [ 43.700548] [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge] [ 43.700836] [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0 [ 43.701106] [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0 [ 43.701379] [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450 [ 43.701665] [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450 [ 43.701947] [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50 [ 43.702219] [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290 [ 43.702500] [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0 [ 43.702771] [<ffffffff81243079>] SyS_ioctl+0x79/0x90 [ 43.703033] [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a CC: Vlad Yasevich <vyasevic@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Bridge list <bridge@lists.linux-foundation.org> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.") Reported-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2016-01-28bridge: Only call /sbin/bridge-stp for the initial network namespaceHannes Frederic Sowa
commit ff62198553e43cdffa9d539f6165d3e83f8a42bc upstream. [I stole this patch from Eric Biederman. He wrote:] > There is no defined mechanism to pass network namespace information > into /sbin/bridge-stp therefore don't even try to invoke it except > for bridge devices in the initial network namespace. > > It is possible for unprivileged users to cause /sbin/bridge-stp to be > invoked for any network device name which if /sbin/bridge-stp does not > guard against unreasonable arguments or being invoked twice on the > same network device could cause problems. [Hannes: changed patch using netns_eq] Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-08-25bridge: mdb: fix delmdb state in the notificationNikolay Aleksandrov
commit 7ae90a4f96486e3e20274afa1b8329802f5e1981 upstream. Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the "temp" state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as "temp" regardless of the real state of the entry. After this patch: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent There's one important note to make here that the state is actually not matched when doing a delete, so one can delete a permanent entry by stating "temp" in the end of the command, I've chosen this fix in order not to break user-space tools which rely on this (incorrect) behaviour. So to give an example after this patch and using the wrong state: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp (monitor) dev br0 port eth3 grp 239.0.0.1 permanent Note the state of the entry that got deleted is correct in the notification. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-08-12bridge: mdb: fix double add notificationNikolay Aleksandrov
commit 5ebc784625ea68a9570d1f70557e7932988cd1b4 upstream. Since the mdb add/del code was introduced there have been 2 br_mdb_notify calls when doing br_mdb_add() resulting in 2 notifications on each add. Example: Command: bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent Before patch: root@debian:~# bridge monitor all [MDB]dev br0 port eth1 grp 239.0.0.1 permanent [MDB]dev br0 port eth1 grp 239.0.0.1 permanent After patch: root@debian:~# bridge monitor all [MDB]dev br0 port eth1 grp 239.0.0.1 permanent Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: cfd567543590 ("bridge: add support of adding and deleting mdb entries") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-08-10bridge: mdb: zero out the local br_ip variable before useNikolay Aleksandrov
commit f1158b74e54f2e2462ba5e2f45a118246d9d5b43 upstream. Since commit b0e9a30dd669 ("bridge: Add vlan id to multicast groups") there's a check in br_ip_equal() for a matching vlan id, but the mdb functions were not modified to use (or at least zero it) so when an entry was added it would have a garbage vlan id (from the local br_ip variable in __br_mdb_add/del) and this would prevent it from being matched and also deleted. So zero out the whole local ip var to protect ourselves from future changes and also to fix the current bug, since there's no vlan id support in the mdb uapi - use always vlan id 0. Example before patch: root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent RTNETLINK answers: Invalid argument After patch: root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent root@debian:~# bridge mdb Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: b0e9a30dd669 ("bridge: Add vlan id to multicast groups") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-08-10bridge: mdb: start delete timer for temp static entriesSatish Ashok
commit f7e2965db17dd3b60f05fad88e7afc79ea75b48f upstream. Start the delete timer when adding temp static entries so they can expire. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-07-15bridge: multicast: restore router configuration on port link down/upSatish Ashok
commit 754bc547f0a79f7568b5b81c7fc0a8d044a6571a upstream. When a port goes through a link down/up the multicast router configuration is not restored. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-07-06bridge: fix br_stp_set_bridge_priority race conditionsNikolay Aleksandrov
commit 2dab80a8b486f02222a69daca6859519e05781d9 upstream. After the ->set() spinlocks were removed br_stp_set_bridge_priority was left running without any protection when used via sysfs. It can race with port add/del and could result in use-after-free cases and corrupted lists. Tested by running port add/del in a loop with stp enabled while setting priority in a loop, crashes are easily reproducible. The spinlocks around sysfs ->set() were removed in commit: 14f98f258f19 ("bridge: range check STP parameters") There's also a race condition in the netlink priority support that is fixed by this change, but it was introduced recently and the fixes tag covers it, just in case it's needed the commit is: af615762e972 ("bridge: add ageing_time, stp_state, priority over netlink") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: 14f98f258f19 ("bridge: range check STP parameters") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-07-06bridge: fix multicast router rlist endless loopNikolay Aleksandrov
commit 1a040eaca1a22f8da8285ceda6b5e4a2cb704867 upstream. Since the addition of sysfs multicast router support if one set multicast_router to "2" more than once, then the port would be added to the hlist every time and could end up linking to itself and thus causing an endless loop for rlist walkers. So to reproduce just do: echo 2 > multicast_router; echo 2 > multicast_router; in a bridge port and let some igmp traffic flow, for me it hangs up in br_multicast_flood(). Fix this by adding a check in br_multicast_add_router() if the port is already linked. The reason this didn't happen before the addition of multicast_router sysfs entries is because there's a !hlist_unhashed check that prevents it. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-06-12bridge: disable softirqs around br_fdb_update to avoid lockupNikolay Aleksandrov
commit c4c832f89dc468cf11dc0dd17206bace44526651 upstream. br_fdb_update() can be called in process context in the following way: br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set) so we need to disable softirqs because there are softirq users of the hash_lock. One easy way to reproduce this is to modify the bridge utility to set NTF_USE, enable stp and then set maxageing to a low value so br_fdb_cleanup() is called frequently and then just add new entries in a loop. This happens because br_fdb_cleanup() is called from timer/softirq context. The spin locks in br_fdb_update were _bh before commit f8ae737deea1 ("[BRIDGE]: forwarding remove unneeded preempt and bh diasables") and at the time that commit was correct because br_fdb_update() couldn't be called from process context, but that changed after commit: 292d1398983f ("bridge: add NTF_USE support") Using local_bh_disable/enable around br_fdb_update() allows us to keep using the spin_lock/unlock in br_fdb_update for the fast-path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 292d1398983f ("bridge: add NTF_USE support") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-06-02bridge: fix parsing of MLDv2 reportsThadeu Lima de Souza Cascardo
commit 47cc84ce0c2fe75c99ea5963c4b5704dd78ead54 upstream. When more than a multicast address is present in a MLDv2 report, all but the first address is ignored, because the code breaks out of the loop if there has not been an error adding that address. This has caused failures when two guests connected through the bridge tried to communicate using IPv6. Neighbor discoveries would not be transmitted to the other guest when both used a link-local address and a static address. This only happens when there is a MLDv2 querier in the network. The fix will only break out of the loop when there is a failure adding a multicast address. The mdb before the patch: dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp dev ovirtmgmt port bond0.86 grp ff02::2 temp After the patch: dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp dev ovirtmgmt port bond0.86 grp ff02::fb temp dev ovirtmgmt port bond0.86 grp ff02::2 temp dev ovirtmgmt port bond0.86 grp ff02::d temp dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp dev ovirtmgmt port bond0.86 grp ff02::16 temp dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.") Reported-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Tested-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-05-12bridge/mdb: remove wrong use of NLM_F_MULTINicolas Dichtel
commit 821996795973fd52703c35811a03db9fec1ac141 upstream. NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: 37a393bc4932 ("bridge: notify mdb changes via netlink") CC: Cong Wang <amwang@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2015-01-16bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queriesLinus Lüssing
commit f0b4eeced518c632210ef2aea44fc92cc9e86cce upstream. Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected for both locally generated IGMP and MLD queries. The IP header specific filter options are off by 14 Bytes for netfilter (actual output on interfaces is fine). NF_HOOK() expects the skb->data to point to the IP header, not the ethernet one (while dev_queue_xmit() does not). Luckily there is an br_dev_queue_push_xmit() helper function already - let's just use that. Introduced by eb1d16414339a6e113d89e2cca2556005d7ce919 ("bridge: Add core IGMP snooping support") Ebtables example: $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \ --log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP before (broken): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \ DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \ IPv6 priority=0x3, Next Header=2 after (working): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \ DST=ff02:0000:0000:0000:0000:0000:0000:0001, \ IPv6 priority=0x0, Next Header=0 Signed-off-by: Linus Lüssing <linus.luessing@web.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
2014-10-15bridge: Fix br_should_learn to check vlan_enabledVlad Yasevich
[ Upstream commit c095f248e63ada504dd90c90baae673ae10ee3fe ] As Toshiaki Makita pointed out, the BRIDGE_INPUT_SKB_CB will not be initialized in br_should_learn() as that function is called only from br_handle_local_finish(). That is an input handler for link-local ethernet traffic so it perfectly correct to check br->vlan_enabled here. Reported-by: Toshiaki Makita<toshiaki.makita1@gmail.com> Fixes: 20adfa1 bridge: Check if vlan filtering is enabled only once. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15bridge: Check if vlan filtering is enabled only once.Vlad Yasevich
[ Upstream commit 20adfa1a81af00bf2027644507ad4fa9cd2849cf ] The bridge code checks if vlan filtering is enabled on both ingress and egress. When the state flip happens, it is possible for the bridge to currently be forwarding packets and forwarding behavior becomes non-deterministic. Bridge may drop packets on some interfaces, but not others. This patch solves this by caching the filtered state of the packet into skb_cb on ingress. The skb_cb is guaranteed to not be over-written between the time packet entres bridge forwarding path and the time it leaves it. On egress, we can then check the cached state to see if we need to apply filtering information. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-15net: Always untag vlan-tagged traffic on input.Vlad Yasevich
[ Upstream commit 0d5501c1c828fb97d02af50aa9d2b1a5498b94e4 ] Currently the functionality to untag traffic on input resides as part of the vlan module and is build only when VLAN support is enabled in the kernel. When VLAN is disabled, the function vlan_untag() turns into a stub and doesn't really untag the packets. This seems to create an interesting interaction between VMs supporting checksum offloading and some network drivers. There are some drivers that do not allow the user to change tx-vlan-offload feature of the driver. These drivers also seem to assume that any VLAN-tagged traffic they transmit will have the vlan information in the vlan_tci and not in the vlan header already in the skb. When transmitting skbs that already have tagged data with partial checksum set, the checksum doesn't appear to be updated correctly by the card thus resulting in a failure to establish TCP connections. The following is a packet trace taken on the receiver where a sender is a VM with a VLAN configued. The host VM is running on doest not have VLAN support and the outging interface on the host is tg3: 10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243, offset 0, flags [DF], proto TCP (6), length 60) 10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect -> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val 4294837885 ecr 0,nop,wscale 7], length 0 10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q (0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244, offset 0, flags [DF], proto TCP (6), length 60) 10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect -> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val 4294838888 ecr 0,nop,wscale 7], length 0 This connection finally times out. I've only access to the TG3 hardware in this configuration thus have only tested this with TG3 driver. There are a lot of other drivers that do not permit user changes to vlan acceleration features, and I don't know if they all suffere from a similar issue. The patch attempt to fix this another way. It moves the vlan header stipping code out of the vlan module and always builds it into the kernel network core. This way, even if vlan is not supported on a virtualizatoin host, the virtual machines running on top of such host will still work with VLANs enabled. CC: Patrick McHardy <kaber@trash.net> CC: Nithin Nayak Sujir <nsujir@broadcom.com> CC: Michael Chan <mchan@broadcom.com> CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-12bridge: fix compile error when compiling without IPv6 supportLinus Lüssing
Some fields in "struct net_bridge" aren't available when compiling the kernel without IPv6 support. Therefore adding a check/macro to skip the complaining code sections in that case. Introduced by 2cd4143192e8c60f66cb32c3a30c76d0470a372d ("bridge: memorize and export selected IGMP/MLD querier port") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12bridge: fix smatch warning / potential null pointer dereferenceLinus Lüssing
"New smatch warnings: net/bridge/br_multicast.c:1368 br_ip6_multicast_query() error: we previously assumed 'group' could be null (see line 1349)" In the rare (sort of broken) case of a query having a Maximum Response Delay of zero, we could create a potential null pointer dereference. Fixing this by skipping the multicast specific MLD Query parsing again if no multicast group address is available. Introduced by dc4eb53a996a78bfb8ea07b47423ff5a3aadc362 ("bridge: adhere to querier election mechanism specified by RFCs") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: Support 802.1ad vlan filteringToshiaki Makita
This enables us to change the vlan protocol for vlan filtering. We come to be able to filter frames on the basis of 802.1ad vlan tags through a bridge. This also changes br->group_addr if it has not been set by user. This is needed for an 802.1ad bridge. (See IEEE 802.1Q-2011 8.13.5.) Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad bridge can forward the Nearest Customer Bridge group addresses except for br->group_addr, which should be passed to higher layer. To change the vlan protocol, write a protocol in sysfs: # echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: Prepare for forwarding another bridge group addressesToshiaki Makita
If a bridge is an 802.1ad bridge, it must forward another bridge group addresses (the Nearest Customer Bridge group addresses). (For details, see IEEE 802.1Q-2011 8.6.3.) As user might not want group_fwd_mask to be modified by enabling 802.1ad, introduce a new mask, group_fwd_mask_required, which indicates addresses the bridge wants to forward. This will be set by enabling 802.1ad. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: Prepare for 802.1ad vlan filtering supportToshiaki Makita
This enables a bridge to have vlan protocol informantion and allows vlan tag manipulation (retrieve, insert and remove tags) according to the vlan protocol. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: Add 802.1ad tx vlan accelerationToshiaki Makita
Bridge device doesn't need to embed S-tag into skb->data. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10bridge: memorize and export selected IGMP/MLD querier portLinus Lüssing
Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in IGMP/MLD queriers to be able to reliably serve any multicast listener behind this same bridge. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10bridge: add export of multicast database adjacent to net_devLinus Lüssing
With this new, exported function br_multicast_list_adjacent(net_dev) a list of IPv4/6 addresses is returned. This list contains all multicast addresses sensed by the bridge multicast snooping feature on all bridge ports of the bridge interface of net_dev, excluding addresses from the specified net_device itself. Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in multicast listeners to be able to reliably serve them with multicast packets. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10bridge: adhere to querier election mechanism specified by RFCsLinus Lüssing
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2 (RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the querier with lowest source address shall become the selected querier. So far the bridge stopped its querier as soon as it heard another querier regardless of its source address. This results in the "wrong" querier potentially becoming the active querier or a potential, unnecessary querying delay. With this patch the bridge memorizes the source address of the currently selected querier and ignores queries from queriers with a higher source address than the currently selected one. This slight optimization is supposed to make it more RFC compliant (but is rather uncritical and therefore probably not necessary to be queued for stable kernels). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>