freebsd-dev/sys
Hans Petter Selasky 468a6b5055 ibcore: Fix use-after-free in IB mad completion handling.
We encountered a use-after-free bug when unloading the driver:

BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862

Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
Call Trace:
dump_stack+0x9a/0xeb
print_address_description+0xe3/0x2e0
ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
__kasan_report+0x15c/0x1df
ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
kasan_report+0xe/0x20
ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
find_mad_agent+0xa00/0xa00 [ib_core]
qlist_free_all+0x51/0xb0
mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib]
quarantine_reduce+0x1fa/0x270
kasan_unpoison_shadow+0x30/0x40
ib_mad_recv_done+0xdf6/0x3000 [ib_core]
_raw_spin_unlock_irqrestore+0x46/0x70
ib_mad_send_done+0x1810/0x1810 [ib_core]
mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib]
_raw_spin_unlock_irqrestore+0x46/0x70
debug_object_deactivate+0x2b9/0x4a0
__ib_process_cq+0xe2/0x1d0 [ib_core]
ib_cq_poll_work+0x45/0xf0 [ib_core]
process_one_work+0x90c/0x1860
pwq_dec_nr_in_flight+0x320/0x320
worker_thread+0x87/0xbb0
__kthread_parkme+0xb6/0x180
process_one_work+0x1860/0x1860
kthread+0x320/0x3e0
kthread_park+0x120/0x120
ret_from_fork+0x24/0x30
...
Freed by task 31682:
save_stack+0x19/0x80
__kasan_slab_free+0x11d/0x160
kfree+0xf5/0x2f0
ib_mad_port_close+0x200/0x380 [ib_core]
ib_mad_remove_device+0xf0/0x230 [ib_core]
remove_client_context+0xa6/0xe0 [ib_core]
disable_device+0x14e/0x260 [ib_core]
__ib_unregister_device+0x79/0x150 [ib_core]
ib_unregister_device+0x21/0x30 [ib_core]
mlx4_ib_remove+0x162/0x690 [mlx4_ib]
mlx4_remove_device+0x204/0x2c0 [mlx4_core]
mlx4_unregister_interface+0x49/0x1d0 [mlx4_core]
mlx4_ib_cleanup+0xc/0x1d [mlx4_ib]
__x64_sys_delete_module+0x2d2/0x400
do_syscall_64+0x95/0x470
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The problem was that the MAD PD was deallocated before the MAD CQ.
There was completion work pending for the CQ when the PD got deallocated.
When the mad completion handling reached procedure
ib_mad_post_receive_mads(), we got a use-after-free bug in the following
line of code in that procedure:
sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey;
(the pd pointer in the above line is no longer valid, because the
pd has been deallocated).

We fix this by allocating the PD before the CQ in procedure
ib_mad_port_open(), and deallocating the PD after freeing the CQ
in procedure ib_mad_port_close().

Since the CQ completion work queue is flushed during ib_free_cq(),
no completions will be pending for that CQ when the PD is later
deallocated.

Note that freeing the CQ before deallocating the PD is the practice
in the ULPs.

Linux commit:
770b7d96cfff6a8bf6c9f261ba6f135dc9edf484

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:31 +02:00
..
amd64 igc(4): Introduce new driver for the Intel I225 Ethernet controller. 2021-07-12 14:57:18 +10:00
arm Revert "Pass the syscall number to capsicum permission-denied signals" 2021-07-10 20:26:01 +01:00
arm64 Revert "Pass the syscall number to capsicum permission-denied signals" 2021-07-10 20:26:01 +01:00
bsm
cam cam: enable kern.cam.ada.enable_uma_ccbs by default 2021-07-07 09:40:34 +01:00
cddl
compat LinuxKPI: Force the usleep_range() function to sleep instead of spinning on the timer. 2021-07-10 21:59:31 +02:00
conf igc(4): Introduce new driver for the Intel I225 Ethernet controller. 2021-07-12 14:57:18 +10:00
contrib zfs: merge openzfs/zfs@bdd11cbb9 (master) into main 2021-07-07 23:31:52 +02:00
crypto ossl: Use crypto_cursor_segment(). 2021-05-25 16:59:19 -07:00
ddb
dev mlx5core: Make sure error code is propagated on error. 2021-07-12 14:22:31 +02:00
dts dts: Bump the freebsd branding version to 5.13 2021-07-01 18:48:56 +02:00
fs nfscl: Add a Linux compatible "nconnect" mount option 2021-07-08 17:39:04 -07:00
gdb
geom geom_label: Remove an old sysinstall(8) workaround 2021-07-05 16:15:32 +01:00
gnu
i386 igc(4): Introduce new driver for the Intel I225 Ethernet controller. 2021-07-12 14:57:18 +10:00
isa newbus: Move from bus_child_{pnpinfo,location}_src to bus_child_{pnpinfo,location} with sbuf 2021-06-22 20:52:06 -06:00
kern cache: add cache_enter_time_flags 2021-07-12 07:03:14 +02:00
kgssapi
libkern
mips Revert "Pass the syscall number to capsicum permission-denied signals" 2021-07-10 20:26:01 +01:00
modules igc(4): Introduce new driver for the Intel I225 Ethernet controller. 2021-07-12 14:57:18 +10:00
net pf: add DIOCGETSTATESV2 2021-07-09 10:29:53 +02:00
net80211 net80211: ieee80211_probereq_ie fix length calculation for hw scans 2021-06-28 12:17:11 +00:00
netgraph Consistently use the SOLISTENING() macro 2021-06-14 17:32:27 -04:00
netinet libalias: fix divide by zero causing panic 2021-07-10 13:08:18 +02:00
netinet6 sctp: Fix errno in case of association setup failures 2021-07-09 23:19:25 +02:00
netipsec ipsec: globalize lft zone and zero out buffers at allocation time 2021-06-28 08:14:26 +00:00
netpfil pf: bound DIOCGETSTATESV2 memory use 2021-07-09 10:30:02 +02:00
netsmb netsmb: Avoid a read-after-free in smb_t2_request_int() 2021-05-26 10:45:40 -04:00
nfs
nfsclient
nfsserver
nlm
ofed ibcore: Fix use-after-free in IB mad completion handling. 2021-07-12 14:22:31 +02:00
opencrypto crypto: Remove now-unused crypto_cursor_seg{base,len}. 2021-06-16 15:23:16 -07:00
powerpc Revert "Pass the syscall number to capsicum permission-denied signals" 2021-07-10 20:26:01 +01:00
riscv Revert "Pass the syscall number to capsicum permission-denied signals" 2021-07-10 20:26:01 +01:00
rpc Consistently use the SOLISTENING() macro 2021-06-14 17:32:27 -04:00
security mac: cheaper check for ifnet_create_mbuf and ifnet_check_transmit 2021-06-29 15:06:45 +02:00
sys cache: add cache_enter_time_flags 2021-07-12 07:03:14 +02:00
teken
tests tests: Revise FIB lookups per second benchmarking routines 2021-06-17 08:49:09 +02:00
tools
ufs ffs_softdep: force sync if journal is low in journal_check_space 2021-06-23 23:47:05 +03:00
vm uma: Fix a few problems with KASAN integration 2021-07-09 20:38:50 -04:00
x86 x86: Mark the trapframe as initialized in ipi_bitmap_handler() 2021-07-09 20:38:50 -04:00
xdr
xen
Makefile