iFix udp_output() lock inconsistency.

In r297225 the initial INP_RLOCK() was replaced by an early
acquisition of an r- or w-lock depending on input variables
possibly extending the write locked area for reasons not entirely
clear but possibly to avoid a later case of unlock and relock
leading to a possible race condition and possibly in order to
allow the route cache to work for connected sockets.

Unfortunately the conditions were not 1:1 replicated (probably
because of the route cache needs). While this would not be a
problem the legacy IP code compared to IPv6 has an extra case
when dealing with IP_SENDSRCADDR. In a particular case we were
holding an exclusive inp lock and acquired the shared udbinfo
lock (now epoch).
When then running into an error case, the locking assertions
on release fired as the udpinfo and inp lock levels did not match.

Break up the special case and in that particular case acquire
and udpinfo lock depending on the exclusitivity of the inp lock.

MFC After:	9 days
Reported-by:	syzbot+1f5c6800e4f99bdb1a48@syzkaller.appspotmail.com
Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D19594
This commit is contained in:
Bjoern A. Zeeb 2019-04-23 10:12:33 +00:00
parent 3bed0179ee
commit d86ecbe993
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=346595

View File

@ -1258,20 +1258,22 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct sockaddr *addr,
}
/*
* Depending on whether or not the application has bound or connected
* the socket, we may have to do varying levels of work. The optimal
* case is for a connected UDP socket, as a global lock isn't
* required at all.
*
* In order to decide which we need, we require stability of the
* inpcb binding, which we ensure by acquiring a read lock on the
* inpcb. This doesn't strictly follow the lock order, so we play
* the trylock and retry game; note that we may end up with more
* conservative locks than required the second time around, so later
* assertions have to accept that. Further analysis of the number of
* misses under contention is required.
*
* XXXRW: Check that hash locking update here is correct.
* In the old days, depending on whether or not the application had
* bound or connected the socket, we had to do varying levels of work.
* The optimal case was for a connected UDP socket, as a global lock
* wasn't required at all.
* In order to decide which we need, we required stability of the
* inpcb binding, which we ensured by acquiring a read lock on the
* inpcb. This didn't strictly follow the lock order, so we played
* the trylock and retry game.
* With the re-introduction of the route-cache in some cases, we started
* to acquire an early inp wlock and a possible race during re-lock
* went away. With the introduction of epoch(9) some read locking
* became epoch(9) and the lock-order issues also went away.
* Due to route-cache we may now hold more conservative locks than
* otherwise required and have split up the 2nd case in case 2 and 3
* in order to keep the udpinfo lock level in sync with the inp one
* for the IP_SENDSRCADDR case below.
*/
pr = inp->inp_socket->so_proto->pr_protocol;
pcbinfo = udp_get_inpcbinfo(pr);
@ -1279,14 +1281,21 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct sockaddr *addr,
(inp->inp_laddr.s_addr == INADDR_ANY && inp->inp_lport == 0)) {
INP_HASH_WLOCK(pcbinfo);
unlock_udbinfo = UH_WLOCKED;
} else if ((sin != NULL &&
(sin->sin_addr.s_addr == INADDR_ANY ||
sin->sin_addr.s_addr == INADDR_BROADCAST ||
inp->inp_laddr.s_addr == INADDR_ANY ||
inp->inp_lport == 0)) ||
src.sin_family == AF_INET) {
} else if (sin != NULL &&
(sin->sin_addr.s_addr == INADDR_ANY ||
sin->sin_addr.s_addr == INADDR_BROADCAST ||
inp->inp_laddr.s_addr == INADDR_ANY ||
inp->inp_lport == 0)) {
INP_HASH_RLOCK_ET(pcbinfo, et);
unlock_udbinfo = UH_RLOCKED;
} else if (src.sin_family == AF_INET) {
if (unlock_inp == UH_WLOCKED) {
INP_HASH_WLOCK(pcbinfo);
unlock_udbinfo = UH_WLOCKED;
} else {
INP_HASH_RLOCK_ET(pcbinfo, et);
unlock_udbinfo = UH_RLOCKED;
}
} else
unlock_udbinfo = UH_UNLOCKED;