smr: Fix synchronization in smr_enter()

smr_enter() must publish its observed read sequence number before
issuing any subsequent memory operations.  The ordering provided by
atomic_add_acq_int() is insufficient on some platforms, at least on
arm64, because it permits reordering of subsequent loads with the store
to c_seq.

Thus, use atomic_thread_fence_seq_cst() to issue a store-load barrier
after publishing the read sequence number.

On x86, take advantage of the fact that memory operations are not
reordered with locked instructions to improve code density: we can store
the observed read sequence and provide a store-load barrier with a
single operation.

Based on a patch from Pierre Habouzit <pierre@habouzit.net>.

PR:		265974
Reviewed by:	alc
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36370
This commit is contained in:
Mark Johnston 2022-09-24 09:18:04 -04:00
parent c2d27b0ec7
commit 8694fd3335

View File

@ -122,8 +122,12 @@ smr_enter(smr_t smr)
* Frees that are newer than this stored value will be
* deferred until we call smr_exit().
*
* An acquire barrier is used to synchronize with smr_exit()
* and smr_poll().
* Subsequent loads must not be re-ordered with the store. On
* x86 platforms, any locked instruction will provide this
* guarantee, so as an optimization we use a single operation to
* both store the cached write sequence number and provide the
* requisite barrier, taking advantage of the fact that
* SMR_SEQ_INVALID is zero.
*
* It is possible that a long delay between loading the wr_seq
* and storing the c_seq could create a situation where the
@ -132,8 +136,12 @@ smr_enter(smr_t smr)
* the load. See smr_poll() for details on how this condition
* is detected and handled there.
*/
/* This is an add because we do not have atomic_store_acq_int */
#if defined(__amd64__) || defined(__i386__)
atomic_add_acq_int(&smr->c_seq, smr_shared_current(smr->c_shared));
#else
atomic_store_int(&smr->c_seq, smr_shared_current(smr->c_shared));
atomic_thread_fence_seq_cst();
#endif
}
/*