Commit Graph

26 Commits

Author SHA1 Message Date
Rick Macklem
9c2065607f Change the xid for client side krpc over UDP to a global value.
Without this patch, the xid used for the client side krpc requests over
UDP was initialized for each "connection". A "connection" for UDP is
rather sketchy and for the kernel NLM a new one is created every 2minutes.
A problem with client side interoperability with a Netapp server for the NLM
was reported and it is believed to be caused by reuse of the same xid.
Although this was never completely diagnosed by the reporter, I could see
how the same xid might get reused, since it is initialized to a value
based on the TOD clock every two minutes.
I suspect initializing the value for every "connection" was inherited from
userland library code, where having a global xid was not practical.
However, implementing a global "xid" for the kernel rpc is straightforward
and will ensure that an xid value is not reused for a long time. This
patch does that and is hoped it will fix the Netapp interoperability
problem.

PR:		245022
Reported by:	danny@cs.huji.ac.il
MFC after:	2 weeks
2020-04-05 21:08:17 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Enji Cooper
f99529597c Deobfuscate cleanup path in clnt_dg_create(..)
Similar to r300836 and r301800, cl and cu will always be non-NULL as they're
allocated using the mem_alloc routines, which always use
`malloc(..., M_WAITOK)`.

Deobfuscating the cleanup path fixes a leak where if cl was NULL and
cu was not, cu would not be free'd, and also removes a duplicate test for
cl not being NULL.

MFC after: 1 week
Reported by: Coverity
CID: 1007033, 1007344
Sponsored by: EMC / Isilon Storage Division
2016-07-11 06:58:24 +00:00
Pedro F. Giffuni
6244c6e7db sys/rpc: minor spelling fixes.
No functional change.
2016-05-06 01:49:46 +00:00
Pedro F. Giffuni
4ed3c0e713 sys: Make use of our rounddown() macro when sys/param.h is available.
No functional change.
2016-04-30 14:41:18 +00:00
Dimitry Andric
a6132f60af Remove some unused static const strings under sys/rpc, which have never
been used since the initial commit (r177633).

MFC after:	3 days
2013-12-24 20:55:22 +00:00
Hiroki Sato
2e322d3796 Replace Sun RPC license in TI-RPC library with a 3-clause BSD license,
with the explicit permission of Sun Microsystems in 2009.
2013-11-25 19:04:36 +00:00
Rick Macklem
318677ad92 It was reported via email that the cu_sent field used by the
krpc client side UDP was observed as way out of range and
caused the rpc.lockd daemon to hang trying to do an RPC.
Inspection of the code found two places where the RPC request
is re-queued, but the value of cu_sent was not incremented.
Since cu_sent is always decremented when the RPC request is
dequeued, I think this could have caused cu_sent to go out of
range. This patch adds lines to increment cu_sent for these
two cases.

Reported by:	dwhite@ixsystems.com
Discussed with:	dwhite@ixsystems.com
MFC after:	2 weeks
2013-09-06 02:34:34 +00:00
Gleb Smirnoff
bd54830bcb Use m_get(), m_gethdr() and m_getcl() instead of historic macros.
Sponsored by:	Nginx, Inc.
2013-03-12 12:17:19 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Rick Macklem
2ba476324b Both a crash reported on freebsd-current on Oct. 18 under the
subject heading "mtx_lock() of destroyed mutex on NFS" and
PR# 156168 appear to be caused by clnt_dg_destroy() closing
down the socket prematurely. When to close down the socket
is controlled by a reference count (cs_refs), but clnt_dg_create()
checks for sb_upcall being non-NULL to decide if a new socket
is needed. I believe the crashes were caused by the following race:
  clnt_dg_destroy() finds cs_refs == 0 and decides to delete socket
  clnt_dg_destroy() then loses race with clnt_dg_create() for
    acquisition of the SOCKBUF_LOCK()
  clnt_dg_create() finds sb_upcall != NULL and increments cs_refs to 1
  clnt_dg_destroy() then acquires SOCKBUF_LOCK(), sets sb_upcall to
    NULL and destroys socket

This patch fixes the above race by changing clnt_dg_destroy() so
that it acquires SOCKBUF_LOCK() before testing cs_refs.

Tested by:	bz
PR:		156168
Reviewed by:	dfr
MFC after:	2 weeks
2011-11-03 14:38:03 +00:00
Artem Belevich
fa3db771d2 Make sure RPC calls over UDP return RPC_INTR status is the process has
been interrupted in a restartable syscall. Otherwise we could end up
in an (almost) endless loop in clnt_reconnect_call().

PR: kern/160198
Reviewed by: rmacklem
Approved by: re (kib), avg (mentor)
MFC after: 1 week
2011-08-28 18:09:17 +00:00
Rick Macklem
5e8eb3cd4e Fix a couple of mbuf leaks introduced by r217242. I do
not believe that these leaks had a practical impact,
since the situations in which they would have occurred
would have been extremely rare.

MFC after:	2 weeks
2011-04-13 00:03:49 +00:00
Bjoern A. Zeeb
1fb51a12f2 Mfp4 CH=177274,177280,177284-177285,177297,177324-177325
VNET socket push back:
  try to minimize the number of places where we have to switch vnets
  and narrow down the time we stay switched.  Add assertions to the
  socket code to catch possibly unset vnets as seen in r204147.

  While this reduces the number of vnet recursion in some places like
  NFS, POSIX local sockets and some netgraph, .. recursions are
  impossible to fix.

  The current expectations are documented at the beginning of
  uipc_socket.c along with the other information there.

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb
  Tested by:    zec

Tested by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	2 weeks
2011-02-16 21:29:13 +00:00
Rick Macklem
2a1e0fb436 Fix a bug in the client side krpc where it was, sometimes
erroneously, assumed that 4 bytes of data were in the first
mbuf of a list by replacing the bcopy() with m_copydata().
Also, replace the uses of m_pullup(), which can fail for
reasons other than not enough data, with m_copydata().
For the cases where it isn't known that there is enough
data in the mbuf list, check first via m_len and m_length().
This is believed to fix a problem reported by dpd at dpdtech.com
and george+freebsd at m5p.com.

Reviewed by:	jhb
MFC after:	8 days
2011-01-10 21:35:10 +00:00
Rick Macklem
cec077bc8f Fix the krpc so that it can handle NFSv3,UDP mounts with a read/write
data size greater than 8192. Since soreserve(so, 256*1024, 256*1024)
would always fail for the default value of sb_max, modify clnt_dg.c
so that it uses the calculated values and checks for an error return
from soreserve(). Also, add a check for error return from soreserve()
to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of
the bogus maxsize == 256*1024.

PR:		kern/150910
Reviewed by:	jhb
MFC after:	2 weeks
2010-10-13 00:57:14 +00:00
Martin Blapp
c2ede4b379 Remove extraneous semicolons, no functional changes.
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week
2010-01-07 21:01:37 +00:00
Marko Zec
0348c661d1 Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet
context inside the RPC code.

Temporarily set td's cred to mount's cred before calling socreate() via
__rpc_nconf2socket().

Submitted by:	rmacklem (in part)
Reviewed by:	rmacklem, rwatson
Discussed with:	dfr, bz
Approved by:	re (rwatson), julian (mentor)
MFC after:	3 days
2009-08-24 10:09:30 +00:00
Rick Macklem
b766fabd9c Make sure that cr_error is set to ESHUTDOWN when closing the connection.
This is normally done by a loop in clnt_dg_close(), but requests that aren't
in the pending queue at the time of closing, don't get set. This avoids a
panic in xdrmbuf_create() when it is called with a NULL cr_mrep if
cr_error doesn't get set to ESHUTDOWN while closing.

Reviewed by:	dfr
Approved by:	re (Ken Smith), kib (mentor)
2009-07-01 16:38:18 +00:00
Rick Macklem
3144f81221 Fix upcall races in the client side krpc. For the client side upcall,
holding SOCKBUF_LOCK() isn't sufficient to guarantee that there is
no upcall in progress, since SOCKBUF_LOCK() is released/re-acquired
in the upcall. An upcall reference counter was added to the upcall
structure that is incremented at the beginning of the upcall and
decremented at the end of the upcall. As such, a reference count == 0
when holding the SOCKBUF_LOCK() guarantees there is no upcall in
progress. Add a function that is called just after soupcall_clear(),
which waits until the reference count == 0.
Also, move the mtx_destroy() down to after soupcall_clear(), so that
the mutex is not destroyed before upcalls are done.

Reviewed by:	dfr, jhb
Tested by:	pho
Approved by:	kib (mentor)
2009-06-04 14:49:27 +00:00
John Baldwin
74fb0ba732 Rework socket upcalls to close some races with setup/teardown of upcalls.
- Each socket upcall is now invoked with the appropriate socket buffer
  locked.  It is not permissible to call soisconnected() with this lock
  held; however, so socket upcalls now return an integer value.  The two
  possible values are SU_OK and SU_ISCONNECTED.  If an upcall returns
  SU_ISCONNECTED, then the soisconnected() will be invoked on the
  socket after the socket buffer lock is dropped.
- A new API is provided for setting and clearing socket upcalls.  The
  API consists of soupcall_set() and soupcall_clear().
- To simplify locking, each socket buffer now has a separate upcall.
- When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from
  the receive socket buffer automatically.  Note that a SO_SND upcall
  should never return SU_ISCONNECTED.
- All this means that accept filters should now return SU_ISCONNECTED
  instead of calling soisconnected() directly.  They also no longer need
  to explicitly clear the upcall on the new socket.
- The HTTP accept filter still uses soupcall_set() to manage its internal
  state machine, but other accept filters no longer have any explicit
  knowlege of socket upcall internals aside from their return value.
- The various RPC client upcalls currently drop the socket buffer lock
  while invoking soreceive() as a temporary band-aid.  The plan for
  the future is to add a new flag to allow soreceive() to be called with
  the socket buffer locked.
- The AIO callback for socket I/O is now also invoked with the socket
  buffer locked.  Previously sowakeup() would drop the socket buffer
  lock only to call aio_swake() which immediately re-acquired the socket
  buffer lock for the duration of the function call.

Discussed with:	rwatson, rmacklem
2009-06-01 21:17:03 +00:00
Doug Rabson
a9148abd9d Implement support for RPCSEC_GSS authentication to both the NFS client
and server. This replaces the RPC implementation of the NFS client and
server with the newer RPC implementation originally developed
(actually ported from the userland sunrpc code) to support the NFS
Lock Manager.  I have tested this code extensively and I believe it is
stable and that performance is at least equal to the legacy RPC
implementation.

The NFS code currently contains support for both the new RPC
implementation and the older legacy implementation inherited from the
original NFS codebase. The default is to use the new implementation -
add the NFS_LEGACYRPC option to fall back to the old code. When I
merge this support back to RELENG_7, I will probably change this so
that users have to 'opt in' to get the new code.

To use RPCSEC_GSS on either client or server, you must build a kernel
which includes the KGSSAPI option and the crypto device. On the
userland side, you must build at least a new libc, mountd, mount_nfs
and gssd. You must install new versions of /etc/rc.d/gssd and
/etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.

As long as gssd is running, you should be able to mount an NFS
filesystem from a server that requires RPCSEC_GSS authentication. The
mount itself can happen without any kerberos credentials but all
access to the filesystem will be denied unless the accessing user has
a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There
is currently no support for situations where the ticket file is in a
different place, such as when the user logged in via SSH and has
delegated credentials from that login. This restriction is also
present in Solaris and Linux. In theory, we could improve this in
future, possibly using Brooks Davis' implementation of variant
symlinks.

Supporting RPCSEC_GSS on a server is nearly as simple. You must create
service creds for the server in the form 'nfs/<fqdn>@<REALM>' and
install them in /etc/krb5.keytab. The standard heimdal utility ktutil
makes this fairly easy. After the service creds have been created, you
can add a '-sec=krb5' option to /etc/exports and restart both mountd
and nfsd.

The only other difference an administrator should notice is that nfsd
doesn't fork to create service threads any more. In normal operation,
there will be two nfsd processes, one in userland waiting for TCP
connections and one in the kernel handling requests. The latter
process will create as many kthreads as required - these should be
visible via 'top -H'. The code has some support for varying the number
of service threads according to load but initially at least, nfsd uses
a fixed number of threads according to the value supplied to its '-n'
option.

Sponsored by:	Isilon Systems
MFC after:	1 month
2008-11-03 10:38:00 +00:00
Doug Rabson
c675522fc4 Re-implement the client side of rpc.lockd in the kernel. This implementation
provides the correct semantics for flock(2) style locks which are used by the
lockf(1) command line tool and the pidfile(3) library. It also implements
recovery from server restarts and ensures that dirty cache blocks are written
to the server before obtaining locks (allowing multiple clients to use file
locking to safely share data).

Sponsored by:	Isilon Systems
PR:		94256
MFC after:	2 weeks
2008-06-26 10:21:54 +00:00
Doug Rabson
ee31b83a3a Minor changes to improve compatibility with older FreeBSD releases. 2008-03-28 09:50:32 +00:00
Doug Rabson
dfdcada31e Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00