freebsd-skq

Author	SHA1	Message	Date
imp	7e6cabd06e	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
avg	498edf8d92	add svcpool_close to handle killed nfsd threads This patch adds a new function to the server krpc called svcpool_close(). It is similar to svcpool_destroy(), but does not free the data structures, so that the pool can be used again. This function is then used instead of svcpool_destroy(), svcpool_create() when the nfsd threads are killed. PR: 204340 Reported by: Panzura Approved by: rmacklem Obtained from: rmacklem MFC after: 1 week	2017-02-14 17:49:08 +00:00
kib	42da5a6952	Hide the boottime and bootimebin globals, provide the getboottime(9) and getboottimebin(9) KPI. Change consumers of boottime to use the KPI. The variables were renamed to avoid shadowing issues with local variables of the same name. Issue is that boottime* should be adjusted from tc_windup(), which requires them to be members of the timehands structure. As a preparation, this commit only introduces the interface. Some uses of boottime were found doubtful, e.g. NLM uses boottime to identify the system boot instance. Arguably the identity should not change on the leap second adjustment, but the commit is about the timekeeping code and the consumers were kept bug-to-bug compatible. Tested by: pho (as part of the bigger patch) Reviewed by: jhb (same) Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 1 month X-Differential revision: https://reviews.freebsd.org/D7302	2016-07-27 11:08:59 +00:00
ngie	dcf6ba27e8	Don't test for xpt not being NULL before calling svc_xprt_free(..) svc_xprt_alloc(..) will always return initialized memory as it uses mem_alloc(..) under the covers, which uses malloc(.., M_WAITOK, ..). MFC after: 1 week Reported by: Coverity CID: 1007341 Sponsored by: EMC / Isilon Storage Division	2016-07-11 07:24:56 +00:00
ngie	f5b8c276a7	Convert `svc_xprt_alloc(..)` and `svc_xprt_free(..)`'s prototypes to ANSI C style prototypes MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2016-07-11 07:17:52 +00:00
ngie	73e3956f3e	Deobfuscate cleanup path in clnt_vc_create(..) Similar to r300836, r301800, and r302550, cl and ct will always be non-NULL as they're allocated using the mem_alloc routines, which always use `malloc(..., M_WAITOK)`. MFC after: 1 week Reported by: Coverity CID: 1007342 Sponsored by: EMC / Isilon Storage Division	2016-07-11 07:07:15 +00:00
ngie	e240ec3bb0	Deobfuscate cleanup path in clnt_dg_create(..) Similar to r300836 and r301800, cl and cu will always be non-NULL as they're allocated using the mem_alloc routines, which always use `malloc(..., M_WAITOK)`. Deobfuscating the cleanup path fixes a leak where if cl was NULL and cu was not, cu would not be free'd, and also removes a duplicate test for cl not being NULL. MFC after: 1 week Reported by: Coverity CID: 1007033, 1007344 Sponsored by: EMC / Isilon Storage Division	2016-07-11 06:58:24 +00:00
ngie	722a8ee711	Deobfuscate cleanup path in clnt_bck_create(..) Similar to r300836, cl and ct will always be non-NULL as they're allocated using the mem_alloc routines, which always use `malloc(..., M_WAITOK)`. Deobfuscating the cleanup path fixes a leak where if cl was NULL and ct was not, ct would not be free'd, and also removes a duplicate test for cl not being NULL. Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D6801 MFC after: 1 week Reported by: Coverity CID: 1229999 Reviewed by: cem Sponsored by: EMC / Isilon Storage Division	2016-06-10 17:53:28 +00:00
kevlo	3623b3448c	Fix the rpcb_getaddr() definition to match its declaration. Submitted by: Sebastian Huber <sebastian dot huber at embedded-brains dot de>	2016-06-09 14:33:00 +00:00
ngie	24650ad43b	Quell false positives in svc_vc_create and svc_vc_create_conn with cd and xprt Both cd and xprt will be non-NULL after their respective malloc(9) wrappers are called (mem_alloc and svc_xprt_alloc, which calls mem_alloc) as mem_alloc always gets called with M_WAITOK\|M_ZERO today. Thus, testing for them being non-NULL is incorrect -- it misleads Coverity and it misleads the reader. Remove some unnecessary NULL initializations as a follow up to help solidify the fact that these pointers will be initialized properly in sys/rpc/.. with the interfaces the way they are currently. Differential Revision: https://reviews.freebsd.org/D6572 MFC after: 2 weeks Reported by: Coverity CID: 1007338, 1007339, 1007340 Reviewed by: markj, truckman Sponsored by: EMC / Isilon Storage Division	2016-05-27 08:48:33 +00:00
ngie	3e61cae63a	Remove unnecessary memset(.., 0, ..)'s The mem_alloc macro calls calloc (userspace) / malloc(.., M_WAITOK\|M_ZERO) under the covers, so zeroing out memory is already handled by the underlying calls MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2016-05-24 20:06:41 +00:00
pfg	68fbe09129	sys/rpc: minor spelling fixes. No functional change.	2016-05-06 01:49:46 +00:00
pfg	e72339bbf0	sys: Make use of our rounddown() macro when sys/param.h is available. No functional change.	2016-04-30 14:41:18 +00:00
cem	9b64f241b8	kgssapi(4): Fix string overrun in Kerberos principal construction 'buf.value' was previously treated as a nul-terminated string, but only allocated with strlen() space. Rectify this. Reported by: Coverity CID: 1007639 Sponsored by: EMC / Isilon Storage Division	2016-04-20 04:45:23 +00:00
pfg	507a1d08d6	RPC: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle.	2016-04-14 17:06:37 +00:00
pfg	b63211eed5	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
trasz	ca92bb3067	Remove some NULL checks for M_WAITOK allocations. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-03-29 13:56:59 +00:00
mav	55b7402d49	Fix incorrect (fortunately bigger) malloc size. Submitted by: pfg MFC after: 1 week	2016-03-19 11:48:06 +00:00
glebius	306a6faf84	These files were getting sys/malloc.h and vm/uma.h with header pollution via sys/mbuf.h	2016-02-01 17:41:21 +00:00
mav	62e7403834	Improve locking of sg_threadcount. MFC after: 1 week	2015-11-19 08:04:05 +00:00
jpaetzel	bd730ad2f4	Increase group limit for kerberized NFSv4 PR: 202659 Submitted by: matthew.l.dailey@dartmouth.edu Reviewed by: rmacklem dfr MFC after: 1 week Sponsored by: iXsystems	2015-09-26 16:30:16 +00:00
delphij	277c92f612	Set curvnet context inside the RPC code in more places. Reviewed by: melifaro MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3398	2015-08-18 18:12:46 +00:00
kib	2b8c79506d	Remove useless acquire semantic from the atomic_add operation before sosend(). The only release on the xp_snt_cnt is done after sosend(), with an intent to synchronize with load_acq in svc_vc_ack(). Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-07-28 06:58:10 +00:00
mav	979d590154	Remove hard limits on number of accepting NFS connections. Limits of 5 connections set long ago creates problems for SPEC benchmark. Make the NFS follow system-wide maximum. MFC after: 1 week	2015-04-07 10:25:27 +00:00
wollman	299467e040	Fix overflow bugs in and remove obsolete limit from kernel RPC implementation. The kernel RPC code, which is responsible for the low-level scheduling of incoming NFS requests, contains a throttling mechanism that prevents too much kernel memory from being tied up by NFS requests that are being serviced. When the throttle is engaged, the RPC layer stops servicing incoming NFS sockets, resulting ultimately in backpressure on the clients (if they're using TCP). However, this is a very heavy-handed mechanism as it prevents all clients from making any requests, regardless of how heavy or light they are. (Thus, when engaged, the throttle often prevents clients from even mounting the filesystem.) The throttle mechanism applies specifically to requests that have been received by the RPC layer (from a TCP or UDP socket) and are queued waiting to be serviced by one of the nfsd threads; it does not limit the amount of backlog in the socket buffers. The original implementation limited the total bytes of queued requests to the minimum of a quarter of (nmbclusters * MCLBYTES) and 45 MiB. The former limit seems reasonable, since requests queued in the socket buffers and replies being constructed to the requests in progress will all require some amount of network memory, but the 45 MiB limit is plainly ridiculous for modern memory sizes: when running 256 service threads on a busy server, 45 MiB would result in just a single maximum-sized NFS3PROC_WRITE queued per thread before throttling. Removing this limit exposed integer-overflow bugs in the original computation, and related bugs in the routines that actually account for the amount of traffic enqueued for service threads. The old implementation also attempted to reduce accounting overhead by batching updates until each queue is fully drained, but this is prone to livelock, resulting in repeated accumulate-throttle-drain cycles on a busy server. Various data types are changed to long or unsigned long; explicit 64-bit types are not used due to the unavailability of 64-bit atomics on many 32-bit platforms, but those platforms also cannot support nmbclusters large enough to cause overflow. This code (in a 10.1 kernel) is presently running on production NFS servers at CSAIL. Summary of this revision: * Removes 45 MiB limit on requests queued for nfsd service threads * Fixes integer-overflow and signedness bugs * Avoids unnecessary throttling by not deferring accounting for completed requests Differential Revision: https://reviews.freebsd.org/D2165 Reviewed by: rmacklem, mav MFC after: 30 days Relnotes: yes Sponsored by: MIT Computer Science & Artificial Intelligence Laboratory	2015-04-01 00:45:47 +00:00
pfg	a9fa41d648	rpc: Uninitialized pointer read Initialize *xprt to avoid exposing a random value in cleanup_svc_vc_create. This is the kernel counterpart of r278041. CID: 1007340	2015-02-02 16:07:07 +00:00
kib	d41bf48327	Add facility to stop all userspace processes. The supposed use of the feature is to quisce the system before suspend. Stop is implemented by reusing the thread_single(9) with the special mode SINGLE_ALLPROC. SINGLE_ALLPROC differs from the existing single-threading modes by allowing (requiring) caller to operate on other process. Interruptible sleeps for !TDF_SBDRY threads are suspended like SIGSTOP does it, instead of aborting the sleep, like SINGLE_NO_EXIT, to avoid spurious EINTRs on resume. Provide debugging sysctl debug.stop_all_proc, which causes total stop and suspends syncer, while waiting for variable reset for resume. It is used for debugging; should be removed after the real use of the interface is added. In collaboration with: pho Discussed with: avg Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-13 16:18:29 +00:00
kib	e83a8b4fa1	Current reaction of the nfsd worker threads to any signal is exit. This is not correct at least for the stop requests. Check for stop conditions and suspend threads if requested. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:33:18 +00:00
glebius	c0b38b545a	In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions: sbavail() and sbused() Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-12 09:57:15 +00:00
rmacklem	8c4948bf02	Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server. MFC after: 1 month	2014-07-01 20:47:16 +00:00
mav	9264ec94fe	Fix race in r267221. MFC after: 2 weeks	2014-06-09 15:00:43 +00:00
mav	18c81a620b	Split RPC pool threads into number of smaller semi-isolated groups. Old design with unified thread pool was good from the point of thread utilization. But single pool-wide mutex became huge congestion point for systems with many CPUs. To reduce the congestion create several thread groups within a pool (one group for every 6 CPUs and 12 threads), each group with own mutex. Each connection during its registration is assigned to one of the groups in round-robin fashion. File affinify code may still move requests between the groups, but otherwise groups are self-contained. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2014-06-08 11:19:32 +00:00
mav	c73f36ad0d	Remove st_idle variable, duplicating st_xprt. MFC after: 2 weeks	2014-06-08 10:18:22 +00:00
mav	aa54e738fe	Introduce new per-thread lock to protect the list of requests. This allows to slightly simplify svc_run_internal() code: if we processed all the requests in a queue, then we know that new one will not appear. MFC after: 2 weeks	2014-06-08 09:40:26 +00:00
brueffer	c006433c40	Properly free resources in case of error. CID: 1007032 Found with: Coverity Prevent(tm) MFC after: 2 weeks	2014-05-02 20:45:55 +00:00
mav	7cbc3205cc	Fix lock acquisition in case no request space available, missed in r260097. MFC after: 3 days	2014-02-04 00:00:01 +00:00
peter	3244f6064b	Don't expose svc_loss_reg / _unreg to userland as they're kernel-only additions from r260229 and the SVCPOOL type doesn't exist in userland.	2014-01-08 22:37:18 +00:00
mav	b421f931ee	Fix NULL dereference panic on UDP requests introduced in r260229.	2014-01-06 12:40:46 +00:00
mav	048eda83eb	Replace locks added in r260229 to protect sequence counters with atomics. New algorithm does not create additional lock congestion, while some races it includes should not be a problem. Those races may keep requests in DRC cache for some more time by returning ACK position smaller then actual, but it still should be able to drop thems when proper ACK finally read. Races of the original algorithm based on TCP seq number were worse because they happened when reply sequence number were recorded. After that even correctly read ACKs could not clean DRC sometimes.	2014-01-04 15:51:31 +00:00
mav	d75fe782e9	Rework NFS Duplicate Request Cache cleanup logic. - Introduce additional hash to group requests by hash of sockref. This allows to process TCP acknowledgements without looping though all the cache, and as result allows to do it every time. - Indroduce additional callbacks to notify application layer about sockets disconnection. Without this last few requests processed just before socket disconnection never processed their ACKs and stuck in cache for many hours. - Implement transport-specific method for tracking reply acknowledgements. New implementation does not cross multiple stack layers to get the data and does not have race conditions that previously made some requests stuck in cache. This could be done more efficiently at sockbuf layer, but that would broke some KBIs, while I don't know other consumers for it aside NFS. - Instead of traversing all DRC twice per request, run cleaning only once per request, and except in some conditions traverse only single hash slot at a time. Together this limits NFS DRC growth only to situations of real connectivity problems. If network is working well, and so all replies are acknowledged, cache remains almost empty even after hours of heavy load. Without this change on the same test cache was growing to many thousand requests even with perfectly working local network. As another result this reduces CPU time spent on the DRC handling during SPEC NFS benchmark from about 10% to 0.5%. Sponsored by: iXsystems, Inc.	2014-01-03 15:09:59 +00:00
mav	16b0bb1da4	Move most of NFS file handle affinity code out of the heavily congested global RPC thread pool lock and protect it with own set of locks. On synthetic benchmarks this improves peak NFS request rate by 40%.	2013-12-30 20:23:15 +00:00
mav	082b09553b	Introduce xprt_inactive_self() -- variant for use when sure that port is assigned to thread. For example, withing receive handlers. In that case the function reduces to single assignment and can avoid locking.	2013-12-29 11:19:09 +00:00
mav	6ab680bb65	In addition to r259632 completely block receive upcalls if we have more data than we need. This reduces lock pressure from xprt_active() side.	2013-12-29 03:43:25 +00:00
dim	b1d5d16e85	Move a static const variable to the #if 0 part where it is only used. (Note the #if 0 part has been inactive since the initial commit, r177633, so maybe it should be removed altogether). MFC after: 3 days	2013-12-24 20:57:26 +00:00
dim	7e48e15976	Remove some unused static const strings under sys/rpc, which have never been used since the initial commit (r177633). MFC after: 3 days	2013-12-24 20:55:22 +00:00
mav	dfcd443c18	Fix a bug introduced at r259632, triggering infinite loop in some cases.	2013-12-24 17:28:27 +00:00
glebius	303df467d6	Fix build.	2013-12-20 19:44:29 +00:00
mav	88262ed57f	Remove several linear list traversals per request from RPC server code. Do not insert active ports into pool->sp_active list if they are success- fully assigned to some thread. This makes that list include only ports that really require attention, and so traversal can be reduced to simple taking the first one. Remove idle thread from pool->sp_idlethreads list when assigning some work (port of requests) to it. That again makes possible to replace list traversals with simple taking the first element.	2013-12-20 17:39:07 +00:00
mav	17e956838c	Rework flow control for connection-oriented (TCP) RPC server. When processing receive buffer, write the amount of data, expected in present request record, into socket's so_rcv.sb_lowat to make stack aware about our needs. When processing following upcalls, ignore them until socket collect enough data to be read and processed in one turn. This change reduces number of context switches and other operations in RPC stack during large NFS writes (especially via non-Jumbo networks) by order of magnitude. After precessing current packet, take another look into the pending buffer to find out whether the next packet had been already received. If not, deactivate this port right there without making RPC code to push this port to another thread just to find that there is nothing. If the next packet is received partially, also deactivate the port, but also update socket's so_rcv.sb_lowat to not be woken up prematurely. This change additionally reduces number of context switches per NFS request about in half.	2013-12-19 21:31:28 +00:00
hrs	aee8502f1b	Replace Sun Industry Standards Source License for Sun RPC code with a 3-clause BSD license as specified by Oracle America, Inc. in 2010. This license change was approved by Wim Coekaerts, Senior Vice President, Linux and Virtualization at Oracle Corporation.	2013-11-25 19:08:38 +00:00

1 2 3 4

177 Commits