freebsd-skq

Author	SHA1	Message	Date
jtl	be1c33dfcc	The TCPPCAP debugging feature caches recently-used mbufs for use in debugging TCP connections. This commit provides a mechanism to free those mbufs when the system is under memory pressure. Because this will result in lost debugging information, the behavior is controllable by a sysctl. The default setting is to free the mbufs. Reviewed by: gnn Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D6931 Input from: novice_techie.com	2016-07-06 16:17:13 +00:00
nwhitehorn	89d01c24d1	Replace a number of conflations of mp_ncpus and mp_maxid with either mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try, for example, to run tasks on CPUs that did not exist or to allocate too few buffers on systems with sparse CPU IDs in which there are holes in the range and mp_maxid > mp_ncpus. Such circumstances generally occur on systems with SMT, but on which SMT is disabled. This patch restores system operation at least on POWER8 systems configured in this way. There are a number of other places in the kernel with potential problems in these situations, but where sparse CPU IDs are not currently known to occur, mostly in the ARM machine-dependent code. These will be fixed in a follow-up commit after the stable/11 branch. PR: kern/210106 Reviewed by: jhb Approved by: re (glebius)	2016-07-06 14:09:49 +00:00
hselasky	6716d2f9c5	Fix regression issue with XHCI on 32-bit ARMv7 Armada-38x. Make sure "struct xhci_dev_ctx_addr" fits into a single 4K page until further. Approved by: re (hrs) MFC after: 1 week	2016-07-06 10:57:04 +00:00
bz	a1ab1fdac6	Only set the ipfilter running state to 'not running' if we are doing the teardown. ipf_destroy_all() may free ipfmain in case of ipf_dynamic_softc being true, thus we are avoiding a possible memory modified after free as well. Reported by: Coverity Coverity CID: 1357320 Approved by: re (hrs) MFC after: 10 days	2016-07-06 10:29:29 +00:00
cem	5cdf98a966	ioat(4): Block asynchronous work during HW reset Fix the race between ioat_reset_hw and ioat_process_events. HW reset isn't protected by a lock because it can sleep for a long time (40.1 ms). This resulted in a race where we would process bogus parts of the descriptor ring as if it had completed. This looked like duplicate completions on old events, if your ring had looped at least once. Block callout and interrupt work while reset runs so the completion end of things does not observe indeterminate state and process invalid parts of the ring. Start the channel with a manually implemented ioat_null() to keep other submitters quiesced while we wait for the channel to start (100 us). r295605 may have made the race between ioat_reset_hw and ioat_process_events wider, but I believe it already existed before that revision. ioat_process_events can be invoked by two asynchronous sources: callout (softclock) and device interrupt. Those could race each other, to the same effect. Reviewed by: markj Approved by: re Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7097	2016-07-05 20:53:32 +00:00
cem	74e2a82c2d	ioat(4): Serialize ioat_reset_hw invocations Reviewed by: markj Approved by: re Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7097	2016-07-05 20:52:35 +00:00
cem	daf01694f3	ioat(4): Split timer into poll and shrink functions Poll should happen quickly, while shrink should happen infrequently. Protect is_completion_pending with submit_lock. Reviewed by: markj Approved by: re Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7097	2016-07-05 20:51:52 +00:00
glebius	2b2f782761	The paradigm of a callout is that it has three consequent states: not scheduled -> scheduled -> running -> not scheduled. The API and the manual page assume that, some comments in the code assume that, and looks like some contributors to the code also did. The problem is that this paradigm isn't true. A callout can be scheduled and running at the same time, which makes API description ambigouous. In such case callout_stop() family of functions/macros should return 1 and 0 at the same time, since it successfully unscheduled future callout but the current one is running. Before this change we returned 1 in such a case, with an exception that if running callout was migrating we returned 0, unless CS_MIGRBLOCK was specified. With this change, we now return 0 in case if future callout was unscheduled, but another one is still in action, indicating to API users that resources are not yet safe to be freed. However, the sleepqueue code relies on getting 1 return code in that case, and there already was CS_MIGRBLOCK flag, that covered one of the edge cases. In the new return path we will also use this flag, to keep sleepqueue safe. Since the flag CS_MIGRBLOCK doesn't block migration and now isn't limited to migration edge case, rename it to CS_EXECUTING. This change fixes panics on a high loaded TCP server. Reviewed by: jch, hselasky, rrs, kib Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D7042	2016-07-05 18:47:17 +00:00
glebius	0417c9be8f	Compile in the kassert_panic() function with INVARIANT_SUPPORT option, not INVARIANTS. The function is required if we want to load in a module that is compiled with INVARIANTS. Reviewed by: jhb Approved by: re (gjb)	2016-07-05 18:34:34 +00:00
markj	d6ae30f932	Ensure that spinlock sections are balanced even after a panic. vpanic() uses spinlock_enter() to disable interrupts before dumping core. However, when the scheduler is stopped and INVARIANTS is not configured, thread_lock() does not acquire a spinlock section, while thread_unlock() releases one. This can result in interrupts staying enabled while the kernel dumps core, complicating post-mortem analysis of the crash. Approved by: re (gjb) MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2016-07-05 17:59:04 +00:00
rwatson	0696f7cf29	Call audit hooks to capture vnode attributes for three file-descriptor method implementations: fstat(2), close(2), and poll(2). This change synchronises auditing here with similar auditing for VFS-specific system calls such as stat(2) that audit more complete vnode information. Sponsored by: DARPA, AFRL Approved by: re (kib) MFC after: 1 week	2016-07-05 16:37:01 +00:00
emaste	c769866ed1	add description for debug.elf{32,64}_legacy_coredump sysctl Approved by: re (kib) MFC after: 1 week Sponsored by: The FreeBSD Foundation	2016-07-05 14:46:06 +00:00
kib	b028e9a2d9	Clarify the vnode_destroy_vobject() logic handling for already terminated objects. Assert that there is no new waiters for the already terminated objects. Old waiters should have been notified by the termination calling vnode_pager_dealloc() (old/new are with regard of the lock acquisition interval). Only clear the vp->v_object for the case of already terminated object, since other branches call vnode_pager_dealloc(), which should clear the pointer. Assert this. Tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-07-05 11:21:02 +00:00
jhibbits	d88fefb346	Remove SoC-specific integrations from dTSEC, to make it SoC agnostic. This will allow a single kernel to run on all SoCs supported by the dTSEC driver. Approved by: re@(gjb)	2016-07-05 06:16:42 +00:00
jhibbits	3720bcbd30	Unbreak the LBC driver, broken with the large RMan and 36-bit physical address changes. Remove the use of fdt_data_to_res(), and instead construct the resources manually. Additionally, avoid the 32-bit size limitation of fdt_data_get(), by building physical addresses manually from the lbc ranges property. Approved by: re@(gjb)	2016-07-05 06:14:23 +00:00
np	58667ff11c	cxgbe(4): Changes to the CPL-handler registration mechanism and code related to "shared" CPLs. a) Combine t4_set_tcb_field and t4_set_tcb_field_rpl into a single function. Allow callers to direct the response to any iq. Tidy up set_ulp_mode_iscsi while there to use names from t4_tcb.h instead of magic constants. b) Remove all CPL handler tables from struct adapter. This reduces its size by around 2KB. All handlers are now registered at MOD_LOAD instead of attach or some kind of initialization/activation. The registration functions do not need an adapter parameter any more. c) Add per-iq handlers to deal with CPLs whose destination cannot be determined solely from the opcode. There are 2 such CPLs in use right now: SET_TCB_RPL and L2T_WRITE_RPL. The base driver continues to send filter and L2T_WRITEs over the mgmtq and solicits the reply on fwq. t4_tom (including the DDP code) now uses the port's ctrlq to send L2T_WRITEs and SET_TCB_FIELDs and solicits the reply on an ofld_rxq. fwq and ofld_rxq have different handlers that know what kind of tid to expect in the reply. Update t4_write_l2e and callers to to support any wrq/iq combination. Approved by: re@ (kib@) Sponsored by: Chelsio Communications	2016-07-05 01:29:24 +00:00
truckman	9e6e43be95	Fix a race condition between the main thread in aqm_pie_cleanup() and the callout thread that can cause a kernel panic. Always do the final cleanup in the callout thread by passing a separate callout function for that task to callout_reset_sbt(). Protect the ref_count decrement in the callout with DN_BH_WLOCK(). All other ref_count manipulation is protected with this lock. There is still a tiny window between ref_count reaching zero and the end of the callout function where it is unsafe to unload the module. Fixing this would require the use of callout_drain(), but this can't be done because dummynet holds a mutex and callout_drain() might sleep. Remove the callout_pending(), callout_active(), and callout_deactivate() calls from calculate_drop_prob(). They are not needed because this callout uses callout_init_mtx(). Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au> Approved by: re (gjb) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D6928	2016-07-05 00:53:01 +00:00
hselasky	1649e59d99	Fix interrupt loop when switching from USB device to USB host mode by clearing all endpoint interrupt bits. PR: 210736 Approved by: re (glebius) MFC after: 1 week	2016-07-04 17:12:22 +00:00
emaste	bca2b7caa7	boot1.efi: fix assignment / comparison expression PR: 210706 Submitted by: David Binderman <dcb314@hotmail.com> Approved by: re (kib) MFC after: 1 week	2016-07-04 16:50:21 +00:00
kib	2ff8e9c636	Provide helper macros to detect 'non-silent SBDRY' state and to calculate appropriate return value for stops. Simplify the code by using them. Fix typo in sig_suspend_threads(). The thread which sleep must be aborted is td2. () In issignal(), when handling stopping signal for thread in TD_SBDRY_INTR state, do not stop, this is wrong and fires assert. This is yet another place where execution should be forced out of SBDRY-protected region. For such case, return -1 from issignal() and translate it to corresponding error code in sleepq_catch_signals(). Assert that other consumers of cursig() are not affected by the new return value. () Micro-optimize, mostly VFS and VOP methods, by avoiding calling the functions when SIGDEFERSTOP_NOP non-change is requested. (*) Reported and tested by: pho () Requested by: bde (**) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-07-03 18:19:48 +00:00
kib	c31bb3499e	Remove racy assert. The thread which changes vnode usecount from 0 to 1 does it under the vnode interlock, but the interlock is not owned by the asserting thread. As result, we might read increased use counter but also still see VI_OWEINACT. In collaboration with: nwhitehorn Hardware donated by: IBM LTC Sponsored by: The FreeBSD Foundation (kib) Approved by: re (gjb)	2016-07-03 01:56:48 +00:00
kib	fb6d455926	Change type of the 'dead' variable to boolean. Requested by: alc MFC after: 1 week Approved by: re (gjb)	2016-07-03 00:08:17 +00:00
np	d08d8c2ff3	cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries used by switching filters that rewrite L2 information do not have any associated ifnet. Approved by: re@ (gjb@) Sponsored by: Chelsio Communications	2016-07-01 23:18:49 +00:00
nwhitehorn	1a2acce994	Clean up some FDT-related code in the PowerPC bootloader, improving error checking and robustness. Prevents errors and crashes in FDT commands on PowerMac G5 systems. Approved by: re (gjb)	2016-07-01 21:09:30 +00:00
kib	38d067a317	When a process knote was attached to the process which is already exiting, the knote is activated immediately. If the exit1() later activates knotes, such knote is attempted to be activated second time. Detect the condition by zeroed kn_ptr.p_proc pointer, and avoid excessive activation. Before r302235, such knotes were removed from the knlist immediately upon activation. Reported by: truckman Sponsored by: The FreeBSD Foundation Approved by: re (gjb)	2016-07-01 20:11:28 +00:00
adrian	d6284376d0	[net80211] teach AMRR to log the initial MCS rate as "MCS X" Otheriwse it logs it as the rate value, which is 0x80 (MCS flag) + MCS, which isn't that helpful. Approved by: re (gjb)	2016-07-01 19:58:13 +00:00
hselasky	e531a038fb	Fix detection of USB device disconnects in USB host mode when the USB device is connected directly to the USB port of the DWC OTG, in this case a RPI-zero. PR: 210695 Approved by: re (gjb) MFC after: 1 week	2016-07-01 07:27:33 +00:00
gjb	cce5cf3ef4	Update 11.0 to ALPHA6. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation	2016-07-01 00:00:35 +00:00
bz	2af4ea8fc2	In case of the global eventhandler make sure the current VNET is still operational before doing any work; otherwise we might run into, e.g., destroyed locks. PR: 210724 Reported by: olevole olevole.ru Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Obtained from: projects/vnet Approved by: re (gjb)	2016-06-30 19:32:45 +00:00
bz	0c1171f994	Virtualise ipfilter. Split initializzation an teardown into module (global state) and VNET (per virtual network stack) parts. Virtualise global state, which is not "const". Cleanup eventhandlers, so that we can make use of the passed in argument to get the vnet state from the ifp; disable the "cloner" event as it is too early, has no state, and can fire before initialisation (see comment in the source). Handle the dynamic sysctls specially. The problem is that "ipmain" is the virtualized struct, but the fields used for the sysctls are hanging off memory allocated and attached to the virtualized "ipmain" thus standard VNET macros and sysctl handling do not work. We still say it is VNET sysctls to get the proper protection checks in the VIMAGE case; to solve the problem of accessing the right bit of memory hanging of each per-VNET ipmain, we use a dedicated handler function wrapping around sysctl_ipf_int() undoing the base calculation from kern_sysctl.c and then adding the passed-in offset into the right struct depending on handler. A bit of a mess exposing VNET-internals this way but the only way to keep the code without having to massively restructure ipf internals. Approved by: re (hrs) Sponsored by: The FreeBSD Foundation Obtained from: projects/vnet MFC after: 2 weeks Reviewed by: cy Differential Revision: https://reviews.freebsd.org/D7000	2016-06-30 15:01:07 +00:00
mav	b281d573a9	Revert r299454 and r299448. Those changes were found confusing FreeBSD libc ACL code, that doesn't differentiate ACL for directories and files, and report ACLs for all directories created after those patches as non-trivial. On the other side these changes were considered wrong from POSIX and NFSv4 points of view. Until further investigation done upstream, revert those changes locally in preparation for FreeBSD 11.0 release. Approved by: re (hrs)	2016-06-30 14:55:49 +00:00
tuexen	557adfd043	This patch fixes two bugs related to the setting of the I-Bit for SCTP DATA and I-DATA chunks. * For fragmented user messages, set the I-Bit only on the last fragment. * When using explicit EOR mode, set the I-Bit on the last fragment, whenever SCTP_SACK_IMMEDIATELY was set in snd_flags for any of the send() calls. Approved by: re (hrs) MFC after: 1 week	2016-06-30 06:06:35 +00:00
wma	eaf77c9cc6	ARM, ARM64: Workaround for buf_ring reordering This patch offers a workaround to buf_ring reordering visible on armv7 and armv8. This is supposed to be removed once new buf_ring implementation is integrated into the tree. Obtained from: Semihalf Reviewed by: alc,emaste Differential Revision: https://reviews.freebsd.org/D6986 Approved by: re (gjb)	2016-06-30 05:18:37 +00:00
wma	2da75cc60b	ARM64: fix DMAP calculation Use arithmetic operators instead of logical. This fixes DMAP ranges calculation for ThunderX Dual Socket. Obtained from: Semihalf Sponsored by: Cavium Reviewed by: zbb Differential Revision: https://reviews.freebsd.org/D7023 Approved by: re (gjb)	2016-06-30 04:58:19 +00:00
bz	c79242bce1	Move the ipfw_log_bpf() calls from global module initialisation to per-VNET initialisation and virtualise the interface cloning to allow a dedicated ipfw log interface per VNET. Approved by: re (gjb) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2016-06-30 01:33:14 +00:00
bz	47f08657c2	Remove unused global variables as well as unused memory allocations from ipfilter in preparation for VNET support. Suggested by: cy (see D7000) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Approved by: re (gjb)	2016-06-30 01:32:12 +00:00
gonzo	62a7c9548d	Fixed FreeBSD/mips MALTA support for QEMU Recource management functions in GT PCI controller driver treated memory/IO resources as KSEG1 addresses, later during activation these values would be increased by KSEG1 base again rendering the address invalid and causing "bus error" trap. Actual logic was converted to use real physical addresses, so mapping takes place only during activation. Submitted by: Aleksandr Rybalko <ray@FreeBSD.org> Approved by: re (gjb)	2016-06-29 23:33:44 +00:00
bdrewery	3ef66a1afe	WITH_META_MODE: Avoid false-positive error due to missing .meta with build commands. Sponsored by: EMC / Isilon Storage Division Approved by: re (blanket, META_MODE)	2016-06-29 22:39:22 +00:00
sobomax	fa8fbeaaf4	1.Improve handling around last compressed block of the file, which is necessary because CLOOP format lacks explicit EOF or length, so that in the presence of padding or when the CLOOP is put onto a larger partition upper level provider size may be larger. Bound amount of extra data that we might touch to the max length of the compressed block and detect zero-padding in the last cluster, which when sector is all-zero might cause us to emit bogus I/O error after decompression of that fails. To not make code any more complicated that it needs to be deal with it in lazy-manner, i.e. when we first access that specific cluster. This change also fixes stupid mistake in the LZMA code, inherited from geom_lzma, which does not share length of the output buffer buffer with the decompression routine, so that in the presence of corrupted or purposedly tailored data may easily cause heap overflow and kernel memory corruption. Beef up validation of the CLOOP TOC by checking that lengths of all but the last compressed clusters match upper limit set by the decompressor and improve some error diagnostic output while I am here. 2.Add kern.geom.uzip.attach_to tunable to artifically limit attaching uzip to certain devices in the dev tree only. For example the following only makes us attaching to the GPT labels: kern.geom.uzip.attach_to="gpt/" 3.Add kern.geom.uzip.noattach_to, which does opposite to the (2) above, i.e. prevents geom_uzip from tasting / attaching to providers matching some pattern. By default we don't attach to our own kind, i.e. kern.geom.uzip.noattach_to=".uzip". It saves us quite some CPU cycles, esp on low-end embedded systems. Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D7013	2016-06-29 18:19:05 +00:00
avos	19e196315e	net80211: fix LOR/deadlock in ieee80211_ff_node_cleanup(). Add new lock for stageq (part of ieee80211_superg structure) and ni_tx_superg (part of ieee80211_node structure); drop com_lock protection where it is used to protect them. While here, drop duplicate OPACKETS counter incrementation. ni_tx_ampdu is not protected with it (however, it is also used without locking in other places; probably, it requires some other solution to be thread-safe). Tested with RTL8188CUS (AP) and RTL8188EU (STA). NOTE: Since this change breaks KBI, all wireless drivers need to be recompiled. Reviewed by: adrian Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D6958	2016-06-29 17:25:46 +00:00
sbruno	7fe30bec23	Correct PERSISTENT RESERVE OUT command and populate scsi_cmd->length. PR: 202625 Submitted by: niakrisn@gmail.com Reviewed by: scottl kenm Approved by: re (gjb) MFC after: 2 weeks	2016-06-29 16:41:37 +00:00
nwhitehorn	032d51d92b	Fix fat-fingering: #if AIM should have been #ifdef AIM to avoid failures on Book-E kernels. Approved by: re (gjb) Pointy hat to: nwhitehorn	2016-06-29 16:34:56 +00:00
nwhitehorn	74554ccb4a	Do not rely on firmware having pre-enabled the MMU in a reasonable way for late boot: enable it explicitly after installing the page tables. If booting from an FDT, also make sure to escape the firmware's MMU context early before overwriting firmware page tables. Approved by: re (gjb)	2016-06-29 14:40:43 +00:00
smh	b24634094a	Allow ZFS ARC min / max to be tuned at runtime Prior to this change ZFS ARC min / max could only be changed using boot time tunables, this allows the values to be tuned at runtime using the sysctls: * vfs.zfs.arc_max * vfs.zfs.arc_min When adjusting ZFS ARC minimum the memory used will only reduce to the new minimum given memory pressure. Reviewed by: allanjude Approved by: re (gjb) MFC after: 2 weeks Relnotes: yes Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D5907	2016-06-29 07:55:45 +00:00
np	ecbf2c3532	cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it. The interface's queues are functional after VI_INIT_DONE (which is short of interface-up) and that's all that's needed for t4_tom to communicate with the chip. Approved by: re@ (gjb@) Sponsored by: Chelsio Communications	2016-06-29 06:55:30 +00:00
cem	eaa90873e9	USB: Add Garmin FR230 device quirk (broken INQUIRY) PR: 210544 Reviewed by: hps Approved by: re	2016-06-29 06:42:20 +00:00
bz	2acea814a2	Several device drivers call if_alloc() and then do further checks and will cal if_free() in case of conflict, error, .. if_free() however sets the VNET instance from the ifp->if_vnet which was not yet initialized but would only in if_attach(). Fix this by setting the curvnet from where we allocate the interface in if_alloc(). if_attach() will later overwrite this as needed. We do not set the home_vnet early on as we only want to prevent the if_free() panic but not change any of the other housekeeping, e.g., triggered through ifioctl()s. Reviewed by: brooks Approved by: re (gjb) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D7010	2016-06-29 05:21:25 +00:00
sbruno	c6d7fbc03e	Revert svn r302253 at the request/review of Ken M. This commit is incorrect. PR: 202625 Approved by: re (implicit)	2016-06-28 18:32:15 +00:00
sbruno	90baee121f	Correct PERSISTENT RESERVE OUT command and populate scsi_cmd->length. PR: 202625 Submitted by: niakrisn@gmail.com Reviewed by: scottl Approved by: re (hrs) MFC after: 2 weeks	2016-06-28 18:08:47 +00:00
kib	42d92058c3	Currently the ntptime code and resettodr() are Giant-locked. In particular, the Giant is supposed to protect against parallel ntp_adjtime(2) invocations. But, for instance, sys_ntp_adjtime() does copyout(9) under Giant and then examines time_status to return syscall result. Since copyout(9) could sleep, the syscall result might be inconsistent. Another and more important issue is that if PPS is configured, hardpps(9) is executed without any protection against the parallel top-level code invocation. Potentially, this may result in the inconsistent state of the ntptime state variables, but I cannot say how serious such distortion is. The non-functional splclock() call in sys_ntp_adjtime() protected against clock interrupts calling hardpps() in the pre-SMP era. Modernize the locking. A mutex protects ntptime data. Due to the hardpps() KPI legitimately serving from the interrupt filters (and e.g. uart(4) does call it from filter), the lock cannot be sleepable mutex if PPS_SYNC is defined. Otherwise, use normal sleepable mutex to reduce interrupt latency. Reviewed by: imp, jhb Sponsored by: The FreeBSD Foundation Approved by: re (gjb) Differential revision: https://reviews.freebsd.org/D6825	2016-06-28 16:43:23 +00:00

1 2 3 4 5 ...

115476 Commits