proper virtualisation, teardown, avoiding use-after-free, race conditions,
no longer creating a thread per VNET (which could easily be a couple of
thousand threads), gracefully ignoring global events (e.g., eventhandlers)
on teardown, clearing various globally cached pointers and checking
them before use.
Reviewed by: kp
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6924
use. Update comments regarding the spare fields in struct inpcb.
Bump __FreeBSD_version for the changes to the size of the structures.
Reviewed by: gnn@
Approved by: re@ (gjb@)
Sponsored by: Chelsio Communications
While reading the code, I noticed that shm_read() returns without unlocking
foffset and rangelock if mac_posixshm_check_read() rejects the read.
Reviewed by: kib, jhb, rwatson
Approved by: re (gjb)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6927
The GEOM disk d_mtx is only acquired on disk creation and destruction.
It is a good candidate for replacement with a pool mutex. This eliminates
the mutex initialization and teardown and the mutex and name variables
themselves from struct disk.
sys/geom/geom_disk.h:
Take d_mtx and d_mtx_name out of struct disk.
sys/geom/geom_disk.c:
Use mtx_pool_lock() and mtx_pool_unlock() to guard the disk
initialization state instead of a dedicated mutex.
This allows removing the initialization and destruction of
d_mtx.
sys/sys/param.h:
Bump __FreeBSD_version to 1100119 for the change to struct disk.
Suggested by: jhb
Sponsored by: Spectra Logic
Approved by: re (gjb)
this code in the userland stack, it could result in a loop. This happened on iOS.
However, I was not able to reproduce this when using the code in the kernel.
Thanks to Eugen-Andrei Gavriloaie for reporting the issue and proving detailed
information to find the root of the problem.
Approved by: re (gjb)
MFC after: 1 week
If the allocation attempt fails, we may otherwise VM_WAIT after a failed
attempt to reclaim contiguous memory in the requested range. After r297466,
this results in the thread going to sleep, causing a hang during boot.
Reviewed by: jkim, kib
Approved by: re (gjb)
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D6945
They are currently not supported by makefs(1).
PR: 194703
Reviewed by: brooks
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6925
A larger EFI file system size will facilitate multi-boot configurations
and the installation other EFI applications like firmware update tools.
200MB matches OS X.
Note that this changes only the partition size, not the file system that
bsdinstall places there. We need to do both, but as the partition size
is difficult to adjust later make this change for now so that at least
systems installed with FreeBSD 11.0 have a partition layout with room
to grow.
Reviewed by: allanjude, imp
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6935
cddl/lib/libavl/Makefile
cddl/lib/libctf/Makefile
cddl/lib/libnvpair/Makefile
cddl/lib/libumem/Makefile
cddl/lib/libuutil/Makefile
Increase WARNS to the highest working level for each of these
libraries
Approved by: re (gjb, hrs)
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
waiters exist, same as for vm_page_xunbusy(). If previous value of
busy_lock was VPB_SINGLE_EXCLUSIVER, no waiters existed and wakeup is
not needed.
Move common code from vm_page_xunbusy_maybelocked() and
vm_page_xunbusy_hard() to vm_page_xunbusy_locked().
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Approved by: re (gjb)
getzfsvfs() called vfs_busy() in the waiting mode while having a hold on
a pool (via a call to dmu_objset_hold). In other words,
dp_config_rwlock was held in the shared mode while a thread could be
sleeping in vfs_busy().
The pool's txg sync thread needs to take dp_config_rwlock in the
exclusive mode for some actions, e.g., for executing sync tasks. If the
sync thread gets blocked, then any thread waiting for its sync task to
get executed is also blocked. Which, in turn, could mean that
vfs_busy() will keep waiting indefinitely.
The solution is to use vfs_ref() in the locked section and to call
vfs_busy() only after dropping other locks.
Note that a reference on a struct mount object does not prevent an
associated zfsvfs_t object from being destroyed. So, we have to be
careful to operate only on the struct mount object until we successfully
vfs_busy it.
Approved by: re (gjb)
MFC after: 2 weeks
vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.
The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel. The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.). It did not have normal
tx/rx but had a specialized netmap-only data path. r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.
This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support. The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable. This means "device netmap" will not
result in the automatic creation of any virtual interfaces.
The following tunables can be used to override the default number of
queues allocated for each 'v' interface. nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.
# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi
# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi
# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi
hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.
--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only. No
need to bring them up. The vcxl interfaces are completely independent
and everything should just work.
---------------------
Approved by: re@ (gjb@)
Relnotes: Yes
Sponsored by: Chelsio Communications
This patch addes missing implementation of BHND_BUS_RESET_CORE function for BCMA.
The reset procedure is very simple: enable reset mode, stop clocking,
enable clocking & force clock gating, disable reset mode, stop clock gating.
Tested:
* (michael) Tested on ASUS RT-N53 for enabling/reset USB core
Submitted by: Michael Zhilin <mizhka@gmail.com>
Approved by: re (gjb)
We also need to consider the size of large firmware commands in iwm_alloc_tx_ring(),
in the dma tag creation, when qid == IWM_MVM_CMD_QUEUE. The old code apparently
only allocated a 2KB (MCLBYTES) sized buffer when it actually expected 4KB.
Submitted by: Imre Vadasz <imre@vdsz.com>
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D6824
(Together with other iwm(4) memory leak fixes) Memory leakage in M_DEVBUF
is now at ca. 2KB for each iwm(4) module load/unload cycle.
Submitted by: Imre Vadasz <imre@vdsz.com>
Approved by: re (gjb)
Obtained from: DragonflyBSD git eaf551a1d464c643e98ce5781971dd32124e9af1
Differential Revision: https://reviews.freebsd.org/D6819
* When bus_dmamem_alloc is used, the bus_dmamap_t is usually set to NULL, so
we were never actually freeing any dma memory allocations done via
iwm_dma_contig_alloc(). So we should check dma->vaddr instead of dma->map here.
* Also, the dmamap is actually supposed to be invalidated as part of
bus_dmamem_free(), so bus_dmamap_destroy() is never needed here.
Submitted by: Imre Vadasz <imre@vdsz.com>
Approved by: re (gjb)
Obtained from: DragonflyBSD git ef2b29a7ba6ca8a9d2c82ab591c0622227ff84cb
ic_macaddr is only used for the initial mac address provided by NVM. We should
rather use vap->iv_myaddr when vap != NULL, to allow the MAC address
to be changed later with ifconfig(8).
Submitted by: Imre Vadasz <imre@vdsz.com>
Reviewed by: avos
Approved by: re (gjb)
Obtained from: DragonflyBSD git 4aee7a78275676d22d14c04177bd0c9377d91478
Differential Revision: https://reviews.freebsd.org/D6743
I keep asking myself "what do these fields mean" and so now I've clarified
it for myself.
Tested:
* Reading the comments, going "a-ha!" a couple times.
Approved by: re (gjb)
That way timers can finish cleanly and we do not gamble with a DELAY().
Reviewed by: gnn, jtl
Approved by: re (gjb)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6923
Timewait code does a proper cleanup after itself.
Reviewed by: gnn
Approved by: re (gjb)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6922
Mark the pipe() system call as COMPAT10.
As of r302092 libc uses pipe2() with a zero flags value instead of pipe().
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6816
As of r302092 libc uses pipe2() with a zero flags value instead of pipe().
Commit with regenerated files and implementation to follow.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6816
value.
This eliminates the need for machine dependant assembly wrappers for
pipe(2).
It also make passing an invalid address to pipe(2) return EFAULT rather
than triggering a segfault. Document this behavior (which was already
true for pipe2(2), but undocumented).
Reviewed by: andrew
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6815
This will result in lock recursion and is more generally incorrect since
the completion handlers will just reinsert the BIOs into the queue we're
trying to drain.
Reviewed by: imp, ngie
Approved by: re (gjb)
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D6908
There is an order between covered vnode lock and allproc_lock, which
is established by calling mountcheckdirs() while owning the covered
vnode lock. mountcheckdirs() iterates over the processes, protected by
allproc_lock. This order is needed and seems to be not avoidable.
On the other hand, various VM daemons also need to iterate over all
processes, and they lock and unlock user maps. Since unlock of the
user map may trigger processing of the deferred map entries, it causes
vnode locking to occur. Or, when vmspace is freed, dropping references
on the vnode-backed object also lock vnodes. We get reverted order
comparing with the mount/unmount order.
For VM daemons, there is no need to own allproc_lock while we operate
on vmspaces. If the process is held, it serves as the marker for
allproc list, which allows to continue the iteration.
Add _PHOLD_LITE() macro, similar to _PHOLD(), but not causing swap-in
of the kernel stacks. It is used instead of _PHOLD() in vm code,
since e.g. calling faultin() in OOM conditions only exaggerates the
problem.
Modernize comment describing PHOLD.
Reported by: lists@yamagi.org
Tested by: pho (previous version)
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 week
Approved by: re (gjb)
Differential revision: https://reviews.freebsd.org/D6679
after the underlying device went away.
The problem was that callers who queue the GEOM resize provider
event didn't check to make sure that the provider had not been
withered. For the other equivalent case, g_new_provider_event(),
the code checks to see whether the provider has been withered
before queueing a g_new_provider_event() to the event thread.
In some cases, a resize provider event would come through after
the provider had been withered and all of the existing consumers
had been orphaned. When the resize event triggered a taste of
the provider, that would attach a new consumer to the now
withered provider. The wither washer (g_wither_washer() would
never be able to completely tear down the GEOM because of the
consumers that were hanging around.
The solution was to check the G_PF_WITHER provider flag before
queueing the g_resize_provider_event(), and add an assert to
g_resize_provider_event() to insure that it isn't called on a
withered provider.
sys/geom/geom_subr.c:
In g_resize_provider(), don't try to continue if the
G_PF_WITHER flag is set.
In g_resize_provider_event(), add an assert that the
G_PF_WITHER flag is not set.
In g_access(), if a provider has an error, print out the
name of the provider with the error.
Sponsored by: Spectra Logic
Approved by: re (marius)
MFC after: 3 days
some fields to match the order in the struct. Especially needed
if_pf_kif to do pf(4) VNET debugging.
Approved by: re (marius)
Obtained from: projects/vnet
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
uncategorised reason. We need to read the fault address register before
enabling interrupts as the interrupt handler may cause this register to
change.
Approved by: re (marius, kib)
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
without VIMAGE support would dereference a NULL point unconditionally
leading to a panic. Wrap the entire VIMAGE related code with #ifdefs
rather than just the decision making part to save an extra bit of
resources.
Reported by: np
Sponsored by: The FreeBSD Foundation
MFC After: 13 days
Approved by: re (marius)
libusb_hotplug_deregister_callback() for the LibUSB v1.0 API and
update the libusb(3) manual page.
Approved by: re (kib)
Requested by: swills
MFC after: 1 week
version of the XHCI specification. Make sure the code can handle the
maximum number of allowed scratch pages.
Submitted by: Shichun_Ma@Dell.com
Approved by: re (hrs)
MFC after: 1 week
Update libarchive to 3.2.1 (bugfix and security fix release)
List of vendor fixes:
- fix exploitable heap overflow vulnerability in Rar decompression
(vendor issue 719, CVE-2016-4302, TALOS-2016-0154)
- fix exploitable stack based buffer overflow vulnebarility in mtree
parse_device functionality (vendor PR 715, CVE-2016-4301, TALOS-2016-0153)
- fix exploitable heap overflow vulnerability in 7-zip read_SubStreamsInfo
(vendor issue 718, CVE-2016-4300, TALOS-2016-152)
- fix integer overflow when computing location of volume descriptor
(vendor issue 717)
- fix buffer overflow when reading a crafred rar archive (vendor issue 521)
- fix possible buffer overflow when reading ISO9660 archives on machines
where sizeof(int) < sizeof(size_t) (vendor issue 711)
- tar and cpio should fail if an input file named on the command line is
missing (vendor issue 708)
- fix incorrect writing of gnutar filenames that are exactly 512 bytes
long (vendor issue 682)
- allow tests to be run from paths that are equal or longer than 128
characters (vendor issue 657)
- add memory allocation errors in archive_entry_xattr.c (vendor PR 603)
- remove dead code in archive_entry_xattr_add_entry() (vendor PR 716)
- fix broken decryption of ZIP files (vendor issue 553)
- manpage style, typo and description fixes
Post-3.2.1 vendor fixes:
- fix typo in cpio version reporting (Vendor PR 725, 726)
- fix argument range of ctype functions in libarchive_fe/passphrase.c
- fix ctype use and avoid empty loop bodies in WARC reader
MFC after: 1 week
Security: CVE-2016-4300, CVE-2016-4301, CVE-2016-4302
Approved by: re (kib)
File and disk-backed I/O requests store counts of read/written disk
blocks in each AIO job so that they can be charged to the thread that
completes an AIO request via aio_return() or aio_waitcomplete(). This
change extends AIO jobs to store counts of received/sent messages and
updates socket backends to set these counts accordingly. Note that
the socket backends are careful to only charge a single messages for
each AIO request even though a single request on a blocking socket might
invoke sosend or soreceive multiple times. This is to mimic the
resource accounting of synchronous read/write.
Adjust the UNIX socketpair AIO test to verify that the message resource
usage counts update accordingly for aio_read and aio_write.
Approved by: re (hrs)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D6911