m_dup_pkthdr(), that uses M_COPYFLAGS mask to copy m_flags field.
If original mbuf chain has M_RDONLY flag, its copy also will have it.
Reset this flag explicitly.
MFC after: 2 weeks
The function clearning credentials on failure asserts the process is a
zombie, which is not true when fork fails.
Changing creds to NULL is unnecessary, but is still being done for
consistency with other code.
Pointy hat: mjg
Reported by: pho
interface without breaking ABI or API compatibility with existing drivers.
The existing data structures used to communicate between the kernel and
driver portions of PPS processing contain no spare/padding fields and no
flags field or other straightforward mechanism for communicating changes
in the structures or behaviors of the code. This makes it difficult to
MFC new features added to the PPS facility. ABI compatibility is
important; out-of-tree drivers in module form are known to exist. (Note
that the existing api_version field in the pps_params structure must
contain the value mandated by RFC 2783 and any RFCs that come along after.)
These changes introduce a pair of abi-version fields which are filled in
by the driver and the kernel respectively to indicate the interface
version. The driver sets its version field before calling the new
pps_init_abi() function. That lets the kernel know how much of the
pps_state structure is understood by the driver and it can avoid using
newer fields at the end of the structure that it knows about if the driver
is a lower version. The kernel fills in its version field during the init
call, letting the driver know what features and data the kernel supports.
To implement the new version information in a way that is backwards
compatible with code from before these changes, the high bit of the
lightly-used 'kcmode' field is repurposed as a flag bit that indicates the
driver is aware of the abi versioning scheme. Basically if this bit is
clear that indicates a "version 0" driver and if it is set the driver_abi
field indicates the version.
These changes also move the recently-added 'mtx' field of pps_state from
the middle to the end of the structure, and make the kernel code that uses
this field conditional on the driver being abi version 1 or higher. It
changes the only driver currently supplying the mtx field, usb_serial, to
use pps_init_abi().
Reviewed by: hselasky@
multiple values using the same key in a nvlist.
Approved by: pjd (mentor)
Obtained from: WHEEL Systems (http://www.wheelsystems.com)
Update man page.
Reviewed by: AllanJude
Approved by: pjd (mentor)
Change the nvlist_recv() function to take additional argument that
specifies flags expected on the received nvlist. Receiving a nvlist with
different set of flags than the ones we expect might lead to undefined
behaviour, which might be potentially dangerous.
Update consumers of this and related functions and update the tests.
Approved by: pjd (mentor)
Update man page for nvlist_unpack, nvlist_recv, nvlist_xfer, cap_recv_nvlist
and cap_xfer_nvlist.
Reviewed by: AllanJude
Approved by: pjd (mentor)
and follow-up assertion errors on at least ARM after r282257,
with nvp_magic being 0x6e7600:
Assertion failed: ((nvp)->nvp_magic == 0x6e7670), function nvpair_name, file .../subr_nvpair.c, line 713.
Sponsored by: DARPA/AFRL
remains. Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead. Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
components for PV kernels. These components are now standard as they
are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.
Differential Revision: https://reviews.freebsd.org/D2362
Reviewed by: royger
Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes: yes
Those functions are problematic, because there is no way to report
memory allocation problems without complicating the API, so we can
either abort or potentially return invalid results. None of which is
acceptable.
In most cases the caller knows the size of the name, so he can allocate
buffer on the stack and use snprintf(3) to prepare the name.
After some discussion the conclusion is to removed those functions,
which also simplifies the API.
Discussed with: pjd, rstone
Approved by: pjd (mentor)
The point of this is to be able to add RACCT (with RACCT_DISABLED)
to GENERIC, to avoid having to rebuild the kernel to use rctl(8).
Differential Revision: https://reviews.freebsd.org/D2369
Reviewed by: kib@
MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
ctld(8) child processes to indicate initiator address and name in
their titles, similar to what iscsid(8) child processes do.
PR: 181352
Differential Revision: https://reviews.freebsd.org/D2363
Reviewed by: rwatson@, mjg@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
bufdaemon in getnewbuf(), do use buf_flush(). The difference is that
bufdaemon uses TRYLOCK to get buffer locks, which allows calls to
getnewbuf() while another buffer is locked.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
largest size for a buffer in the buffer cache. This patch
defines a new constant MAXBCACHEBUF, which is the largest
size for a buffer in the buffer cache. Having a separate
constant allows MAXBCACHEBUF to be set larger than MAXBSIZE
on a per-architecture basis, so that NFS can do larger read/writes
for these architectures. It modifies sys/param.h so that BKVASIZE
can also be set on a per-architecture basis.
A couple of cases where NFS used MAXBSIZE instead of NFS_MAXBSIZE
is fixed as well.
Differential Revision: https://reviews.freebsd.org/D2330
Reviewed by: mav, kib
MFC after: 2 weeks
the page size, len is the total transfer length, which may be larger
than zero_region.
Reported and tested by: clusteradm (gjb)
Sponsored by: The FreeBSD Foundation
X-MFC-With: r281442
While there, make few more performance optimizations.
On 40-core system doing many 512-byte AIO reads from array of raw SSDs
this change removes lock congestions inside pbuf allocator and devfs,
and bottleneck on single AIO completion taskqueue thread. It improves
peak AIO performance from ~600K to ~1.3M IOPS.
MFC after: 2 weeks
It is not network-specific code and would
be better as part of libkern instead.
Move zlib.h and zutil.h from net/ to sys/
Update includes to use sys/zlib.h and sys/zutil.h instead of net/
Submitted by: Steve Kiernan stevek@juniper.net
Obtained from: Juniper Networks, Inc.
GitHub Pull Request: https://github.com/freebsd/freebsd/pull/28
Relnotes: yes
* Add VCREAT flag to indicate when a new file is being created
* Add VVERIFY to indicate verification is required
* Both VCREAT and VVERIFY are only passed on the MAC method vnode_check_open
and are removed from the accmode after
* Add O_VERIFY flag to rtld open of objects
* Add 'v' flag to __sflags to set O_VERIFY flag.
Submitted by: Steve Kiernan <stevek@juniper.net>
Obtained from: Juniper Networks, Inc.
GitHub Pull Request: https://github.com/freebsd/freebsd/pull/27
Relnotes: yes
argument. This will be used for the Linux emulation layer - for Linux,
PATH_MAX is 4096 and not 1024.
Differential Revision: https://reviews.freebsd.org/D2335
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
pbufs is a limited resource, and their allocator is not SMP-scalable.
So instead of always allocating pbuf to immediately convert it to bio,
allocate bio just here. If buffer needs kernel mapping, then pbuf is
still allocated, but used only as a source of KVA and storage for a list
of held pages.
On 40-core system doing many 512-byte reads from user level to array of
raw SSDs this change removes huge lock congestion inside pbuf allocator.
It improves peak performance from ~300K to ~1.2M IOPS. On my previous
24-core system this problem also existed, but was less serious.
Reviewed by: kib
MFC after: 2 weeks
It is truer to the semantics of logging for messages to *always*
go to the message buffer, where they can eventually be collected
and, in fact, be put into a log file.
This restores the behavior prior to r70239, which seems to have
changed it inadvertently.
Submitted by: Eric Badger <eric@badgerio.us>
Reviewed by: jhb
Approved by: kib (mentor)
Obtained from: Dell Inc.
MFC after: 1 week
pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x
kernels which required padding before the off_t parameter. The
fcntl(2) contains compatibility code to handle kernels before the
struct flock was changed during the 8.x CURRENT development. The
shims were reasonable to allow easier revert to the older kernel at
that time.
Now, two or three major releases later, shims do not serve any
purpose. Such old kernels cannot handle current libc, so revert the
compatibility code.
Make padded syscalls support conditional under the COMPAT6 config
option. For COMPAT32, the syscalls were under COMPAT6 already.
Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to
(partially) disable the removed shims.
Reviewed by: jhb, imp (previous versions)
Discussed with: peter
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
They were added for compatibility with the sched provider in Solaris and
illumos, but our sched provider is already incompatible since it uses native
types, so there isn't much point in keeping them around.
Differential Revision: https://reviews.freebsd.org/D2167
Reviewed by: rpaulo
on the initial allocation, but seltdinit() assumes that td_sel is NULL
or a valid pointer. Note that thread_fini()/seltdfini() also relies
on this, but correctly resets td_sel to NULL.
Submitted by: luke.tw@gmail.com
PR: 199518
MFC after: 1 week
sysctl_debug_hashstat_nchash() and sysctl_debug_hashstat_rawnchash().
These changes are in preparation for allowing changes in the size
of the vnode hash tables driven by increases and decreases in the
maximum number of vnodes in the system.
Reviewed by: kib@
Phabric: D2265
A new loader.conf(5) option of geom_eli_passphrase_prompt="YES" will now
allow you to enter your geli(8) root-mount credentials prior to invoking
the kernel.
See check-password.4th(8) for details.
Differential Revision: https://reviews.freebsd.org/D2105
Reviewed by: imp, kmoore
Discussed on: -current
MFC after: 3 days
X-MFC-to: stable/10
Relnotes: yes
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.
Reviewed by: kib
MFC after: 2 weeks
Device probe value of BUS_PROBE_NOWILDCARD should be treated specially only
if the device has a fixed devclass. Otherwise it should be interpreted just
as if the driver doesn't want to claim the device.
Prior to this change a device that was not claimed explicitly by its driver
would remain "attached" to the driver that returned BUS_PROBE_NOWILDCARD.
This would bump up the reference on 'driver->refs' and its 'dev->ops' would
point to the 'driver->ops'. When the driver is subsequently unloaded the
'dev->ops->cls' is left pointing to freed memory.
This fixes an easily reproducible #GP fault caused by loading and unloading
vmm.ko multiple times.
Differential Revision: https://reviews.freebsd.org/D2294
Reviewed by: imp, jhb
Discussed with: rstone
Reported by: Leon Dang (ldang@nahannisys.com)
MFC after: 2 weeks
initial thread. It is read by the ELF image activator as the virtual
size of the PT_GNU_STACK program header entry, and can be specified by
the linker option -z stack-size in newer binutils.
The soft RLIMIT_STACK is auto-increased if possible, to satisfy the
binary' request.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the capability do not try to take the mutex at all.
Replaces misbegotten attempt from reverted commit 281276
Pointed out by: glebius
Sponsored by: Rubicon Communications (Netgate)
Differential Revision: https://reviews.freebsd.org/D2262
Check whether the page being requested is either resident or on swap. If
not, read from the zero_region instead of instantiating an unnecessary page.
This avoids consuming memory for sparse files on tmpfs, when they are read
by applications that do not use SEEK_HOLE/SEEK_DATA (which is most of them).
Reviewed by: kib
MFC after: 1 week
Sponsored by: Spectra Logic