It's possible to get EFAULT when writing a segment backed by a file
if the segment extends beyond the file.
The core dump could still be useful if we skip the rest of the segment
and proceed to other segements.
The skipped segment (or a portion of it) will be zero-filled.
While there, use 'const' to signify that core_write() only reads the
buffer and use __DECONST before calling vn_rdwr_inchunks() because it
can be used for both reading and writing.
Before the change:
kernel: Failed to write core file for process mmap_trunc_core (error 14)
kernel: pid 77718 (mmap_trunc_core), uid 1001: exited on signal 6
After the change:
kernel: Failed to fully fault in a core file segment at VA 0x800645000 with size 0x4000 to be written at offset 0x29000 for process mmap_trunc_core
kernel: pid 4901 (mmap_trunc_core), uid 1001: exited on signal 6 (core dumped)
Reviewed by: julian, kib
Obtained from: Panzura (older version of the change)
MFC after: 5 days
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9233
The error is:
vmm_dev.c: In function 'alloc_memseg':
vmm_dev.c:261:11: error: null argument where non-null required (argument 1) [-Werror=nonnull]
Apparently, the gcc is unable to figure out that if a ternary operator
produced a non-NULL value once, then the operator with exactly the same
operands would produce the same value again.
MFC after: 1 week
Add own state variable to track if a sendqueue is stopped or not.
This will prevent traffic from entering the sendqueue while it is
being destroyed.
Update drain function to wait for traffic to be transmitted before
returning when the link state is active.
Add extra checks in transmit path for stopped SQ's.
While at it:
- Use likely() for a mbuf pointer check.
- Remove redundant IFF_DRV_RUNNING check.
MFC after: 1 week
Sponsored by: Mellanox Technologies
There were several places where reference to compression were left
unfinished. Furthermore, KASSERTs contained references to MPPC_INVALID
which is not defined in the tree and therefore were sure to break with
INVARIANTS: comment them out.
Reported by: Eugene Grosbein
PR: 216265
MFC after: 3 days
All of the printing from the tables file now has wrappers so that the
handling is cleaner and it's possible to print something out (say, during
development) without having to fight the global debug flags. This re-org
will also make it easier to have the tables be compiled out at build time
if desired.
Other than fixing some minor bugs, there are no user-visible changes from
this change
Sponsored by: Netflix, Inc.
Differential Revision: D9238
users that choose not to use EARLY_AP_STARTUP.
There is still an initialization issue/panic with !SMP and !EARLY_AP_STARTUP
that we have yet to resolve.
Submitted by: bde
The option "nonc" disables using of namecache for the created mount,
by default namecache is used. The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs. Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default. On the
other hand, smaller machines may benefit from lesser namecache
pressure.
Discussed with: mjg
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
For directories, node->tn_spec.tn_dir.tn_parent pointer to the parent
is used. For non-directories, the implementation is naive, all
directory nodes are scanned to find a dirent linking the specified
node. This can be significantly improved by maintaining tn_parent for
all nodes, later.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer. In this case, tmpfs_alloc_vp() accesses freed memory.
Introduce the reference count on the nodes. The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference. Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list. Both nodes and tmpfs_mounts are removed when
refcount goes to zero.
Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished. The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Commit r270423 fixed a regression in sched_yield() that was introduced
in earlier changes. Unfortunately, at the same time it introduced an
new regression. The problem is that SWT_RELINQUISH (6), like all other
SWT_* constants and unlike SW_* flags, is not a bit flag. So, (flags &
SWT_RELINQUISH) is true in cases where that was not really indended,
for example, with SWT_OWEPREEMPT (2) and SWT_REMOTEPREEMPT (11).
A straight forward fix would be to use (flags & SW_TYPE_MASK) ==
SWT_RELINQUISH, but my impression is that the switch types are designed
mostly for gathering statistics, not for influencing scheduling
decisions.
So, I decided that it would be better to check for SW_PREEMPT flag
instead. That's also the same flag that was checked before r239157.
I double-checked how that flag is used and I am confident that the flag
is set only in the places where we really have the preemption:
- critical_exit + td_owepreempt
- sched_preempt in the ULE scheduler
- sched_preempt in the 4BSD scheduler
Reviewed by: kib, mav
MFC after: 4 days
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9230
sure the XHCI controller is reset after halting it. The problem is
clearly a BIOS bug as the suspend and resume is failing without
loading the XHCI driver. The same happens when using Linux and the
XHCI driver is not loaded.
Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
PR: 216261
MFC after: 1 week
As suggested in r167010, use the structure type and macros to access and
modify UFS2 extended attributes. Add assertions that pointers are
aligned in places where we now access the data through a structure
pointer, instead of character-by-character.
PR: 216127
Reported by: dewayne at heuristicsystems.com.au
Reviewed by: kib@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D9225
disabled (Hi netmap!).
Only remove the CRC bytes from packets when the hardware tell us to do so.
Fixes the 'discard frame w/o leading ethernet header' issues.
Sponsored by: Rubicon Communications, LLC (Netgate)
Remove TMPFS_ASSERT_ELOCKED(). Its claims are already stated by other
asserts nearby and by VFS guarantees.
Change TMPFS_ASSERT_LOCKED() and one inlined place to use
ASSERT_VOP_(E)LOCKED() instead of hand-rolled imprecise asserts.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Edit comments which explain no longer relevant details, and add
locking annotations to the struct tmpfs_node members.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The ea_name string is not nul-terminated. Correct the documentation.
Because the subsequent field is padded to 8 bytes, and the padding is
zeroed, the ea_name string will appear to be nul-terminated whenever the
length isn't exactly one (mod eight).
This was introduced in r167010 (2007).
Additionally, mark the length fields as unsigned. This particularly
matters for the single byte ea_namelength field, which can represent
extended attribute names up to 255 bytes long.
No functional change.
PR: 216127
Reported by: dewayne at heuristicsystems.com.au
Reviewed by: kib@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D9206
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.
- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.
- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().
- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.
- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.
- How rate limiting works:
1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.
2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.
3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.
4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.
Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months
the fifth argument to functions being traced, however there was an error
where the userspace stack was being used. This may be invalid leading to
a kernel panic if this address is unmapped.
Submitted by: Graeme Jenkinson <graeme.jenkinson@cl.cam.ac.uk>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9229
As the efi_devpath_last_node() and efi_devpath_trim() can return NULL
pointers, the consumers of this API should check the the NULL pointers.
Same for efinet_dev_init() using calloc().
Reported by: Robert Mustacchi <rm@joyent.com>
Reviewed by: jhb, allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9203
Clang apparently requires the explicit form of this instruction, and rejects
uses which ignore the optional cmpD register. This was the only use of the
shorthand form of the instruction, so just fix it up to match the others.
PR: kern/215681
Submitted by: Mark Millard
Reported by: Mark Millard <markmi _AT_ dsl-only.net>
MFC after: 2 weeks
Languages like C++17 and Go provide direct support for slice types:
pointer/length pairs. The CloudABI generator now has more complete for
this, meaning that for the C binding, pointer/length pairs now use an
automatic naming scheme of ${name} and ${name}_len.
Apart from this change and some reformatting, the ABI definitions are
identical. Binary compatibility is preserved entirely.
This field has no practical use and never readed. Initiators already
receive respective residual size from frontends. Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.
While there, fix kern_data_resid field support in case of HA, missed in
r312291.
MFC after: 13 days
This lock was replaced from rwlock in r272840. But unlike rwlock, rmlock
doesn't allow recursion on rm_rlock(), so at this time fix this with
RM_RECURSE flag. Later we need to change ipfw to avoid such recursions.
PR: 216171
Reported by: Eugene Grosbein
MFC after: 1 week
The Zedboard has a hardware bug where initialization of the USB PHY
occasionally fails on boot-up. Fix regression in -CURRENT when
kernel panics on such occasion. 11-RELEASE branch works fine
PR: 215862
Submitted by: Thomas Skibo <thoma555-bsd@yahoo.com>
Previously "panic: msleep" could happen for a few different reasons.
Break the KASSERTs out into individual cases to identify the failing
condition. Found during the investigation that resulted in r308288.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8604
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:
o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).
In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.
Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.
This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.
There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.
Reviewed by: adrian, gnn
Differential Revision: https://reviews.freebsd.org/D9171
gtaskqueue bits at SI_SUB_INIT_IF instead of waiting until SI_SUB_SMP
which is far too late.
Add an assertion in taskqgroup_attach() to catch startup initialization
failures in the future.
Reported by: kib bde
it into pmap-v4.h where they are used. Other than those few lines of
support for different MMU types, nothing in cpuconf.h has been used in our
code for quite a while.
The file existed to set up a variety of symbols to describe the
architecture. Over the past few years we have converted all of our source
to use the new architecture symbols standardized by ARM Inc, and predefined
by both clang and gcc.
PR: 216104
It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.
CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).
MFC after: 2 weeks
This patch adds driver for temperature/humidity sensor connected via GPIO.
To compile it into kernel add "device gpioths". To activate driver, use
hints (.at and .pins) for gpiobus. As result it will provide temperature &
humidity values via sysctl.
DHT11 is cheap & popular temperature/humidity sensor used via GPIO on ARM
or MIPS devices like Raspberry Pi or Onion Omega.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9185
The find_currdev() is using variable "copy" to store the reference to trimmed
devpath pointer, if for some reason the efi_devpath_handle() fails, we will
leak this copy.
Also we can simplify the code there a bit.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9191
Replace archaic "busses" with modern form "buses."
Intentionally excluded:
* Old/random drivers I didn't recognize
* Old hardware in general
* Use of "busses" in code as identifiers
No functional change.
http://grammarist.com/spelling/buses-busses/
PR: 216099
Reported by: bltsrc at mail.ru
Sponsored by: Dell EMC Isilon
arswitch_setled() and a number of _global_setup functions did not acquire the
lock before calling arswitch_modifyreg(). With WITNESS enabled this would
instantly panic.
Discovered on a TPLink-3600:
("panic: mutex arswitch not owned at sys/dev/etherswitch/arswitch/arswitch_reg.c:236")
Reviewed by: adrian, kan
Differential Revision: https://reviews.freebsd.org/D9187
between exp(3) and `exp` var.
The approach taken previously was not ideal for multiple
functional and stylistic reasons.
Add to existing sed call in Makefile to replace `exp` with
`exponent` instead.
MFC after: 13 days
Requested by: bde
vm_object_madvise() is frequently used to apply advice to a contiguous
set of pages in an object with no backing object. Optimize this case by
skipping non-resident subranges in constant time, and by iterating over
resident pages using the object memq, thus avoiding radix tree lookups on
each page index in the specified range.
While here, move MADV_WILLNEED handling to vm_page_advise(), and rename the
"advise" parameter to vm_object_madvise() to "advice."
Reviewed by: alc, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9098
sys/net/iflib.c:
Add ctx to filter_info and don't skpi interrupt early on unless we're on an
SMP system
sys/kern/subr_gtaskqueue.c:
Skip smp check if we're running UP
Submitted by: Matt Macy <mmacy@nextbsd.org>
Reported by: emaste bde
This is Micrel KSZ8995MA driver code. KSZ8995MA uses SPI bus to control.
This code is written & tested on @SRCHACK's ksz8995ma board and FON2100
with gpiospi.
etherswitchcfg support commands: addtag, ingress, striptag, dropuntagged.
Submitted by: Hiroki Mori <yamori813@yahoo.co.jp>
Reviewed by: mizhka, adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D8790
This patch adds missing hints for ath0 (eepromaddr) and GPIO (mask & leds).
ath0 doesn't work without eeprom hints, so this commit should make wifi
works on Onion Omega.
GPIO mask is required if you want to use gpiobus and GPIO pins on your
board. Onion Omega has several leds connected to gpio pins (one on board,
one color on dock).
This commit adds mask for gpiobus and allow you to turn off/on leds via
/dev/leds/{board,blue,green,red} (on by default).
Tested on Onion Omega 1.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9107
This is needed for kernel dumps to work, as the panicking thread will call
into code that makes use of kernel locks.
Reported and tested by: Eugene Grosbein
MFC after: 1 week
the cmap lock. Releasing the lock first may result in the thread
being immediately rescheduled and bound to the same CPU, only to
unpin itself upon resuming execution.
Noted by: skra (in review for armv6 equivalent)
MFC after: 1 week
This helps fix a -Wshadow issue with exp(3) with tests/sys/acct/acct_test,
which include math.h, which in turn defines exp(3)
MFC after: 2 weeks
Tested with: clang, gcc 4.2.1, gcc 4.9
Sponsored by: Dell EMC Isilon
dropped then reacquired due to using M_WAITOK, which opens a window in
which the tty device can disappear. Check for this and return ENXIO
back up the call chain so that callers can cope.
This closes a race where TF_GONE would get set while buffers were being
allocated as part of ttydev_open(), causing a subsequent call to
ttydevsw_modem() later in ttydev_open() to assert.
Reported by: pho
Reviewed by: kib
header match when using a raw socket to send IPv4 packets and
providing the header. If they don't match, let send return -1
and set errno to EINVAL.
Before this patch is was only enforced that the length in the header
is not larger then the buffer length.
PR: 212283
Reviewed by: ae, gnn
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9161