Commit Graph

245132 Commits

Author SHA1 Message Date
ian
4a4f9b093d Enhance mdmfs(8) to work with tmpfs(5).
Existing scripts and associated config such as rc.initdiskless, rc.d/var,
and others, use mdmfs to create memory filesystems. That program accepts a
size argument which allows SI suffixes and treats an unsuffixed number as a
count of 512 byte sectors. That makes it difficult to convert existing
scripts to use tmpfs instead of mdmfs, because tmpfs treats unsuffixed
numbers as a count of bytes. The script logic to deal with existing user
config that might include suffixed and unsuffixed numbers is... unpleasant.

Also, there is no g'tee that tmpfs will be available. It is sometimes
configured out of small-resource embedded systems to save memory and flash
storage space.

These changes enhance mdmfs(8) so that it accepts two new values for the
'md-device' arg: 'tmpfs' and 'auto'. With tmpfs, the program always uses
tmpfs(5) (and fails if it's not available). With 'auto' the program prefers
tmpfs, but falls back to using md(4) if tmpfs isn't available. It also
handles the -s <size> argument so that the mdconfig interpetation of
unsuffixed numbers applies when tmpfs is used as well, so that existing user
config keeps working after a switch to tmpfs.

A new rc setting, mfs_type, is added to etc/defaults/rc.conf to let users
force the use of tmpfs or md; the default value is "auto".

Differential Revision:	https://reviews.freebsd.org/D12301
2017-09-29 22:13:26 +00:00
cem
ff255b872c aesni(4): Fix GCC build
The GCC xmmintrin.h header brokenly includes mm_malloc.h unconditionally.
(The Clang version of xmmintrin.h only includes mm_malloc.h if not compiling
in standalone mode.)

Hack around GCC's broken header by defining the include guard macro ahead of
including xmmintrin.h.

Reported by:	lwhsu, jhb
Tested by:	lwhsu
Sponsored by:	Dell EMC Isilon
2017-09-29 19:56:09 +00:00
jkim
47a1bf954c Import ACPICA 20170929. 2017-09-29 17:08:30 +00:00
bdrewery
3f9dafbc2b __setrunelocale: Fix asprintf(3) failure not returning an error.
Also fix the style of the asprintf(3) call in __collate_load_tables_l().
Both of these lines were modified away from snprintf(3) during the
import from DragonFly/Illumos.

Reviewed by:	jilles (briefly over shoulder)
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-09-29 16:30:50 +00:00
cem
e240c2435f netsmb: Fix buggy/racy smb_strdupin()
smb_strdupin() tried to roll a copyin() based strlen to allocate a buffer
and then blindly copyin that size.  Of course, a malicious user program
could simultaneously manipulate the buffer, resulting in a non-terminated
string being copied.

Later assumptions in the code rely upon the string being nul-terminated.

Just use copyinstr() and drop the racy sizing.

PR:		222687
Reported by:	Meng Xu <meng.xu AT gatech.edu>
Security:	possible local DoS
Sponsored by:	Dell EMC Isilon
2017-09-29 15:53:26 +00:00
bapt
d21eead6ff man(1): silent the output of mandoc when testing
This reduce the spam a user may face when mandoc tries to
figure out if it can renders a manpage or fallback on groff(1)

Reported by:	bdrewery
MFC after:	3 days
2017-09-29 07:44:48 +00:00
wma
b974bc3b71 Compile loader as Little-Endian on PPC64/POWER8
Add flag to the makefile to allow loader compilation as
  Little-Endian 32-bit executable.
  Usage:

  make WITH_LOADER_FORCE_LE=yes -C sys/boot all

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           imp, nwhitehorn
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D12421
2017-09-29 06:36:19 +00:00
ae
3193525e89 Some mbuf related fixes in icmp_error()
* check mbuf length before doing mtod() and accessing to IP header;
* update oip pointer and all depending pointers after m_pullup();
* remove extra checks and extra parentheses, wrap long lines;

PR:		222670
Reported by:	Prabhakar Lakhera
MFC after:	1 week
2017-09-29 06:24:45 +00:00
scottl
fed34c6aff Convert sysctl sbuf usage to use a fully dynaic sbuf. This is strictly
needed, but it silences an erroneous Coverity warning and makes the code a
little more logically consistent.  Also mark the sysctl as MPSAFE.

Sponsored by:	Netflix
2017-09-29 04:52:15 +00:00
kevlo
ba5b326d83 Add ThinkPad USB 3.0 Ethernet Adapter.
Submitted by:	jh
2017-09-29 01:19:22 +00:00
rmacklem
fce41e8299 Add the NFS client state flag that enables Flexible File Layout.
This patch adds a NFSSTA_FLEXFILE flag that will be used to enable
Flexible File Layout for the NFSv4.1 pNFS client. It is not yet
used, but will be after a future commit adds Flex File Layout support.
2017-09-28 23:05:08 +00:00
rmacklem
888b65faac Change nfsv4_getipaddr() and nfsrpc_fillsa() to not use sockaddr_storage.
This patch changes nfsv4_getipaddr() and nfsrpc_fillsa() to use
a sockaddr_in * and sockaddr_in6 * instead of sockaddr_storage, to
avoid allocating the latter on the stack. It also moves the nfsrpc_fillsa()
call to after the completion of parsing of the DeviceInfo reply from
the server. This patch is in preparation for addition of Flex File
Layout support in a future commit.
It only affects the "pnfs" NFSv4.1 client mount option and should not
have changed its semantics.
2017-09-28 22:33:01 +00:00
n_hibma
ab72bcdbd0 Make this compile if NO_SYSCTL_DESCR is defined.
Defining a variable with the description and then only use it in the
SYSCTL declaration led to an unused variable warning. In the SYSCTL the
passed value is discarded using __DESCR.
2017-09-28 19:57:46 +00:00
n_hibma
e52d348cee Make this compile with DEVICE_POLLING set.
smc_poll had the wrong prototype. It returns 0 as it does not check
anything but submits a taskqueue.

Reviewed by:	benno
MFC after:	2 weeks
2017-09-28 19:33:36 +00:00
alc
7ad59282da Optimize vm_object_page_remove() by eliminating pointless calls to
pmap_remove_all().  If the object to which a page belongs has no
references, then that page cannot possibly be mapped.

Reviewed by:	kib
MFC after:	1 week
2017-09-28 17:55:41 +00:00
mav
13858695a6 Alike to ZFS disable cache flush after first ENOTSUP error.
MFC after:	1 week
2017-09-28 15:58:41 +00:00
n_hibma
6d7dddc618 Typo in filename in comment. 2017-09-28 12:43:25 +00:00
eugen
85321e8a2f Correction after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>
PR:		220076
Approved by:	mav (mentor)
MFC after:	3 days
2017-09-28 11:26:37 +00:00
kib
50cb59e230 A different fix for the issue from r323722.
Split the handlers for pop of invalid selectors from the trap frame
into usermode and kernel variants.  Usermode handler is kept as is, it
restores the already loaded parts of the trap frame and jumps to set
up a signal delivery to the user process.

New kernel part of the handler emulates IRET treatment of the segments
which would violate access right.  It loads NUL selector in the
segment register which load causes the fault, and then continues the
return to interrupted kernel code.  Since invalid selectors in the
segment registers in the kernel mode can only exist while kernel still
enters or exits from userspace, we only zero invalid userspace
selectors.  If userspace tries to use the segment register, it gets a
signal, as if the processor segment descriptor cache was reloaded.

Reported by:	Maxime Villard <max@m00nbsd.net>
Suggested and reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-28 09:01:28 +00:00
kib
071c00f495 Restore a part of r323722.
Do not return from interrupt using the POP_FRAME;iret instruction
sequence, always jump to doreti.

The user segments selectors saved on the stack might become invalid
because userspace manipulated LDT in a parallel thread.  trap() is
aware of such issue, but it is only prepared to handle it at iret and
segment registers load operations in doreti path.

Also remove POP_FRAME macro because it is no longer used.

Reviewed by:	bde, jhb (as part of r323722)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-28 08:46:15 +00:00
kib
81faa225ff Revert r323722. A better fix will be committed shortly, as well as
some still useful bits of the reverted revision.

The problem with the committed fix is that there are still issues with
returning from NMI, when NMI interrupted kernel in a moment where the
kernel segments selectors were still not loaded into registers.  If
this happens, the NMI return would loose the userspace selectors
because r323722 does not reload segment registers on return to kernel
mode.

Fixing the problem is complicated.  Since an alternative approach to
handle the original bug exists, it makes sence to stop adding more
complexity.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-09-28 08:38:24 +00:00
sephe
11859da3f9 hyperv/hn: Unbreak i386 building.
Reported by:	cy
MFC after:	1 week
Sponsored by:	Microsoft
2017-09-28 07:02:56 +00:00
imp
ff7911a913 Tweak performance of nda completions
Use xpt_done_direct in preference to xpt_done when completing a
successful I/O. Continue to use xpt_done when there's an error, or for
completion of the submission of a CCB. This eliminates a context
switch to the cam_doneq thread.

Sponsored by: Netflix
Suggested by: scottl@
2017-09-28 01:27:00 +00:00
rmacklem
9dd547c597 Fix a memory leak that occurred in the pNFS client.
When a "pnfs" NFSv4.1 mount was unmounted, it didn't free up the layouts
and deviceinfo structures. This leak only affects "pnfs" mounts and only
when the mount is umounted.
Found while testing the pNFS Flexible File layout client code.

MFC after:	2 weeks
2017-09-27 23:23:41 +00:00
jhb
bb8b530dd0 Use UMA_ALIGNOF() for name cache UMA zones.
This fixes kernel crashes due to misaligned accesses to the 64-bit
time_t embedded in struct namecache_ts in MIPS n32 kernels.

MFC after:	1 week
Sponsored by:	DARPA / AFRL
2017-09-27 23:18:57 +00:00
jhb
13b1e2684d Add UMA_ALIGNOF().
This is a wrapper around _Alignof() that sets the alignment for a zone
to the alignment required by a given type.  This allows the compiler to
determine the proper alignment rather than having the programmer try to
guess.

Discussed on:	arch@
MFC after:	1 week
Sponsored by:	DARPA / AFRL
2017-09-27 23:15:33 +00:00
landonf
67cf61ca22 bhnd: Add support for supplying bus I/O callbacks when initializing an EROM
parser.

This allows us to use the EROM parser API in cases where the standard bus
space I/O APIs are unsuitable. In particular, this will allow us to parse
the device enumeration table directly from bhndb(4) drivers, prior to
full attach and configuration of the bridge.

Approved by:	adrian (mentor)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12510
2017-09-27 19:48:34 +00:00
landonf
2e486532df bhnd: Implement bhnd(4) platform device registration.
Add bhnd(4) API for explicitly registering BHND platform devices (ChipCommon,
PMU, NVRAM, etc) with the bus, rather than walking the newbus hierarchy to
discover platform devices. These devices are now also refcounted; attempting
to deregister an actively used platform device will return EBUSY.

This resolves a lock ordering incompatibility with bwn(4)'s firmware loading
threads; previously it was necessary to acquire Giant to protect newbus access
when locating and querying the NVRAM device.

Approved by:	adrian (mentor)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12392
2017-09-27 19:44:23 +00:00
imp
f0a60eb85f Since the human readable name is actually ignored, and not matching a
'human' pnp string, change it to #, the name reserved for fields that
are ignored.
2017-09-27 19:22:10 +00:00
imp
ee8813ba3b Improve description of the PNP string a bit. 2017-09-27 19:21:52 +00:00
cem
8a39a65595 Unrevert r324059
With a colon and bogus name ("#") added to appease the simplistic parser
used in kldxref.

Sponsored by:	Dell EMC Isilon
2017-09-27 19:14:00 +00:00
markj
ae879c6393 Use C99 initializers for DTrace provider methods.
This makes the definitions easier to read and more cscope-friendly.

MFC after:	1 week
2017-09-27 17:46:38 +00:00
davidcs
270b655971 Tx Ring Shadow Consumer Index Register needs to be cleared prior
to passing it's physical address to the FW during Tx Create Context.

MFC after:3 days
2017-09-27 17:46:11 +00:00
fsu
ac00ff77c6 Add check to avoid raw inode iblocks fields overflow in case of huge_file feature.
Use the Linux logic for now.

Reviewed by:    pfg (mentor)
Approved by:    pfg (mentor)
MFC after:      2 weeks
Differential Revision: https://reviews.freebsd.org/D12131
2017-09-27 16:12:13 +00:00
cem
df978a4bc8 Remove PNP metadata from drm2 drivers until kldxref problem is resolved
Reported by:	np
Sponsored by:	Dell EMC Isilon
2017-09-27 14:59:18 +00:00
tuexen
a1f4d252b8 Remove unused function.
MFC after:	1 week
2017-09-27 13:05:23 +00:00
manu
444f0ddd90 vfs_export: Simplify vfs_export_lookup
If the filesystem is not exported directly return NULL.
If no address is given and filesystem is exported using some default
one return it directly, if it doesn't have a default one directly
return NULL.

Reviewed by:	kib, bapt
MFC after:	1 week
Sponsored by:	Gandi.net
Differential Revision:	https://reviews.freebsd.org/D12505
2017-09-27 09:39:16 +00:00
sephe
04335d9414 kernel: Bump __FreeBSD_version for the removal of M_HASHTYPE_RSS_UDP_IPV4_EX
Sponsored by:	Microsoft
2017-09-27 06:33:55 +00:00
sephe
f72be2a2dc mbuf: Remove UDP_IPV4_EX, which was never defined.
Add comment to explain the IPV6_EX suffix.  The confusion about
these RSS hash type probably stems from the facts that they were
never widely implemented by hardwares.

Reviewed by:	rwatson
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12453
2017-09-27 06:31:35 +00:00
sephe
f1d4cb25ab ixl: Fix mbuf hash type settings.
IPV6_EXs in RSS never mean fragment.  They mean:
"- Home address from the home address option in the IPv6 destination
   options header.  If the extension header is not present, use the
   Source IPv6 Address.
 - IPv6 address that is contained in the Routing-Header-Type-2 from
   the associated extension header.  If the extension header is not
   present, use the Destination IPv6 Address."

UDP_IPV4_EX is an invalid RSS hash type, which will be removed.

Quoted from:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-hashing-types#ndishashipv6ex

Reviewed by:	erj
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12450
2017-09-27 05:59:54 +00:00
sephe
e1f9aeedee tcp: Don't "negotiate" MSS.
_NO_ OSes actually "negotiate" MSS.

RFC 879:
"... This Maximum Segment Size (MSS) announcement (often mistakenly
called a negotiation) ..."

This negotiation behaviour was introduced 11 years ago by r159955
without any explaination about why FreeBSD had to "negotiate" MSS:

    In syncache_respond() do not reply with a MSS that is larger than what
    the peer announced to us but make it at least tcp_minmss in size.

    Sponsored by:   TCP/IP Optimization Fundraise 2005

The tcp_minmss behaviour is still kept.

Syncookie fix was prodded by tuexen, who also helped to test this
patch w/ packetdrill.

Reviewed by:	tuexen, karels, bz (previous version)
MFC after:	2 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12430
2017-09-27 05:52:37 +00:00
sephe
b69ef721ad hyperv/hn: Fix UDP checksum offload issue in Azure.
UDP checksum offload does not work in Azure if following conditions are
met:
- sizeof(IP hdr + UDP hdr + payload) > 1420.
- IP_DF is not set in IP hdr

Use software checksum for UDP datagrams falling into this category.

Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in
case something unexpected happened.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12429
2017-09-27 05:44:50 +00:00
sephe
8fc41247cb hyperv/hn: Set tcp header offset for CSUM/LSO offloading.
No observable effect; better safe than sorry.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12417
2017-09-27 04:42:40 +00:00
mjg
92eb817a05 sysctl: remove target buffer read/write checks prior to calling the handler
Said checks were inherently racy anyway as jokers could unmap target areas
before the handler got around to accessing them.

This saves time by avoiding locking the address space.

MFC after:	1 week
2017-09-27 01:31:52 +00:00
mjg
84d28f19d5 Annotate sysctlmemlock with __exclusive_cache_line.
MFC after:	1 week
2017-09-27 01:27:43 +00:00
mjg
1b521971f3 Remove manpage entries about crshared(9)
The function itself was removed years ago in r272546

Submitted by:	Paulm <paulm tetrardus.net>
MFC after:	2 weeks
2017-09-27 01:12:47 +00:00
mjg
c8836df626 Whack procctl(8)
It was supposed to provide a recovery mechanism against bugs in procfs's
long deprecated tracing capabilities.

Remove the tool as a prerequisite to axing the kernel side.

The tracing facility to use is ptrace(2).

MFC after:	2 weeks
2017-09-27 01:03:00 +00:00
mjg
77579b97a2 mtx: drop the tid argument from _mtx_lock_sleep
tid must be equal to curthread and the target routine was already reading
it anyway, which is not a problem. Not passing it as a parameter allows for
a little bit shorter code in callers.

MFC after:	1 week
2017-09-27 00:57:05 +00:00
rmacklem
3ccccd3209 Add major and minor version arguments to nfscl_reqstart().
This patch adds "vers" and "minorvers" arguments to nfscl_reqstart().
The patch always passes them in as "0" and that implies no change
in semantics. These arguments will be used by a future commit that
adds support for the Flexible File Layout.
2017-09-26 23:42:44 +00:00
jhb
c991591342 Don't defer wakeup()s for completed journal workitems.
Normally wakeups() are performed for completed softupdates work items
in workitem_free() before the underlying memory is free()'d.
complete_jseg() was clearing the "wakeup needed" flag in work items to
defer the wakeup until the end of each loop iteration.  However, this
resulted in the item being free'd before it's address was used with
wakeup().  As a result, another part of the kernel could allocate this
memory from malloc() and use it as a wait channel for a different
"event" with a different lock.  This triggered an assertion failure
when the lock passed to sleepq_add() did not match the existing lock
associated with the sleep queue.  Fix this by removing the code to
defer the wakeup in complete_jseg() allowing the wakeup to occur
slightly earlier in workitem_free() before free() is called.

The main reason I can think of for deferring a wakeup() would be to
avoid waking up a waiter while holding a lock that the waiter would
need.  However, no locks are dropped in between the wakeup() in
workitem_free() and the end of the loop in complete_jseg() as far as I
can tell.

In general I think it is not safe to do a wakeup() after free() as one
cannot control how other parts of the kernel that might reuse the
address for a different wait channel will handle spurious wakeups.

Reported by:	pho
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12494
2017-09-26 23:24:15 +00:00