126497 Commits

Author SHA1 Message Date
Alan Somers
4683b90591 fusefs: don't disappear a vnode on entry cache expiration
When the entry cache expires, it's only necessary to purge the cache.
Disappearing a vnode also purges the attribute cache, which is unnecessary,
and invalidates the data cache, which could be harmful.

Sponsored by:	The FreeBSD Foundation
2019-04-11 21:13:54 +00:00
Alan Somers
6124fd7106 fusefs: Finish supporting -o default_permissions
I got most of -o default_permissions working in r346088.  This commit adds
sticky bit checks.  One downside is that sometimes there will be an extra
FUSE_GETATTR call for the parent directory during unlink or rename.  But in
actual use I think those attributes will almost always be cached.

PR:		216391
Sponsored by:	The FreeBSD Foundation
2019-04-11 21:00:40 +00:00
Alan Somers
dc14d593a6 fusefs: use vn_vget_ino_gen in fuse_vnop_lookup
vn_vget_ino_gen is a helper function added in r268606 to simplify cases just
like this.

Sponsored by:	The FreeBSD Foundation
2019-04-11 17:20:15 +00:00
Alan Somers
438b8a6fa2 fusefs: eliminate a superfluous FUSE_GETATTR from VOP_LOOKUP
fuse_vnop_lookup was using a FUSE_GETATTR operation when looking up "." and
"..", even though the only information it needed was the file type and file
size.  "." and ".." are obviously always going to be directories; there's no
need to double check.

Sponsored by:	The FreeBSD Foundation
2019-04-11 05:11:02 +00:00
Alan Somers
73825da397 fusefs: remove "early permission check hack"
fuse_vnop_lookup contained an awkward hack meant to reduce daemon activity
during long lookup chains.  However, the hack is no longer necessary now
that we properly cache file attributes.  Also, I'm 99% certain that it
could've bypassed permission checks when using openat to open a file
relative to a directory that lacks execute permission.

Sponsored by:	The FreeBSD Foundation
2019-04-10 21:46:59 +00:00
Alan Somers
666f8543bb fusefs: various cleanups
* Eliminate fuse_access_param.  Whatever it was supposed to do, it seems
  like it was never complete.  The only real function it ever seems to have
  had was a minor performance optimization, which I've already eliminated.
* Make extended attribute operations obey the allow_other mount option.
* Allow unprivileged access to the SYSTEM extattr namespace when
  -o default_permissions is not in use.
* Disallow setextattr and deleteextattr on read-only mounts.
* Add tests for a few more error cases.

Sponsored by:	The FreeBSD Foundation
2019-04-10 21:10:21 +00:00
Alan Somers
ff4fbdf548 fusefs: WIP supporting -o default_permissions
Normally all permission checking is done in the fuse server.  But when -o
default_permissions is used, it should be done in the kernel instead.  This
commit adds appropriate permission checks through fusefs when -o
default_permissions is used.  However, sticky bit checks aren't working yet.
I'll handle those in a follow-up commit.

There are no checks for file flags, because those aren't supported by our
version of the FUSE protocol.  Nor is there any support for ACLs, though
that could be added if there were any demand.

PR:		216391
Reported by:	hiyorin@gmail.com
Sponsored by:	The FreeBSD Foundation
2019-04-10 17:31:00 +00:00
Alan Somers
44f10c6e40 fusefs: cache negative lookups
The FUSE protocol includes a way for a server to tell the client that a
negative lookup response is cacheable for a certain amount of time.

PR:		236226
Sponsored by:	The FreeBSD Foundation
2019-04-09 21:22:02 +00:00
Alan Somers
ccb75e4939 fusefs: implement entry cache timeouts
Follow-up to r346046.  These two commits implement fuse cache timeouts for
both entries and attributes.  They also remove the vfs.fusefs.lookup_cache
enable sysctl, which is no longer needed now that cache timeouts are
honored.

PR:		235773
Sponsored by:	The FreeBSD Foundation
2019-04-09 17:23:34 +00:00
Alan Somers
3f2c630c74 fusefs: implement attribute cache timeouts
The FUSE protocol allows the server to specify the timeout period for the
client's attribute and entry caches.  This commit implements the timeout
period for the attribute cache.  The entry cache's timeout period is
currently disabled because it panics, and is guarded by the
vfs.fusefs.lookup_cache_expire sysctl.

PR:		235773
Reported by:	cem
Sponsored by:	The FreeBSD Foundation
2019-04-09 00:47:38 +00:00
Alan Somers
cad677915f fusefs: cache file attributes
FUSE_LOOKUP, FUSE_GETATTR, FUSE_SETATTR, FUSE_MKDIR, FUSE_LINK,
FUSE_SYMLINK, FUSE_MKNOD, and FUSE_CREATE all return file attributes with a
cache validity period.  fusefs will now cache the attributes, if the server
returns a non-zero cache validity period.

This change does _not_ implement finite attr cache timeouts.  That will
follow as part of PR 235773.

PR:		235775
Reported by:	cem
Sponsored by:	The FreeBSD Foundation
2019-04-08 18:45:41 +00:00
Alan Somers
caf5f57d2d fusefs: implement VOP_ACCESS
VOP_ACCESS was never fully implemented in fusefs.  This change:
* Removes the FACCESS_DO_ACCESS flag, which pretty much disabled the whole
  vop.
* Removes a quixotic special case for VEXEC on regular files.  I don't know
  why that was in there.
* Removes another confusing special case for VADMIN.
* Removes the FACCESS_NOCHECKSPY flag.  It seemed to be a performance
  optimization, but I'm unconvinced that it was a net positive.
* Updates test cases.

This change does NOT implement -o default_permissions.  That will be handled
separately.

PR:		236291
Sponsored by:	The FreeBSD Foundation
2019-04-05 18:37:48 +00:00
Alan Somers
efa23d9784 fusefs: enforce -onoallow_other even beneath the mountpoint
When -o allow_other is not in use, fusefs is supposed to prevent access to
the filesystem by any user other than the one who owns the daemon.  Our
fusefs implementation was only enforcing that restriction at the mountpoint
itself.  That was usually good enough because lookup usually descends from
the mountpoint.  However, there are cases when it doesn't, such as when
using openat relative to a file beneath the mountpoint.

PR:		237052
Sponsored by:	The FreeBSD Foundation
2019-04-05 17:21:23 +00:00
Alan Somers
140bb4927a fusefs: correctly return EROFS from VOP_ACCESS
Sponsored by:	The FreeBSD Foundation
2019-04-05 15:33:43 +00:00
Alan Somers
a7e81cb3db fusefs: properly handle FOPEN_KEEP_CACHE
If a fuse file system returne FOPEN_KEEP_CACHE in the open or create
response, then the client is supposed to _not_ clear its caches for that
file.  I don't know why clearing the caches would be the default given that
there's a separate flag to bypass the cache altogether, but that's the way
it is.  fusefs(5) will now honor this flag.

Our behavior is slightly different than Linux's because we reuse file
handles.  That means that open(2) wont't clear the cache if there's a
reusable file handle, even if the file server wouldn't have sent
FOPEN_KEEP_CACHE had we opened a new file handle like Linux does.

PR:		236560
Sponsored by:	The FreeBSD Foundation
2019-04-04 20:30:14 +00:00
Alan Somers
8d013bec7a fusefs: fix some uninitialized memory references
This bug was long present, but was exacerbated by r345876.

The problem is that fiov_refresh was bzero()ing a buffer _before_ it
reallocated that buffer.  That's obviously the wrong order.  I fixed the
order in r345876, which exposed the main problem.  Previously, the first 160
bytes of the buffer were getting bzero()ed when it was first allocated in
fiov_init.  Subsequently, as that buffer got recycled between callers, the
portion used by the _previous_ caller was getting bzero()ed by the current
caller in fiov_refresh.  The problem was never visible simply because no
caller was trying to use more than 160 bytes.

Now the buffer gets properly bzero()ed both at initialization time and any
time it gets enlarged or reallocated.

Sponsored by:	The FreeBSD Foundation
2019-04-04 20:24:58 +00:00
Alan Somers
9a696dc6bb MFHead@r345880 2019-04-04 18:26:32 +00:00
Alan Somers
12292a99ac fusefs: correctly handle short writes
If a FUSE daemon returns FOPEN_DIRECT_IO when a file is opened, then it's
allowed to write less data than was requested during a FUSE_WRITE operation
on that file handle.  fusefs should simply return a short write to userland.

The old code attempted to resend the unsent data.  Not only was that
incorrect behavior, but it did it in an ineffective way, by attempting to
"rewind" the uio and uiomove the unsent data again.

This commit correctly handles short writes by returning directly to
userland if FOPEN_DIRECT_IO was set.  If it wasn't set (making the short
write technically a protocol violation), then we resend the unsent data.
But instead of rewinding the uio, just resend the data that's already in the
kernel.

That necessitated a few changes to fuse_ipc.c to reduce the amount of bzero
activity.  fusefs may be marginally faster as a result.

PR:		236381
Sponsored by:	The FreeBSD Foundation
2019-04-04 16:51:34 +00:00
Rick Macklem
52cab12c09 Fix malloc stats for the RPCSEC_GSS server code when DEBUG is enabled.
The code enabled when "DEBUG" is defined uses mem_alloc(), which is a
malloc(.., M_RPC, M_WAITOK | M_ZERO), but then calls gss_release_buffer()
which does a free(.., M_GSSAPI) to free the memory.
This patch fixes the problem by replacing mem_alloc() with a
malloc(.., M_GSSAPI, M_WAITOK | M_ZERO).
This bug affects almost no one, since the sources are not normally built
with "DEBUG" defined.

Submitted by:	peter@ifm.liu.se
MFC after:	2 weeks
2019-04-04 01:23:06 +00:00
Conrad Meyer
a8a16c7128 Replace read_random(9) with more appropriate arc4rand(9) KPIs
Reviewed by:	ae, delphij
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19760
2019-04-04 01:02:50 +00:00
Pawel Jakub Dawidek
2f07cdf871 Implement automatic online expansion of GELI providers - if the underlying
provider grows, GELI will expand automatically and will move the metadata
to the new location of the last sector.

This functionality is turned on by default. It can be turned off with the
-R flag, but it is not recommended - if the underlying provider grows and
automatic expansion is turned off, it won't be possible to attach this
provider again, as the metadata is no longer located in the last sector.

If the automatic expansion is turned off and the underlying provider grows,
GELI will only log a message with the previous size of the provider, so
recovery can be easier.

Obtained from:	Fudo Security
2019-04-03 23:57:37 +00:00
Ed Maste
5eb264119e cpsw: use phy-handle in FDT to find PHY address
In r337703 DTS files were updated to Linux 4.18, including Linux commit
4d8b032d3c03f4e9788a18bbb51b10e6c9e8a56b which removed the `phy_id`
property from am335x-bone-common (as the property was deprecated).

Use `phy-handle` via fdt_get_phyaddr, keeping the existing code as a
fallback for old DTBs.

PR:		236624
Submitted by:	manu, Gerald Aryeetey <aryeeteygerald_rogers.com>
Reported by:	Gerald Aryeetey
Reviewed by:	manu
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19814
2019-04-03 21:01:53 +00:00
Alan Somers
35cf0e7e56 fusefs: fix a panic in VOP_READDIR
The original fusefs import, r238402, contained a bug in fuse_vnop_close that
could close a directory's file handle while there were still other open file
descriptors.  The code looks deliberate, but there is no explanation for it.
This necessitated a workaround in fuse_vnop_readdir that would open a new
file handle if, "for some mysterious reason", that vnode didn't have any
open file handles.  r345781 had the effect of causing the workaround to
panic, making the problem more visible.

This commit removes the workaround and the original bug, which also fixes
the panic.

Sponsored by:	The FreeBSD Foundation
2019-04-03 20:57:43 +00:00
Alan Somers
9f10f423a9 fusefs: send FUSE_FLUSH during VOP_CLOSE
The FUSE protocol says that FUSE_FLUSH should be send every time a file
descriptor is closed.  That's not quite possible in FreeBSD because multiple
file descriptors can share a single struct file, and closef doesn't call
fo_close until the last close.  However, we can still send FUSE_FLUSH on
every VOP_CLOSE, which is probably good enough.

There are two purposes for FUSE_FLUSH.  One is to allow file systems to
return EIO if they have an error when writing data that's cached
server-side.  The other is to release POSIX file locks (which fusefs(5) does
not yet support).

PR:		236405, 236327
Sponsored by:	The FreeBSD Foundation
2019-04-03 19:59:45 +00:00
Randall Stewart
fd29ff5d53 Undo my previous erroneous commit changing the tcp_output kassert.
Hmm now the question is where did the tcp_log_id change go :o
2019-04-03 19:35:07 +00:00
Alexander Motin
71f06f81af Fix typos in r345849.
MFC after:	1 week
2019-04-03 18:35:13 +00:00
Alexander Motin
9345f88f8c List few more ATA commands.
MFC after:	1 week
2019-04-03 18:27:54 +00:00
Konstantin Belousov
c973ee9e06 msdosfs: zero tail of the last block on truncation for VREG vnodes as well.
Despite the call to vtruncbuf() from detrunc(), which results in
zeroing part of the partial page after EOF, there still is a
possibility to retain the stale data which is revived on file
enlargement.  If the filesystem block size is greater than the page
size, partial block might keep other after-EOF pages wired and they
get reused then.  Fix it by zeroing whole part of the partial buffer
after EOF, not relying on vnode_pager_setsize().

PR:	236977
Reported by:	asomers
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-03 17:02:18 +00:00
Marcin Wojtas
3a3e5039b9 Add a cv_wait to the TPM2.0 harvesting function
Harvesting has to compete for the TPM chip with userspace.
Before this change the callout could hijack an unread buffer
causing a userspace call to the TPM to fail.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: delphij
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19712
2019-04-03 08:22:58 +00:00
Justin Hibbits
62c7ea1f1d powerpc: Allow emulating optional FPU instructions on CPUs with an FPU
The e5500 has an FPU, but lacks the optional fsqrt instruction.  This
instruction gets emulated in the kernel, but the emulation uses stale data,
from the last switch out, and does not return the result of the operation
immediately.  Fix both of these conditions by saving and restoring the FPRs
around the emulation point.

MFC after:	1 week
MFC with:	r345829
2019-04-03 04:01:08 +00:00
Marcin Wojtas
b0fefb25c5 Create kernel module to parse Veriexec manifest based on envs
The current approach of injecting manifest into mac_veriexec is to
verify the integrity of it in userspace (veriexec (8)) and pass its
entries into kernel using a char device (/dev/veriexec).
This requires verifying root partition integrity in loader,
for example by using memory disk and checking its hash.
Otherwise if rootfs is compromised an attacker could inject their own data.

This patch introduces an option to parse manifest in kernel based on envs.
The loader sets manifest path and digest.
EVENTHANDLER is used to launch the module right after the rootfs is mounted.
It has to be done this way, since one might want to verify integrity of the init file.
This means that manifest is required to be present on the root partition.
Note that the envs have to be set right before boot to make sure that no one can spoof them.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: sjg
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19281
2019-04-03 03:57:37 +00:00
Justin Hibbits
81dd9c5e69 powerpc: Apply r178139 from sparc64 to powerpc's fpu_sqrt
This fix was committed less than 2 months after the code was forked into the
powerpc kernel.  Though powerpc doesn't use quad-precision floating point,
or need it for emulation, the changes do look like correctness fixes
overall.

This was found while trying to get fsqrt emulation working on e5500, which
does have a real FPU, but lacks the fsqrt instruction.  This is not the
complete fix, the rest is to be committed separately.

MFC after:	1 week
2019-04-03 03:54:30 +00:00
Rick Macklem
1406895958 Add a comment to the r345818 patch to explain why cl_refs is initialized to 2.
PR:		235582
MFC after:	2 weeks
2019-04-03 03:50:16 +00:00
Alan Somers
e312493b37 fusefs: during ftruncate, discard cached data past truncation point
During truncate, fusefs was discarding entire cached blocks, but it wasn't
zeroing out the unused portion of a final partial block.  This resulted in
reads returning stale data.

PR:		233783
Reported by:	fsx
Sponsored by:	The FreeBSD Foundation
2019-04-03 02:29:56 +00:00
Rick Macklem
b0e14530a0 Fix a race in the RPCSEC_GSS server code that caused crashes.
When a new client structure was allocated, it was added to the list
so that it was visible to other threads before the expiry time was
initialized, with only a single reference count.
The caller would increment the reference count, but it was possible
for another thread to decrement the reference count to zero and free
the structure before the caller incremented the reference count.
This could occur because the expiry time was still set to zero when
the new client structure was inserted in the list and the list was
unlocked.

This patch fixes the race by initializing the reference count to two
and initializing all fields, including the expiry time, before inserting
it in the list.

Tested by:	peter@ifm.liu.se
PR:		235582
MFC after:	2 weeks
2019-04-02 23:51:08 +00:00
Alexander Motin
154c6ffd71 Build NVMe CAM transport unrelated to NVMe SIM.
Before this I suppose it was impossible load CAM-based NVMe as module.
Plus this appeared to be needed to build r345815 without NVMe driver.

MFC after:	2 weeks
2019-04-02 20:27:56 +00:00
Alexander Motin
e40d8dbbcb Make cam_error_print() decode NVMe commands.
MFC after:	2 weeks
2019-04-02 19:37:52 +00:00
Alan Somers
d3a8f2dd09 fusefs: fix a just-introduced panic in readdir
r345808 changed the interface of fuse_filehandle_open, but failed to update
one caller.

Sponsored by:	The FreeBSD Foundation
2019-04-02 19:20:55 +00:00
Tycho Nightingale
b80b32a2ab ioat(4) should use bus_dma(9) for the operation source and destination
addresses

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19725
2019-04-02 19:08:06 +00:00
Tycho Nightingale
a8d9ee9c50 ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and
crc-copy modes.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19780
2019-04-02 19:06:25 +00:00
Tycho Nightingale
9708c3a2b8 DMAR driver assumes all physical addresses are backed by a fully
initialized struct vm_page.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19753
2019-04-02 18:50:49 +00:00
Navdeep Parhar
1b3fc371b7 cxgbe(4): Add a flag to indicate that bits in interrupt cause but not in
interrupt enable are not fatal.

The firmware sets up all the interrupt enables based on run time
configuration, which means the information in the enables is more
accurate than what's compiled into the driver.  This change also allows
the fatal bits to be updated without any changes in the driver in some
cases.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-04-02 18:50:33 +00:00
Alan Somers
9e4448719b fusefs: cleanup and refactor some recent commits
This commit cleans up after recent commits, especially 345766, 345768, and
345781.  There is no functional change.  The most important change is to add
comments documenting why we can't send flags like O_APPEND in
FUSE_WRITE_OPEN.

PR:		236340
Sponsored by:	The FreeBSD Foundation
2019-04-02 18:09:40 +00:00
Alexander Motin
99bad9ca9a Unify SCSI_STATUS_BUSY retry handling with other cases.
- Do not retry if periph was invalidated.
 - Do not decrement retry_count if already zero.
 - Report action_string when applicable.

MFC after:	2 weeks
2019-04-02 14:46:10 +00:00
Konstantin Belousov
5c4ce6fac2 tmpfs: plug holes on rw->ro mount update.
In particular:
- suspend the mount around vflush() to avoid new writes come after the
  vnode is processed;
- flush pending metadata updates (mostly node times);
- remap all rw mappings of files from the mount into ro.

It is not clear to me how to handle writeable mappings on rw->ro for
tmpfs best.  Other filesystems, which use vnode vm object, call
vgone() on vnodes with writers, which sets the vm object type to
OBJT_DEAD, and keep the resident pages and installed ptes as is.  In
particular, the existing mappings continue to work as far as
application only accesses resident pages, but changes are not flushed
to file.

For tmpfs the vm object of VREG vnodes also serves as the data pages
container, giving single copy of the mapped pages, so it cannot be set
to OBJT_DEAD.  Alternatives for making rw mappings ro could be either
invalidating them at all, or marking as CoW.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:59:04 +00:00
Konstantin Belousov
e1cdc30faa tmpfs: ignore tmpfs_set_status() if mount point is read-only.
In particular, this fixes atimes still changing for ro tmpfs.
tmpfs_set_status() gains tmpfs_mount * argument.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:49:32 +00:00
Konstantin Belousov
ae26575394 Block creation of the new nodes for read-only tmpfs mounts.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:41:26 +00:00
Ruslan Bukin
a8cb655d2e o Grab the number of devices supported by PLIC from FDT.
o Fix bug in PLIC_ENABLE macro when irq >= 32.

Tested on the real hardware, which is HiFive Unleashed board.

Thanks to SiFive, Inc. for the board provided.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19775
2019-04-02 12:02:35 +00:00
Justin Hibbits
95a1f0e81c ipmi: Fixes for ipmi_opal(powernv)
* Crank the OPAL state machine during the receive loop, to make sure the
  pollers are executed
* Add a proper detach function, so the module can be unloaded and reloaded
  at runtime.

It still doesn't reliably work 100% of the time on POWER9, and it appears
timing and/or cache related.  It may work on POWER8 now.

MFC after:	2 weeks
2019-04-02 04:12:06 +00:00
Justin Hibbits
fbf7737949 powernv: Port OPAL asynchronous framework to use the new message framework
Since OPAL_GET_MSG does not discriminate between message types, asynchronous
completion events may be received in the OPAL_GET_MSG call, which dequeues
them from the list, thus preventing OPAL_CHECK_ASYNC_COMPLETION from
succeeding.  Handle this case by integrating with the messaging framework.
2019-04-02 04:02:57 +00:00