In my eagerness to eliminate a branch which is taken once per 2^38
bytes of keystream, I forgot that the state words are in host order.
Thus, the counter increment code worked fine on little-endian
machines, but not on big-endian ones. Switch to a simpler (branchful)
solution.
The NFSv4 Setattr operation always has reply data even when it fails,
so don't set the ND_NOMOREDATA for it. This would only affect unusual
cases where Setattr fails and the RPC code wants to parse the rest of
the compound. Detected during recent development related to the pNFS server.
MFC after: 2 weeks
An NFSv4.1 client connection to a Data Server (DS) should not have a
backchannel. This patch fixes the NFSv4.1/pNFS client to not do a backchannel
for this case.
Found during recent testing with the pNFS server under development.
MFC after: 2 weeks
Implement FUSE open flag FOPEN_KEEP_CACHE. Without this flag, cached file
contents should be invalidated on open. Apparently, fusefs-encfs relies
upon this behavior.
PR: 218636
Submitted by: Ben RUBSON <ben.rubson at gmail.com>
The nfscl_mtofh() function didn't check for failed operations and, as such,
would have returned EBADRPC for these cases, due to parsing failure.
This patch adds checks, so that it returns with ND_NOMOREDATA set.
This is needed for future use in the pNFS server and acts as a safety
belt in the meantime.
MFC after: 2 weeks
The nfsuserd.8 man page stated that a usertimeout of 0 would disable
the cache timeout. This was simply not true, so this patch deletes
the sentence.
This is a content change.
PR: 217406
MFC after: 2 weeks
The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon.
However, they were 0 until the nfsuserd(8) was run. Since it is
possible to use NFSv4 without running the nfsuserd(8) daemon, set them
to nobody/nogroup initially.
Without this patch, the values would be set by the nfsuserd(8) daemon
and left changed even if the nfsuserd(8) daemon was killed. The default
values of 0 meant that setting a group to "wheel" would fail even when
done by root.
It also adds a definition of GID_NOGROUP to sys/conf.h.
Discussed on: freebsd-current@
MFC after: 2 weeks
7386 zfs get does not work properly with bookmarks
illumos/illumos-gate@edb901aab9edb901aab9https://www.illumos.org/issues/7386
The zfs get command does not work with the bookmark parameter while it works
properly with both filesystem and snapshot:
# zfs get -t all -r creation rpool/test
NAME PROPERTY VALUE SOURCE
rpool/test creation Fri Sep 16 15:00 2016 -
rpool/test@snap creation Fri Sep 16 15:00 2016 -
rpool/test#bkmark creation Fri Sep 16 15:00 2016 -
# zfs get -t all -r creation rpool/test@snap
NAME PROPERTY VALUE SOURCE
rpool/test@snap creation Fri Sep 16 15:00 2016 -
# zfs get -t all -r creation rpool/test#bkmark
cannot open 'rpool/test#bkmark': invalid dataset name
#
The zfs get command should be modified to work properly with bookmarks too.
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>
Make some use of reallocarray, attempting to limit it to cases where the
parameters are unsigned and there is some theoretical chance of overflow.
MFC afer: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9980
remove the former.
All other EGA/VGA methods were already shared, with VGA-only features
mostly not used and no decisions in inner loops to optimize fof VGA,
but this method was split up because it is the only important one and
using VGA methods if possible is about twice as fast. The speed is
mostly not from splitting to reduce branches but from doing half as
many bus accesses, so make this easier to maintain by not splitting.
There is now 1 extra branch in an inner loop where it costs less than
1% of the bus access overhead on Haswell even if the compiler schedules
it poorly.
The GNU extension bits in the base system are old, no longer faithful
to upstream, and surprising in some regards. Switch to documenting
WITH_GNU_GREP_COMPAT and default GNU_GREP_COMPAT to OFF in the name of
good behavior.
According to http://www.regular-expressions.info, GNU extensions:
- Add missing quantifiers to BREs: \?, \+
- Add branching to BREs: \|
- Add backreferences (\1 through \9) to EREs
- Add \w, \W, \s, and \S corresponding to :alnum:, [^[:alnum:]],
:space:, and [^[:space:]] respectively
- Add word boundaries and anchors:
\b: word boundary
\B: not word boundary
\<: Strt of word
\>: End of word
\`: Start of subject string
\': End of subject string
These extensions are still available in /usr/bin/grep by default today,
as it is still GNU grep. As part of the bsdgrep migration plan these
extensions may be added to bsdgrep's regex support if necessary.
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D10114
Bugs have been found in the fastmatch implementation as used in bsdgrep.
Some have been fixed (r316495) while fixes for others are in review
(D10098).
In comparison with the fastmatch implementation, Kyle Evans found that:
- regex(3)'s performance with literal expressions offers a speed
improvement over fastmatch
- regex(3)'s performance, both with simple BREs and EREs, seems to be
comparable
The regex implementation was imported in r226035, and the commit message
reports:
This is a temporary solution until the whole regex library is
not replaced so that BSD grep development can continue and the
backported code gets some review and testing. This change only
improves scalability slightly, there is no big performance boost
yet but several minor bugs have been found and fixed.
Introduce a WITH_/WITHOUT_BSD_GREP_FASTMATCH knob to support testing
of both approaches.
PR: 175314, 194823
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reviewed by: bdrewery (in part)
Differential Revision: https://reviews.freebsd.org/D10282
After r307655 MK_GDB is forced to no if MK_BINUTILS is no, and similarly
MK_GROFF is forced to no if MK_CXX is no, so we can remove nested
conditionals.
Reviewed by: bapt, brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8287
Before this change it was impossible to set number of PKCS#5v2 iterations,
required to set passphrase, if it has two keys and never had any passphrase.
Due to present metadata format limitations there are still cases when number
of iterations can not be changed, but now it works in cases when it can.
PR: 218512
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D10338
vga planar method (for testing that was supposed to be local that the
former still works). The ega method works on vga but is about twice
as slow. The vga method doesn't work on ega.
Optimize the main vga planar method a little. For changing the
background color (which was otherwise optimized better than most
things), don't switch the write mode from 3 to 0 just to select
the pixel mask of 0xff obscurely by writing 0. Just write 0xff
directly.
-(SYNCOOKIE_LIFETIME + 1) instead of INT64_MIN, since it is
good enough and works when time_t is int32 or int64.
This fixes the issue reported by cy@ on i386.
Reported by: cy
MFC after: 1 week
Sponsored by: Netflix, Inc.
The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon.
However, they were 0 until the nfsuserd(8) was run. Since it is
possible to use NFSv4 without running the nfsuserd(8) daemon, set them
to nobody/nogroup initially.
Without this patch, the values would be set by the nfsuserd(8) daemon
and left changed even if the nfsuserd(8) daemon was killed. Also, the default
values of 0 meant that setting a group to "wheel" would fail even when
done by root and this patch fixes this issue.
MFC after: 2 weeks
7490 real checksum errors are silenced when zinject is on
illumos/illumos-gate@6cedfc397d6cedfc397dhttps://www.illumos.org/issues/7490
When zinject is on, error codes from zfs_checksum_error() can be overwritten
due to an incorrect and overly-complex if condition.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
7448 ZFS doesn't notice when disk vdevs have no write cache
illumos/illumos-gate@295438ba32295438ba32https://www.illumos.org/issues/7448
I built a SmartOS image with all the NVMe commits including 7372
(support NVMe volatile write cache) and repeated my dd testing:
> #!/bin/bash
> for i in `seq 1 1000`; do
> dd if=/dev/zero of=file00 bs=1M count=102400 oflag=sync &
> dd if=/dev/zero of=file01 bs=1M count=102400 oflag=sync &
> wait
> rm file00 file01
> done
>
Previously each dd command took ~145 seconds to finish, now it takes
~400 seconds.
Eventually I figured out it is 7372 that causes unnecessary
nvme_bd_sync() executions which wasted CPU cycles.
If a NVMe device doesn't support a write cache, the nvme_bd_sync function will
return ENOTSUP to indicate this to upper layers.
It seems this returned value is ignored by ZFS, and as such this bug is not
really specific to NVMe. In vdev_disk_io_start() ZFS sends the flush to the
disk driver (blkdev) with a callback to vdev_disk_ioctl_done(). As nvme filled
in the bd_sync_cache function pointer, blkdev will not return ENOTSUP, as the
nvme driver in general does support cache flush. Instead it will issue an
asynchronous flush to nvme and immediately return 0, and hence ZFS will not set
vdev_nowritecache here. The nvme driver will at some point process the cache
flush command, and if there is no write cache on the device it will return
ENOTSUP, which will be delivered to the vdev_disk_ioctl_done() callback. This
function will not check the error code and not set nowritecache.
The right place to check the error code from the cache flush is in
zio_vdev_io_assess(). This would catch both cases, synchronous and asynchronous
cache flushes. This would also be independent of the implementation detail that
some drivers can return ENOTSUP immediately.
Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Obtained from: Illumos
The FreeBSD NFSv4 server did not set the attribute bit for TimeAccess in
the reply to an Open with exclusive_create, as required by the RFCs.
(This is required since the FreeBSD NFS server stores the create_verifier
in the va_atime attribute.)
As such, the Linux NFSv4 client did not set the TimeAccess (atime) in
the Setattr done in an RPC after the one with the Open/exclusive_create.
This patch fixes the server to set the TimeAccess bit in the reply.
I believe that storing the create_verifier in an extended attribute for
file systems that support extended attributes might be a good idea,
but I will wait for a discussion of this on the freebsd-fs@ email list
before considering committing a patch to do this.
Reported by: jim@ks.uiuc.edu
Suggested by: dfr
MFC after: 2 weeks
7430 Backfill metadnode more intelligently
illumos/illumos-gate@af346df588af346df588https://www.illumos.org/issues/7430
Description and patch from brought over from the following ZoL commit: https://
github.com/zfsonlinux/zfs/commit/68cbd56e182ab949f58d004778d463aeb3f595c6
Only attempt to backfill lower metadnode object numbers if at least
4096 objects have been freed since the last rescan, and at most once
per transaction group. This avoids a pathology in dmu_object_alloc()
that caused O(N^2) behavior for create-heavy workloads and
substantially improves object creation rates. As summarized by
@mahrens in #4636:
"Normally, the object allocator simply checks to see if the next
object is available. The slow calls happened when dmu_object_alloc()
checks to see if it can backfill lower object numbers. This happens
every time we move on to a new L1 indirect block (i.e. every 32 *
128 = 4096 objects). When re-checking lower object numbers, we use
the on-disk fill count (blkptr_t:blk_fill) to quickly skip over
indirect blocks that don?t have enough free dnodes (defined as an L2
with at least 393,216 of 524,288 dnodes free). Therefore, we may
find that a block of dnodes has a low (or zero) fill count, and yet
we can?t allocate any of its dnodes, because they've been allocated
in memory but not yet written to disk. In this case we have to hold
each of the dnodes and then notice that it has been allocated in
memory.
The end result is that allocating N objects in the same TXG can
require CPU usage proportional to N^2."
Add a tunable dmu_rescan_dnode_threshold to define the number of
objects that must be freed before a rescan is performed. Don't bother
to export this as a module option because testing doesn't show a
compelling reason to change it. The vast majority of the performance
gain comes from limit the rescan to at most once per TXG.
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Ned Bass <bass6@llnl.gov>
Obtained from: Illumos
properly anyway. (Upstream has reorganized this somewhat in the mean
time, but for proper backtraces we would need llvm-symbolizer in base.)
MFC after: 3 days
overflows, syncookies are used.
This patch restricts the usage of syncookies in this case: accept
syncookies only if there was an overflow of the syncache recently.
This mitigates a problem reported in PR217637, where is syncookie was
accepted without any recent drops.
Thanks to glebius@ for suggesting an improvement.
PR: 217637
Reviewed by: gnn, glebius
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D10272
I inadvertedly soubled the size of the memset without noticing the
start address had changed. The size for the memset in pt_map_thread()
shouldn't actually match the reallocarray() so undo that part of r317200.
This is a re-commit of r317201 to clarify the log.
X-MFC with: r317200
Lengths are not negative, so map_len should be unsigned. Unsign the
corresponding indexes too and bring a small use of reallocarray(3).
Reorder the memset to be consistent with the realloc: it appears we
were only clearing half the memory in pt_map_thread().
MFC after: 2 weeks
a pointer to the main ega drawing method which is misoptimized be in
a different function than the main vga planar mode drawing method.
Vga initialization handles everything with no extra code except for
selecting the different function.
corresponding to the gaps between characters. This fixes distortion
of the cursor due to expanding it across the gaps.
Again for character width 9, when the cursor characters are not in the
graphics range (0xb0-0xdf), the gaps were always there (filled in the
background color for the previous char). They still look strange, but
don't cause distortion. When the cursor characters are in the graphics
range, the gaps are filled by repeating the previous line. This gives
distortion with cilia. Removing vertical lines reduces the distortion
to vertical cilia.
Move the default for the cursor characters out of the graphics range.
With character width 9, this gives gaps instead of distortion and
other problems. With character width 8, it just fixes a smaller set
of other problems. Some distortion and other problems can be recovered
using vidcontrol -M. Presumably the default was to fill the gaps
intentionally, but it is much better to leave gaps. The gaps can even
be considered as a feature for text processing -- they give sub-pointers
to character boundaries. The other problems are: (1) with character
width 9, characters near the cursor are moved into the graphics range
and thus distorted if any of their 8th bits is set; (2) conflicts with
national characters in the graphics range.
The default range for the graphics cursor characters is now 8-11. This
doesn't conflict with anything, since the glyphs for the characters in
this range are unreachable.
Use the 10x16 mouse cursor in text mode too (if the font size is >= 14).
When the character width is 9, removal of 1 or 2 vertical lines makes
10x16 cursor no wider than the 9x13 one usually was. We could even
handle cursors 1 pixel wider in 2 character cells and gaps without
more clipping than given by the gaps (the worst case is 1 pixel in the
left cell, 1 removed in the middle gap, 8 in the right cell and 1
removed in the right gap. The pixel in the right gap is removed so
it doesn't matter if it is in the font).
When the character width is 8, we now clip the 10-wide cursor by 1
pixel in the worst case. This clipping is usually invisible since it
is of the border and and the border usually merges with the background
so is invisible. There should be an option to use reverse video to
highlight the border and its tip instead of the interior (graphics
mode can do better using separate colors). This needs the 9x13 cursor
again.
Ideas from: ache (especially about the bad default character range)
Note that KVA mapping of the framebuffer already uses write-combining
mode, so the change, besides improving speed of user mode writes, also
satisfies requirement of the IA32 architecture of using consistent
caching modes for multiple mappings of the same page.
Reported and tested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week