When deleting files on filesystems that are stored on flash-memory
(solid-state) disk drives, the filesystem notifies the underlying
disk of the blocks that it is no longer using. The notification
allows the drive to avoid saving these blocks when it needs to
flash (zero out) one of its flash pages. These notifications of
no-longer-being-used blocks are referred to as TRIM notifications.
In FreeBSD these TRIM notifications are sent from the filesystem
to the drive using the BIO_DELETE command.
Until now, the filesystem would send a separate message to the drive
for each block of the file that was deleted. Each Gigabyte of file
size resulted in over 3000 TRIM messages being sent to the drive.
This burst of messages can overwhelm the drive's task queue causing
multiple second delays for read and write requests.
This implementation collects runs of contiguous blocks in the file
and then consolodates them into a single BIO_DELETE command to the
drive. The BIO_DELETE command describes the run of blocks as a
single large block being deleted. Each Gigabyte of file size can
result in as few as two BIO_DELETE commands and is typically less
than ten. Though these larger BIO_DELETE commands take longer to
run, they do not clog the drive task queue, so read and write
commands can intersperse effectively with them.
Though this new feature has been throughly reviewed and tested, it
is being added disabled by default so as to minimize the possibility
of disrupting the upcoming 12.0 release. It can be enabled by running
``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it.
If no problems arise, we will consider requesting that it be enabled
by default for 12.0.
Reviewed by: kib
Tested by: Peter Holm
Sponsored by: Netflix
The support for lazy pmap invalidations on i386 was removed in r281707.
This removes the constant for the IPI and stops accounting for it when
sizing the interrupt count arrays.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16801
The TCP client side or the TCP server side when not using SYN-cookies
used the uptime as the TCP timestamp value. This patch uses in all
cases an offset, which is the result of a keyed hash function taking
the source and destination addresses and port numbers into account.
The keyed hash function is the same a used for the initial TSN.
Reviewed by: rrs@
MFC after: 1 month
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D16636
Mention abort_handler_s(3) and ignore_handler_s(3), provide
cross-reference from memset(3).
Submitted by: Yuri Pankov <yuripv@yuripv.net>
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D16797
After years in the making, lualoader is ready to make its debut. Both
flavors of loader are still built by default, and may be installed as
/boot/loader or /boot/loader.efi as appropriate either by manually creating
hard links or using LOADER_DEFAULT_INTERP as documented in build(7).
Discussed with: imp
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D16795
becomes -1, except these are unsigned integers, so they become very large
numbers. Thus are always larger than the maximum bucket; the hash table
insertion fails causing NAT to fail.
This commit ensures that if the index is already zero it is not reduced
prior to insertion into the hash table.
PR: 208566
are situated next to error counters and/or in one instance prior to the
-1 return from various functions. This was useful in diagnosis of
PR/208566 and will be handy in the future diagnosing NAT failures.
PR: 208566
MFC after: 3 days
Instead of doing a second pass to skip empty lines if we've specified -I, go
ahead and check both at once. Ignore critera has been split out into its own
function to try and keep the logic cleaner.
As noted by cem in r338035, coccinelle invokes diff(1) with the -B flag.
This was not previously implemented here, so one was forced to create a link
for GNU diff to /usr/local/bin/diff
Implement the -B flag and add some primitive tests for it. It is implemented
in the same fashion that -I is implemented; each chunk's lines are scanned,
and if a non-blank line is encountered then the chunk will be output.
Otherwise, it's skipped.
MFC after: 2 weeks
Without this patch, some CS4614 cards will need users to reload the driver manually or
the hardware won't be initialised properly. Something like:
# kldload snd_csa
# kldunload snd_csa
# kldload snd_csa
Tested with: Terratec SiXPack 5.1+
I was not aware Warner was making or planning to make forward progress in
this area and have since been informed of that.
It's easy to apply/reapply when churn dies down.
Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to
have correct pointer (or array) type. Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.
Mostly done with the coccinelle 'spatch' tool:
$ cat modpnpsize0.cocci
@normaltables@
identifier b,c;
expression a,d,e;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,d,
-sizeof(d[0]),
e);
@singletons@
identifier b,c,d;
expression a;
declarer MODULE_PNP_INFO;
@@
MODULE_PNP_INFO(a,b,c,&d,
-sizeof(d),
1);
$ rg -l MODULE_PNP_INFO -- sys | \
xargs spatch --in-place --sp-file modpnpsize0.cocci
(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not. So I had to link gdiff into
PATH as diff to use spatch.)
Tinderbox'd (-DMAKE_JUST_KERNELS).
driven by problems found with the algorithms being tested for TRIM
consolodation.
Reported by: Peter Holm
Suggested by: kib
Reviewed by: kib
Sponsored by: Netflix
These aliases are supported and documented in the man page. For now, they
will not be mentioned in the error when an invalid argument is encountered,
instead keeping that list to the shorter 'preferred' names of each argument.
Reported by: rgrimes
instead of the size of the whole bge_devs array.
This should stop kldxref searching beyond the end of .rodata when it
processes relocations, and emitting "unhandled relocation type" errors,
at least on i386.
This is very primitive code to inspect the PCI error state and AER
error state, dump the log and clear errors, from ddb.
pci_print_faulted_dev() is made external to allow calling it from
other places. It was called from NMI handler but this chunk is not
included.
Also there is a tunable-controlled code to clear AER on device attach,
disabled by default.
All this code was useful to me when I debugged ACPI_DMAR failures (not
faults) long time ago.
Reviewed by: cem, imp (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7813
- In configurations with a pseudo devices section, move 'device crypto'
into that section.
- Use a consistent comment. Note that other things common in kernel
configs such as GELI also require 'device crypto', not just IPSEC.
Reviewed by: rgrimes, cem, imp
Differential Revision: https://reviews.freebsd.org/D16775
Compiling FreeBSD/i386 with modern GCC triggers warnings for various
places that convert 64-bit EFI_ADDRs to pointers and vice versa.
- Cast pointers to uintptr_t rather than to uint64_t when assigning
to a 64-bit integer.
- Cast 64-bit integers to uintptr_t before a cast to a pointer.
Reviewed by: kevans
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16586
The fallback logic was broken if hints were found in multiple environments.
If we found a hint in either the loader environment or the static
environment, fallback would be incremented excessively when we returned to
the environment-selection bits. These checks should have also been guarded
by the fbacklvl checks. As a result, fbacklvl could quickly get to a point
where we skip either the static environment and/or the static hints
depending on which environments contained valid hints.
The impact of this bug is minimal, mostly affecting mips boards that use
static hints and may have hints in either the loader environment or the
static environment.
There may be better ways to express the searchable environments and
describing their characteristics (immutable, already searched, etc.) but
this may be revisited after 12 branches.
Reported by: Dan Nelson <dnelson_1901@yahoo.com>
Triaged by: Dan Nelson <dnelson_1901@yahoo.com>
MFC after: 3 days
When coding the pNFS server, I added vn_start_write() calls in nfsrv_copymr()
done while the vnodes were locked, not realizing I had introduced LORs and
possible deadlock when an exported file system on the MDS is suspended.
This patch fixes the LORs by moving the vn_start_write() calls up to before
where the vnodes are locked. For "tvp", the vn_start_write() probaby isn't
necessary, because NFS mounts can't be suspended. However, I think doing
so is harmless.
Thanks go to kib@ for letting me know that I had introduced these LORs.
This patch only affects the behaviour of the pNFS server when pnfsdscopymr(8)
is used to recover a mirrored DS.
The domain and flags parameters suffice. In fact, the related functions
kmem_alloc_{attr,contig}_domain() don't have an arena parameter.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D16713
* correctly prepare a buffer to obtain interface description from a kernel and
truncate long description instead of dropping it altogether and
spamming logs;
* skip calling strlen() for each description and each SNMP request
for MIB-II/ifXTable's ifAlias.
* teach bsnmpd to allocate memory dynamically for interface descriptions
to decrease memory usage for common case and not to break
if long description occurs;
PR: 217763
Reviewed by: harti and others
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D16459
Some background: in the GSoC project, libbe/Makefile lived in lib/libbe. I
created projects/bectl branch, maintained the above for all of five
minutes before I misread Makefile.inc1 and decided that it couldn't possibly
build outside of cddl/, so I kicked the Makefile out into the cddl/ build
and all was good. The misreading was of the bit where .WAIT is added to
SUBDIR after lib, libexec but prior to building bin and cddl *only during
the install targets*, which is the critical part.
Fast forward- buildworld was still broken in my branch unbeknownst to me
because I didn't nuke my OBJDIR. Combing through Makefile.inc1 eventually
revealed the necessary magic to make sure that libbe's dependencies are
specified well enough, and it becomes clear what needs done to make a
non-cddl/ build work. This is an interesting prospect, because the build
split is kind of annoying to work with.
IGNORE_PRAGMA is added to avoid dropping WARNS by one more. This was
previously pulled in via cddl/Makefile.inc.
Instead of always running /bin/sh, allow the user to specify the command
to run. The jail is not removed when the command finishes. Meaning,
`bectl unjail` will still need to be run.
For example:
```
bectl jail newBE pkg upgrade
bectl ujail newBE
```
Submitted by: Shawn Webb
Obtained from: HardenedBSD (8b451014ab)
This basically adds makes use of the C99 restrict keyword, and also
adds some 'const's to four threading functions: pthread_mutexattr_gettype(),
pthread_mutexattr_getprioceiling(), pthread_mutexattr_getprotocol(), and
pthread_mutex_getprioceiling. The changes are in accordance to POSIX/SUSv4-2018.
Hinted by: DragonFlyBSD
Relnotes: yes
MFC after: 1 month
Differential Revision: D16722
In the create-world-packages target we manually piece this together (unless
it is undefined), without the DISTDIR. Normally DISTDIR is empty (unset) and
no one notices. Now DISTDIR is a well known long-standing PORTS environment
variable and if that is set in the local environment the path to METALOG
is wrong as it no longer is ${DESTDIR}/METALOG.
Long-term we should start to avoid "publicly well known" names for global
variables, for now just piece ${DISTDIR} in as well. This allows
create-world-packages to continue if DISTDIR is set in the env.
When coding the pNFS server, I added several vn_start_write() calls done
while the vnode was locked, not realizing I had introduced LORs and
possible deadlock when an exported file system on the MDS is suspended.
This patch fixes this by removing the added vn_start_write() calls and
modifying the code so that the extant vn_start_write() call before the
NFS RPC/operation is done when needed by the pNFS server.
Flags are changed so that LayoutCommit and LayoutReturn now get a
vn_start_write() done for them.
When the pNFS server is enabled, the code now also changes the flags for
Getattr, so that the vn_start_write() is done for Getattr, since it may
need to do a vn_set_extattr(). The nfs_writerpc flag array was made global
to the NFS server and renamed nfsrv_writerpc, which is consistent naming
for globals in the NFS server.
Thanks go to kib@ for reporting that doing vn_start_write() while the vnode is
locked results in a LOR.
This patch only affects the behaviour of the pNFS server.