Implement ffs_getpages_async(), which when possible calls the asynchronous
flavor of the generic pager's getpages function. When the underlying
block size is larger than the system page size, however, it will invoke
the (synchronous) buffer cache pager, followed by a call to the client
completion routine. This retains true asynchronous completion in the most
common (block size <= page size) case, which is important for the performance
of the new sendfile(2). The behavior in the larger block size case mirrors
the default implementation of VOP_GETPAGES_ASYNC, which most other
filesystems use anyway as they do not override the getpages_async method.
PR: 235708
Reported by: pho
Reviewed by: kib, glebius
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19340
r328169 removed the copy of bootinfo that would've made this somewhat
functional. However, this is irrelevant- earlier work in r292338 was done to
exit boot services in the MI bi_load() rather than having N copies of the
GetMemoryMap/ExitBootServices dance.
i386 never quite caught up to that; ldr_enter was still being called but
the prereq for that, ldr_bootinfo, was no longer. As a consequence, this
ExitBootServices() was being called with a mapkey=0, clearly bogus, and
reportedly breaking the boot in some instances.
Reported by: bcran
MFC after: 1 week
Eliminate trailing whitespace on inet, inet6, and groups lines. I think the
"list txpower" command will still show some, but I'm not able to test that.
PR: 153731
Reported-by: Nikolay Denev <ndenev@gmail.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19004
to go to the FS image we are making cannot be read (e.g. EPERM).
Current behaviour when we issue waring but still proceeed and
return success is definitely not correct: masking out error
condition as well as making a slighly inconsistent FS where
attempt to access the file in question ends up in EBADF. See
linked DR for details.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18584
shorter than its size resulting in a hole as its final block (which
is a violation of the invarients of the UFS filesystem).
Soft updates will always ensure that the file size is correct when
writing inodes to disk for files that contain only direct block
pointers. However soft updates does not roll back sizes for files
with indirect blocks that it has set to unallocated because their
contents have not yet been written to disk. Hence, the file can
appear to have a hole at its end because the block pointer has been
rolled back to zero when its inode was written to disk. Thus,
fsck_ffs calculates the last allocated block in the file. For files
that extend into indirect blocks, fsck_ffs checks for a size past
the last allocated block of the file and if that is found, shortens
the file to reference the last allocated block thus avoiding having
it reference a hole at its end.
Submitted by: Chuck Silvers <chs@netflix.com>
Tested by: Chuck Silvers <chs@netflix.com>
MFC after: 1 week
Sponsored by: Netflix
Split the rights-limiting code into two cases: if one of the input
files isn't a regular file, use caph_limit_stream(3) instead of
open-coding the same logic; if both input files are regular files,
and the initial attempts to map them succeed, we limit the rights on
those files to CAP_MMAP_R.
Add a regression test for PR 234885.
PR: 234885
Reviewed by: delphij
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19216
On platforms without a direct map (i.e., platforms without
UMA_MD_SMALL_ALLOC defined), the boundary tag allocator reserves a
number of tags for use when allocating a new slab of boundary tags,
as such platforms require free boundary tags in order to allocate
boundary tags. r327899 increased the number of boundary tags required
for a KVA allocation in the worst case, and the aforementioned
reservation was not updated accordingly. In some cases, this could
lead to a system hang. Fix the problem by increasing this reservation.
Also reduce KVA_QUANTUM on systems lacking superpage support.
The previous import quantum (4MB with a 4KB page size) was quite large
for systems with limited KVA, and fragmentation in kernel_arena could
cause kernel memory allocation failures even with a substantial amount
of free KVA.
Reported and tested by: jhibbits
Reviewed by: alc, kib
No objections: jeff
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19337
the tag wasn't being computed properly due to chaning a >= comparison
to an == comparison.
Specifically: CBC-MAC encodes the length of the authorization data
into the the stream to be encrypted/hashed. For short data, this is
two bytes (big-endian 16 bit value); for larger data, it's 6 bytes
(a prefix of 0xff, 0xfe, followed by a 32-bit big-endian length). And
there's a larger size, which is 10 bytes. These extra bytes weren't
being accounted for with the post-review code. The other bit that then came
into play was that OCF only calls the Update code with blksiz=16, which
meant that I had to ignore the length variable. (It also means that it
can't be called with a single buffer containing the AAD and payload;
however, OCF doesn't do this for the software-only algorithsm.)
I tested with this script:
ALG=aes-ccm
DEV=soft
for aad in 0 1 2 3 4 14 16 24 30 32 34 36 1020
do
for dln in 16 32 1024 2048 10240
do
echo "Testing AAD length ${aad} data length ${dln}"
/root/cryptocheck -A ${aad} -a ${ALG} -d ${DEV} ${dln}
done
done
Reviewed by: cem
Sponsored by: iXsystems Inc.
Instead of PRIVATELIB + NO_PIC. This avoids the need for the wlandebug
PIE special case added in r344211, and provides a stronger guarantee
against 3rd party software coming to depend on the API or ABI.
If / when we declare the API/ABI to be stable we can make it a normal
library.
Discussed with: bapt
Sponsored by: The FreeBSD Foundation
RockChip clocks have a write mask in the upper 16bits of the mux register
which wasn't set in the set_mux function.
Also the wrong parent was tested instead of the real current one, when
switch parent, test with the current one before.
Pointy Hat: manu
MFC after: 1 week
Both SpiFlash (mx25l) and DataFlash (at45d) drivers create a disk device
with a name of /dev/flash/spiN where N is the driver's unit number. If
both types of devices are present in the same system, this creates a fatal
conflict that prevents attachment of whichever device attaches second
(because mx25l0 and at45d0 both try to create a spi0).
This gives each type of device a unique name (mx25lN or at45dN respectively)
and also adds an alias of spiN for compatibility. When both device types
appear in the same system, only the first to attach gets the spiN alias.
When the second device attaches there is a non-fatal warning that the alias
can't be created, but both devices are still accessible via their primary
names (and there is no need for the spiN name to work for backwards
compatibility on such a system, because it has never been possible to use
the spiN names when both devices exist).
jedec ID as its older cousin the AT45DB642D, but uses a different page size.
The only way to distinguish between the two chips is that the 2D chip has
0 bytes of extended ID info and the new 1E has 1 byte of extended ID. The
actual value of the extended ID byte is all zeroes. In other words, it's
the presence of the extended info that identifies this chip. (Presumably
a future upgrade might define non-zero values for the extended ID byte.)
- Do not use nvf = 4 as it is not really supported by the firmware.
Firmwares 1.23.3.0 and above will ignore it silently.
- Increase PF4's share of the VIs and let it use all of the RSS table.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
to match a chip to our table of metadata describing the chips. At least one
new DataFlash chip has a 3-byte jedec ID identical to its predecessors and
differs only in the extended info, and it has different metadata requiring a
unique entry in the table. This paves the way for supporting such chips.
The metadata table now includes two new fields, extmask and extid. The two
bytes of extended info obtained from the chip are ANDed with extmask then
compared to extid, so it's possible to use only a subset of the extended
info in the matching.
We now always read 6 bytes of jedec ID info. Most chips don't return any
extended info, and the values read back for those two bytes may be
indeterminate, but such chips have extmask and extid values of 0x0000 in the
table, so the extid effectively doesn't participate in the matching on those
chips and it doesn't matter what they return in the extended info bytes.
This fixes a panic during configuration if the tx channel of a port
isn't the same as its port id.
Reported by: Fabrice Bruel
MFC after: 1 week
Sponsored by: Chelsio Communications
Update the diff to include other missing sysctl types found in sysctl.h.
Some of these sysctls are already documented in other pages (e.g counter(9)
and ZONE(9)), but they should at least be mentioned here for completeness.
This patch now documents all of the following:
- SYSCTL_BOOL/SYSCTL_ADD_BOOL
- SYSCTL_COUNTER_U64/SYSCTL_ADD_COUNTER_U64
- SYSCTL_COUNTER_U64_ARRAY/SYSCTL_ADD_COUNTER_U64_ARRAY
- SYSCTL_SBINTIME_MSEC/SYSCTL_ADD_SBINTIME_MSEC
- SYSCTL_SBINTIME_USEC/SYSCTL_ADD_SBINTIME_USEC
- SYSCTL_UMA_CUR/SYSCTL_ADD_UMA_CUR
- SYSCTL_UMA_MAX/SYSCTL_ADD_UMA_MAX
Submitted by: mhorne063_gmail.com
Reviewed by: bcr, hselasky
Approved by: bcr (doc), hselasky (src)
Approved by: krion (mentor, implicit), mat (mentor, implicit)
Differential Revision: https://reviews.freebsd.org/D19272
If an interrupt fires while writing the cmp entry we may have a partial
entry. Work around this by using atomic_cmpset to set the new index. If it
fails we need to set the previous index value and try again as the entry
may be in an inconsistent state.
This fixes messages similar to the following from syzkaller:
bad comp 224 type 2163727253
Reviewed by: tuexen
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D19287
in the delayed attach to use early returns, which allows reducing the level
of indentation. So all in all, what looks like a lot of changes is really
no change in behavior, mostly just moving whitespace around.
arprequest() is a void function and in case of error we simply
return without any feedback. In case of any local operation
or *if_output() failing no feedback is send up the stack for the
packet which triggered the arp request to be sent.
arpresolve_full() has three pre-canned possible errors returned
(if we have not yet sent enough arp requests or if we tried
often enough without success) otherwise "no error" is returned.
Make arprequest() an "internal" function arprequest_internal() which
does return a possible error to the caller. Preserve arprequest()
as a void wrapper function for external consumers.
In arpresolve_full() add an extra error checking. Use the
arprequest_internal() function and only return an error if non
of the three ones (mentioend above) are already set.
This will return possible errors all the way up the stack and
allows functions and programs to react on the send errors rather
than leaving them in the dark. Also they might get more detailed
feedback of why packets cannot be sent and they will receive it
quicker.
Reviewed by: karels, hselasky
Differential Revision: https://reviews.freebsd.org/D18904
[X86] Fix tls variable lowering issue with large code model
Summary:
The problem here is the lowering for tls variable. Below is the DAG
for the code. SelectionDAG has 11 nodes:
t0: ch = EntryToken
t8: i64,ch = load<(load 8 from `i8 addrspace(257)* null`,
addrspace 257)> t0, Constant:i64<0>, undef:i64
t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32*
@x> 0 [TF=10]
t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64
t12: i64 = add t8, t11
t4: i32,ch = load<(dereferenceable load 4 from @x)> t0, t12,
undef:i64
t6: ch = CopyToReg t0, Register:i32 %0, t4
And when mcmodel is large, below instruction can NOT be folded.
t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0
[TF=10]
t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64
So "t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64" is
lowered to " Morphed node: t11: i64,ch = MOV64rm<Mem:(load 8 from
got)> t10, TargetConstant:i8<1>, Register:i64 $noreg,
TargetConstant:i32<0>, Register:i32 $noreg, t0"
When llvm start to lower "t10: i64 = X86ISD::WrapperRIP
TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]", it fails.
The patch is to fold the load and X86ISD::WrapperRIP.
Fixes PR26906
Patch by LuoYuanke
Reviewers: craig.topper, rnk, annita.zhang, wxiao3
Reviewed By: rnk
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58336
This should fix "fatal error: error in backend: Cannot select" messages
when compiling <ctype.h> functions using -mcmodel=large.
Reported by: phk
PR: 233143
MFC after: 3 days
The pipefail option allows checking the exit status of all commands in a
pipeline more easily, at a limited cost of complexity in sh itself. It works
similarly to the option in bash, ksh93 and mksh.
Like ksh93 and unlike bash and mksh, the state of the option is saved when a
pipeline is started. Therefore, even in the case of commands like
A | B &
a later change of the option does not change the exit status, the same way
(A | B) &
works.
Since SIGPIPE is not handled specially, more work in the script is required
for a proper exit status for pipelines containing commands such as head that
may terminate successfully without reading all input. This can be something
like
(
cmd1
r=$?
if [ "$r" -gt 128 ] && [ "$(kill -l "$r")" = PIPE ]; then
exit 0
else
exit "$r"
fi
) | head
PR: 224270
Relnotes: yes
A big security advantage of Wayland is not allowing applications to read
input devices all the time. Having /dev/input/* accessible to the user
account subverts this advantage.
libudev-devd was opening the evdev devices to detect their types (mouse,
keyboard, touchpad, etc). This don't work if /dev/input/* is inaccessible.
With the kernel exposing this information as sysctls (kern.evdev.input.*),
we can work w/o /dev/input/* access, preserving the Wayland security model.
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: wulf, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18694
Because fetching a counter is a rather expansive function we should use
counter_u64_fetch() in pf_state_expires() only when necessary. A "rdr
pass" rule should not cause more effort than separate "rdr" and "pass"
rules. For rules with adaptive timeout values the call of
counter_u64_fetch() should be accepted, but otherwise not.
From the man page:
The adaptive timeout values can be defined both globally and for
each rule. When used on a per-rule basis, the values relate to the
number of states created by the rule, otherwise to the total number
of states.
This handling of adaptive timeouts is done in pf_state_expires(). The
calculation needs three values: start, end and states.
1. Normal rules "pass .." without adaptive setting meaning "start = 0"
runs in the else-section and therefore takes "start" and "end" from
the global default settings and sets "states" to pf_status.states
(= total number of states).
2. Special rules like
"pass .. keep state (adaptive.start 500 adaptive.end 1000)"
have start != 0, run in the if-section and take "start" and "end"
from the rule and set "states" to the number of states created by
their rule using counter_u64_fetch().
Thats all ok, but there is a third case without special handling in the
above code snippet:
3. All "rdr/nat pass .." statements use together the pf_default_rule.
Therefore we have "start != 0" in this case and we run the
if-section but we better should run the else-section in this case and
do not fetch the counter of the pf_default_rule but take the total
number of states.
Submitted by: Andreas Longwitz <longwitz@incore.de>
MFC after: 2 weeks
- Collapse original_sc and serializing_sc fields into one, since they
are never used simultanously, we have only one local I/O and one remote.
- Move remote_sglist and local_sglist fields into CTL_PRIV_BACKEND,
since they are used only on Originating SC in XFER mode, where requests
don't ever reach backends, so we can reuse backend's private storage.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
MTU if we've set it once and there were no changes on the DHCP server
side since the last refresh. This is consistent I believe with how dhclient
handles other settings like IP address, mask etc.
Approved by: cem, eugen
Differential Revision: https://reviews.freebsd.org/D18546
add gcov support and export results as files in debugfs
Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19260
- offsets can be negative, loff_t needs to be signed, it also simplifies
interop with the rest of the code base to use off_t than the actual linux
definition "long long"
- don't rely on the defining "file" to "linux_file" in interface definitions
as that causes heartache with includes
Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19274
for struct ip_mreq remains in place.
The struct ip_mreqn is Linux extension to classic BSD multicast API. It
has extra field allowing to specify the interface index explicitly. In
Linux it used as argument for IP_MULTICAST_IF and IP_ADD_MEMBERSHIP.
FreeBSD kernel also declares this structure and supports it as argument
to IP_MULTICAST_IF since r170613. So, we have structure declared but
not fully supported, this confused third party application configure
scripts.
Code handling IP_ADD_MEMBERSHIP was mixed together with code for
IP_ADD_SOURCE_MEMBERSHIP. Bringing legacy and new structure support
into the mess would made the "argument switcharoo" intolerable, so
code was separated into its own switch case clause.
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D19276
It was not only disabled for quite a while, but also appeared to be broken
at r325517, when maximum number of ports was made configurable.
MFC after: 1 week