It turns out the callers of vm_page_replace know exactly which page they are
replacing and would like to assert about it. Change those from hard panics to
KASSERTs, and provide them with a wrapper so they don't have to deal with
warnings from an INVARIANTS-dependent dead store of the return value of
vm_page_replace.
Submitted by: Ryan Libby <rlibby@gmail.com>
Reviewed by: alc, kib (earlier version)
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D4497
the #address-cells property set. For this we need to read more data before
the parent interrupt description.
this is only enabled on arm64 for now as it's not quite compliant with the
ePAPR spec. We should use a default of 2 where the #address-cells property
is missing, however this will need further testing across architectures.
Obtained from: ABT Systems Ltd
Sponsored by: SoftIron Inc
Differential Revision: https://reviews.freebsd.org/D4518
Rather than pushing all eight possible arguments into dtrace_probe()'s
stack frame, make the syscall_args struct for the current syscall available
via the current thread. Using a custom getargval method for the systrace
provider, this allows any syscall argument to be fetched, even in kernels
that have modified the maximum number of system call arguments.
Sponsored by: EMC / Isilon Storage Division
With the new VOP_GETPAGES() KPI the "count" argument counts pages already,
and doesn't need to be translated from bytes to pages.
While here make it consistent that *rbehind and *rahead are updated only
if we doesn't return error.
Pointy hat to: glebius
- Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect
at the moment, but will be needed for some future changes.
- Don't hardcode the module component of the probe identifier. This is
set automatically by the SDT framework.
MFC after: 1 week
o With new KPI consumers can request contiguous ranges of pages, and
unlike before, all pages will be kept busied on return, like it was
done before with the 'reqpage' only. Now the reqpage goes away. With
new interface it is easier to implement code protected from race
conditions.
Such arrayed requests for now should be preceeded by a call to
vm_pager_haspage() to make sure that request is possible. This
could be improved later, making vm_pager_haspage() obsolete.
Strenghtening the promises on the business of the array of pages
allows us to remove such hacks as swp_pager_free_nrpage() and
vm_pager_free_nonreq().
o New KPI accepts two integer pointers that may optionally point at
values for read ahead and read behind, that a pager may do, if it
can. These pages are completely owned by pager, and not controlled
by the caller.
This shifts the UFS-specific readahead logic from vm_fault.c, which
should be file system agnostic, into vnode_pager.c. It also removes
one VOP_BMAP() request per hard fault.
Discussed with: kib, alc, jeff, scottl
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
pxeboot in tftp loader mode (when built with LOADER_TFTP_SUPPORT) now
prefix all the path to open with the path obtained via the option 'root-path'
directive.
This allows to be able to use the traditional content /boot out of box. Meaning
it now works pretty much like all other loaders. It simplifies hosting hosting
multiple version of FreeBSD on a tftp server.
As a consequence, pxeboot does not look anymore for a pxeboot.4th (which was
never provided)
Note: that pxeboot in tftp loader mode is not built by default.
Reviewed by: rpokala
Relnotes: yes
Sponsored by: Gandi.net
Differential Revision: https://reviews.freebsd.org/D4590
The EFI memory map may change before or during the first
ExitBootServices call. In that case ExitBootServices returns an error,
and GetMemoryMap and ExitBootServices must be retried.
Glue together calls to GetMemoryMap(), ExitBootServices() and storage of
(now up-to-date) MODINFOMD_EFI_MAP metadata within a single function.
That new function - bi_add_efi_data_and_exit() - uses space previously
allocated in bi_load_efi_data() to store the memory map (it will fail if
that space is too short). It handles re-calling GetMemoryMap() once to
update the map key if necessary. Finally, if ExitBootServices() is
successful, it stores the memory map and its header as MODINFOMD_EFI_MAP
metadata.
ExitBootServices() calls are now done earlier, from within arch-
independent bi_load() code.
PR: 202455
Submitted by: Ganael LAPLANCHE
Reviewed by: kib
MFC after: 2 weeks
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D4296
Before the change, things like lle state were queried via
SIOCGNBRINFO_IN6 by ndp(8) for _each_ lle entry in dump.
This ioctl was added in 1999, probably to avoid touching rtsock code.
This change maps SIOCGNBRINFO_IN6 data to standard rtsock dump the
following way:
expire (already) maps to rtm_rmx.rmx_expire
isrouter -> rtm_flags & RTF_GATEWAY
asked -> rtm_rmx.rmx_pksent
state -> rtm_rmx.rmx_state (maps to rmx_weight via define)
Reviewed by: ae
If source of ARP request didn't pass the routing check
(e.g. not in directly connected network), be polite and
still answer the request instead of dropping frame.
Reported by: quadro at irc@rusnet
buffer for each block number in the range with gbincore(), look up the
next instantiated buffer with the logical block number which is
greater or equal to the next lblkno. This significantly speeds up the
iteration for sparce-populated range.
Move the iteration into new helper bnoreuselist(), which is structured
similarly to flushbuflist().
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
the type stability of the buffers memory. Instead of memoizing
pointer to the next buffer and validating it, remember the next
logical block number in the bo list and re-lookup.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
to do is to clean up the timer handling using the async-drain.
Other optimizations may be coming to go with this. Whats here
will allow differnet tcp implementations (one included).
Reviewed by: jtl, hiren, transports
Sponsored by: Netflix Inc.
Differential Revision: D4055
When a multipath member is orphaned its access members are zeroed before its
removed if marked for wither, so prevent any future calls to g_access on
such members.
This prevents a panic on debug kernels which validates the resultant values
aren't negative.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D4416
The intention was to just limit leading zeroes on numeric names. That
check is now improved to also catch the leading spaces and '+' that
strtoul can pass through.
PR: 204897
MFC after: 3 days
(1) The pmap argument passed to the function must be current pmap only.
(2) The process must be single threaded as the function is called either
when a process is exiting or from exec_new_vmspace().
Remove pmap_tlb_flush_ng() which is not used anywhere now.
Approved by: kib (mentor)
When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited
Neighbour Advertisements (IPv6) are sent to notify other nodes that the
address may have moved.
This results is slow failover, dropped packets and network outages for the
lagg interface when the primary link goes down.
We now use the new if_link_state_change_cond with the force param set to
allow lagg to force through link state changes and hence fire a
ifnet_link_event which are now monitored by rip and nd6.
Upon receiving these events each protocol trigger the relevant
notifications:
* inet4 => Gratuitous ARP
* inet6 => Unsolicited Neighbour Announce
This also fixes the carp IPv6 NA's that stopped working after r251584 which
added the ipv6_route__llma route.
The new behavour can be controlled using the sysctls:
* net.link.ether.inet.arp_on_link
* net.inet6.icmp6.nd6_on_link
Also removed unused param from lagg_port_state and added descriptions for the
sysctls while here.
PR: 156226
MFC after: 1 month
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D4111
in pmap_remove_pages().
Some points were considered:
(1) There is no range TLB flush cp15 function.
(2) There is no target selection for hardware TLB flush broadcasting.
(3) Some memory ranges could be mapped sparsely.
(4) Some memory ranges could be quite large.
Tested by buildworld on RPi2 and Jetson TK1, i.e. 4 core platforms.
It turned out that the buildworld time is faster. On the other hand,
when the postponed TLB flush was also removed from pmap_remove_pages(),
the result was worse. But pmap_remove_pages() is called for removing
all user mapping from a process, thus it's quite expected.
Note that the postponed TLB flushes came here from i386 pmap where
hardware TLB flush broadcasting is not available.
Approved by: kib (mentor)
This fixes an issue observed on Cortex A7 (RPi2) and on Cortex A15
(Jetson TK1) causing various memory corruptions. It turned out that
even L2 page table with no valid mapping might be a subject of such
caching.
Note that not all platforms have intermediate TLB caching implemented.
An open question is if this fix is sufficient for all platforms with
this feature.
Approved by: kib (mentor)
Without the patch, there is a race condition: when poll() is invoked(),
if kvp_globals.daemon_busy is false, the daemon won't be timely
woke up, because hv_kvp_send_msg_to_daemon() can't wake up the daemon
in this case.
Submitted by: Dexuan Cui <decui@microsoft.com>
Sponsored by: Microsoft OSTC
Reviewed by: delphij, royger
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D4258
Current code doesn't try to make use of the full page when bouncing because
the size is only expanded to be a multiple of the alignment. Instead try to
always create segments of PAGE_SIZE when using bounce pages.
This allows us to remove the specific casing done for
BUS_DMA_KEEP_PG_OFFSET, since the requirement is to make sure the offsets
into contiguous segments are aligned, and now this is done by default.
Sponsored by: Citrix Systems R&D
Reviewed by: hps, kib
Differential revision: https://reviews.freebsd.org/D4119
panics when unloading the dummynet and IPFW modules:
- The callout drain function can sleep and should not be called having
a non-sleepable lock locked. Remove locks around "ipfw_dyn_uninit(0)".
- Add a new "dn_gone" variable to prevent asynchronous restart of
dummynet callouts when unloading the dummynet kernel module.
- Call "dn_reschedule()" locked so that "dn_gone" can be set and
checked atomically with regard to starting a new callout.
Reviewed by: hiren
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3855
This is a holdover from how reset is handled in the ARGE_MDIO world.
You need to define the mdio bus device if you want to use the ethernet
device or the arge setup path doesn't bring the MAC out of reset.
The new flag, -c <period>, sets the interrupt coalescing period in
microseconds through the new ioat(4) API ioat_set_interrupt_coalesce().
Also add a -z flag to zero ioat statistics before tests, to make it easy
to measure results.
Sponsored by: EMC / Isilon Storage Division
In I/OAT, this is done through the INTRDELAY register. On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows. The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.
Sponsored by: EMC / Isilon Storage Division
directly into a loader (and thus kernel) env var, using the syntax
ubenv import ldvarname=ubvarname
Without the varname= prefix it uses the historical behavior of importing
to the name uboot.ubvarname.
Tested with RTL8188EU and RTL8188CUS in STA mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4523
Certain interfaces (e.g. pfsync0) do not have ip6 addresses (in other words,
ifp->if_afdata[AF_INET6] is NULL). Ensure we don't panic when the MTU is
updated.
pfsync interfaces will never have ip6 support, because it's explicitly disabled
in in6_domifattach().
PR: 205194
Reviewed by: melifaro, hrs
Differential Revision: https://reviews.freebsd.org/D4522
sys/dev/mpr/mpr_sas_lsi.c
sys/dev/mps/mps_sas_lsi.c
When mp[rs]sas_get_sata_identify returns
MPI2_IOCSTATUS_SCSI_PROTOCOL_ERROR, don't bother retrying. Protocol
errors aren't likely to be fixed by sleeping.
Without this change, a system that generated may protocol errors due
to signal integrity issues was taking more than an hour to boot, due
to all the retries.
Reviewed by: slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D4553