* use IN6_IS_ADDR_XXX() macro instead of hardcoded values;
* for multicast addresses just return scope value, the only exception
is addresses with 0x0F scope value (RFC 4291 p2.7.0);
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
an mbuf's storage (internal or external).
Add a new M_SIZE() mbuf macro that returns the size of an mbuf's
storage (internal or external).
These contrast with m_data and m_len, which are with respect to data
in the buffer, rather than the buffer itself.
Rewrite M_LEADINGSPACE() and M_TRAILINGSPACE() in terms of M_START()
and M_SIZE().
This is done as we currently have many instances of using mbuf flags
to generate pointers or lengths for internal storage in header and
regular mbufs, as well as to external storage. Rather than replicate
this logic throughout the network stack, centralising the
implementation will make it easier for us to refine mbuf storage.
This should also help reduce bugs by limiting the amount of
mbuf-type-specific pointer arithmetic. Followup changes will
propagate use of the macros throughout the stack.
M_SIZE() conflicts with one macro in the Chelsio driver; rename that
macro in a slightly unsatisfying way to eliminate the collision.
MFC after: 3 days
Obtained from: jeff (with enhancements)
Sponsored by: EMC / Isilon Storage Division
Reviewed by: bz, glebius, np
Differential Revision: https://reviews.freebsd.org/D753
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
- Split code to do one-time probing of cache properties out of
initializecpu() and into initializecpucache(). This is called once on
the BSP during boot.
- Move enable_sse() into initializecpu().
- Call initializecpu() for AP startup instead of enable_sse() and
manually frobbing MSR_EFER to enable PG_NX.
- Call initializecpu() when an AP resumes. In theory this will now
properly re-enable PG_NX in MSR_EFER when resuming a PAE kernel on
APs.
the local APIC in initializecpu() and re-enables it if the APIC code
decides to use the local APIC after all. Rework this workaround
slightly so that initializecpu() won't re-disable the local APIC if
it is called after the APIC code re-enables the local APIC.
None of existing STEC devices need UNMAP or even support it well, having
many limitations and even hanging sometimes executing those commands.
New devices that may use UNMAP going to be released under HGST name.
MFC after: 3 days
bootloader. Implement the following routines:
pcibios-device-count count the number of instances of a devid
pcibios-read-config read pci config space
pcibios-write-config write pci config space
pcibios-find-devclass find the nth device with a given devclass
pcibios-find-device find the nth device with a given devid
pcibios-locator convert bus device function ti pcibios locator
These commands are thin wrappers over their PCI BIOS 2.1 counterparts. More
informaiton, such as it is, can be found in the standard.
Export a nunmber of pcibios.X variables into the environment to report
what the PCI IDENTIFY command returned.
Also implmenet a new command line primitive (pci-device-count), but don't
include it by default just yet, since it depends on the recently added
words and any errors here can render a system unbootable.
This is intended to allow the boot loader to do special things based
on the hardware it finds. This could be have special settings that are
optimized for the specific cards, or even loading special drivers. It
goes without saying that writing to pci config space should not be
done without a just cause and a sound mind.
Sponsored by: Netflix
A non-global IPv6 address can be used in more than one zone of the same
scope. This zone index is used to identify to which zone a non-global
address belongs.
Also we can have many foreign hosts with equal non-global addresses,
but from different zones. So, they can have different metrics in the
host cache.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
* Return ENETDOWN when interface specified by ipi6_ifindex is not
enabled for IPv6 use.
* Return EADDRNOTAVAIL when ipi6_ifindex specifies an interface, but the
address ipi6_addr is not available for use on that interface.
* Return EINVAL when ipi6_addr is multicast address.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
kern.cam.ctl.iscsi.ping_timeout to 0. This fixes interoperability with
some initiators that don't properly support NOP-Ins, namely iPXE/gPXE.
Submitted by: Chen Wen <pokkys@gmail.com>
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
reboot/halt/debug.
o Add support for most key combinations supported by syscons(4).
Reviewed by: dumbbell, emaste (prev revision of D747)
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
device drivers with calls to the centralised m_print() implementation.
While the formatting and output details differ a little, the content
is essentially the same, and it is unlikely anyone has used this
debugging output in some time.
This change reduces awareness of mbuf cluster allocation (and,
especially, the M_EXT flag) outside of the mbuf allocator, which will
make it easier to refine the external storage mechanism without
disrupting drivers in the future.
Style bugs are preserved.
Reviewed by: bz, glebius
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
isn't being allocated for the last of the requested pages, because a
reservation won't fit in the gap between allocated pages, then the
reservation structure shouldn't be initialized.
While I'm here, improve the nearby comments.
Reported by: jeff, pho
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
The nmdm code enforces a number between the 'nmdm' and 'A|B' portions
of the device name. This is then used as a unit number, and sprintf'd
back into the tty name. If leading zeros were used in the name,
the created device name is different than the string used for the
clone-open (e.g. /dev/nmdm0001A will result in /dev/nmdm1A).
Since unit numbers are no longer required with the updated tty
code, there seems to be no reason to force the string to be a
number. The fix is to allow an arbitrary string between
'nmdm' and 'A|B', within the constraints of devfs names. This allows
all existing user of numeric strings to continue to work, and also
allows more meaningful names to be used, such as bhyve VM names.
Tested on amd64, i386 and ppc64.
Reported by: Dave Smith
PR: 192281
Reviewed by: neel, glebius
Phabric: D729
MFC after: 3 days
At this moment it works only for files and ZVOLs in device mode since BIOs
have no respective respective cache control flags (DPO/FUA).
MFC after: 1 month
Sponsored by: iXsystems, Inc.
It affects the IPv6 source address selection algorithm (RFC 6724)
and allows override the last rule ("longest matching prefix") for
choosing among equivalent addresses. The address with `prefer_source'
will be preferred source address.
Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC
This doesn't include the same kind of userland overriding that the IPv4
path has; nor does it yet know about 2-tuple versus 4-tuple hashing.
That'll come later.
Differential Revision: https://reviews.freebsd.org/D527
Reviewed by: grehan
with no RSS hash.
When doing RSS:
* Create a new IPv4 netisr which expects the frames to have been verified;
it just directly dispatches to the IPv4 input path.
* Once IPv4 reassembly is done, re-calculate the RSS hash with the new
IP and L3 header; then reinject it as appropriate.
* Update the IPv4 netisr to be a CPU affinity netisr with the RSS hash
function (rss_soft_m2cpuid) - this will do a software hash if the
hardware doesn't provide one.
NICs that don't implement hardware RSS hashing will now benefit from RSS
distribution - it'll inject into the correct destination netisr.
Note: the netisr distribution doesn't work out of the box - netisr doesn't
query RSS for how many CPUs and the affinity setup. Yes, netisr likely
shouldn't really be doing CPU stuff anymore and should be "some kind of
'thing' that is a workqueue that may or may not have any CPU affinity";
that's for a later commit.
Differential Revision: https://reviews.freebsd.org/D527
Reviewed by: grehan
and egress.
* rss_mbuf_software_hash_v4 - look at the IPv4 mbuf to fetch the IPv4 details
+ direction to calculate a hash.
* rss_proto_software_hash_v4 - hash the given source/destination IPv4 address,
port and direction.
* rss_soft_m2cpuid - map the given mbuf to an RSS CPU ("bucket" for now)
These functions are intended to be used by the stack to support
the following:
* Not all NICs do RSS hashing, so we should support some way of doing
a hash in software;
* The NIC / driver may not hash frames the way we want (eg UDP 4-tuple
hashing when the stack is only doing 2-tuple hashing for UDP); so we
may need to re-hash frames;
* .. same with IPv4 fragments - they will need to be re-hashed after
reassembly;
* .. and same with things like IP tunneling and such;
* The transmit path for things like UDP, RAW and ICMP don't currently
have any RSS information attached to them - so they'll need an
RSS calculation performed before transmit.
TODO:
* Counters! Everywhere!
* Add a debug mode that software hashes received frames and compares them
to the hardware hash provided by the hardware to ensure they match.
The IPv6 part of this is missing - I'm going to do some re-juggling of
where various parts of the RSS framework live before I add the IPv6
code (read: the IPv6 code is going to go into netinet6/in6_rss.[ch],
rather than living here.)
Note: This API is still fluid. Please keep that in mind.
Differential Revision: https://reviews.freebsd.org/D527
Reviewed by: grehan
information as part of recvmsg().
This is primarily used for debugging/verification of the various
processing paths in the IP, PCB and driver layers.
Unfortunately the current implementation of the control message path
results in a ~10% or so drop in UDP frame throughput when it's used.
Differential Revision: https://reviews.freebsd.org/D527
Reviewed by: grehan
overriding an existing flowid/flowtype field in the outbound mbuf with
the inp_flowid/inp_flowtype details.
The upcoming RSS UDP support calculates a valid RSS value for outbound
mbufs and since it may change per send, it doesn't cache it in the inpcb.
So overriding it here would be wrong.
Differential Revision: https://reviews.freebsd.org/D527
Reviewed by: grehan