Commit Graph

103796 Commits

Author SHA1 Message Date
ae
3275dab818 * constify argument of in6_addrscope();
* use IN6_IS_ADDR_XXX() macro instead of hardcoded values;
* for multicast addresses just return scope value, the only exception
  is addresses with 0x0F scope value (RFC 4291 p2.7.0);

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-09-11 10:27:59 +00:00
rwatson
f1ff024818 Add new a M_START() mbuf macro that returns a pointer to the start of
an mbuf's storage (internal or external).

Add a new M_SIZE() mbuf macro that returns the size of an mbuf's
storage (internal or external).

These contrast with m_data and m_len, which are with respect to data
in the buffer, rather than the buffer itself.

Rewrite M_LEADINGSPACE() and M_TRAILINGSPACE() in terms of M_START()
and M_SIZE().

This is done as we currently have many instances of using mbuf flags
to generate pointers or lengths for internal storage in header and
regular mbufs, as well as to external storage. Rather than replicate
this logic throughout the network stack, centralising the
implementation will make it easier for us to refine mbuf storage.
This should also help reduce bugs by limiting the amount of
mbuf-type-specific pointer arithmetic.  Followup changes will
propagate use of the macros throughout the stack.

M_SIZE() conflicts with one macro in the Chelsio driver; rename that
macro in a slightly unsatisfying way to eliminate the collision.

MFC after:	3 days
Obtained from:	jeff (with enhancements)
Sponsored by:	EMC / Isilon Storage Division
Reviewed by:	bz, glebius, np
Differential Revision:	https://reviews.freebsd.org/D753
2014-09-11 07:16:15 +00:00
alc
6af056918d Update a stale comment. 2014-09-11 03:16:57 +00:00
jhb
fa53fae9e4 MFamd64: Use initializecpu() to set various model-specific registers on
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
- Split code to do one-time probing of cache properties out of
  initializecpu() and into initializecpucache().  This is called once on
  the BSP during boot.
- Move enable_sse() into initializecpu().
- Call initializecpu() for AP startup instead of enable_sse() and
  manually frobbing MSR_EFER to enable PG_NX.
- Call initializecpu() when an AP resumes.  In theory this will now
  properly re-enable PG_NX in MSR_EFER when resuming a PAE kernel on
  APs.
2014-09-10 21:37:47 +00:00
jhb
6f8d6cd57b To workaround an errata on certain Pentium Pro CPUs, i386 disables
the local APIC in initializecpu() and re-enables it if the APIC code
decides to use the local APIC after all.  Rework this workaround
slightly so that initializecpu() won't re-disable the local APIC if
it is called after the APIC code re-enables the local APIC.
2014-09-10 21:25:54 +00:00
mav
a3e07f1c6a Extend UNMAP blacklist on all STEC SSD models.
None of existing STEC devices need UNMAP or even support it well, having
many limitations and even hanging sometimes executing those commands.
New devices that may use UNMAP going to be released under HGST name.

MFC after:	3 days
2014-09-10 21:24:15 +00:00
imp
ea3c3dd245 Add support for calling pcibios routines from the
bootloader. Implement the following routines:
	pcibios-device-count	count the number of instances of a devid
	pcibios-read-config	read pci config space
	pcibios-write-config	write pci config space
	pcibios-find-devclass	find the nth device with a given devclass
	pcibios-find-device	find the nth device with a given devid
	pcibios-locator		convert bus device function ti pcibios locator
These commands are thin wrappers over their PCI BIOS 2.1 counterparts. More
informaiton, such as it is, can be found in the standard.

Export a nunmber of pcibios.X variables into the environment to report
what the PCI IDENTIFY command returned.

Also implmenet a new command line primitive (pci-device-count), but don't
include it by default just yet, since it depends on the recently added
words and any errors here can render a system unbootable.

This is intended to allow the boot loader to do special things based
on the hardware it finds. This could be have special settings that are
optimized for the specific cards, or even loading special drivers. It
goes without saying that writing to pci config space should not be
done without a just cause and a sound mind.

Sponsored by:	Netflix
2014-09-10 21:07:00 +00:00
jhb
2a48d5d52c Move code to set various MSRs on AMD cpus out of printcpuinfo() and
into initalizecpu() instead.
2014-09-10 21:04:44 +00:00
mav
73b5b7048f Add PCI ID for Promise TX8660 8-port 3Gbps HBA.
This device reports RAID subclass, but appears to be AHCI compatible.

Submitted by:	Yuri Perejilin <yuri@rivera.ru>
MFC after:	1 week
2014-09-10 19:53:31 +00:00
ae
1576b695b6 Add scope zone id to the in_endpoints and hc_metrics structures.
A non-global IPv6 address can be used in more than one zone of the same
scope. This zone index is used to identify to which zone a non-global
address belongs.

Also we can have many foreign hosts with equal non-global addresses,
but from different zones. So, they can have different metrics in the
host cache.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-09-10 16:26:18 +00:00
andrew
249f5cbad9 Unify interrupts bit definition and usage. While here remove PSR_C_bit.
Submitted by:	Svatopluk Kraus <onwahe at gmail.com>,
		Michal Meloun <meloun at miracle.cz>
Differential Revision: https://reviews.freebsd.org/D754
2014-09-10 15:25:15 +00:00
ae
c8498c6b7f Add additional checks for IPV6_PKTINFO handling (RFC 3542):
* Return ENETDOWN when interface specified by ipi6_ifindex is not
  enabled for IPv6 use.
* Return EADDRNOTAVAIL when ipi6_ifindex specifies an interface, but the
  address ipi6_addr is not available for use on that interface.
* Return EINVAL when ipi6_addr is multicast address.

Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-09-10 14:32:07 +00:00
trasz
d8aaa0480a Make sure we handle less than zero timeouts in iSCSI initiator and target
in a reasonable way.

Sponsored by:	The FreeBSD Foundation
2014-09-10 14:04:10 +00:00
andrew
a363841a18 Add more register values to armreg.h and remove CPU_CONTROL_32BP_ENABLE
from asm.h as they were already defined in armreg.h.

Submitted by:	Michal Meloun <meloun at miracle.cz>
2014-09-10 13:38:52 +00:00
trasz
6cacb7cf77 Make it possible to disable NOP-In PDUs by the iSCSI initiator by setting
kern.cam.ctl.iscsi.ping_timeout to 0.  This fixes interoperability with
some initiators that don't properly support NOP-Ins, namely iPXE/gPXE.

Submitted by:	Chen Wen <pokkys@gmail.com>
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2014-09-10 13:34:27 +00:00
ae
9186ea6a59 Make in6_pcblookup_hash_locked and in6_pcbladdr static.
Obtained from:	Yandex LLC
Sponsored by:	Yandex LLC
2014-09-10 13:17:35 +00:00
glebius
5939c729a8 Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES(). 2014-09-10 12:36:41 +00:00
ae
82d0b71937 Introduce INP6_PCBHASHKEY macro. Replace usage of hardcoded part of
IPv6 address as hash key in all places.

Obtained from:	Yandex LLC
2014-09-10 12:35:42 +00:00
ray
7c22eb535c Fix one more spelling mistake.
Pointed by:	danfe
2014-09-10 11:48:13 +00:00
ray
25f116b73a spelling fixes
Submitted by:	"Sam Fourman Jr." <sfourman@gmail.com>
MFC after:	1 week
2014-09-10 11:27:33 +00:00
ray
a95d8f6006 o Add sysctls to enable/disable potentially dengerous key combinations, like
reboot/halt/debug.
o Add support for most key combinations supported by syscons(4).

Reviewed by:	dumbbell, emaste (prev revision of D747)
MFC after:	5 days
Sponsored by:	The FreeBSD Foundation
2014-09-10 11:13:13 +00:00
andrew
ae8d0bb549 Move if_smc_fdt.c to live in sys/dev/smc. It's not specific to the ARM
Versatile hardware.
2014-09-10 10:59:17 +00:00
rwatson
da0f8310e7 Replace local copy-and-paste implementations of printmbuf() in several
device drivers with calls to the centralised m_print() implementation.
While the formatting and output details differ a little, the content
is essentially the same, and it is unlikely anyone has used this
debugging output in some time.

This change reduces awareness of mbuf cluster allocation (and,
especially, the M_EXT flag) outside of the mbuf allocator, which will
make it easier to refine the external storage mechanism without
disrupting drivers in the future.

Style bugs are preserved.

Reviewed by:	bz, glebius
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2014-09-10 09:57:32 +00:00
mav
645f3f53cf Make ctl_port_mask an array to support more then 32 ports.
Overflow reported by Coverity.

CID:		1229894
MFC after:	3 days
2014-09-10 07:16:17 +00:00
mav
20136fb37c Remove uninitialized and unused variable, reported by Coverity.
CID:		1230015
2014-09-10 07:00:36 +00:00
mav
4db7e51eb2 Fix array overrun, reported by Coverity.
CID:		1229970
2014-09-10 06:56:45 +00:00
mav
a209c226c5 Fix couple off-by-one range check errors, reported by Coverity.
CID:		1007837
2014-09-10 06:35:00 +00:00
mav
8633346319 Fix memory leak on error, reported by Coverity.
CID:		1007773
2014-09-10 06:29:31 +00:00
mav
aa30f2d2de Fix minor buffer overflow reported by Coverity.
CID:		1006781
2014-09-10 06:25:18 +00:00
alc
996a2e8d68 Fix a boundary case error in vm_reserv_alloc_contig(): If a reservation
isn't being allocated for the last of the requested pages, because a
reservation won't fit in the gap between allocated pages, then the
reservation structure shouldn't be initialized.

While I'm here, improve the nearby comments.

Reported by:	jeff, pho
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2014-09-10 05:52:30 +00:00
grehan
f30a88f9b1 Fix issue with nmdm and leading zeros in device name.
The nmdm code enforces a number between the 'nmdm' and 'A|B' portions
of the device name. This is then used as a unit number, and sprintf'd
back into the tty name. If leading zeros were used in the name,
the created device name is different than the string used for the
clone-open (e.g. /dev/nmdm0001A will result in /dev/nmdm1A).

Since unit numbers are no longer required with the updated tty
code, there seems to be no reason to force the string to be a
number. The fix is to allow an arbitrary string between
'nmdm' and 'A|B', within the constraints of devfs names. This allows
all existing user of numeric strings to continue to work, and also
allows more meaningful names to be used, such as bhyve VM names.

Tested on amd64, i386 and ppc64.

Reported by:	Dave Smith
PR:		192281
Reviewed by:	neel, glebius
Phabric:	D729
MFC after:	3 days
2014-09-10 05:44:15 +00:00
gjb
bd9dfb26f5 Bump __FreeBSD_version after SA-14:18
Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2014-09-10 00:19:33 +00:00
np
d63cc6f68a Whitespace nit.
MFC after:	1 week
2014-09-09 18:36:00 +00:00
trasz
76b4ee1f43 Avoid unlocking unlocked mutex in RCTL jail code. Specific test case
is attached to PR.

PR:		193457
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2014-09-09 16:05:33 +00:00
mav
d6b90c0b3c Report that DPO and FUA bits are supported after r271311. 2014-09-09 15:19:38 +00:00
mav
fbe6280d13 Oops, missed piece of r271311. 2014-09-09 14:20:55 +00:00
ray
4f2d1acf2c Revert r269474. Special keyboard combinations should be handled by separate
sysctls.
2014-09-09 14:18:56 +00:00
mav
abf46035ae Add support for Mode Page Policy (0x87) VPD page. 2014-09-09 14:09:51 +00:00
ian
b62cc0fb26 Rename new to newval in inline asm code, to avoid clashes with C++ new.
Also rename cmp to cmpval just to keep the asm variable names similar to
the C variable names.
2014-09-09 13:50:21 +00:00
mav
57ec24a023 Improve cache control support, including DPO/FUA flags and the mode page.
At this moment it works only for files and ZVOLs in device mode since BIOs
have no respective respective cache control flags (DPO/FUA).

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-09-09 11:38:29 +00:00
mav
7797473e53 Make ZVOL writes in device mode support IO_SYNC flag.
MFC after:	1 month
2014-09-09 11:29:55 +00:00
ae
7d73ba1804 Add the ability to set `prefer_source' flag to an IPv6 address.
It affects the IPv6 source address selection algorithm (RFC 6724)
and allows override the last rule ("longest matching prefix") for
choosing among equivalent addresses. The address with `prefer_source'
will be preferred source address.

Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
2014-09-09 10:52:50 +00:00
kevlo
f4235299d0 Drop frames that have larger than MCLBYTES. 2014-09-09 05:21:31 +00:00
adrian
d01f185e80 Add basic RSS awareness for the UDPv6 send path.
This doesn't include the same kind of userland overriding that the IPv4
path has; nor does it yet know about 2-tuple versus 4-tuple hashing.
That'll come later.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:20:53 +00:00
adrian
63bc177a08 Calculate the RSS hash for outbound UDPv4 frames.
Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:19:36 +00:00
adrian
b73995e058 Update the IPv4 input path to handle reassembled frames and incoming frames
with no RSS hash.

When doing RSS:

* Create a new IPv4 netisr which expects the frames to have been verified;
  it just directly dispatches to the IPv4 input path.
* Once IPv4 reassembly is done, re-calculate the RSS hash with the new
  IP and L3 header; then reinject it as appropriate.
* Update the IPv4 netisr to be a CPU affinity netisr with the RSS hash
  function (rss_soft_m2cpuid) - this will do a software hash if the
  hardware doesn't provide one.

NICs that don't implement hardware RSS hashing will now benefit from RSS
distribution - it'll inject into the correct destination netisr.

Note: the netisr distribution doesn't work out of the box - netisr doesn't
query RSS for how many CPUs and the affinity setup.  Yes, netisr likely
shouldn't really be doing CPU stuff anymore and should be "some kind of
'thing' that is a workqueue that may or may not have any CPU affinity";
that's for a later commit.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:18:20 +00:00
adrian
e5ddfb30ab Implement IPv4 RSS software hash functions to use during packet ingress
and egress.

* rss_mbuf_software_hash_v4 - look at the IPv4 mbuf to fetch the IPv4 details
  + direction to calculate a hash.
* rss_proto_software_hash_v4 - hash the given source/destination IPv4 address,
  port and direction.
* rss_soft_m2cpuid - map the given mbuf to an RSS CPU ("bucket" for now)

These functions are intended to be used by the stack to support
the following:

* Not all NICs do RSS hashing, so we should support some way of doing
  a hash in software;
* The NIC / driver may not hash frames the way we want (eg UDP 4-tuple
  hashing when the stack is only doing 2-tuple hashing for UDP); so we
  may need to re-hash frames;
* .. same with IPv4 fragments - they will need to be re-hashed after
  reassembly;
* .. and same with things like IP tunneling and such;
* The transmit path for things like UDP, RAW and ICMP don't currently
  have any RSS information attached to them - so they'll need an
  RSS calculation performed before transmit.

TODO:

* Counters! Everywhere!
* Add a debug mode that software hashes received frames and compares them
  to the hardware hash provided by the hardware to ensure they match.

The IPv6 part of this is missing - I'm going to do some re-juggling of
where various parts of the RSS framework live before I add the IPv6
code (read: the IPv6 code is going to go into netinet6/in6_rss.[ch],
rather than living here.)

Note: This API is still fluid.  Please keep that in mind.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 03:10:21 +00:00
adrian
e623d51cd5 Add support for receiving and setting flowtype, flowid and RSS bucket
information as part of recvmsg().

This is primarily used for debugging/verification of the various
processing paths in the IP, PCB and driver layers.

Unfortunately the current implementation of the control message path
results in a ~10% or so drop in UDP frame throughput when it's used.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 01:45:39 +00:00
adrian
4f769d2ecf Add IP_NODEFAULTFLOWID awareness to ip6_output().
Differential Revision:	https://reviews.freebsd.org/D527
2014-09-09 00:21:21 +00:00
adrian
aab402c146 Add a flag to ip_output() - IP_NODEFAULTFLOWID - which prevents it from
overriding an existing flowid/flowtype field in the outbound mbuf with
the inp_flowid/inp_flowtype details.

The upcoming RSS UDP support calculates a valid RSS value for outbound
mbufs and since it may change per send, it doesn't cache it in the inpcb.
So overriding it here would be wrong.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 00:19:02 +00:00