Commit Graph

117952 Commits

Author SHA1 Message Date
John Baldwin
319e8f3e36 Fix a typo. 2017-08-11 22:47:32 +00:00
Ed Maste
67d4951ddf arm: enable ARM_MANY_BOARD in NOTES for LINT build
Added in r238189, ARM_MANY_BOARD adds support for multiple ARM boards in
a single kernel. Include it for LINT builds to avoid duplicate symbol
errors when linking with lld.

Sponsored by:	The FreeBSD Foundation
2017-08-11 19:49:29 +00:00
Mark Johnston
d055a46cda Bump KERNELDUMP_BUFFER_SIZE to 4096.
The encrypted kernel dump code writes data in blocks of this size. A buffer
size of 4096 allows encrypted dumps to work with 4Kn drives.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11870
2017-08-11 19:24:08 +00:00
Ian Lepore
c82d887d47 Stop calling atrtc_set() from the xen timer clock_settime() method. That
removes the only reference to atrtc_set() from outside of atrtc.c, so make
it static.

The xen timer driver registers as a realtime clock with 1us resolution.  In
the past that resulted in only the xen timer's clock_settime() getting
called, so it would call atrtc_set() to set the hardware clock as well.  As
of r32090, the clock_settime() method of all registered realtime clocks gets
called, so the xen driver no longer needs to chain-call the lower-resolution
driver.

Thanks to royger@ for talking me through the xen stuff, and for testing.
2017-08-11 19:02:11 +00:00
Ed Maste
9432a9bd9f Rename at91_pmc's M_PMC malloc type to avoid duplicate definition
M_PMC is defined in sys/dev/hwpmc/hwpmc_mod.c, and the LINT kernel build
fails when linking with lld due to a duplicate symbol error.

Sponsored by:	The FreeBSD Foundation
2017-08-11 18:09:26 +00:00
David C Somayajulu
45f1312387 Performance enhancements to reduce CPU utililization for large number of
TCP connections (order of tens of thousands), with predominantly Transmits.

Choice to perform receive operations either in IThread or Taskqueue Thread.

Submitted by:Vaishali.Kulkarni@cavium.com
MFC after:5 days
2017-08-11 17:43:25 +00:00
Ryan Libby
9de5f67de2 x86/crc32_sse42.c: quiet unused function warning
Reviewed by:	cem
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11980
2017-08-11 17:05:31 +00:00
Mark Johnston
af0460beda Have sendfile_swapin() use vm_page_grab_pages().
Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11942
2017-08-11 16:32:24 +00:00
Mark Johnston
9df950b35d Modify vm_page_grab_pages() to handle VM_ALLOC_NOWAIT.
This will allow its use in sendfile_swapin().

Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11942
2017-08-11 16:29:22 +00:00
Alan Cox
6921451dab An invalid page can't be dirty.
Reviewed by:	kib
MFC after:	1 week
2017-08-11 16:27:54 +00:00
Roger Pau Monné
c642d2f5b5 acpi/srat: fix build without DMAP
Use pmap_mapbios to map memory used to store the cpus array.

Reported by:	lwhsu
X-MFC-with:	r322348
2017-08-11 14:19:55 +00:00
Andrew Turner
a92a2f00b1 Only return the current cpu if it's in the cpumask. When we restrict the
cpumask it probably means we are unable to sent interrupts to CPUs outside
the map. As such only return the current CPU when it's within the mask
otherwise return the first valid CPU.

This is needed on ThunderX as, in a dual socket configuration, we are
unable to send MSI/MSI-X interrupts between sockets.

Reviewed by:	mmel
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11957
2017-08-11 12:45:58 +00:00
Hans Petter Selasky
ebf854802d Make sure the "vm_flags" and "vm_page_prot" fields get set correctly
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.

While at it fix some redundant use of parenthesing inside some related
macros.

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-11 10:44:40 +00:00
Mark Johnston
0b7bd01a82 Add a specialized function for DRM drivers to register themselves.
Such drivers attach to a vgapci bus rather than directly to a pci bus. For
the rest of the LinuxKPI to work correctly in this case, we override the
vgapci bus' ivars with those of the grandparent.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11932
2017-08-11 03:59:48 +00:00
Mark Johnston
7e05ffa6e6 Micro-optimize kmem_unback().
We can remove some unnecessary object radix tree lookups by using the
object memq to iterate over pages in the specified range. This does not,
however, eliminate the lookup needed in vm_page_free_toq() to remove each
tree entry.

Reviewed by:	alc, kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11945
2017-08-11 03:09:11 +00:00
Mark Johnston
2c642ec1e7 Make vm_page_sunbusy() assert that the page is unlocked.
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11946
2017-08-10 22:43:38 +00:00
Ian Lepore
c89cfb4377 Ensure the clocks driver is attached before any drivers that need to enable
clocks in their attach().
2017-08-10 19:42:30 +00:00
Roger Pau Monné
3f0a9fe06c mptable: fix i386 build failure
Reported by:	emaste
X-MFC-with:	r322347
2017-08-10 17:46:57 +00:00
Kenneth D. Merry
6d4ffcb4ac Changes to make mps(4) and mpr(4) handle reinit with reallocation.
When the mps(4) and mpr(4) drivers need to reinitialize the
firmware, they sometimes need to reallocate all of the memory
allocated by the driver.  The reallocation happens whenever the IOC
Facts change.  That should only happen after a firmware upgrade.

If the reinitialization happens as a result of a timed out command
sent to the card, the command that timed out and triggered the
reinit may have been freed if iocfacts_allocate() reallocated all
memory.  If the caller attempts to access the command after that,
the kernel will panic because the caller will be dereferencing
freed memory.

The solution is to set a flag in the softc when we reallocate,
and avoid dereferencing the command strucure if we've reallocated.

The changes are largely the same in both drivers, since mpr(4) is a
derivative of mps(4).

 o In iocfacts_allocate(), if the IOC Facts have changed and we
   need to reallocate, set the REALLOCATED flag in the softc.

 o Change wait_command() to take a struct mps_command ** instead of
   a struct mps_command *.  This allows us to NULL out the caller's
   command pointer if we have to reinit the controller and the data
   structures get reallocated.  (The REALLOCATED flag will be set
   in the softc if that has happened.)

 o In every place that calls wait_command(), make sure we handle
   the case where the command is NULL after the call.

 o The mpr(4) driver has mpr_request_polled() which can also
   reinitialize the card.  Also check for reallocation there.

Reviewed by:	scottl, slm
MFC after:	1 week
Sponsored by:	Spectra Logic
2017-08-10 14:59:17 +00:00
Sean Bruno
5f593927a8 Purge deprecated locking macros.
Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
2017-08-10 14:54:36 +00:00
Ruslan Bukin
af19cc59ca Support for v1.10 (latest) of RISC-V privilege specification.
New version is not compatible on supervisor mode with v1.9.1
(previous version).

Highlights:
    o BBL (Berkeley Boot Loader) provides no initial page tables
      anymore allowing us to choose VM, to build page tables manually
      and enable MMU in S-mode.
    o SBI interface changed.
    o GENERIC kernel.
      FDT is now chosen standard for RISC-V hardware description.
      DTB is now provided by Spike (golden model simulator). This
      allows us to introduce GENERIC kernel. However, description
      for console and timer devices is not provided in DTB, so move
      these devices temporary to nexus bus.
    o Supervisor can't access userspace by default. Solution is to
      set SUM (permit Supervisor User Memory access) bit in sstatus
      register.
    o Compressed extension is now turned on by default.
    o External GCC 7.1 compiler used.
    o _gp renamed to __global_pointer$
    o Compiler -march= string is now in use allowing us to choose
      required extensions (compressed, FPU, atomic, etc).

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11800
2017-08-10 14:18:09 +00:00
Marcin Wojtas
e2b9d20234 Enable OF_setprop API function to add property in FDT
This patch modifies function ofw_fdt_setprop (called by OF_setprop),
so that it can add property, when replacing is not possible.
Adding property is needed to fixup FDT's that have missing
properties.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11879
2017-08-10 13:45:56 +00:00
Hans Petter Selasky
f6800be3ce Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:05:40 +00:00
Hans Petter Selasky
4ef8a6301f Fixes for wait event in the LinuxKPI. These are regression issues
after r319757.

1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.

2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.

3) The wait_event() function should not have a return value.

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:00:10 +00:00
Hans Petter Selasky
8ea4441598 Make sure the linux_wait_event_common() function in the LinuxKPI properly
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.

While at it change the type of returned variable from "long" to "int" to
match the actual return type.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 12:51:04 +00:00
Roger Pau Monné
a74bb29ada x86: bump MAX_APIC_ID to 512
Introduce a new define to take int account the xAPIC ID limit, for
systems where x2APIC is not available/reliable.

Also change some of the usages of the APIC ID to use an unsigned int
(which is the correct storage type to deal with x2APIC IDs as found in
x2APIC MADT entries).

This allows booting FreeBSD on a box with 256 CPUs and APIC IDs up to
295:

FreeBSD/SMP: Multiprocessor System Detected: 256 CPUs
FreeBSD/SMP: 1 package(s) x 64 core(s) x 4 hardware threads
Package HW ID = 0
	Core HW ID = 0
		CPU0 (BSP): APIC ID: 0
		CPU1 (AP/HT): APIC ID: 1
		CPU2 (AP/HT): APIC ID: 2
		CPU3 (AP/HT): APIC ID: 3
[...]
	Core HW ID = 73
		CPU252 (AP): APIC ID: 292
		CPU253 (AP/HT): APIC ID: 293
		CPU254 (AP/HT): APIC ID: 294
		CPU255 (AP/HT): APIC ID: 295

Submitted by:		kib (previous version)
Relnotes:		yes
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11913
2017-08-10 09:16:40 +00:00
Roger Pau Monné
84525e55c1 x86: make the arrays that depend on MAX_APIC_ID dynamic
So that MAX_APIC_ID can be bumped without wasting memory.

Note that the usage of MAX_APIC_ID in the SRAT parsing forces the
parser to allocate memory directly from the phys_avail physical memory
array, which is not the best approach probably, but I haven't found
any other way to allocate memory so early in boot. This memory is not
returned to the system afterwards, but at least it's sized according
to the maximum APIC ID found in the MADT table.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11912
2017-08-10 09:16:03 +00:00
Roger Pau Monné
fd1f83fb45 apic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase
Populate the lapics arrays and call cpu_add/lapic_create in the setup
phase instead. Also store the max APIC ID found in the newly
introduced max_apic_id global variable.

This is a requirement in order to make the static arrays currently
using MAX_LAPIC_ID dynamic.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11911
2017-08-10 09:15:18 +00:00
Sean Bruno
5c5ca36ca2 Don't leak mbufs if clusers exceeds the number of segments. This would
leak mbufs over time causing crashes.

PR:		221202
Submitted by:	Matt Macy <matt@mattmacy.io>
Reported by:	gergely.czuczy@harmless.hu
Sponsored by:	Limelight Networks
2017-08-10 03:43:23 +00:00
Sean Bruno
18a660b344 Export IFCAP_HWSTATS so that we don't experience double stats counting
on iflib enabled devices.

PR:		220198
Submitted by:	Matt Macy <matt@mattmacy.io>
Reported by:	Ben Woods <woodsb02@freebsd.org>
Sponsored by:	Limelight Networks
2017-08-10 03:11:05 +00:00
David C Somayajulu
b284b46dc4 Provide compile to choose receive processing in either Ithread or Taskqueue Thread. 2017-08-09 22:18:49 +00:00
Ryan Libby
2473c1b145 i386/boot2: -fno-asynchronous-unwind-tables for gcc
The amd64 build of boot2 was failing with gcc 6.3.0 due to being more
than 1 kB too large. It was apparently generating a .eh_frame section
which was not being removed by objcopy -S. The .eh_frame section seems
to be mandatory per the amd64 ABI, but boot2 is compiled for i386 (uses
-m32), and therefore should be optional in this context. Suppress
generation of .eh_frame with the -fno-asynchronous-unwind-tables flag to
gcc. This saves 1348 bytes (the limit is 7680 bytes).

Reviewed by:	dim, imp
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11928
2017-08-09 20:13:49 +00:00
Andrey V. Elsukov
e54647920b Make user supplied data checks a bit stricter.
key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY)
call. This socket option is usually used to configure IPsec bypass for
socket. Only privileged user can set this socket option.
The message syntax is described here
	http://www.kame.net/newsletter/20021210/

and our libipsec is usually used to create the correct request.
Add additional checks:
* that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer
* that src/dst's sa_len is the same
* that 2*sa_len is not out of bounds of user supplied buffer
* that 2*sa_len fits into bounds of sadb_x_ipsecrequest

Reported by:	Ilja van Sprundel
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11796
2017-08-09 19:58:38 +00:00
Jung-uk Kim
b5669d0aa8 Split identify_cpu() into two functions for amd64 as we do for i386. This
reduces diff between amd64 and i386.  Also, it fixes a regression introduced
in r322076, i.e., identify_hypervisor() failed to identify some hypervisors.
This function assumes cpu_feature2 is already initialized.

Reported by:	dexuan
Tested by:	dexuan
2017-08-09 18:09:09 +00:00
Gleb Smirnoff
ef3266d58a Plug uninitialized stack variable leak in sendfile(2).
Reported by:	Ilja Van Sprundel <ivansprundel ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa gmail.com>
MFC after:	1 week
Security:	uninitialized stack variable leak
2017-08-09 17:48:38 +00:00
Warner Losh
0038725697 Also provide a warning for geom_fox.
Differential Review: https://reviews.freebsd.org/D11935
Requested by: jhb@
MFC After: 3 days
2017-08-09 16:37:37 +00:00
Warner Losh
20995eab57 Mark geom classes as deprecated.
geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel
Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC
since FreeBSD 8. Add warning when used.

geom_vol_ffs has been obsolete since ufs support to geom_label was
committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5.
Add warning when used.

geom_fox has been obsolete since gmultipath was committed in FreeBSD 7.
(no warning added, since this is a very obscure class).

These will all be removed in FreeBSD 12.

MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D11935

Note: Classes will be removed after MFC
2017-08-09 16:15:24 +00:00
Alexander Motin
2b2a6eb95e Missing remanant of 322309.
MFC after:	1 week
2017-08-09 13:46:16 +00:00
Andrey V. Elsukov
95e8b991ca Add to if_enc(4) ability to capture packets via BPF after pfil processing.
New flag 0x4 can be configured in net.enc.[in|out].ipsec_bpf_mask.
When it is set, if_enc(4) additionally captures a packet via BPF after
invoking pfil hook. This may be useful for debugging.

MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D11804
2017-08-09 12:24:07 +00:00
Alexander Motin
6750c3d0fa Use "Ibex Peak" codename for "5 Series/3400 Series" chipsets.
This is shorter and unifies naming with later chipsets.

MFC after:	1 week
2017-08-09 12:21:17 +00:00
Alexander Motin
aaa9b2b3f3 Add new Intel Lewisburg and Union Point chipset PCI IDs.
While there, polish some old AHCI ones, since they are still reused.

MFC after:	1 week
2017-08-09 12:03:12 +00:00
Oleg Bulyzhin
ff21796d25 Fix comment typo. 2017-08-09 10:46:34 +00:00
Hans Petter Selasky
768a720e95 Print maximum MTU when trying to set invalid MTU in the mlx4en(4) driver.
Useful for debugging.

Submitted by:		Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2017-08-09 10:32:51 +00:00
Hans Petter Selasky
a3d0173d98 Increment queue drops in the network statistics when transmitted packets
are dropped by the mlx4en(4) driver.

Submitted by:		Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2017-08-09 10:30:55 +00:00
Hans Petter Selasky
f7833544f1 Add support for RX and TX statistics when the mlx4en(4) PCI device
is in VF or SRIOV mode typically in a virtual machine environment.

Submitted by:		Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after:		3 days
Sponsored by:		Mellanox Technologies
2017-08-09 10:27:21 +00:00
Alexander Motin
c901a9b1f1 Do not loose CCB flags after r320493.
There is at least CAM_UNLOCKED that should be kept.

MFC after:	3 days
2017-08-09 09:13:15 +00:00
Dag-Erling Smørgrav
d80fbbee5f Correct sysctl names. 2017-08-09 07:24:58 +00:00
Sepherosa Ziehau
9c6cae2431 hyperv/hn: Implement transparent mode network VF.
How network VF works with hn(4) on Hyper-V in transparent mode:

- Each network VF has a cooresponding hn(4).
- The network VF and the it's cooresponding hn(4) have the same hardware
  address.
- Once the network VF is attached, the cooresponding hn(4) waits several
  seconds to make sure that the network VF attach routing completes, then:
  o  Set the intersection of the network VF's if_capabilities and the
     cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s
     if_capabilities.  And adjust the cooresponding hn(4) if_capable and
     if_hwassist accordingly. (*)
  o  Make sure that the cooresponding hn(4)'s TSO parameters meet the
     constraints posed by both the network VF and the cooresponding hn(4).
     (*)
  o  The network VF's if_input is overridden.  The overriding if_input
     changes the input packet's rcvif to the cooreponding hn(4).  The
     network layers are tricked into thinking that all packets are
     neceived by the cooresponding hn(4).
  o  If the cooresponding hn(4) was brought up, bring up the network VF.
     The transmission dispatched to the cooresponding hn(4) are
     redispatched to the network VF.
  o  Bringing down the cooresponding hn(4) also brings down the network
     VF.
  o  All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to
     the network VF; the cooresponding hn(4) changes its internal state
     if necessary.
  o  The media status of the cooresponding hn(4) solely relies on the
     network VF.
  o  If there are multicast filters on the cooresponding hn(4), allmulti
     will be enabled on the network VF. (**)
- Once the network VF is detached.  Undo all damages did to the
  cooresponding hn(4) in the above item.

NOTE:
No operation should be issued directly to the network VF, if the
network VF transparent mode is enabled.  The network VF transparent mode
can be enabled by setting tunable hw.hn.vf_transparent to 1.  The network
VF transparent mode is _not_ enabled by default, as of this commit.

The benefit of the network VF transparent mode is that the network VF
attachment and detachment are transparent to all network layers; e.g. live
migration detaches and reattaches the network VF.

The major drawbacks of the network VF transparent mode:
- The netmap(4) support is lost, even if the VF supports it.
- ALTQ does not work, since if_start method cannot be properly supported.

(*)
These decisions were made so that things will not be messed up too much
during the transition period.

(**)
This does _not_ need to go through the fancy multicast filter management
stuffs like what vlan(4) has, at least currently:
- As of this write, multicast does not work in Azure.
- As of this write, multicast packets go through the cooresponding hn(4).

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11803
2017-08-09 05:59:45 +00:00
Kirk McKusick
77b63aa0fc Since the switch to GPT disk labels, fsck for UFS/FFS has been
unable to automatically find alternate superblocks. This checkin
places the information needed to find alternate superblocks to the
end of the area reserved for the boot block.

Filesystems created with a newfs of this vintage or later will
create the recovery information. If you have a filesystem created
prior to this change and wish to have a recovery block created for
your filesystem, you can do so by running fsck in forground mode
(i.e., do not use the -p or -y options). As it starts, fsck will
ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should
answer yes.

Discussed with: kib, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589
2017-08-09 05:17:21 +00:00
Alan Cox
5471caf6f1 Introduce vm_page_grab_pages(), which is intended to replace loops calling
vm_page_grab() on consecutive page indices.  Besides simplifying the code
in the caller, vm_page_grab_pages() allows for batching optimizations.
For example, the current implementation replaces calls to vm_page_lookup()
on consecutive page indices by cheaper calls to vm_page_next().

Reviewed by:	kib, markj
Tested by:	pho (an earlier version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11926
2017-08-09 04:23:04 +00:00