Commit Graph

120463 Commits

Author SHA1 Message Date
Gleb Smirnoff
f4bef67c9c Followup on r302393 by cperciva, improving calculation of boot pages required
for UMA startup.

o Introduce another stage of UMA startup, which is entered after
  vm_page_startup() finishes. After this stage we don't yet enable buckets,
  but we can ask VM for pages. Rename stages to meaningful names while here.
  New list of stages: BOOT_COLD, BOOT_STRAPPED, BOOT_PAGEALLOC, BOOT_BUCKETS,
  BOOT_RUNNING.
  Enabling page alloc earlier allows us to dramatically reduce number of
  boot pages required. What is more important number of zones becomes
  consistent across different machines, as no MD allocations are done before
  the BOOT_PAGEALLOC stage. Now only UMA internal zones actually need to use
  startup_alloc(), however that may change, so vm_page_startup() provides
  its need for early zones as argument.
o Introduce uma_startup_count() function, to avoid code duplication. The
  functions calculates sizes of zones zone and kegs zone, and calculates how
  many pages UMA will need to bootstrap.
  It counts not only of zone structures, but also of kegs, slabs and hashes.
o Hide uma_startup_foo() declarations from public file.
o Provide several DIAGNOSTIC printfs on boot_pages usage.
o Bugfix: when calculating zone of zones size use (mp_maxid + 1) instead of
  mp_ncpus. Use resulting number not only in the size argument to zone_ctor()
  but also as args.size.

Reviewed by:		imp, gallatin (earlier version)
Differential Revision:	https://reviews.freebsd.org/D14054
2018-02-06 04:16:00 +00:00
Kirk McKusick
47806d1b93 Occasional cylinder-group check-hash errors were being reported on
systems running with a heavy filesystem load. Tracking down this
bug was elusive because there were actually two problems. Sometimes
the in-memory check hash was wrong and sometimes the check hash
computed when doing the read was wrong. The occurrence of either
error caused a check-hash mismatch to be reported.

The first error was that the check hash in the in-memory cylinder
group was incorrect. This error was caused by the following
sequence of events:

- We read a cylinder-group buffer and the check hash is valid.
- We update its cg_time and cg_old_time which makes the in-memory
  check-hash value invalid but we do not mark the cylinder group dirty.
- We do not make any other changes to the cylinder group, so we
  never mark it dirty, thus do not write it out, and hence never
  update the incorrect check hash for the in-memory buffer.
- Later, the buffer gets freed, but the page with the old incorrect
  check hash is still in the VM cache.
- Later, we read the cylinder group again, and the first page with
  the old check hash is still in the VM cache, but some other pages
  are not, so we have to do a read.
- The read does not actually get the first page from disk, but rather
  from the VM cache, resulting in the old check hash in the buffer.
- The value computed after doing the read does not match causing the
  error to be printed.

The fix for this problem is to only set cg_time and cg_old_time as
the cylinder group is being written to disk. This keeps the in-memory
check-hash valid unless the cylinder group has had other modifications
which will require it to be written with a new check hash calculated.
It also requires that the check hash be recalculated in the in-memory
cylinder group when it is marked clean after doing a background write.

The second problem was that the check hash computed at the end of the
read was incorrect because the calculation of the check hash on
completion of the read was being done too soon.

- When a read completes we had the following sequence:

  - bufdone()
  -- b_ckhashcalc (calculates check hash)
  -- bufdone_finish()
  --- vfs_vmio_iodone() (replaces bogus pages with the cached ones)

- When we are reading a buffer where one or more pages are already
  in memory (but not all pages, or we wouldn't be doing the read),
  the I/O is done with bogus_page mapped in for the pages that exist
  in the VM cache. This mapping is done to avoid corrupting the
  cached pages if there is any I/O overrun. The vfs_vmio_iodone()
  function is responsible for replacing the bogus_page(s) with the
  cached ones. But we were calculating the check hash before the
  bogus_page(s) were replaced. Hence, when we were calculating the
  check hash, we were partly reading from bogus_page, which means
  we calculated a bad check hash (e.g., because multiple pages have
  been mapped to bogus_page, so its contents are indeterminate).

The second fix is to move the check-hash calculation from bufdone()
to bufdone_finish() after the call to vfs_vmio_iodone() so that it
computes the check hash over the correct set of pages.

With these two changes, the occasional cylinder-group check-hash
errors are gone.

Submitted by: David Pfitzner <dpfitzner@netflix.com>
Reviewed by: kib
Tested by: David Pfitzner
2018-02-06 00:19:46 +00:00
Konstantin Belousov
5370a081a8 Move signal trampolines out of locore.s into separate source file.
Similar to other arches, the move makes the subject of locore.s only
the kernel startup.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-02-06 00:02:30 +00:00
Landon J. Fuller
d177c19903 bwn(4): migrate bwn(4) to the native bhnd(9) interface, and drop siba_bwn.
- Remove the shim interface that allowed bwn(4) to use either siba_bwn or
  bhnd(4), replacing all siba_bwn calls with their bhnd(4) bus equivalents.
- Drop the legay, now-unused siba_bwn bus driver.
- Clean up bhnd(4) board flag defines referenced by bwn(4).

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13518
2018-02-05 23:38:15 +00:00
John Baldwin
15746ef43a Ignore relocation tables for non-memory-resident sections.
As a followup to r328101, ignore relocation tables for ELF object
sections that are not memory resident.  For modules loaded by the
loader, ignore relocation tables whose associated section was not
loaded by the loader (sh_addr is zero).  For modules loaded at runtime
via kldload(2), ignore relocation tables whose associated section is
not marked with SHF_ALLOC.

Reported by:	Mori Hiroki <yamori813@yahoo.co.jp>, adrian
Tested on:	mips, mips64
MFC after:	1 month
Sponsored by:	DARPA / AFRL
2018-02-05 23:35:33 +00:00
John Baldwin
d722231bca Always give ELF brands a chance to veto a match.
If a brand provides a header_supported hook, check it when trying to
find a brand based on a matching interpreter as well as in the final
loop for the fallback brand. Previously a brand might reject a binary
via a header_supported hook in one of the earlier loops, but still be
chosen by one of these later loops.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	2 weeks
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D13945
2018-02-05 23:27:42 +00:00
Adrian Chadd
f6f7bec6f8 [arswitch] disable ARP copy-to-CPU port for AR9340 for now.
I'll have to go double check to see if it does indeed pass ARP frames between
switch ports with this disabled, but it seems required for the CPU port to see
ARP traffic.

I'll dig into this some more.
2018-02-05 20:37:29 +00:00
Adrian Chadd
45a7272a5b [arswitch] fix build breakage.
Apparently the last time I checked building this it didn't pick up that the
header had changed.
2018-02-05 20:30:53 +00:00
Brooks Davis
0a2c60c371 Reduce duplication in extattr_*_(file|link) syscalls.
Reviewed by:	rwatson
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14173
2018-02-05 19:06:34 +00:00
Brooks Davis
02bc058f79 ANSIfy syscall implementations.
Reviewed by:	rwatson
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14172
2018-02-05 18:58:55 +00:00
Ed Maste
8a3b44cfc2 Additional linuxolator whitespace cleanup, missed in r328890 2018-02-05 18:39:06 +00:00
Brooks Davis
7dea788b91 Garbage collect trailing whitespace.
Sponsored by:	DARPA, AFRL
2018-02-05 18:06:54 +00:00
Ed Maste
132f90c660 Linuxolator whitespace cleanup
A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator.  Clean these up so that new
architectures do not inherit whitespace issues.

Clean up shared Linuxolator files while here.

Sponsored by:	Turing Robotic Industries Inc.
2018-02-05 17:29:12 +00:00
Pedro F. Giffuni
fdc154e44a ext2fs: remove EXT4F_RO_INCOMPAT_SUPP
This was a hack to be able to mount ext4 filesystems read-only while not
supporting all the features. We now support all those features so it
doesn't make sense to keep the undocumented hack.

Discussed with:	fsu
2018-02-05 15:14:01 +00:00
Ed Maste
85059bc4ad Move assym.s to DPSRCS in linux modules
assym.s exists only to be included by other .s files, and should not
actually be assembled by itself.

Sponsored by:	Turing Robotic Industries Inc.
2018-02-05 14:53:18 +00:00
Pedro F. Giffuni
f86f5cd406 ext2fs: Cleanup variable assignments for extents.
Delay the initialization of variables until the are needed.

In the case of ext4_ext_rm_leaf(), make sure 'error' value is not
undefined.

Reported by:		Clang's static analyzer
Differential Revision:	https://reviews.freebsd.org/D14193
2018-02-05 14:30:27 +00:00
Andriy Gapon
be4be0e5ca zfs: move a utility function, ioflags, closer to its consumers
No functional change.

MFC after:	1 week
2018-02-05 14:19:36 +00:00
Konstantin Belousov
20e4afbfbb On munlock(), unwire correct page.
It is possible, for complex fork()/collapse situations, to have
sibling address spaces to partially share shadow chains. If one
sibling performs wiring, it can happen that a transient page, invalid
and busy, is installed into a shadow object which is visible to other
sibling for the duration of vm_fault_hold().  When the backing object
contains the valid page, and the wiring is performed on read-only
entry, the transient page is eventually removed.

But the sibling which observed the transient page might perform the
unwire, executing vm_object_unwire().  There, the first page found in
the shadow chain is considered as the page that was wired for the
mapping.  It is really the page below it which is wired.  So we unwire
the wrong page, either triggering the asserts of breaking the page'
wire counter.

As the fix, wait for the busy state to finish if we find such page
during unwire, and restart the shadow chain walk after the sleep.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14184
2018-02-05 12:49:20 +00:00
Andrey V. Elsukov
68e0e5a673 Modify ip6_get_prevhdr() to be able use it safely.
Instead of returning pointer to the previous header, return its offset.
In frag6_input() use m_copyback() and determined offset to store next
header instead of accessing to it by pointer and assuming that the memory
is contiguous.

In rip6_input() use offset returned by ip6_get_prevhdr() instead of
calculating it from pointers arithmetic, because IP header can belong
to another mbuf in the chain.

Reported by:	Maxime Villard <max at m00nbsd dot net>
Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14158
2018-02-05 09:22:07 +00:00
Adrian Chadd
f76883d64c [arswitch] Enable ATU dump support for the AR9340.
This indeed uses the same registers as the AR8216 and later chips.

There seems to be an issue with ARP requests being sent out from the CPU
through this switch here, so figuring that out is next.  Learning works fine on
the AR8327 ethernet switch on the /other/ gigabit ethernet port, so I don't
think it's the network stack or ethernet driver.

Tested:

* DB120 - AR9340 SOC + ethernet switch (and other bits.)
2018-02-05 07:05:28 +00:00
Adrian Chadd
7f81a20479 [arswitch] fix mac address field definition.
Whilst here, add some further fields for future experimenting.

Tested:

* AR9340 switch
* AR9330 switch
* AR7240 switch
2018-02-05 07:03:45 +00:00
Adrian Chadd
7ed0831923 [arswitch] Break out of the loop upon any error, not just -1.
This fixes the AR9340 "unimplemented" thingy for now.
2018-02-05 05:51:37 +00:00
Adrian Chadd
286a5a1c7e [ar71xx] Fix DB120 AHB device hints in the new world order.
This allows the on-chip (AHB bus) device to attach correctly as a module.

Tested:

* DB120, AR9344 (SoC + 2x2 2G wifi) + QCA9580 PCI 3x3 5G wifi
2018-02-05 04:48:41 +00:00
Adrian Chadd
431017d066 [ar71xx] AR934x is a MIPS74k board - use the right hwpmc module 2018-02-05 04:47:13 +00:00
Adrian Chadd
ea3c60a1e5 [ar71xx] New world order - don't reference ath_pci here, it's a module now 2018-02-05 04:46:36 +00:00
Vladimir Kondratyev
74a53bd131 psm(4): Fix panic occuring soon after PS/2 packet has been rejected by
synaptics or elantech sanity checker.

After packet has been rejected contents of packet buffer is not cleared
with setting of inputbytes counter to 0. So when this packet buffer is
filled again being an element of circular queue, new data appends to old
data rather than overwrites it. This leads to packet buffer overflow
after 10 rounds.

Fix it with setting of packet's inputbytes counter to 0 after rejection.

While here add extra logging of rejected packets.

PR:		222667 (for reference)
Reported by:	Neel Chauhan <neel@neelc.org>
Tested by:	Neel Chauhan <neel@neelc.org>
MFC after:	1 week
2018-02-04 23:01:48 +00:00
Justin Hibbits
b0d3bb2613 Only look for L2 cache controllers for mpc85xx_cache
The L3 cache controller (Corenet Platform Cache) is listed with one of its
compatible strings as "cache", which this driver can't attach to.  Restrict
to a known list of primary cache controller strings, as found in the l2cache
devicetree binding.
2018-02-04 20:07:08 +00:00
Justin Hibbits
ce2d51972f Start building modules for MPC85XX and MPC85XXSPE
These kernels aren't restricted to development boards anymore, they are
closer in behavior to GENERIC, so build modules.
2018-02-04 15:40:48 +00:00
Justin Hibbits
2c26c98c89 Add sdhci to MPC85XX build 2018-02-04 15:39:15 +00:00
Justin Hibbits
77baa2256d Minimal changes for MPR to build on architectures with physical addresses larger than virtual
Summary:
Some architectures use large (36-bit) physical addresses, with smaller
virtual addresses.  Casting between vm_paddr_t (or bus_addr_t) and void * is
considered illegal, so cast through uintptr_t.  No functional change on existing
platforms.

Reviewed By:	scottl
Differential Revision:	https://reviews.freebsd.org/D14042
2018-02-04 15:37:58 +00:00
Alan Somers
f5b4099e6b geom: don't write stack garbage in disk labels
Most consumers of g_metadata_store were passing in partially unallocated
memory, resulting in stack garbage being written to disk labels. Fix them by
zeroing the memory first.

gvirstor repeated the same mistake, but in the kernel.

Also, glabel's label contained a fixed-size string that wasn't
initialized to zero.

PR:		222077
Reported by:	Maxim Khitrov <max@mxcrypt.com>
Reviewed by:	cem
MFC after:	3 weeks
X-MFC-With:	323314
X-MFC-With:	323338
Differential Revision:	https://reviews.freebsd.org/D14164
2018-02-04 14:49:55 +00:00
Steve Wills
aa3c83c3c6 Create GENERIC64-NODEBUG for powerpc64
Approved by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D14192
2018-02-04 14:27:12 +00:00
Adrian Chadd
c9f70b7b88 [arswitch] fix up issues on the AR8327.
This correctly dumps the ethernet bridge contents on an AR8327 switch.

Tested:

* AP135 - QCA9550 + AR8327 ethernet switch:

# etherswitchcfg atu dump
 [0] c0:3f:d5:7e:6f:45: portmask 0x00000004
 [1] f6:b6:03:96:1e:ba: portmask 0x00000004
 [2] 00:03:7f:11:38:4f: portmask 0x00000040
# arp -na
? (192.168.3.170) at 00:03:7f:11:38:4f on arge0 permanent [ethernet]
? (192.168.3.12) at c0:3f:d5:7e:6f:45 on arge0 expires in 1188 seconds [ethernet]
? (192.168.3.1) at f6:b6:03:96:1e:ba on arge0 expires in 1186 seconds [ethernet]
2018-02-04 08:22:11 +00:00
Hans Petter Selasky
c32d1cce9d Add new USB ID.
PR:		225641
Submitted by:	Ryan <ryanwinter@outlook.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-03 09:43:32 +00:00
Xin LI
90a48fba23 After r328426, g_label depends on UFS (option FFS) code to read UFS
superblock, and the kernel will fail to link when UFS is not built
in.  This commit makes it depend on a small portion of FFS bits and
thereby fixes build for this situation.

This is intended as an interim bandaid, and the actual superblock
reading code should probably be made independent of UFS, so we do
not need to depend on it (see kib@'s comment in the review for
details), and we will revisit this once the superblock check hashes
are all in place.

Differential Revision:	https://reviews.freebsd.org/D14092
2018-02-03 09:15:13 +00:00
Adrian Chadd
65d59686e1 [arswitch] add initial functionality for AR8327 ATU management.
* Add the bulk of the ATU table read function
* Correct how the ATU function and WAIT bits work

TODO:

* more testing, figure out how the multi-vlan table stuff works and push that
  up to userspace
2018-02-03 00:59:08 +00:00
Adrian Chadd
84a5558c38 [arswitch] Stub out the ATU table dump in AR9340 switches until I implement
this.
2018-02-02 22:08:03 +00:00
Adrian Chadd
62042c979d [arswitch] begin tidying up the learning and ATU management, introduce ATU APIs.
* Refactor the initial learning configuration (port learning, address expiry,
  handling address moving between ports, etc, etc) into a separate HAL routine
* and ensure that it's consistent between switch chips - the AR8216,8316,724x,9331
  SoCs all share the same switch code.
* .. the AR8327 needs doing - the defaults seem OK for now
* .. the AR9340 is different but it's also programmed now.

* Add support for flushing a single port worth of ATU entries
* Add support for fetching the ATU table from AR8216 and derived chips

Tested:

* AR9344, Carambola 2

TODO:

* Further testing on other chips
* Add AR9340 support
* Add AR8327 support
2018-02-02 22:05:36 +00:00
Brooks Davis
0fd25723bc Add kern.ipc.{msqids,semsegs,sema} sysctls for FreeBSD32.
Stop leaking kernel pointers though theses sysctls and make sure that the
padding in the structures is zeroed on allocation to avoid other leaks.

Reviewed by:	gordon, kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13459
2018-02-02 18:03:12 +00:00
Andriy Gapon
6c1e03251e ZFS ARC: restore illumos uses of 'needfree' that were removed in r325851
This is purely a cosmetic change to have a more complete copy of
ifdef-ed out illumos code.

MFC after:	1 week
2018-02-02 12:57:33 +00:00
Hans Petter Selasky
64282a1274 Slightly bump the maximum OID path for loading tunable SYSCTLs.
Coming updates to the mlx5en(4) driver will require this.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-02 12:42:46 +00:00
Konstantin Belousov
938cdc4264 On pageout, in vnode generic pager, for partially dirty page, only
clear dirty bits for completely invalid blocks.

Otherwise we might not write out the last chunk that is shorter than
512 bytes, if the file end is not aligned on disk block boundary.
This become important after the r324794.

PR:	225586
Reported by:	tris_vern@hotmail.com
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-02-02 11:56:30 +00:00
Andrey V. Elsukov
883cd89b05 Merge r1.120 from NetBSD:
Fix a pretty simple, yet pretty tragic typo: we should return IPPROTO_DONE,
  not IPPROTO_NONE. With IPPROTO_NONE we will keep parsing the header chain
  on an mbuf that was already freed.

Reported by:	Maxime Villard <max at m00nbsd dot net>
MFC after:	3 days
2018-02-02 07:39:34 +00:00
Steve Wills
e1782bae5f Correct longjmp
Reviewed by:	nwhitehorn
Differential Revision:	https://reviews.freebsd.org/D14159
2018-02-02 02:28:25 +00:00
Adrian Chadd
877d73ecb4 [etherswitch] add the first pass of a simple API to flush and fetch the L2 address table from the ethernet switch.
This stuff may be a bit fluid during this -HEAD cycle as various other
switch features are added, but the current stuff is enough to drive
initial development and features on the atheros range of integrated
and external switches.

* add a method to flush the whole address table;
* add a method to flush all addresses on a given port;
* add a method to download the address table;
* .. and then a method to fetch entries from the address table.

The table fetch/read methods pass through to the drivers for now since
the drivers may implement different ways of fetching/caching the address
table data.  The atheros devices for example fetch the table by
iterating over the table through a set of registers and so you need
to keep that locked whilst you iterate otherwise you may have the table
flushed half way by a port status change.

This is a no-op until the userland and arswitch code shows up.
2018-02-02 02:05:14 +00:00
Adrian Chadd
a597af9415 [atheros] Update QCA953x support to use the new hints. 2018-02-01 22:01:53 +00:00
Adrian Chadd
1246f4fe1c [atheros] Fix DIR-825C1 to use the new hints.
Tested:

* DIR-825C1
2018-02-01 22:01:11 +00:00
Adrian Chadd
2786bc9951 [atheros] teach these two boards about the new hints location as well. 2018-02-01 22:00:38 +00:00
Adrian Chadd
118c9d516e [atheros] Teach the QCA955x SoC code about the new hints stuff. 2018-02-01 22:00:05 +00:00
Adrian Chadd
7f1a46e2e8 [atheros] Fix-up the base address stuff after I did a drive-by with the calibration data location.
The old way required the data to be present really early and copied it from
memory mapped NOR flash; this only worked during kernel boot but not for
ath/ath_hal modules.

Tested:

* AR9331, Carambola2, ath/hal modules.
2018-02-01 21:58:52 +00:00