Commit Graph

104874 Commits

Author SHA1 Message Date
Ed Schouten
bc41a24735 Fix the build after breaking it in r285549.
I performed the commit on a different system as where I wrote the
change. After pulling in the change from Phabricator, I didn't notice
that a single chunk did not apply.

Approved by:	secteam (implicit, as intended change was approved)
Pointy hat to:	me
2015-07-14 20:45:24 +00:00
Andrew Turner
f3856d8fcb Also accept "ok" to enable a device, some vendor device trees use this when
they mean "okay"
2015-07-14 19:11:16 +00:00
Ed Schouten
707d98fe2f Implement the CloudABI random_get() system call.
The random_get() system call works similar to getentropy()/getrandom()
on OpenBSD/Linux. It fills a buffer with random data.

This change introduces a new function, read_random_uio(), that is used
to implement read() on the random devices. We can call into this
function from within the CloudABI compatibility layer.

Approved by:	secteam
Reviewed by:	jmg, markm, wblock
Obtained from:	https://github.com/NuxiNL/freebsd
Differential Revision:	https://reviews.freebsd.org/D3053
2015-07-14 18:45:15 +00:00
Mark Johnston
02d131ad11 Fix some error-handling bugs when core dump compression is enabled:
- Ensure that core dump parameters are initialized in the error path.
- Don't call gzio_fini() on a NULL stream.

Reported by:	rpaulo
2015-07-14 18:24:05 +00:00
Ed Schouten
460ac6370a Regenerate system call table for r285540. 2015-07-14 15:12:24 +00:00
Ed Schouten
1eb7c7cae3 Implement thread_tcb_set() and thread_yield().
The first system call is used to set the user TLS address. Right now
this system call is invoked by the C library for both the initial thread
and additional threads unconditionally, but in the future we'll only
call this if the architecture does not support this. On recent x86-64
CPUs we could use the WRFSBASE instruction.

This system call was erroneously placed in sys/compat/cloudabi64, even
though it does not depend on any pointer size dependent datastructure.
Move it to the right place.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-14 15:11:50 +00:00
Ed Schouten
03744d7c8d Implement {,p}{read,write}{,v}().
Add a routine similar to copyinuio() and freebsd32_copyinuio() that
copies in CloudABI's struct iovecs. These are then translated into
FreeBSD format and placed in a 'struct uio', so we can call into the
kern_*() functions.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-14 14:33:21 +00:00
Andrew Turner
b7fbd410ab Set memory to be inner-sharable. This isn't needed on device memory as the
MMU will ignore the attribute there, howeverit simplifies to code to alwas
set it.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2015-07-14 12:37:47 +00:00
Ed Schouten
f9675092b8 Let proc_raise() call into pksignal() directly.
Summary:
As discussed with kib@ in response to r285404, don't call into
kern_sigaction() within proc_raise() to reset the signal to the default
action before delivery. We'd better do that during image execution.

Change the code to simply use pksignal(), so we don't waste cycles on
functions like pfind() to look up the currently running process itself.

Test Plan:
This change has also been pushed into the cloudabi branch on GitHub. The
raise() tests still seem to pass.

Reviewers: kib

Reviewed By: kib

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3076
2015-07-14 12:16:14 +00:00
Zbigniew Bodek
d1be8e59e2 Fix secondary PIC initialization order
Call arm_init_secondary before any other PIC-related functions
are called. This is necessary for GICv3 where PIC_INIT_SECONDARY
allocates resources needed for all further operations.

Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3066
2015-07-14 12:02:56 +00:00
Zbigniew Bodek
b7ac293f44 Fix intr_machdep.c for ARM64
On ARMv8 IPIs are mapped to 0-15. Incrementing the number by 16
is wrong, because it sets a reserved bit in the IPI register.
This patch removes all "+16" to comply with specs.

Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3029
2015-07-14 11:59:43 +00:00
Christian Brueffer
f4c1eac7cd Spell crypto correctly. 2015-07-14 10:47:56 +00:00
Hiren Panchasara
df7b11fa09 Expose full 32bit RSS hash from card regardless of whether RSS is defined or
not. When doing multiqueue, we are all setup to have full 32bit RSS hash from
the card. We do not need to hide that under "ifdef RSS" and should expose that
by default so others like lagg(4) can use that and avoid hashing the traffic by
themselves.

While here, delete the FreeBSD version check and use of deprecated M_FLOWID.

Reviewed by:	adrian, erj
MFC after:	1 week
Sponsored by:	Limelight Networks
2015-07-14 09:13:18 +00:00
Navdeep Parhar
c7dbd80213 cxgbe(4): Update T4 and T5 firmwares to 1.14.2.0.
Obtained from:	Chelsio Communications
MFC after:	3 days
2015-07-14 08:02:05 +00:00
John-Mark Gurney
577f7474b0 Fix XTS, and name things a bit better...
Though confusing, GCM using ICM_BLOCK_LEN, but ICM does not is
correct...  GCM is built on ICM, but uses a function other than
swcr_encdec...  swcr_encdec cannot handle partial blocks which is
why it must still use AES_BLOCK_LEN and is why XTS was broken by the
commit...

Thanks to the tests for helping sure I didn't break GCM w/ an earlier
patch...

I did run the tests w/o this patch, and need to figure out why they
did not fail, clearly more tests are needed...

Prodded by:	peter
2015-07-14 07:45:18 +00:00
John-Mark Gurney
e0b231cbc8 fix typos..
Submitted by:	brueffer
2015-07-14 06:34:57 +00:00
Adrian Chadd
85b543e06d Populate hw.model with the CPU model information.
Now you see something like:

# sysctl hw.model
hw.model: Atheros AR9330 rev 1

Tested:

* Carambola 2, AR9331 SoC
2015-07-14 05:14:10 +00:00
John-Mark Gurney
b65946c631 cryptodev is not needed for TCP_SIGNATURE...
Comment that cryptodev shouldn't be used unless you know what you're
doing...

The various arm/mips and one powerpc configs that have cryptodev in
them need to be addressed, audited if they provide benefit and removed
if they don't...
2015-07-14 05:09:58 +00:00
Conrad Meyer
0c40f3532d Fix cleanup race between unp_dispose and unp_gc
unp_dispose and unp_gc could race to teardown the same mbuf chains, which
can lead to dereferencing freed filedesc pointers.

This patch adds an IGNORE_RIGHTS flag on unpcbs marking the unpcb's RIGHTS
as invalid/freed. The flag is protected by UNP_LIST_LOCK.

To serialize against unp_gc, unp_dispose needs the socket object. Change the
dom_dispose() KPI to take a socket object instead of an mbuf chain directly.

PR:		194264
Differential Revision:	https://reviews.freebsd.org/D3044
Reviewed by:	mjg (earlier version)
Approved by:	markj (mentor)
Obtained from:	mjg
MFC after:	1 month
Sponsored by:	EMC / Isilon Storage Division
2015-07-14 02:00:50 +00:00
Mateusz Guzik
6161705823 exec: textvp -> oldtextvp; binvp -> newtextvp
This makes it consistent with the rest of the naming in do_execve.

No functional changes.
2015-07-14 01:13:37 +00:00
Mateusz Guzik
853be5ffef exec plug a redundant vref + vrele of the image vnode 2015-07-14 00:43:08 +00:00
Mateusz Guzik
e94e50af1d racct: perform a lockless check for p_throttled
This reduces proc lock contention.

Reviewed by:	trasz
2015-07-13 22:52:11 +00:00
Alexander Motin
d4f3ad3a26 Switch initiator IDs in target mode to the same address space as target
IDs in initiator mode -- index in port database instead of handlers.

This makes initiator IDs persist across role changes and firmware resets,
when handlers previously assigned by firmware are lost and reused.

Sponsored by:	iXsystems, Inc.
2015-07-13 21:01:24 +00:00
Luiz Otavio O Souza
fb54940587 Bring a few simplifications to a10_gpio:
o Return the real hardware state in gpio_pin_getflags() instead of keep
   the last state in an internal table.  Now the driver returns the real
   state of pins (input/output and pull-up/pull-down) at all times.
 o Use a spin mutex.  This is required by interrupts and the 1-wire code.
 o Use better variable names and place parentheses around them in MACROS.
 o Do not lock the driver when returning static data.

Tested with gpioled(4) and DS1820 (1-wire) sensors on banana pi.
2015-07-13 18:19:26 +00:00
Conrad Meyer
c578e0fb48 pipe_direct_write: Fix mismatched pipelock/unlock
If a signal is caught in pipelock, causing it to fail, pipe_direct_write
should not try to pipeunlock.

Reported by:	pho
Differential Revision:	https://reviews.freebsd.org/D3069
Reviewed by:	kib
Approved by:	markj (mentor)
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2015-07-13 17:45:22 +00:00
Alexander Motin
391f03dafb Make role sysctl handling from r284727 less strict. 2015-07-13 15:51:28 +00:00
Alexander Motin
e68eef1442 Unify port database use for target and initiator roles.
Aside from cleaner and more consistent code, this allows ports to be both
target and initiator same time, and easily switch from any role to any.

Sponsored by:	iXsystems, Inc.
2015-07-13 15:11:05 +00:00
Luigi Rizzo
5f94000ee4 set the refcount for the structure (dropped by mistake in the last commit). 2015-07-13 10:23:52 +00:00
Mark Murray
b712101cf7 Rework the read routines to keep the PRNG sources happy. These work
in units of crypto blocks, so must have adequate space to write.
This means needing to be careful about buffers and keeping track
of external read request length.

Approved by:	so (/dev/random blanket)
2015-07-13 08:38:21 +00:00
Adrian Chadd
d3f9e6a743 Fixes the RF switch state polling by comparing with the revision of the
PHY instead of the revision of the RADIO.

This fixes the RF switch state polling.

This is from DragonflyBSD, Commit 202e28d1f65e9f35df6032400df3242a3bafb483

Obtained from:	DragonflyBSD
2015-07-13 05:13:39 +00:00
Ian Lepore
3f3def246a Add PRINTF_BUFR_SIZE=128 to avoid interleaved output. 2015-07-12 19:58:12 +00:00
Ian Lepore
969fc29e0b Use the monotonic (uptime) counter rather than time-of-day to measure elapsed
time between ntp_adjtime() clock offset adjustments.  This eliminates spurious
frequency steering after a large clock step (such as a 1970->2015 step on a
system with no battery-backed clock hardware).

This problem was discovered after the import of ntpd 4.2.8, which does things
in a slightly different (but still correct) order than the 4.2.4 we had
previously.  In particular, 4.2.4 would step the clock then immediately after
use ntp_adjtime() to set the frequency and offset to zero, which captured the
post-step time-of-day as a side effect.  In 4.2.8, ntpd sets frequency and
offset to zero before any initial clock step, capturing the time as 1970-ish,
then when it next calls ntp_adjtime() it's with a non-zero offset measurement.
This non-zero value gets multiplied by the apparent 45-year interval, which
blows up into a completely bogus frequency steer.  That gets clamped to
500ppm, but that's still enough to make the clock drift so fast that ntpd has
to keep stepping it every few minutes to compensate.
2015-07-12 18:38:17 +00:00
Zbigniew Bodek
686836faca Add ARM64TODO comments to ACPI PCI stubs
This will make searching for missing functionalities easier.
2015-07-12 18:32:16 +00:00
Mark Murray
3aa77530ca * Address review (and add a bit myself).
- Tweek man page.
 - Remove all mention of RANDOM_FORTUNA. If the system owner wants YARROW or DUMMY, they ask for it, otherwise they get FORTUNA.
 - Tidy up headers a bit.
 - Tidy up declarations a bit.
 - Make static in a couple of places where needed.
 - Move Yarrow/Fortuna SYSINIT/SYSUNINIT to randomdev.c, moving us towards a single file where the algorithm context is used.
 - Get rid of random_*_process_buffer() functions. They were only used in one place each, and are better subsumed into those places.
 - Remove *_post_read() functions as they are stubs everywhere.
 - Assert against buffer size illegalities.
 - Clean up some silly code in the randomdev_read() routine.
 - Make the harvesting more consistent.
 - Make some requested argument name changes.
 - Tidy up and clarify a few comments.
 - Make some requested comment changes.
 - Make some requested macro changes.

* NOTE: the thing calling itself a 'unit test' is not yet a proper
  unit test, but it helps me ensure things work. It may be a proper
  unit test at some time in the future, but for now please don't make
  any assumptions or hold any expectations.

Differential Revision:	https://reviews.freebsd.org/D2025
Approved by:	so (/dev/random blanket)
2015-07-12 18:14:38 +00:00
Zbigniew Bodek
e7c14c38ba Implement stubs for ACPI PCI routines
ACPI driver requires special functions to be provided by machdep code.
Add temporary stubs to satisfy the compiler when both "pci" and "acpi"
are enabled in the kernel configuration file.

Reviewed by:   andrew
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3028
2015-07-12 17:28:31 +00:00
Bjoern A. Zeeb
97fc027722 Try to unbreak the build after r285390 removing the obsolete static
declaration.
2015-07-12 00:26:22 +00:00
Luiz Otavio O Souza
a8921b867f Return the FDT node of the GPIO controller to gpiobus. It is used by the
children of gpiobus.
2015-07-11 21:09:43 +00:00
Ed Schouten
4f1905177a Implement normal and abnormal process termination.
CloudABI does not provide an explicit kill() system call, for the reason
that there is no access to the global process namespace. Instead, it
offers a raise() system call that can at least be used to terminate the
process abnormally.

CloudABI does not support installing signal handlers. CloudABI's raise()
system call should behave as if the default policy is set up. Call into
kern_sigaction(SIG_DFL) before calling sys_kill() to force this.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-11 19:41:31 +00:00
Ed Schouten
a4001f4cb9 Use FDDUP_NORMAL instead of hardcoding value 0.
Proposed by:	mjg
2015-07-11 18:53:30 +00:00
Ed Schouten
329d1bca7f Add missing function parameter.
A function parameter got added in r285356, meaning that the call to
kern_dup() needs to be patched up.
2015-07-11 18:39:16 +00:00
Justin Hibbits
20b6ee617f cpu_number and cpu_swapout are never used, and only defined in powerpc. 2015-07-11 17:33:50 +00:00
Mateusz Guzik
b34be824a0 linprocfs: vref the vnode passed to vn_fullpath 2015-07-11 16:44:28 +00:00
Mateusz Guzik
c634b75204 vfs: always clear VI_OWEINACT in consumers bumping v_usecount
Previously vputx would detect the condition and clear the flag.

With this change it is invalid to have both v_usecount > 0 and the flag
set. Assert the condition is met in all revlevant places.

Reviewed by:	kib
2015-07-11 16:28:55 +00:00
Mateusz Guzik
2d1ca3cdff vfs: move si_usecount manipulation to dedicated functions
Reviewed by:	kib
2015-07-11 16:28:12 +00:00
Mateusz Guzik
8a08cec166 Create a dedicated function for ensuring that cdir and rdir are populated.
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.

Reviewed by:	kib
2015-07-11 16:22:48 +00:00
Mateusz Guzik
f0725a8e1e Move chdir/chroot-related fdp manipulation to kern_descrip.c
Prefix exported functions with pwd_.

Deduplicate some code by adding a helper for setting fd_cdir.

Reviewed by:	kib
2015-07-11 16:19:11 +00:00
Andrew Turner
70915d1289 Always send a SIGSEGV on a map failure. Use the code to tell the reason
for the signal.

Sponsored by:	ABT Systems Ltd
2015-07-11 16:02:06 +00:00
Adrian Chadd
871ef8b0d8 Regenerate syscalls. 2015-07-11 15:22:11 +00:00
Adrian Chadd
6520495abc Add an initial NUMA affinity/policy configuration for threads and processes.
This is based on work done by jeff@ and jhb@, as well as the numa.diff
patch that has been circulating when someone asks for first-touch NUMA
on -10 or -11.

* Introduce a simple set of VM policy and iterator types.
* tie the policy types into the vm_phys path for now, mirroring how
  the initial first-touch allocation work was enabled.
* add syscalls to control changing thread and process defaults.
* add a global NUMA VM domain policy.
* implement a simple cascade policy order - if a thread policy exists, use it;
  if a process policy exists, use it; use the default policy.
* processes inherit policies from their parent processes, threads inherit
  policies from their parent threads.
* add a simple tool (numactl) to query and modify default thread/process
  policities.
* add documentation for the new syscalls, for numa and for numactl.
* re-enable first touch NUMA again by default, as now policies can be
  set in a variety of methods.

This is only relevant for very specific workloads.

This doesn't pretend to be a final NUMA solution.

The previous defaults in -HEAD (with MAXMEMDOM set) can be achieved by
'sysctl vm.default_policy=rr'.

This is only relevant if MAXMEMDOM is set to something other than 1.
Ie, if you're using GENERIC or a modified kernel with non-NUMA, then
this is a glorified no-op for you.

Thank you to Norse Corp for giving me access to rather large
(for FreeBSD!) NUMA machines in order to develop and verify this.

Thank you to Dell for providing me with dual socket sandybridge
and westmere v3 hardware to do NUMA development with.

Thank you to Scott Long at Netflix for providing me with access
to the two-socket, four-domain haswell v3 hardware.

Thank you to Peter Holm for running the stress testing suite
against the NUMA branch during various stages of development!

Tested:

* MIPS (regression testing; non-NUMA)
* i386 (regression testing; non-NUMA GENERIC)
* amd64 (regression testing; non-NUMA GENERIC)
* westmere, 2 socket (thankyou norse!)
* sandy bridge, 2 socket (thankyou dell!)
* ivy bridge, 2 socket (thankyou norse!)
* westmere-EX, 4 socket / 1TB RAM (thankyou norse!)
* haswell, 2 socket (thankyou norse!)
* haswell v3, 2 socket (thankyou dell)
* haswell v3, 2x18 core (thankyou scott long / netflix!)

* Peter Holm ran a stress test suite on this work and found one
  issue, but has not been able to verify it (it doesn't look NUMA
  related, and he only saw it once over many testing runs.)

* I've tested bhyve instances running in fixed NUMA domains and cpusets;
  all seems to work correctly.

Verified:

* intel-pcm - pcm-numa.x and pcm-memory.x, whilst selecting different
  NUMA policies for processes under test.

Review:

This was reviewed through phabricator (https://reviews.freebsd.org/D2559)
as well as privately and via emails to freebsd-arch@.  The git history
with specific attributes is available at https://github.com/erikarn/freebsd/
in the NUMA branch (https://github.com/erikarn/freebsd/compare/local/adrian_numa_policy).

This has been reviewed by a number of people (stas, rpaulo, kib, ngie,
wblock) but not achieved a clear consensus.  My hope is that with further
exposure and testing more functionality can be implemented and evaluated.

Notes:

* The VM doesn't handle unbalanced domains very well, and if you have an overly
  unbalanced memory setup whilst under high memory pressure, VM page allocation
  may fail leading to a kernel panic.  This was a problem in the past, but it's
  much more easily triggered now with these tools.

* This work only controls the path through vm_phys; it doesn't yet strongly/predictably
  affect contigmalloc, KVA placement, UMA, etc.  So, driver placement of memory
  isn't really guaranteed in any way.  That's next on my plate.

Sponsored by:	Norse Corp, Inc.; Dell
2015-07-11 15:21:37 +00:00
Konstantin Belousov
cf88021ab1 Do not allow creation of the dirty buffers for the dead buffer
objects, i.e. for buffer objects which vnode was reclaimed.  Buffer
cache cannot write such buffers.  Return the error and discard the
buffer immediately on write attempt.

BO_DIRTY now always set during vnode reclamation, since it is used not
only for the INVARIANTS checks.  Do allow placement of the clean
buffers on dead bufobj list, otherwise filesystems cannot use bufcache
at all after the devvp reclaim.

Reported and tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-07-11 11:21:56 +00:00