Commit Graph

11356 Commits

Author SHA1 Message Date
Ruslan Ermilov
f47552e770 MFC r198295:
Random number generator initialization cleanup:

- Introduce new SI_SUB_RANDOM point in boot sequence to make it
clear from where one may start using random(9).  It should be as
early as possible, so place it just after SI_SUB_CPU where we
have some randomness on most platforms via get_cyclecount().

- Move stack protector initialization to be after SI_SUB_RANDOM
as before this point we have no randomness at all.  This fixes
stack protector to actually protect stack with some random guard
value instead of a well-known one.

Note that this patch doesn't try to address arc4random(9) issues.
With current code, it will be implicitly seeded by stack protector
and hence will get the same entropy as random(9).  It will be
securely reseeded once /dev/random is feeded by some entropy from
userland.

Submitted by:	Maxim Dounin <mdounin@mdounin.ru>
Approved by:	re (kib)
2009-10-24 04:55:14 +00:00
Konstantin Belousov
dc68cec603 MFC r197934:
Map PIE binaries at non-zero base address.

MFC r198202:
Honour non-zero mapbase for PIE binaries. Inform interpreter-less PIE
binary about its relocbase.

Approved by:	re (kensmith)
2009-10-20 13:34:41 +00:00
Konstantin Belousov
5b15472fe9 MFC r197932:
Do not map elf segments of zero length.

Approved by:	re (kensmith)
2009-10-20 13:30:06 +00:00
John Baldwin
74fb2c91c6 MFC 198126:
Fix a sign bug in the handling of nice priorities when computing the
interactive score for a thread.

Approved by:	re (kib)
2009-10-19 19:40:05 +00:00
Attilio Rao
be0ac16015 MFC r197476:
In function do_rw_wrlock, when a writer got an error and before returning,
check if there are readers blocked by us via URWLOCK_WRITE_WAITERS flag,
and resume the readers. The error must be EAGAIN, otherwise there must
have memory problem, and nobody can rescue the buggy application.

Approved by:	re (kib), davidxu
2009-10-13 13:03:31 +00:00
Konstantin Belousov
aba70b5e59 MFC r197942:
Refine r195509, instead of checking that vnode type is VBAD, that is
set quite late in the revocation path, properly verify that vnode is
not doomed before calling VOP.

Approved by:	re (bz)
2009-10-13 09:24:51 +00:00
Attilio Rao
3f4609ac69 MFC r197643, r197735:
When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.

Approved by:	re (kib)
2009-10-12 15:32:00 +00:00
Konstantin Belousov
88c45ef724 MFC r197662:
Do not dereference vp->v_mount without holding vnode lock and checking
that the vnode is not reclaimed.

Approved by:	re (bz)
2009-10-08 11:28:32 +00:00
Konstantin Belousov
68ee1aac0a MFC r197660:
Fix typo.

Approved by:	re (bz, kensmith)
2009-10-04 12:11:44 +00:00
Simon L. B. Nielsen
bc7f0010f1 MFC r197711:
Add no zero mapping feature.

NOTE: Unlike in the other branches where this change will be "merged"
to, the 'no zero mapping' is enabled by default in stable/8.

Errata:		FreeBSD-EN-09:05.null
Approved by:	re (kib)
2009-10-02 17:58:47 +00:00
Jamie Gritton
a301d3226a MFC r197581, r197583, r197584:
Set the prison in NFS anon and GSS SVC creds.

Reviewed by:	marcel
Approved by:	re (kib)
2009-10-01 13:11:45 +00:00
Konstantin Belousov
b57f5ce0bd MFC r197390:
Remove forward_roundrobin().

Approved by:	re (kensmith)
2009-09-28 11:31:21 +00:00
Alexander Motin
2adf464fea MFC rev. 197462:
Do not call BUS_DRIVER_ADDED() for detached buses (attach failed) on
driver load. This fixes crash on atapicam module load on systems, where
some ata channels (usually ata1) was probed, but failed to attach.

Reviewed by:    jhb, imp
Tested by:      many
Approved by:    re (kib)
2009-09-25 18:04:55 +00:00
Konstantin Belousov
9f1fab5064 MFC r197049:
Calculate the amount of bytes to copy for select filedescriptor masks
taking into account size of fd_set for the current process ABI.

Approved by:	re (kensmith)
2009-09-16 13:24:37 +00:00
Attilio Rao
9cede8fb41 MFC r197223:
Fix sched_switch_migrate() by assuming locks cannot be shared and a
deadlock between 3 different threads by acquiring both runqueue locks
when doing the migration.

Please note that this is a special condition as we want this fix in
before RC1 as we assume it is critical and so it has been handled
as an instant-merge.  For the STABLE_7 branch, 1 week before the MFC
is assumed.

Approved by:	re (kib)
2009-09-15 19:14:25 +00:00
Konstantin Belousov
d4c8e5ac7b MFC r197031:
Unlock the image vnode around the call of pmc PMC_FN_PROCESS_EXEC hook.
The hook calls vn_fullpath(9), that should not be executed with a vnode
lock held.

Approved by:	re (kensmith)
2009-09-12 18:05:57 +00:00
Konstantin Belousov
3c9d279b1d MFC r197030:
In vfs_mark_atime(9), be resistent against reclaimed vnodes.
Assert that neccessary locks are taken, since vop might not be called.

Approved by:	re (kensmith)
2009-09-12 18:02:57 +00:00
Konstantin Belousov
93566d2a83 MFC r196887:
In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent
vn_start_write(NULL, &mp) from operating on potentially freed or reused
struct mount *.

Remove unmatched vfs_rel() in cleanup.

Approved by:	re (kensmith)
2009-09-09 13:28:18 +00:00
Attilio Rao
c90c9ddddb Adaptive spinning for locking primitives, in read-mode, have some tuning
SYSCTLs which are inappropriate for a daily use of the machine (mostly
useful only by a developer which wants to run benchmarks on it).
Remove them before the release as long as we do not want to ship with
them in.

Now that the SYSCTLs are gone, instead than use static storage for some
constants, use real numeric constants in order to avoid eventual compiler
dumbiness and the risk to share a storage (and then a cache-line) among
CPUs when doing adaptive spinning together.

Pleasse note that the sys/linker_set.h inclusion in lockmgr and sx lock
support could have been gone, but re@ preferred them to be in order to
minimize the risk of problems on future merging.

Please note that this patch is not a MFC, but an 'edge case' as commit
directly to stable/8, which creates a diverging from HEAD.

Tested by:      Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Approved by:	re (kib)
2009-09-09 09:34:13 +00:00
Attilio Rao
db0c92ce82 MFC r196772:
fix adaptive spinning in lockmgr by using correctly GIANT_RESTORE and
continue statement and improve adaptive spinning for sx lock by just
doing once GIANT_SAVE.

Approved by:	re (kib)
2009-09-09 09:17:31 +00:00
Jamie Gritton
3c7562c77e MFC r196835:
Allow a jail's name to be the same as its jid (which is the default if
  no name is specified), and let a numeric name specify the jid for a new
  jail when the jid isn't otherwise set.  Still disallow other numeric
  names.

Reviewed by:	zec
Approved by:	re (kib), bz (mentor)
2009-09-08 19:18:02 +00:00
Konstantin Belousov
2af00decb8 MFC r196730:
Remove the altkstacks, instead instantiate threads with kernel stack
allocated with the right size from the start. For the thread that has
kernel stack cached, verify that requested stack size is equial to the
actual, and reallocate the stack if sizes differ.

Introduce separate kernel stack cache that keeps some limited amount of
preallocated kernel stacks to lower the latency of thread allocation.

Not a merge: instead of removing td_altkstack* members of struct thread,
replace them with placeholders to keep struct thread layout on the
stable branch.

Also, record r196640, r196644 and r196648 as merged.

Approved by:	re (kensmith)
2009-09-08 15:31:23 +00:00
Konstantin Belousov
c02280f542 MFC r196692:
Make the mnt_writeopcount and mnt_secondary_writes counters,
used by the suspension code, not greater then mnt_ref reference
counter value.

MFC r196733:
Fix mount reference leak when V_XSLEEP is specified to vn_start_write().

Approved by:	re (kensmith)
2009-09-08 14:43:42 +00:00
Warner Losh
45f395006c MFC r196529:
Rather than having enabled/disabled, implement a max queue depth.
  While usually not an issue, this firewalls bugs in the code that may
  run us out of memory.

  Fix a memory exhaustion in the case where devctl was disabled, but the
  link was bouncing.  The check to queue was in the wrong place.

  Implement a new sysctl hw.bus.devctl_queue to control the depth.  Make
  compatibility hacks for hw.bus.devctl_disable to ease transition.

  Reviewed by:	emaste@
  Approved by:	re@ (kib)
  MFC after:	asap
2009-09-05 08:03:29 +00:00
Bjoern A. Zeeb
914e5afefd MFC r196653:
Make sure FreeBSD binaries without .note.ABI-tag section work
  correctly and do not match a colliding Debian GNU/kFreeBSD
  brandinfo statements.
  For this mark the Debian GNU/kFreeBSD brandinfo that it must have
  an .note.ABI-tag section and ignore the old EI_OSABI brandinfo
  when comparing a possibly colliding set of options.

  Due to SYSINIT we add the brandinfo in a non-deterministic order,
  so native FreeBSD is not always first. We may want to consider
  to force native FreeBSD to come first as well.

  The only way a problem could currently be noticed is when running an
  i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD
  brandinfo  was matched first,  as the fallback to ld-elf32.so.1 does
  not exist in that case.

Reported and tested by:	ticso
In collaboration with:	kib
MFC after:		3 days
Approved by:		re (rwatson)
2009-09-02 10:39:46 +00:00
Jilles Tjoelker
33656b91cd MFC r196460
Fix the conformance of poll(2) for sockets after r195423 by
  returning POLLHUP instead of POLLIN for several cases. Now, the
  tools/regression/poll results for FreeBSD are closer to that of the
  Solaris and Linux.

  Also, improve the POSIX conformance by explicitely clearing POLLOUT
  when POLLHUP is reported in pollscan(), making the fix global.

  Submitted by:	bde
  Reviewed by:	rwatson

MFC r196556

  Fix poll() on half-closed sockets, while retaining POLLHUP for fifos.

  This reverts part of r196460, so that sockets only return POLLHUP if both
  directions are closed/error. Fifos get POLLHUP by closing the unused
  direction immediately after creating the sockets.

  The tools/regression/poll/*poll.c tests now pass except for two other
  things:
  - if POLLHUP is returned, POLLIN is always returned as well instead of
    only when there is data left in the buffer to be read
  - fifo old/new reader distinction does not work the way POSIX specs it

  Reviewed by:	kib, bde

MFC r196554

  Add some tests for poll(2)/shutdown(2) interaction.

Approved by:	re (kensmith)
2009-09-01 20:58:41 +00:00
Marius Strobl
e4e5ed252d Add a temporary workaround which just lets init die instead of
causing a panic if it is killed due to a unsolved stack overflow
seen very late during shutdown on sparc64 when the gmirror worker
process exists, which is a regression introduced in 8.0.

Reviewed by:	kib
Approved by:	re (rwatson)
2009-08-31 19:16:58 +00:00
Jamie Gritton
f37b0a3db5 MFC r196592:
Fix a LOR between allprison_lock and vnode locks by releasing
  allprison_lock before releasing a prison's root vnode.

PR:		kern/138004
Reviewed by:	kib
Approved by:	re (rwatson), bz (mentor)
2009-08-31 14:13:45 +00:00
Konstantin Belousov
e179d138ba MFC r196560:
Honor the vfs.timestamp_precision sysctl settings for utimes(path, NULL)
and similar calls.

Approved by:	re (rwatson)
2009-08-31 09:08:14 +00:00
Robert Watson
3ef94f2b72 Merge r196481 from head to stable/8:
Rework global locks for interface list and index management, correcting
  several critical bugs, including race conditions and lock order issues:

  Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
  sxlock.  Either can be held to stablize the lists and indexes, but both
  are required to write.  This allows the list to be held stable in both
  network interrupt contexts and sleepable user threads across sleeping
  memory allocations or device driver interactions.  As before, writes to
  the interface list must occur from sleepable contexts.

  Reviewed by:  bz, julian

Approved by:	re (kib)
2009-08-28 20:06:02 +00:00
Marko Zec
61268392e1 MFC r196505:
When "jail -c vnet" request fails, the current code actually creates and
  leaves behind an orphaned vnet.  This change ensures that such vnets get
  released.

  This change affects only options VIMAGE builds.

  Submitted by: jamie
  Discussed with:       bz
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:15:17 +00:00
Marko Zec
939af5009a MFC r196501:
When registering a protocol to an existing protocol domain via
  pf_proto_register(), iterate over all existing vnets to call protosw_init()
  and thus the appropriate .pr_init() handler in the context of each vnet.
  NB in the future we probably want to separate pr_init() handlers into
  two, i.e. per-vnet and global, functions.

  This change has no impact on nooptions VIMAGE builds.

  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:08:56 +00:00
Bjoern A. Zeeb
ac63e409c2 MFC r196512:
Fix handling of .note.ABI-tag section for GNU systems [1].
  Handle GNU/Linux according to LSB Core Specification 4.0,
  Chapter 11. Object Format, 11.8. ABI note tag.

  Also check the first word of desc, not only name, according to
  glibc abi-tags specification to distinguish between Linux and
  kFreeBSD.

  Add explicit handling for Debian GNU/kFreeBSD, which runs
  on our kernels as well [2].

  In {amd64,i386}/trap.c, when checking osrel of the current process,
  also check the ABI to not change the signal behaviour for Linux
  binary processes, now that we save an osrel version for all three
  from the lists above in struct proc [2].

  These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD
  and Linux binaries on the same machine again for at least i386 and
  amd64, and no longer break kFreeBSD which was detected as GNU(/Linux).

PR:		kern/135468
Submitted by:	dchagin [1] (initial patch)
Suggested by:	kib [2]
Tested by:	Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD
Reviewed by:	kib
Approved by:	re (kensmith)
2009-08-27 17:34:13 +00:00
John Baldwin
18fb1e9a44 MFC 196417:
This patch fixes two bugs in sglist(9) and improves robustness of the API via
better semantics if a request to append an address range to an existing list
fails.
- When cloning an sglist, properly set the length in the new sglist instead of
  leaving the new list empty.
- Properly compute the amount of data added to an sglist via
  _sglist_append_buf().  This allows sglist_consume_uio() to properly update
  uio_resid.
- When a request to append an address range to a scatter/gather list fails,
  restore the sglist to the state it had at the start of the function call
  instead of resetting it to an empty list.

Approved by:	re (kib)
2009-08-21 03:14:39 +00:00
Robert Watson
708b471c4b Merge r196267 from head to stable/8:
Rather than fix questionable ifnet list locking in the implementation of
  the kern.polling.enable sysctl, remove the sysctl.  It has been deprecated
  since FreeBSD 6 in favour of per-ifnet polling flags.

  Reviewed by:	luigi

Approved by:	re (kib)
2009-08-20 21:29:49 +00:00
John Baldwin
5c91164df2 MFC 196404:
Change the 'resid' parameter to sglist_consume_uio() from an int to a
size_t to match the recent type change of the uio_resid member of struct
uio.

Approved by:	re (kib)
2009-08-20 20:53:36 +00:00
John Baldwin
247db0748a MFC 196403: Temporarily revert the new-bus locking for 8.0 release.
Approved by:	re (kib)
2009-08-20 20:23:28 +00:00
Ed Schouten
e047c5fbb6 MFC r196378:
Small changes to the warning message generated by pty(4):

  - Only print the warning once, instead of filling up the screen.
  - Use the word "legacy" for the pty_warningcnt description, to prevent
    confusion.
  - Use log() instead of printf().

  Discussed with: rwatson, jhb
  Approved by:    re (kib)
2009-08-19 14:38:43 +00:00
Pawel Jakub Dawidek
65536ad653 MFC r196358:
Remove unused taskqueue_find() function.

Reviewed by:	dfr
Approved by:	re (kib)
2009-08-18 14:00:25 +00:00
Attilio Rao
7e2d0af9e0 MFC r196334:
* Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
  a pointer-fetching specific operation check. Consequently, rename the
  operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
  directly alignment on the word boundry, for all the given specific
  architectures. That's a bit too strict for some common case, but it
  assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)
2009-08-17 16:33:53 +00:00
Pawel Jakub Dawidek
e43f173602 MFC r196295:
Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:03:47 +00:00
Pawel Jakub Dawidek
ea5f504fed MFC r196293:
Because taskqueue_run() can drop tq_mutex, we need to check if the
TQ_FLAGS_ACTIVE flag wasn't removed in the meantime, which means we missed a
wakeup.

Approved by:	re (kib)
2009-08-17 08:46:47 +00:00
Ed Schouten
336b627671 MFC r196276:
Fix small style regression introduced by the MPSAFE newbus code.

Approved by:	re (rwatson)
2009-08-16 20:33:16 +00:00
Bjoern A. Zeeb
21845eba48 MFC r196226:
Add a new macro to test that a variable could be loaded atomically.
  Check that the given variable is at most uintptr_t in size and that
  it is aligned.

  Note: ASSERT_ATOMIC_LOAD() uses ALIGN() to check for adequate
        alignment -- however, the function of ALIGN() is to guarantee
        alignment, and therefore may lead to stronger alignment
        enforcement than necessary for types that are smaller than
        sizeof(uintptr_t).

  Add checks to mtx, rw and sx locks init functions to detect possible
  breakage. This was used during debugging of the problem fixed with
  r196118 where a pointer was on an un-aligned address in the dpcpu area.

  In collaboration with:  rwatson
  Reviewed by:            rwatson

Approved by:	re (kib)
2009-08-14 21:50:47 +00:00
Konstantin Belousov
106c3802ff MFC r196203:
Correctly handle unlock for !MAKEENTRY case.

Approved by:	re (rwatson)
2009-08-14 11:06:58 +00:00
Attilio Rao
be1057174e MFC r196196:
* Completely remove the option STOP_NMI from the kernel.  This option
  has proven to have a good effect when entering KDB by using a NMI,
  but it completely violates all the good rules about interrupts
  disabled while holding a spinlock in other occasions.  This can be the
  cause of deadlocks on events where a normal IPI_STOP is expected.
* Add an new IPI called IPI_STOP_HARD on all the supported architectures.
  This IPI is responsible for sending a stop message among CPUs using a
  privileged channel when disponible. In other cases it just does match a
  normal IPI_STOP.
  Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
  architectures, while on the other has a normal IPI_STOP effect. It is
  responsibility of maintainers to eventually implement an hard stop
  when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
  function called stop_cpus_hard(). That is specular to stop_cpu() but
  it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
  stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by:  jhb
Tested by:    pho, bz, rink
Approved by:  re (kib)
2009-08-13 17:54:11 +00:00
Bjoern A. Zeeb
da2a30fca1 MFC r196176:
Make it possible to change the vnet sysctl variables on jails
  with their own virtual network stack. Jails only inheriting a
  network stack cannot change anything that cannot be changed from
  within a prison.

  Reviewed by:  rwatson, zec

Approved by:	re (kib)
2009-08-13 10:31:02 +00:00
Bjoern A. Zeeb
abff5b8ad9 MFC r196135:
Make the kernel compile without IP networking by moving
  a variable under a proper #ifdef.

Approved by:	re (rwatson)
2009-08-12 12:14:30 +00:00
Bjoern A. Zeeb
537791e584 MFC r196132:
Add ddb show dpcpu_off command to ease dpcpu memory debugging.
  While show pcpu prints pc_dynamic this also prints the original
  memory address as well as the maths.

  Once dpcpu goes NUMA this is considered to help debugging as well.

  Reviewed by:  rwatson

Approved by:	re
2009-08-12 12:10:28 +00:00
Julian Elischer
9734411552 Stop uuidgen(2) from crashing in vimage kerenels.
make curvnet valid when needed.

Reviewed by:	bz@
Approved by:	re (kib)
2009-08-02 16:59:02 +00:00