13664 Commits

Author SHA1 Message Date
delphij
86df2c268f Fix rtsold(8) remote buffer overflow vulnerability. [SA-14:20]
Fix memory leak in sandboxed namei lookup. [SA-14:22]
2014-10-21 20:20:07 +00:00
mav
7f2e56c17d MFC r273143: Remove setting BIO_DONE flag for BIOs that have done() method.
This fixes use-after-free, caused by geom_disk, completing same BIO twice
to save extra allocation, and getting BIO_DONE set after the first.
2014-10-19 08:47:27 +00:00
kib
08803fec5e MFC r272534:
Add IO_RANGELOCKED flag for vn_rdwr(9), which specifies that vnode is
not locked, but range is.
2014-10-18 15:28:01 +00:00
jhb
c8f54eb474 MFC 272449:
Require p_cansched() for changing a process' protection status via
procctl() rather than p_cansee().
2014-10-17 19:28:21 +00:00
jhb
b183e8b3c4 MFC 272182:
Don't panic if a resource is allocated twice.  Instead, print a warning and
fail the allocation request.  Allocations of "reserved" resources such as
PCI BARs already fail the request instead of panic'ing in this case.
2014-10-17 15:29:47 +00:00
jkim
d37b3c75ad MFC: r272718
Make kern.nswbuf tunable from loader.
2014-10-15 20:04:21 +00:00
mjg
6eb4db1c33 MFC r269023,r272503,r272505,r272523,r272567,r272569,r272574
Prepare fget_unlocked for reading fd table only once.

Some capsicum functions accept fdp + fd and lookup fde based on that.
Add variants which accept fde.

===============================

Add sequence counters with memory barriers.

Current implementation is somewhat simplistic and hackish,
will be improved later after possible memory barrier overhaul.

===============================

Plug capability races.

fp and appropriate capability lookups were not atomic, which could result in
improper capabilities being checked.

This could result either in protection bypass or in a spurious ENOTCAPABLE.

Make fp + capability check atomic with the help of sequence counters.

===============================

Put and #ifdef _KERNEL around the #include for opt_capsicum.h to
hopefully allow the build to finish after r272505.

===============================

filedesc: fix up breakage introduced in 272505

Include sequence counter supports incoditionally [1]. This fixes reprted build
problems with e.g. nvidia driver due to missing opt_capsicum.h.

Replace fishy looking sizeof with offsetof. Make fde_seq the last member in
order to simplify calculations.

===============================

Keep struct filedescent comments within 80-char limit.

===============================

seq_t needs to be visible to userspace
2014-10-14 21:19:23 +00:00
kib
82d8b580fd MFC r272538:
Slightly reword comment.  Move code, which is described by the
comment, after it.
2014-10-11 18:01:09 +00:00
kib
149982a012 MFC r272536:
Add kernel option KSTACK_USAGE_PROF.
2014-10-11 17:49:51 +00:00
neel
a70300211f MFC r272270:
tty_rel_free() can be called more than once for the same tty so make sure
that the tty is dequeued from 'tty_list' only the first time.
2014-10-08 04:35:09 +00:00
kib
a1a38225f0 MFC r272130:
In kern_linkat() and kern_renameat(), do not call namei(9) while
holding a write reference on the filesystem.  Try to get write
reference in unblocked way after all vnodes are resolved; if failed,
drop all locks and retry after waiting for suspension end.
2014-10-04 19:37:44 +00:00
sbruno
3e8c118a14 MFC r271141: Allow multiple image activators to run on the same
execution by changing imgp->interpreted to a bitmask instead of,
functionally, a bool.

Approved by:	re (gjb)
2014-10-02 21:19:13 +00:00
kib
2a73c68cd0 MFC r272132:
Fix fcntl(2) compat32 after r270691.

Approved by:	re (glebius)
2014-09-28 11:08:32 +00:00
mjg
c60179fc3c MFC r270993:
Fix up proc_realparent to always return correct process.

Prior to the change it would always return initproc for non-traced processes.

This fixes a regression in inferior().

Approved by:	re (marius)
2014-09-26 20:05:28 +00:00
grehan
45208196a6 MFC tty fixes, r259549 and r259663
Keep tty_makedev as a function to preserve the KBI on 10-stable
(it is a macro in CURRENT). The changes for this are direct
commits to 10-stable.

r259549 (glebius):
  - Rename tty_makedev() into tty_makedevf() and make it capable
    to fail and return error.
  - Use make_dev_p() in tty_makedevf() instead of make_dev_cred().
  - Always pass MAKEDEV_CHECKNAME flag.
  - Optionally pass MAKEDEV_REF flag.
  - Provide macro for compatibility with old API.

  This fixes races with simultaneous creation and desctruction of
  ttys, and makes it possible to call tty_makedevf() from device
  cloners.

  A race in tty_watermarks() still exist, since the latter drops
  lock for M_WAITOK allocation. This will be addressed in separate
  commit.

r259663 (glebius):
  Move list of ttys handling from the allocating procedures, to the
  device creation stage. A device creation can fail, and in that case
  an entry already on the list will be freed.

KBI issue pointed out by:       kib
Reviewed by:    kib (KBI addition)
Approved by:    re (kib)
2014-09-18 14:44:47 +00:00
dumbbell
4ed3581a36 vt(4): Merge several bug fixes and improvements
SVN revisions in this MFC:
  269779 270705 270706 271180 271250 271253 271682 271684

Detailed commit list:

r269779:
  fbd: Fix a bug where vt_fb_attach() success would be considered a failure

  vt_fb_attach() currently always returns 0, but it could return a code
  defined in errno.h. However, it doesn't return a CN_* code. So checking
  its return value against CN_DEAD (which is 0) is incorrect, and in this
  case, a success becomes a failure.

  The consequence was unimportant, because the caller (drm_fb_helper.c)
  would only log an error message in this case. The console would still
  work.

  Approved by:	nwhitehorn

r270705:
  vt(4): Add cngrab() and cnungrab() callbacks

  They are used when a panic occurs or when entering a DDB session for
  instance.

  cngrab() forces a vt-switch to the console window, no matter if the
  original window is another terminal or an X session. However, cnungrab()
  doesn't vt-switch back to the original window currently.

r270706:
  drm: Don't "taskqueue" vt-switch if under DDB/panic situation

  If DDB is active, we can't use a taskqueue thread to switch away from
  the X window, because this thread can't run.

  Reviewed by:	ray@
  Approved by:	ray@

r271180:
  vt_vga: vd_setpixel_t and vd_drawrect_t are noop in text mode

r271250:
  vt(4): Change the terminal and buffer sizes, even without a font

  This fixes a bug where scroll lock would not work for tty #0 when using
  vt_vga's textmode. The reason was that this window is created with a
  static 256x100 buffer, larger than the real size of 80x25.

  Now, in vt_change_font() and vt_compute_drawable_area(), we still
  perform operations even of the window has no font loaded (this is the
  case in textmode here vw->vw_font == NULL). One of these operation
  resizes the buffer accordingly.

  In vt_compute_drawable_area(), we take the terminal size as is (ie.
  80x25) for the drawable area.

  The font argument to vt_set_border() is removed (it was never used) and
  the code now uses the computed drawable area instead of re-doing its own
  calculation.

  Reported by:	Harald Schmalzbauer <h.schmalzbauer_omnilan.de>
  Tested by:	Harald Schmalzbauer <h.schmalzbauer_omnilan.de>

r271253:
  pause_sbt(): Take the cold path (ie. use DELAY()) if KDB is active

  This fixes a panic in the i915 driver when one uses debug.kdb.enter=1
  under vt(4).

  PR:		193269
  Reported by:	emaste@
  Submitted by:	avg@

r271682:
  vt(4): Fix a LOR which occurs during a call to vt_upgrade()

  Reported by:	kib@
  Review:		https://reviews.freebsd.org/D785
  Reviewed by:	ray@
  Approved by:	ray@

r271684:
  vt(4): Use vt_fb_drawrect() and vt_fb_setpixel() in all vt_fb-derivative

  Review:		https://reviews.freebsd.org/D789
  Reviewed by:	nwhitehorn
  Approved by:	nwhitehorn

Approved by:	re (gjb)
2014-09-18 14:38:18 +00:00
mav
0e425be7bb MFC r271604, r271616:
Add couple memory barriers to order tdq_cpu_idle and tdq_load accesses.

This change fixes transient performance drops in some of my benchmarks,
vanishing as soon as I am trying to collect any stats from the scheduler.
It looks like reordered access to those variables sometimes caused loss of
IPI_PREEMPT, that delayed thread execution until some later interrupt.

Approved by:	re (marius)
2014-09-17 14:06:21 +00:00
trasz
333bfc8652 MFC r271317:
Avoid unlocking unlocked mutex in RCTL jail code.  Specific test case
is attached to PR.

PR:		193457
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2014-09-15 13:01:47 +00:00
kib
9b5b98982c MFC r270993 (by mjg):
Fix up proc_realparent to always return correct process.

Approved by:	re (delphij)
2014-09-11 11:25:10 +00:00
jhb
b2f9aa76a4 MFC 270823,270825,270829:
Use a unit number allocator to provide suitable st_dev and st_ino values
for POSIX shared memory descriptors.  The implementation is similar to
that used for pipes.

Approved by:	re (gjb for 10)
2014-09-10 15:45:18 +00:00
kib
f5031098f4 MFC r271000:
Delay the return from thread_single(SINGLE_EXIT) until all threads are
really destroyed by thread_stash() after the last switch out.

MFC r271007:
Retire thread_unthread().

MFC r271008:
Style.

Approved by:	re (marius)
2014-09-10 09:47:16 +00:00
mav
c5202a10e4 MFC r270423:
Restore pre-r239157 handling of sched_yield(), when thread time slice
was aborted, allowing other threads to run.  Without this change thread
is just rescheduled again, that was illustrated by provided test tool.

PR:		192926
Submitted by:	eric@vangyzen.net
Approved by:	re (marius)
2014-09-06 15:26:38 +00:00
kib
78a27e5e59 Add function and wrapper to switch lockmgr and vnode lock back to
auto-promotion of shared to exclusive.

Approved by:	re (gjb)
2014-09-05 13:22:28 +00:00
emaste
63a3fa9dd0 MFC automatic vt(4) selection for UEFI boot
r268158: Prefer vt(4) for UEFI boot

  The UEFI framebuffer driver vt_efifb requires vt(4), so add a
  mechanism for the startup routine to set the preferred console.
  This change is ugly because console init happens very early in the
  boot, making a cleaner interface difficult.  This change is intended
  only to facilitate the sc(4) / vt(4) transition, and can be reverted
  once vt(4) is the default.

r268160: Fix typos in VTY constant names from r268158

r268982: Don't pass null kmdp to preload_search_info

  On Xen PVH guests kmdp == NULL.

Sponsored by:	The FreeBSD Foundation
2014-09-02 22:01:14 +00:00
emaste
1063e140b6 MFC part of r267973: remove redundant "" assignment for string in BSS.
Sponsored by:	The FreeBSD Foundation
2014-09-02 19:48:37 +00:00
trasz
e8d76f86d2 MFC r270096:
Bring in the new automounter, similar to what's provided in most other
UNIX systems, eg. MacOS X and Solaris.  It uses Sun-compatible map format,
has proper kernel support, and LDAP integration.

There are still a few outstanding problems; they will be fixed shortly.

Reviewed by:	allanjude@, emaste@, kib@, wblock@ (earlier versions)
Phabric:	D523
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
2014-08-31 21:18:23 +00:00
delphij
3074ca39fd MFC r269963+269964:
Re-instate UMA cached backend for 4K - 64K allocations.  New consumers
like geli(4) uses malloc(9) to allocate temporary buffers that gets
free'ed shortly, causing frequent TLB shootdown as observed in hwpmc
supported flame graph.

Add a new loader tunable, vm.kmem_zmax which allows a system administrator
to limit the maximum allocation size that malloc(9) would consider using
the UMA cache allocator as backend.
2014-08-29 13:12:45 +00:00
kib
c679006e55 MFC r270345:
In do_lock_pi(), do not override error from umtxq_sleep_pi() when
doing suspend check.
2014-08-29 08:42:20 +00:00
kib
a1ef6db102 MFC r270321:
Ensure that sigaction flags for signal, which disposition is reset to
ignored or default, are not leaking.

MFC r270504:
Revert the handling of all siginfo sa_flags except SA_SIGINFO to the
pre-r270321 state.
2014-08-29 08:38:34 +00:00
kib
3d7b436d95 MFC r270320:
Check the validity of struct sigaction sa_flags value, reject unknown
flags.
2014-08-29 08:33:32 +00:00
kib
5d15246092 Commit forgotten chunk of r270264. 2014-08-21 12:30:01 +00:00
kib
afeea342d6 MFC r269656:
Implement and use proc_realparent(9).

MFC r270024 (by markj):
Correct the order of arguments passed to LIST_INSERT_AFTER().

For merge, the p_treeflag member of struct proc was moved to the end
of the structure, to keep KBI intact.
2014-08-21 10:46:19 +00:00
davide
4c6c2c8b89 MFC r269502:
Fix an overflow in getsockopt(). optval isn't big enough to hold
sbintime_t.
Re-introduce r255030 behaviour capping socket timeouts to INT_32
if they're too large.
2014-08-20 17:26:05 +00:00
kib
68885db3a4 MFC r269907:
Fix leaks of unqueued unwired pages.
2014-08-20 08:24:37 +00:00
grehan
855cd37c4f MFC r265098
Bump WITNESS_PENDLIST by MAXCPU to account for the
pmap pvlist locks which are scaled by MAXCPU.
2014-08-19 23:08:47 +00:00
mckusick
4791bac9b4 MFC of 269533 (by mckusick):
Add support for multi-threading of soft updates.

Replace a single soft updates thread with a thread per FFS-filesystem
mount point. The threads are associated with the bufdaemon process.

Reviewed by:  kib
Tested by:    Peter Holm and Scott Long
MFC after:    2 weeks
Sponsored by: Netflix

MFC of 269853 (by kib):

Revision r269457 removed the Giant around mount and unmount code, but
r269533, which was tested before r269457 was committed, implicitely
relied on the Giant to protect the manipulations of the softdepmounts
list.  Use softdep global lock consistently to guarantee the list
structure now.

Insert the new struct mount_softdeps into the softdepmounts only after
it is sufficiently initialized, to prevent softdep_speedup() from
accessing bare memory.  Similarly, remove struct mount_softdeps for
the unmounted filesystem from the tailq before destroying structure
rwlock.

Reported and tested by: pho
Reviewed by:    mckusick
Sponsored by:   The FreeBSD Foundation
2014-08-18 22:53:48 +00:00
kib
94d67906ea MFC r269457:
Remove Giant acquisition from the mount and unmount pathes.
2014-08-17 09:07:21 +00:00
mjg
75d88c0d5a MFC r269020:
Cosmetic changes to unp_internalize

Don't throw away the result of fget_unlocked.
Move fdp increment to for loop to make it consistent with similar code
elsewhere.
2014-08-17 07:24:23 +00:00
mjg
fd678d23e5 MFC r268636:
Plug p_pptr null test in do_execve. It is always true.
2014-08-17 07:22:40 +00:00
mjg
46f8d5c454 MFC r268634:
Manage struct sigacts refcnt with atomics instead of a mutex.
2014-08-17 07:20:37 +00:00
mjg
789b35542c MFC r264114, r264310, r268570:
r264114 by davidxu:

Fix SIGIO delivery. Use fsetown() to handle file descriptor owner
ioctl and use pgsigio() to send SIGIO.

r264310 by davidxu:

Add kqueue support for devctl.

r268570:

Clear nonblock and async on devctl close instaed of open.

This is a purely cosmetic change.
2014-08-17 07:16:03 +00:00
mjg
9e281fe64e MFC r268514:
Eliminate plim and vtmp local vars in exit1.

No functional changes.
2014-08-17 07:06:55 +00:00
mjg
8a70582e79 MFC r259407:
proc exit: don't take PROC_LOCK while freeing rlimits

Code wishing to check rlimits of some process should check whether it
is exiting first, which current consumers do.
2014-08-17 07:05:30 +00:00
mjg
ce59684e4d MFC r268505, r268507:
Avoid relocking filedesc lock when closing fds during fdp destruction.

Don't call bzero nor fdunused from fdfree for such cases. It would do
unnecessary work and complain that the lock is not taken.

=======

Don't zero fd_nfiles during fdp destruction.

Code trying to take a look has to check fd_refcnt and it is 0 by that time.

This is a follow up to r268505, without this the code would leak memory for
tables bigger than the default.
2014-08-17 07:00:47 +00:00
mjg
8fa92f4d0b MFC r268365:
Don't call crdup nor uifind under vnode lock.

A locked vnode can get into the way of satisyfing malloc with M_WATOK.

This is a fixup to r268087.
2014-08-17 06:58:14 +00:00
mjg
98ac4e9b08 MFC r268136:
Plug gcc warning after r268074 about unitialized newsigacts
2014-08-17 06:56:22 +00:00
mjg
9cebe2439f MFC r268087:
Don't call crcopysafe or uifind unnecessarily in execve.
2014-08-17 06:54:49 +00:00
mjg
f377401ca1 MFC r268074:
Perform a lockless check in sigacts_shared.

It is used only during execve (i.e. singlethreaded), so there is no fear
of returning 'not shared' which soon becomes 'shared'.

While here reorganize the code a little to avoid proc lock/unlock in
shared case.
2014-08-17 06:52:35 +00:00
bz
b067654da5 MFC r269669:
Split up sys_ktimer_getoverrun() into a sys_ and a kern_ variant
 and export the kern_ version needed by an upcoming linuxolator change.

 Sponsored by:	DARPA,AFRL
2014-08-16 12:59:47 +00:00
markj
57990c0ba4 MFC r266826, r266827
Move some duplicated hook definitions from machine-dependent files to
kern_dtrace.c.
2014-08-09 14:05:01 +00:00