Commit Graph

145724 Commits

Author SHA1 Message Date
Konstantin Belousov
c1d8b5e82c Fix two issues with bufdaemon, often causing the processes to hang in
the "nbufkv" sleep.

First, ffs background cg group block write requests a new buffer for
the shadow copy. When ffs_bufwrite() is called from the bufdaemon due
to buffers shortage, requesting the buffer deadlock bufdaemon.
Introduce a new flag for getnewbuf(), GB_NOWAIT_BD, to request getblk
to not block while allocating the buffer, and return failure
instead. Add a flag argument to the geteblk to allow to pass the flags
to getblk(). Do not repeat the getnewbuf() call from geteblk if buffer
allocation failed and either GB_NOWAIT_BD is specified, or geteblk()
is called from bufdaemon (or its helper, see below). In
ffs_bufwrite(), fall back to synchronous cg block write if shadow
block allocation failed.

Since r107847, buffer write assumes that vnode owning the buffer is
locked. The second problem is that buffer cache may accumulate many
buffers belonging to limited number of vnodes. With such workload,
quite often threads that own the mentioned vnodes locks are trying to
read another block from the vnodes, and, due to buffer cache
exhaustion, are asking bufdaemon for help. Bufdaemon is unable to make
any substantial progress because the vnodes are locked.

Allow the threads owning vnode locks to help the bufdaemon by doing
the flush pass over the buffer cache before getnewbuf() is going to
uninterruptible sleep. Move the flushing code from buf_daemon() to new
helper function buf_do_flush(), that is called from getnewbuf().  The
number of buffers flushed by single call to buf_do_flush() from
getnewbuf() is limited by new sysctl vfs.flushbufqtarget.  Prevent
recursive calls to buf_do_flush() by marking the bufdaemon and threads
that temporarily help bufdaemon by TDP_BUFNEED flag.

In collaboration with:	pho
Reviewed by:	 tegge (previous version)
Tested by:	 glebius, yandex ...
MFC after:	 3 weeks
2009-03-16 15:39:46 +00:00
VANHULLEBUS Yvan
ef39bc9f18 Added DLT_ENC to map list, so it is now possible
to save dumps on enc0

Reviewed by:	gnn(mentor)
Obtained from:	NETASQ
MFC after:	1 week
2009-03-16 15:09:47 +00:00
Alexander Motin
ad7de3aafb Fix spelling in message. 2009-03-16 12:42:23 +00:00
Dag-Erling Smørgrav
ed53753dac cat(1) compiles fine at WARNS level 6. 2009-03-16 12:16:17 +00:00
Weongyo Jeong
d3947a869c use usb2_desc_foreach() to iterate the USB config descriptor instread of
accessing structures directly to check some invalid descriptors.

Pointed by:	hps
2009-03-16 11:19:07 +00:00
Robert Watson
5bb89930e4 Define and use two macros for loopback checksum offload:
LO_CSUM_FEATURES - a bitmask of supported transmit offload features, which
  will be stored in if_hwassist if IFCAP_TXCSUM is enabled, and be cleared
  from mbuf packet header csum flags on transmit. (1)

LO_CSUM_SET - a bitmask of supported receive offload features, which will
  be set on the mbuf packet header csum flags on transmit if IFCAP_RXCSUM
  is enabled.

While here, fix SCTP offload for loopback: offer generation on the
transmit side, don't just skip validation on the receive side.

Obtained from:  DragonflyBSD (1)
MFC after:      1 week
2009-03-16 10:56:50 +00:00
Dmitry Chagin
6465d2d9d2 Chase the k8temp->amdtemp rename in NOTES and loader.conf.
Approved by:	kib (mentor)
2009-03-16 10:36:24 +00:00
Robert Watson
fe89fae166 if_hwassist should be initialized with CSUM, rather than IFCAP, flags.
Submitted by:	yongari
MFC after:	1 week
2009-03-16 09:22:34 +00:00
Robert Noland
910ac6da21 Teach psm about O_ASYNC
This makes Xorg happy if you aren't using moused.

MFC after:	3 days
2009-03-16 08:21:51 +00:00
Robert Noland
85c5cd4b94 Get rid of any remaining PZERO flags in mtx_sleep()
Also, clean up some ifdef mess while I'm here.

MFC after:	3 days
2009-03-16 08:19:11 +00:00
Robert Noland
96deaed545 Fix R600 writeback across suspend/resume.
This is likely a NOOP for us, since I haven't ported the suspend/resume
code yet.

MFC after:	3 days
2009-03-16 08:15:35 +00:00
Dmitry Chagin
731aded86d Sort include files in the alphabetical order.
Approved by:	kib (mentor)
MFC after:	2 weeks
2009-03-16 05:39:37 +00:00
Sean Farley
c035ab1c6f Add the SIOCSIFMTU ioctl handling directly to tap(4) permitting it to
have its MTU set higher than 1500 (ETHERMTU).  Its new limit is now
65535 as enforced by ifhwioctl() in if.c

This allows a tap(4) device to be added to a bridge, which requires all
interface members to have the same MTU, with an interface configured for
jumbo frames.  QEMU may now connect to a network via tap(4) without
requiring the real interface to have its MTU set to 1500 or lower.

Reviewed by:	rpaulo, bms
MFC after:	1 week
2009-03-16 03:11:02 +00:00
Warner Losh
badf7d2584 Restore missing OSREL definition that accidetnally dropped from an
earlier version of this patch.
2009-03-15 23:52:13 +00:00
Jamie Gritton
a9948a71cf Default to AF_LOCAL instead of AF_INET sockets for non-family-specific
operations.  This allows the query operations to work in non-IPv4 jails,
and will be necessary in a future of possible non-INET networking.

Approved by:	bz (mentor)
2009-03-15 22:33:18 +00:00
Robert Watson
3cb73e3d8b Teach the loopback interface about checksum generation and validation
avoidance:

- Enable setting the RXCSUM and TXCSUM flags for loopback interfaces;
  set both by default.
- When RXCSUM is set, flag packets sent over the loopback interface as
  having checked and valid IP, UDP, TCP checksums so that higher
  protocol layers won't check them.
- Always clear CSUM_{IP,UDP_TCP} checksum required flags on transmit,
  as they will have gotten there as a result of TXCSUM being set.

This is done only for packets explicitly sent over the loopback, not
simulated loopback via if_simloop() due to !SIMPLEX interfaces, etc.

Note that enabling TXCSUM but not RXCSUM will lead to unhappiness, as
checksums won't be generated but will be validated.

Kris reports that this leads to significant performance improvements
in loopback benchmarking with TCP and UDP for throughput:

	RXCSUM 	RXCSUM+TXCSUM
TCP	15%	37%
UDP	10%	74%

Update man page.

Reviewed by:	sam
Tested by:	kris
MFC after:	1 week
2009-03-15 20:17:44 +00:00
Dmitry Chagin
b41a7787e1 Ignore FUTEX_FD op, as it is done by linux.
Approved by:	kib (mentor)
MFC after:	2 weeks
2009-03-15 19:38:34 +00:00
Dmitry Chagin
3b8cbbded3 Include linux_futex.h before linux_emul.h
Approved by:	kib (mentor)
MFC after:	6 days
2009-03-15 19:16:12 +00:00
Robert Watson
1df143759b Mention specifically in UPDATING that non-MPSAFE device drivers are no
longer supported.
2009-03-15 16:12:50 +00:00
Robert Watson
3871df9ea1 Bump __FreeBSD_version for the removal of IFF_NEEDSGIANT; network
device drivers that require Giant to be held over calls to the ifnet
interface are no longer supported in the FreeBSD 8.x kernel.
2009-03-15 16:10:25 +00:00
Robert Watson
e5adda3d51 Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced
in FreeBSD 5.x to allow network device drivers to run with Giant
despite the network stack being Giant-free.  This significantly
simplifies calls into ioctl() on network interfaces, especially
in the multicast code, as well as eliminates deferred invocation
of interface if_start routines.

Disable the build on device drivers still depending on
IFF_NEEDSGIANT as they no longer compile.  They will be removed
in a few weeks if they haven't been made MPSAFE in that time.
Disabled drivers:

        if_ar
        if_axe
        if_aue
        if_cdce
        if_cue
        if_kue
        if_ray
        if_rue
        if_rum
        if_sr
        if_udav
        if_ural
        if_zyd

Drivers that were already disabled because of tty changes:

        if_ppp
        if_sl

Discussed on:	arch@
2009-03-15 14:21:05 +00:00
Gabor Kovesdan
8a6a076cb4 - Create the buildworld object directories with mtree instead of various
mkdir calls
- Remove the ugly workaroung from libc NLS, which was to create some of
  these directories
2009-03-15 13:14:06 +00:00
Robert Watson
ad71fe3c35 Correct a number of evolved problems with inp_vflag and inp_flags:
certain flags that should have been in inp_flags ended up in inp_vflag,
meaning that they were inconsistently locked, and in one case,
interpreted.  Move the following flags from inp_vflag to gaps in the
inp_flags space (and clean up the inp_flags constants to make gaps
more obvious to future takers):

  INP_TIMEWAIT
  INP_SOCKREF
  INP_ONESBCAST
  INP_DROPPED

Some aspects of this change have no effect on kernel ABI at all, as these
are UDP/TCP/IP-internal uses; however, netstat and sockstat detect
INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this
into account.

MFC after:      1 week (or after dependencies are MFC'd)
Reviewed by:    bz
2009-03-15 09:58:31 +00:00
Jeff Roberson
1723a06485 - Wrap lock profiling state variables in #ifdef LOCK_PROFILING blocks. 2009-03-15 08:03:54 +00:00
Jeff Roberson
2e6b8de462 - Implement a new mechanism for resetting lock profiling. We now
guarantee that all cpus have acknowledged the cleared enable int by
   scheduling the resetting thread on each cpu in succession.  Since all
   lock profiling happens within a critical section this guarantees that
   all cpus have left lock profiling before we clear the datastructures.
 - Assert that the per-thread queue of locks lock profiling is aware of
   is clear on thread exit.  There were several cases where this was not
   true that slows lock profiling and leaks information.
 - Remove all objects from all lists before clearing any per-cpu
   information in reset.  Lock profiling objects can migrate between
   per-cpu caches and previously these migrated objects could be zero'd
   before they'd been removed

Discussed with:	attilio
Sponsored by:	Nokia
2009-03-15 06:41:47 +00:00
Warner Losh
9dffe835eb Don't adjust ranges at all for subtractive bridges. The simple-minded
stuff we're doing is too simple-minded, so back it out for now.
2009-03-15 06:40:57 +00:00
Warner Losh
d557a26b80 Generalize the workaround for the Hitachi HT-4840-11. The Contec
C-NET(PC) has a cfe at location 1 that has both an odd irq mask (it
matches pc98 machines, so maybe it was a flag for pc98 operation) as
well as a memory map.  Since this driver doesn't know how to cope, we
start with cfe2, which is purely an I/O space mapped and that seems to
make it work.  I say 'seems' here, because the card I have doesn't
seem to have the right dongle for full testing...
2009-03-15 02:31:34 +00:00
Sam Leffler
eb239d2abd no need to for gnu89 any more 2009-03-15 01:39:16 +00:00
Sam Leffler
1e0f47c327 remove gcc-ism; tsinfo isn't used anyway 2009-03-15 01:38:37 +00:00
Randall Stewart
49633f4b36 Opps.. I missed a file on the commit :-) 2009-03-14 23:13:16 +00:00
David Schultz
14b55f2f95 Fix build breakage due to the interplay between r189801 and r189824.
In particular, vendor sources that aren't ready for gnu99 should
still be compiled with gnu89. (Before r189824, these would have
generated warnings if you tried to compile them in gnu99 mode,
but the warnings went unheeded due to -Wno-error.)
2009-03-14 22:50:03 +00:00
Pawel Jakub Dawidek
a71f46d368 Oops. Correct comment in the LICENSE file. 2009-03-14 21:59:12 +00:00
Pawel Jakub Dawidek
2f9e552de1 Regression tests for mac_portacl(4). 2009-03-14 21:54:19 +00:00
Pawel Jakub Dawidek
a3ce3b6d35 - Correct logic in if statement - we want to allocate temporary buffer
when someone is passing new rules, not when he only want to read them.
  Because of this bug, even if the given rules were incorrect, they
  ended up in rule_string.
- Add missing protection for rule_string when coping it.

Reviewed by:	rwatson
MFC after:	1 week
2009-03-14 20:40:06 +00:00
David Schultz
b3c11b5b91 Namespace: Defining htonl() and friends here instead of arpa/inet.h is
a BSD extension.
2009-03-14 20:16:54 +00:00
David Schultz
48a3f7d9ae Fix the visibility of several prototypes. Also move pthread_kill() and
pthread_sigmask() to signal.h. In principle, this shouldn't break anything,
since they're already in signal.h on other systems, and the FreeBSD
manpage says that both pthread.h and signal.h need to be included to
get these functions.

Add a hack to declare pthread_t in the P1003.1-2008 namespace
in signal.h.
2009-03-14 20:10:14 +00:00
David Schultz
ea3186bda4 Hide dbopen() in the POSIX namespace, and use standard type names
throughout so that this compiles in strict POSIX mode.
2009-03-14 20:05:27 +00:00
David Schultz
1a4cc57833 Hide numerous BSD extensions in the POSIX namespace. 2009-03-14 20:04:28 +00:00
David Schultz
fb3fd8c6e9 Bump __FreeBSD_version to 800071 for gcc patch to add support for C99
inline functions in c99 and gnu99 mode.
2009-03-14 19:44:13 +00:00
David Schultz
d5ed956300 Make gcc use C99 inline semantics in c99 and gnu99 mode. This was the
original intent, but the functionality wasn't implemented until after
gcc 4.2 was released. However, if you compiled a program that would
behave differently before and after this change, gcc 4.2 would have
warned you; hence, everything currently in the base system is
unaffected by this change.  This patch also adds additional warnings
about certain inline function-related bogosity, e.g., using a
static non-const local variable in an inline function.

These changes were merged from a snapshot of gcc mainline from March
2007, prior to the GPLv3 switch. I then ran the regression test suite
from a more recent gcc snapshot and fixed the important bugs it found.
I also squelched the following warning unless -pedantic is specified:

    foo is static but used in inline function bar which is not static

This is consistent with LLVM's behavior, but not consistent with gcc 4.3.

Reviewed by:	arch@
2009-03-14 19:36:13 +00:00
David Schultz
34d3ac5921 Namespace: aio_waitcomplete() is a BSD extension.
Also, don't pollute the namespace by including <sys/time.h>.
2009-03-14 19:17:00 +00:00
David Schultz
df41066d71 Namespace: adjtime(), futimes(), futimesat(), lutimes(), and settimeofday()
are BSD extensions.

Also include <sys/select.h> in user code, since this header is
also supposed to define most of the symbols there.
2009-03-14 19:15:13 +00:00
David Schultz
b38a55d49d Namespace: abort2() is a BSD extension. 2009-03-14 19:13:30 +00:00
David Schultz
32f1a59174 Namespace: endpwent, getpwent, and setpwent are XSI extensions. 2009-03-14 19:13:01 +00:00
David Schultz
44bf9512a8 Namespace: dprintf() and getline() are in P1003.1-2008. 2009-03-14 19:12:11 +00:00
David Schultz
cc4603df21 Various namespace cleanups, including exposing fchmod() and fchmodat()
in the POSIX namespace, and hiding eaccess() and setproctitle().
Also move mknodat() from unistd.h to sys/stat.h where it belongs.
The *at() syscalls are only in CURRENT, so this shouldn't cause
problems.
2009-03-14 19:11:08 +00:00
David Schultz
f4ab27b927 Namespace: preadv() and pwritev() are extensions. 2009-03-14 19:07:58 +00:00
David Schultz
ce76f2cfa7 Namespace: vsyslog() is a BSD extension. 2009-03-14 19:07:25 +00:00
David Schultz
7e005f0136 Namespace: semsys() and shmsys() aren't standard. 2009-03-14 19:06:52 +00:00
David Schultz
472f7812a2 Use namespace visibility macros instead of checking for _POSIX_SOURCE. 2009-03-14 19:06:07 +00:00