62684 Commits

Author SHA1 Message Date
pjd
23ac3fc28a Change:
"... try to use VADMIN in preference to VADMIN ..."
To:
"... try to use VADMIN in preference to VWRITE ..."
2007-03-01 21:44:08 +00:00
sos
83b66ad130 Add support for the 3 (PATA) channel on the VIA 6421 chip.
HW donated by: Fabian Peters
2007-03-01 21:18:27 +00:00
pjd
e544923453 Rename PRIV_VFS_CLEARSUGID to PRIV_VFS_RETAINSUGID, which seems to better
describe the privilege.

OK'ed by:	rwatson
2007-03-01 20:47:42 +00:00
pjd
9558665f1e Avoid checking for privileges if there is no need to.
Discussed with:	rwatson
2007-03-01 20:38:24 +00:00
bms
62de975b4d Do not dispatch SIGPIPE from the generic write path for a socket; with
this patch the code behaves according to the comment on the line above.

Without this patch, a socket could cause SIGPIPE to be delivered to its
process, once with SO_NOSIGPIPE set, and twice without.

With this patch, the kernel now passes the sigpipe regression test.

Tested by:	Anton Yuzhaninov
MFC after:	1 week
2007-03-01 19:20:25 +00:00
bms
36e1a8127c Introduce a new mbuf flag, M_PROMISC.
An mbuf packet chain with the M_PROMISC flag set contains a unicast packet
received by the link layer, which does not correspond to any configured
link layer address in the local system.

It is copied when copying m_pkthdr. It is not cleared when crossing layers.
As such, it is defined to have a flag value which is outside of the
M_PROTO* range, like M_VLANTAG has.

Reviewed by:	andre
Obtained from:	NetBSD
2007-03-01 14:38:08 +00:00
bms
3562ed06e7 Fix undirected broadcast sends for the case where SO_DONTROUTE has also
been set at the socket layer, in our somewhat convoluted IPv4 source
selection logic in ip_output().

IP_ONESBCAST is actually a special case of SO_DONTROUTE, as 255.255.255.255
must always be delivered on a local link with a TTL of 1.

If IP_ONESBCAST has been set at the socket layer, also perform destination
interface lookup for point-to-point interfaces based on the destination
address of the link; previously it was not possible to use the option with
such interfaces; also, the destination/broadcast address fields map to the
same field within struct ifnet, which doesn't help matters.

One more valid fix going forward for these issues is to treat 255.255.255.255
as a destination in its own right in the forwarding trie. Other
implementations do this. It fits with the use of multiple paths, though
it then becomes necessary to specify interface preference.
This hack will eventually go away when that comes to pass.

Reviewed by:	andre
MFC after:	1 week
2007-03-01 13:29:30 +00:00
andre
ccd57f9789 Prevent TSO mbuf chain from overflowing a few bytes by subtracting the
TCP options size before the TSO total length calculation.

Bug found by:	kmacy
2007-03-01 13:12:09 +00:00
kmacy
6993a10996 Evidently I've overestimated gcc's ability to peak inside inline functions
and optimize away unused stack values. The 48 bytes that the lock_profile_object
adds to the stack evidently has a measurable performance impact on certain workloads.
2007-03-01 09:35:48 +00:00
piso
4ead5c57e5 Update bus_setup_intr().
Pointed by: Krassimir Slavchev
2007-03-01 09:10:55 +00:00
rwatson
c077671dda Remove two simultaneous acquisitions of multiple unpcb locks from
uipc_send in cases where only a global read lock is held by breaking
them out and avoiding the unpcb lock acquire in the common case.  This
avoids deadlocks which manifested with X11, and should also marginally
further improve performance.

Reported by:	sepotvin, brooks
2007-03-01 09:00:42 +00:00
bms
f9e43f8ad0 Prepare for 802.1p:
Add macro EVL_APPLY_VLID() which may be used to apply an 802.1q VLAN ID
 to the M_VLANTAG field in an mbuf packet header non-destructively.
 This will be used by net80211 to begin with.

 Add macro EVL_APPLY_PRI() which may be used to apply an 802.1p priority
 class to the M_VLANTAG field in an mbuf packet header non-destructively.

 Add other macros for manipulating tags and the CFI bit.

Submitted by:	Boris Kovalenko (EVL_CFIOFTAG(), EVL_MAKETAG())
2007-02-28 22:05:30 +00:00
bms
cbe2d2bcd1 Add comments about common idioms for cleanup pass at a later date. 2007-02-28 21:58:37 +00:00
mohans
2010e1d527 In the SYN_SENT case, Initialize the snd_wnd before the call to tcp_mss().
The TCP hostcache logic in tcp_mss() depends on the snd_wnd being initialized.
2007-02-28 20:48:00 +00:00
bms
ad7a801a07 Remove code which would never be used, viz a viz Quality-of-Service;
the token bucket filter got killed in netinet, so it gets killed here
too. Correct comments.
2007-02-28 20:32:25 +00:00
bms
d055e36582 Add a comment about a struct which needs to be global.
Remove an unused global variable.
Staticize variables which do not need to be global.
2007-02-28 20:29:20 +00:00
bms
84976a55f2 Style: Move declaration of subsystem mutex to where other
mutexes are in this file, and use macros for dealing with it.
2007-02-28 20:02:24 +00:00
thomas
eea9c683fe Minor reformatting. 2007-02-28 16:51:52 +00:00
glebius
5c82cce393 Add EHOSTDOWN and ENETUNREACH to the list of soft errors, that shouldn't
be returned up to the caller.

PR:		100172
Submitted by:	"Andrew - Supernews" <andrew supernews.net>
Reviewed by:	rwatson, bms
2007-02-28 12:47:49 +00:00
glebius
481c8d8b0b Toss the code, that handles errors from ip_output(), to make it more
readable:
- Merge two embedded if() into one.
- Introduce switch() block to handle different kinds of errors.

Reviewed by:	rwatson, bms
2007-02-28 12:41:49 +00:00
ru
c9af927098 Revert previous change and take back a pointy hat. 2007-02-28 09:04:46 +00:00
rwatson
9408f478d2 Lock unp2 after checking for a non-NULL unp2 pointer in uipc_send() on
datagram UNIX domain sockets, not before.
2007-02-28 08:08:50 +00:00
ru
54c3432baa Fix panic on boot caused by setting up a NULL interrupt handler.
Submitted by:	Goran Gajic
Pointy hat to:	piso
2007-02-28 05:29:23 +00:00
pjd
a937285704 Add a comment for PRIV_NET_SETLLADDR.
OK'ed by:	rwatson
2007-02-27 23:38:58 +00:00
imp
b8fed42140 Some USB mass storage devices return the number of sectors in response
to a READ_CAPACITY request rather than the maximum sector (off by one
problem).  This causes a huge cascade of errors as the geom tasting
code tries to read the last sector (which isn't really there in the
face of this error).  automated tools that manipulate disk labels and
such also have issues.

Create a new quirk READ_CAPACITY_OFFBY1 and add a quirk for the
SanDISK ImageMate that I have that suffers from this problem (the
SDDR-31).  It intercepts the READ_CAPACITY response and adjusts it
from number of sectors to max sector for devices with this quirk.

Reading the Linux source suggests that there are a host of
other devices with this issue, including iPods and some popular
cameras.  I've not added quirks for them, since I don't have the
devices in front of me to test.
2007-02-27 22:33:50 +00:00
imp
0bd2af4cfe Entries sorted by id number, not name 2007-02-27 22:27:53 +00:00
jhb
54e4ea54b6 Use pause() in vm_object_deallocate() to yield the CPU to the lock holder
rather than a tsleep() on &proc0.  The only wakeup on &proc0 is intended
to awaken the swapper, not random threads blocked in
vm_object_deallocate().
2007-02-27 19:40:26 +00:00
jhb
b7c2a59c51 Print tid's rather than thread pointers in KTR_PROC traces. 2007-02-27 18:46:07 +00:00
jhb
19438f4ae6 Use taskqueue_drain() to wait for any pending tasks to complete rather
than just pausing for a second.
2007-02-27 18:45:37 +00:00
jhb
a9d161a0a7 Use pause() instead of tsleep()'s on the softc pointer that have no
corresponding wakeups.  Also, at least some of the comments nearby indicate
that these are fixed-length I/O sleeps.
2007-02-27 17:27:23 +00:00
jhb
9081d44243 Use pause() rather than tsleep() on stack variables and function pointers. 2007-02-27 17:23:29 +00:00
jhb
3a7dab48bd Use pause() rather than tsleep() on explicit global dummy variables. 2007-02-27 17:22:30 +00:00
jhb
ba5df1ca42 Use pause() rather than using tsleep() on a dummy variable. 2007-02-27 17:19:33 +00:00
jhb
cedb987542 Always protect the kthread flags with the lock and close a race with
module unload and kthread_exit().

MFC after:	3 days
2007-02-27 17:16:52 +00:00
jhb
e946f637f6 Use tsleep() rather than msleep() with a NULL mtx. 2007-02-27 17:15:39 +00:00
piso
88a4a229c2 Do not execute filter only handlers in ithread_execute_handlers():
this fixes the panics when filter only and ithread only handlers where
sharing the same irq .
2007-02-27 17:09:20 +00:00
flz
ea12bd93e4 Fix obvious typo (use long name if short name isn't provided).
Reviewed by:	sam
MFC after:	3 days
2007-02-27 16:52:27 +00:00
piso
44fdc89cd9 Add proper return codes to zs_intr() filter, and fix accordinlgly zs_intr()
prototype.
2007-02-27 15:31:11 +00:00
bms
833c0dc8bd Add INADDR_ALLRPTS_GROUP define for 224.0.0.22 for future IGMPv3 support.
Obtained from:	OpenSolaris
2007-02-27 14:45:37 +00:00
piso
c0f6ce7abf Correct return code (int) for at91_rtc_intr() prototype.
Approved by: cognet
2007-02-27 13:39:34 +00:00
des
b71867b10d Add GEOM_MULTIPATH so LINT will build.
Pointy hat to:	mjacob
2007-02-27 12:05:25 +00:00
thomas
df5a88905d (cam_rescan): Do not reference ccb->ccb_h.path in CAM_DEBUG call before
it is initialized; use path instead.

This change fixes a panic when using atapicam in conjunction with CAMDEBUG,
which has been described under kern/103602.

Thanks to Josh Carroll <josh.carroll@gmail.com> for providing the traces
that allowed identifying this problem.

PR:		kern/103602
MFC after:	1 week
2007-02-27 09:00:51 +00:00
mckusick
1aafc05142 KASSERT fails when the condition is false, not when it is true. 2007-02-27 07:34:28 +00:00
kmacy
b7672bad26 Further improvements to LOCK_PROFILING:
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
 - Move updates to the lock hash to after the lock is released for spin mutexes,
   sleep mutexes, and sx locks
 - Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
   statistics when an acquisition is contended. This reduces the overhead of
   LOCK_PROFILING to increasing system time by 20%-25% which on
   "make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
   of wall-clock time. Contrast this to enabling lock profiling without
   LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
2007-02-27 06:42:05 +00:00
mjacob
05b92097cb First cut at GEOM based multipath. This is an active/passive{/passive...}
arrangement that has no intrinsic internal knowledge of whether devices
it is given are truly multipath devices. As such, this is a simplistic
approach, but still a useful one.

The basic approach is to (at present- this will change soon) use camcontrol
to find likely identical devices and and label the trailing sector of the
first one. This label contains both a full UUID and a name. The name is
what is presented in /dev/multipath, but the UUID is used as a true
distinguishor at g_taste time, thus making sure we don't have chaos
on a shared SAN where everyone names their data multipath as "Fred".

The first of N identical devices (and N *may* be 1!) becomes the active
path until a BIO request is failed with EIO or ENXIO. When this occurs,
the active disk is ripped away and the next in a list is picked to
(retry and) continue with.

During g_taste events new disks that meet the match criteria for existing
multipath geoms get added to the tail end of the list.

Thus, this active/passive setup actually does work for devices which
go away and come back, as do (now) mpt(4) and isp(4) SAN based disks.

There is still a lot to do to improve this- like about 5 of the 12
recommendations I've received about it,  but it's been functional enough
for a while that it deserves a broader test base.

Reviewed by: pjd
Sponsored by: IronPort Systems
MFC: 2 months
2007-02-27 04:01:58 +00:00
jkim
2620bd06da MFP4: 115094
Linux does not check file descriptor when MAP_ANONYMOUS is set.
This should fix recent LTP test regressions.

Reported by:	Scot Hetzel (swhetzel at gmail dot com)
		netchild
2007-02-27 02:08:01 +00:00
pjd
287d98b314 Replace spaces with tabs in some places. 2007-02-27 01:48:58 +00:00
njl
7eb821d885 Rework EC I/O approach. Implement burst mode, including proper handling of
case where it asynchronously exits burst mode on its own.  Handle different
values of hz in sleep loop.  Provide more debugging options to tune EC
behavior.  These tunables/sysctls may be temporary and are not for user
access if the EC is working properly.  Burst mode is now on by default for
testing and the poll interval has been increased from 100 to 500 us and
total timeout from 100 to 500 ms.

Hopefully this should be the first step of addressing reports of timeout
errors during battery or thermal access, especially on HP/Compaq laptops.
It is reasonably stable and should not cause a loss of functionality or
performance on systems that were previously working.  Testing shows an
increase of responsiveness by ~75% on one system.

PR:		kern/98171
2007-02-27 00:14:20 +00:00
mohans
384aeb29f6 Reap FIN_WAIT_2 connections marked SOCANTRCVMORE faster. This mitigate
potential issues where the peer does not close, potentially leaving
thousands of connections in FIN_WAIT_2. This is controlled by a new sysctl
fast_finwait2_recycle, which is disabled by default.

Reviewed by: gnn, silby.
2007-02-26 22:25:21 +00:00
jkim
2bd7382fdc Add three new ioctl(2) commands for bpf(4).
- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining
whether incoming, outgoing, or all packets on the interface should be
returned by BPF.  Set to BPF_D_IN to see only incoming packets on the
interface.  Set to BPF_D_INOUT to see packets originating locally and
remotely on the interface.  Set to BPF_D_OUT to see only outgoing
packets on the interface.  This setting is initialized to BPF_D_INOUT
by default.  BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but
kept for backward compatibility.

- BIOCFEEDBACK sets packet feedback mode.  This allows injected packets
to be fed back as input to the interface when output via the interface is
successful.  When BPF_D_INOUT direction is set, injected outgoing packet
is not returned by BPF to avoid duplication.  This flag is initialized to
zero by default.

Note that libpcap has been modified to support BPF_D_OUT direction for
pcap_setdirection(3) and PCAP_D_OUT direction is functional now.

Reviewed by:	rwatson
2007-02-26 22:24:14 +00:00