63226 Commits

Author SHA1 Message Date
Kip Macy
ba68b814cc suck in more of busdma to enable more efficient mappings
kill redundant INVARIANTS check
2007-04-15 05:46:34 +00:00
Kip Macy
d43f50b93a Add sysctl for disabling/enabling mbuf chain collapsing
remove map creation before calling bus_dmamap_load_mvec_sg
2007-04-15 05:45:10 +00:00
Kip Macy
52c81add3c Implement ZERO_COPY_SOCKETS check in a way that doesn't make LINT unhappy 2007-04-15 04:55:39 +00:00
Pawel Jakub Dawidek
87e89536f1 Fix RAID-Z resilvering.
Obtained from:	OpenSolaris
2007-04-14 20:50:14 +00:00
Kip Macy
51580731ae Add support for mbuf iovec in the TX path 2007-04-14 20:40:22 +00:00
Kip Macy
642046797b add reference count pointer to mbuf iovec
implement robust version of m_collapse
add support for sf_buf
add fix for m_iovappend
add calls to m_sanity under INVARIANTS
fix m_freem_vec to correctly travese the mbuf iovec chain
2007-04-14 20:38:38 +00:00
Kip Macy
21c5f3f383 hide static declaration
remove extra white space
2007-04-14 20:31:05 +00:00
Kip Macy
21ee3e7aff remove now invalid check from m_sanity
panic on m_sanity check failure with INVARIANTS
2007-04-14 20:19:16 +00:00
Kip Macy
f8bbd17f06 Add option for disabling allocation from the packet zone 2007-04-14 20:16:03 +00:00
Kip Macy
38073c4181 pad out m_hdr to make pkthdr word-aligned
shuffle pkthdr.len so that pkthdr.header is aligned without compiler added padding

Reviewed by: rwatson, andre, sam
2007-04-14 19:42:20 +00:00
Max Laier
d0cf96b407 Fix a typeo - unbreak the build. 2007-04-14 18:27:34 +00:00
Maxim Konovalov
461f64fe8c o Add bsm and security to a list of cscope dirs. 2007-04-14 16:29:15 +00:00
Pawel Jakub Dawidek
d48078479c MFp4: Hmm, it seems to work now. 2007-04-14 15:01:50 +00:00
Dag-Erling Smørgrav
f61bc4ea5e Further pseudofs improvements:
The pfs_info mutex is only needed to lock pi_unrhdr.  Everything else
in struct pfs_info is modified only while Giant is held (during
vfs_init() / vfs_uninit()); add assertions to that effect.

Simplify pfs_destroy somewhat.

Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the
assertions which were added in the previous commit to ensure they were
consistent.

Assert that Giant is held while the vnode cache is initialized and
destroyed.  Also assert that the cache is empty when it is destroyed.

Rename the vnode cache mutex for consistency.

Fix a long-standing bug in pfs_getattr(): it would uncritically return
the node's pn_fileno as st_ino.  This would result in st_ino being 0
if the node had not previously been visited by readdir(), and also in
an incorrect st_ino for process directories and any files contained
therein.  Correct this by abstracting the fileno manipulations
previously done in pfs_readdir() into a new function, pfs_fileno(),
which is used by both pfs_getattr() and pfs_readdir().
2007-04-14 14:08:30 +00:00
Pawel Jakub Dawidek
8aff52ca4e MFp4: Use max_ncpus, which is used in other places in the code. 2007-04-14 12:33:47 +00:00
Pawel Jakub Dawidek
8870baf005 MFp4: Add more debug, so we can see if zpool.cache was loaded or why it
wasn't loaded.
2007-04-14 12:23:03 +00:00
Pawel Jakub Dawidek
dbd490e0e2 MFp4: Allow to tune vfs.zfs.debug from loader.conf. 2007-04-14 12:21:06 +00:00
Pawel Jakub Dawidek
c98fbf0418 MFp4: - Allow to tune number of spa_zio_* threads.
- Reduce default number of spa_zio_* threads to N*spa_zio_issue
	  plus N*spa_zio_intr threads per ZIO type, where N is the number
	  of CPUs.
	- Put ZIO type number in thread's name.
2007-04-14 12:20:06 +00:00
Robert Watson
d72a615878 Some Linux applications (ping) pass a non-NULL msg_control argument to
sendmsg() while using a 0-length msg_controllen.  This isn't allowed in
the FreeBSD system call ABI, so detect this case and set msg_control to
NULL.  This allows Linux ping to work.

Submitted by:	rdivacky
2007-04-14 10:35:09 +00:00
Randall Stewart
c105859eee - fix source address selection when picking an acceptable address
- name change of prefered -> preferred
- CMT fast recover code added.
- Comment fixes in CMT.
- We were not giving a reason of cant_start_asoc per socket api
  if we failed to get init/or/cookie to bring up an assoc. Change
  so we don't just give a generic "comm lost" but look at actual
  states of dying assoc.
- change "crc32" arguments to "crc32c" to silence strict/noisy
  compiler warnings when crc32() is also declared
- A few minor tweaks to get the portable stuff truely portable
  for sctp6_usrreq.c :-D
- one-2-one style vrf match problem.
- window recovery would leave chks marked for retran
  during window probes on the sent queue. This would then
  cause an out-of-order problem and assure that the flight
  size "problem" would occur.
- Solves a flight size logging issue that caused rwnd
  overruns, flight size off as well as false retransmissions.g
- Macroize the up and down of flight size.
- Fix a ECNE bug in its counting.
- The strict_sacks options was causing aborts when window probing
  was active, fix to make strict sacks a bit smarter about what
  the next unsent TSN is.
- Fixes a one-2-one wakeup bug found by Martin Kulas.
- If-defed out form, Andre's copy routines pending his
  commit of at least m_last().. need to adjust for 6.2 as
  well.. since m_last won't exist.
Reviewed by:	gnn
2007-04-14 09:44:09 +00:00
Bruce M Simpson
05d91e4363 In member interface detach event handler, do not attempt to free state
which has already been freed by in_ifdetach(). With this cumulative change,
the removal of a member interface will not cause a panic in pfsync(4).

Requested by:	yar
PR:		86848
2007-04-14 01:01:46 +00:00
Pawel Jakub Dawidek
24b0502ee0 Fix jails and jail-friendly file systems handling:
- We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail.
- Move security checks to vfs_suser() and deny unmounting and updating
  for jailed root from different jails, etc.

OK'ed by:	rwatson
2007-04-13 23:54:22 +00:00
Pawel Jakub Dawidek
bd59d85850 Fix overflow, which was causing endless loops when 32bit machine had more
than 2GB of RAM. This was because our physmem is long and 'physmem*PAGESIZE'
can be negative for more than 2GB of memory.

Reported by:	Andrey V. Elsukov <bu7cher@yandex.ru>

It is not yet tested by Andrey, so there can be other problems, but this
was definiately a bug, so I'm committing a fix now.
2007-04-13 18:50:03 +00:00
Maxim Konovalov
b274ce9ef2 o Extend the list of supported CDMA-2000 terminals.
Submitted by:	R.Mahmatkhanov
MFC after:	10 days
2007-04-13 18:15:07 +00:00
Alan Cox
0b76504872 Eliminate the misuse of PG_FRAME to truncate a virtual address to a virtual
page boundary.

Reviewed by: ru@
2007-04-13 16:07:29 +00:00
Christian S.J. Peron
f0cbfcc468 Fix the handling of IPv6 addresses for subject and process BSM audit
tokens. Currently, we do not support the set{get}audit_addr(2) system
calls which allows processes like sshd to set extended or ip6
information for subject tokens.

The approach that was taken was to change the process audit state
slightly to use an extended terminal ID in the kernel. This allows
us to store both IPv4 IPv6 addresses. In the case that an IPv4 address
is in use, we convert the terminal ID from an struct auditinfo_addr to
a struct auditinfo.

If getaudit(2) is called when the subject is bound to an ip6 address,
we return E2BIG.

- Change the internal audit record to store an extended terminal ID
- Introduce ARG_TERMID_ADDR
- Change the kaudit <-> BSM conversion process so that we are using
  the appropriate subject token. If the address associated with the
  subject is IPv4, we use the standard subject32 token. If the subject
  has an IPv6 address associated with them, we use an extended subject32
  token.
- Fix a couple of endian issues where we do a couple of byte swaps when
  we shouldn't be. IP addresses are already in the correct byte order,
  so reading the ip6 address 4 bytes at a time and swapping them results
  in in-correct address data. It should be noted that the same issue was
  found in the openbsm library and it has been changed there too on the
  vendor branch
- Change A_GETPINFO to use the appropriate structures
- Implement A_GETPINFO_ADDR which basically does what A_GETPINFO does,
  but can also handle ip6 addresses
- Adjust get{set}audit(2) syscalls to convert the data
  auditinfo <-> auditinfo_addr
- Fully implement set{get}audit_addr(2)

NOTE: This adds the ability for processes to correctly set extended subject
information. The appropriate userspace utilities still need to be updated.

MFC after:	1 month
Reviewed by:	rwatson
Obtained from:	TrustedBSD
2007-04-13 14:55:19 +00:00
Pawel Jakub Dawidek
f0bc5ac3e1 Fix vnodes starvation caused by DNLC (ZFS name cache):
- Tune number of namecache entires better (based on desiredvnodes).
- Handle vfs_lowvnodes event by releasing requested number of name cache
  entries, but no less than 5%.

Reported by:	simokawa
2007-04-13 08:42:01 +00:00
Pawel Jakub Dawidek
6bc3ab2574 When we are running low on vnodes, there is currently no way to ask other
subsystems to release some vnodes. Implement backpressure based on
vfs_lowvnodes event (similar to vm_lowmem for memory).
2007-04-13 08:38:48 +00:00
Pawel Jakub Dawidek
6704017a15 MFp4: Synchronize with vendor (mostly 'zfs rename -r'). 2007-04-12 23:16:02 +00:00
Pawel Jakub Dawidek
1da61b3665 MFp4: Bring back comments.
Requested by:	jhb
2007-04-12 23:14:25 +00:00
Lukas Ertl
a2237c41fc -) Correct sdcount for a plex when removing or adding subdisks.
-) Set correct sizes for plexes and volumes a subdisk has been removed.

Submitted by:   Ulf Lilleengen <lulf_AT_freebsd.org>
2007-04-12 17:54:35 +00:00
Lukas Ertl
9e357b05da Avoid infinite loop if the device string given for a drive
only consists of "/".

Submitted by:  Ulf Lilleengen <lulf_AT_freebsd.org>
2007-04-12 17:40:44 +00:00
Alan Cox
1434c1f34b MFamd64
Define PGEX_RSV.
2007-04-12 17:00:56 +00:00
Ruslan Ermilov
f76f6cfd25 Fix PAE on SMP by enabling EFER_NXE on all APs.
Reported by:	kris
Diagnosed by:	alc
2007-04-12 11:05:24 +00:00
Kip Macy
aa84193acf restore sense to get_imm_packet
MFC after: 3 days
2007-04-12 04:48:54 +00:00
Kip Macy
98d6fba71d switch over to per-txq dma tag to facilitate parallelism on TX
MFC after: 3 days
2007-04-12 04:31:44 +00:00
Kip Macy
dd782506d8 explicitly check TSO flag
don't clear and then set M_PKTHDR, m_gethdr sets it correctly
improve error handling on m_gethdr failure

MFC after: 3 days
2007-04-12 03:33:30 +00:00
Kip Macy
23ed7b513f Add ETHER_HDR_LEN to hardware accepted mtu
MFC after: 3 days
2007-04-12 03:07:24 +00:00
Andrew Thompson
575156b607 Fix a case where the multicast addresses were not removed from some ports. The
first port to be removed from the trunk would free the multicast list so
subsequent removed ports didnt have their multicast addresses removed.
2007-04-12 01:58:57 +00:00
Andre Oppermann
dfd389bf64 Add m_last() inline function. 2007-04-11 23:13:12 +00:00
Dag-Erling Smørgrav
15bad11fdb Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process-
specific nodes when the process exits)

Move the vnode-cache-walking loop which was duplicated in pfs_exit() and
pfs_disable() into its own function, pfs_purge(), which looks for vnodes
marked as dead and / or belonging to the specified pfs_node and reclaims
them.  Note that this loop is still extremely inefficient.

Add a comment in pfs_vncache_alloc() explaining why we have to purge the
vnode from the vnode cache before returning, in case anyone should be
tempted to remove the call to cache_purge().

Move the special handling for pfstype_root nodes into pfs_fileno_alloc()
and pfs_fileno_free() (the root node's fileno must always be 2).  This
also fixes a bug where pfs_fileno_free() would reclaim the root node's
fileno, triggering a panic in the unr code, as that fileno was never
allocated from unr to begin with.

When destroying a pfs_node, release its fileno and purge it from the
vnode cache.  I wish we could put off the call to pfs_purge() until
after the entire tree had been destroyed, but then we'd have vnodes
referencing freed pfs nodes.  This probably doesn't matter while we're
still under Giant, but might become an issue later.

When destroying a pseudofs instance, destroy the tree before tearing
down the fileno allocator.

In pfs_mount(), acquire the mountpoint interlock when required.

MFC after:	3 weeks
2007-04-11 22:40:57 +00:00
Robert Watson
949da0d8f8 Remove obsolete comment about privileges: SUSER_ALLOWJAIL is no longer set
in this code.
2007-04-11 16:31:02 +00:00
Robert Watson
94b94b2b49 Remove now-obsolete comment regarding mqueue privileges in jail. 2007-04-11 16:22:59 +00:00
Ruslan Ermilov
7480de4305 Make "struct tcp_timer" visible only to the kernel, and unbreak world. 2007-04-11 14:08:42 +00:00
John Baldwin
e403490aa6 Fix m_freem_vec() to actually traverse the mbuf chain. This avoids
double free's and an infinite loop.

CID:		1834
Found by:	Coverity Prevent (tm)
2007-04-11 13:47:24 +00:00
John Baldwin
4796ce4956 Group the loop to acquire/release Giant with the WITNESS_SAVE/RESTORE under
a single conditional.  The two operations are linked, but since the link
is not very direct, Coverity can't see it.  Humans might also miss the
link as well.  So, this isn't fixing any actual bugs, just improving
readability.

CID:		1787 (likely others as well)
Found by:	Coverity Prevent (tm)
2007-04-11 13:44:55 +00:00
Ruslan Ermilov
9fd6e3d4a4 This commit was generated by cvs2svn to compensate for changes in r168616,
which included commits to RCS files with non-trunk default branches.
2007-04-11 11:09:18 +00:00
Ruslan Ermilov
1859f337c4 Unbreak world build. 2007-04-11 11:09:18 +00:00
Andre Oppermann
b8152ba793 Change the TCP timer system from using the callout system five times
directly to a merged model where only one callout, the next to fire,
is registered.

Instead of callout_reset(9) and callout_stop(9) the new function
tcp_timer_activate() is used which then internally manages the callout.

The single new callout is a mutex callout on inpcb simplifying the
locking a bit.

tcp_timer() is the called function which handles all race conditions
in one place and then dispatches the individual timer functions.

Reviewed by:	rwatson (earlier version)
2007-04-11 09:45:16 +00:00
Nate Lawson
1b96d500fb Put some overly verbose prints under bootverbose. This is on the vendor
branch but we need to work out a different interface with the vendor.
2007-04-11 02:03:36 +00:00