Commit Graph

74230 Commits

Author SHA1 Message Date
Ermal Luçi
d8e86c4a5d Fix typo which has survived amazingly long!
Reviewed by:	mlaier(mentor)
Approved by:	re(kib)
2009-10-14 15:32:46 +00:00
Attilio Rao
be0ac16015 MFC r197476:
In function do_rw_wrlock, when a writer got an error and before returning,
check if there are readers blocked by us via URWLOCK_WRITE_WAITERS flag,
and resume the readers. The error must be EAGAIN, otherwise there must
have memory problem, and nobody can rescue the buggy application.

Approved by:	re (kib), davidxu
2009-10-13 13:03:31 +00:00
Konstantin Belousov
aba70b5e59 MFC r197942:
Refine r195509, instead of checking that vnode type is VBAD, that is
set quite late in the revocation path, properly verify that vnode is
not doomed before calling VOP.

Approved by:	re (bz)
2009-10-13 09:24:51 +00:00
Pawel Jakub Dawidek
69882ff11d MFC r197898:
If provider is open for writing when we taste it, skip it for classes that
depend on on-disk metadata. This was we won't attach to providers that are used
by other classes. For example we don't want to configure partitions on da0 if
it is part of gmirror, what we really want is partitions on mirror/foo.

During regular work it works like this: if provider is open for writing a class
receives the spoiled event from GEOM and detaches, once provider is closed the
taste event is send again and class can rediscover its metadata if it is still
there.  This doesn't work that way when new class arrives, because GEOM gives
all existing providers for it to taste, also those open for writing. Classes
have to decided on their own if they want to deal with such providers (eg.
geom_dev) or not (classes modified by this commit).

Reported by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Tested by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Discussed with:	phk, marcel
Reviewed by:	marcel
Approved by:	re (kib)
2009-10-12 21:08:06 +00:00
Pawel Jakub Dawidek
1d93d2aa4f MFC r197896:
Export disk serial numbers for adaX disks.

Reviewed by:	mav
Approved by:	re (kib)
2009-10-12 21:03:07 +00:00
Pawel Jakub Dawidek
d1c95b4a34 MFC r197831,r197842,r197843,r197860,r197861:
r197831:

Fix situation where Mac OS X NFS client creates a file and when it tries
to set ownership and mode in the same setattr operation, the mode was
overwritten by secpolicy_vnode_setattr().

PR:	kern/118320
Submitted by:	Mark Thompson <info-gentoo@mark.thompson.bz>

r197842:

Fix white-spaces.

r197843:

On FreeBSD it is enough to report provider removal when orphan event is
received, we don't have to do it on every ENXIO error in I/O path.
Solaris has no GEOM so they have to handle it in a less clean way.

r197860:

File system owner is when uid matches and jail matches.

r197861:

Allow file system owner to modify system flags if securelevel permits.

Approved by:	re (kib)
2009-10-12 20:36:55 +00:00
Attilio Rao
4dc32a7398 MFC r197803, r197824, r197910:
Per their definition, atomic instructions used in conjuction with
memory barriers should also ensure that the compiler doesn't reorder paths
where they are used.  GCC, however, does that aggressively, even in
presence of volatile operands.  The most reliable way GCC offers for avoid
instructions reordering is clobbering "memory".
Not all our memory barriers, right now, clobber memory for GCC-like
compilers.
Fix these cases.

Approved by:	re (kib)
2009-10-12 16:05:31 +00:00
Attilio Rao
3f4609ac69 MFC r197643, r197735:
When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.

Approved by:	re (kib)
2009-10-12 15:32:00 +00:00
Marcel Moolenaar
879632020a MFC change 197721:
Fix RTS/CTS flow control, broken by the TTY overhaul.  The new TTY
interface is fairly simple WRT dealing with flow control, but
needed 2 new RX buffer functions with "get-char-from-buf" separated
from "advance-buf-pointer" so that the pointer could be advanced
only when ttydisc_rint() succeeded.

Approved by:	re (kib)
2009-10-10 18:24:54 +00:00
Robert Watson
3e5cbaa4c7 Merge r197814 from head to stable/8:
Remove tcp_input lock statistics; these are intended for debugging only
  and are not intended to ship in 8.0 as they dirty additional cache
  lines in a performance-critical per-packet path.

Approved by:	re (kib, bz)
2009-10-09 09:18:22 +00:00
Bjoern A. Zeeb
67f0b21fa6 MFC r197727:
Put #ifdef INET around parts of the FLOWTABLE code, to unbreak
  nooptions INET kernel builds.

Approved by:	re (kib)
2009-10-08 20:58:09 +00:00
Konstantin Belousov
88c45ef724 MFC r197662:
Do not dereference vp->v_mount without holding vnode lock and checking
that the vnode is not reclaimed.

Approved by:	re (bz)
2009-10-08 11:28:32 +00:00
Robert Watson
f41dd6dca9 Merge r197795 from head to stable/8:
In tcp_input(), we acquire a global write lock at first only if a
  segment is likely to trigger a TCP state change (i.e., FIN/RST/SYN).
  If we later have to upgrade the lock, we acquire an inpcb reference
  and drop both global/inpcb locks before reacquiring in-order.  In
  that gap, the connection may transition into TIMEWAIT, so we need
  to loop back and reevaluate the inpcb after relocking.

  Reported by:        Kamigishi Rei <spambox at haruhiism.net>
  Reviewed by:        bz

Approved by:	re (kib)
2009-10-08 11:07:15 +00:00
Yoshihiro Takahashi
8fe1fb2047 MFC: revision 197730
unifdef NFSCLIENT because the nlm depends on the nfsclient even if NFSCLIENT
  is not defined.

  Now the nfslockd module works with the nfsclient module.

  Reviewed by:	kib

Approved by:	re (kensmith)
2009-10-07 14:14:05 +00:00
Qing Li
c8c92b5491 MFC r197696
Remove a log message from production code. This log message can be
triggered by a misconfigured host that is sending out gratuious ARPs.
This log message can also be triggered during a network renumbering
event when multiple prefixes co-exist on a single network segment.

Approved by:	re
2009-10-06 20:33:02 +00:00
Qing Li
7ec99f713d MFC 197695
Previously, if an address alias is configured on an interface, and
this address alias has a prefix matching that of another address
configured on the same interface, then the ARP entry for the alias
is not deleted from the ARP table when that address alias is removed.
This patch fixes the aforementioned issue.

PR:		kern/139113
Reviewed by:	bz
Approved by:	re
2009-10-06 19:44:44 +00:00
Qing Li
e85f0cc52d MFC r197687
The flow-table associates TCP/UDP flows and IP destinations with
specific routes. When the routing table changes, for example,
when a new route with a more specific prefix is inserted into the
routing table, the flow-table is not updated to reflect that change.
As such existing connections cannot take advantage of the new path.
In some cases the path is broken. This patch will update the affected
flow-table entries when a more specific route is added. The route
entry is properly marked when a route is deleted from the table.
In this case, when the flow-table performs a search, the stale
entry is updated automatically. Therefore this patch is not
necessary for route deletion.

Reviewed by:	bz, kmacy
Approved by:	re
2009-10-06 18:47:02 +00:00
Coleman Kane
4718640084 MFC: r197403, r197644, r197654, and r197659
Fix some unexpected potential NULL de-references in kernel mode due to
usage of pre-8.0 wifi operations with the ndis driver wrapping a Win32/64
wifi driver.

Submitted by:	Paul B Mahol <onemda@gmail.com>
Approved by:	re
2009-10-06 16:05:06 +00:00
Pyun YongHyeon
0baf4d9450 MFC r197461:
Use __NO_STRICT_ALIGNMENT to determine whether de(4) have to apply
  alignment fixup code for received frames on strict alignment
  architectures.

MFC r197463:
  Consistently use bus_addr_t.

MFC r197464:
  Destroy dmamap in dma cleanup.

MFC r197465:
  Align Tx/Rx descriptors on 32 bytes boundary instead of PAGE_SIZE.
  Also align setup descriptor on 32 bytes boundary. Tx buffer have no
  alignment limitation so create dmamap without alignment
  restriction[1]. Rx buffer still seems to require 4 bytes alignment
  limitation but we can simply use MCLBYTES for size to map the
  buffer instead of TULIP_DATA_PER_DESC as the buffer is allocated
  with m_getcl(9).
  de(4) supports up to TULIP_MAX_TXSEG segments for Tx buffers,
  increase maximum dma segment size to TULIP_MAX_TXSEG * MCLBYTES.
  While I'm here remove TULIP_DATA_PER_DESC as it is not used anymore.

  This should fix de(4) breakage introduced after r176206.
  Submitted by:	jhb [1]
  Reported by:	WATANABE Kazuhiro < CQG00620 <> nifty dot ne dot jp >
  Tested by:	WATANABE Kazuhiro < CQG00620 <> nifty dot ne dot jp >,
		Takahashi Yoshihiro < nyan <> jp dot freebsd dot org >
Approved by:	re (kib)
2009-10-05 19:29:25 +00:00
Andrew Gallatin
c005a51c4e MFC:197645
Two more mxge watchdog fixes

1) Restore the PCI Express control register after a watchdog
   reset.  This is required because the device will come out
   of watchdog reset with the pectl reg at its default state,
   and important BIOS configuration (like max payload size)
   could be lost.

2) Call mxge_start_locked() for every tx queue before dropping
   the lock in the watchdog handler.   This is required, as
   the queue's buf ring may have filled during the reset.

Approved by:	re (kib)
2009-10-05 14:28:23 +00:00
Yoshihiro Takahashi
ec89a806b8 MFC: revision 197709
Fix build nfscl and/or nfsd.

Approved by:	re (kib)
2009-10-05 14:03:26 +00:00
Andrew Thompson
3a3dfbf8c4 MFC r197682
EHCI Hardware BUG workaround

 The EHCI HW can use the qtd_next field instead of qtd_altnext when a short
 packet is received. This contradicts what is stated in the EHCI datasheet.
 Also the total-bytes field in the status field of the following TD gets
 corrupted upon reception of a short packet!  We work this around in software by
 not queueing more than one job/TD at a time of up to 16Kbytes! The bug has been
 seen on multiple INTEL based EHCI chips.  Other vendors have not been tested
 yet.

 - Applications using /dev/usb/X.Y.Z, where Z is non-zero are affected, but not
   applications using LibUSB v0.1, v1.2 and v2.0.
 - Mass Storage (umass) is affected.

Approved by:	re (kib)
2009-10-04 19:03:32 +00:00
Konstantin Belousov
9072a0309e MFC r197663:
As a workaround, for Intel CPUs, do not use CLFLUSH in
pmap_invalidate_cache_range() when self-snoop is apparently not reported
in cpu features.

Approved by:	re (bz, kensmith)
2009-10-04 12:20:59 +00:00
Konstantin Belousov
6832666db4 MFC r197661:
Move the annotation for vm_map_startup() immediately before the function.

Approved by:	re (bz, kensmith)
2009-10-04 12:14:49 +00:00
Konstantin Belousov
68ee1aac0a MFC r197660:
Fix typo.

Approved by:	re (bz, kensmith)
2009-10-04 12:11:44 +00:00
Xin LI
9516a85cc3 MFC revision 197683:
Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old
format ZFS, as defined in the manual page.

Submitted by:	pjd (response of my original patch but bugs are mine)
Approved by:	re (kib)
2009-10-04 09:07:29 +00:00
Yoshihiro Takahashi
485080c4c4 MFC: revision 197657
MFi386: revision 197653

    Improve 802.11s comment.

Approved by:	re (bz)
2009-10-03 14:38:22 +00:00
Marius Strobl
721f5d589f MFC: r197490
Merge r194204 from amd64/i386:

Enable PRINTF_BUFR_SIZE by default.

PR:		139134
Approved by:	re (kib)
2009-10-02 18:33:40 +00:00
Simon L. B. Nielsen
bc7f0010f1 MFC r197711:
Add no zero mapping feature.

NOTE: Unlike in the other branches where this change will be "merged"
to, the 'no zero mapping' is enabled by default in stable/8.

Errata:		FreeBSD-EN-09:05.null
Approved by:	re (kib)
2009-10-02 17:58:47 +00:00
Alan Cox
1040e2e4d1 MFC r197580
Temporarily disable the use of 1GB page mappings by the direct map.

Approved by:	re (kib)
2009-10-02 05:11:46 +00:00
Yoshihiro Takahashi
4d1ed2a5c6 MFC: revision 197535
Add '#define NFSCLIENT' into opt_nfs.h if the NFSCLIENT variable is 1
  (the default is 1).

  This makes the nfslockd module works for NFS client.

  Reviewed by:	dfr

Approved by:	re (kib)
2009-10-01 14:42:55 +00:00
Jamie Gritton
a301d3226a MFC r197581, r197583, r197584:
Set the prison in NFS anon and GSS SVC creds.

Reviewed by:	marcel
Approved by:	re (kib)
2009-10-01 13:11:45 +00:00
Rui Paulo
f24f7ffbd4 MFC r197653:
Improve 802.11s comment.

Approved by:	re (kib)
2009-10-01 10:06:09 +00:00
Rui Paulo
f785216c4f Update 802.11s mesh support to draft 3.03. This includes a revised frame
format for peering and changes to the PERR frames.
Note that this is incompatible with the previous code.

Reviewed by:	sam
Approved by:	re (kib)
2009-09-29 12:18:23 +00:00
Pawel Jakub Dawidek
01985d6884 MFC r197287, r197289, r197351, r197426, r197458, r197459, r197497, r197498,
r197512, r197513, r197514, r197515, r197525:

r197287:

Purge namecache for the file system being rolled back, so it doesn't point at
invalid vnodes after the rollback resulting in EIO errors when trying to access
files which are in the namecache.

Reported by:	des

r197289:

Purge file system namecache when receiving incremental stream and rolling back
to it.

r197351:

Purge namecache in the same place OpenSolaris does.

r197426:

Restore BSD behaviour - when creating new directory entry use parent directory
gid to set group ownership and not process gid.

This was overlooked during v6 -> v13 switch.

PR:	kern/139076
Reported by:	Sean Winn <sean@gothic.net.au>

r197458:

Close race in zfs_zget(). We have to increase usecount first and then
check for VI_DOOMED flag. Before this change vnode could be reclaimed
between checking for the flag and increasing usecount.

r197459:

Before calling vflush(FORCECLOSE) mark file system as unmounted so the
following vnops will fail. This is very important, because without this change
vnode could be reclaimed at any point, even if we increased usecount. The only
way to ensure that vnode won't be reclaimed was to lock it, which would be very
hard to do in ZFS without changing a lot of code. With this change simply
increasing usecount is enough to be sure vnode won't be reclaimed from under
us. To be precise it can still be reclaimed but we won't be able to see it,
because every try to enter ZFS through VFS will result in EIO.

The only function that cannot return EIO, because it is needed for vflush() is
zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks
z_teardown_lock and never returns EIO.

r197497:

Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to
be a bit weak and OpenSolaris also switched to fletcher4.

r197498:	head/cddl/contrib/opensolaris

Fletcher4 is not the default checksum algorithm.

r197512:

- Don't depend on value returned by gfs_*_inactive(), it doesn't work
  well with forced unmounts when GFS vnodes are referenced.
- Make other preparations to GFS for forced unmounts.

PR:	kern/139062
Reported by:	trasz

r197513:

Use traverse() function to find and return mount point's vnode instead of
covered vnode when snapshot is already mounted.

r197514:

On lookup error VFS expects *vpp to be set to NULL, be sure to do that.

r197515:

Handle cases where virtual (GFS) vnodes are referenced when doing forced
unmount. In that case we cannot depend on the proper order of invalidating
vnodes, so we have to free resources when we have a chance.

PR:	kern/139062
Reported by:	trasz

r197525:

Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object.
This completes the fix from r185586.

PR:	kern/139059
Reported by:	Daniel Braniss <danny@cs.huji.ac.il>
Submitted by:	Jaakko Heinonen <jh@saunalahti.fi>
Tested by:	Daniel Braniss <danny@cs.huji.ac.il>

Approved by:	re (kib)
2009-09-29 10:53:06 +00:00
Andrew Gallatin
264d14d30d MFC 197395: Improve mxge watchdog routine's ability to reliably reset a failed NIC
Approved by: re (kib)
2009-09-28 23:48:16 +00:00
Michael Tuexen
fe36e02918 MFC r197341.
Fix errnos.

Approved by: re (bz), rrs (mentor)
2009-09-28 18:32:28 +00:00
Konstantin Belousov
b57f5ce0bd MFC r197390:
Remove forward_roundrobin().

Approved by:	re (kensmith)
2009-09-28 11:31:21 +00:00
Marius Strobl
238bc19306 MFC: r197401
- According to Linux, the ALi M5451 can do 31-bit DMA instead of just
  30-bit like the reset of the controllers supported by this driver.
  Actually ALi M5451 can be setup up to generate 32-bit addresses by
  setting the 31st bit via the accompanying ISA bridge, which allows
  it to work in sparc64 machines whose IOMMU require at least 32-bit
  DMA. Even though other architectures would also benefit from 32-bit
  DMA, enabling this bit is limited to sparc64 as bus_dma(9) doesn't
  generally guarantee that a low address of BUS_SPACE_MAXADDR_32BIT
  results in a buffer in the 32-bit range.
- According to Tatsuo YOKOGAWA's ali(4), the the DMA transfer size of
  ALi M5451 is fixed to 64k and in fact using the default size of 4k
  causes the chip to overrun the mapping, triggering uncorrectable
  DMA errors on sparc64.
- The 4DWAVE DX and NX require the recording buffer to be 8-byte
  aligned so adjust the bus_dma_tag_create(9) accordingly.
- Unlike the rest of the controllers supported by this driver, the
  ALi M5451 only has 32 hardware channels instead of 64 so limit the
  loop in tr_intr() accordingly. [1]

Submitted by:	yongari [1]
Reviewed by:	yongari (superset of what is committed)
Approved by:	re (kib)
2009-09-25 19:59:18 +00:00
Alexander Motin
b8b5722c5d Remove constraint, requiring request data to fulfill controller's
alignment requirements. It is busdma task, to manage proper alignment by
loading data to bounce buffers.

PR:		kern/127316
Reviewed by:	current@
Tested by:	Ryan Rogers
Approved by:	re (kib)
2009-09-25 18:07:23 +00:00
Alexander Motin
2adf464fea MFC rev. 197462:
Do not call BUS_DRIVER_ADDED() for detached buses (attach failed) on
driver load. This fixes crash on atapicam module load on systems, where
some ata channels (usually ata1) was probed, but failed to attach.

Reviewed by:    jhb, imp
Tested by:      many
Approved by:    re (kib)
2009-09-25 18:04:55 +00:00
Marcel Moolenaar
3b9790003e MFC rev 197449:
Don't create more partitions than can fit in the table by checking
that the index is within bounds.

Approved by:	re (kib)
2009-09-25 17:48:30 +00:00
Marius Strobl
54577e2314 - Add missing bus_dmamap_sync(9) calls for the work DMA map. Previously
the work area was totally unsynchronized which means this driver only
  had a chance of working on x86 when no bounce buffers were involved,
  which isn't that likely given that support for 64-bit DMA is currently
  broken throughout ata(4).
- Add necessary little-endian conversion of accesses to the work area,
  making this driver work on big-endian hosts. While at it, use the
  alignment-agnostic byte order encoders in order to be on the safe side.
- Clear the reserved member of the SG list entries in order to be on the
  safe side. [1]

Submitted by:	yongari [1]
Reviewed by:	yongari
Approved by:	re (kib)
2009-09-25 16:45:27 +00:00
John Baldwin
424b2e64a2 MFC 197415:
The elements in the component arrays may be direct Package objects rather
than references to objects.  In that case, simply use the Package directly.

Approved by:	re (kib)
2009-09-25 15:14:11 +00:00
John Baldwin
57a0ee4c0b MFC 197410:
- Split the logic to parse an SMAP entry out into a separate function on
  amd64 similar to i386.  This fixes a bug on amd64 where overlapping
  entries would not cause the SMAP parsing to stop.
- Change the SMAP parsing code to do a sorted insertion into physmap[]
  instead of an append to support systems with out-of-order SMAP entries.

Approved by:	re (kib)
2009-09-25 15:08:26 +00:00
John Baldwin
b10d205de2 MFC 197406:
Don't reread the command register to see if enabling I/O or memory
decoding "took".  Other OS's that I checked do not do this and it breaks
some amdpm(4) devices.  Prior to 7.2 we did not honor the error returned
when this failed anyway, so this in effect restores previous behavior.

Approved by:	re (kib)
2009-09-25 14:58:00 +00:00
Brooks Davis
7bd26ba4cc MFC r197269:
Allocate space for the group array in a static credential used in
the quota code.  One case was correctly handled in r194498, but
this one was missed.

PR:		kern/138657
Tested by:	PR submitter
MFC after:	3 days
Approved by:	re@ (kib)
2009-09-24 21:32:56 +00:00
John Baldwin
4e36c32793 MFC 197350:
Re-remove the IBM0057 ID used for PS/2 mouse controllers.  The asl for the
61p includes the hotkey device as IBM0068 and the mouse as IBM0057 similar
to other systems.

Approved by:	re (kensmith)
2009-09-23 15:56:09 +00:00
Konstantin Belousov
2c5f9fbe6a MFC r197348:
For a.out and pre-8 ELF binaries, allow the mmap of zero length.

Approved by:	re (kensmith)
2009-09-23 13:49:41 +00:00
Rui Paulo
d414bb00bb MFC 197190:
Make the sudden motion sensor work on older models and add a bit of
 debugging.

 Submitted by:	Christoph Langguth <christoph at rosenkeller.org>

Approved by:	re (kib)
2009-09-22 20:31:32 +00:00
Qing Li
8cb7f8f861 MFC r197364
A wrong variable is used when setting up the interface
address route, which broke source address selection in
some code paths.

Submitted by:	noted by bz
Reviewed by:	hrs
Approved by:	re (kib)
2009-09-20 17:46:56 +00:00
Andriy Gapon
1a7268649b MFC r197099: pci(4): don't perform maximum register number check
Different sub-kinds of PCI buses may have different rules and
thus it is up for the bus backends to do proper input checks.
For example, PCIe allows configuration register numbers < 0x1000,
while for PCI proper the limit is 0x100.
And, in fact, the buses already do the checks.

Reviewed by:	jhb
Approved by:	re (kib)
2009-09-19 08:13:10 +00:00
Nathan Whitehorn
3f12b2cdd4 MFC r197080
Add a few SCSI controllers to GENERIC that can be found in Powermacs.
This allows installation onto SCSI disks as shipped, for example,
as an option with the Powermac G3.

PR:		powerpc/138543
Reviewed by:	grehan
Approved by:	re (kib)
Obtained from:	sparc64
2009-09-19 01:49:36 +00:00
Nathan Whitehorn
ec46867105 MFC r196993
Remove some debugging (KTR_VERBOSE) that crept into ppc GENERIC long ago
and is present on no other architectures by default.

Reviewed by:	grehan
Approved by:	re (kib)
2009-09-19 01:48:12 +00:00
Kenneth D. Merry
43eb6aeda0 Merge change r197208 from head to stable/8:
Fix some instances where CAM rescans get hung up or take a long time to
complete.

Also, allow xpt_rescan() to rescan a LUN instead of a full bus.

Sponsored by:	Copan Systems, Inc.
Approved by:	re (kib)
2009-09-18 20:35:05 +00:00
Yoshihiro Takahashi
9febd63ce1 MFC: r197156
MFi386:

  Move the loader's entry point to 0x200000.  This change is also needed
  for pc98.

Approved by:	re (kensmith)
2009-09-17 14:12:21 +00:00
Ken Smith
5895f2dd9f Get ready for 8.0-RC1 builds.
Approved by:	re (implicit)
2009-09-17 14:05:06 +00:00
Bruce M Simpson
bfcfe77605 MFC revs 197129,197130,197132:
Fixes to mcast userland API.
--
  Fix an API issue in leave processing for IPv4 multicast groups.
   * Do not assume that the group lookup performed by imo_match_group()
     is valid when ifp is NULL in this case.
   * Instead, return EADDRNOTAVAIL if the ifp cannot be resolved for the
     membership we are being asked to leave.

  Caveat user:
   * The way IPv4 multicast memberships are implemented in the inpcb layer
     at the moment, has the side-effect that struct ip_moptions will
     still hold the membership, under the old ifp, until ip_freemoptions()
     is called for the parent inpcb.
   * The underlying issue is: the inpcb layer does not get notification
     of ifp being detached going away in a thread-safe manner.
     This is non-trivial to fix.
--
  Fix an obvious logic error in the IPv4 multicast leave processing,
  where the filter mode vector was not updated correctly after the leave.
--
  Tighten input checking in inp_join_group():
   * Don't try to use the source address, when its family is unspecified.
   * If we get a join without a source, on an existing inclusive
     mode group, this is an error, as it would change the filter mode.

  Fix a problem with the handling of in_mfilter for new memberships:
   * Do not rely on imf being NULL; it is explicitly initialized to a
     non-NULL pointer when constructing a membership.
   * Explicitly initialize *imf to EX mode when the source address
     is unspecified.
  This fixes a problem with in_mfilter slot recycling in the join path.
--
  Don't allow joins w/o source on an existing group.
  This is almost always pilot error.

  We don't need to check for group filter UNDEFINED state at t1,
  because we only ever allocate filters with their groups, so we
  unconditionally reject such calls with EINVAL.
  Trying to change the active filter mode w/o going through IP_MSFILTER
  is also disallowed.

  Deals with the case described in PR 137164 upfront, cumulative
  with the fix in svn rev 197132 which only calls imo_match_source()
  if the source address family was not unspecified.
--

Revision 197136 has a text conflict, however it is a comment only change.

PR:		137164, 138689, 138690, 138691
Submitted by:	Stef Walter (with fixups)
Approved by:	re (kib)
2009-09-17 13:41:59 +00:00
Andriy Gapon
04793894d4 MFC r197077: pci: remove definitions of duplicate constants
Suggested by:	jhb
Reviewed by:	jhb
Approved by:	re (kib)
2009-09-17 12:41:27 +00:00
Marko Zec
1fe6ff9267 MFC r197176:
Lock the ifnet list while iterating over it.

  Submitted by: julian
  MFC after:    3 days

Approved by:	re (kensmith)
2009-09-17 11:03:37 +00:00
Scott Long
24048b0cb3 Merge rev 197263:
- Enable MSI support (MSIX support was already present)
- Performance improvements

Approved by:	re
Obtained from:	Yahoo!
2009-09-17 05:30:55 +00:00
Scott Long
053351cec3 Merge r197260, r197261, r197262
- Prevent a panic on modern controllers by increasing CISS_MAX_PHYSTGT to 256
- Fix MSI and PERFORMANT interrupt programming.  Fixes hang on boot.
- Fix locking bugs in ioctl handler

Most of this has been soaking at Yahoo for several months, if not longer.  The
quick MFC is due to the impending 8.0-RC1 build.

Approved by:	re
Obtained from:	Yahoo!
2009-09-17 05:27:32 +00:00
Michael Tuexen
6b3c18a020 MFC 197257:
Fix a bug reported by Daniel Mentz:
When authenticating DATA chunks some DATA chunks
might get stuck when the MTU gets decreased via
an ICMP message.

Approved by: re, rrs (mentor)
2009-09-16 14:47:50 +00:00
Michael Tuexen
04a34c6c34 Fixes two bugs:
1) A lock issue, if we ever had to try again
   we would double lock the INP lock.
2) We were allowing (at wrap) associd 0... which really
   we cannot allow since 0 normally means in most socket
   API calls that we are wishing to effect something on
   the INP not TCB.

Approved by: re, rrs (mentor)
2009-09-16 13:44:12 +00:00
Konstantin Belousov
9f1fab5064 MFC r197049:
Calculate the amount of bytes to copy for select filedescriptor masks
taking into account size of fd_set for the current process ABI.

Approved by:	re (kensmith)
2009-09-16 13:24:37 +00:00
Rafal Jaworowski
e9667e8ff1 MFC r196531-196534,196536
Clean up Marvell platform code.

Introduce SheevaPlug support.

   - The device is based on Marvell 88F6281 system on chip.
   - More info about the platform at http://www.plugcomputer.org

   - To build the FreeBSD kernel:
     make buildkernel TARGET_ARCH=arm KERNCONF=SHEEVAPLUG

   - Installation notes at: http://wiki.freebsd.org/FreeBSDMarvell

Submitted by:	Michal Hajduk
Approved by:	re (kib)
Obtained from:	Semihalf
2009-09-16 12:07:58 +00:00
Qing Li
553a7dec4b MFC r197227
Self pointing routes are installed for configured interface addresses
and address aliases. After an interface is brought down and brought
back up again, those self pointing routes disappeared. This patch
ensures after an interface is brought back up, the loopback routes
are reinstalled properly.

Reviewed by:	bz
Approved by:	re
2009-09-15 22:46:06 +00:00
Qing Li
bb3b75e86f MFC r197225
This patch enables the node to respond to ARP requests for
configured proxy ARP entries.

Reviewed by:	bz
Approved by:	re
2009-09-15 22:37:17 +00:00
Qing Li
77eb2069ce MFC r197210, 197212, 197235
The bootp code installs an interface address and the nfs client
module tries to install the same address again. This extra code
is removed, which was discovered by the removal of a call to
in_ifscrub() in r196714. This call to in_ifscrub is put back here
because the SIOCAIFADDR command can be used to change the prefix
length of an existing alias.

r197235 reverts file nfs_vfsops.c

Reviewed by:	kmacy
Approved by:	re
2009-09-15 22:25:19 +00:00
Qing Li
6d8337ba49 MFC r196714
This patch fixes the following issues:

- Routing messages are not generated when adding and removing
  interface address aliases.
- Loopback route installed for an interface address alias is
  not deleted from the routing table when that address alias
  is removed from the associated interface.
- Function in_ifscrub() is called extraneously.

Reviewed by:	gnn, kmacy, sam
Approved by:	re
2009-09-15 19:58:33 +00:00
Qing Li
599f45c5dd MFC r197203
Previously local end of point-to-point interface is not reachable
within the system that owns the interface. Packets destined to
the local end point leak to the wire towards the default gateway
if one exists. This behavior is changed as part of the L2/L3
rewrite efforts. The local end point is now reachable within the
system. The inpcb code needs to consider this fact during the
address selection process.

Reviewed by:	bz
Approved by:	re
2009-09-15 19:38:29 +00:00
Attilio Rao
623b4aa57e MFC r197224:
Use explicit int values for the device states in order to allow, if
necessary, in the future, adds of new states without breaking ABI
between revisions.

Please note that this is a special condition as we want this fix in
before RC1 as we assume it is critical and so it has been handled
as an instant-merge.

Approved by:	re (kib)
2009-09-15 19:24:18 +00:00
Attilio Rao
9cede8fb41 MFC r197223:
Fix sched_switch_migrate() by assuming locks cannot be shared and a
deadlock between 3 different threads by acquiring both runqueue locks
when doing the migration.

Please note that this is a special condition as we want this fix in
before RC1 as we assume it is critical and so it has been handled
as an instant-merge.  For the STABLE_7 branch, 1 week before the MFC
is assumed.

Approved by:	re (kib)
2009-09-15 19:14:25 +00:00
Konstantin Belousov
4dec4ece5a MFC r196888:
The clear_remove() and clear_inodedeps() call vn_start_write(NULL, &mp,
V_NOWAIT) on the non-busied mount point. Unmount might free ufs-specific
mp data, causing ffs_vgetf() to access freed memory.

Busy mountpoint before dropping softdep lk.

Approved by:	re (kensmith)
2009-09-15 12:51:22 +00:00
Pawel Jakub Dawidek
9c0bf68299 MFC r197218:
We believe ZFS is ready for production use. Remove a warning about it being
experimental. :)

Approved by:	re (kib)
2009-09-15 12:21:06 +00:00
Pawel Jakub Dawidek
26e71a6c1b MFC r197219:
Forced unmounts work just fine in my tests under heavy load. There might
still be a problem, but it isn't worth a warning.

Approved by:	re (kib)
2009-09-15 12:19:34 +00:00
Pawel Jakub Dawidek
6bd6f55621 MFC r196822, r196823, r196824:
Remove 'ad:' prefix from disk serial number. We don't want serial number
to change when we reconnect the disk in a way that it is accessible through
CAM for example.

Discussed with:	trasz

Simplify g_disk_ident_adjust() function and allow any printable character
in serial number.

Discussed with:	trasz
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)

Make serial numbers of daX disks visible by GEOM.

No objections from:	scottl
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)

Approved by:	re (kib)
2009-09-15 11:23:59 +00:00
Pawel Jakub Dawidek
055f4e2cf8 MFC r197039, r197040:
Fix usecount leak in mknod(2) on file system exported over NFS.

While I'm here, correct typo in comment.

Reviewed by:	kan, kib
Approved by:	re (bz)
2009-09-15 11:20:23 +00:00
Pawel Jakub Dawidek
18713ab672 MFC r196456,r196457,r196458,r196662,r196702,r196703,r196919,r196927,r196928,
r196943,r196944,r196947,r196950,r196953,r196954,r196965,r196978,r196979,
r196980,r196982,r196985,r196992,r197131,r197133,r197150,r197151,r197152,
r197153,r197167,r197172,r197177,r197200,r197201:

r196456:
- Give minclsyspri and maxclsyspri real values (consulted with kmacy).
- Honour 'pri' argument for thread_create().

r196457:
Set priority of vdev_geom threads and zvol threads to PRIBIO.

r196458:
- Hide ZFS kernel threads under zfskern process.
- Use better (shorter) threads names:
	'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00'
	'vdev:worker da0' -> 'vdev da0'

r196662:
Add missing mountpoint vnode locking.
This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when
regular user tries to mount dataset owned by him.

r196702:
Remove empty directory.

r196703:
Backport the 'dirtying dbuf' panic fix from newer ZFS version.

Reported by:	Thomas Backman <serenity@exscape.org>

r196919:
bzero() on-stack argument, so mutex_init() won't misinterpret that the
lock is already initialized if we have some garbage on the stack.

PR:	kern/135480
Reported by:	Emil Mikulic <emikulic@gmail.com>

r196927:
Changing provider size is not really supported by GEOM, but doing so when
provider is closed should be ok.
When administrator requests to change ZVOL size do it immediately if ZVOL
is closed or do it on last ZVOL close.

PR:	kern/136942
Requested by:	Bernard Buri <bsd@ask-us.at>

r196928:
Teach zdb(8) how to obtain GEOM provider size.

PR:	kern/133134
Reported by:	Philipp Wuensche <cryx-freebsd@h3q.com>

r196943:
- Avoid holding mutex around M_WAITOK allocations.
- Add locking for mnt_opt field.

r196944:
Don't recheck ownership on update mount. This will eliminate LOR between
vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway.

Noticed by:	kib
Reviewed by:	kib

r196947:
Defer thread start until we set priority.

Reviewed by:	kib

r196950:
Fix detection of file system being shared. Now zfs unshare/destroy/rename
command will properly remove exported file systems.

r196953:
When snapshot mount point is busy (for example we are still in it)
we will fail to unmount it, but it won't be removed from the tree,
so in that case there is no need to reinsert it.

Reported by:	trasz

r196954:
If we have to use avl_find(), optimize a bit and use avl_insert() instead of
avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()).
Fix similar case in the code that is currently commented out.

r196965:
Fix reference count leak for a case where snapshot's mount point is updated.

r196978:
Call ZFS_EXIT() after locking the vnode.

r196979:
On FreeBSD we don't have to look for snapshot's mount point,
because fhtovp method is already called with proper mount point.

r196980:
When we automatically mount snapshot we want to return vnode of the mount point
from the lookup and not covered vnode. This is one of the fixes for using .zfs/
over NFS.

r196982:
We don't export individual snapshots, so mnt_export field in snapshot's
mount point is NULL. That's why when we try to access snapshots over NFS
use mnt_export field from the parent file system.

r196985:
Only log successful commands! Without this fix we log even unsuccessful
commands executed by unprivileged users. Action is not really taken, but it is
logged to pool history, which might be confusing.

Reported by:	Denis Ahrens <denis@h3q.com>

r196992:
Implement __assert() for Solaris-specific code. Until now Solaris code was
using Solaris prototype for __assert(), but FreeBSD's implementation.
Both take different arguments, so we were either core-dumping in assert()
or printing garbage.

Reported by:	avg

r197131:
Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain
NULL, but also can point to dead vnode, take that into account.

PR:	kern/132068
Reported by:	Edward Fisk <7ogcg7g02@sneakemail.com>, kris
Fix based on patch from:	Jaakko Heinonen <jh@saunalahti.fi>

r197133:
- Protect reclaim with z_teardown_inactive_lock.
- Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if
  z_dbuf field is NULL - this might happen in case of rollback or forced
  unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete().
- On forced unmount wait for all znodes to be destroyed - destruction can be
  done asynchronously via zfs_reclaim_complete().

r197150:
There is a bug where mze_insert() can trigger an assert() of inserting
the same entry twice. This bug is not fixed yet, but leads to situation
where when try to access corrupted directory the kernel will panic.
Until the bug is properly fixed, try to recover from it and log that it
happened.

Reported by:	marck
OpenSolaris bug:	6709336

r197151:
Be sure not to overflow struct fid.

r197152:
Extend scope of the z_teardown_lock lock for consistency and "just in case".

r197153:
When zfs.ko is compiled with debug, make sure that znode and vnode point at
each other.

r197167:
Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories
by just returning EOPNOTSUPP. This will allow NFS server to fall back to
regular READDIR.
Note that converting inode number to snapshot's vnode is expensive operation.
Snapshots are stored in AVL tree, but based on their names, not inode numbers,
so to convert inode to snapshot vnode we have to interate over all snalshots.
This is not a problem in OpenSolaris, because in their READDIRPLUS
implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on
d_fileno as we do.

PR:	kern/125149
Reported by:	Weldon Godfrey <wgodfrey@ena.com>
Analysis by:	Jaakko Heinonen <jh@saunalahti.fi>

r197172:
Add missing \n.

Reported by:	marck

r197177:
Support both case: when snapshot is already mounted and when it is not yet
mounted.

r197200:
Modify mount(8) to skip MNT_IGNORE file systems by default, just like df(1)
does. This is not POLA violation, because there is no single file system in the
base that use MNT_IGNORE currently, although ZFS snapshots will be mounted with
MNT_IGNORE after next commit.

Reviewed by:	kib

r197201:
- Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular
  df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows
  ZFS route of not listing snapshots by default with 'zfs list' command.
- Add UPDATING entry to note that ZFS snapshots are no longer visible in
  mount(8) and df(1) output by default.

Reviewed by:	kib

Approved by:	re (bz)
2009-09-15 11:13:40 +00:00
John Baldwin
3c31e305c0 MFC 197062:
Don't malloc a buffer while holding the prison0 mutex.  Instead, use a loop
where we figure out the hostname length under the lock, malloc the buffer
with the lock dropped, then recheck the length under the lock and loop again
if the buffer is now too small.

Approved by:	re (kib)
2009-09-14 16:13:12 +00:00
Rick Macklem
8760cb9afb MFC r197048:
Add LK_NOWITNESS to the vn_lock() calls done on newly created nfs
vnodes, since these nodes are not linked into the mount queue and,
as such, the vn_lock() cannot cause a deadlock so LORs are harmless.

Suggested by: kib
Approved by:	re (kensmith), kib (mentor)
2009-09-14 15:16:17 +00:00
Konstantin Belousov
3f6296aa26 MFC r196921:
Do not decrement pfs_vncache_entries for the vnode that was not in the
pfs_vncache list.

Approved by:	re (bz)
2009-09-14 11:01:15 +00:00
Norikatsu Shigemura
87c120a67a MFC r196889:
Change 'dev.cpu.N.temperature', sysctl I (degC) to IK (Kelvin),
to match acpi_thermal(4) and amdtemp(4).

Approved by:	re (rwatson)
Reviewed by:	rpaulo
Suggested by:	ume
2009-09-13 10:04:08 +00:00
Konstantin Belousov
72c975651d MFC r197046:
As was done in r196643 for i386 and amd64, swap the start/end virtual
addresses in pmap_invalidate_cache_range().

Approved by:	re (kensmith)
2009-09-12 18:11:48 +00:00
Michael Tuexen
ceda2d70e4 MFC 196610:
Fix a bug where vlan interfaces are not supported by SCTP.

Approved by: re, rrs (mentor)
2009-09-12 18:08:44 +00:00
Konstantin Belousov
d4c8e5ac7b MFC r197031:
Unlock the image vnode around the call of pmc PMC_FN_PROCESS_EXEC hook.
The hook calls vn_fullpath(9), that should not be executed with a vnode
lock held.

Approved by:	re (kensmith)
2009-09-12 18:05:57 +00:00
Konstantin Belousov
3c9d279b1d MFC r197030:
In vfs_mark_atime(9), be resistent against reclaimed vnodes.
Assert that neccessary locks are taken, since vop might not be called.

Approved by:	re (kensmith)
2009-09-12 18:02:57 +00:00
Michael Tuexen
f222133ab7 This fixes a bug where the value set by SCTP_PARTIAL_DELIVERY_POINT
was not honored, if the socket buffer size was not 4 times that large.
MFC of 196509.

Approved by: re, rrs (mentor)`
2009-09-12 17:58:15 +00:00
Jack F Vogel
b9a65dadc2 This fixes kern/138516, an mbuf leak in both the em
and igb driver, when a transmit fails the packet/mbuf
was not being requeued. Thanks to those that pointed
this problem out.

Approved by:  re
2009-09-11 16:53:12 +00:00
Shteryana Shopova
d51cecd143 MFC r196932:
When joining a multicast group, the inp_lookup_mcast_ifp call
does a KASSERT that the group address is multicast, so the
check if this is indeed true and eventually return a EINVAL if not,
should be done before calling inp_lookup_mcast_ifp. This fixes a kernel
crash when calling setsockopt (sock, IPPROTO_IP, IP_ADD_MEMBERSHIP,...)
with invalid group address.

Reviewed by:	bms
Approved by:	re (kib)
2009-09-11 15:07:36 +00:00
Konstantin Belousov
8f0b752891 MFC r196966:
Lock Giant around vn_open_cred().
Remove innocent unnecessary call to NDFREE().

Approved by:	re (kensmith)
2009-09-11 12:56:13 +00:00
Ken Smith
ac7d4c93c6 Remove extra debugging support that is turned on for head but turned off
for stable branches:

	- shift to MALLOC_PRODUCTION
	- turn off automatic crash dumps
	- Remove kernel debuggers, INVARIANTS*[1], WITNESS* from
	  GENERIC kernel config files[2]

[1] INVARIANTS* left on for ia64 by request marcel
[2] sun4v was left as-is

Reviewed by:	marcel, kib
Approved by:	re (implicit)
2009-09-10 14:04:00 +00:00
Konstantin Belousov
b67ca8999b MFC r196920:
insmntque_stddtr() clears vp->v_data and resets vp->v_op to
dead_vnodeops before calling vgone(). Revert r189706 and corresponding
part of the r186560.

Approved by:	re (kensmith)
2009-09-10 12:42:36 +00:00
Konstantin Belousov
93566d2a83 MFC r196887:
In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent
vn_start_write(NULL, &mp) from operating on potentially freed or reused
struct mount *.

Remove unmatched vfs_rel() in cleanup.

Approved by:	re (kensmith)
2009-09-09 13:28:18 +00:00
Konstantin Belousov
f68591e407 Use traditional td_unusedX names for the padding members.
Suggested by:	julian
Approved by:	re (kensmith)
2009-09-09 10:31:09 +00:00
Attilio Rao
c90c9ddddb Adaptive spinning for locking primitives, in read-mode, have some tuning
SYSCTLs which are inappropriate for a daily use of the machine (mostly
useful only by a developer which wants to run benchmarks on it).
Remove them before the release as long as we do not want to ship with
them in.

Now that the SYSCTLs are gone, instead than use static storage for some
constants, use real numeric constants in order to avoid eventual compiler
dumbiness and the risk to share a storage (and then a cache-line) among
CPUs when doing adaptive spinning together.

Pleasse note that the sys/linker_set.h inclusion in lockmgr and sx lock
support could have been gone, but re@ preferred them to be in order to
minimize the risk of problems on future merging.

Please note that this patch is not a MFC, but an 'edge case' as commit
directly to stable/8, which creates a diverging from HEAD.

Tested by:      Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Approved by:	re (kib)
2009-09-09 09:34:13 +00:00
Attilio Rao
db0c92ce82 MFC r196772:
fix adaptive spinning in lockmgr by using correctly GIANT_RESTORE and
continue statement and improve adaptive spinning for sx lock by just
doing once GIANT_SAVE.

Approved by:	re (kib)
2009-09-09 09:17:31 +00:00
Jack F Vogel
6a89c3ede1 Make LRO turned off uncategorically for devices
attached to the bridge, rather than just in the case
when some device cannot do TSO. Customer tests have
shown that even when all devices can do TSO that LRO
will cause problems when bridging.

Approved by:  re
2009-09-08 23:25:39 +00:00
John Baldwin
bae950f3b3 MFC 196745:
Don't attempt to bind the current thread to the CPU an IRQ is bound to
when removing an interrupt handler from an IRQ during shutdown.  During
shutdown we are already bound to CPU 0 and this was triggering a panic.

Approved by:	re (kib)
2009-09-08 21:50:34 +00:00
Jamie Gritton
3c7562c77e MFC r196835:
Allow a jail's name to be the same as its jid (which is the default if
  no name is specified), and let a numeric name specify the jid for a new
  jail when the jid isn't otherwise set.  Still disallow other numeric
  names.

Reviewed by:	zec
Approved by:	re (kib), bz (mentor)
2009-09-08 19:18:02 +00:00
Konstantin Belousov
2af00decb8 MFC r196730:
Remove the altkstacks, instead instantiate threads with kernel stack
allocated with the right size from the start. For the thread that has
kernel stack cached, verify that requested stack size is equial to the
actual, and reallocate the stack if sizes differ.

Introduce separate kernel stack cache that keeps some limited amount of
preallocated kernel stacks to lower the latency of thread allocation.

Not a merge: instead of removing td_altkstack* members of struct thread,
replace them with placeholders to keep struct thread layout on the
stable branch.

Also, record r196640, r196644 and r196648 as merged.

Approved by:	re (kensmith)
2009-09-08 15:31:23 +00:00
Konstantin Belousov
c02280f542 MFC r196692:
Make the mnt_writeopcount and mnt_secondary_writes counters,
used by the suspension code, not greater then mnt_ref reference
counter value.

MFC r196733:
Fix mount reference leak when V_XSLEEP is specified to vn_start_write().

Approved by:	re (kensmith)
2009-09-08 14:43:42 +00:00
Sam Leffler
d53c6c91b1 MFC r196717:
fix beacon timers on resume in sta mode so raoming works

Approved by:	re (kensmith)
2009-09-07 16:41:18 +00:00
Sam Leffler
f8083274f0 MFC r196785:
correct timeout for doing NOL processing; need a ticks-relative value

Approved by:	re (kensmith)
2009-09-07 16:33:27 +00:00
Pawel Jakub Dawidek
264a8db4b0 MFC r196579:
Fix an obvious topology lock leak.

Approved by:	re (kib)
2009-09-07 16:25:09 +00:00
Qing Li
69406c1632 MFC r196871
The addresses that are assigned to the loopback interface
should be part of the kernel routing table.

Reviewed by:	bz
Approved by:	re
2009-09-05 20:35:18 +00:00
Qing Li
3d2a8d364d MFC r196864
This patch fixes the following issues:
- Interface link-local address is not reachable within the
  node that owns the interface, this is due to the mismatch
  in address scope as the result of the installed interface
  address loopback route. Therefore for each interface
  address loopback route, the rt_gateway field (of AF_LINK
  type) will be used to track which interface a given
  address belongs to. This will aid the address source to
  use the proper interface for address scope/zone validation.
- The loopback address is not reachable. The root cause is
  the same as the above.
- Empty nd6 entries are created for the IPv6 loopback addresses
  only for validation reason. Doing so will eliminate as much
  of the special case (loopback addresses) handling code
  as possible, however, these empty nd6 entries should not
  be returned to the userland applications such as the
  "ndp" command.
Since both of the above issues contain common files, these
files are committed together.

Reviewed by:	bz
Approved by:	re
2009-09-05 17:40:27 +00:00
Qing Li
02642a5729 MFC r196865
This patch fixes an address scope violation. Considering the
scenario where an anycast address is assigned on one interface,
and a global address with the same scope is assigned on another
interface. In other words, the interface owns the anycast
address has only the link-local address as one other address.
Without this patch, "ping6" the anycast address from another
station will observe the source address of the returned ICMP6
echo reply has the link-local address, not the global address
that exists on the other interface in the same node.

Reviewed by:    bz
Approved by:	re
2009-09-05 17:35:31 +00:00
Konstantin Belousov
a0ed3d8546 MFC r196689:
Remove spurious pfs_unlock().

Approved by:	re (rwatson)
2009-09-05 13:10:54 +00:00
Warner Losh
45f395006c MFC r196529:
Rather than having enabled/disabled, implement a max queue depth.
  While usually not an issue, this firewalls bugs in the code that may
  run us out of memory.

  Fix a memory exhaustion in the case where devctl was disabled, but the
  link was bouncing.  The check to queue was in the wrong place.

  Implement a new sysctl hw.bus.devctl_queue to control the depth.  Make
  compatibility hacks for hw.bus.devctl_disable to ease transition.

  Reviewed by:	emaste@
  Approved by:	re@ (kib)
  MFC after:	asap
2009-09-05 08:03:29 +00:00
Alexander Motin
b89c161793 MFC r196777, r196796:
ATI SB600 can't handle 256 sectors transfers with FPDMA (NCQ).

Approved by:	re (ATA-CAM blanket)
2009-09-05 06:24:28 +00:00
Ken Smith
9d4abd5433 Ready for BETA4.
Approved by:	re (implicit)
2009-09-05 00:50:08 +00:00
Jack F Vogel
0bb40ca3e0 This patch seperates the control of header split from LRO (which it
was previously dependent on), LRO gets turned off when bridging but
its been found that header split is still a performance win in that case.

Secondly, there was some interface specific control in stats code that
has been missing, and a logic error that resulted in bogus reporting.
Thanks to Manish and John of LineRateSystems for the report and help in
this code.

Approved by: re
2009-09-04 22:37:03 +00:00
Pyun YongHyeon
b3d8f9c3fc MFC r196721:
Make sure rx descriptor ring align on 16 bytes. I guess the
  alignment requirement could be multiple of 4 bytes but I think
  using descriptor size would make intention clearer.
  Previously the size of rx descriptor was not power of 2 so it
  caused panic in bus_dmamem_alloc(9).

  Reported by:	Jeff Blank (jb000003 <> mr-happy dot com)
Approved by:	re (kib)
2009-09-04 16:41:17 +00:00
Stanislav Sedov
8b3a7204a7 - MFC r196568:
- Add quirk for Sony DSC digital cameras.  This umass devices fail
    to attach without these quirks applied.

Approved by:	re (kib)
2009-09-04 11:32:05 +00:00
Weongyo Jeong
fa0dc8bb62 MFC r196809:
fix a TX issue on big endian machines like powerpc or sparc64.  Now
  zyd(4) should work on all architectures.

  Obtained from:	OpenBSD

Approved by:	re (kib)
2009-09-04 05:37:49 +00:00
John Baldwin
bf202eb1c6 MFC 196705 and 196707:
- Improve pmap_change_attr() on i386 so that it is able to demote a large
  (2/4MB) page into 4KB pages as needed.  This should be fairly rare in
  practice.
- Simplify pmap_change_attr() a bit:
  - Always calculate the cache bits instead of doing it on-demand.
  - Always set changed to TRUE rather than only doing it if it is false.

Approved by:	re (kib)
2009-09-03 13:54:58 +00:00
Bjoern A. Zeeb
5b628e0c26 MFC r196738:
In case an upper layer protocol tries to send a packet but the
  L2 code does not have the ethernet address for the destination
  within the broadcast domain in the table, we remember the
  original mbuf in `la_hold' in arpresolve() and send out a
  different packet with an arp request.
  In case there will be more upper layer packets to send we will
  free an earlier one held in `la_hold' and queue the new one.

  Once we get a packet in, with which we can perfect our arp table
  entry we send out the original 'on hold' packet, should there
  be any.
  Rather than continuing to process the packet that we received,
  we returned without freeing the packet that came in, which
  basically means that we leaked an mbuf for every arp request
  we sent.

  Rather than freeing the received packet and returning, continue
  to process the incoming arp packet as well.
  This should (a) improve some setups, also proxy-arp, in case it was an
  incoming arp request and (b) resembles the behaviour FreeBSD had
  from day 1, which alignes with RFC826 "Packet reception" (merge case).

  Rename 'm0' to 'hold' to make the code more understandable as
  well as diffable to earlier versions more easily.

  Handle the link-layer entry 'la' lock comepletely in the block
  where needed and release it as early as possible, rather than
  holding it longer, down to the end of the function.

  Found by:			pointyhat, ns1
  Bug hunting session with:	erwin, simon, rwatson
  Tested by:			simon on cluster machines
  Reviewed by:			ratson, kmacy, julian

Approved by:	re (kib)
2009-09-02 16:35:57 +00:00
Bjoern A. Zeeb
914e5afefd MFC r196653:
Make sure FreeBSD binaries without .note.ABI-tag section work
  correctly and do not match a colliding Debian GNU/kFreeBSD
  brandinfo statements.
  For this mark the Debian GNU/kFreeBSD brandinfo that it must have
  an .note.ABI-tag section and ignore the old EI_OSABI brandinfo
  when comparing a possibly colliding set of options.

  Due to SYSINIT we add the brandinfo in a non-deterministic order,
  so native FreeBSD is not always first. We may want to consider
  to force native FreeBSD to come first as well.

  The only way a problem could currently be noticed is when running an
  i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD
  brandinfo  was matched first,  as the fallback to ld-elf32.so.1 does
  not exist in that case.

Reported and tested by:	ticso
In collaboration with:	kib
MFC after:		3 days
Approved by:		re (rwatson)
2009-09-02 10:39:46 +00:00
Alfred Perlstein
42a66b539d MFC: r196489,196498
Critical USB bugfixes for 8.0

Approved by:    re
2009-09-02 02:12:07 +00:00
Jilles Tjoelker
33656b91cd MFC r196460
Fix the conformance of poll(2) for sockets after r195423 by
  returning POLLHUP instead of POLLIN for several cases. Now, the
  tools/regression/poll results for FreeBSD are closer to that of the
  Solaris and Linux.

  Also, improve the POSIX conformance by explicitely clearing POLLOUT
  when POLLHUP is reported in pollscan(), making the fix global.

  Submitted by:	bde
  Reviewed by:	rwatson

MFC r196556

  Fix poll() on half-closed sockets, while retaining POLLHUP for fifos.

  This reverts part of r196460, so that sockets only return POLLHUP if both
  directions are closed/error. Fifos get POLLHUP by closing the unused
  direction immediately after creating the sockets.

  The tools/regression/poll/*poll.c tests now pass except for two other
  things:
  - if POLLHUP is returned, POLLIN is always returned as well instead of
    only when there is data left in the buffer to be read
  - fifo old/new reader distinction does not work the way POSIX specs it

  Reviewed by:	kib, bde

MFC r196554

  Add some tests for poll(2)/shutdown(2) interaction.

Approved by:	re (kensmith)
2009-09-01 20:58:41 +00:00
Robert Noland
e2ba743711 MFC 196643
Swap the start/end virtual addresses in pmap_invalidate_cache_range().

This fixes the functionality on non SelfSnoop hardware.

Found by:	rnoland
Submitted by:	alc
Reviewed by:	kib
Approved by:	re (rwatson)
2009-09-01 16:41:28 +00:00
John Baldwin
8a503ad2c9 MFC 196637:
Mark the fake pages constructed by the OBJT_SG pager valid.  This was
accidentally lost at one point during the PAT development.  Without this
fix vm_pager_get_pages() was zeroing each of the pages.

Approved by:	re (kib)
2009-09-01 15:50:07 +00:00
Alexander Motin
be5b3af890 MFC r196657:
ATA_FLUSHCACHE is a 28bit format command, not 48.

MFC r196658:
Improve camcontrol ATA support:
 - Tune protocol version reporting,
 - Add supported DMA/PIO modes reporting.
 - Fix IDENTIFY request for ATAPI devices.
 - Remove confusing "-" for NCQ status.

MFC r196659:
Short ATA command format has 28bit address, not 36bit.
Rename ata_36bit_cmd() into ata_28bit_cmd(), while it didn't become legacy.

Approved by:	re (ATA-CAM blanket)
2009-09-01 12:04:43 +00:00
Alexander Motin
84a08f606f MFC r196656, r196660:
Update ahci driver:
 - Add Command Completion Coalescing support.
 - Add SNTF support.
 - Add two more power management modes (4, 5), implemented on driver level.
 - Fix interface mode setting.
 - Reduce interface reset time.
 - Do not report meaningless protocol/transport versions.
 - Report CAP2 register content.
 - Some performance optimizations.

Approved by:	re (ATA-CAM blanket)
2009-09-01 11:44:30 +00:00
Alexander Motin
088705a89e MFC r196655:
Update siis driver:
 - Add SNTF support.
 - Do not report meaningless transport/protocol versions.

Approved by:	re (ATA-CAM blanket)
2009-09-01 11:13:31 +00:00
Marius Strobl
e4e5ed252d Add a temporary workaround which just lets init die instead of
causing a panic if it is killed due to a unsolved stack overflow
seen very late during shutdown on sparc64 when the gmirror worker
process exists, which is a regression introduced in 8.0.

Reviewed by:	kib
Approved by:	re (rwatson)
2009-08-31 19:16:58 +00:00
Jamie Gritton
f37b0a3db5 MFC r196592:
Fix a LOR between allprison_lock and vnode locks by releasing
  allprison_lock before releasing a prison's root vnode.

PR:		kern/138004
Reviewed by:	kib
Approved by:	re (rwatson), bz (mentor)
2009-08-31 14:13:45 +00:00
Rui Paulo
de3a9cf126 MFC r196455:
Make dev.asmc.N.light.control writable by everyone.

Submitted by:	Patrick Lamaiziere <patfbsd at davenulle.org>
Approved by:	re (rwatson)
2009-08-31 12:25:04 +00:00
Marko Zec
962e75a763 MFC r196635:
Fix a few panics in linuxulator + VIMAGE due to curvnet not being set.

  This change affects only options VIMAGE builds.

  Reviewed by:  julian

Approved by:	re (rwatson)
2009-08-31 09:46:09 +00:00
Marko Zec
e9cedda843 MFC r196633:
Introduce a separate sx lock for protecting lists of vnet sysinit
  and sysuninit handlers.

  Previously, sx_vnet, which is a lock designated for protecting
  the vnet list, was (ab)used for protecting vnet sysinit / sysuninit
  handler lists as well.  Holding exclusively the sx_vnet lock while
  invoking sysinit and / or sysuninit handlers turned out to be
  problematic, since some of the handlers may attempt to wake up
  another thread and wait for it to walk over the vnet list, hence
  acquire a shared lock on sx_vnet, which in turn leads to a deadlock.
  Protecting vnet sysinit / sysuninit lists with a separate lock
  mitigates this issue, which was first observed with
  flowtable_flush() / flowtable_cleaner() in sys/net/flowtable.c.

  Reviewed by:  rwatson, jhb
  MFC after:    3 days

Approved by:	re (rwatson)
2009-08-31 09:44:07 +00:00
Konstantin Belousov
e179d138ba MFC r196560:
Honor the vfs.timestamp_precision sysctl settings for utimes(path, NULL)
and similar calls.

Approved by:	re (rwatson)
2009-08-31 09:08:14 +00:00
Qing Li
c7276c59ff As part of r196609, a call to "rtalloc" did not take the fib into
account. So call the appropriate "rtalloc_ign_fib()" instead of
calling "rtalloc_ign()".

Reviewed by:	pointed out by bz
Approved by:	re
2009-08-31 00:18:17 +00:00
Qing Li
87d2d9c556 MFC r196649
Prefix on-link verification is being performed on statically
configured prefixes. Since these statically configured prefixes
do not have any associated advertising routers, these prefixes
are treated as unreachable and those prefix routes are deleted
from the routing table. Therefore bypass prefixes that are not
learned from router advertisements during prefix on-link check.

Reviewed by:	hrs
Approved by:	re
2009-08-30 22:44:12 +00:00
Qing Li
ba3ae75b3c MFC r196609
In ip_output(), the flow-table module must not try to cache L2/L3
information for interface of IFF_POINTOPOINT or IFF_LOOPBACK type.
Since the L2 information (rt_lle) is invalid for these interface
types, accidental caching attempt will trigger panic when the invalid
rt_lle reference is accessed.

When installing a new route, or when updating an existing route, the
user supplied gateway address may be an interface address (this is
particularly true for point-to-point interface related modules such
as ppp, if_tun, if_gif). Currently the routing command handler always
set the RTF_GATEWAY flag if the gateway address is given as part of the
command paramters. Therefore the gateway address must be verified against
interface addresses or else the route would be treated as an indirect
route, thus making that route unusable.

Reviewed by:	kmacy, julian, rwatson
Approved by:	re
2009-08-30 22:42:32 +00:00
Qing Li
d84f95cd4a MFC r196608
Do not try to free the rt_lle entry of the cached route in
ip_output() if the cached route was not initialized from the
flow-table. The rt_lle entry is invalid unless it has been
initialized through the flow-table.

Reviewed by:	kmacy, rwatson
Approved by:	re
2009-08-30 22:39:49 +00:00
Qing Li
4090e9b219 MFC r196569
When multiple interfaces exist in the system, with each interface having
an IPv6 address assigned to it, and if an incoming packet received on
one interface has a packet destination address that belongs to another
interface, the routing table is consulted to determine how to reach this
packet destination. Since the packet destination is an interface address,
the route table will return a host route with the loopback interface as
rt_ifp. The input code must recognize this fact, instead of using the
loopback interface, the input code performs a search to find the right
interface that owns the given IPv6 address.

Reviewed by:	bz, gnn, kmacy
Approved by:	re
2009-08-30 22:36:46 +00:00
Andrew Thompson
59fa5c955f MFC r196547
It is possible for all the kthreads to exit (hci modules unloaded) which in
 turn ends our usb process. This means the proc pointer becomes invalid and will
 panic if a new kthread is added. Count the number of threads and clear the proc
 pointer on the last one.

Approved by:	re (kib)
2009-08-29 15:42:06 +00:00
Robert Watson
d6f7f21cac Merge r196559 from head to stable/8:
Add IFNET_HOLD reserved pointer value for the ifindex ifnet array,
  which allows an index to be reserved for an ifnet without making
  the ifnet available for management operations.  Use this in if_alloc()
  while the ifnet lock is released between initial index allocation and
  completion of ifnet initialization.

  Add ifindex_free() to centralize the implementation of releasing an
  ifindex value.  Use in if_free() and if_vmove(), as well as when
  releasing a held index in if_alloc().

  Reviewed by:  bz

Approved by:	re (kib)
2009-08-28 21:14:04 +00:00
Robert Watson
57d231bba6 Merge r196553 from head to stable/8:
Break out allocation of new ifindex values from if_alloc() and if_vmove(),
  and centralize in a single function ifindex_alloc().  Assert the
  IFNET_WLOCK, and add missing IFNET_WLOCK in if_alloc().  This does not
  close all known races in this code.

  Reviewed by:  bz

Approved by:	re (kib)
2009-08-28 21:12:38 +00:00
Robert Watson
a0021692f2 Merge r196535 from head to stable/8:
Use locks specific to the lltable code, rather than borrow the ifnet
  list/index locks, to protect link layer address tables.  This avoids
  lock order issues during interface teardown, but maintains the bug that
  sysctl copy routines may be called while a non-sleepable lock is held.

  Reviewed by:  bz, kmacy, qingli

Approved by:	re (kib)
2009-08-28 21:10:26 +00:00
Robert Watson
b569420afa Merge r196510 from head to stable/8:
Make if_grow static -- it's not used outside of if.c, and with the
  internals destined to change, it's better if it remains that way.

Approved by:	re (kib)
2009-08-28 21:07:43 +00:00
Max Laier
f2b31d1909 MFC r196551:
Fix argument ordering to memcpy as well as the size of the copy in the
  (theoretical) case that pfi_buffer_cnt should be greater than ~_max.

  Submitted by:	pjd
  Reviewed by:	{krw,sthen,markus}@openbsd.org

Approved by:	re (kib)
2009-08-28 20:26:00 +00:00
Robert Watson
1b257b0e92 Merge r196482 from head to stable/8:
Rather than using IFNET_RLOCK() when iterating over (and modifying) the
  ifnet list during if_ef load, directly acquire the ifnet_sxlock
  exclusively.  That way when if_alloc() recurses the lock, it's a write
  recursion rather than a read->write recursion.

  This code structure is arguably a bug, so add a comment indicating that
  this is the case.  Post-8.0, we should fix this, but this commit
  resolves panic-on-load for if_ef.

  Discussed with:       bz, julian
  Reported by:  phk

Approved by:	re (kib)
2009-08-28 20:07:38 +00:00
Robert Watson
3ef94f2b72 Merge r196481 from head to stable/8:
Rework global locks for interface list and index management, correcting
  several critical bugs, including race conditions and lock order issues:

  Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
  sxlock.  Either can be held to stablize the lists and indexes, but both
  are required to write.  This allows the list to be held stable in both
  network interrupt contexts and sleepable user threads across sleeping
  memory allocations or device driver interactions.  As before, writes to
  the interface list must occur from sleepable contexts.

  Reviewed by:  bz, julian

Approved by:	re (kib)
2009-08-28 20:06:02 +00:00
Marko Zec
d6976e0558 MFC r196504:
When moving ifnets from one vnet to another, and the ifnet
  has ifaddresses of AF_LINK type which thus have an embedded
  if_index "backpointer", we must update that if_index backpointer
  to reflect the new if_index that our ifnet just got assigned.

  This change affects only options VIMAGE builds.

  Submitted by: bz
  Reviewed by:  bz
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:18:20 +00:00
Marko Zec
61268392e1 MFC r196505:
When "jail -c vnet" request fails, the current code actually creates and
  leaves behind an orphaned vnet.  This change ensures that such vnets get
  released.

  This change affects only options VIMAGE builds.

  Submitted by: jamie
  Discussed with:       bz
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:15:17 +00:00
Marko Zec
83864c810e MFC r196503:
Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet
  context inside the RPC code.

  Temporarily set td's cred to mount's cred before calling socreate() via
  __rpc_nconf2socket().

  Submitted by: rmacklem (in part)
  Reviewed by:  rmacklem, rwatson
  Discussed with:       dfr, bz
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:12:44 +00:00
Marko Zec
f04e871efc MFC r196502:
Introduce a div_destroy() function which takes over per-vnet cleanup tasks
  from the existing modevent / MOD_UNLOAD handler, and register div_destroy()
  in protosw as per-vnet .pr_destroy() handler for options VIMAGE builds.  In
  nooptions VIMAGE builds, div_destroy() will be invoked from the modevent
  handler, resulting in effectively identical operation as it was prior this
  change.  div_destroy() also tears down hashtables used by ipdivert, which
  were previously left behind on ipdivert kldunloads.

  For options VIMAGE builds only, temporarily disable kldunloading of ipdivert,
  because without introducing additional locking logic it is impossible to
  atomically check whether all ipdivert instances in all vnets are idle, and
  proceed with cleanup without opening a race window for a vnet to open an
  ipdivert socket while ipdivert tear-down is in progress.

  While here, staticize div_init(), because it is not used outside of
  ip_divert.c.

  In cooperation with:  julian
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:10:58 +00:00
Marko Zec
939af5009a MFC r196501:
When registering a protocol to an existing protocol domain via
  pf_proto_register(), iterate over all existing vnets to call protosw_init()
  and thus the appropriate .pr_init() handler in the context of each vnet.
  NB in the future we probably want to separate pr_init() handlers into
  two, i.e. per-vnet and global, functions.

  This change has no impact on nooptions VIMAGE builds.

  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-28 19:08:56 +00:00
Pyun YongHyeon
180e7945c7 MFC r196517:
Don't try to power down PHY when alc(4) failed to map the device.
  This fixes system crash when mapping alc(4) device failed in device
  attach.

  Reported by:	Jim < stapleton.41 <> gmail DOT com >
Approved by:	re (kib)
2009-08-28 18:01:37 +00:00
Pyun YongHyeon
83b5def49a MFC r196516:
Add RTL8168DP/RTL8111DP device id. While I'm here append "8111D" to
  the description of RTL8168D as RL_HWREV_8168D can be either
  RTL8168D or RTL8111D.

  PR:	kern/137672
Approved by:	re (kib)
2009-08-28 17:34:22 +00:00
Bjoern A. Zeeb
ac63e409c2 MFC r196512:
Fix handling of .note.ABI-tag section for GNU systems [1].
  Handle GNU/Linux according to LSB Core Specification 4.0,
  Chapter 11. Object Format, 11.8. ABI note tag.

  Also check the first word of desc, not only name, according to
  glibc abi-tags specification to distinguish between Linux and
  kFreeBSD.

  Add explicit handling for Debian GNU/kFreeBSD, which runs
  on our kernels as well [2].

  In {amd64,i386}/trap.c, when checking osrel of the current process,
  also check the ABI to not change the signal behaviour for Linux
  binary processes, now that we save an osrel version for all three
  from the lists above in struct proc [2].

  These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD
  and Linux binaries on the same machine again for at least i386 and
  amd64, and no longer break kFreeBSD which was detected as GNU(/Linux).

PR:		kern/135468
Submitted by:	dchagin [1] (initial patch)
Suggested by:	kib [2]
Tested by:	Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD
Reviewed by:	kib
Approved by:	re (kensmith)
2009-08-27 17:34:13 +00:00
John Baldwin
953e1b6c8d MFC 196520:
Tweak the way that the ACPI and ISA bus drivers match hint devices to
BIOS-enumerated devices:
- Assume a device is a match if the memory and I/O ports match even if the
  IRQ or DRQ is wrong or missing.  Some BIOSes don't include an IRQ for
  the atrtc device for example.
- Add a hack to better match floppy controller devices.  Many BIOSes do not
  include the starting port of the floppy controller listed in the hints
  (0x3f0) in the resources for the device.  So far, however, all the BIOS
  variations encountered do include the 'port + 2' resource (0x3f2), so
  adjust the matching for "fdc" devices to look for 'port + 2'.

Approved by:	re (kib)
2009-08-27 16:34:04 +00:00
Doug Barton
818b5b0e2a MFC r196435:
The svnversion string is only relevant when newvers.sh is called
during the kernel build process, the other places that call the
script do not make use of that information. So restrict execution
of the svnversion-related code to the kernel build context.

Approved by:	re (kib)
2009-08-26 22:32:14 +00:00
Ken Smith
bf6ab6cb36 Ready for 8.0-BETA3 builds.
Approved by:	re (implicit)
2009-08-21 17:40:24 +00:00
Julian Elischer
f8f0b70474 MFC r196423
Fix ipfw's initialization functions to get the correct order of evaluation
  to allow vnet and non vnet operation. Move some functions from ip_fw_pfil.c
  to ip_fw2.c and mode to mostly using the SYSINIT and VNET_SYSINIT handlers
  instead of the modevent handler. Correct some spelling errors in comments
  in the affected code. Note this bug fixes a crash in NON VIMAGE kernels when
  ipfw is unloaded.

  This patch is a minimal patch for 8.0
  I have a much larger patch that actually fixes the underlying problems
  that will be applied after 8.0

Reviewed by:	zec@, rwatson@, bz@(earlier version)
Approved by:	re (rwatson)
2009-08-21 11:23:29 +00:00
Julian Elischer
1261248008 MFC r196419:
Don't allow access to the internals until it has all been set up.
  Specifically, not until the per-vnet parts have been set up.

Submitted by:	kmacy@
Reviewed by:	julian@, zec@
Approved by:	re(rwatson)
2009-08-21 10:05:26 +00:00
John Baldwin
18fb1e9a44 MFC 196417:
This patch fixes two bugs in sglist(9) and improves robustness of the API via
better semantics if a request to append an address range to an existing list
fails.
- When cloning an sglist, properly set the length in the new sglist instead of
  leaving the new list empty.
- Properly compute the amount of data added to an sglist via
  _sglist_append_buf().  This allows sglist_consume_uio() to properly update
  uio_resid.
- When a request to append an address range to a scatter/gather list fails,
  restore the sglist to the state it had at the start of the function call
  instead of resetting it to an empty list.

Approved by:	re (kib)
2009-08-21 03:14:39 +00:00
Ken Smith
31b3c66986 MFC r196415:
Fix a boot hang for hptrr(4) caused by changes introduced in r195534.
It is necessary to make sure cpi->transport is set for xpt_scan_bus() to
work properly.

Submitted by: Bernhard Schmidt (scb+freebsd-current <at> techwires
              <dot> net)
Reviewed by:  scottl
Approved by:  re (kib)
2009-08-21 01:12:06 +00:00
Peter Wemm
21f6a3982f MFC rev 196410 - deal with 'ticks' going negative after 24 days of uptime
with the default 1000hz clock in the timewait expiration code.

Approved by:    re (kensmith)
2009-08-20 23:07:53 +00:00
Jung-uk Kim
1cc36da966 MFC: r196412
Check whether the SMBIOS reports reasonable amount of memory.  If it is
less than "avail memory", fall back to Maxmem to avoid user confusion.
We use SMBIOS information to display "real memory" since r190599 but
some broken SMBIOS implementation reported only half of actual memory.

Tested by:	bz
Approved by:	re (kib)
2009-08-20 23:04:21 +00:00
Robert Watson
708b471c4b Merge r196267 from head to stable/8:
Rather than fix questionable ifnet list locking in the implementation of
  the kern.polling.enable sysctl, remove the sysctl.  It has been deprecated
  since FreeBSD 6 in favour of per-ifnet polling flags.

  Reviewed by:	luigi

Approved by:	re (kib)
2009-08-20 21:29:49 +00:00
Robert Watson
aeba9e80ff Merge r196263 from head to stable/8:
Remove unused if_rawoutput() macro; it has been unused since at least
  FreeBSD 2.

Approved by:	re (kib)
2009-08-20 21:14:52 +00:00
John Baldwin
5c91164df2 MFC 196404:
Change the 'resid' parameter to sglist_consume_uio() from an int to a
size_t to match the recent type change of the uio_resid member of struct
uio.

Approved by:	re (kib)
2009-08-20 20:53:36 +00:00
John Baldwin
247db0748a MFC 196403: Temporarily revert the new-bus locking for 8.0 release.
Approved by:	re (kib)
2009-08-20 20:23:28 +00:00
Will Andrews
566abe95b2 MFC r196397 from head:
Fix CARP memory leaks on carp_if's malloc'd using M_CARP.  This occurs when
  CARP tries to free them using M_IFADDR after the last address for a virtual
  host is removed and when detaching from the parent interface.

Approved by:	re (kib), ken (mentor)
2009-08-20 02:49:43 +00:00
Pawel Jakub Dawidek
8844a10730 MFC r196395:
Our libc doesn't implement control method for XDR (only kernel does) and it
will always return failure. Fix this by bringing userland implementation of
xdrmem_control() back. This allow 'zpool import' to work again.

Reported by:	Thomas Backman <serenity@exscape.org>
Reviewed by:	kmacy
Approved by:	re (kib)
2009-08-20 00:08:58 +00:00
Ed Schouten
f0c46a48f7 MFC r196390:
Make the MacBookPro3,1 hardware boot again.

  Tested by:    Patrick Lamaiziere <patfbsd davenulle org>
  Approved by:  re (kib)
2009-08-19 20:44:22 +00:00
Kip Macy
786f829ddb This change fixes a comment and addresses a complaint by kib@ by
moving a frequently executed flowtable syslog statement from being
 conditional on bootverbose to conditional on a per-vnet flowtable
 sysctl.

Approved by:	re@
2009-08-19 20:17:36 +00:00
Xin LI
67a435347b MFC r196386:
Temporarily enhance em(4) and igb(4) hack to take account for IFF_NOARP.
Without this changeset there will be no way to prevent these NICs from
sending ARP, which is harmful in server farms that is configured as
"Direct Server Return" behind a load balancer.

A better fix would remove the whole hack completely but it would be
later than 8.0-RELEASE.

Reviewed by:	jfv, yongari
Approved by:	re (kib)
2009-08-19 18:08:50 +00:00
Rafal Jaworowski
49d96dc6dd MFC r196380
Fix USB cache sync operations for platforms with non-coherent DMA.

- usb_pc_cpu_invalidate() is called between [consecutive] reads from a device,
  so a sequence of BUS_DMASYNC_POSTREAD and _PREREAD should be used. Note we
  cannot use or'ed shorthand ( _POSTREAD | _PREREAD) for BUS_DMASYNC flags, as
  the low level bus dma sync operation is implementation dependent and we
  cannot assume the required order of operations to be guaranteed.

- usb_pc_cpu_flush() is called before writing to a device, so
  BUS_DMASYNC_PREWRITE should be used.

Submitted by:	Grzegorz Bernacki
Reviewed by:	HPS, arm@, usb@ ML
Tested by:	HPS, Mike Tancsa
Approved by:	re (kib)
Obtained from:	Semihalf
2009-08-19 14:48:59 +00:00
Ed Schouten
e047c5fbb6 MFC r196378:
Small changes to the warning message generated by pty(4):

  - Only print the warning once, instead of filling up the screen.
  - Use the word "legacy" for the pty_warningcnt description, to prevent
    confusion.
  - Use log() instead of printf().

  Discussed with: rwatson, jhb
  Approved by:    re (kib)
2009-08-19 14:38:43 +00:00
Michael Tuexen
d51d92a789 Fix a bug in the handling of unreliable messages which
results in stalled associations.

Approved by: re, rrs (mentor)
2009-08-19 12:12:51 +00:00
Max Laier
0e7983d1f6 MFC r196372:
If we cannot immediately get the pf_consistency_lock in the purge thread,
  restart the scan after acquiring the lock the hard way.  Otherwise we
  might end up with a dead reference.

Approved by:	re (kib)
2009-08-19 00:17:00 +00:00
Stanislav Sedov
1daebacca0 - MFC r196370.
Do not try to reevaluate current RX production index on each
  loop iteration as it can be updated by the card while we
  process the RX ring forcing us to process RX descriptors
  for which DMA synchronisation operation has not been
  performed.  This fixes the bug when bge(4) drops packets
  under high load.

Discussed with:	yongari, marius
Approved by:	re (kib)
2009-08-18 21:13:00 +00:00
Kip Macy
670151d0e4 MFC 196368
- change the interface to flowtable_lookup so that we don't rely on
    the mbuf for obtaining the fib index
  - check that a cached flow corresponds to the same fib index as the
    packet for which we are doing the lookup
  - at interface detach time flush any flows referencing stale rtentrys
    associated with the interface that is going away (fixes reported
    panics)
  - reduce the time between cleans in case the cleaner is running at
    the time the eventhandler is called and the wakeup is missed less
    time will elapse before the eventhandler returns
  - separate per-vnet initialization from global initialization
    (pointed out by jeli@)

Reviewed by:	sam@
Approved by:	re@
2009-08-18 20:39:35 +00:00
Pyun YongHyeon
f8fb3cc00e MFC r196366:
Backout r193289. r193289 restored page select bits to previous
  value instead of blindly resetting it to 0. However, it seems page
  select bits of some 88E1116 PHY is initialized to invalid one such
  that restoring page select bits after programming broke MII
  register access. The correct solution would be reset page select
  bits to 0 in PHY attach stage but it would require more testing.
  Since we're in BETA stage such a change would be dangerous so just
  back it out.
  This change should fix nfe(4) breakage on NVIDIA MCP55.

  Reported by:	Ryan Rogers < webmaster <> doghouserepair dot com >
		Sam Fourman Jr. < sfourman <> gmail dot com >
  Tested by:	Ryan Rogers < webmaster <> doghouserepair dot com >
		Sam Fourman Jr. < sfourman <> gmail dot com >
  Approved by:	re (kib)
2009-08-18 20:25:02 +00:00
Michael Tuexen
3da1fd00cf Fix a panic when using one-to-one style sockets in non-blocking
mode and there is no listening server.
PR: 137795
Approved by: re, rrs (mentor)
2009-08-18 20:06:00 +00:00
Pawel Jakub Dawidek
65536ad653 MFC r196358:
Remove unused taskqueue_find() function.

Reviewed by:	dfr
Approved by:	re (kib)
2009-08-18 14:00:25 +00:00
Alexander Motin
382c5c0df4 Fix copy/paste bug, that requests data read during ATA device probe sequence
for ATA_SETFEATURES/ATA_SF_SETXFER command which by definition transfers no
data. Most of controllers are irrelevant to this bug, but some nVidia's
doesn't.

Tested on:      current@
Approved by:    re (kib)
2009-08-18 09:36:25 +00:00
Alexander Motin
a7f9e24d61 MFC r196352:
Fix iSCSI initiator and vpo driver operation, broken by CAM changes.

Reviewed by:	scottl, Danny Braniss
Approved by:	re (rwatson)
2009-08-18 09:31:00 +00:00
Kip Macy
bf4e402b83 fix netboot issue by disabling flowtable lookups until initialization has been run
+ mergeinfo garbage

Reviewed by:	rwatson@
Approved by:	re@
2009-08-17 20:06:00 +00:00
Rick Macklem
8d3f6febcd MFC r196332:
Apply the same patch as r196205 for nfs_upgrade_lock() and
nfs_downgrade_lock() to the experimental nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-08-17 18:11:50 +00:00
Attilio Rao
7e2d0af9e0 MFC r196334:
* Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
  a pointer-fetching specific operation check. Consequently, rename the
  operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
  directly alignment on the word boundry, for all the given specific
  architectures. That's a bit too strict for some common case, but it
  assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)
2009-08-17 16:33:53 +00:00
Marcel Moolenaar
87b51e539d MFC rev 196333:
The start of the EFI GPT partition in the PMBR can always be represented
by CHS addressing. Don't define these fields as 0xff, but rather define
them correctly. This prevents boot problems on PCs where GPT is being
used.

PR:             115406
Submitted by:   Kent Hauser <kent@khauser.net>
Approved by:    re (kib)
2009-08-17 16:24:50 +00:00
John Hay
c49b5baa65 MFC: 196326
Fix parse() so that the partition to boot (load /boot/loader) from can
be set. The syntax as printed in main() is used: 0:ad(0p3)/boot/loader

Reviewed by:	jhb
Approved by:	re (kib)
2009-08-17 15:39:47 +00:00
Konstantin Belousov
1d4dc8543f MFC r196318:
Correct accounting error when allocating a a page table page to implement
a user-space demotion.

Approved by:	re (rwatson)
2009-08-17 13:32:56 +00:00
Rui Paulo
f2d3e43377 MFC r196316:
Fix a typo in ifdef mesh support. This would make mesh unworkable if
  TDMA support was compiled out.

Approved by:	re (kib)
2009-08-17 13:00:32 +00:00
Pawel Jakub Dawidek
1aefcd39f0 MFC r196309:
getcwd() (when __getcwd() fails) works by stating current directory, going up
(..), calling readdir and looking for previous directory inode.  In case of
.zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it
won't be visible in readdir output.

Fix this by implementing VPTOCNP for snapshot directories, so __getcwd()
doesn't fail and getcwd() doesn't have to use readdir method.

This fixes /bin/pwd from within .zfs/snapshot/<name>/.

Suggested by:	kib
Approved by:	re (rwatson)
2009-08-17 10:02:31 +00:00
Pawel Jakub Dawidek
5adfc444cf MFC r196307:
Manage asynchronous vnode release just like Solaris.

Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 09:55:58 +00:00
Pawel Jakub Dawidek
be7e2e42e1 MFC r196303:
- Reduce z_teardown_lock lock scope a bit.
- The error variable is int, not bool.
- Convert spaces to tabs where needed.

Approved by:	re (kib)
2009-08-17 09:30:31 +00:00
Pawel Jakub Dawidek
a64b735739 MFC r196301:
If z_buf is NULL, we should free znode immediately.

Noticed by:	avg
Approved by:	re (kib)
2009-08-17 09:27:10 +00:00
Pawel Jakub Dawidek
93be9449e4 MFC r196299:
- We need to recycle vnode instead of freeing znode.

Submitted by:	avg

- Add missing vnode interlock unlock.
- Remove redundant znode locking.

Approved by:	re (kib)
2009-08-17 09:23:27 +00:00
Pawel Jakub Dawidek
f0fb1d62c7 MFC r196297:
Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have
0 usecount.

Reported by:	Thomas Backman <serenity@exscape.org>
Approved by:	re (kib)
2009-08-17 09:14:58 +00:00
Pawel Jakub Dawidek
e43f173602 MFC r196295:
Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:03:47 +00:00
Pawel Jakub Dawidek
ea5f504fed MFC r196293:
Because taskqueue_run() can drop tq_mutex, we need to check if the
TQ_FLAGS_ACTIVE flag wasn't removed in the meantime, which means we missed a
wakeup.

Approved by:	re (kib)
2009-08-17 08:46:47 +00:00
Pawel Jakub Dawidek
4aae820003 MFC r196291:
- Fix a race where /dev/zfs control device is created before ZFS is fully
  initialized. Also destroy /dev/zfs before doing other deinitializations.
- Initialization through taskq is no longer needed and there is a race
  where one of the zpool/zfs command loads zfs.ko and tries to do some work
  immediately, but /dev/zfs is not there yet.

Reported by:	pav
Approved by:	re (kib)
2009-08-17 08:38:41 +00:00
Pawel Jakub Dawidek
9e813d55e3 MFC r196289:
Remove files that are no longer used.

Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 08:09:46 +00:00
Scott Long
91fa948790 Merge r196200. Add firmware definitions needed by mfiutil
Approved by:	re
2009-08-17 06:21:22 +00:00