Commit Graph

74085 Commits

Author SHA1 Message Date
Robert Watson
7429a3f3d8 Remove vnet_foreach() utility function, which previously allowed
vnet.c to iterate virtual network stacks without being aware of
the implementation details previously hidden in kern_vimage.c.
Now they are in the same file, so remove this added complexity.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 20:24:45 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Matt Jacob
2df76c160b Add 8Gb support (isp_2500). Fix a fair number of configuration and
firmware loading bugs.

Target mode support has received some serious attention to make it
more usable and stable.

Some backward compatible additions to CAM have been made that make
target mode async events easier to deal with have also been put
into place.

Further refinement and better support for NP-IV (N-port Virtualization)
is now in place.

Code for release prior to RELENG_7 has been stripped away for code clarity.

Sponsored by: Copan Systems

Reviewed by:    scottl, ken, jung-uk kim
Approved by:    re
2009-08-01 01:04:26 +00:00
Matt Jacob
b965588786 Add 8Gb card firmware. Update some 2Gb and 4Gb f/w sets.
Split 4Gb and 8Gb into pieces that can be either multi_id
capable or not.

Reviewed by:	scottl, ken
Approved by:	re
2009-08-01 00:57:34 +00:00
Sam Leffler
faf666aa77 fix misplaced #endif that caused tdma handling to be merged with ESS handling
(causing tdma scanning to break)

Approved by:	re (kib)
2009-07-31 19:13:16 +00:00
Sam Leffler
2dfcbb0e2e Filter setting IFF_PROMISC on tdma vaps; we don't want the underyling device
to be in promiscuous mode as we have a h/w bssid.

Approved by:	re (kib)
2009-07-31 19:12:19 +00:00
Weongyo Jeong
6418e06a48 add upgt
Approved by:	re (kib)
2009-07-31 17:57:16 +00:00
Jamie Gritton
42bebd8205 Make the "enforce_statfs" default 2 (most restrictive) in jail_set(2),
instead of whatever the parent/system has (which is generally 0).  This
mirrors the old-style default used for jail(2) in conjunction with the
security.jail.enforce_statfs sysctl.

Approved by:	re (kib), bz (mentor)
2009-07-31 16:00:41 +00:00
John Baldwin
87eca70e0c Fix some LORs between vnode locks and filedescriptor table locks.
- Don't grab the filedesc lock just to read fd_cmask.
- Drop vnode locks earlier when mounting the root filesystem and before
  sanitizing stdin/out/err file descriptors during execve().

Submitted by:	kib
Approved by:	re (rwatson)
MFC after:	1 week
2009-07-31 13:40:06 +00:00
Kevin Lo
6ece67d83f Free allocated Rx ring dma memory/tags.
Reviewed by: yongari@
Approved by: re (kib)
2009-07-31 09:57:42 +00:00
Weongyo Jeong
aa7eaa2fb6 fixes a typo for DWA120 device ID.
Reported by:	Alexander Kuznetsov <skritku at gmail.com>
Approved by:	re (kib)
2009-07-30 18:53:06 +00:00
Xin LI
16e324fc56 Show interface name which received short CARP packet (e.g. a VRRP packet),
in order to match other codepaths nearby.  This makes troubleshooting
easier.

Approved by:	re (kib)
MFC after:	1 month
2009-07-30 17:40:47 +00:00
Jamie Gritton
2b0d6f815e Remove a LOR, where the the sleepable allprison_lock was being obtained
in prison_equal_ip4/6 while an inp mutex was held.  Locking allprison_lock
can be avoided by making a restriction on the IP addresses associated with
jails:

Don't allow the "ip4" and "ip6" parameters to be changed after a jail is
created.  Setting the "ip4.addr" and "ip6.addr" parameters is allowed,
but only if the jail was already created with either ip4/6=new or
ip4/6=disable.  With this restriction, the prison flags in question
(PR_IP4_USER and PR_IP6_USER) become read-only and can be checked
without locking.

This also allows the simplification of a messy code path that was needed
to handle an existing prison gaining an IP address list.

PR:		kern/136899
Reported by:	Dirk Meyer
Approved by:	re (kib), bz (mentor)
2009-07-30 14:28:56 +00:00
Robert Watson
ed3db012fc Reorder and recomment vnet.c and vnet.h on the basis that they are no longer
solely about the virtual network stack memory allocator.

Approved by:	re (vimage blanket)
2009-07-30 12:41:19 +00:00
Robert Watson
4b0026b9fb Add two new privileges for use by OpenAFS, which will be supported for
FreeBSD 8.x.

MFC after:	3 days
Submitted by:	Benjamin Kaduk <kaduk at MIT.EDU>
Approved by:	re (kib)
2009-07-30 08:41:06 +00:00
Alfred Perlstein
9e90dc16ba Missed this file for r195963:
USB core:
  - add support for defragging of written device data.
  - improve handling of alternate settings in device side mode.
  - correct return value from usbd_get_no_alts() function.
  - reported by: HPS
  - P4 ID: 166156, 166168

  - report USB device release information to devd and pnpinfo.
  - reported by: MIHIRA Sanpei Yoshiro
  - P4 ID: 166221

Submitted by:	hps
Approved by:	re
2009-07-30 00:57:54 +00:00
Alfred Perlstein
65c7bc9cbf USB CORE - Improve HID parsing
See PR description for more info. Patch is
implemented differently than suggested, but
having the same result.

PR:     usb/137188

Submitted by:	hps
Approved by:	re
2009-07-30 00:17:08 +00:00
Alfred Perlstein
bf3254f22e USB CORE - compat Linux:
- Patch request from Tim Borgeaud:
- add automatic locking
- add refcount for killing URB's

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:50 +00:00
Alfred Perlstein
a0c61406ad USB controller:
- allow disabling "root_mount_hold()" by setting "hw.usb.no_boot_wait" sysctl

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:32 +00:00
Alfred Perlstein
4743e6da52 ULPT:
- add conditional printer status checking
- P4 ID: 166176

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:06 +00:00
Alfred Perlstein
bd73b18714 USB core:
- add support for defragging of written device data.
- improve handling of alternate settings in device side mode.
- correct return value from usbd_get_no_alts() function.
- reported by: HPS
- P4 ID: 166156, 166168

- report USB device release information to devd and pnpinfo.
- reported by: MIHIRA Sanpei Yoshiro
- P4 ID: 166221

Submitted by:	hps
Approved by:	re
2009-07-30 00:15:50 +00:00
Alfred Perlstein
b27b901cb4 USB serial:
- add new ID for Huawei
- P4 ID: 166150

PR:             usb/136761

Submitted by:	hps
Approved by:	re
2009-07-30 00:15:17 +00:00
Alfred Perlstein
3d11bc19a0 USB audio:
- code factoring patch from "Eygene Ryabinkin"
- P4 ID: 166149

Submitted by:	hps
Approved by:	re
2009-07-30 00:14:56 +00:00
Alfred Perlstein
dddb25f98a USB CORE:
- Add minimum polling support to drive UMASS
  and UKBD in case of panic.
- Add extra check to ukbd probe to fix problem about
  mouse devices attaching like keyboards.
- P4 ID: 166148

Submitted by:	hps
Approved by:	re
2009-07-30 00:14:34 +00:00
Alfred Perlstein
81f8288460 USB input
- add support for setting the UMS polling rate through -F option
           passed to moused.
         - requested by Alexander Best
         - P4 ID: 166075

PR:             usb/125264

Submitted by:	hps
Approved by:	re
2009-07-30 00:13:09 +00:00
Alfred Perlstein
f724bcec31 USB controller:
- patch from Alexander Motin <mav@freebsd.org>
          - add more ID's
          - P4 ID: 165805

Submitted by:	hps
Approved by:	re
2009-07-30 00:12:47 +00:00
Konstantin Belousov
ed190ad06a Fix XEN build breakage, by implementing pmap_invalidate_cache_range()
and using it when appropriate. Merge analogue of the r195836
optimization to XEN.

Approved by:	re (kensmith)
2009-07-29 19:38:33 +00:00
Jamie Gritton
bdfc8cc4cd Don't allow mixing the "vnet" and "ip4/6" jail parameters, since vnet
jails have their own IP stack and don't have access to the parent IP
addresses anyway.  Note that a virtual network stack forms a break
between prisons with regard to the list of allowed IP addresses.

Approved by:	re (kib), bz (mentor)
2009-07-29 16:46:59 +00:00
Jamie Gritton
8986e3a023 Change the default value of the "ip4" and "ip6" jail parameters to
"disable", which only allows access to the parent/physical system's
IP addresses when specifically directed.  Change the default value of
"host" to "new", and don't copy the parent host values, to insulate
jails from the parent hostname et al.

Approved by:	re (kib), bz (mentor)
2009-07-29 16:41:02 +00:00
Rick Macklem
6a795918c1 Fix the experimental nfs client so that it only calls ncl_vinvalbuf()
for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since
nfscl_mustflush() only returns 0 when there is a valid write delegation
issued to the client, it only affects the case of an NFSv4 mount with
callbacks/delegations enabled.

Approved by:	 re (kensmith), kib (mentor)
2009-07-29 14:50:31 +00:00
Konstantin Belousov
8a5ac5d56f As was done in r195820 for amd64, use clflush for flushing cache lines
when memory page caching attributes changed, and CPU does not support
self-snoop, but implemented clflush, for i386.

Take care of possible mappings of the page by sf buffer by utilizing
the mapping for clflush, otherwise map the page transiently. Amd64
used direct map.

Proposed and reviewed by:  alc
Approved by:   re (kensmith)
2009-07-29 08:49:58 +00:00
Robert Watson
791b0ad2bf Eliminate ARG_UPATH[12] arguments to AUDIT_ARG_UPATH() and instead
provide specific macros, AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2()
to capture path information for audit records.  This allows us to
move the definitions of ARG_* out of the public audit header file,
as they are an implementation detail of our current kernel-internal
audit record, which may change.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-29 07:44:43 +00:00
Robert Watson
a9bcca799e Revise header comments for vnet.h as we now implement VNET_SYSINIT, not
just VNET_DEFINE in vnet.h.

Approved by:	re (vimage blanket)
2009-07-28 22:17:34 +00:00
Robert Watson
b146fc1bf0 Rework vnode argument auditing to follow the same structure, in order
to avoid exposing ARG_ macros/flag values outside of the audit code in
order to name which one of two possible vnodes will be audited for a
system call.

Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:52:24 +00:00
Robert Watson
e4b4bbb665 Audit file descriptors passed to fooat(2) system calls, which are used
instead of the root/current working directory as the starting point for
lookups.  Up to two such descriptors can be audited.  Add audit record
BSM encoding for fooat(2).

Note: due to an error in the OpenBSM 1.1p1 configuration file, a
further change is required to that file in order to fix openat(2)
auditing.

Approved by:	re (kib)
Reviewed by:	rdivacky (fooat(2) portions)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:39:58 +00:00
Julian Elischer
3d1001cb11 Startup the vnet part of initialization a bit after the global part.
Fixes crash on boot if ipfw compiled in.

Submitted by:	tegge@
Reviewed by:	tegge@
Approved by:	re (kib)
2009-07-28 19:58:07 +00:00
Julian Elischer
7973fba3a4 Somewhere along the line accept sockets stopped honoring the
FIB selected for them. Fix this.

Reviewed by:	ambrisko
Approved by:	re (kib)
MFC after:	3 days
2009-07-28 19:43:27 +00:00
Qing Li
9fca4f79c7 The new flow table caches both the routing table entry as well as the
L2 information. For an indirect route the cached L2 entry contains the
MAC address of the gateway. Typically the default route is used to
transmit multicast packets when explicit multicast routes are not
available. The ether_output() function bypasses L2 resolution function
if it verifies the L2 cache is valid, because the cached L2 address
(a unicast MAC address) is copied into the packets as the destination
MAC address. This validation, however, does not apply to broadcast and
multicast packets because the destination MAC address is mapped
according to a standard method instead.

Submitted by:	Xin Li
Reviewed by:	bz
Approved by:	re
2009-07-28 17:16:54 +00:00
Michael Tuexen
bf3d517756 Fix a bug where wrong initialization value
in used for an SCTP specific sysctl variable.

Approved by: re, rrs(mentor).
MFC after: 2 weeks.
2009-07-28 15:07:41 +00:00
Randall Stewart
cfde3ff70b Turns out that when a receiver forwards through its TNS's the
processing code holds the read lock (when processing a
FWD-TSN for pr-sctp). If it finds stranded data that
can be given to the application, it calls sctp_add_to_readq().
The readq function also grabs this lock. So if INVAR is on
we get a double recurse on a non-recursive lock and panic.

This fix will change it so that readq() function gets a
flag to tell if the lock is held, if so then it does not
get the lock.

Approved by:	re@freebsd.org (Kostik Belousov)
MFC after:	1 week
2009-07-28 14:09:06 +00:00
Weongyo Jeong
62dcd8657b adds DLINK2 DWA120 device.
PR:		usb/136950
Reported by:	Alexander Kuznetsov <skritku at gmail.com>
Approved by:	re (kib)
2009-07-27 20:17:20 +00:00
Qing Li
df813b7ea2 This patch does the following:
- Allow loopback route to be installed for address assigned to
      interface of IFF_POINTOPOINT type.
    - Install loopback route for an IPv4 interface addreess when the
      "useloopback" sysctl variable is enabled. Similarly, install
      loopback route for an IPv6 interface address when the sysctl variable
      "nd6_useloopback" is enabled. Deleting loopback routes for interface
      addresses is unconditional in case these sysctl variables were
      disabled after an interface address has been assigned.

Reviewed by:	bz
Approved by:	re
2009-07-27 17:08:06 +00:00
John Baldwin
8e3764c0a6 Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the
old ABI versions of the relevant control system call (e.g.
freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()).

Approved by:	re (kib)
2009-07-27 16:03:04 +00:00
Pawel Jakub Dawidek
abd8353f5d We don't support ephemeral IDs in FreeBSD and without this fix ZFS can
panic when in zfs_fuid_create_cred() when userid is negative. It is
converted to unsigned value which makes IS_EPHEMERAL() macro to
incorrectly report that this is ephemeral ID. The most reasonable
solution for now is to always report that the given ID is not ephemeral.

PR:		kern/132337
Submitted by:	Matthew West <freebsd@r.zeeb.org>
Tested by:	Thomas Backman <serenity@exscape.org>, Michael Reifenberger <mike@reifenberger.com>
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-27 14:52:34 +00:00
Rui Paulo
3ca80f0dbc Mesh fixes, namely:
* don't clobber proxy entries
* HWMP seq number processing, including discard of old frames
* flush routing table entries based on nexthop
* print route flags in ifconfig
* more debugging messages and comments

Proxy changes submitted by sam.

Approved by:	re (kib)
2009-07-27 14:22:09 +00:00
Rui Paulo
1f93ae9453 Refine the MacBook hack to only match early models that have Intel ICH.
Discussed with:	kjim
Approved by:	re (kib)
2009-07-27 13:51:55 +00:00
Michael Tuexen
8e71b6947a Fix the handling of unordered messages when using
PR-SCTP.

Approved by: re, rrs (mentor)
MFC after: 3 weeks.
2009-07-27 13:41:45 +00:00
Michael Tuexen
4420d9a062 Get rid of unused field. This will also be deleted
in the official speciication of the SCTP socket API.

Approved by:re, rrs (mentor)
2009-07-27 12:09:32 +00:00
Michael Tuexen
47a490cbbc Add a missing unlock for the inp lock when
returning early from sctp_add_to_readq().

Approved by: re, rrs (mentor)
MFC after: 2 weeks.
2009-07-26 15:06:59 +00:00
Alexander Motin
b06555e4fc Restore PATA device probe order, broken by PMP support implementation,
requesting IDENTIFY from slave device first. This order is important
for proper cable type detection by master device.

PR:		kern/136438
Approved by:	re (kib)
2009-07-26 14:04:48 +00:00
Bjoern A. Zeeb
d0ea47437a Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
  from going away, under INVARIANTS as this is a general problem
  of the stack and should be solved in if.c/netisr but still good
  to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.

Hook epair(4) up to the build.

Approved by:	re (kib)
2009-07-26 12:20:07 +00:00
Bjoern A. Zeeb
be31e5e7b5 Make the in-kernel logic for the SIOCSIFVNET, SIOCSIFRVNET ioctls
(ifconfig ifN (-)vnet <jname|jid>) work correctly.

Move vi_if_move to if.c and split it up into two functions(*),
one for each ioctl.

In the reclaim case, correctly set the vnet before calling if_vmove.

Instead of silently allowing a move of an interface from the current
vnet to the current vnet, return an error. (*)

There is some duplicate interface name checking before actually moving
the interface between network stacks without locking and thus race
prone. Ideally if_vmove will correctly and automagically handle these
in the future.

Suggested by:	rwatson (*)
Approved by:	re (kib)
2009-07-26 11:29:26 +00:00
Alexander Motin
1a00526bec Add note, that ahci(4) and siis(4) supersede ata(4) drivers.
Approved by:	re (implicitly)
2009-07-25 18:45:09 +00:00
Alexander Motin
e19ef875b1 Add ahci and siis drivers to NOTES.
Approved by:	re (implicitly)
2009-07-25 17:40:49 +00:00
Jamie Gritton
7cbf72137f Some jail parameters (in particular, "ip4" and "ip6" for IP address
restrictions) were found to be inadequately described by a boolean.
Define a new parameter type with three values (disable, new, inherit)
to handle these and future cases.

Approved by:	re (kib), bz (mentor)
Discussed with:	rwatson
2009-07-25 14:48:57 +00:00
Julian Elischer
9d85f50ad5 Catch ipfw up to the rest of the vimage code.
It got left behind when it moved to its new location.

Approved by:	re (kensmith)
2009-07-25 06:42:42 +00:00
Jack F Vogel
45289e2ded Improvement on the last change, this gives a precise
way to tell the one and only interface that a vlan
event is for. Thanks to John Baldwin for the patch.

Approved by: re
2009-07-24 21:35:52 +00:00
Brooks Davis
254d03c508 Introduce a new sysctl process mib, kern.proc.groups which adds the
ability to retrieve the group list of each process.

Modify procstat's -s option to query this mib when the kinfo_proc
reports that the field has been truncated.  If the mib does not exist,
fall back to the truncated list.

Reviewed by:	rwatson
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-24 19:12:19 +00:00
John Baldwin
5f60e483c8 Bump __FreeBSD_version for the introduction of OBJT_SG.
Approved by:	re (kensmith)
2009-07-24 18:31:04 +00:00
Jack F Vogel
387424df40 This delta fixes two bugs:
- When a vlan event occurs a check was not made that
    the event was actually for the interface, thus resulting
    in a panic. All three drivers have this vulnerability. Add
    a check for this condition.
  - Secondly, there was a duplicate buf_ring free in the em
    driver resulting in a panic on unload. Remove.

Approved by:  re
2009-07-24 16:57:49 +00:00
Jack F Vogel
8bd0025fdd A small number of systems in the ICH9/10 family have a flash
part that is made up of 8K banks rather than 4K, if these
systems are using bank 1 then the last change in this code
breaks the bank read, resulting in an invalid checksum of
the eeprom during driver load. This change fixes this.

Approved by:  re
2009-07-24 16:54:22 +00:00
Sam Leffler
8ed1835dea revert OACTIVE part of r195845; instead fix the comment so it does not refer
to the old hack removed in r193312

Approved by:	re (implicit)
2009-07-24 15:37:02 +00:00
Sam Leffler
ff5aac8eb7 correct handling of IFF_PROMISC; this should not be pushed to the parent
device except for monitor and ahdemo mode vaps

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-24 15:28:29 +00:00
Sam Leffler
983a2c892b monitor mode vaps are meant to be read-only so they can operate on any
frequency w/o regulatory issues, do this by hooking if_transmit and
if_output with routines that discard all transmits

Reviewed by:	thompsa, cbzimmer (intent)
Approved by:	re (kensmith)
2009-07-24 15:27:02 +00:00
Sam Leffler
224881b466 o kill old code no longer needed after r193312
o count output packets+errors for frames sent through ieee80211_output

Approved by:	re (kensmith)
2009-07-24 15:22:12 +00:00
John Baldwin
63cbfee7b0 Remove debugging that crept in with previous commit.
Reported by:	nwhitehorn
Approved by:	re (kib)
2009-07-24 15:06:49 +00:00
Brooks Davis
1b5768be71 Revert the changes to struct kinfo_proc in r194498. Instead, fill
in up to 16 (KI_NGROUPS) values and steal a bit from ki_cr_flags
(all bits currently unused) to indicate overflow with the new flag
KI_CRF_GRP_OVERFLOW.

This fixes procstat -s.

Approved by: re (kib)
2009-07-24 15:03:10 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Robert Watson
d0728d7174 Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
  occur each time a network stack is instantiated and destroyed.  In the
  !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
  For the VIMAGE case, we instead use SYSINIT's to track their order and
  properties on registration, using them for each vnet when created/
  destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
  previously, as well as its dependency scheme: we now just use the
  SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
  they want init functions to be called for each virtual network stack
  rather than just once at boot, compiling down to DOMAIN_SET() in the
  non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
  of modinfo or DOMAIN_SET() for init/uninit events.  In some cases,
  convert modular components from using modevent to using sysinit (where
  appropriate).  In some cases, do minor rejuggling of SYSINIT ordering
  to make room for or better manage events.

Portions submitted by:	jhb (VNET_SYSINIT), bz (cleanup)
Discussed with:		jhb, bz, julian, zec
Reviewed by:		bz
Approved by:		re (VIMAGE blanket)
2009-07-23 20:46:49 +00:00
Alan Cox
3313145243 Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This
optimization was implemented in the amd64 version roughly 1 year ago.)

Approved by:	re (kensmith)
2009-07-23 19:43:23 +00:00
Nathan Whitehorn
ccf6415e82 Fix serial console on Apple Xserve G5 by falling back to input-device-1
if input-device is unavailable. The Xserve G5 defaults to using
screen/keyboard for output-device/input-device even if these are not
installed, and then falls back to serial ports at boot time.

Reviewed by:	marcel
Hardware from:	grehan
Approved by:	re (kib)
2009-07-23 12:51:27 +00:00
Rick Macklem
93a15e2054 When vfs.newnfs.callback_addr is set to an IPv4 address, the
experimental NFSv4 client might try and use it as an IPv6 address,
breaking callbacks. The fix simply initializes the isinet6 variable
for this case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 18:10:44 +00:00
Edward Tomasz Napierala
d2ceff236a Fix extattr_list_file(2) on ZFS in case the attribute directory
doesn't exist and user doesn't have write access to the file.
Without this fix, it returns bogus value instead of 0.  For some
reason this didn't manifest on my kernel compiled with -O0.

PR:		kern/136601
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Approved by:	re (kib)
2009-07-22 15:15:58 +00:00
Rick Macklem
c79e697621 Add changes to the experimental nfs client to use the PBDRY flag for
msleep(9) when a vnode lock or similar may be held. The changes are
just a clone of the changes applied to the regular nfs client by
r195703.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:37:53 +00:00
Konstantin Belousov
206a336872 When the page caching attributes are changed, after new mapping is
established, OS shall flush the caches on all processors that may have
used the mapping previously. This operation is not needed if processors
support self-snooping. If not, but clflush instruction is implemented
on the CPU, series of the clflush can be used on the mapping region.
Otherwise, we have to flush the whole cache. The later operation is very
expensive, and AMD-made CPUs do not have self-snooping.

Implement cache flush for remapped region by using clflush for amd64,
when supported by CPU.

Proposed and reviewed by:	alc
Approved by:	re (kensmith)
2009-07-22 14:32:38 +00:00
Rick Macklem
7f1968ba10 When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:32:28 +00:00
Andrew Gallatin
94c7d993a3 mxge's tunable hw.mxge.rss_hash_type cannot be set from the
loader, because it uses a reserved suffix (_type).  Fix
this by removing the "_" and renaming the tunable to
hw.mxge.rss_hashtype.  The old (rss_hash_type) tunable is
still fetched, in case people load the driver via scripts.
When both are present in the kernel environment,
the new value (hw.mxge.rss_hashtype) overrides the old
value.

Approved by:	re (kib)
2009-07-22 11:57:34 +00:00
Bjoern A. Zeeb
a08362ce46 sysctl_msec_to_ticks is used with both virtualized and
non-vrtiualized sysctls so we cannot used one common function.

Add a macro to convert the arg1 in the virtualized case to
vnet.h to not expose the maths to all over the code.

Add a wrapper for the single virtualized call, properly handling
arg1 and call the default implementation from there.

Convert the two over places to use the new macro.

Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-21 21:58:55 +00:00
Sam Leffler
e50821abe7 store mesh timers as ticks and sysctls for changing the defaults
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:38:22 +00:00
Sam Leffler
411ccf5f63 Correct handling of keys that already have a hardware/device key index:
this was broken in r183248 when the check of wk_keyix was replaced by
a check of IEEE80211_KEY_DEVKEY (because the flag was clobbered).  Define
IEEE80211_KEY_DEVICE to specify flags that are owned by net80211/driver
and use this to preserve IEEE80211_KEY_DEVKEY so we don't ask the driver
for another key index when we already have one.

Testing by:	Daniel Thiele, Wes Morgan
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:36:32 +00:00
Sam Leffler
bd6c4dd05d correct setup of opt_ddb.h
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 19:24:53 +00:00
Sam Leffler
82cadd5aa0 Fix handling of AR_RX_FILTER_BSSID: write the shadow value for AR_MISC_MODE
so other register writes preserve the setting of AR_MISC_MODE_BSSID_MATCH_FORCE.

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:23:34 +00:00
Marius Strobl
fada2a867d Add a MD __PCI_BAR_ZERO_VALID which denotes that BARs containing 0
actually specify valid bases that should be treated just as normal.
The PCI specifications have no indication that 0 would be a magic value
indicating a disabled BAR as commonly used on at least amd64 and i386
but not sparc64. It's unclear what to do in pci_delete_resource()
instead of writing 0 to a BAR though as there's no (other) way do
disable individual BARs so its decoding is left enabled in case of
__PCI_BAR_ZERO_VALID for now.

Approved by:	re (kib), jhb
MFC after:	1 week
2009-07-21 19:06:39 +00:00
Sam Leffler
fe0dd78965 track whether any mesh vaps are present to correctly setup the rx filter
when, for example, an ap vap is created first

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:01:04 +00:00
Alan Cox
9e367360e3 Catch up with r195249, "Improve the handling of cpuset with interrupts."
Specifically, update the return type of xenpic_assign_cpu() so that this
file compiles again.

Approved by:	re (kib)
2009-07-21 16:54:11 +00:00
Rui Paulo
a6d54dae20 Enable mesh support.
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 14:23:05 +00:00
Rui Paulo
5b4d5f9ee2 Improve the printf message when a module failed to load. This gives the
user some clue about the possibility of a __FreeBSD_version mismatch.

Discussed with:	rwatson, jhb
Approved by:	re (kib)
2009-07-21 14:18:25 +00:00
Alexander Motin
67b87e4429 Add siis CAM driver for SiliconImage SiI3124/3132/3531 SATA2 controllers.
Driver supports Serial ATA and ATAPI devices, Port Multipliers
(including FIS-based switching), hardware command queues (31 command
per port) and Native Command Queuing. This is probably the second on
popularity, after AHCI, type of SATA2 controllers, that benefits from
using CAM, because of hardware command queuing support.

Approved by:    re (kib)
2009-07-21 12:32:46 +00:00
Rafal Jaworowski
570255d8a5 Do not use OCP85XX_LBC_OFF twice when accessing LBC registers on MPC85XX.
It turns LBC control registers were not programmed correctly on MPC85XX. We
were accessing bogus addresses as the base offset (OCP85XX_LBC_OFF) was
erroneously added during offset calculations.  Effectively the state of LBC
control registers was not altered by the kernel initialization code, but
everything worked as long as we coincided to use the same settings (LBC decode
windows) as firmware has initialized.

Submitted by:	Lukasz Wojcik
Reviewed by:	marcel
Approved by:	re (kensmith)
Obtained from:	Semihalf
2009-07-21 08:38:45 +00:00
Rafal Jaworowski
84a5818564 Make dcache_inv_range() point to the proper routines on ARM9 and ARM9E/ARM10.
On some ARM variations CPU func dispatcher has the D-cache invalidate method
point to write-back invalidate, which is wrong, and can lead to a crash/panic
on affected platforms.

Spotted by:	HPS
Reviewed by:	cognet
Approved by:	re (kib)
2009-07-21 08:29:19 +00:00
Coleman Kane
e0d12c4b94 Fix regression in last set of commits. Submitted via e-mail and then
nagged again via PR. Thank Paul for his persistence and contributions.

PR:		136895
Submitted by:	Paul B. Mahol <onemda@gmail.com>
Reviewed by:	sam (timeout, 10 days), weongyo (timeout, 10 days), me
Approved by:	re (Kostik Belousov <kostikbel@gmail.com>)
2009-07-20 23:21:19 +00:00
Robert Watson
a511354af4 Back out the moving in r195782 of V_ip_id's initialization from the top
back to the bottom of ip_init() as found in 7.x.  I missed the fact that
the bottom half of the init routine only runs in the !VNET case.

Submitted by:	zec
Approved by:	re (vimage blanket)
2009-07-20 19:40:09 +00:00
Edward Tomasz Napierala
65588fd503 Fix permission handling for extended attributes in ZFS. Without
this change, ZFS uses SunOS Alternate Data Streams semantics - each
EA has its own permissions, which are set at EA creation time
and - unlike SunOS - invisible to the user and impossible to change.
From the user point of view, it's just broken: sometimes access
is granted when it shouldn't be, sometimes it's denied when
it shouldn't be.

This patch makes it behave just like UFS, i.e. depend on current
file permissions.  Also, it fixes returned error codes (ENOATTR
instead of ENOENT) and makes listextattr(2) return 0 instead
of EPERM where there is no EA directory (i.e. the file never had
any EA).

Reviewed by:	pjd (idea, not actual code)
Approved by:	re (kib)
2009-07-20 19:16:42 +00:00
Rui Paulo
c104cff26e More mesh bits, namely:
* bridge support (sam)
* handling of errors (sam)
* deletion of inactive routing entries
* more debug msgs (sam)
* fixed some inconsistencies with the spec.
* decap is now specific to mesh (sam)
* print mesh seq. no. on ifconfig list mesh
* small perf. improvements

Reviewed by:	sam
Approved by:	re (kib)
2009-07-20 19:12:08 +00:00
Robert Watson
0a4747d4d0 Garbage collect vnet module registrations that have neither constructors
nor destructors, as there's no actual work to do.

In most cases, the constructors weren't needed because of the existing
protocol initialization functions run by net_init_domain() as part of
VNET_MOD_NET, or they were eliminated when support for static
initialization of virtualized globals was added.

Garbage collect dependency references to modules without constructors or
destructors, notably VNET_MOD_INET and VNET_MOD_INET6.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 13:55:33 +00:00
Rafal Jaworowski
331f685743 ARM pmap fixes.
a)  nocache-remap problem

   When a page is remapped into a non-cacheable virtual memory region there
   was no associated write-back invalidate operation performed. We remove
   writeback of the original buffer size from bus_dmamem_alloc() and add
   appropriate L1/L2 flush operation.

b) missing write-back invalidate operation

   In pmap_kremove a page is removed so we must do a write-back
   invalidate operation aligned to the page virtual address.

Submitted by:	Michal Hajduk
Reviewed by:	Mark Tinguely, rpaulo, stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-07-20 07:53:07 +00:00
Robert Watson
17ef1feb8a Add macros VNET_SETNAME and VNET_SYMPREFIX, and expose to userspace if
_WANT_VNET is defined.  This way we don't need separate definitions in
libkvm.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 07:50:50 +00:00
Scott Long
62fdee1dda Fix an apparently harmless typo.
Approved by:	re
2009-07-20 03:59:00 +00:00
Alan Cox
9861cbc6ca Change the handling of fictitious pages by pmap_page_set_memattr() on
amd64 and i386.  Essentially, fictitious pages provide a mechanism for
creating aliases for either normal or device-backed pages.  Therefore,
pmap_page_set_memattr() on a fictitious page needn't update the direct
map or flush the cache.  Such actions are the responsibility of the
"primary" instance of the page or the device driver that "owns" the
physical address.  For example, these actions are already performed by
pmap_mapdev().

The device pager needn't restore the memory attributes on a fictitious
page before releasing it.  It's now pointless.

Add pmap_page_set_memattr() to the Xen pmap.

Approved by:	re (kib)
2009-07-19 21:40:19 +00:00
Konstantin Belousov
7ac5806bfb When buffer write is failed, it is wrong for brelse() to invalidate
portion of the page that was written. Among other problems, this
page might be picked up by pagedaemon, with failed assertion in
vm_pageout_flush() about validity of the page.

Reported and tested by:	pho
Approved by:	re (kensmith)
MFC after:	3 weeks
2009-07-19 20:25:59 +00:00
Robert Watson
006e9db452 Normalize field naming for struct vnet, fix two debugging printfs that
print them.

Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-19 17:40:45 +00:00
Ken Smith
3ca3047aee Bump the version of all non-symbol-versioned shared libraries in
preparation for 8.0-RELEASE.  Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.

Reviewed by:    kib
Approved by:    re (rwatson)
2009-07-19 17:25:24 +00:00
Sam Leffler
030d10ee97 add urtw
Approved by:	re (kib)
2009-07-19 16:54:24 +00:00
Rick Macklem
80e556956e Fix two bugs in the experimental nfs client:
- When the root vnode was acquired during mounting, mnt_stat.f_iosize was
  still set to 0, so getnewvnode() would set bo_bsize == 0. This would
  confuse getblk(), so that it always returned the first block causing
  the problem when the root directory of the mount point was greater
  than one block in size. It was fixed by setting mnt_stat.f_iosize to
  NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode.
- NFSMNT_INT was being set temporarily while the initial connect to a
  server was being done. This erroneously configured the krpc for
  interruptible RPCs, which caused problems because signals weren't
  being masked off as they would have been for interruptible mounts.
  This code was deleted to fix the problem. Since mount_nfs does an
  NFS null RPC before the mount system call, connections to the server
  should work ok.

Tested by:	swell dot k at gmail dot com
Approved by:	re (kensmith), kib (mentor)
2009-07-19 16:44:26 +00:00
Robert Watson
94da184f68 Expose the definitions of 'struct vnet' and 'VNET_MAGIC_N' to userspace
if _WANT_VNET is defined.  This is required so that libkvm can locate
virtual network stack instances in order to reach their global variables
for monitoring and crashdump analysis.

Reviewed by:	bz
Approved by:	re (kib)
2009-07-19 15:21:42 +00:00
Robert Watson
5ee847d3ac Reimplement and/or implement vnet list locking by replacing a mostly
unused custom mutex/condvar-based sleep locks with two locks: an
rwlock (for non-sleeping use) and sxlock (for sleeping use).  Either
acquired for read is sufficient to stabilize the vnet list, but both
must be acquired for write to modify the list.

Replace previous no-op read locking macros, used in various places
in the stack, with actual locking to prevent race conditions.  Callers
must declare when they may perform unbounded sleeps or not when
selecting how to lock.

Refactor vnet sysinits so that the vnet list and locks are initialized
before kernel modules are linked, as the kernel linker will use them
for modules loaded by the boot loader.

Update various consumers of these KPIs based on whether they may sleep
or not.

Reviewed by:	bz
Approved by:	re (kib)
2009-07-19 14:20:53 +00:00
Sam Leffler
519f677aff Move code that does payload realigment to a new routine, ieee80211_realign,
so it can be reused.  While here rewrite the logic to always use a single mbuf.

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-18 20:19:53 +00:00
Bruce M Simpson
b36c89e55f Fix a problem, whereby misbehaving IPv6 applications, which don't include
a valid zone ID or interface identifier in a v6 multicast leave, would
trigger a fairly paranoid KASSERT().

Observed with Boost++ regression tests on ref8.freebsd.org.

Approved by:	re (kib)
2009-07-18 17:38:18 +00:00
Ulf Lilleengen
b79cac0f92 - Fix the issue with read access count modification on RAID-5 plexes properly.
If the access counts were not increased and decreased in equal numbers by
  gvinum consumers, the read access count would be inconsistent with the write
  access count. Instead, modify the read access count with the write access
  count directly to prevent any inconsistencies.

Approved by:	re (kib)
2009-07-18 11:12:48 +00:00
Alan Cox
13de722155 An addendum to r195649, "Add support to the virtual memory system for
configuring machine-dependent memory attributes...":

Don't set the memory attribute for a "real" page that is allocated to
a device object in vm_page_alloc().  It is a pointless act, because
the device pager replaces this "real" page with a "fake" page and sets
the memory attribute on that "fake" page.

Eliminate pointless code from pmap_cache_bits() on amd64.

Employ the "Self Snoop" feature supported by some x86 processors to
avoid cache flushes in the pmap.

Approved by:	re (kib)
2009-07-18 01:50:05 +00:00
Alexander Motin
b2da774623 Fix copy-paste bug. Use regular non-polled mode for executing FLUSHCACHE
command on disk close.

Approved by:	re (implicitly)
2009-07-17 21:48:08 +00:00
Rick Macklem
874bb76647 Patch the regular nfs client in a manner analagous to
r195704 for the experimental client. The patch avoids calling vn_lock()
for the case where nfs_nget() has acquired the same vnode as dvp,
since nfs_nget() has already locked the vnode.

Reviewed by:	kib, jhb
Approved by:	re (kensmith), kib (mentor)
2009-07-17 19:38:07 +00:00
Rui Paulo
8b9fde4324 Add IEEE80211_SUPPORT_MESH, following similar change to nanobsd and
other GENERIC kernels.

Approved by:	re (kib)
2009-07-17 18:35:45 +00:00
Jamie Gritton
7afcbc18b3 Remove the interim vimage containers, struct vimage and struct procg,
and the ioctl-based interface that supported them.

Approved by:	re (kib), bz (mentor)
2009-07-17 14:48:21 +00:00
Robert Watson
597df30e62 Import OpenBSM 1.1p1 from vendor branch to 8-CURRENT, populating
contrib/openbsm and a subset also imported into sys/security/audit.
This patch release addresses several minor issues:

- Fixes to AUT_SOCKUNIX token parsing.
- IPv6 support for au_to_me(3).
- Improved robustness in the parsing of audit_control, especially long
  flags/naflags strings and whitespace in all fields.
- Add missing conversion of a number of FreeBSD/Mac OS X errnos to/from BSM
  error number space.

MFC after:	3 weeks
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
Approved by:	re (kib)
2009-07-17 14:02:20 +00:00
Robert Watson
5d171016e7 Vendor import of OpenBSM 1.1p1, which incorporates the following changes
since the last imported OpenBSM release:

OpenBSM 1.1p1

- Fixes to AUT_SOCKUNIX token parsing.
- IPv6 support for au_to_me(3).
- Improved robustness in the parsing of audit_control, especially long
  flags/naflags strings and whitespace in all fields.
- Add missing conversion of a number of FreeBSD/Mac OS X errnos to/from BSM
  error number space.

Obtained from:  TrustedBSD Project
Sponsored by:   Apple, Inc.
2009-07-17 12:18:39 +00:00
Robert Watson
1e77c1056a Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used.  Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with:	bz, julian
Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-16 21:13:04 +00:00
Alexander Motin
d96aeec8ff Limit IOCATAREQUEST ioctl data size to controller's maximum I/O size.
It fixes kernel panic when requested size is too large (0xffffffff),

PR:             kern/136726
Approved by:    re (kib)
MFC after:      2 weeks
2009-07-16 19:48:39 +00:00
Ken Smith
e57e15a182 Prepare for the 8.0-BETA2 builds.
Approved by:	re (implicit)
2009-07-15 17:29:05 +00:00
Andriy Gapon
b064b6d1cd dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds
Currently dtrace_gethrtime uses formula similar to the following for
converting TSC ticks to nanoseconds:
rdtsc() * 10^9 / tsc_freq
The dividend overflows 64-bit type and wraps-around every 2^64/10^9 =
18446744073 ticks which is just a few seconds on modern machines.

Now we instead use precalculated scaling factor of
10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately
for each 32-bit half.  This allows to avoid overflow of the dividend
described above.
The idea is taken from OpenSolaris.
This has an added feature of always scaling TSC with invariant value
regardless of TSC frequency changes. Thus the timestamps will not be
accurate if TSC actually changes, but they are always proportional to
TSC ticks and thus monotonic. This should be much better than current
formula which produces wildly different non-monotonic results on when
tsc_freq changes.

Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init()
to make it identical to the i386 twin.

PR:		kern/127441
Tested by:	Thomas Backman <serenity@exscape.org>
Reviewed by:	jhb
Discussed with:	current@, bde, gnn
Silence from:	jb
Approved by:	re (gnn)
MFC after:	1 week
2009-07-15 17:07:39 +00:00
Robert Watson
cd2b95a237 r195699 introduced an assertion regarding when progbits data in kernel
modules was present, which turns out to be false in some situations.
Back out the assertion.

Reported by:	Luiz Otavio O Souza <lists.br at gmail.com>,
		Florian Smeets <flo at kasimir.com>
Approved by:	re (kensmith) (implicit)
2009-07-15 09:19:01 +00:00
Robert Watson
c1e200ffcc Add missing license line for vnet.h, correct white space nit.
Approved by:	re (kensmith) (implicit)
2009-07-15 00:56:15 +00:00
Rick Macklem
405229f913 Fix the experimental nfs client so that it does not cause a
"share->excl" panic when doing a lookup of dotdot at the root
of a server's file system. The patch avoids calling vn_lock()
for that case, since nfscl_nget() has already acquired a lock
for the vnode.

Approved by:	re (kensmith), kib (mentor)
2009-07-14 23:10:23 +00:00
Konstantin Belousov
b35687df13 Use PBDRY flag for msleep(9) in NFS and NLM when sleeping thread owns
kernel resources that block other threads, like vnode locks. The SIGSTOP
sent to such thread (process, rather) shall not stop it until thread
releases the resources.

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:54:29 +00:00
Konstantin Belousov
f33a947b56 Add new msleep(9) flag PBDY that shall be specified together with
PCATCH, to indicate that thread shall not be stopped upon receipt of
SIGSTOP until it reaches the kernel->usermode boundary.

Also change thread_single(SINGLE_NO_EXIT) to only stop threads at
the user boundary unconditionally.

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:52:46 +00:00
Konstantin Belousov
79799053a7 Move the repeated code to calculate the number of the threads in the
process that still need to be suspended or exited from thread_single
into the new function calc_remaining().

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:51:31 +00:00
Konstantin Belousov
c8167830f9 When wakeup(9) is going to notify swapper, assert that wait channel is not
equal to &proc0. It shall be not, since proc0 stack is not swappable, and
kick_proc0() is wakeup(&proc0).

Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:50:41 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
John Baldwin
0fe0ed8bf8 - Change mmap() to fail requests with EINVAL that pass a length of 0. This
behavior is mandated by POSIX.
- Do not fail requests that pass a length greater than SSIZE_MAX
  (such as > 2GB on 32-bit platforms).  The 'len' parameter is actually
  an unsigned 'size_t' so negative values don't really make sense.

Submitted by:	Alexander Best  alexbestms at math.uni-muenster.de
Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 week
2009-07-14 19:45:36 +00:00
Bjoern A. Zeeb
6c19585325 Re-add opt_inet.h, as we did in r193862 and lost yet again.
Approved by:	re (kib)
2009-07-14 19:32:36 +00:00
Alexander Motin
5487ee5507 Disable MSI by default for nVidia MCP55 chipset.
It is reported to be broken in the same way as MCP51.

PR:		kern/136429
Approved by:	re (kib)
2009-07-14 19:18:31 +00:00
Ariff Abdullah
df41a638f7 - Do aggresive saturation on various polynomial interpolators.
This dramatically pushing 99.9% interpolations and quantizations
  error _below_ -180dB on 32bit dynamic range, resulting extremely
  high quality conversion.
- Use BSPLINE interpolator for filter oversampling factor greater or
  equal than 64 (log2 6).

Approved by:	re (kib)
2009-07-14 18:53:34 +00:00
Ed Maste
c25ebae365 Change xpt_scan_bus to scsi_scan_bus and xpt_scan_lun to scsi_scan_lun
in comments and printfs to match new function names after refacoring.

Approved by:	re
2009-07-14 18:44:17 +00:00
Ed Maste
399831bce6 Fix leaks in probestart, probedone, and scsi_scan_bus. Also free
page_list using the matching malloc type for the allocation.

Approved by:	re
Reviewed by:	scottl [1]
MFC after:	1 week

[1] Original patch was against xpt_cam.c, prior to the cam refactoring.
2009-07-14 17:26:37 +00:00
Lawrence Stewart
5f1ff8136a Fix a buglet that slipped into r195654. My buildworld/buildkernel sanity
check missed this because cxgb's TOM is currently commented out of the build
system.

Submitted by:	Navdeep Parhar <np at FreeBSD dot org>
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-14 11:53:21 +00:00
Tai-hwa Liang
3d22427cff Adding hardware ID for RTL810x PCIe found on HP Pavilion DV2-1022AX.
Reviewed by:	yongari
Approved by:	re (kib, kensmith)
2009-07-14 04:35:13 +00:00
Jung-uk Kim
e8c4d3e407 Match PCI Express root bridge _HID directly instead of
relying on _CID.

Reviewed by:	jhb
Approved by:	re (kib)
2009-07-13 21:36:31 +00:00
Alexander Motin
3ccda2f312 Fix copy-paste bug, enabling SIM PMP support, when it was not really found.
Approved by:	re (implicitly)
2009-07-13 21:21:30 +00:00
Scott Long
de985f6174 Revert the CISS driver to 64K i/o, the previous change was in error and
missing a lot of needed infrastructure.

Approved by:	re
2009-07-13 20:19:29 +00:00
Rui Paulo
aec0289d59 Fix inline function declaration and prototype.
Approved by:	re (kensmith)
2009-07-13 18:23:58 +00:00
Alan Cox
bc5c48f12f Correct an error of omission in r195649 ("Add support to the virtual memory
system for configuring machine-dependent memory attributes: ...").  In
r195649, the "vm_cache_mode_t/vm_memattr_t" parameter was removed from
vm_phys_alloc_contig().

Approved by:	re (kensmith)
2009-07-13 18:11:59 +00:00
Alexander Motin
45a30a41d2 Fix Marvel SATA controllers operation, broken by rev. 188765,
by using uninitialized variable.

Tested by:	Chris Hedley
Approved by:	re (kensmith)
2009-07-13 18:01:49 +00:00
Lawrence Stewart
91a5ebde45 Fix a race in the manipulation of the V_tcp_sack_globalholes global variable,
which is currently not protected by any type of lock. When triggered, the bug
would sometimes cause a panic when the TCP activity to an affected machine
eventually slowed during a lull. The panic only occurs if INVARIANTS is compiled
into the kernel, and has laid dormant for some time as a result of INVARIANTS
being off by default except in FreeBSD-CURRENT.

Switch to atomic operations in the locations where the variable is changed.
Reads have not been updated to be protected by atomics, so there is a
possibility of accounting errors in any given calculation where the variable is
read. This is considered unlikely to occur in the wild, and will not cause
serious harm on rare occasions where it does.

Thanks to Robert Watson for debugging help.

Reported by:	Kamigishi Rei <spambox at haruhiism dot net>
Tested by:	Kamigishi Rei <spambox at haruhiism dot net>
Reviewed by:	silby
Approved by:	re (rwatson), kensmith (mentor temporarily unavailable)
2009-07-13 11:59:38 +00:00
Lawrence Stewart
237fbe0a1c Replace struct tcpopt with a proxy toeopt struct in the TOE driver interface to
the TCP syncache. This returns struct tcpopt to being private within the TCP
implementation, thus allowing it to be modified without ABI concerns.

The patch breaks the ABI. Bump __FreeBSD_version to 800103 accordingly. The cxgb
driver is the only TOE consumer affected by this change, and needs to be
recompiled along with the kernel.

Suggested by:	rwatson
Reviewed by:	rwatson, kmacy
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-13 11:51:02 +00:00
Alexander Motin
f09f8e3e0b Rename ATA probe driver to "aprobe" to resolve name conflict with SCSI
and fix loading cam as module.

Approved by:	re (implicitly)
2009-07-13 06:12:21 +00:00
Alan Cox
3153e878dd Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t.  The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes.  Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures.  The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map.  The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

  kmem_alloc_contig() can now be used to allocate kernel memory with
  non-default memory attributes on amd64 and i386.

  vm_page_alloc() and the device pager will set the memory attributes
  for the real or fictitious page according to the object's default
  memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386.  In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by:	re (kib)
2009-07-12 23:31:20 +00:00
Qing Li
05b262e264 This patch adds a host route to an interface address (that is assigned
to a non loopback/ppp link type) through the loopback interface. Prior
to the new L2/L3 rewrite, this host route was explicitly created when
processing the IPv6 address assignment. This loopback host route is
deleted when that IPv6 address is removed from the interface.

Reviewed by:	bz, gnn
Approved by:	re
2009-07-12 19:20:55 +00:00
Rick Macklem
089f366ab0 Add calls to the experimental nfs client for the case of an "intr" mount,
so that signals that aren't supposed to terminate RPCs in progress are
masked off during the RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:07:35 +00:00
Rick Macklem
ad86aef9af Fix the handling of dotdot in lookup for the experimental nfs client
in a manner analagous to the change in r195294 for the regular nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:02:17 +00:00
Marcel Moolenaar
c5e2d1885c Isochronous transfers only have 1 frame buffer, but multiple
frame lengths. The frame buffer is at index 0.

Approved by:	re (kensmith)
Obtained from:	HPS
2009-07-12 16:50:32 +00:00
Marcel Moolenaar
58f6d27320 MFp4:
USB CORE: busdma improvement

      For single segment allocations the boundary field
      of the BUSDMA tag should be zero. Currently all
      single segment allocations are less than or equal
      to 4096 bytes, so the limit does not kick in. If
      any single segment USB allocations would be greater
      than 4K, then it would be a problem.

Approved by:	re (kensmith)
Obtained from:	HPS
2009-07-12 16:46:43 +00:00
Konstantin Belousov
529ab57b9a When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounters
non-readable and non-executable map entry, the entry is skipped from
wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not
set for the map entry, its wired_count is later erronously decremented.
vm_map_delete(9) for such map entry stuck in "vmmaps".

Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop.

Reported by:	John Marshall <john.marshall riverwillow com au>
Approved by:	re (kensmith)
2009-07-12 12:37:38 +00:00
Lawrence Stewart
962ebef8c0 Pad the following TCP related structs to allow MFCs of upcoming features/fixes
back to the 8 branch:

tcp_var.h
- struct sackhint
- struct tcpcb
- struct tcpstat

The patch breaks the ABI. Bump __FreeBSD_version to 800102 accordingly. User
space tools that rely on the size of any of these structs (e.g. sockstat) need
to be recompiled.

Reviewed by:	rpaulo, sam, andre, rwatson
Approved by:	re & mentor (gnn)
2009-07-12 09:14:28 +00:00
Marcel Moolenaar
31d22003dd Rename option USBVERBOSE to USB_VERBOSE for 2 reasons:
1.  USB_VERBOSE is more consistent with USB_DEBUG,
2.  sys/dev/usb/usb_device.c uses option USB_VERBOSE and
    not USBVERBOSE.

POLA with the USBVERBOSE option as it's found in 7-STABLE
has been considered but found insignificant in the face
of the USB stack overhaul.

Approved by:	re (kensmith)
2009-07-12 04:48:47 +00:00
Nathan Whitehorn
f136543e41 Increase the size of the page table on 64-bit PowerPC machines as a
bandaid to prevent exhaustion of the primary and secondary hash groups
in the event of extreme stress on the PMAP layer (e.g. a forkbomb). This
wastes memory, and should be revised to properly handle PTEG spills instead.

Suggested by:	grehan
Approved by:	re (kensmith)
2009-07-12 04:07:52 +00:00
Marcel Moolenaar
b54b764c72 Revert rev 192323 (nfs_common.c only):
The D-cache flushing added here was to deal with I-cache
incoherency observed on ia64. However, the problem was
in the implementation of pmap_enter_object() for ia64:
it was missing I-cache coherency logic for prefaulted
pages. After this got added in rev 195625, testing showed
that no D-cache flushing was required.

The SIGILL that was observed on Book-E (see commit log
for rev 192323) ended up not being related to I-cache
incoherency, but was found to be caused by bad memory.
This discovery further undermined the need for D-cache
flushing in the NFS I/O code, triggering the reversal.

Approved by:	re (kensmith)
2009-07-12 03:53:52 +00:00
Marcel Moolenaar
dd8a461e83 In nvpair_native_embedded_array(), meaningless pointers are zeroed.
The programmer was aware that alignment was not guaranteed in the
packed structure and used bzero() to NULL out the pointers.
However, on ia64, the compiler is quite agressive in finding ILP
and calls to bzero() are often replaced by simple assignments (i.e.
stores). Especially when the width or size in question corresponds
with a store instruction (i.e. st1, st2, st4 or st8).

The problem here is not a compiler bug. The address of the memory
to zero-out was given by '&packed->nvl_priv' and given the type of
the 'packed' pointer the compiler could assume proper alignment for
the replacement of bzero() with an 8-byte wide store to be valid.
The problem is with the programmer. The programmer knew that the
address did not have the alignment guarantees needed for a regular
assignment, but failed to inform the compiler of that fact. In
fact, the programmer told the compiler the opposite: alignment is
guaranteed.

The fix is to avoid using a pointer of type "nvlist_t *" and
instead use a "char *" pointer as the basis for calculating the
address. This tells the compiler that only 1-byte alignment can
be assumed and the compiler will either keep the bzero() call
or instead replace it with a sequence of byte-wise stores. Both
are valid.

Approved by:	re (kib)
2009-07-11 22:43:20 +00:00
Colin Percival
7d845dde8d Remove build timestamps from the following files:
/boot/kernel/hptrr.ko
/etc/mail/*.cf
/lib/libcrypto.so.5
/usr/bin/ntpq
/usr/sbin/amd
/usr/sbin/iasl
/usr/sbin/ntpd
/usr/sbin/ntpdate
/usr/sbin/ntpdc

There does not appear to be any purpose to having these timestamps, and
they have the irritating consequence that the aforementioned files will
be different every time they are rebuilt.

After this commit, the only remaining build timestamps are in the kernel,
the boot loaders, /usr/include/osreldate.h (the year in the copyright
notice), and lib*.a (the timestamps on all of the included .o files).

Reviewed by:	scottl (hptrr), gshapiro (sendmail), simon (openssl),
		roberto (ntp), jkim (acpica)
Approved by:	re (kib)
2009-07-11 22:30:37 +00:00
Marcel Moolenaar
1ed01448fb On exec(2), when loading the ELF image, pmap_enter_object() is
called to prefault pages. This is an obvious place for making
sure the I-cache is coherent. It was missing though. As such,
execution over NFS and ZFS file systems was failing. NFS was
fixed the wrong way (by flushing the D-cache as part of the
NFS code) in a previous commit. ZFS problems were encountered
after that and indicated that something else was wrong...

Approved by:	re (kib)
2009-07-11 22:27:20 +00:00
Kip Macy
6a7bff2c31 Re-factoring for adding weighted routes introduced a
fairly irritating bug where the system will panic
when RADIX_MPATH is enabled. This change fixes this.

Approved by:	re@
2009-07-11 21:56:23 +00:00
Rui Paulo
7d26189145 Fix something bogus deletion that got it during mesh commit.
Approved by:	re (implicit)
2009-07-11 16:02:06 +00:00
Rui Paulo
59aa14a91d Implementation of the upcoming Wireless Mesh standard, 802.11s, on the
net80211 wireless stack. This work is based on the March 2009 D3.0 draft
standard. This standard is expected to become final next year.
This includes two main net80211 modules, ieee80211_mesh.c
which deals with peer link management, link metric calculation,
routing table control and mesh configuration and ieee80211_hwmp.c
which deals with the actually routing process on the mesh network.
HWMP is the mandatory routing protocol on by the mesh standard, but
others, such as RA-OLSR, can be implemented.

Authentication and encryption are not implemented.

There are several scripts under tools/tools/net80211/scripts that can be
used to test different mesh network topologies and they also teach you
how to setup a mesh vap (for the impatient: ifconfig wlan0 create
wlandev ... wlanmode mesh).

A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled
by default on GENERIC kernels for i386, amd64, sparc64 and pc98.

Drivers that support mesh networks right now are: ath, ral and mwl.

More information at: http://wiki.freebsd.org/WifiMesh

Please note that this work is experimental. Also, please note that
bridging a mesh vap with another network interface is not yet supported.

Many thanks to the FreeBSD Foundation for sponsoring this project and to
Sam Leffler for his support.
Also, I would like to thank Gateworks Corporation for sending me a
Cambria board which was used during the development of this project.

Reviewed by:	sam
Approved by:	re (kensmith)
Obtained from:	projects/mesh11s
2009-07-11 15:02:45 +00:00
Jung-uk Kim
35ea6959ac Get correct maxio from the controller and drop the tunable.
The default (64K) is too pessimistic for "new comm" hardware.
Also, this is bad because multiple controllers get limited by
the global tunable.

Reviewed by:	scottl
Approved by:	re (kensmith)
2009-07-11 08:10:18 +00:00
Rui Paulo
820e6a1f38 For ic_opmode switch cases, provide a default label with a printf saying
this opmode is not supported.

Approved by:	re (kib)
2009-07-10 15:28:33 +00:00
Sam Leffler
8b10ecb67c mark struct ieee80211req_maclist packed so sizeof works as intended on arm;
fixes "list mac"

Approved by:	re (kensmith)
2009-07-10 15:26:33 +00:00
Konstantin Belousov
d77e2734a1 When amd64 CPU cannot load segment descriptor during trap return to
usermode, it generates GPF, that is mirrored to user mode as SIGSEGV.
The offending register in mcontext should contain the value loading of
which generated the GPF, and it is so on i386. On amd64, we currently
report segment descriptor in tf_err, while segment register contains the
corrected value loaded by trap handler.

Fix the issue by behaving like i386, reloading segment register in trap
frame after signal frame is pushed onto user stack.

Noted and tested by:	pho
Approved by:	re (kensmith)
2009-07-10 10:29:16 +00:00
Scott Long
52c9ce25d8 Separate the parallel scsi knowledge out of the core of the XPT, and
modularize it so that new transports can be created.

Add a transport for SATA

Add a periph+protocol layer for ATA

Add a driver for AHCI-compliant hardware.

Add a maxio field to CAM so that drivers can advertise their max
I/O capability.  Modify various drivers so that they are insulated
from the value of MAXPHYS.

The new ATA/SATA code supports AHCI-compliant hardware, and will override
the classic ATA driver if it is loaded as a module at boot time or compiled
into the kernel.  The stack now support NCQ (tagged queueing) for increased
performance on modern SATA drives.  It also supports port multipliers.

ATA drives are accessed via 'ada' device nodes.  ATAPI drives are
accessed via 'cd' device nodes.  They can all be enumerated and manipulated
via camcontrol, just like SCSI drives.  SCSI commands are not translated to
their ATA equivalents; ATA native commands are used throughout the entire
stack, including camcontrol.  See the camcontrol manpage for further
details.  Testing this code may require that you update your fstab, and
possibly modify your BIOS to enable AHCI functionality, if available.

This code is very experimental at the moment.  The userland ABI/API has
changed, so applications will need to be recompiled.  It may change
further in the near future.  The 'ada' device name may also change as
more infrastructure is completed in this project.  The goal is to
eventually put all CAM busses and devices until newbus, allowing for
interesting topology and management options.

Few functional changes will be seen with existing SCSI/SAS/FC drivers,
though the userland ABI has still changed.  In the future, transports
specific modules for SAS and FC may appear in order to better support
the topologies and capabilities of these technologies.

The modularization of CAM and the addition of the ATA/SATA modules is
meant to break CAM out of the mold of being specific to SCSI, letting it
grow to be a framework for arbitrary transports and protocols.  It also
allows drivers to be written to support discrete hardware without
jeopardizing the stability of non-related hardware.  While only an AHCI
driver is provided now, a Silicon Image driver is also in the works.
Drivers for ICH1-4, ICH5-6, PIIX, classic IDE, and any other hardware
is possible and encouraged.  Help with new transports is also encouraged.

Submitted by:	scottl, mav
Approved by:	re
2009-07-10 08:18:08 +00:00
Sam Leffler
f6c09dd6a8 correctly set the tailq ptr when removing the last item in the q
Approved by:	re (kensmith)
2009-07-10 02:19:57 +00:00
Ariff Abdullah
5c663ce9cf Rearrange shift operation to increase interpolation accuracy,
further reducing conversion artifacts and better worst case SNR.

Approved by:	re (kib)
2009-07-09 22:21:18 +00:00
Navdeep Parhar
adb1423aa6 Fix cxgb(4) panic with jumbo frames.
Reviewed by:	kmacy
Approved by:	re (kib), gnn (mentor)
2009-07-09 19:27:58 +00:00
Rick Macklem
9ca27b565b Since the nfscl_getclose() function both decremented open counts and,
optionally, created a separate list of NFSv4 opens to be closed, it
was possible for the associated OpenOwner to be free'd before the Open
was closed. The problem was that the Open was taken off the OpenOwner
list before the Close RPC was done and OpenOwners can be free'd once the
list is empty. This patch separates out the case of doing the Close RPC
into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose()
so that it closes a single open instead of a list of them. This avoids
removing the Open from the OpenOwner list before doing the Close RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-09 19:00:29 +00:00
Konstantin Belousov
7c6d401c75 The control terminal revocation at the session leader exit does not
correctly checks for reclaimed vnode, possibly calling VOP_REVOKE for
such vnode. If the terminal is already revoked, or devfs mount was
forcibly unmounted, the revocation of doomed ctty vnode causes panic.

Reported and tested by:	lstewart
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-09 18:54:38 +00:00
Konstantin Belousov
1d587edb3a Extend the cn_flags field of the struct componentname to 64 bits to have
more space for the flags, that is too close to be exhausted. While changing
the KBI for name(9), use unsigned int for symlinks count.

Suggested by:	rwatson
Approved by:	re (kensmith)
2009-07-09 18:49:26 +00:00
Robert Noland
87c73f89a9 Add support for Radeon HD 4770 (RV740) chips.
Approved by:	re@ (kib)
MFC after:	3 days
2009-07-09 16:39:28 +00:00
Konstantin Belousov
a2622e5dc2 Restore the segment registers and segment base MSRs for amd64 syscall
return path only when neither thread was context switched while
executing syscall code nor syscall explicitely modified LDT or MSRs.

Save segment registers in trap handlers before interrupts are enabled,
to not allow context switches to happen before registers are saved.
Use separated byte in pcb for indication of fast/full return, since
pcb_flags are not synchronized with context switches.

The change puts back syscall microbenchmark numbers that were slowed
down after commit of the support for LDT on amd64.

Reviewed by:	jeff
Tested (and tested, and tested ...) by:	pho
Approved by:	re (kensmith)
2009-07-09 09:34:11 +00:00
Pyun YongHyeon
8e95322a35 Make xl(4) build with Tx checksum offload.
PR:		kern/136409
Approved by:	re (kib)
2009-07-09 01:58:59 +00:00
Jamie Gritton
9dce97d788 Remove crcopy call from seteuid now that it calls crcopysafe.
Reviewed by:	brooks
Approved by:	re (kib), bz (mentor)
2009-07-08 21:45:48 +00:00
Edward Tomasz Napierala
51334c8257 Regen the freebsd32 parts.
Approved by:	re (kib)
2009-07-08 16:30:34 +00:00
Edward Tomasz Napierala
92f76afe3a Fix freebsd32 version of lpathconf(2).
Approved by:	re (kib)
2009-07-08 16:26:43 +00:00
Edward Tomasz Napierala
e2b881bf03 Regenerate after lpathconf(2) addition.
Approved by:	re (kib)
2009-07-08 15:25:27 +00:00
Edward Tomasz Napierala
c38898116a There is an optimization in chmod(1), that makes it not to call chmod(2)
if the new file mode is the same as it was before; however, this
optimization must be disabled for filesystems that support NFSv4 ACLs.
Chmod uses pathconf(2) to determine whether this is the case - however,
pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.

This change adds lpathconf(3) to make it possible to solve that problem
in a clean way.

Reviewed by:	rwatson (earlier version)
Approved by:	re (kib)
2009-07-08 15:23:18 +00:00
Ed Schouten
6b53d5c0e7 Fix regressions in return events of poll() on TTYs.
As pointed out, POLLHUP should be generated, even if it hasn't been
specified on input. It is also not allowed to return both POLLOUT and
POLLHUP at the same time.

Reported by:	jilles
Approved by:	re (kib)
2009-07-08 10:21:52 +00:00
Alexander Motin
d498a2e62b Fix kernel panic, when ataahci driver is used on system with increased
MAXPHYS. Current ataahci driver memory allocation scheme includes only
64 items in DMA S/G table, and so not guarantied to support transactions
with more then 252K data.

Approved by:    re (kensmith)
MFC after:      2 weeks
2009-07-08 06:00:21 +00:00
Marcel Moolenaar
f43b57e32a Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c
is invalid because the ioctl happens without prior open. The ioctl
got introduced to provide backward compatibility for extended
partitions, but it ended up not being used because it didn't work
as expected. Since there are no consumers of the ioctl and the
implementation is broken, the best fix is to remove the code
entirely.

Spotted by:	phk
Approved by:	re (kensmith)
2009-07-08 05:56:14 +00:00
Mike Silbersack
347e22a68b Increase HZ_VM from 10 to 100. While 10 hz saves cpu time
under VM environments, it's too slow for FreeBSD to work
properly.  For example, ping at 10hz pings about every 600ms
instead of about every second.

Approved by:	re (kib)
2009-07-08 01:09:12 +00:00
Sam Leffler
6552698302 Fix ar5416 and later parts on big-endian platforms: setup the h/w byte
swizzler using the same technique used everywhere else.

Approved by:	re (kib)
2009-07-07 18:11:05 +00:00
Konstantin Belousov
7f5dff5064 Fix poll(2) and select(2) for named pipes to return "ready for read"
when all writers, observed by reader, exited. Use writer generation
counter for fifo, and store the snapshot of the fifo generation in the
f_seqcount field of struct file, that is otherwise unused for fifos.
Set FreeBSD-undocumented POLLINIGNEOF flag only when file f_seqcount is
equal to fifo' fi_wgen, and revert r89376.

Fix POLLINIGNEOF for sockets and pipes, and return POLLHUP for them.
Note that the patch does not fix not returning POLLHUP for fifos.

PR:	kern/94772
Submitted by:	bde (original version)
Reviewed by:	rwatson, jilles
Approved by:	re (kensmith)
MFC after:	6 weeks (might be)
2009-07-07 09:43:44 +00:00
Ken Smith
55b57bd274 Bump for BETA1.
Approved by:	re (implicit)
2009-07-07 00:02:26 +00:00
Sam Leffler
69ad6b3450 Fix AR5416 and later parts when building with AH_DEBUG or similar defined:
always define OS_REG_UNSWAPPED and use it in ath_hal_reg_{read,write}.

Approved by:	re (kib)
2009-07-06 20:51:54 +00:00
Alan Cox
133898afd5 When pmap_change_attr() changes the PAT setting on a kernel mapping, it has
to simultaneously change the PAT setting for the same pages within the
direct map region.  This may require the demotion of a 2MB page mapping and
the allocation of a page table page.  This revision gives the highest
possible priority (VM_ALLOC_INTERRUPT) to this page allocation, so that
pmap_change_attr() is less likely to fail.  (In general, kernel page table
page allocations have the highest priority, so this is not creating a new
precedent.)

(Demotion of 1GB page mappings within the direct map already specifies
VM_ALLOC_INTERRUPT to vm_page_alloc(), so only pmap_demote_pde() must be
changed.)

Approved by:	re (kib)
2009-07-06 18:43:42 +00:00
John Baldwin
0d0a6650d7 After the per-CPU IDT changes, the IDT vector of an interrupt could change
when the interrupt was moved from one CPU to another.  If the interrupt was
enabled, then the old IDT vector needs to be disabled and the new IDT vector
needs to be enabled.  This was mostly masked prior to the recent MSI changes
since in the older code almost all allocated IDT vectors were already enabled
and the enabled vectors on the BSP during boot covered enough of the IDT
range.  However, after the MSI changes, MSI interrupts that were allocated
but not enabled (e.g. DRM with MSI) during boot could result in an allocated
IDT vector that wasn't enabled.  The round-robin at the end of boot could
place another interrupt at the same IDT vector without enabling the IDT
vector causing trap 30 faults.

Fix this by explicitly disabling/enabling the old and new IDT vectors for
enabled interrupt sources when moving an interrupt between CPUs via the
pic_assign_cpu() method.  While here, fix a bug in my earlier changes so
that an I/O APIC interrupt pin is left unchanged if ioapic_assign_cpu()
fails to allocate a new IDT vector and returns ENOSPC.

Approved by:	re (kensmith)
2009-07-06 18:23:00 +00:00
John Baldwin
f7d7cd0c76 MFi386: Add a 'show idt' command to DDB to display the non-default function
pointers in the interrupt descriptor table.

Approved by:	re (kensmith)
2009-07-06 18:10:27 +00:00
Jack F Vogel
38537cb9d3 The new method of reading the mac address from the
RAR(0) register does not work on this old adapter,
provide a local routine that does it the older way.

Approved by:  re
2009-07-06 17:23:48 +00:00
Alan Cox
0e18ab26d0 PAE adds another level to the i386 page table. This level is a small
4-entry table that must be located within the first 4GB of RAM.  This
requirement is met by defining an UMA zone with a custom back-end
allocator function.  This revision makes two changes to this back-end
allocator function: (1) It replaces the use of contigmalloc() with the
use of kmem_alloc_contig().  This eliminates "double accounting", i.e.,
accounting by both the UMA zone and malloc tags.  (I made the same
change for the same reason to the zones supporting jumbo frames a week
ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK,
M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it.
pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO.  At the
moment, this is harmless only because the default behavior of
contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init()
doesn't really depend on the memory being zeroed.

The back-end allocator function in the Xen pmap is dead code.  I am
changing it nonetheless because I don't want to leave any "bad examples"
in the source tree for someone to copy at a later date.

Approved by:	re (kib)
2009-07-05 21:40:21 +00:00
Sam Leffler
24d3677b9d catchup with action+ageq additions
Submitted by:	"Paul B. Mahol" <onemda@gmail.com>
Approved by:	re (implicit)
2009-07-05 21:19:10 +00:00
Sam Leffler
2affcf8d1d add missing bit of r195379
Approved by:	re (kensmith)
2009-07-05 20:44:50 +00:00
Sam Leffler
5b16c28c42 Add ieee80211_ageq; a facility for staging packets that require
long-term work before they can be serviced.  Packets are tagged and
assigned an age (in seconds) at the point they are added to the
queue.  If a packet is not retrieved before it's age expires it is
reclaimed.  Tagging can take two forms: a reference to an ieee80211_node
(as happens in the tx path) or an opaque token in cases where there
is no reference or the node structure is not stable (i.e. it's going
to be destroyed).

o add ic_stageq to replace the per-node wds staging queue used for
  dynamic wds
o add ieee80211_mac_hash for building ageq tokens; this computes a
  32-bit hash from an 802.11 mac address (copied from the bridge)
o while here fix a stray ';' noticed in IEEE80211_PSQ_INIT

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-05 18:17:37 +00:00
Ariff Abdullah
96831ae59b - Increase dynamic range of filter coefficients from 28bit to 30bit.
This cause dramatic effect in overall precision and conversion quality
  by pushing down most aliasing artifacts around -180 dB.

  Spectrogram analysis/comparison:

  	http://people.freebsd.org/~ariff/z_comparison/z_28vs30/

- Guard against possible 64bit overflow during accumulation process by
  slightly normalize and saturate sample and coefficient multiplication,
  possible during extreme 32bit downsampling (eg. 380KHz -> 8KHz) with
  custom preset that require more than ~7000 taps filter (which is
  overkill).

- Add knobs through FEEDER_RATE_PRESETS to set dynamic range of filter
  coefficients/accumulator and prefered polynomial interpolator:

  	COEFFICIENT_BIT:X
	(where 1 <= X <= 30, default: 30)

	ACCUMULATOR_BIT:X
	(where 32 <= X <=64, default: 58)

	INTERPOLATOR:I
	(where I = ZOH, LINEAR, QUADRATIC, HERMITE, BSPLINE,
 	           OPT32X, OPT16X, OPT8X, OPT4X, OPT2X)

Approved by:	re (kib)
2009-07-05 18:15:06 +00:00
Sam Leffler
7634012302 Revamp 802.11 action frame handling:
o add a new facility for components to register send+recv handlers
o ieee80211_send_action and ieee80211_recv_action now use the registered
  handlers to dispatch operations
o rev ieee80211_send_action api to enable passing arbitrary data
o rev ieee80211_recv_action api to pass the 802.11 frame header as it may
  be difficult to locate
o update existing IEEE80211_ACTION_CAT_BA and IEEE80211_ACTION_CAT_HT handling
o update mwl for api rev

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-05 17:59:19 +00:00
Sam Leffler
8c393fd1f0 Cleanup ALIGNED_POINTER:
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent

Reviewed by:	bde, imp, marcel
Approved by:	re (kensmith)
2009-07-05 17:45:48 +00:00
Edward Tomasz Napierala
b16af7f185 When the kernel is configured without "options FFS", build UFS as a module
without requiring any special build flags.

Approved by:	re (kib)
2009-07-05 15:25:02 +00:00
Alexander Motin
6ae5218789 Mark atanvidia depending on ataahci since rev.188846.
Approved by:	re (kib)
2009-07-05 14:50:45 +00:00
Ivan Voras
cd2dd44393 Add missing reference to GPT support.
Submitted by:	Paul B. Mahol onemda at gmail.com
Approved by:	re (kib)
2009-07-05 14:03:41 +00:00
Konstantin Belousov
121fd46175 When forking a vm space that has wired map entries, do not forget to
charge the objects created by vm_fault_copy_entry. The object charge
was set, but reserve not incremented.

Reported by:	Greg Rivers <gcr+freebsd-current tharned org>
Reviewed by:	alc (previous version)
Approved by:	re (kensmith)
2009-07-03 22:17:37 +00:00
Rui Paulo
9e760f2578 acpi_hp.c:
- sysctl dev.acpi_hp.0.verbose to toggle debug output
- A modification so this can deal with different array lengths
  when reading the CMI BIOS - now it works ok on HP Compaq nx7300
  as well.
- Change behaviour to query only max_instance-1 CMI BIOS instances,
  because all HPs seen so far are broken in that respect
  (or there is a fundamental misunderstanding on my side, possible
  as well). This way a disturbing ACPI Error Field exceeds Buffer
  message is avoided.
- New bit to set on dev.acpi_hp.0.cmi_detail (0x8) to
  also query the highest guid instance of CMI bios

acpi_hp.4:
- Document dev.acpi_hp.0.verbose sysctl in man page
- Document new bit for dev.acpi_hp.0.cmi_detail
- Add a section to manpage about hardware that has been reported
  to work ok

Submitted by:	Michael Gmelin <freebsdusb at bindone.de>
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-03 21:12:37 +00:00
Edward Tomasz Napierala
340263c992 Fix fpathconf(3) on fifos, in effect making ls(1) properly
display '+' on them.  Taken from kern/125613, with cosmetic
changes.

PR:		kern/125613
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Approved by:	re (kib)
2009-07-02 20:05:21 +00:00
Ed Schouten
89fe4c0a2b Enable POSIX semaphores on all non-embedded architectures by default.
More applications (including Firefox) seem to depend on this nowadays,
so not having this enabled by default is a bad idea.

Proposed by:	miwi
Patch by:	Florian Smeets <flo kasimir com>
Approved by:	re (kib)
2009-07-02 18:24:37 +00:00
Konstantin Belousov
f1eccd05ec In vn_vget_ino() and their inline equivalents, mnt_ref() the mount point
around the sequence that drop vnode lock and then busies the mount point.
Not having vlocked node or direct reference to the mp allows for the
forced unmount to proceed, making mp unmounted or reused.

Tested by:	pho
Reviewed by:	jeff
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-02 18:02:55 +00:00
Robert Watson
6196f898bb Create audit records for AUE_POSIX_OPENPT, currently w/o arguments.
Approved by:	re (audit argument blanket)
2009-07-02 16:33:38 +00:00
Jamie Gritton
f0899a3460 Call prison_check from vfs_suser rather than re-implementing it.
Approved by:	re (kib), bz (mentor)
2009-07-02 14:19:33 +00:00
Ariff Abdullah
0162436974 Slightly increase amount of bandwidth of resampling filter for
feeder_rate_quality=3. This have the benefit of reducing aliasing
artifacts due to alias masking.

Spectrogram analysis:

 o Old preset (100:36:0.90)
	http://people.freebsd.org/~ariff/z_comparison/z_q3_old.png

 o New preset (100:36:0.92):
	http://people.freebsd.org/~ariff/z_comparison/z_q3_new.png

Approved by:	re (kib)
2009-07-02 10:02:10 +00:00
Robert Watson
deedc899fd Fix comment misthink.
Submitted by:	b. f. <bf1783 at googlemail.com>
Approved by:	re (audit argument blanket)
MFC after:	1 week
2009-07-02 09:50:13 +00:00
Robert Watson
64c9a4d9ab Audit file descriptor and command arguments to ioctl(2).
Approved by:	re (audit argument blanket)
MFC after:	1 week
2009-07-02 09:16:25 +00:00
Robert Watson
2a5658382a Clean up a number of aspects of token generation from audit arguments to
system calls:

- Centralize generation of argument tokens for VM addresses in a macro,
  ADDR_TOKEN(), and properly encode 64-bit addresses in 64-bit arguments.
- Fix up argument numbers across a large number of syscalls so that they
  match the numeric argument into the system call.
- Don't audit the address argument to ioctl(2) or ptrace(2), but do keep
  generating tokens for mmap(2), minherit(2), since they relate to passing
  object access across execve(2).

Approved by:	re (audit argument blanket)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-07-02 09:15:30 +00:00
Xin LI
c4f739ec0c Use MPT_MAX_LUNS as maximium number of LUNs, not 7, for SAS and FC cases.
This matches Linux driver behavior.

Discussed with:	scottl
Approved by:	re (kensmith)
MFC after:	1 month
2009-07-02 00:43:10 +00:00
Xin LI
1635f0499e Change explicit maximium numbers to the defined macro MPT_MAX_LUNS.
Approved by:	re (kensmith)
2009-07-02 00:41:37 +00:00
Robert Watson
03f7b00438 For access(2) and eaccess(2), audit the requested access mode.
Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 22:47:45 +00:00
Edward Tomasz Napierala
4bc61fd4ec Don't panic on attempt to set ACL on a block device file.
This is just a part of kern/125613.

PR:		kern/125613
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-01 22:30:36 +00:00
Jeff Roberson
2141453eb4 - Use fd_lastfile + 1 as the upper bound on nd. This is more correct than
using the size of the descriptor array.
 - A lock is not needed to fetch fd_lastfile.  The results are stale the
   instant it is dropped.
 - Use a private mutex pool for select since the pool mutex is not used
   as a leaf.
 - Fetch the si_mtx pointer first before resorting to hashing to compute
   the mutex address.

Reviewed by:	McKusick
Approved by:	re (kib)
2009-07-01 20:43:46 +00:00
Edward Tomasz Napierala
8edfe76ab5 Fix a panic which (reportedly) can happen when unmounting a filesystem
with I/O requests in flight on kernels compiled with "options INVARIANTS".
Also, make it obvious it's not right to call g_valid_obj() (and macros
using it, e.g. G_VALID_CONSUMER()) without topology lock held.

Approved by:	re (kib)
Reported by:	pho
2009-07-01 20:16:29 +00:00
Rafal Jaworowski
f981547c99 Map DPCPU pages into ARM kernel VA space.
DPCPU area was not properly mapped into kernel VA space, which caused page
fault on the first DPCPU access. This patch fixes the problem by mapping DPCPU
area into kernel VA space.

Submitted by:	Michal Hajduk, Piotr Ziecik
Reviewed by:	cognet, stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-07-01 20:07:44 +00:00
Robert Watson
15ca46f69d Audit file descriptor numbers for various socket-related system calls.
Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 19:55:11 +00:00
Robert Watson
9e4c1521d5 Define missing audit argument macro AUDIT_ARG_SOCKET(), and
capture the domain, type, and protocol arguments to socket(2)
and socketpair(2).

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 18:54:49 +00:00
John Baldwin
cebc7fb16c Improve the handling of cpuset with interrupts.
- For x86, change the interrupt source method to assign an interrupt source
  to a specific CPU to return an error value instead of void, thus allowing
  it to fail.
- If moving an interrupt to a CPU fails due to a lack of IDT vectors in the
  destination CPU, fail the request with ENOSPC rather than panicing.
- For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used
  on the first interrupt in a group.  Moving the first interrupt in a group
  moves the entire group.
- Use the icu_lock to protect intr_next_cpu() on x86 instead of the
  intr_table_lock to fix a LOR introduced in the last set of MSI changes.
- Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with
  interrupts.  Previously, binding an interrupt to a CPU only performed a
  privilege check if the interrupt had an interrupt thread.  Interrupts
  without a thread could be bound by non-root users as a result.
- If an interrupt event's assign_cpu method fails, then restore the original
  cpuset mask for the associated interrupt thread.

Approved by:	re (kib)
2009-07-01 17:20:07 +00:00
Robert Watson
6d5a61563a When auditing unmount(2), capture FSID arguments as regular text strings
rather than as paths, which would lead to them being treated as relative
pathnames and hence confusingly converted into absolute pathnames.

Capture flags to unmount(2) via an argument token.

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 16:56:56 +00:00
Rick Macklem
a4c5a1c315 When unmounting an NFS mount using sec=krb5[ip], the umount system
call could get hung sleeping on "gsssta" if the credentials for a user
that had been accessing the mount point have expired. This happened
because rpc_gss_destroy_context() would end up calling itself when the
"destroy context" RPC was attempted, trying to refresh the credentials.
This patch just checks for this case in rpc_gss_refresh() and returns
without attempting the refresh, which avoids the recursive call to
rpc_gss_destroy_context() and the subsequent hang.

Reviewed by:	dfr
Approved by:	re (Ken Smith), kib (mentor)
2009-07-01 16:42:03 +00:00
Rick Macklem
b766fabd9c Make sure that cr_error is set to ESHUTDOWN when closing the connection.
This is normally done by a loop in clnt_dg_close(), but requests that aren't
in the pending queue at the time of closing, don't get set. This avoids a
panic in xdrmbuf_create() when it is called with a NULL cr_mrep if
cr_error doesn't get set to ESHUTDOWN while closing.

Reviewed by:	dfr
Approved by:	re (Ken Smith), kib (mentor)
2009-07-01 16:38:18 +00:00
Jack F Vogel
6bdcc991ae Multiqueue RX is not correctly enabled on the new 82599
adapter, the SRRCTL register needs to be setup per queue.

Approved by: re
2009-07-01 16:13:01 +00:00
Robert Watson
422d786676 Audit the file descriptor number passed to lseek(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 15:37:23 +00:00
Robert Watson
c5957d6bba Fix link(2) auditing: use the second audit record path for the new object
name.

Approved by:	re (kib)
MFC after:	3 days
2009-07-01 13:22:08 +00:00
Robert Watson
2ef24dde7c udit the 'options' argument to wait4(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 12:36:10 +00:00
Alexander Motin
505feb8f37 Fix infinite loop in ng_iface, that happens when packet passes out via
two different ng interfaces sequentially due to tunnelling.

PR:		kern/134557
Submitted by:	Mikolaj Golub
Approved by:	re (kensmith)
MFC after:	3 days
2009-07-01 08:08:56 +00:00
Doug Rabson
259d14ed88 Don't include rpcv2.h - it has been removed.
Submitted by: ed@
Approved by: re
2009-07-01 07:34:28 +00:00
Alan Cox
9785747f87 Remove a stale comment. The very same revision (r85511) that introduced
this comment also implemented the proposed change to the code.

Approved by:	re (kib)
2009-06-30 19:39:17 +00:00
Doug Rabson
98c497255b Adjust the internal NFS KPI to avoid the last traces of NFS_LEGACYRPC.
Approved by: re
2009-06-30 19:10:17 +00:00
Doug Rabson
b49a2b39fd Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.
Approved by: re
2009-06-30 19:03:27 +00:00
Edward Tomasz Napierala
fb231f3627 Make gjournal work with kernel compiled with "options DIAGNOSTIC".
Previously, it would panic immediately.

Reviewed by:	pjd
Approved by:	re (kib)
2009-06-30 14:34:06 +00:00
Ed Maste
2dafac3976 Add FIONSPACE from NetBSD. FIONSPACE is provided so that programs may
easily determine how much space is left in the send queue; they do not
need to know the send queue size.

NetBSD revisions:
  sys_socket.c r1.41, 1.42
  filio.h r1.9

Obtained from:	NetBSD
Approved by:	re (kensmith)
2009-06-30 13:38:49 +00:00
Stanislav Sedov
b2d758545b - Add support to atomically set/clear individual bits of a MSR register
via cpuctl(4) driver.  Two new CPUCTL_MSRSBIT and CPUCTL_MSRCBIT ioctl(2)
  calls treat the data field of the argument struct passed as a mask
  and set/clear bits of the MSR register according to the mask value.
- Allow user to perform atomic bitwise AND and OR operaions on MSR registers
  via cpucontrol(8) utility.  Two new operations ("&=" and "|=") have been
  added.  The first one applies bitwise AND operaion between the current
  contents of the MSR register and the mask, and the second performs bitwise
  OR.  The argument can be optionally prefixed with "~" inversion operator.
  This allows one to mimic the "clear bit" behavior by using the command
  like this:
      cpucontrol -m 0x10&=~0x02		# clear the second bit of TSC MSR

  Inversion operator support in all modes (assignment, OR, AND).

Approved by:	re (kib)
MFC after:	1 month
2009-06-30 12:35:47 +00:00
Andriy Gapon
462fab84b8 remove unused/unneeded extern declarations
This should result in no changes to compiled code.

Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 day
2009-06-30 11:16:32 +00:00
Konstantin Belousov
cfba50c070 For SU mounts, softdep_fsync() might drop vnode lock, allowing other
threads to put dirty buffers on the vnode bufobj list. For regular files
and synchronous fsync requests, check for the condition and restart the
fsync vop if a new dirty buffer arrived.

Tested by:	pho
Approved by:	re (kensmith)
MFC after:	1 month
2009-06-30 10:07:33 +00:00
Konstantin Belousov
a50d1b2a66 Softdep_fsync() may need to lock parent directory of the synced vnode.
Use inlined (due to FFSV_FORCEINSMQ) version of vn_vget_ino() to prevent
mountpoint from being unmounted and freed while no vnodes are locked.

Tested by:	pho
Approved by:	re (kensmith)
MFC after:	1 month
2009-06-30 10:07:00 +00:00
Rui Paulo
0f73b657a9 acpi_wmi_if:
- Document different semantics for ACPI_WMI_PROVIDES_GUID_STRING_METHOD

acpi_wmi.c:
- Modify acpi_wmi_provides_guid_string_method to return absolut number of
  instances known for the given GUID.

acpi_hp.c:
- sysctl dev.acpi_hp.0.verbose to toggle debug output
- A modification so this can deal with different array lengths
  when reading the CMI BIOS - now it works ok on HP Compaq nx7300
  as well.
- Change behaviour to query only max_instance-1 CMI BIOS instances,
  because all HPs seen so far are broken in that respect
  (or there is a fundamental misunderstanding on my side, possible
  as well). This way a disturbing ACPI Error Field exceeds Buffer
  message is avoided.
- New bit to set on dev.acpi_hp.0.cmi_detail (0x8) to
  also query the highest guid instance of CMI bios

acpi_hp.4:
- Document dev.acpi_hp.0.verbose sysctl in man page
- Document new bit for dev.acpi_hp.0.cmi_detail
- Add a section to manpage about hardware that has been reported
  to work ok

Submitted by:	Michael Gmelin, freebsdusb at bindone.de
Approved by:	re (kib)
MFC after:	2 weeks
2009-06-30 09:51:41 +00:00
Bjoern A. Zeeb
ba3b25b35a In case we cannot queue a packet reaching the queue limit, retain the
semantics netisr_queue() always had and free the mbuf along with
returning the error.

Reviewed by:	rwatson
Approved by:	re (kensmith)
2009-06-30 05:21:00 +00:00
John Baldwin
5ed6940d13 Fix build with NFS_LEGACYRPC enabled after the socket upcall locking
changes.

Approved by:	re (kensmith)
2009-06-30 03:18:51 +00:00
Stacey Son
86120afae4 Dynamically allocate the gidset field in audit record.
This fixes a problem created by the recent change that allows a large
number of groups per user.  The gidset field in struct kaudit_record
is now dynamically allocated to the size needed rather than statically
(using NGROUPS).

Approved by:	re@ (kensmith, rwatson), gnn (mentor)
2009-06-29 20:19:19 +00:00
Brooks Davis
6cb7f168db Remove support for the /dev/net/* per-interface devices. They serve
little purpose and are unused in the base system.

The IOCTL functionality is entirely duplicated and routing sockets
provide a richer interface than the kqueue functionality.

Further, it is not practical for these devices to be made sensible in
the face of VIMAGE.

Bump __FreeBSD_version on the off chance that there is any code out
there that actually uses this stuff.

Reviewed by:	rwatson
Discussed with:	bz, zec
Approved by:	re@ (kensmith)
2009-06-29 19:46:29 +00:00
Sam Leffler
7850fa71f5 Update to 3.6.2.2 firmware (latest w/o host-based power save support):
o new tx ack queue (not used right now)
o proxy-sta related changes (no proxy sta in driver)
o explicit dwds ena/dis (needed only with proxy sta)
o cleanup BA policy handling
o new ampdu aggressive mode support
o CFEnd use now controllable

Approved by:	re (kensmith)
2009-06-29 18:42:54 +00:00
Jack F Vogel
562a924de0 Type problem when FreeBSD is in a virtualized environment, the
result was when the RX index wrapped it was converted into some
sort of gibberish and written into the RDT register, effectively
killing the RX side of the thing :)

Approved by: re
2009-06-29 18:17:10 +00:00
Konstantin Belousov
a73034ef7f Free struct ucreds allocated in vfs_hang_addrlist() when deleting
the export element.
While there, remove register storage-class specifiers.

Reported and tested by:	pho
Reviewed by:	kan
Approved by:	re (kensmith)
2009-06-29 18:09:07 +00:00
Warner Losh
ca72c49f42 Fix copyrights to reflect the origin of these files.
Approved by:	re@ (rwatson)
2009-06-29 16:45:50 +00:00
Attilio Rao
5257ff7ee1 Don't assume a default (currently 15) value for preloaded klds when
loading hwpmc, but calculate at runtime and allocate the necessary space.
Also the current logic is wrong as it can lead to an endless loop.

Sponsored by:   Sandvine Incorporated
Reported by:    Ryan Stone <rstone at sandvine dot com>
Tested by:      Giovanni Trematerra
                <giovanni dot trematerra at gmail dot com>
Approved by:	re (kib)
2009-06-29 16:03:18 +00:00
Robert Watson
5f06a81ae9 Fix "options VIMAGE_GLOBALS" build following introduction of
in6_ifaddrhead.

Approved by:	re (kib)
2009-06-29 15:23:50 +00:00
Pyun YongHyeon
ca406a44de Disable Rx checksum offload until I find more clue why it breaks
under certain environments. However give users chance to override
it when he/she surely knows his/her hardware works with Rx checksum
offload.

Reported by:	Ulrich Spoerlein ( uqs <> spoerlein dot net )
MFC after:	1 week
Approved by:	re (kensmith)
2009-06-29 05:12:21 +00:00
Marius Strobl
49c8326a79 - Work around the broken loader behavior of not demapping no longer
used kernel TLB slots when unloading the kernel or modules, which
  results in havoc when loading a kernel and modules which take up
  less TLB slots afterwards as the unused but locked ones aren't
  accounted for in virtual_avail. Eventually this should be fixed
  in the loader which isn't straight forward though and the kernel
  should be robust against this anyway. [1]
- Ensure that the addresses allocated directly from phys_avail[] by
  pmap_bootstrap_alloc() are always colored properly. This implicit
  assumption was broken in r194784 as unlike the other consumers the
  DPCPU area allocated for the BSP isn't a multiple of PAGE_SIZE *
  DCACHE_COLORS. [2]
- Remove the no longer used global msgbuf_phys.
- Remove the redundant ekva parameter of pmap_bootstrap_alloc().
- Correct some outdated function names in ktr(9) invocations.

Requested by:	jhb [1]
Reported by:	gavin [2]
Approved by:	re (kib)
MFC after:	2 weeks
2009-06-28 22:42:51 +00:00
Stanislav Sedov
fe1d3f15f6 - Turn the third (islocked) argument of the knote call into flags parameter.
Introduce the new flag KNF_NOKQLOCK to allow event callers to be called
  without KQ_LOCK mtx held.
- Modify VFS knote calls to always use KNF_NOKQLOCK flag.  This is required
  for ZFS as its getattr implementation may sleep.

Approved by:	re (rwatson)
Reviewed by:	kib
MFC after:	2 weeks
2009-06-28 21:49:43 +00:00
Ed Schouten
f551adb0a9 Don't pick up Giant inside ucom(4).
Giant was only used here to lock down a bit mask of allocated unit
numbers. Change the code to use its own mutex.

Reviewed by:	hselasky
Approved by:	re (kib)
2009-06-28 20:52:11 +00:00
Ed Schouten
f9bb1cf010 Add FIONWRITE support to TTYs.
TTYs already supported TIOCOUTQ, but FIONWRITE seems to be a more
generic name for this.

Approved by:	re (kib)
2009-06-28 12:02:15 +00:00
Poul-Henning Kamp
f5d7126306 Revert a local change that should not have been in the last commit.
Approved by:	re (kib)
2009-06-28 11:32:52 +00:00
Poul-Henning Kamp
bb520069ca There are a number of ways an application can check if there are
inbound data waiting on a filedescriptor, such as a pipe or a socket,
for instance by using select(2), poll(2), kqueue(2), ioctl(FIONREAD)
etc.

But we have no way of finding out if written data have yet to be
disposed of, for instance, transmitted (and ack'ed!) to some remote
host, or read by the applicantion at the far end of the pipe.

The closest we get, is calling shutdown(2) on a TCP socket in
non-blocking mode, but this has the undesirable sideeffect of
preventing future communication.

Add a complement to FIONREAD, called FIONWRITE, which returns the
number of bytes not yet properly disposed of.  Implement it for
all sockets.

Background:

A HTTP server will want to time out connections, if no new request
arrives within a certain period after the last transmitted response
has actually been sent (and ack'ed).

For a busy HTTP server, this timeout can be subsecond duration.

In order to signal to a load-balancer that the connection is truly
dead, TCP_RST will be the preferred method, as this avoids the need
for a RTT delay for FIN handshaking, with a client which, surprisingly
often, no longer at the remote IP number.

If a slow, distant client is being served a response which is big
enough to fill the window, but small enough to fit in the socket
buffer, the write(2) call will return immediately.

If the session timeout is armed at that time, all bytes in the
response may not have been transmitted by the time it fires.

FIONWRITE allows the timeout to check that no data is outstanding
on the connection, before it TCP_RST's it.

Input & Idea from: rwatson
Approved by:	re (kib)
2009-06-28 11:28:14 +00:00
Poul-Henning Kamp
c92033c874 Add ids of Sitecom USB wlan gadget.
Approved by:	re (kib)
2009-06-28 10:30:53 +00:00
Konstantin Belousov
9b4d473a6e Eliminiate code duplication by calling vm_object_destroy()
from vm_object_collapse().

Requested and reviewed by:	alc
Approved by:	re (kensmith)
2009-06-28 08:42:17 +00:00
Alan Cox
de0c3e0895 Correct a long-standing performance bug in cluster_rbuild(). Specifically,
in the case of a file system with a block size that is less than the page
size, cluster_rbuild() looks at too many of the page's valid bits.
Consequently, it may terminate prematurely, resulting in poor performance.

Reported by:	bde
Reviewed by:	tegge
Approved by:	re (kib)
2009-06-27 21:37:36 +00:00
Andrew Thompson
29bd7d7e9a Sync to p4
- Add support for devices that handle set and clear stall in hardware.
 - Add missing get timestamp function
 - Add more xfer flags

Submitted by:	Hans Petter Selasky
Approved by:	re (kib)
2009-06-27 21:23:30 +00:00
Andrew Thompson
7e6e6b6766 Use the correct mutex in umidi_open()
Submitted by:	Hans Petter Selasky
Approved by:	re (kib)
2009-06-27 21:21:11 +00:00
Sam Leffler
3c3e9d336d Add HAL_RX_FILTER_BSSID support (to disable bssid match):
o add HAL_CAP_BSSIDMATCH to identify parts that have the support for
  disabling bssid match
o honor capability for set/get rx filter
o use HAL_CAP_BSSIDMATCH in driver to decide whether to use the bssid
  match disable or fall back to promisc mode

Reviewed by:	rpaulo
Approved by:	re (rwatson)
2009-06-27 20:06:56 +00:00
Robert Watson
ad8dacbb91 Catch missed AUDIT_ARG() -> AUDIT_ARG_CMD() on amd64.
Submitted by:	Florian Smeets <flo at kasimir.com>
Approved by:	re (kib) (implicit)
MFC after:	1 week
2009-06-27 15:03:50 +00:00
Robert Watson
14961ba789 Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type.  This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by:	brooks
Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-06-27 13:58:44 +00:00
Robert Watson
f291b9cd38 In in6_update_ifa(), jump to 'cleanup' rather than returning directly
in one additional case, avoiding an ifaddr reference leak.

Defer releasing the in6_ifaddr's in6_ifaddrhead reference until the
end of in6_unlink_ifa(), as callers are inconsistent regarding whether
or not they hold a reference across the call.  This avoids using the
ifaddr after it may have been freed.

Reported by:	tegge
Reviewed by:	tegge
Approved by:	re (blanket)
MFC after:	6 weeks
2009-06-27 11:05:53 +00:00
Robert Watson
395cbe82d2 Remove unnecessary include of kdb.h that snuck in during ifaddr refcount
work.

Reported by:	pluknet <pluknet at gmail.com>
Approved by:	re (kib)
2009-06-27 10:30:28 +00:00
Yoshihiro Takahashi
1ef17ae8d4 Add stub vm.h for pc98.
Approved by:	re (kensmith)
2009-06-27 02:20:31 +00:00
Stanislav Sedov
b0f642ad44 - Don't zero data field in case of MSR write operation. Before this change
the value written to MSR register was always 0 regardless of value passed
  by user.
- Use proper data pointer when performing AMD microcode update.  Previously,
  the pointer to user-space data has been provided instead, which is totally
  incorrect.

Approved by:	re (kib)
MFC after:	1 week
2009-06-26 22:13:15 +00:00
Xin LI
8c9d12305b Add quirks for Actions MP4 player.
Submitted by:	John Hixson <john ixsystems com>
Approved by:	re (kib)
MFC after:	2 weeks
2009-06-26 21:47:37 +00:00
Robert Watson
9e6e01ebf6 In light of DPCPU use by netisr, revise various for loops from using
MAXCPU to mp_maxid, and handling and reporting of requests to use more
threads than we have CPUs to run them on.

Reviewed by:	bz
Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 20:39:36 +00:00
John Baldwin
7c020cbbe5 Return ENOSYS instead of EINVAL for invalid function codes to match the
behavior of Linux.

Reported by:	Alexander Best  alexbestms of math.uni-muenster.de
Approved by:	re (kib)
2009-06-26 19:39:33 +00:00
Robert Watson
5157862aa2 Use if_maddr_rlock() instead of IF_ADDR_LOCK() to protect access to
if_multiaddrs in if_cxgb.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 19:04:08 +00:00
Robert Watson
ba16a0fab1 Use if_addr_rlock/if_addr_runlock for if_spp when iterating if_addrhead,
as it is loadable as a module.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 18:50:49 +00:00
John Baldwin
f5e4c1052a Note that as a result of the SYSV IPC changes, COMPAT_FREEBSD[456] now
require COMPAT_FREEBSD7.  Also, explicitly note in NOTES that any version
of COMPAT_FREEBSD<n> effectively requires for newer binaries (i.e.
COMPAT_FREEBSD<n+1>, etc.).  While this has been true in practice
previously, it used to compile ok before the commit earlier this week.

Discussed with:	peter
Approved by:	re (kensmith)
2009-06-26 17:50:52 +00:00
Alan Cox
5797795f5a Correct the #endif comment.
Noticed by:	jmallett
Approved by:	re (kib)
2009-06-26 16:22:24 +00:00
Robert Watson
eb956cd041 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
Rui Paulo
be80e49a01 Add support for MacBook4,1.
Submitted by:	Christoph Langguth <christoph at rosenkeller.org>
MFC after:	2 weeks
Approved by:	re (kib)
2009-06-26 10:23:17 +00:00
Rui Paulo
1e5fd3f467 On special systems where the MBR and the GPT are in sync (up to the 4th
slicei, Apple EFI hardware), the bootloader will fail to recognize the GPT
if it finds anything else but the EFI partition. Change the check to continue
detecting the GPT by looking at the EFI partition on the MBR but
stopping successfuly after finding it.

PR:		kern/134590
Submitted by:	Christoph Langguth <christoph at rosenkeller.org>
Reviewed by:	jhb
MFC after:	2 weeks
Approved by:	re (kib)
2009-06-26 09:32:31 +00:00
Alan Cox
e999111ae7 This change is the next step in implementing the cache control functionality
required by video card drivers.  Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures.  In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig().  These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.

In collaboration with:	jhb
2009-06-26 04:47:43 +00:00
Weongyo Jeong
58f30773d8 provides a extra write buffer when the NDIS driver want to send a
request whose body has some datas through the default pipe.

Tested by:	Nikos Vassiliadis <nvass9573 at gmx.com>
2009-06-26 01:42:41 +00:00
Robert Watson
c4c96d5ea5 Update Netgraph nodes to use if_addr_rlock()/if_addr_runlock() instead
of IF_ADDR_LOCK()/IF_ADDR_UNLOCK() when iterating ifp->if_addrhead.

MFC after:	6 weeks
2009-06-26 00:49:12 +00:00
Robert Watson
6c8615603b Update various IPFW-related modules to use if_addr_rlock()/
if_addr_runlock() rather than IF_ADDR_LOCK()/IF_ADDR_UNLOCK().

MFC after:	6 weeks
2009-06-26 00:46:50 +00:00
Robert Watson
3893212ddc Update if_stf and if_tun to use if_addr_rlock()/if_addr_runlock() rather
than IF_ADDR_LOCK()/IF_ADDR_UNLOCK() when iterating ifp->if_addrhead.

MFC after:	6 weeks
2009-06-26 00:45:20 +00:00
Robert Watson
f9ef96ca71 Define four wrapper functions for interface address locking,
if_addr_rlock() and if_addr_runlock() for regular address lists, and
if_maddr_rlock() and if_maddr_runlock() for multicast address lists.

We will use these in various kernel modules to avoid encoding specific
type and locking strategy information into modules that currently use
IF_ADDR_LOCK() and IF_ADDR_UNLOCK() directly.

MFC after:	6 weeks
2009-06-26 00:36:47 +00:00
Robert Watson
534027673b Convert netisr to use dynamic per-CPU storage (DPCPU) instead of sizing
arrays to [MAXCPU], offering moderate memory savings.  In some places,
this requires using CPU_ABSENT() to handle less common platforms with
sparse CPU IDs.  In several places, assert that the selected CPUID for
work placement or statistics is not CPU_ABSENT() to be on the safe side.

Discussed with:	bz, jeff
2009-06-26 00:19:25 +00:00
Navdeep Parhar
fd3e790228 mvec routines should have no knowledge of the SG engine.
Reviewed by:	kmacy
Approved by:	gnn (mentor)
2009-06-25 21:50:15 +00:00
Attilio Rao
ca2d94bef7 Fix a LOR between pmc_sx and proctree/allproc when creating a new thread
for the pmclog.

Reported by:	Ryan Stone <rstone at sandvine dot com>
Tested by:	Ryan Stone <rstone at sandvine dot com>
Sponsored by:	Sandvine Incorporated
2009-06-25 20:59:37 +00:00
Sean Nicholas Barkas
5a6dafe63e Fix a bug reported by pho@ where one can induce a panic by decreasing
vfs.ufs.dirhash_maxmem below the current amount of memory used by dirhash. When
ufsdirhash_build() is called with the memory in use greater than dirhash_maxmem,
it attempts to free up memory by calling ufsdirhash_recycle(). If successful in
freeing enough memory, ufsdirhash_recycle() leaves the dirhash list locked. But
at this point in ufsdirhash_build(), the list is not explicitly unlocked after
the call(s) to ufsdirhash_recycle(). When we next attempt to lock the dirhash
list, we will get a "panic: _mtx_lock_sleep: recursed on non-recursive mutex
dirhash list".

Tested by:	pho
Approved by:	dwmalone (mentor)
MFC after:	3 weeks
2009-06-25 20:40:13 +00:00
John Baldwin
4e9dba6322 Fix kernels compiled without SMP support. Make intr_next_cpu() available
for UP kernels but as a stub that always returns the single CPU's local
APIC ID.

Reported by:	kib
2009-06-25 20:35:46 +00:00
Ed Schouten
91292be73f Remove COMPAT_43 from sun4v's GENERIC.
I think it's very unlikely that we have binaries for sun4v that use
features provided by COMPAT_43. Remove it from GENERIC.

Approved by:	kib
2009-06-25 19:26:23 +00:00
Robert Noland
3db57ca311 We shouldn't need to drop and reaquire the lock here.
MFC after:	3 days
2009-06-25 19:23:25 +00:00
Konstantin Belousov
28fe6a3f3e In lf_iteratelocks_vnode, increment state->ls_threads around iterating
of the vnode advisory lock list. This prevents deallocation of state
while inside the loop.

Reported and tested by:	pho
MFC after:	2 weeks
2009-06-25 18:54:56 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
John Baldwin
2654ae4dfe Remove the d_spare2_t typedef. The d_spare2 field was replaced by
d_mmap_single().  I considered adding a new round of padding for 8.0.
However, since cdevsw already maintains a version field, new versions
can be handled without requiring the need for explicit padding fields.
2009-06-25 18:44:05 +00:00
Jack F Vogel
29af53f006 Decided to limit the interrupt bind to multiqueue
config as done in igb.
2009-06-25 18:40:27 +00:00
John Baldwin
9bd55acf5e Return errors from intr_event_bind() to the caller of intr_set_affinity().
Specifically, if a non-root user attempts to bind an interrupt the request
will now report failure with EPERM rather than silently failing with a
successful return code.

MFC after:	1 week
2009-06-25 18:35:19 +00:00