Commit Graph

74145 Commits

Author SHA1 Message Date
Marcel Moolenaar
87b51e539d MFC rev 196333:
The start of the EFI GPT partition in the PMBR can always be represented
by CHS addressing. Don't define these fields as 0xff, but rather define
them correctly. This prevents boot problems on PCs where GPT is being
used.

PR:             115406
Submitted by:   Kent Hauser <kent@khauser.net>
Approved by:    re (kib)
2009-08-17 16:24:50 +00:00
John Hay
c49b5baa65 MFC: 196326
Fix parse() so that the partition to boot (load /boot/loader) from can
be set. The syntax as printed in main() is used: 0:ad(0p3)/boot/loader

Reviewed by:	jhb
Approved by:	re (kib)
2009-08-17 15:39:47 +00:00
Konstantin Belousov
1d4dc8543f MFC r196318:
Correct accounting error when allocating a a page table page to implement
a user-space demotion.

Approved by:	re (rwatson)
2009-08-17 13:32:56 +00:00
Rui Paulo
f2d3e43377 MFC r196316:
Fix a typo in ifdef mesh support. This would make mesh unworkable if
  TDMA support was compiled out.

Approved by:	re (kib)
2009-08-17 13:00:32 +00:00
Pawel Jakub Dawidek
1aefcd39f0 MFC r196309:
getcwd() (when __getcwd() fails) works by stating current directory, going up
(..), calling readdir and looking for previous directory inode.  In case of
.zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it
won't be visible in readdir output.

Fix this by implementing VPTOCNP for snapshot directories, so __getcwd()
doesn't fail and getcwd() doesn't have to use readdir method.

This fixes /bin/pwd from within .zfs/snapshot/<name>/.

Suggested by:	kib
Approved by:	re (rwatson)
2009-08-17 10:02:31 +00:00
Pawel Jakub Dawidek
5adfc444cf MFC r196307:
Manage asynchronous vnode release just like Solaris.

Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 09:55:58 +00:00
Pawel Jakub Dawidek
be7e2e42e1 MFC r196303:
- Reduce z_teardown_lock lock scope a bit.
- The error variable is int, not bool.
- Convert spaces to tabs where needed.

Approved by:	re (kib)
2009-08-17 09:30:31 +00:00
Pawel Jakub Dawidek
a64b735739 MFC r196301:
If z_buf is NULL, we should free znode immediately.

Noticed by:	avg
Approved by:	re (kib)
2009-08-17 09:27:10 +00:00
Pawel Jakub Dawidek
93be9449e4 MFC r196299:
- We need to recycle vnode instead of freeing znode.

Submitted by:	avg

- Add missing vnode interlock unlock.
- Remove redundant znode locking.

Approved by:	re (kib)
2009-08-17 09:23:27 +00:00
Pawel Jakub Dawidek
f0fb1d62c7 MFC r196297:
Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have
0 usecount.

Reported by:	Thomas Backman <serenity@exscape.org>
Approved by:	re (kib)
2009-08-17 09:14:58 +00:00
Pawel Jakub Dawidek
e43f173602 MFC r196295:
Remove OpenSolaris taskq port (it performs very poorly in our kernel) and
replace it with wrappers around our taskqueue(9).
To make it possible implement taskqueue_member() function which returns 1
if the given thread was created by the given taskqueue.

Approved by:	re (kib)
2009-08-17 09:03:47 +00:00
Pawel Jakub Dawidek
ea5f504fed MFC r196293:
Because taskqueue_run() can drop tq_mutex, we need to check if the
TQ_FLAGS_ACTIVE flag wasn't removed in the meantime, which means we missed a
wakeup.

Approved by:	re (kib)
2009-08-17 08:46:47 +00:00
Pawel Jakub Dawidek
4aae820003 MFC r196291:
- Fix a race where /dev/zfs control device is created before ZFS is fully
  initialized. Also destroy /dev/zfs before doing other deinitializations.
- Initialization through taskq is no longer needed and there is a race
  where one of the zpool/zfs command loads zfs.ko and tries to do some work
  immediately, but /dev/zfs is not there yet.

Reported by:	pav
Approved by:	re (kib)
2009-08-17 08:38:41 +00:00
Pawel Jakub Dawidek
9e813d55e3 MFC r196289:
Remove files that are no longer used.

Discussed with:	kmacy
Approved by:	re (kib)
2009-08-17 08:09:46 +00:00
Scott Long
91fa948790 Merge r196200. Add firmware definitions needed by mfiutil
Approved by:	re
2009-08-17 06:21:22 +00:00
Ed Schouten
336b627671 MFC r196276:
Fix small style regression introduced by the MPSAFE newbus code.

Approved by:	re (rwatson)
2009-08-16 20:33:16 +00:00
Andrew Thompson
a8ff174941 MFC r196274
Change the usb workers from kernel processes to threads, this is mostly a
 cosmetic change to reduce cruft in the proc table.

 Also change the idle wait message to `-` like how taskqueues are.

 Reviewed by:	julian
 Approved by:	re (kib)
2009-08-16 14:17:47 +00:00
Marcel Moolenaar
7ea9d2977c MFC revision 196269:
Fix misalignment in nvpair_native_embedded() caused by the compiler
replacing the bzero().

Approved by:	re (kensmith)
2009-08-16 02:21:24 +00:00
Marcel Moolenaar
fd77c45934 MFC rev 196268:
Decouple ACPI CPU Ids from FreeBSD's cpuid. The ACPI Ids can be
sparse, which causes a kernel assert.

Approved by:	re (kensmith)
2009-08-16 02:12:13 +00:00
Michael Tuexen
ca007251f2 MFC r196260.
* Fix a bug where PR-SCTP settings are ignore when using implicit
   association setup.
 * Fix a bug where message with illegal stream ids are not deleted.
 * Fix a crash when reporting back unsent messages from the send_queue.
 * Fix a bug related to INIT retransmission when the socket is already
   closed.
 * Fix a bug where associations were stalled when partial delivery API
   was enabled.
 * Fix a bug where the receive buffer size was smaller than the
   partial_delivery_point.

Approved by: re, rrs (mentor)
2009-08-15 21:37:16 +00:00
Attilio Rao
168346d043 MFC r196256:
Fixup the Xen support in order to match newly introduced enhacements for
IPIs.

Approved by:	re (kib)
2009-08-15 18:56:56 +00:00
Stanislav Sedov
3a68500f53 - Merge r196246:
Proprely intialize UART parameters at probe stage, so uart(4)
  will initialize the FIFO memory correctly on attach.  Before
  that this values was intialized in only in at91_usart_bus_attach
  which is called after the uart(4) memory allocation happens.

Approved by:	re (kib)
MFC after:	1 week
2009-08-15 15:18:29 +00:00
Qing Li
6e12c67559 MFC 196234
In function ip_output(), the cached route is flushed when there is a
mismatch between the cached entry and the intended destination. The
cached rtentry{} is flushed but the associated llentry{} is not. This
causes the wrong destination MAC address being used in the output
packets. The fix is to flush the llentry{} when rtentry{} is cleared.

Reviewed by:	kmacy, rwatson
Approved by:	re
2009-08-15 00:04:12 +00:00
Marko Zec
d326ff4914 MFC r196230:
Appease VNET_DEBUG - in if_vmove we temporarily switch i.e.
  recurse from one vnet to another which is OK, so no need
  to flood the console with warnings here.

  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-14 23:05:10 +00:00
Marko Zec
0e9c71019b MFC r196229:
SCTP is not yet compatible with options VIMAGE kernels although it compiles
  with VIMAGE defined, so explicitly disallow building such kernels.

  Reviewed by:  rrs
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-14 23:01:21 +00:00
Marko Zec
857e561523 MFC r196228:
Make VNET_DEBUG a standalone compile-time option, i.e. decouple it from
  INVARIANTS.

  Reviewed by:  bz
  Approved by:  re (rwatson), julian (mentor)

Approved by:	re (rwatson)
2009-08-14 22:55:54 +00:00
Bjoern A. Zeeb
21845eba48 MFC r196226:
Add a new macro to test that a variable could be loaded atomically.
  Check that the given variable is at most uintptr_t in size and that
  it is aligned.

  Note: ASSERT_ATOMIC_LOAD() uses ALIGN() to check for adequate
        alignment -- however, the function of ALIGN() is to guarantee
        alignment, and therefore may lead to stronger alignment
        enforcement than necessary for types that are smaller than
        sizeof(uintptr_t).

  Add checks to mtx, rw and sx locks init functions to detect possible
  breakage. This was used during debugging of the problem fixed with
  r196118 where a pointer was on an un-aligned address in the dpcpu area.

  In collaboration with:  rwatson
  Reviewed by:            rwatson

Approved by:	re (kib)
2009-08-14 21:50:47 +00:00
John Baldwin
7612087747 Adjust the handling of the local APIC PMC interrupt vector:
- Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc()
  routines in the local APIC code that the hwpmc(4) driver can use to
  manage the local APIC PMC interrupt vector.
- Do not enable the local APIC PMC interrupt vector by default when
  HWPMC_HOOKS is enabled.  Instead, the hwpmc(4) driver explicitly
  enables the interrupt when it is succesfully initialized and disables
  the interrupt when it is unloaded.  This avoids enabling the interrupt
  on unsupported CPUs which may result in spurious NMIs.

Reported by:	rnoland
Reviewed by:	jkoshy
Approved by:	re (kib)
MFC after:	2 weeks
2009-08-14 20:57:21 +00:00
Konstantin Belousov
8c5a788279 MFC r196206:
Take the number of allocated freeblks into consideration for
softdep_slowdown(), to prevent kernel memory exhaustioni on
mass-truncation.

Approved by:	re (rwatson)
2009-08-14 11:22:09 +00:00
Konstantin Belousov
b5d2abe26f MFC r196205:
In nfs_upgrade_vnlock(), assert that the vnode is locked.
When downgrading, pass LK_RETRY to the vn_lock(), since otherwise
vn_lock() unlocks the doomed vnode, causing extra unlock.

Approved by:	re (rwatson)
2009-08-14 11:17:34 +00:00
Konstantin Belousov
b60a1fa9a5 MFC r196204:
Add the address of the lock to the KTR_LOCK trace.

Approved by:	re (rwatson)
2009-08-14 11:13:06 +00:00
Konstantin Belousov
106c3802ff MFC r196203:
Correctly handle unlock for !MAKEENTRY case.

Approved by:	re (rwatson)
2009-08-14 11:06:58 +00:00
Julian Elischer
f394882d89 MFC of r196201
URL: http://svn.freebsd.org/changeset/base/196201

  Fix ipfw crash on uid or gid check.
  Receiving any ip packet for which there is no existing socket will
  crash if ipfw has a uid or gid test rule, as the uid/gid
  of the non existent owner of said non existent socket is tested.
  Brooks introduced this error as part of his >16 gids patch.
  It appears to be a cut-n-paste error from similar code a few lines
  before. The old code used the 'pcb' variable here, but in the
  new code that switched the 'inp' variable, which is often NULL
  and what is tested in the code further up. The rest of the multi-gid
  patch for ipfw seems solid (and cleaner than previous code).

p.s. What's up with all the properties changing? It is a fresh checkout.

Reviewed by:	brooks
Approved by:	re (rwatson)
2009-08-14 10:25:14 +00:00
Attilio Rao
be1057174e MFC r196196:
* Completely remove the option STOP_NMI from the kernel.  This option
  has proven to have a good effect when entering KDB by using a NMI,
  but it completely violates all the good rules about interrupts
  disabled while holding a spinlock in other occasions.  This can be the
  cause of deadlocks on events where a normal IPI_STOP is expected.
* Add an new IPI called IPI_STOP_HARD on all the supported architectures.
  This IPI is responsible for sending a stop message among CPUs using a
  privileged channel when disponible. In other cases it just does match a
  normal IPI_STOP.
  Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
  architectures, while on the other has a normal IPI_STOP effect. It is
  responsibility of maintainers to eventually implement an hard stop
  when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
  function called stop_cpus_hard(). That is specular to stop_cpu() but
  it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
  stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by:  jhb
Tested by:    pho, bz, rink
Approved by:  re (kib)
2009-08-13 17:54:11 +00:00
Rafal Jaworowski
60df4aa103 MFC r196193:
Use correct wbinv operation in pmap_l2cache_wbinv_range().

Submitted by:	Michal Hajduk
Reviewed by:	stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-08-13 16:01:19 +00:00
Robert Watson
fa1decc4ad Merge r196122 from head to stable/8:
Correctly audit real gids following changes to the audit record argument
  interface.

Approved by:	re (kib)
2009-08-13 15:01:50 +00:00
Robert Watson
7e68640f17 Merge r196121 from head to stable/8:
Reverse misordered unlock and lock in at_control for netatalk phase I
  addresses.

  Submitted by:	Russell Cattelan <cattelan at thebarn.com>
Approved by:	re (kib)
2009-08-13 14:50:39 +00:00
Edward Tomasz Napierala
dba91be244 InstaMFC 196179: Remove CDDL warning.
Approved by:	re (kib), core
2009-08-13 13:56:05 +00:00
Bjoern A. Zeeb
da2a30fca1 MFC r196176:
Make it possible to change the vnet sysctl variables on jails
  with their own virtual network stack. Jails only inheriting a
  network stack cannot change anything that cannot be changed from
  within a prison.

  Reviewed by:  rwatson, zec

Approved by:	re (kib)
2009-08-13 10:31:02 +00:00
Bjoern A. Zeeb
9b36fbe976 MFC r196174:
Put multiple instructions into a block when iterating; unbreaks
  NET_RT_DUMP, which otherwise only returned information of AF_MAX.
  This was broken in r193232 (save your time - my bug, my fix).

  Reported by:  Larry Baird (lab gta.com)
  Tested by:    Larry Baird (lab gta.com)
  Reviewed by:  zec, lstewart, qing

PR:		kern/137700
Approved by:	re (kib)
2009-08-13 09:32:15 +00:00
Matt Jacob
7922e083b4 MFC 196162: Have at least a fallback WWN so cards on sun branded FC cards
can configure.

Approved by:	re
2009-08-13 01:45:26 +00:00
Sam Leffler
4904bb1de0 MFC r196159:
Drain link state event changes posted during vap destroy.  This is a
  band-aid for the general problem that if_link_state_change can be
  called between if_detach and if_free leaving a task queued that has
  been free'd.

Reviewed by:	rwatson
Approved by:	re (rwatson)
2009-08-12 21:34:57 +00:00
Qing Li
747a32aaea MFC r196152
A piece of code was added to install a host route when an IPv6 interface
address is configured with a /128 prefix. This is no longer necessary due
to r192011. In fact that code conflicts with r192011. This patch removes
the host route installation when detecting the /128 prefix, and instead
let the code added by r192011 to install the loopback route for that IPv6
interface address.

Approved by:	re
2009-08-12 20:48:50 +00:00
Rick Macklem
e9200ba65b MFC r196149:
Add a check for a NULL mbuf ptr at the beginning of xdrmbuf_inline()
so that it returns failure instead of crashing when "m->m_len" is
executed and m == NULL. The mbuf ptr can be NULL when a call to
xdrmbuf_getbytes() gets the bytes it needs, but they are at the end
of a short RPC reply. When this happens, xdrmbuf_getbytes() returns
success, but advances the mbuf ptr (xdrs->x_private) to m_next, which
is NULL. If this is followed by a call to xdrmbuf_getlong(), it calls
xdrmbuf_inline(), which would cause a crash by accessing "m->m_len".

Approved by:	re (rwatson), kib (mentor)
2009-08-12 20:30:27 +00:00
Jung-uk Kim
3eced0dd19 MFC: r196150
Always embed pointer to BPF JIT function in BPF descriptor
to avoid inconsistency when opt_bpf.h is not included.

Reviewed by:	rwatson
Approved by:	re (rwatson)
2009-08-12 17:45:55 +00:00
Robert Noland
b7f930cb48 Merge r196142
Add support for radeon RS880 IGP chips to drm.

Approved by:	re (kib)
2009-08-12 13:12:09 +00:00
Robert Noland
0f2cac6379 Merge r196141
Add some additional radeon pci ids to drm.

Approved by:	re (kib)
2009-08-12 13:09:24 +00:00
Bjoern A. Zeeb
abff5b8ad9 MFC r196135:
Make the kernel compile without IP networking by moving
  a variable under a proper #ifdef.

Approved by:	re (rwatson)
2009-08-12 12:14:30 +00:00
Bjoern A. Zeeb
537791e584 MFC r196132:
Add ddb show dpcpu_off command to ease dpcpu memory debugging.
  While show pcpu prints pc_dynamic this also prints the original
  memory address as well as the maths.

  Once dpcpu goes NUMA this is considered to help debugging as well.

  Reviewed by:  rwatson

Approved by:	re
2009-08-12 12:10:28 +00:00
Bjoern A. Zeeb
3baf84f587 MFC r196129:
Update DDB show vnet command to print all used and available information.

  Reviewed by:  rwatson, zec

Approved by:	re
2009-08-12 12:05:07 +00:00
Bjoern A. Zeeb
28a2b3dd57 MFC r196118:
Put minimum alignment on the dpcpu and vnet section so that ld
  when adding the __start_ symbol knows the expected section alignment
  and can place the __start_ symbol correctly.

  These sections will not support symbols with super-cache line alignment
  requirements.

  For full details, see posting to freebsd-current, 2009-08-10,
  Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>.

  Debugging and testing patches by:
                Kamigishi Rei (spambox haruhiism.net),
                np, lstewart, jhb, kib, rwatson
  Tested by:    Kamigishi Rei, lstewart
  Reviewed by:  kib

Approved by:	re
2009-08-12 10:32:20 +00:00
Robert Watson
9d2eb78bcb Add padding to struct inpcb, missed during our padding sweep earlier in
the release cycle.

Approved by:	re (kensmith)
2009-08-02 22:47:08 +00:00
Robert Watson
315e3e38fa Many network stack subsystems use a single global data structure to hold
all pertinent statatistics for the subsystem.  These structures are
sometimes "borrowed" by kernel modules that require a place to store
statistics for similar events.

Add KPI accessor functions for statistics structures referenced by kernel
modules so that they no longer encode certain specifics of how the data
structures are named and stored.  This change is intended to make it
easier to move to per-CPU network stats following 8.0-RELEASE.

The following modules are affected by this change:

      if_bridge
      if_cxgb
      if_gif
      ip_mroute
      ipdivert
      pf

In practice, most of these statistics consumers should, in fact, maintain
their own statistics data structures rather than borrowing structures
from the base network stack.  However, that change is too agressive for
this point in the release cycle.

Reviewed by:	bz
Approved by:	re (kib)
2009-08-02 19:43:32 +00:00
Julian Elischer
9734411552 Stop uuidgen(2) from crashing in vimage kerenels.
make curvnet valid when needed.

Reviewed by:	bz@
Approved by:	re (kib)
2009-08-02 16:59:02 +00:00
Attilio Rao
444b91868b Make the newbus subsystem Giant free by adding the new newbus sxlock.
The newbus lock is responsible for protecting newbus internIal structures,
device states and devclass flags. It is necessary to hold it when all
such datas are accessed. For the other operations, softc locking should
ensure enough protection to avoid races.

Newbus lock is automatically held when virtual operations on the device
and bus are invoked when loading the driver or when the suspend/resume
take place. For other 'spourious' operations trying to access/modify
the newbus topology, newbus lock needs to be automatically acquired and
dropped.

For the moment Giant is also acquired in some key point (modules subsystem)
in order to avoid problems before the 8.0 release as module handlers could
make assumptions about it. This Giant locking should go just after
the release happens.

Please keep in mind that the public interface can be expanded in order
to provide more support, if there are really necessities at some point
and also some bugs could arise as long as the patch needs a bit of
further testing.

Bump __FreeBSD_version in order to reflect the newbus lock introduction.

Reviewed by:    ed, hps, jhb, imp, mav, scottl
No answer by:   ariff, thompsa, yongari
Tested by:      pho,
                G. Trematerra <giovanni dot trematerra at gmail dot com>,
                Brandon Gooch <jamesbrandongooch at gmail dot com>
Sponsored by:   Yahoo! Incorporated
Approved by:	re (ksmith)
2009-08-02 14:28:40 +00:00
Ed Schouten
d40b91cb13 Fix two bugs related to TTY input:
- fix write() on pseudo-terminal masters to return the amount of bytes
  passed to the TTY, not the amount of bytes read from user.

- fix ttydisc_rint_bypass() to set the high watermark when it cannot
  write all input, just like ttydisc_rint() itself.

Approved by:	re (kib)
2009-08-02 14:25:26 +00:00
Ed Schouten
61fb73de41 Make the MacBook3,1 boot again.
Approved by:	re (kib)
2009-08-02 11:26:23 +00:00
Robert Watson
6aad5c1c93 The colour was red as shall be the letters of this warning to people upon
boot if the experimental VIMAGE feature was compiled into the kernel.

Submitted by:	bz
Reviewed by:	zec
Approved by:	re (vimage blanket)
2009-08-01 22:22:45 +00:00
Robert Watson
c8f6a13820 Minor style tweaks.
Approved by:	re (vimage blanket)
2009-08-01 21:58:32 +00:00
Robert Watson
6bc2c7b70c Make the vnet alloc/destroy paths a bit easier to followg by merging
vnet_data_init/vnet_data_destroy into vnet_alloc/vnet_destroy.

Reviewed by:	bz, zec
Approved by:	re (vimage blanket)
2009-08-01 21:54:15 +00:00
Robert Watson
7429a3f3d8 Remove vnet_foreach() utility function, which previously allowed
vnet.c to iterate virtual network stacks without being aware of
the implementation details previously hidden in kern_vimage.c.
Now they are in the same file, so remove this added complexity.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 20:24:45 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Matt Jacob
2df76c160b Add 8Gb support (isp_2500). Fix a fair number of configuration and
firmware loading bugs.

Target mode support has received some serious attention to make it
more usable and stable.

Some backward compatible additions to CAM have been made that make
target mode async events easier to deal with have also been put
into place.

Further refinement and better support for NP-IV (N-port Virtualization)
is now in place.

Code for release prior to RELENG_7 has been stripped away for code clarity.

Sponsored by: Copan Systems

Reviewed by:    scottl, ken, jung-uk kim
Approved by:    re
2009-08-01 01:04:26 +00:00
Matt Jacob
b965588786 Add 8Gb card firmware. Update some 2Gb and 4Gb f/w sets.
Split 4Gb and 8Gb into pieces that can be either multi_id
capable or not.

Reviewed by:	scottl, ken
Approved by:	re
2009-08-01 00:57:34 +00:00
Sam Leffler
faf666aa77 fix misplaced #endif that caused tdma handling to be merged with ESS handling
(causing tdma scanning to break)

Approved by:	re (kib)
2009-07-31 19:13:16 +00:00
Sam Leffler
2dfcbb0e2e Filter setting IFF_PROMISC on tdma vaps; we don't want the underyling device
to be in promiscuous mode as we have a h/w bssid.

Approved by:	re (kib)
2009-07-31 19:12:19 +00:00
Weongyo Jeong
6418e06a48 add upgt
Approved by:	re (kib)
2009-07-31 17:57:16 +00:00
Jamie Gritton
42bebd8205 Make the "enforce_statfs" default 2 (most restrictive) in jail_set(2),
instead of whatever the parent/system has (which is generally 0).  This
mirrors the old-style default used for jail(2) in conjunction with the
security.jail.enforce_statfs sysctl.

Approved by:	re (kib), bz (mentor)
2009-07-31 16:00:41 +00:00
John Baldwin
87eca70e0c Fix some LORs between vnode locks and filedescriptor table locks.
- Don't grab the filedesc lock just to read fd_cmask.
- Drop vnode locks earlier when mounting the root filesystem and before
  sanitizing stdin/out/err file descriptors during execve().

Submitted by:	kib
Approved by:	re (rwatson)
MFC after:	1 week
2009-07-31 13:40:06 +00:00
Kevin Lo
6ece67d83f Free allocated Rx ring dma memory/tags.
Reviewed by: yongari@
Approved by: re (kib)
2009-07-31 09:57:42 +00:00
Weongyo Jeong
aa7eaa2fb6 fixes a typo for DWA120 device ID.
Reported by:	Alexander Kuznetsov <skritku at gmail.com>
Approved by:	re (kib)
2009-07-30 18:53:06 +00:00
Xin LI
16e324fc56 Show interface name which received short CARP packet (e.g. a VRRP packet),
in order to match other codepaths nearby.  This makes troubleshooting
easier.

Approved by:	re (kib)
MFC after:	1 month
2009-07-30 17:40:47 +00:00
Jamie Gritton
2b0d6f815e Remove a LOR, where the the sleepable allprison_lock was being obtained
in prison_equal_ip4/6 while an inp mutex was held.  Locking allprison_lock
can be avoided by making a restriction on the IP addresses associated with
jails:

Don't allow the "ip4" and "ip6" parameters to be changed after a jail is
created.  Setting the "ip4.addr" and "ip6.addr" parameters is allowed,
but only if the jail was already created with either ip4/6=new or
ip4/6=disable.  With this restriction, the prison flags in question
(PR_IP4_USER and PR_IP6_USER) become read-only and can be checked
without locking.

This also allows the simplification of a messy code path that was needed
to handle an existing prison gaining an IP address list.

PR:		kern/136899
Reported by:	Dirk Meyer
Approved by:	re (kib), bz (mentor)
2009-07-30 14:28:56 +00:00
Robert Watson
ed3db012fc Reorder and recomment vnet.c and vnet.h on the basis that they are no longer
solely about the virtual network stack memory allocator.

Approved by:	re (vimage blanket)
2009-07-30 12:41:19 +00:00
Robert Watson
4b0026b9fb Add two new privileges for use by OpenAFS, which will be supported for
FreeBSD 8.x.

MFC after:	3 days
Submitted by:	Benjamin Kaduk <kaduk at MIT.EDU>
Approved by:	re (kib)
2009-07-30 08:41:06 +00:00
Alfred Perlstein
9e90dc16ba Missed this file for r195963:
USB core:
  - add support for defragging of written device data.
  - improve handling of alternate settings in device side mode.
  - correct return value from usbd_get_no_alts() function.
  - reported by: HPS
  - P4 ID: 166156, 166168

  - report USB device release information to devd and pnpinfo.
  - reported by: MIHIRA Sanpei Yoshiro
  - P4 ID: 166221

Submitted by:	hps
Approved by:	re
2009-07-30 00:57:54 +00:00
Alfred Perlstein
65c7bc9cbf USB CORE - Improve HID parsing
See PR description for more info. Patch is
implemented differently than suggested, but
having the same result.

PR:     usb/137188

Submitted by:	hps
Approved by:	re
2009-07-30 00:17:08 +00:00
Alfred Perlstein
bf3254f22e USB CORE - compat Linux:
- Patch request from Tim Borgeaud:
- add automatic locking
- add refcount for killing URB's

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:50 +00:00
Alfred Perlstein
a0c61406ad USB controller:
- allow disabling "root_mount_hold()" by setting "hw.usb.no_boot_wait" sysctl

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:32 +00:00
Alfred Perlstein
4743e6da52 ULPT:
- add conditional printer status checking
- P4 ID: 166176

Submitted by:	hps
Approved by:	re
2009-07-30 00:16:06 +00:00
Alfred Perlstein
bd73b18714 USB core:
- add support for defragging of written device data.
- improve handling of alternate settings in device side mode.
- correct return value from usbd_get_no_alts() function.
- reported by: HPS
- P4 ID: 166156, 166168

- report USB device release information to devd and pnpinfo.
- reported by: MIHIRA Sanpei Yoshiro
- P4 ID: 166221

Submitted by:	hps
Approved by:	re
2009-07-30 00:15:50 +00:00
Alfred Perlstein
b27b901cb4 USB serial:
- add new ID for Huawei
- P4 ID: 166150

PR:             usb/136761

Submitted by:	hps
Approved by:	re
2009-07-30 00:15:17 +00:00
Alfred Perlstein
3d11bc19a0 USB audio:
- code factoring patch from "Eygene Ryabinkin"
- P4 ID: 166149

Submitted by:	hps
Approved by:	re
2009-07-30 00:14:56 +00:00
Alfred Perlstein
dddb25f98a USB CORE:
- Add minimum polling support to drive UMASS
  and UKBD in case of panic.
- Add extra check to ukbd probe to fix problem about
  mouse devices attaching like keyboards.
- P4 ID: 166148

Submitted by:	hps
Approved by:	re
2009-07-30 00:14:34 +00:00
Alfred Perlstein
81f8288460 USB input
- add support for setting the UMS polling rate through -F option
           passed to moused.
         - requested by Alexander Best
         - P4 ID: 166075

PR:             usb/125264

Submitted by:	hps
Approved by:	re
2009-07-30 00:13:09 +00:00
Alfred Perlstein
f724bcec31 USB controller:
- patch from Alexander Motin <mav@freebsd.org>
          - add more ID's
          - P4 ID: 165805

Submitted by:	hps
Approved by:	re
2009-07-30 00:12:47 +00:00
Konstantin Belousov
ed190ad06a Fix XEN build breakage, by implementing pmap_invalidate_cache_range()
and using it when appropriate. Merge analogue of the r195836
optimization to XEN.

Approved by:	re (kensmith)
2009-07-29 19:38:33 +00:00
Jamie Gritton
bdfc8cc4cd Don't allow mixing the "vnet" and "ip4/6" jail parameters, since vnet
jails have their own IP stack and don't have access to the parent IP
addresses anyway.  Note that a virtual network stack forms a break
between prisons with regard to the list of allowed IP addresses.

Approved by:	re (kib), bz (mentor)
2009-07-29 16:46:59 +00:00
Jamie Gritton
8986e3a023 Change the default value of the "ip4" and "ip6" jail parameters to
"disable", which only allows access to the parent/physical system's
IP addresses when specifically directed.  Change the default value of
"host" to "new", and don't copy the parent host values, to insulate
jails from the parent hostname et al.

Approved by:	re (kib), bz (mentor)
2009-07-29 16:41:02 +00:00
Rick Macklem
6a795918c1 Fix the experimental nfs client so that it only calls ncl_vinvalbuf()
for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since
nfscl_mustflush() only returns 0 when there is a valid write delegation
issued to the client, it only affects the case of an NFSv4 mount with
callbacks/delegations enabled.

Approved by:	 re (kensmith), kib (mentor)
2009-07-29 14:50:31 +00:00
Konstantin Belousov
8a5ac5d56f As was done in r195820 for amd64, use clflush for flushing cache lines
when memory page caching attributes changed, and CPU does not support
self-snoop, but implemented clflush, for i386.

Take care of possible mappings of the page by sf buffer by utilizing
the mapping for clflush, otherwise map the page transiently. Amd64
used direct map.

Proposed and reviewed by:  alc
Approved by:   re (kensmith)
2009-07-29 08:49:58 +00:00
Robert Watson
791b0ad2bf Eliminate ARG_UPATH[12] arguments to AUDIT_ARG_UPATH() and instead
provide specific macros, AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2()
to capture path information for audit records.  This allows us to
move the definitions of ARG_* out of the public audit header file,
as they are an implementation detail of our current kernel-internal
audit record, which may change.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-29 07:44:43 +00:00
Robert Watson
a9bcca799e Revise header comments for vnet.h as we now implement VNET_SYSINIT, not
just VNET_DEFINE in vnet.h.

Approved by:	re (vimage blanket)
2009-07-28 22:17:34 +00:00
Robert Watson
b146fc1bf0 Rework vnode argument auditing to follow the same structure, in order
to avoid exposing ARG_ macros/flag values outside of the audit code in
order to name which one of two possible vnodes will be audited for a
system call.

Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:52:24 +00:00
Robert Watson
e4b4bbb665 Audit file descriptors passed to fooat(2) system calls, which are used
instead of the root/current working directory as the starting point for
lookups.  Up to two such descriptors can be audited.  Add audit record
BSM encoding for fooat(2).

Note: due to an error in the OpenBSM 1.1p1 configuration file, a
further change is required to that file in order to fix openat(2)
auditing.

Approved by:	re (kib)
Reviewed by:	rdivacky (fooat(2) portions)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:39:58 +00:00
Julian Elischer
3d1001cb11 Startup the vnet part of initialization a bit after the global part.
Fixes crash on boot if ipfw compiled in.

Submitted by:	tegge@
Reviewed by:	tegge@
Approved by:	re (kib)
2009-07-28 19:58:07 +00:00
Julian Elischer
7973fba3a4 Somewhere along the line accept sockets stopped honoring the
FIB selected for them. Fix this.

Reviewed by:	ambrisko
Approved by:	re (kib)
MFC after:	3 days
2009-07-28 19:43:27 +00:00
Qing Li
9fca4f79c7 The new flow table caches both the routing table entry as well as the
L2 information. For an indirect route the cached L2 entry contains the
MAC address of the gateway. Typically the default route is used to
transmit multicast packets when explicit multicast routes are not
available. The ether_output() function bypasses L2 resolution function
if it verifies the L2 cache is valid, because the cached L2 address
(a unicast MAC address) is copied into the packets as the destination
MAC address. This validation, however, does not apply to broadcast and
multicast packets because the destination MAC address is mapped
according to a standard method instead.

Submitted by:	Xin Li
Reviewed by:	bz
Approved by:	re
2009-07-28 17:16:54 +00:00
Michael Tuexen
bf3d517756 Fix a bug where wrong initialization value
in used for an SCTP specific sysctl variable.

Approved by: re, rrs(mentor).
MFC after: 2 weeks.
2009-07-28 15:07:41 +00:00
Randall Stewart
cfde3ff70b Turns out that when a receiver forwards through its TNS's the
processing code holds the read lock (when processing a
FWD-TSN for pr-sctp). If it finds stranded data that
can be given to the application, it calls sctp_add_to_readq().
The readq function also grabs this lock. So if INVAR is on
we get a double recurse on a non-recursive lock and panic.

This fix will change it so that readq() function gets a
flag to tell if the lock is held, if so then it does not
get the lock.

Approved by:	re@freebsd.org (Kostik Belousov)
MFC after:	1 week
2009-07-28 14:09:06 +00:00
Weongyo Jeong
62dcd8657b adds DLINK2 DWA120 device.
PR:		usb/136950
Reported by:	Alexander Kuznetsov <skritku at gmail.com>
Approved by:	re (kib)
2009-07-27 20:17:20 +00:00
Qing Li
df813b7ea2 This patch does the following:
- Allow loopback route to be installed for address assigned to
      interface of IFF_POINTOPOINT type.
    - Install loopback route for an IPv4 interface addreess when the
      "useloopback" sysctl variable is enabled. Similarly, install
      loopback route for an IPv6 interface address when the sysctl variable
      "nd6_useloopback" is enabled. Deleting loopback routes for interface
      addresses is unconditional in case these sysctl variables were
      disabled after an interface address has been assigned.

Reviewed by:	bz
Approved by:	re
2009-07-27 17:08:06 +00:00
John Baldwin
8e3764c0a6 Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the
old ABI versions of the relevant control system call (e.g.
freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()).

Approved by:	re (kib)
2009-07-27 16:03:04 +00:00
Pawel Jakub Dawidek
abd8353f5d We don't support ephemeral IDs in FreeBSD and without this fix ZFS can
panic when in zfs_fuid_create_cred() when userid is negative. It is
converted to unsigned value which makes IS_EPHEMERAL() macro to
incorrectly report that this is ephemeral ID. The most reasonable
solution for now is to always report that the given ID is not ephemeral.

PR:		kern/132337
Submitted by:	Matthew West <freebsd@r.zeeb.org>
Tested by:	Thomas Backman <serenity@exscape.org>, Michael Reifenberger <mike@reifenberger.com>
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-27 14:52:34 +00:00
Rui Paulo
3ca80f0dbc Mesh fixes, namely:
* don't clobber proxy entries
* HWMP seq number processing, including discard of old frames
* flush routing table entries based on nexthop
* print route flags in ifconfig
* more debugging messages and comments

Proxy changes submitted by sam.

Approved by:	re (kib)
2009-07-27 14:22:09 +00:00
Rui Paulo
1f93ae9453 Refine the MacBook hack to only match early models that have Intel ICH.
Discussed with:	kjim
Approved by:	re (kib)
2009-07-27 13:51:55 +00:00
Michael Tuexen
8e71b6947a Fix the handling of unordered messages when using
PR-SCTP.

Approved by: re, rrs (mentor)
MFC after: 3 weeks.
2009-07-27 13:41:45 +00:00
Michael Tuexen
4420d9a062 Get rid of unused field. This will also be deleted
in the official speciication of the SCTP socket API.

Approved by:re, rrs (mentor)
2009-07-27 12:09:32 +00:00
Michael Tuexen
47a490cbbc Add a missing unlock for the inp lock when
returning early from sctp_add_to_readq().

Approved by: re, rrs (mentor)
MFC after: 2 weeks.
2009-07-26 15:06:59 +00:00
Alexander Motin
b06555e4fc Restore PATA device probe order, broken by PMP support implementation,
requesting IDENTIFY from slave device first. This order is important
for proper cable type detection by master device.

PR:		kern/136438
Approved by:	re (kib)
2009-07-26 14:04:48 +00:00
Bjoern A. Zeeb
d0ea47437a Update epair(4) to the new netisr implementation and polish
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
  from going away, under INVARIANTS as this is a general problem
  of the stack and should be solved in if.c/netisr but still good
  to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.

Hook epair(4) up to the build.

Approved by:	re (kib)
2009-07-26 12:20:07 +00:00
Bjoern A. Zeeb
be31e5e7b5 Make the in-kernel logic for the SIOCSIFVNET, SIOCSIFRVNET ioctls
(ifconfig ifN (-)vnet <jname|jid>) work correctly.

Move vi_if_move to if.c and split it up into two functions(*),
one for each ioctl.

In the reclaim case, correctly set the vnet before calling if_vmove.

Instead of silently allowing a move of an interface from the current
vnet to the current vnet, return an error. (*)

There is some duplicate interface name checking before actually moving
the interface between network stacks without locking and thus race
prone. Ideally if_vmove will correctly and automagically handle these
in the future.

Suggested by:	rwatson (*)
Approved by:	re (kib)
2009-07-26 11:29:26 +00:00
Alexander Motin
1a00526bec Add note, that ahci(4) and siis(4) supersede ata(4) drivers.
Approved by:	re (implicitly)
2009-07-25 18:45:09 +00:00
Alexander Motin
e19ef875b1 Add ahci and siis drivers to NOTES.
Approved by:	re (implicitly)
2009-07-25 17:40:49 +00:00
Jamie Gritton
7cbf72137f Some jail parameters (in particular, "ip4" and "ip6" for IP address
restrictions) were found to be inadequately described by a boolean.
Define a new parameter type with three values (disable, new, inherit)
to handle these and future cases.

Approved by:	re (kib), bz (mentor)
Discussed with:	rwatson
2009-07-25 14:48:57 +00:00
Julian Elischer
9d85f50ad5 Catch ipfw up to the rest of the vimage code.
It got left behind when it moved to its new location.

Approved by:	re (kensmith)
2009-07-25 06:42:42 +00:00
Jack F Vogel
45289e2ded Improvement on the last change, this gives a precise
way to tell the one and only interface that a vlan
event is for. Thanks to John Baldwin for the patch.

Approved by: re
2009-07-24 21:35:52 +00:00
Brooks Davis
254d03c508 Introduce a new sysctl process mib, kern.proc.groups which adds the
ability to retrieve the group list of each process.

Modify procstat's -s option to query this mib when the kinfo_proc
reports that the field has been truncated.  If the mib does not exist,
fall back to the truncated list.

Reviewed by:	rwatson
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-24 19:12:19 +00:00
John Baldwin
5f60e483c8 Bump __FreeBSD_version for the introduction of OBJT_SG.
Approved by:	re (kensmith)
2009-07-24 18:31:04 +00:00
Jack F Vogel
387424df40 This delta fixes two bugs:
- When a vlan event occurs a check was not made that
    the event was actually for the interface, thus resulting
    in a panic. All three drivers have this vulnerability. Add
    a check for this condition.
  - Secondly, there was a duplicate buf_ring free in the em
    driver resulting in a panic on unload. Remove.

Approved by:  re
2009-07-24 16:57:49 +00:00
Jack F Vogel
8bd0025fdd A small number of systems in the ICH9/10 family have a flash
part that is made up of 8K banks rather than 4K, if these
systems are using bank 1 then the last change in this code
breaks the bank read, resulting in an invalid checksum of
the eeprom during driver load. This change fixes this.

Approved by:  re
2009-07-24 16:54:22 +00:00
Sam Leffler
8ed1835dea revert OACTIVE part of r195845; instead fix the comment so it does not refer
to the old hack removed in r193312

Approved by:	re (implicit)
2009-07-24 15:37:02 +00:00
Sam Leffler
ff5aac8eb7 correct handling of IFF_PROMISC; this should not be pushed to the parent
device except for monitor and ahdemo mode vaps

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-24 15:28:29 +00:00
Sam Leffler
983a2c892b monitor mode vaps are meant to be read-only so they can operate on any
frequency w/o regulatory issues, do this by hooking if_transmit and
if_output with routines that discard all transmits

Reviewed by:	thompsa, cbzimmer (intent)
Approved by:	re (kensmith)
2009-07-24 15:27:02 +00:00
Sam Leffler
224881b466 o kill old code no longer needed after r193312
o count output packets+errors for frames sent through ieee80211_output

Approved by:	re (kensmith)
2009-07-24 15:22:12 +00:00
John Baldwin
63cbfee7b0 Remove debugging that crept in with previous commit.
Reported by:	nwhitehorn
Approved by:	re (kib)
2009-07-24 15:06:49 +00:00
Brooks Davis
1b5768be71 Revert the changes to struct kinfo_proc in r194498. Instead, fill
in up to 16 (KI_NGROUPS) values and steal a bit from ki_cr_flags
(all bits currently unused) to indicate overflow with the new flag
KI_CRF_GRP_OVERFLOW.

This fixes procstat -s.

Approved by: re (kib)
2009-07-24 15:03:10 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Robert Watson
d0728d7174 Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
  occur each time a network stack is instantiated and destroyed.  In the
  !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
  For the VIMAGE case, we instead use SYSINIT's to track their order and
  properties on registration, using them for each vnet when created/
  destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
  previously, as well as its dependency scheme: we now just use the
  SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
  they want init functions to be called for each virtual network stack
  rather than just once at boot, compiling down to DOMAIN_SET() in the
  non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
  of modinfo or DOMAIN_SET() for init/uninit events.  In some cases,
  convert modular components from using modevent to using sysinit (where
  appropriate).  In some cases, do minor rejuggling of SYSINIT ordering
  to make room for or better manage events.

Portions submitted by:	jhb (VNET_SYSINIT), bz (cleanup)
Discussed with:		jhb, bz, julian, zec
Reviewed by:		bz
Approved by:		re (VIMAGE blanket)
2009-07-23 20:46:49 +00:00
Alan Cox
3313145243 Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This
optimization was implemented in the amd64 version roughly 1 year ago.)

Approved by:	re (kensmith)
2009-07-23 19:43:23 +00:00
Nathan Whitehorn
ccf6415e82 Fix serial console on Apple Xserve G5 by falling back to input-device-1
if input-device is unavailable. The Xserve G5 defaults to using
screen/keyboard for output-device/input-device even if these are not
installed, and then falls back to serial ports at boot time.

Reviewed by:	marcel
Hardware from:	grehan
Approved by:	re (kib)
2009-07-23 12:51:27 +00:00
Rick Macklem
93a15e2054 When vfs.newnfs.callback_addr is set to an IPv4 address, the
experimental NFSv4 client might try and use it as an IPv6 address,
breaking callbacks. The fix simply initializes the isinet6 variable
for this case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 18:10:44 +00:00
Edward Tomasz Napierala
d2ceff236a Fix extattr_list_file(2) on ZFS in case the attribute directory
doesn't exist and user doesn't have write access to the file.
Without this fix, it returns bogus value instead of 0.  For some
reason this didn't manifest on my kernel compiled with -O0.

PR:		kern/136601
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Approved by:	re (kib)
2009-07-22 15:15:58 +00:00
Rick Macklem
c79e697621 Add changes to the experimental nfs client to use the PBDRY flag for
msleep(9) when a vnode lock or similar may be held. The changes are
just a clone of the changes applied to the regular nfs client by
r195703.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:37:53 +00:00
Konstantin Belousov
206a336872 When the page caching attributes are changed, after new mapping is
established, OS shall flush the caches on all processors that may have
used the mapping previously. This operation is not needed if processors
support self-snooping. If not, but clflush instruction is implemented
on the CPU, series of the clflush can be used on the mapping region.
Otherwise, we have to flush the whole cache. The later operation is very
expensive, and AMD-made CPUs do not have self-snooping.

Implement cache flush for remapped region by using clflush for amd64,
when supported by CPU.

Proposed and reviewed by:	alc
Approved by:	re (kensmith)
2009-07-22 14:32:38 +00:00
Rick Macklem
7f1968ba10 When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:32:28 +00:00
Andrew Gallatin
94c7d993a3 mxge's tunable hw.mxge.rss_hash_type cannot be set from the
loader, because it uses a reserved suffix (_type).  Fix
this by removing the "_" and renaming the tunable to
hw.mxge.rss_hashtype.  The old (rss_hash_type) tunable is
still fetched, in case people load the driver via scripts.
When both are present in the kernel environment,
the new value (hw.mxge.rss_hashtype) overrides the old
value.

Approved by:	re (kib)
2009-07-22 11:57:34 +00:00
Bjoern A. Zeeb
a08362ce46 sysctl_msec_to_ticks is used with both virtualized and
non-vrtiualized sysctls so we cannot used one common function.

Add a macro to convert the arg1 in the virtualized case to
vnet.h to not expose the maths to all over the code.

Add a wrapper for the single virtualized call, properly handling
arg1 and call the default implementation from there.

Convert the two over places to use the new macro.

Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-21 21:58:55 +00:00
Sam Leffler
e50821abe7 store mesh timers as ticks and sysctls for changing the defaults
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:38:22 +00:00
Sam Leffler
411ccf5f63 Correct handling of keys that already have a hardware/device key index:
this was broken in r183248 when the check of wk_keyix was replaced by
a check of IEEE80211_KEY_DEVKEY (because the flag was clobbered).  Define
IEEE80211_KEY_DEVICE to specify flags that are owned by net80211/driver
and use this to preserve IEEE80211_KEY_DEVKEY so we don't ask the driver
for another key index when we already have one.

Testing by:	Daniel Thiele, Wes Morgan
Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:36:32 +00:00
Sam Leffler
bd6c4dd05d correct setup of opt_ddb.h
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 19:24:53 +00:00
Sam Leffler
82cadd5aa0 Fix handling of AR_RX_FILTER_BSSID: write the shadow value for AR_MISC_MODE
so other register writes preserve the setting of AR_MISC_MODE_BSSID_MATCH_FORCE.

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:23:34 +00:00
Marius Strobl
fada2a867d Add a MD __PCI_BAR_ZERO_VALID which denotes that BARs containing 0
actually specify valid bases that should be treated just as normal.
The PCI specifications have no indication that 0 would be a magic value
indicating a disabled BAR as commonly used on at least amd64 and i386
but not sparc64. It's unclear what to do in pci_delete_resource()
instead of writing 0 to a BAR though as there's no (other) way do
disable individual BARs so its decoding is left enabled in case of
__PCI_BAR_ZERO_VALID for now.

Approved by:	re (kib), jhb
MFC after:	1 week
2009-07-21 19:06:39 +00:00
Sam Leffler
fe0dd78965 track whether any mesh vaps are present to correctly setup the rx filter
when, for example, an ap vap is created first

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-21 19:01:04 +00:00
Alan Cox
9e367360e3 Catch up with r195249, "Improve the handling of cpuset with interrupts."
Specifically, update the return type of xenpic_assign_cpu() so that this
file compiles again.

Approved by:	re (kib)
2009-07-21 16:54:11 +00:00
Rui Paulo
a6d54dae20 Enable mesh support.
Submitted by:	jkim
Approved by:	re (kib)
2009-07-21 14:23:05 +00:00
Rui Paulo
5b4d5f9ee2 Improve the printf message when a module failed to load. This gives the
user some clue about the possibility of a __FreeBSD_version mismatch.

Discussed with:	rwatson, jhb
Approved by:	re (kib)
2009-07-21 14:18:25 +00:00
Alexander Motin
67b87e4429 Add siis CAM driver for SiliconImage SiI3124/3132/3531 SATA2 controllers.
Driver supports Serial ATA and ATAPI devices, Port Multipliers
(including FIS-based switching), hardware command queues (31 command
per port) and Native Command Queuing. This is probably the second on
popularity, after AHCI, type of SATA2 controllers, that benefits from
using CAM, because of hardware command queuing support.

Approved by:    re (kib)
2009-07-21 12:32:46 +00:00
Rafal Jaworowski
570255d8a5 Do not use OCP85XX_LBC_OFF twice when accessing LBC registers on MPC85XX.
It turns LBC control registers were not programmed correctly on MPC85XX. We
were accessing bogus addresses as the base offset (OCP85XX_LBC_OFF) was
erroneously added during offset calculations.  Effectively the state of LBC
control registers was not altered by the kernel initialization code, but
everything worked as long as we coincided to use the same settings (LBC decode
windows) as firmware has initialized.

Submitted by:	Lukasz Wojcik
Reviewed by:	marcel
Approved by:	re (kensmith)
Obtained from:	Semihalf
2009-07-21 08:38:45 +00:00
Rafal Jaworowski
84a5818564 Make dcache_inv_range() point to the proper routines on ARM9 and ARM9E/ARM10.
On some ARM variations CPU func dispatcher has the D-cache invalidate method
point to write-back invalidate, which is wrong, and can lead to a crash/panic
on affected platforms.

Spotted by:	HPS
Reviewed by:	cognet
Approved by:	re (kib)
2009-07-21 08:29:19 +00:00
Coleman Kane
e0d12c4b94 Fix regression in last set of commits. Submitted via e-mail and then
nagged again via PR. Thank Paul for his persistence and contributions.

PR:		136895
Submitted by:	Paul B. Mahol <onemda@gmail.com>
Reviewed by:	sam (timeout, 10 days), weongyo (timeout, 10 days), me
Approved by:	re (Kostik Belousov <kostikbel@gmail.com>)
2009-07-20 23:21:19 +00:00
Robert Watson
a511354af4 Back out the moving in r195782 of V_ip_id's initialization from the top
back to the bottom of ip_init() as found in 7.x.  I missed the fact that
the bottom half of the init routine only runs in the !VNET case.

Submitted by:	zec
Approved by:	re (vimage blanket)
2009-07-20 19:40:09 +00:00
Edward Tomasz Napierala
65588fd503 Fix permission handling for extended attributes in ZFS. Without
this change, ZFS uses SunOS Alternate Data Streams semantics - each
EA has its own permissions, which are set at EA creation time
and - unlike SunOS - invisible to the user and impossible to change.
From the user point of view, it's just broken: sometimes access
is granted when it shouldn't be, sometimes it's denied when
it shouldn't be.

This patch makes it behave just like UFS, i.e. depend on current
file permissions.  Also, it fixes returned error codes (ENOATTR
instead of ENOENT) and makes listextattr(2) return 0 instead
of EPERM where there is no EA directory (i.e. the file never had
any EA).

Reviewed by:	pjd (idea, not actual code)
Approved by:	re (kib)
2009-07-20 19:16:42 +00:00
Rui Paulo
c104cff26e More mesh bits, namely:
* bridge support (sam)
* handling of errors (sam)
* deletion of inactive routing entries
* more debug msgs (sam)
* fixed some inconsistencies with the spec.
* decap is now specific to mesh (sam)
* print mesh seq. no. on ifconfig list mesh
* small perf. improvements

Reviewed by:	sam
Approved by:	re (kib)
2009-07-20 19:12:08 +00:00
Robert Watson
0a4747d4d0 Garbage collect vnet module registrations that have neither constructors
nor destructors, as there's no actual work to do.

In most cases, the constructors weren't needed because of the existing
protocol initialization functions run by net_init_domain() as part of
VNET_MOD_NET, or they were eliminated when support for static
initialization of virtualized globals was added.

Garbage collect dependency references to modules without constructors or
destructors, notably VNET_MOD_INET and VNET_MOD_INET6.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 13:55:33 +00:00
Rafal Jaworowski
331f685743 ARM pmap fixes.
a)  nocache-remap problem

   When a page is remapped into a non-cacheable virtual memory region there
   was no associated write-back invalidate operation performed. We remove
   writeback of the original buffer size from bus_dmamem_alloc() and add
   appropriate L1/L2 flush operation.

b) missing write-back invalidate operation

   In pmap_kremove a page is removed so we must do a write-back
   invalidate operation aligned to the page virtual address.

Submitted by:	Michal Hajduk
Reviewed by:	Mark Tinguely, rpaulo, stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-07-20 07:53:07 +00:00
Robert Watson
17ef1feb8a Add macros VNET_SETNAME and VNET_SYMPREFIX, and expose to userspace if
_WANT_VNET is defined.  This way we don't need separate definitions in
libkvm.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-07-20 07:50:50 +00:00
Scott Long
62fdee1dda Fix an apparently harmless typo.
Approved by:	re
2009-07-20 03:59:00 +00:00
Alan Cox
9861cbc6ca Change the handling of fictitious pages by pmap_page_set_memattr() on
amd64 and i386.  Essentially, fictitious pages provide a mechanism for
creating aliases for either normal or device-backed pages.  Therefore,
pmap_page_set_memattr() on a fictitious page needn't update the direct
map or flush the cache.  Such actions are the responsibility of the
"primary" instance of the page or the device driver that "owns" the
physical address.  For example, these actions are already performed by
pmap_mapdev().

The device pager needn't restore the memory attributes on a fictitious
page before releasing it.  It's now pointless.

Add pmap_page_set_memattr() to the Xen pmap.

Approved by:	re (kib)
2009-07-19 21:40:19 +00:00
Konstantin Belousov
7ac5806bfb When buffer write is failed, it is wrong for brelse() to invalidate
portion of the page that was written. Among other problems, this
page might be picked up by pagedaemon, with failed assertion in
vm_pageout_flush() about validity of the page.

Reported and tested by:	pho
Approved by:	re (kensmith)
MFC after:	3 weeks
2009-07-19 20:25:59 +00:00
Robert Watson
006e9db452 Normalize field naming for struct vnet, fix two debugging printfs that
print them.

Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-19 17:40:45 +00:00
Ken Smith
3ca3047aee Bump the version of all non-symbol-versioned shared libraries in
preparation for 8.0-RELEASE.  Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.

Reviewed by:    kib
Approved by:    re (rwatson)
2009-07-19 17:25:24 +00:00
Sam Leffler
030d10ee97 add urtw
Approved by:	re (kib)
2009-07-19 16:54:24 +00:00
Rick Macklem
80e556956e Fix two bugs in the experimental nfs client:
- When the root vnode was acquired during mounting, mnt_stat.f_iosize was
  still set to 0, so getnewvnode() would set bo_bsize == 0. This would
  confuse getblk(), so that it always returned the first block causing
  the problem when the root directory of the mount point was greater
  than one block in size. It was fixed by setting mnt_stat.f_iosize to
  NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode.
- NFSMNT_INT was being set temporarily while the initial connect to a
  server was being done. This erroneously configured the krpc for
  interruptible RPCs, which caused problems because signals weren't
  being masked off as they would have been for interruptible mounts.
  This code was deleted to fix the problem. Since mount_nfs does an
  NFS null RPC before the mount system call, connections to the server
  should work ok.

Tested by:	swell dot k at gmail dot com
Approved by:	re (kensmith), kib (mentor)
2009-07-19 16:44:26 +00:00
Robert Watson
94da184f68 Expose the definitions of 'struct vnet' and 'VNET_MAGIC_N' to userspace
if _WANT_VNET is defined.  This is required so that libkvm can locate
virtual network stack instances in order to reach their global variables
for monitoring and crashdump analysis.

Reviewed by:	bz
Approved by:	re (kib)
2009-07-19 15:21:42 +00:00
Robert Watson
5ee847d3ac Reimplement and/or implement vnet list locking by replacing a mostly
unused custom mutex/condvar-based sleep locks with two locks: an
rwlock (for non-sleeping use) and sxlock (for sleeping use).  Either
acquired for read is sufficient to stabilize the vnet list, but both
must be acquired for write to modify the list.

Replace previous no-op read locking macros, used in various places
in the stack, with actual locking to prevent race conditions.  Callers
must declare when they may perform unbounded sleeps or not when
selecting how to lock.

Refactor vnet sysinits so that the vnet list and locks are initialized
before kernel modules are linked, as the kernel linker will use them
for modules loaded by the boot loader.

Update various consumers of these KPIs based on whether they may sleep
or not.

Reviewed by:	bz
Approved by:	re (kib)
2009-07-19 14:20:53 +00:00
Sam Leffler
519f677aff Move code that does payload realigment to a new routine, ieee80211_realign,
so it can be reused.  While here rewrite the logic to always use a single mbuf.

Reviewed by:	rpaulo
Approved by:	re (kib)
2009-07-18 20:19:53 +00:00
Bruce M Simpson
b36c89e55f Fix a problem, whereby misbehaving IPv6 applications, which don't include
a valid zone ID or interface identifier in a v6 multicast leave, would
trigger a fairly paranoid KASSERT().

Observed with Boost++ regression tests on ref8.freebsd.org.

Approved by:	re (kib)
2009-07-18 17:38:18 +00:00
Ulf Lilleengen
b79cac0f92 - Fix the issue with read access count modification on RAID-5 plexes properly.
If the access counts were not increased and decreased in equal numbers by
  gvinum consumers, the read access count would be inconsistent with the write
  access count. Instead, modify the read access count with the write access
  count directly to prevent any inconsistencies.

Approved by:	re (kib)
2009-07-18 11:12:48 +00:00
Alan Cox
13de722155 An addendum to r195649, "Add support to the virtual memory system for
configuring machine-dependent memory attributes...":

Don't set the memory attribute for a "real" page that is allocated to
a device object in vm_page_alloc().  It is a pointless act, because
the device pager replaces this "real" page with a "fake" page and sets
the memory attribute on that "fake" page.

Eliminate pointless code from pmap_cache_bits() on amd64.

Employ the "Self Snoop" feature supported by some x86 processors to
avoid cache flushes in the pmap.

Approved by:	re (kib)
2009-07-18 01:50:05 +00:00
Alexander Motin
b2da774623 Fix copy-paste bug. Use regular non-polled mode for executing FLUSHCACHE
command on disk close.

Approved by:	re (implicitly)
2009-07-17 21:48:08 +00:00
Rick Macklem
874bb76647 Patch the regular nfs client in a manner analagous to
r195704 for the experimental client. The patch avoids calling vn_lock()
for the case where nfs_nget() has acquired the same vnode as dvp,
since nfs_nget() has already locked the vnode.

Reviewed by:	kib, jhb
Approved by:	re (kensmith), kib (mentor)
2009-07-17 19:38:07 +00:00
Rui Paulo
8b9fde4324 Add IEEE80211_SUPPORT_MESH, following similar change to nanobsd and
other GENERIC kernels.

Approved by:	re (kib)
2009-07-17 18:35:45 +00:00
Jamie Gritton
7afcbc18b3 Remove the interim vimage containers, struct vimage and struct procg,
and the ioctl-based interface that supported them.

Approved by:	re (kib), bz (mentor)
2009-07-17 14:48:21 +00:00
Robert Watson
597df30e62 Import OpenBSM 1.1p1 from vendor branch to 8-CURRENT, populating
contrib/openbsm and a subset also imported into sys/security/audit.
This patch release addresses several minor issues:

- Fixes to AUT_SOCKUNIX token parsing.
- IPv6 support for au_to_me(3).
- Improved robustness in the parsing of audit_control, especially long
  flags/naflags strings and whitespace in all fields.
- Add missing conversion of a number of FreeBSD/Mac OS X errnos to/from BSM
  error number space.

MFC after:	3 weeks
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
Approved by:	re (kib)
2009-07-17 14:02:20 +00:00
Robert Watson
5d171016e7 Vendor import of OpenBSM 1.1p1, which incorporates the following changes
since the last imported OpenBSM release:

OpenBSM 1.1p1

- Fixes to AUT_SOCKUNIX token parsing.
- IPv6 support for au_to_me(3).
- Improved robustness in the parsing of audit_control, especially long
  flags/naflags strings and whitespace in all fields.
- Add missing conversion of a number of FreeBSD/Mac OS X errnos to/from BSM
  error number space.

Obtained from:  TrustedBSD Project
Sponsored by:   Apple, Inc.
2009-07-17 12:18:39 +00:00
Robert Watson
1e77c1056a Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used.  Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with:	bz, julian
Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-16 21:13:04 +00:00
Alexander Motin
d96aeec8ff Limit IOCATAREQUEST ioctl data size to controller's maximum I/O size.
It fixes kernel panic when requested size is too large (0xffffffff),

PR:             kern/136726
Approved by:    re (kib)
MFC after:      2 weeks
2009-07-16 19:48:39 +00:00
Ken Smith
e57e15a182 Prepare for the 8.0-BETA2 builds.
Approved by:	re (implicit)
2009-07-15 17:29:05 +00:00
Andriy Gapon
b064b6d1cd dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds
Currently dtrace_gethrtime uses formula similar to the following for
converting TSC ticks to nanoseconds:
rdtsc() * 10^9 / tsc_freq
The dividend overflows 64-bit type and wraps-around every 2^64/10^9 =
18446744073 ticks which is just a few seconds on modern machines.

Now we instead use precalculated scaling factor of
10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately
for each 32-bit half.  This allows to avoid overflow of the dividend
described above.
The idea is taken from OpenSolaris.
This has an added feature of always scaling TSC with invariant value
regardless of TSC frequency changes. Thus the timestamps will not be
accurate if TSC actually changes, but they are always proportional to
TSC ticks and thus monotonic. This should be much better than current
formula which produces wildly different non-monotonic results on when
tsc_freq changes.

Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init()
to make it identical to the i386 twin.

PR:		kern/127441
Tested by:	Thomas Backman <serenity@exscape.org>
Reviewed by:	jhb
Discussed with:	current@, bde, gnn
Silence from:	jb
Approved by:	re (gnn)
MFC after:	1 week
2009-07-15 17:07:39 +00:00
Robert Watson
cd2b95a237 r195699 introduced an assertion regarding when progbits data in kernel
modules was present, which turns out to be false in some situations.
Back out the assertion.

Reported by:	Luiz Otavio O Souza <lists.br at gmail.com>,
		Florian Smeets <flo at kasimir.com>
Approved by:	re (kensmith) (implicit)
2009-07-15 09:19:01 +00:00
Robert Watson
c1e200ffcc Add missing license line for vnet.h, correct white space nit.
Approved by:	re (kensmith) (implicit)
2009-07-15 00:56:15 +00:00
Rick Macklem
405229f913 Fix the experimental nfs client so that it does not cause a
"share->excl" panic when doing a lookup of dotdot at the root
of a server's file system. The patch avoids calling vn_lock()
for that case, since nfscl_nget() has already acquired a lock
for the vnode.

Approved by:	re (kensmith), kib (mentor)
2009-07-14 23:10:23 +00:00
Konstantin Belousov
b35687df13 Use PBDRY flag for msleep(9) in NFS and NLM when sleeping thread owns
kernel resources that block other threads, like vnode locks. The SIGSTOP
sent to such thread (process, rather) shall not stop it until thread
releases the resources.

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:54:29 +00:00
Konstantin Belousov
f33a947b56 Add new msleep(9) flag PBDY that shall be specified together with
PCATCH, to indicate that thread shall not be stopped upon receipt of
SIGSTOP until it reaches the kernel->usermode boundary.

Also change thread_single(SINGLE_NO_EXIT) to only stop threads at
the user boundary unconditionally.

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:52:46 +00:00
Konstantin Belousov
79799053a7 Move the repeated code to calculate the number of the threads in the
process that still need to be suspended or exited from thread_single
into the new function calc_remaining().

Tested by:	pho
Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:51:31 +00:00
Konstantin Belousov
c8167830f9 When wakeup(9) is going to notify swapper, assert that wait channel is not
equal to &proc0. It shall be not, since proc0 stack is not swappable, and
kick_proc0() is wakeup(&proc0).

Reviewed by:	jhb
Approved by:	re (kensmith)
2009-07-14 22:50:41 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
John Baldwin
0fe0ed8bf8 - Change mmap() to fail requests with EINVAL that pass a length of 0. This
behavior is mandated by POSIX.
- Do not fail requests that pass a length greater than SSIZE_MAX
  (such as > 2GB on 32-bit platforms).  The 'len' parameter is actually
  an unsigned 'size_t' so negative values don't really make sense.

Submitted by:	Alexander Best  alexbestms at math.uni-muenster.de
Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 week
2009-07-14 19:45:36 +00:00
Bjoern A. Zeeb
6c19585325 Re-add opt_inet.h, as we did in r193862 and lost yet again.
Approved by:	re (kib)
2009-07-14 19:32:36 +00:00
Alexander Motin
5487ee5507 Disable MSI by default for nVidia MCP55 chipset.
It is reported to be broken in the same way as MCP51.

PR:		kern/136429
Approved by:	re (kib)
2009-07-14 19:18:31 +00:00
Ariff Abdullah
df41a638f7 - Do aggresive saturation on various polynomial interpolators.
This dramatically pushing 99.9% interpolations and quantizations
  error _below_ -180dB on 32bit dynamic range, resulting extremely
  high quality conversion.
- Use BSPLINE interpolator for filter oversampling factor greater or
  equal than 64 (log2 6).

Approved by:	re (kib)
2009-07-14 18:53:34 +00:00
Ed Maste
c25ebae365 Change xpt_scan_bus to scsi_scan_bus and xpt_scan_lun to scsi_scan_lun
in comments and printfs to match new function names after refacoring.

Approved by:	re
2009-07-14 18:44:17 +00:00
Ed Maste
399831bce6 Fix leaks in probestart, probedone, and scsi_scan_bus. Also free
page_list using the matching malloc type for the allocation.

Approved by:	re
Reviewed by:	scottl [1]
MFC after:	1 week

[1] Original patch was against xpt_cam.c, prior to the cam refactoring.
2009-07-14 17:26:37 +00:00
Lawrence Stewart
5f1ff8136a Fix a buglet that slipped into r195654. My buildworld/buildkernel sanity
check missed this because cxgb's TOM is currently commented out of the build
system.

Submitted by:	Navdeep Parhar <np at FreeBSD dot org>
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-14 11:53:21 +00:00
Tai-hwa Liang
3d22427cff Adding hardware ID for RTL810x PCIe found on HP Pavilion DV2-1022AX.
Reviewed by:	yongari
Approved by:	re (kib, kensmith)
2009-07-14 04:35:13 +00:00
Jung-uk Kim
e8c4d3e407 Match PCI Express root bridge _HID directly instead of
relying on _CID.

Reviewed by:	jhb
Approved by:	re (kib)
2009-07-13 21:36:31 +00:00
Alexander Motin
3ccda2f312 Fix copy-paste bug, enabling SIM PMP support, when it was not really found.
Approved by:	re (implicitly)
2009-07-13 21:21:30 +00:00
Scott Long
de985f6174 Revert the CISS driver to 64K i/o, the previous change was in error and
missing a lot of needed infrastructure.

Approved by:	re
2009-07-13 20:19:29 +00:00
Rui Paulo
aec0289d59 Fix inline function declaration and prototype.
Approved by:	re (kensmith)
2009-07-13 18:23:58 +00:00
Alan Cox
bc5c48f12f Correct an error of omission in r195649 ("Add support to the virtual memory
system for configuring machine-dependent memory attributes: ...").  In
r195649, the "vm_cache_mode_t/vm_memattr_t" parameter was removed from
vm_phys_alloc_contig().

Approved by:	re (kensmith)
2009-07-13 18:11:59 +00:00
Alexander Motin
45a30a41d2 Fix Marvel SATA controllers operation, broken by rev. 188765,
by using uninitialized variable.

Tested by:	Chris Hedley
Approved by:	re (kensmith)
2009-07-13 18:01:49 +00:00
Lawrence Stewart
91a5ebde45 Fix a race in the manipulation of the V_tcp_sack_globalholes global variable,
which is currently not protected by any type of lock. When triggered, the bug
would sometimes cause a panic when the TCP activity to an affected machine
eventually slowed during a lull. The panic only occurs if INVARIANTS is compiled
into the kernel, and has laid dormant for some time as a result of INVARIANTS
being off by default except in FreeBSD-CURRENT.

Switch to atomic operations in the locations where the variable is changed.
Reads have not been updated to be protected by atomics, so there is a
possibility of accounting errors in any given calculation where the variable is
read. This is considered unlikely to occur in the wild, and will not cause
serious harm on rare occasions where it does.

Thanks to Robert Watson for debugging help.

Reported by:	Kamigishi Rei <spambox at haruhiism dot net>
Tested by:	Kamigishi Rei <spambox at haruhiism dot net>
Reviewed by:	silby
Approved by:	re (rwatson), kensmith (mentor temporarily unavailable)
2009-07-13 11:59:38 +00:00
Lawrence Stewart
237fbe0a1c Replace struct tcpopt with a proxy toeopt struct in the TOE driver interface to
the TCP syncache. This returns struct tcpopt to being private within the TCP
implementation, thus allowing it to be modified without ABI concerns.

The patch breaks the ABI. Bump __FreeBSD_version to 800103 accordingly. The cxgb
driver is the only TOE consumer affected by this change, and needs to be
recompiled along with the kernel.

Suggested by:	rwatson
Reviewed by:	rwatson, kmacy
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-13 11:51:02 +00:00
Alexander Motin
f09f8e3e0b Rename ATA probe driver to "aprobe" to resolve name conflict with SCSI
and fix loading cam as module.

Approved by:	re (implicitly)
2009-07-13 06:12:21 +00:00
Alan Cox
3153e878dd Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t.  The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes.  Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures.  The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map.  The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

  kmem_alloc_contig() can now be used to allocate kernel memory with
  non-default memory attributes on amd64 and i386.

  vm_page_alloc() and the device pager will set the memory attributes
  for the real or fictitious page according to the object's default
  memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386.  In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by:	re (kib)
2009-07-12 23:31:20 +00:00
Qing Li
05b262e264 This patch adds a host route to an interface address (that is assigned
to a non loopback/ppp link type) through the loopback interface. Prior
to the new L2/L3 rewrite, this host route was explicitly created when
processing the IPv6 address assignment. This loopback host route is
deleted when that IPv6 address is removed from the interface.

Reviewed by:	bz, gnn
Approved by:	re
2009-07-12 19:20:55 +00:00
Rick Macklem
089f366ab0 Add calls to the experimental nfs client for the case of an "intr" mount,
so that signals that aren't supposed to terminate RPCs in progress are
masked off during the RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:07:35 +00:00
Rick Macklem
ad86aef9af Fix the handling of dotdot in lookup for the experimental nfs client
in a manner analagous to the change in r195294 for the regular nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:02:17 +00:00
Marcel Moolenaar
c5e2d1885c Isochronous transfers only have 1 frame buffer, but multiple
frame lengths. The frame buffer is at index 0.

Approved by:	re (kensmith)
Obtained from:	HPS
2009-07-12 16:50:32 +00:00
Marcel Moolenaar
58f6d27320 MFp4:
USB CORE: busdma improvement

      For single segment allocations the boundary field
      of the BUSDMA tag should be zero. Currently all
      single segment allocations are less than or equal
      to 4096 bytes, so the limit does not kick in. If
      any single segment USB allocations would be greater
      than 4K, then it would be a problem.

Approved by:	re (kensmith)
Obtained from:	HPS
2009-07-12 16:46:43 +00:00
Konstantin Belousov
529ab57b9a When VM_MAP_WIRE_HOLESOK is not specified and vm_map_wire(9) encounters
non-readable and non-executable map entry, the entry is skipped from
wiring and loop is aborted. But, since MAP_ENTRY_WIRE_SKIPPED was not
set for the map entry, its wired_count is later erronously decremented.
vm_map_delete(9) for such map entry stuck in "vmmaps".

Properly set MAP_ENTRY_WIRE_SKIPPED when aborting the loop.

Reported by:	John Marshall <john.marshall riverwillow com au>
Approved by:	re (kensmith)
2009-07-12 12:37:38 +00:00
Lawrence Stewart
962ebef8c0 Pad the following TCP related structs to allow MFCs of upcoming features/fixes
back to the 8 branch:

tcp_var.h
- struct sackhint
- struct tcpcb
- struct tcpstat

The patch breaks the ABI. Bump __FreeBSD_version to 800102 accordingly. User
space tools that rely on the size of any of these structs (e.g. sockstat) need
to be recompiled.

Reviewed by:	rpaulo, sam, andre, rwatson
Approved by:	re & mentor (gnn)
2009-07-12 09:14:28 +00:00
Marcel Moolenaar
31d22003dd Rename option USBVERBOSE to USB_VERBOSE for 2 reasons:
1.  USB_VERBOSE is more consistent with USB_DEBUG,
2.  sys/dev/usb/usb_device.c uses option USB_VERBOSE and
    not USBVERBOSE.

POLA with the USBVERBOSE option as it's found in 7-STABLE
has been considered but found insignificant in the face
of the USB stack overhaul.

Approved by:	re (kensmith)
2009-07-12 04:48:47 +00:00
Nathan Whitehorn
f136543e41 Increase the size of the page table on 64-bit PowerPC machines as a
bandaid to prevent exhaustion of the primary and secondary hash groups
in the event of extreme stress on the PMAP layer (e.g. a forkbomb). This
wastes memory, and should be revised to properly handle PTEG spills instead.

Suggested by:	grehan
Approved by:	re (kensmith)
2009-07-12 04:07:52 +00:00
Marcel Moolenaar
b54b764c72 Revert rev 192323 (nfs_common.c only):
The D-cache flushing added here was to deal with I-cache
incoherency observed on ia64. However, the problem was
in the implementation of pmap_enter_object() for ia64:
it was missing I-cache coherency logic for prefaulted
pages. After this got added in rev 195625, testing showed
that no D-cache flushing was required.

The SIGILL that was observed on Book-E (see commit log
for rev 192323) ended up not being related to I-cache
incoherency, but was found to be caused by bad memory.
This discovery further undermined the need for D-cache
flushing in the NFS I/O code, triggering the reversal.

Approved by:	re (kensmith)
2009-07-12 03:53:52 +00:00
Marcel Moolenaar
dd8a461e83 In nvpair_native_embedded_array(), meaningless pointers are zeroed.
The programmer was aware that alignment was not guaranteed in the
packed structure and used bzero() to NULL out the pointers.
However, on ia64, the compiler is quite agressive in finding ILP
and calls to bzero() are often replaced by simple assignments (i.e.
stores). Especially when the width or size in question corresponds
with a store instruction (i.e. st1, st2, st4 or st8).

The problem here is not a compiler bug. The address of the memory
to zero-out was given by '&packed->nvl_priv' and given the type of
the 'packed' pointer the compiler could assume proper alignment for
the replacement of bzero() with an 8-byte wide store to be valid.
The problem is with the programmer. The programmer knew that the
address did not have the alignment guarantees needed for a regular
assignment, but failed to inform the compiler of that fact. In
fact, the programmer told the compiler the opposite: alignment is
guaranteed.

The fix is to avoid using a pointer of type "nvlist_t *" and
instead use a "char *" pointer as the basis for calculating the
address. This tells the compiler that only 1-byte alignment can
be assumed and the compiler will either keep the bzero() call
or instead replace it with a sequence of byte-wise stores. Both
are valid.

Approved by:	re (kib)
2009-07-11 22:43:20 +00:00
Colin Percival
7d845dde8d Remove build timestamps from the following files:
/boot/kernel/hptrr.ko
/etc/mail/*.cf
/lib/libcrypto.so.5
/usr/bin/ntpq
/usr/sbin/amd
/usr/sbin/iasl
/usr/sbin/ntpd
/usr/sbin/ntpdate
/usr/sbin/ntpdc

There does not appear to be any purpose to having these timestamps, and
they have the irritating consequence that the aforementioned files will
be different every time they are rebuilt.

After this commit, the only remaining build timestamps are in the kernel,
the boot loaders, /usr/include/osreldate.h (the year in the copyright
notice), and lib*.a (the timestamps on all of the included .o files).

Reviewed by:	scottl (hptrr), gshapiro (sendmail), simon (openssl),
		roberto (ntp), jkim (acpica)
Approved by:	re (kib)
2009-07-11 22:30:37 +00:00
Marcel Moolenaar
1ed01448fb On exec(2), when loading the ELF image, pmap_enter_object() is
called to prefault pages. This is an obvious place for making
sure the I-cache is coherent. It was missing though. As such,
execution over NFS and ZFS file systems was failing. NFS was
fixed the wrong way (by flushing the D-cache as part of the
NFS code) in a previous commit. ZFS problems were encountered
after that and indicated that something else was wrong...

Approved by:	re (kib)
2009-07-11 22:27:20 +00:00
Kip Macy
6a7bff2c31 Re-factoring for adding weighted routes introduced a
fairly irritating bug where the system will panic
when RADIX_MPATH is enabled. This change fixes this.

Approved by:	re@
2009-07-11 21:56:23 +00:00
Rui Paulo
7d26189145 Fix something bogus deletion that got it during mesh commit.
Approved by:	re (implicit)
2009-07-11 16:02:06 +00:00
Rui Paulo
59aa14a91d Implementation of the upcoming Wireless Mesh standard, 802.11s, on the
net80211 wireless stack. This work is based on the March 2009 D3.0 draft
standard. This standard is expected to become final next year.
This includes two main net80211 modules, ieee80211_mesh.c
which deals with peer link management, link metric calculation,
routing table control and mesh configuration and ieee80211_hwmp.c
which deals with the actually routing process on the mesh network.
HWMP is the mandatory routing protocol on by the mesh standard, but
others, such as RA-OLSR, can be implemented.

Authentication and encryption are not implemented.

There are several scripts under tools/tools/net80211/scripts that can be
used to test different mesh network topologies and they also teach you
how to setup a mesh vap (for the impatient: ifconfig wlan0 create
wlandev ... wlanmode mesh).

A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled
by default on GENERIC kernels for i386, amd64, sparc64 and pc98.

Drivers that support mesh networks right now are: ath, ral and mwl.

More information at: http://wiki.freebsd.org/WifiMesh

Please note that this work is experimental. Also, please note that
bridging a mesh vap with another network interface is not yet supported.

Many thanks to the FreeBSD Foundation for sponsoring this project and to
Sam Leffler for his support.
Also, I would like to thank Gateworks Corporation for sending me a
Cambria board which was used during the development of this project.

Reviewed by:	sam
Approved by:	re (kensmith)
Obtained from:	projects/mesh11s
2009-07-11 15:02:45 +00:00
Jung-uk Kim
35ea6959ac Get correct maxio from the controller and drop the tunable.
The default (64K) is too pessimistic for "new comm" hardware.
Also, this is bad because multiple controllers get limited by
the global tunable.

Reviewed by:	scottl
Approved by:	re (kensmith)
2009-07-11 08:10:18 +00:00
Rui Paulo
820e6a1f38 For ic_opmode switch cases, provide a default label with a printf saying
this opmode is not supported.

Approved by:	re (kib)
2009-07-10 15:28:33 +00:00
Sam Leffler
8b10ecb67c mark struct ieee80211req_maclist packed so sizeof works as intended on arm;
fixes "list mac"

Approved by:	re (kensmith)
2009-07-10 15:26:33 +00:00
Konstantin Belousov
d77e2734a1 When amd64 CPU cannot load segment descriptor during trap return to
usermode, it generates GPF, that is mirrored to user mode as SIGSEGV.
The offending register in mcontext should contain the value loading of
which generated the GPF, and it is so on i386. On amd64, we currently
report segment descriptor in tf_err, while segment register contains the
corrected value loaded by trap handler.

Fix the issue by behaving like i386, reloading segment register in trap
frame after signal frame is pushed onto user stack.

Noted and tested by:	pho
Approved by:	re (kensmith)
2009-07-10 10:29:16 +00:00
Scott Long
52c9ce25d8 Separate the parallel scsi knowledge out of the core of the XPT, and
modularize it so that new transports can be created.

Add a transport for SATA

Add a periph+protocol layer for ATA

Add a driver for AHCI-compliant hardware.

Add a maxio field to CAM so that drivers can advertise their max
I/O capability.  Modify various drivers so that they are insulated
from the value of MAXPHYS.

The new ATA/SATA code supports AHCI-compliant hardware, and will override
the classic ATA driver if it is loaded as a module at boot time or compiled
into the kernel.  The stack now support NCQ (tagged queueing) for increased
performance on modern SATA drives.  It also supports port multipliers.

ATA drives are accessed via 'ada' device nodes.  ATAPI drives are
accessed via 'cd' device nodes.  They can all be enumerated and manipulated
via camcontrol, just like SCSI drives.  SCSI commands are not translated to
their ATA equivalents; ATA native commands are used throughout the entire
stack, including camcontrol.  See the camcontrol manpage for further
details.  Testing this code may require that you update your fstab, and
possibly modify your BIOS to enable AHCI functionality, if available.

This code is very experimental at the moment.  The userland ABI/API has
changed, so applications will need to be recompiled.  It may change
further in the near future.  The 'ada' device name may also change as
more infrastructure is completed in this project.  The goal is to
eventually put all CAM busses and devices until newbus, allowing for
interesting topology and management options.

Few functional changes will be seen with existing SCSI/SAS/FC drivers,
though the userland ABI has still changed.  In the future, transports
specific modules for SAS and FC may appear in order to better support
the topologies and capabilities of these technologies.

The modularization of CAM and the addition of the ATA/SATA modules is
meant to break CAM out of the mold of being specific to SCSI, letting it
grow to be a framework for arbitrary transports and protocols.  It also
allows drivers to be written to support discrete hardware without
jeopardizing the stability of non-related hardware.  While only an AHCI
driver is provided now, a Silicon Image driver is also in the works.
Drivers for ICH1-4, ICH5-6, PIIX, classic IDE, and any other hardware
is possible and encouraged.  Help with new transports is also encouraged.

Submitted by:	scottl, mav
Approved by:	re
2009-07-10 08:18:08 +00:00
Sam Leffler
f6c09dd6a8 correctly set the tailq ptr when removing the last item in the q
Approved by:	re (kensmith)
2009-07-10 02:19:57 +00:00
Ariff Abdullah
5c663ce9cf Rearrange shift operation to increase interpolation accuracy,
further reducing conversion artifacts and better worst case SNR.

Approved by:	re (kib)
2009-07-09 22:21:18 +00:00
Navdeep Parhar
adb1423aa6 Fix cxgb(4) panic with jumbo frames.
Reviewed by:	kmacy
Approved by:	re (kib), gnn (mentor)
2009-07-09 19:27:58 +00:00
Rick Macklem
9ca27b565b Since the nfscl_getclose() function both decremented open counts and,
optionally, created a separate list of NFSv4 opens to be closed, it
was possible for the associated OpenOwner to be free'd before the Open
was closed. The problem was that the Open was taken off the OpenOwner
list before the Close RPC was done and OpenOwners can be free'd once the
list is empty. This patch separates out the case of doing the Close RPC
into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose()
so that it closes a single open instead of a list of them. This avoids
removing the Open from the OpenOwner list before doing the Close RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-09 19:00:29 +00:00
Konstantin Belousov
7c6d401c75 The control terminal revocation at the session leader exit does not
correctly checks for reclaimed vnode, possibly calling VOP_REVOKE for
such vnode. If the terminal is already revoked, or devfs mount was
forcibly unmounted, the revocation of doomed ctty vnode causes panic.

Reported and tested by:	lstewart
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-09 18:54:38 +00:00
Konstantin Belousov
1d587edb3a Extend the cn_flags field of the struct componentname to 64 bits to have
more space for the flags, that is too close to be exhausted. While changing
the KBI for name(9), use unsigned int for symlinks count.

Suggested by:	rwatson
Approved by:	re (kensmith)
2009-07-09 18:49:26 +00:00
Robert Noland
87c73f89a9 Add support for Radeon HD 4770 (RV740) chips.
Approved by:	re@ (kib)
MFC after:	3 days
2009-07-09 16:39:28 +00:00
Konstantin Belousov
a2622e5dc2 Restore the segment registers and segment base MSRs for amd64 syscall
return path only when neither thread was context switched while
executing syscall code nor syscall explicitely modified LDT or MSRs.

Save segment registers in trap handlers before interrupts are enabled,
to not allow context switches to happen before registers are saved.
Use separated byte in pcb for indication of fast/full return, since
pcb_flags are not synchronized with context switches.

The change puts back syscall microbenchmark numbers that were slowed
down after commit of the support for LDT on amd64.

Reviewed by:	jeff
Tested (and tested, and tested ...) by:	pho
Approved by:	re (kensmith)
2009-07-09 09:34:11 +00:00
Pyun YongHyeon
8e95322a35 Make xl(4) build with Tx checksum offload.
PR:		kern/136409
Approved by:	re (kib)
2009-07-09 01:58:59 +00:00
Jamie Gritton
9dce97d788 Remove crcopy call from seteuid now that it calls crcopysafe.
Reviewed by:	brooks
Approved by:	re (kib), bz (mentor)
2009-07-08 21:45:48 +00:00
Edward Tomasz Napierala
51334c8257 Regen the freebsd32 parts.
Approved by:	re (kib)
2009-07-08 16:30:34 +00:00
Edward Tomasz Napierala
92f76afe3a Fix freebsd32 version of lpathconf(2).
Approved by:	re (kib)
2009-07-08 16:26:43 +00:00
Edward Tomasz Napierala
e2b881bf03 Regenerate after lpathconf(2) addition.
Approved by:	re (kib)
2009-07-08 15:25:27 +00:00
Edward Tomasz Napierala
c38898116a There is an optimization in chmod(1), that makes it not to call chmod(2)
if the new file mode is the same as it was before; however, this
optimization must be disabled for filesystems that support NFSv4 ACLs.
Chmod uses pathconf(2) to determine whether this is the case - however,
pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.

This change adds lpathconf(3) to make it possible to solve that problem
in a clean way.

Reviewed by:	rwatson (earlier version)
Approved by:	re (kib)
2009-07-08 15:23:18 +00:00
Ed Schouten
6b53d5c0e7 Fix regressions in return events of poll() on TTYs.
As pointed out, POLLHUP should be generated, even if it hasn't been
specified on input. It is also not allowed to return both POLLOUT and
POLLHUP at the same time.

Reported by:	jilles
Approved by:	re (kib)
2009-07-08 10:21:52 +00:00
Alexander Motin
d498a2e62b Fix kernel panic, when ataahci driver is used on system with increased
MAXPHYS. Current ataahci driver memory allocation scheme includes only
64 items in DMA S/G table, and so not guarantied to support transactions
with more then 252K data.

Approved by:    re (kensmith)
MFC after:      2 weeks
2009-07-08 06:00:21 +00:00
Marcel Moolenaar
f43b57e32a Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c
is invalid because the ioctl happens without prior open. The ioctl
got introduced to provide backward compatibility for extended
partitions, but it ended up not being used because it didn't work
as expected. Since there are no consumers of the ioctl and the
implementation is broken, the best fix is to remove the code
entirely.

Spotted by:	phk
Approved by:	re (kensmith)
2009-07-08 05:56:14 +00:00
Mike Silbersack
347e22a68b Increase HZ_VM from 10 to 100. While 10 hz saves cpu time
under VM environments, it's too slow for FreeBSD to work
properly.  For example, ping at 10hz pings about every 600ms
instead of about every second.

Approved by:	re (kib)
2009-07-08 01:09:12 +00:00
Sam Leffler
6552698302 Fix ar5416 and later parts on big-endian platforms: setup the h/w byte
swizzler using the same technique used everywhere else.

Approved by:	re (kib)
2009-07-07 18:11:05 +00:00
Konstantin Belousov
7f5dff5064 Fix poll(2) and select(2) for named pipes to return "ready for read"
when all writers, observed by reader, exited. Use writer generation
counter for fifo, and store the snapshot of the fifo generation in the
f_seqcount field of struct file, that is otherwise unused for fifos.
Set FreeBSD-undocumented POLLINIGNEOF flag only when file f_seqcount is
equal to fifo' fi_wgen, and revert r89376.

Fix POLLINIGNEOF for sockets and pipes, and return POLLHUP for them.
Note that the patch does not fix not returning POLLHUP for fifos.

PR:	kern/94772
Submitted by:	bde (original version)
Reviewed by:	rwatson, jilles
Approved by:	re (kensmith)
MFC after:	6 weeks (might be)
2009-07-07 09:43:44 +00:00
Ken Smith
55b57bd274 Bump for BETA1.
Approved by:	re (implicit)
2009-07-07 00:02:26 +00:00
Sam Leffler
69ad6b3450 Fix AR5416 and later parts when building with AH_DEBUG or similar defined:
always define OS_REG_UNSWAPPED and use it in ath_hal_reg_{read,write}.

Approved by:	re (kib)
2009-07-06 20:51:54 +00:00
Alan Cox
133898afd5 When pmap_change_attr() changes the PAT setting on a kernel mapping, it has
to simultaneously change the PAT setting for the same pages within the
direct map region.  This may require the demotion of a 2MB page mapping and
the allocation of a page table page.  This revision gives the highest
possible priority (VM_ALLOC_INTERRUPT) to this page allocation, so that
pmap_change_attr() is less likely to fail.  (In general, kernel page table
page allocations have the highest priority, so this is not creating a new
precedent.)

(Demotion of 1GB page mappings within the direct map already specifies
VM_ALLOC_INTERRUPT to vm_page_alloc(), so only pmap_demote_pde() must be
changed.)

Approved by:	re (kib)
2009-07-06 18:43:42 +00:00
John Baldwin
0d0a6650d7 After the per-CPU IDT changes, the IDT vector of an interrupt could change
when the interrupt was moved from one CPU to another.  If the interrupt was
enabled, then the old IDT vector needs to be disabled and the new IDT vector
needs to be enabled.  This was mostly masked prior to the recent MSI changes
since in the older code almost all allocated IDT vectors were already enabled
and the enabled vectors on the BSP during boot covered enough of the IDT
range.  However, after the MSI changes, MSI interrupts that were allocated
but not enabled (e.g. DRM with MSI) during boot could result in an allocated
IDT vector that wasn't enabled.  The round-robin at the end of boot could
place another interrupt at the same IDT vector without enabling the IDT
vector causing trap 30 faults.

Fix this by explicitly disabling/enabling the old and new IDT vectors for
enabled interrupt sources when moving an interrupt between CPUs via the
pic_assign_cpu() method.  While here, fix a bug in my earlier changes so
that an I/O APIC interrupt pin is left unchanged if ioapic_assign_cpu()
fails to allocate a new IDT vector and returns ENOSPC.

Approved by:	re (kensmith)
2009-07-06 18:23:00 +00:00
John Baldwin
f7d7cd0c76 MFi386: Add a 'show idt' command to DDB to display the non-default function
pointers in the interrupt descriptor table.

Approved by:	re (kensmith)
2009-07-06 18:10:27 +00:00
Jack F Vogel
38537cb9d3 The new method of reading the mac address from the
RAR(0) register does not work on this old adapter,
provide a local routine that does it the older way.

Approved by:  re
2009-07-06 17:23:48 +00:00
Alan Cox
0e18ab26d0 PAE adds another level to the i386 page table. This level is a small
4-entry table that must be located within the first 4GB of RAM.  This
requirement is met by defining an UMA zone with a custom back-end
allocator function.  This revision makes two changes to this back-end
allocator function: (1) It replaces the use of contigmalloc() with the
use of kmem_alloc_contig().  This eliminates "double accounting", i.e.,
accounting by both the UMA zone and malloc tags.  (I made the same
change for the same reason to the zones supporting jumbo frames a week
ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK,
M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it.
pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO.  At the
moment, this is harmless only because the default behavior of
contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init()
doesn't really depend on the memory being zeroed.

The back-end allocator function in the Xen pmap is dead code.  I am
changing it nonetheless because I don't want to leave any "bad examples"
in the source tree for someone to copy at a later date.

Approved by:	re (kib)
2009-07-05 21:40:21 +00:00
Sam Leffler
24d3677b9d catchup with action+ageq additions
Submitted by:	"Paul B. Mahol" <onemda@gmail.com>
Approved by:	re (implicit)
2009-07-05 21:19:10 +00:00
Sam Leffler
2affcf8d1d add missing bit of r195379
Approved by:	re (kensmith)
2009-07-05 20:44:50 +00:00
Sam Leffler
5b16c28c42 Add ieee80211_ageq; a facility for staging packets that require
long-term work before they can be serviced.  Packets are tagged and
assigned an age (in seconds) at the point they are added to the
queue.  If a packet is not retrieved before it's age expires it is
reclaimed.  Tagging can take two forms: a reference to an ieee80211_node
(as happens in the tx path) or an opaque token in cases where there
is no reference or the node structure is not stable (i.e. it's going
to be destroyed).

o add ic_stageq to replace the per-node wds staging queue used for
  dynamic wds
o add ieee80211_mac_hash for building ageq tokens; this computes a
  32-bit hash from an 802.11 mac address (copied from the bridge)
o while here fix a stray ';' noticed in IEEE80211_PSQ_INIT

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-05 18:17:37 +00:00
Ariff Abdullah
96831ae59b - Increase dynamic range of filter coefficients from 28bit to 30bit.
This cause dramatic effect in overall precision and conversion quality
  by pushing down most aliasing artifacts around -180 dB.

  Spectrogram analysis/comparison:

  	http://people.freebsd.org/~ariff/z_comparison/z_28vs30/

- Guard against possible 64bit overflow during accumulation process by
  slightly normalize and saturate sample and coefficient multiplication,
  possible during extreme 32bit downsampling (eg. 380KHz -> 8KHz) with
  custom preset that require more than ~7000 taps filter (which is
  overkill).

- Add knobs through FEEDER_RATE_PRESETS to set dynamic range of filter
  coefficients/accumulator and prefered polynomial interpolator:

  	COEFFICIENT_BIT:X
	(where 1 <= X <= 30, default: 30)

	ACCUMULATOR_BIT:X
	(where 32 <= X <=64, default: 58)

	INTERPOLATOR:I
	(where I = ZOH, LINEAR, QUADRATIC, HERMITE, BSPLINE,
 	           OPT32X, OPT16X, OPT8X, OPT4X, OPT2X)

Approved by:	re (kib)
2009-07-05 18:15:06 +00:00
Sam Leffler
7634012302 Revamp 802.11 action frame handling:
o add a new facility for components to register send+recv handlers
o ieee80211_send_action and ieee80211_recv_action now use the registered
  handlers to dispatch operations
o rev ieee80211_send_action api to enable passing arbitrary data
o rev ieee80211_recv_action api to pass the 802.11 frame header as it may
  be difficult to locate
o update existing IEEE80211_ACTION_CAT_BA and IEEE80211_ACTION_CAT_HT handling
o update mwl for api rev

Reviewed by:	rpaulo
Approved by:	re (kensmith)
2009-07-05 17:59:19 +00:00
Sam Leffler
8c393fd1f0 Cleanup ALIGNED_POINTER:
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent

Reviewed by:	bde, imp, marcel
Approved by:	re (kensmith)
2009-07-05 17:45:48 +00:00
Edward Tomasz Napierala
b16af7f185 When the kernel is configured without "options FFS", build UFS as a module
without requiring any special build flags.

Approved by:	re (kib)
2009-07-05 15:25:02 +00:00
Alexander Motin
6ae5218789 Mark atanvidia depending on ataahci since rev.188846.
Approved by:	re (kib)
2009-07-05 14:50:45 +00:00
Ivan Voras
cd2dd44393 Add missing reference to GPT support.
Submitted by:	Paul B. Mahol onemda at gmail.com
Approved by:	re (kib)
2009-07-05 14:03:41 +00:00
Konstantin Belousov
121fd46175 When forking a vm space that has wired map entries, do not forget to
charge the objects created by vm_fault_copy_entry. The object charge
was set, but reserve not incremented.

Reported by:	Greg Rivers <gcr+freebsd-current tharned org>
Reviewed by:	alc (previous version)
Approved by:	re (kensmith)
2009-07-03 22:17:37 +00:00
Rui Paulo
9e760f2578 acpi_hp.c:
- sysctl dev.acpi_hp.0.verbose to toggle debug output
- A modification so this can deal with different array lengths
  when reading the CMI BIOS - now it works ok on HP Compaq nx7300
  as well.
- Change behaviour to query only max_instance-1 CMI BIOS instances,
  because all HPs seen so far are broken in that respect
  (or there is a fundamental misunderstanding on my side, possible
  as well). This way a disturbing ACPI Error Field exceeds Buffer
  message is avoided.
- New bit to set on dev.acpi_hp.0.cmi_detail (0x8) to
  also query the highest guid instance of CMI bios

acpi_hp.4:
- Document dev.acpi_hp.0.verbose sysctl in man page
- Document new bit for dev.acpi_hp.0.cmi_detail
- Add a section to manpage about hardware that has been reported
  to work ok

Submitted by:	Michael Gmelin <freebsdusb at bindone.de>
Approved by:	re (kib)
MFC after:	2 weeks
2009-07-03 21:12:37 +00:00
Edward Tomasz Napierala
340263c992 Fix fpathconf(3) on fifos, in effect making ls(1) properly
display '+' on them.  Taken from kern/125613, with cosmetic
changes.

PR:		kern/125613
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Approved by:	re (kib)
2009-07-02 20:05:21 +00:00
Ed Schouten
89fe4c0a2b Enable POSIX semaphores on all non-embedded architectures by default.
More applications (including Firefox) seem to depend on this nowadays,
so not having this enabled by default is a bad idea.

Proposed by:	miwi
Patch by:	Florian Smeets <flo kasimir com>
Approved by:	re (kib)
2009-07-02 18:24:37 +00:00
Konstantin Belousov
f1eccd05ec In vn_vget_ino() and their inline equivalents, mnt_ref() the mount point
around the sequence that drop vnode lock and then busies the mount point.
Not having vlocked node or direct reference to the mp allows for the
forced unmount to proceed, making mp unmounted or reused.

Tested by:	pho
Reviewed by:	jeff
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-02 18:02:55 +00:00
Robert Watson
6196f898bb Create audit records for AUE_POSIX_OPENPT, currently w/o arguments.
Approved by:	re (audit argument blanket)
2009-07-02 16:33:38 +00:00
Jamie Gritton
f0899a3460 Call prison_check from vfs_suser rather than re-implementing it.
Approved by:	re (kib), bz (mentor)
2009-07-02 14:19:33 +00:00
Ariff Abdullah
0162436974 Slightly increase amount of bandwidth of resampling filter for
feeder_rate_quality=3. This have the benefit of reducing aliasing
artifacts due to alias masking.

Spectrogram analysis:

 o Old preset (100:36:0.90)
	http://people.freebsd.org/~ariff/z_comparison/z_q3_old.png

 o New preset (100:36:0.92):
	http://people.freebsd.org/~ariff/z_comparison/z_q3_new.png

Approved by:	re (kib)
2009-07-02 10:02:10 +00:00
Robert Watson
deedc899fd Fix comment misthink.
Submitted by:	b. f. <bf1783 at googlemail.com>
Approved by:	re (audit argument blanket)
MFC after:	1 week
2009-07-02 09:50:13 +00:00
Robert Watson
64c9a4d9ab Audit file descriptor and command arguments to ioctl(2).
Approved by:	re (audit argument blanket)
MFC after:	1 week
2009-07-02 09:16:25 +00:00
Robert Watson
2a5658382a Clean up a number of aspects of token generation from audit arguments to
system calls:

- Centralize generation of argument tokens for VM addresses in a macro,
  ADDR_TOKEN(), and properly encode 64-bit addresses in 64-bit arguments.
- Fix up argument numbers across a large number of syscalls so that they
  match the numeric argument into the system call.
- Don't audit the address argument to ioctl(2) or ptrace(2), but do keep
  generating tokens for mmap(2), minherit(2), since they relate to passing
  object access across execve(2).

Approved by:	re (audit argument blanket)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-07-02 09:15:30 +00:00
Xin LI
c4f739ec0c Use MPT_MAX_LUNS as maximium number of LUNs, not 7, for SAS and FC cases.
This matches Linux driver behavior.

Discussed with:	scottl
Approved by:	re (kensmith)
MFC after:	1 month
2009-07-02 00:43:10 +00:00
Xin LI
1635f0499e Change explicit maximium numbers to the defined macro MPT_MAX_LUNS.
Approved by:	re (kensmith)
2009-07-02 00:41:37 +00:00
Robert Watson
03f7b00438 For access(2) and eaccess(2), audit the requested access mode.
Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 22:47:45 +00:00
Edward Tomasz Napierala
4bc61fd4ec Don't panic on attempt to set ACL on a block device file.
This is just a part of kern/125613.

PR:		kern/125613
Submitted by:	Jaakko Heinonen <jh at saunalahti dot fi>
Reviewed by:	rwatson
Approved by:	re (kib)
2009-07-01 22:30:36 +00:00
Jeff Roberson
2141453eb4 - Use fd_lastfile + 1 as the upper bound on nd. This is more correct than
using the size of the descriptor array.
 - A lock is not needed to fetch fd_lastfile.  The results are stale the
   instant it is dropped.
 - Use a private mutex pool for select since the pool mutex is not used
   as a leaf.
 - Fetch the si_mtx pointer first before resorting to hashing to compute
   the mutex address.

Reviewed by:	McKusick
Approved by:	re (kib)
2009-07-01 20:43:46 +00:00
Edward Tomasz Napierala
8edfe76ab5 Fix a panic which (reportedly) can happen when unmounting a filesystem
with I/O requests in flight on kernels compiled with "options INVARIANTS".
Also, make it obvious it's not right to call g_valid_obj() (and macros
using it, e.g. G_VALID_CONSUMER()) without topology lock held.

Approved by:	re (kib)
Reported by:	pho
2009-07-01 20:16:29 +00:00
Rafal Jaworowski
f981547c99 Map DPCPU pages into ARM kernel VA space.
DPCPU area was not properly mapped into kernel VA space, which caused page
fault on the first DPCPU access. This patch fixes the problem by mapping DPCPU
area into kernel VA space.

Submitted by:	Michal Hajduk, Piotr Ziecik
Reviewed by:	cognet, stas
Approved by:	re (kib)
Obtained from:	Semihalf
2009-07-01 20:07:44 +00:00
Robert Watson
15ca46f69d Audit file descriptor numbers for various socket-related system calls.
Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 19:55:11 +00:00
Robert Watson
9e4c1521d5 Define missing audit argument macro AUDIT_ARG_SOCKET(), and
capture the domain, type, and protocol arguments to socket(2)
and socketpair(2).

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 18:54:49 +00:00
John Baldwin
cebc7fb16c Improve the handling of cpuset with interrupts.
- For x86, change the interrupt source method to assign an interrupt source
  to a specific CPU to return an error value instead of void, thus allowing
  it to fail.
- If moving an interrupt to a CPU fails due to a lack of IDT vectors in the
  destination CPU, fail the request with ENOSPC rather than panicing.
- For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used
  on the first interrupt in a group.  Moving the first interrupt in a group
  moves the entire group.
- Use the icu_lock to protect intr_next_cpu() on x86 instead of the
  intr_table_lock to fix a LOR introduced in the last set of MSI changes.
- Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with
  interrupts.  Previously, binding an interrupt to a CPU only performed a
  privilege check if the interrupt had an interrupt thread.  Interrupts
  without a thread could be bound by non-root users as a result.
- If an interrupt event's assign_cpu method fails, then restore the original
  cpuset mask for the associated interrupt thread.

Approved by:	re (kib)
2009-07-01 17:20:07 +00:00
Robert Watson
6d5a61563a When auditing unmount(2), capture FSID arguments as regular text strings
rather than as paths, which would lead to them being treated as relative
pathnames and hence confusingly converted into absolute pathnames.

Capture flags to unmount(2) via an argument token.

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 16:56:56 +00:00
Rick Macklem
a4c5a1c315 When unmounting an NFS mount using sec=krb5[ip], the umount system
call could get hung sleeping on "gsssta" if the credentials for a user
that had been accessing the mount point have expired. This happened
because rpc_gss_destroy_context() would end up calling itself when the
"destroy context" RPC was attempted, trying to refresh the credentials.
This patch just checks for this case in rpc_gss_refresh() and returns
without attempting the refresh, which avoids the recursive call to
rpc_gss_destroy_context() and the subsequent hang.

Reviewed by:	dfr
Approved by:	re (Ken Smith), kib (mentor)
2009-07-01 16:42:03 +00:00
Rick Macklem
b766fabd9c Make sure that cr_error is set to ESHUTDOWN when closing the connection.
This is normally done by a loop in clnt_dg_close(), but requests that aren't
in the pending queue at the time of closing, don't get set. This avoids a
panic in xdrmbuf_create() when it is called with a NULL cr_mrep if
cr_error doesn't get set to ESHUTDOWN while closing.

Reviewed by:	dfr
Approved by:	re (Ken Smith), kib (mentor)
2009-07-01 16:38:18 +00:00
Jack F Vogel
6bdcc991ae Multiqueue RX is not correctly enabled on the new 82599
adapter, the SRRCTL register needs to be setup per queue.

Approved by: re
2009-07-01 16:13:01 +00:00
Robert Watson
422d786676 Audit the file descriptor number passed to lseek(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 15:37:23 +00:00
Robert Watson
c5957d6bba Fix link(2) auditing: use the second audit record path for the new object
name.

Approved by:	re (kib)
MFC after:	3 days
2009-07-01 13:22:08 +00:00
Robert Watson
2ef24dde7c udit the 'options' argument to wait4(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 12:36:10 +00:00
Alexander Motin
505feb8f37 Fix infinite loop in ng_iface, that happens when packet passes out via
two different ng interfaces sequentially due to tunnelling.

PR:		kern/134557
Submitted by:	Mikolaj Golub
Approved by:	re (kensmith)
MFC after:	3 days
2009-07-01 08:08:56 +00:00
Doug Rabson
259d14ed88 Don't include rpcv2.h - it has been removed.
Submitted by: ed@
Approved by: re
2009-07-01 07:34:28 +00:00
Alan Cox
9785747f87 Remove a stale comment. The very same revision (r85511) that introduced
this comment also implemented the proposed change to the code.

Approved by:	re (kib)
2009-06-30 19:39:17 +00:00
Doug Rabson
98c497255b Adjust the internal NFS KPI to avoid the last traces of NFS_LEGACYRPC.
Approved by: re
2009-06-30 19:10:17 +00:00
Doug Rabson
b49a2b39fd Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.
Approved by: re
2009-06-30 19:03:27 +00:00
Edward Tomasz Napierala
fb231f3627 Make gjournal work with kernel compiled with "options DIAGNOSTIC".
Previously, it would panic immediately.

Reviewed by:	pjd
Approved by:	re (kib)
2009-06-30 14:34:06 +00:00
Ed Maste
2dafac3976 Add FIONSPACE from NetBSD. FIONSPACE is provided so that programs may
easily determine how much space is left in the send queue; they do not
need to know the send queue size.

NetBSD revisions:
  sys_socket.c r1.41, 1.42
  filio.h r1.9

Obtained from:	NetBSD
Approved by:	re (kensmith)
2009-06-30 13:38:49 +00:00
Stanislav Sedov
b2d758545b - Add support to atomically set/clear individual bits of a MSR register
via cpuctl(4) driver.  Two new CPUCTL_MSRSBIT and CPUCTL_MSRCBIT ioctl(2)
  calls treat the data field of the argument struct passed as a mask
  and set/clear bits of the MSR register according to the mask value.
- Allow user to perform atomic bitwise AND and OR operaions on MSR registers
  via cpucontrol(8) utility.  Two new operations ("&=" and "|=") have been
  added.  The first one applies bitwise AND operaion between the current
  contents of the MSR register and the mask, and the second performs bitwise
  OR.  The argument can be optionally prefixed with "~" inversion operator.
  This allows one to mimic the "clear bit" behavior by using the command
  like this:
      cpucontrol -m 0x10&=~0x02		# clear the second bit of TSC MSR

  Inversion operator support in all modes (assignment, OR, AND).

Approved by:	re (kib)
MFC after:	1 month
2009-06-30 12:35:47 +00:00
Andriy Gapon
462fab84b8 remove unused/unneeded extern declarations
This should result in no changes to compiled code.

Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 day
2009-06-30 11:16:32 +00:00