Commit Graph

66683 Commits

Author SHA1 Message Date
Jeff Roberson
241fbd3d13 - At the top of sleepq_catch_signals() lock the thread and check TDF_NEEDSIGCHK
before doing the very expensive cursig() and related locking.  NEEDSIGCHK
   is updated whenever our signal mask change or when a signal is delivered and
   should be sufficient to avoid the more expensive tests.  This eliminates
   another source of PROC_LOCK contention in multithreaded programs.
2008-03-19 07:35:14 +00:00
Jeff Roberson
bd4e153568 - Remove stale comment.
- In the last revision the code was changed to use maxfilesperproc rather than
   the per-process file limit to restrict the size of the poll array.  This
   eliminates a significant source of process lock contention in multithreaded
   programs and is cheaper.  This had been committed with the wrong batch of
   changes.
2008-03-19 07:33:16 +00:00
Pawel Jakub Dawidek
ab35440fa1 Oops. Use atomic_add_long() for atomic_fetchadd_long() (not atomic_add_int())
for sparc64 and sun4v.

Noticed by:	marius
2008-03-19 07:27:24 +00:00
Jeff Roberson
afc5854dbc - Add a facility similar to LOCK_PROFILING under SLEEPQUEUE_PROFILING. Keep
a simple (wmesg, count) tuple in a hash to keep track of how many times
   we sleep at each wait message.  We hash on message and not channel.  No
   line number information is given as typically wait messages are not used in
   more than one place.  Identical strings defined at different addresses will
   show up with seperate counters.
 - Use debug.sleepq.enable to enable, .reset to reset, and .stats dumps stats.
 - Do an unsynchronized check in sleepq_switch() prior to switching before
   calling sleepq_profile() which uses a global lock to synchronize the hash.
   Only sleeps which actually cause a context switch are counted.
2008-03-19 07:22:07 +00:00
Jeff Roberson
fbd762f197 - Fix the last of the threading bugs that were introduced as far back as
1.38 in 2001.  Break out of the FOREACH_THREAD_IN_PROC loop when we've
   discovered a new proc in the chain.
 - Increment i and check for maxlockdepth once per matching process not
   once per thread.  This didn't properly terminate the loop before.
 - Fix a bug which has existed potentially since rev 1.1.  waitblock->lf_next
   can be NULL when a thread has been woken-up but not yet scheduled.  Check
   for this condition rather than blindly dereferencing.

Found by:	libMicro
2008-03-19 07:13:24 +00:00
Jeff Roberson
45aea8de6e - Restore the NULL check for td_cpuset. This can happen if a partially
constructed thread was torn down as is the case when we fail to allocate
   a kernel stack.
2008-03-19 06:20:21 +00:00
Jeff Roberson
374ae2a393 - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from
requiring the per-process spinlock to only requiring the process lock.
 - Reflect these changes in the proc.h documentation and consumers throughout
   the kernel.  This is a substantial reduction in locking cost for these
   fields and was made possible by recent changes to threading support.
2008-03-19 06:19:01 +00:00
Doug Rabson
999396482a Don't call nfs_realign while holding locks.
Reviewed by: kib
2008-03-18 18:42:59 +00:00
John Baldwin
07f7fccaaf Catch up to intr_event_create() prototype change.
Pointy hat:	jhb
2008-03-18 13:31:45 +00:00
Ulf Lilleengen
1cf9b83c6d - Fix a memory leak when re-discovering a gvinum configuration.
Approved by:	pjd (mentor)
MFC after:	1 week
2008-03-18 08:48:51 +00:00
Adrian Chadd
05e486c71e Sign-extend the 48-bit AMD PMC counter before treating it to a 64-bit
2's compliment.

The 2's compliment transform is done so a "count down" sampling interval
can be converted into a "count up" PMC value. a 2's complimented 'count down'
value is written to the PMC counter; then the read-back counter is reverted
via another 2's compliment.

PR: kern/121660
Reviewed by: jkoshy
Approved by: jkoshy
MFC after: 1 week
2008-03-18 08:39:11 +00:00
Adrian Chadd
4dd3c84f5f Fix the debugging output - the '0x' was duplicated from the %p option. 2008-03-18 08:36:19 +00:00
Alan Cox
1fa94a36b1 Almost seven years ago, vm/vm_page.c was split into three parts:
vm/vm_contig.c, vm/vm_page.c, and vm/vm_pageq.c.  Today, vm/vm_pageq.c
has withered to the point that it contains only four short functions,
two of which are only used by vm/vm_page.c.  Since I can't foresee any
reason for vm/vm_pageq.c to grow, it is time to fold the remaining
contents of vm/vm_pageq.c back into vm/vm_page.c.

Add some comments.  Rename one of the functions, vm_pageq_enqueue(),
that is now static within vm/vm_page.c to vm_page_enqueue().
Eliminate PQ_MAXCOUNT as it no longer serves any purpose.
2008-03-18 06:52:15 +00:00
Kip Macy
19905d6dbd - Integrate 1.133 vendor driver changes
- update some copyrights
- add improved support for delayed ack
- fix issue with fec
2008-03-18 03:55:12 +00:00
Paolo Pisati
8368edc123 Don't cache ptr to nat rule in case of tablearg argument.
Bug spotted by: Dyadchenko Mihail
2008-03-17 23:02:56 +00:00
John Baldwin
6d2d1c044f Simplify the interrupt code a bit:
- Always include the ie_disable and ie_eoi methods in 'struct intr_event'
  and collapse down to one intr_event_create() routine.  The disable and
  eoi hooks simply aren't used currently in the !INTR_FILTER case.
- Expand 'disab' to 'disable' in a few places.
- Use function casts for arm and i386:intr_eoi_src() instead of wrapper
  routines since to trim one extra indirection.

Compiled on:	{arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
Tested on:	{amd64,i386} x {FILTER, !FILTER}
2008-03-17 22:42:01 +00:00
Paolo Pisati
f6efbc8842 Don't abuse stack space while in kernel land, use heap instead. 2008-03-17 22:08:31 +00:00
Antoine Brodin
afe5acff1b Simplify fcntl(SVR4_F_DUP2FD) code now that FreeBSD has F_DUP2FD.
Approved by:	rwatson (mentor)
2008-03-17 18:27:28 +00:00
Scott Long
ad97d96c40 Locking in the ses_ioctl handler doesn't have to be so strict because
the referenced data is only obtained/changed in the device open handler,
and the ioctl handler can only run after the open handler.  Also fix a
few nearby style issues.

Submitted by: Matt Jacob
2008-03-17 17:18:16 +00:00
Konstantin Belousov
aeeb4202df Fix two races in the handling of the d_gianttrick for the D_NEEDGIANT
drivers.

In the giant_XXX wrappers for the device methods of the D_NEEDGIANT
drivers, do not dereference the cdev->si_devsw. It is racing with
the destroy_devl() clearing of the si_devsw. Instead, use the
dev_refthread() and return ENXIO for the destroyed device. [1]

The check for the D_INIT in the prep_cdevsw() was not synchronized with
the call of the fini_cdevsw() in destroy_devl(), that under rapid device
creation/destruction may result in the use of uninitialized cdevsw [2].
Change the protocol for the prep_cdevsw(), requiring it to be called
under dev_mtx, where the check for D_INIT is done.

Do not free the memory allocated for the gianttrick cdevsw while holding
the dev_mtx, put it into the free list to be freed later. Reuse the
d_gianttrick pointer to keep the size and layout of the struct cdevsw
(requested by phk). Free the memory in the dev_unlock_and_free(), and do
all the free after the dev_mtx is dropped (suggested by jhb).

Reported by:	bsdimp + many [1], pho [2]
Reviewed by:	phk, jhb
Tested by:	pho
MFC after:	1 week
2008-03-17 13:17:10 +00:00
Robert Watson
c2877015a1 Fix indentation for a closing brace in in_pcballoc().
MFC after:	3 days
2008-03-17 13:04:56 +00:00
Pawel Jakub Dawidek
4582cb68b1 - There is no more "uidinfo struct" mutex.
- The "uidinfo hash" lock is now a rwlock.

Reminded by:	kib
2008-03-17 11:48:40 +00:00
Poul-Henning Kamp
72d945abcc Add a "spindown" facility to ata-disks: If no requests have been received
for a configurable number of seconds, spin the disk down.  Spin it back
up on the next request.

Notice that the timeout is only armed by a request, so to spin down a
disk you may have to do:

	atacontrol spindown ad10 5
	dd if=/dev/ad10 of=/dev/null count=1

To disable spindown, set timeout to zero:

	atacontrol spindown ad10 0

In order to debug any trouble caused, this code is somewhat noisy on the
console.

Enabling spindown on a disk containing / or /var/log/messages is not
going to do anything sensible.

Spinning a disk up and down all the time will wear it out, use sensibly.

Approved by:	sos
2008-03-17 10:33:23 +00:00
Poul-Henning Kamp
272870cf7b A cautionary XXX comment about seemingly bogus errata checks. 2008-03-17 09:05:15 +00:00
Poul-Henning Kamp
462302db47 Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 09:01:43 +00:00
Poul-Henning Kamp
68b84e73e3 Revert last commit and stop committing before morning tea. 2008-03-17 09:00:59 +00:00
Poul-Henning Kamp
5d306f44cc Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 08:38:38 +00:00
Weongyo Jeong
9744c849bf don't set sniffer mode to ON when the driver is running with the
monitor mode.  This solves a problem that sometimes mangled frames
are passed.

Submitted by:	Werner Backes <werner_at_bit-1.de>
Tested by:	Werner Backes <werner_at_bit-1.de>
PR:		kern/121608
Approved by:	thompsa (mentor)
2008-03-17 02:30:13 +00:00
Andrew Thompson
69f04a828c Remove extra semicolons.
Pointed out by:		antoine
2008-03-17 01:26:44 +00:00
Marcel Moolenaar
294800e52d Make remote GDB work for AIM processors. For BookE, the kernel
will have a special section, named .PPC.EMB.apuinfo, which will
tell GDB that a BookE processor is targeted and which will
result in GDB using a different register definition. In order
to support remote GDB for BookE, we need the GDB stub in the
kernel look for that section and use the BookE definitions.
2008-03-17 00:46:52 +00:00
Poul-Henning Kamp
29cc138cdf Use correct bitmask for identifying chip family. 2008-03-17 00:36:16 +00:00
Alexander Motin
e81de8afb0 Remove impossible (hk_peer == NULL) check from ng_address_hook().
Valid hook can't have NULL peer. Even invalid one can't, as it is resets to
deadhook, but not NULL.
2008-03-16 23:12:17 +00:00
Alexander Motin
4e7597635f Add session ID hashing to speedup incoming packets dispatch in case
of many connections working via the same tunnel. For example, in case
of full "client <-> LAC <-> LNS" setup.
2008-03-16 21:33:12 +00:00
Pawel Jakub Dawidek
709446e782 Whitespace cleanups. 2008-03-16 21:32:20 +00:00
Pawel Jakub Dawidek
1b072fbcab - Use wait-free method to manage ui_sbsize and ui_proccnt fields in the
uidinfo structure. This entirely removes contention observed on the
  ui_mtxp mutex (as it is now gone).
- Convert the uihashtbl_mtx mutex to a rwlock, as most of the time we just
  need to read-lock it.

Reviewed by:	jhb, jeff, kris & others
Tested by:	kris
2008-03-16 21:29:02 +00:00
Pawel Jakub Dawidek
6eb4157ffc Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
Andrew Thompson
3de1800850 Switch the LACP state machine over to its own mutex to protect the internals,
this means that it no longer grabs the lagg rwlock. Use two port table arrays
which list the active ports for Tx and switch between them with an atomic op.
Now the lagg rwlock is only exclusively locked for management (ioctls) and
queuing of lacp control frames isnt needed.
2008-03-16 19:25:30 +00:00
Robert Watson
45fa2c8a87 Consistently use ANSI C declarationsfor all functions in kern_synch.c. 2008-03-16 18:59:21 +00:00
Pawel Jakub Dawidek
e056770745 Style fixes. 2008-03-16 18:26:59 +00:00
Pawel Jakub Dawidek
67e83b07c6 Fix information leak. We can find PIDs of running processes from within
a jail, etc. by simply calling setpriority(PRIO_PROCESS, <PID>, 0) and
checking the return value: 0 means that the process exists and -1 that
it doesn't exist.

Reviewed by:	rwatson
MFC after:	1 week
2008-03-16 17:55:06 +00:00
Alan Cox
ec96dca788 Simplify the inner loop of vm_fault()'s delete-behind heuristic.
Instead of checking each page for PG_UNMANAGED, perform a one-time
check whether the object is OBJT_PHYS.  (PG_UNMANAGED pages only
belong to OBJT_PHYS objects.)
2008-03-16 17:37:19 +00:00
Pawel Jakub Dawidek
b12455f34e Implement soon-to-be-used rw_unlock() macro. 2008-03-16 17:10:52 +00:00
Roman Divacky
d8653dd986 Regen. 2008-03-16 16:29:37 +00:00
Roman Divacky
5dfb688191 Implement sched_setaffinity and get_setaffinity using
real cpu affinity setting primitives.

Reviewed by:	jeff
Approved by:	kib (mentor)
2008-03-16 16:27:44 +00:00
Robert Watson
cc456a74ab Commit SYSINIT() ;-adding patch missed in previous pass.
MFC after:	1 month
Caught by:	tinderbox
2008-03-16 13:02:04 +00:00
Robert Watson
dd3af71f17 Remove trailing ';' from C_SYSINIT() macro definition, in keeping
with style(9) recommendation that macros not contain the
terminating ';', leaving that to the invoker.  All SYSINIT()
consumers must now provide a trailing ';'.

Unlike the change to remove the ';'s from callers, this change
shouldn't be MFC'd unless we don't mind requiring source changes
to third party modules that might still depend on SYSINIT()
providing its own ';'.
2008-03-16 11:01:32 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
Maxim Sobolev
c9370ff4d0 Properly set size of the file_zone to match kern.maxfiles parameter.
Otherwise the parameter is no-op, since zone by default limits number
of descriptors to some 12K entries. Attempt to allocate more ends up
sleeping on zonelimit.

MFC after:	2 weeks
2008-03-16 06:21:30 +00:00
Pawel Jakub Dawidek
2b1c6615bc Fix mmap(2) on ZFS after some changes in VM subsystem.
Submitted by:	alc
Reported by:	kris (originally) and many others
Tested with:	fsx
MFC after:	1 week
2008-03-15 23:23:04 +00:00
Ruslan Ermilov
1f49b573e1 Fix panic on e.g. "kldload /dev/null".
PR:		kern/121427
Reviewed by:	sem
MFC after:	3 days
2008-03-15 17:40:18 +00:00
Warner Losh
dffa4a85ac BUS_DMA_ISA is left over from Alpha, and is not used in the tree at
all.  The reference in ia64 code is due to cutNpaste in its history
and can safely be removed.

Revired by: cognet, raj, marcel, jhb and maybe one other whom I'm forgetting
2008-03-15 06:44:45 +00:00
Ed Maste
4109ba516e Change spelling and eliminate a typo in comments to reduce diffs with
Adaptec's vendor driver.  I have some fixes to bring in and this makes
ongoing review of the FreeBSD-Adaptec driver diffs easier.
2008-03-14 21:59:11 +00:00
John Baldwin
eaf86d1678 Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
Bjoern A. Zeeb
9e3bdede0f Correct IPsec behaviour with a 'use' level in SP but no SA available.
In that case return an continue processing the packet without IPsec.

PR:		121384
MFC after:	5 days
Reported by:	Cyrus Rahman (crahman gmail.com)
Tested by:	Cyrus Rahman (crahman gmail.com) [slightly older version]
2008-03-14 16:38:11 +00:00
Bjoern A. Zeeb
4e8a7c9ae1 Remove the "Fast " from the
"Fast IPsec: Initialized Security Association Processing." printf.
People kept asking questions about this after the IPsec shuffle.

This still is the Fast IPsec implementation so no worries that it would
be any slower now. There are no functional changes.

Discussed with:	sam
MFC after:	4 days
2008-03-14 16:25:40 +00:00
Jung-uk Kim
0a84733d04 Add a quirk to ignore ASUS LCM display found on some ASUS laptops. 2008-03-14 15:59:30 +00:00
John Baldwin
d628fbfa98 Make the function prototype for cpu_search() match the declaration so that
this still compiles with gcc3.
2008-03-14 15:22:38 +00:00
Bjoern A. Zeeb
8cfbd2995b Correct reference counting on the SP for outgoing IPv6 IPsec connections.
PR:		121374
Reported by:	Cyrus Rahman (crahman gmail.com)
Tested by:	Cyrus Rahman (crahman gmail.com)
MFC after:	5 days
2008-03-14 11:55:04 +00:00
Bjoern A. Zeeb
39d8cf90cb #if 0 out a currently unsued (and incomplete) function: ip6_ipsec_mtu().
No need to compile 'dead' code.
I am leaving it in because we will have to review the concept and
should use the common function in various places.

MFC after:	5 days
2008-03-14 11:44:30 +00:00
Bjoern A. Zeeb
41aa71dd3e Replace the function name in two identical printfs
by __func__, __LINE__ so we can distinguish them
when people report a problem.

PR:		121373
MFC after:	5 days
2008-03-14 11:09:11 +00:00
Yoshihiro Takahashi
8ab7c8f322 Add stub for pc98. 2008-03-14 09:00:04 +00:00
Joseph Koshy
a3d7db55cc Correct a typo. 2008-03-14 06:16:18 +00:00
John Baldwin
c9107e85d9 Fix a silly bogon which prevented all the CPUs that are tagged as interrupt
receivers from being given interrupts if any CPUs in the system were not
tagged as interrupt receivers that I introduced when switching the x86
interrupt code to track CPUs via FreeBSD CPU IDs rather than local APIC
IDs.  In practice this only affects systems with Hyperthreading (though
disabling HTT in the BIOS would workaround the issue) as that is the only
case currently where one can have CPUs that aren't tagged as interrupt
receivers.  On a Dell SC1425 test box with 2 x Xeon w/ HTT (so 4 logical
CPUs of which 2 were interrupt receivers) the result was that all
device interrupts were sent to CPU 0.

MFC after:	1 week
Pointy hat to:	jhb
2008-03-14 03:44:42 +00:00
John Baldwin
5217af301c Rework how the nexus(4) device works on x86 to better handle the idea of
different "platforms" on x86 machines.  The existing code already handles
having two platforms: ACPI and legacy.  However, the existing approach was
rather hardcoded and difficult to extend.  These changes take the approach
that each x86 hardware platform should provide its own nexus(4) driver (it
can inherit most of its behavior from the default legacy nexus(4) driver)
which is responsible for probing for the platform and performing
appropriate platform-specific setup during attach (such as adding a
platform-specific bus device).  This does mean changing the x86 platform
busses to no longer use an identify routine for probing, but to move that
logic into their matching nexus(4) driver instead.
- Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the
  legacy platform.  It's probe routine now returns BUS_PROBE_GENERIC so it
  can be overriden.
- Expose a nexus_init_resources() routine which initializes the various
  resource managers so that subclassed nexus(4) drivers can invoke it from
  their attach routine.
- The legacy nexus(4) driver explicitly adds a legacy0 device in its
  attach routine.
- The ACPI driver no longer contains an new-bus identify method.  Instead
  it exposes a public function (acpi_identify()) which is a probe routine
  that the MD nexus(4) drivers can use to probe for ACPI.  All of the
  probe logic in acpi_probe() is now moved into acpi_identify() and
  acpi_probe() is just a stub.
- On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via
  acpi_identify() and claims the nexus0 device if the probe succeeds.  It
  then explicitly adds an acpi0 device in its attach routine.
- The legacy(4) driver no longer knows anything about the acpi0 device.
- On ia64 if acpi_identify() fails you basically end up with no devices.
  This matches the previous behavior where the old acpi_identify() would
  fail to add an acpi0 device again leaving you with no devices.

Discussed with:	imp
Silence on:	arch@
2008-03-13 20:39:04 +00:00
Coleman Kane
6c62df7e49 Replace the non-MPSAFE timeout(9) API in ffs_softdep.c with the MPSAFE
callout_* API (e.g. callout_init_mtx(9)). This was one of the numerous
items on the http://wiki.freebsd.org/SMPTODO list.

Reviewed by:	imp, obrien, jhb
MFC after:	1 week
2008-03-13 20:15:48 +00:00
John Baldwin
d0234f752f Use the SMAP data from the loader if it is provided instead of using
virtual 86 mode to query the BIOS directly.  This is needed for certain
HP machines whose BIOS only provide an SMAP when invoked from real mode.
On such machines the loader will be able to query the SMAP successfully
due to the recent BTX changes, but the kernel will not.

One thing I'm not sure of is if we can skip the INT 12h probe altogether
if we have the SMAP from the loader as it seems that we do the INT 12h
probe to setup enough state so we can use vm86 to call the BIOS.

MFC after:	1 week
2008-03-13 18:56:53 +00:00
David E. O'Brien
149c7c86d2 style(9) & style.Makefile(9)
Reviewed by:	raj
2008-03-13 17:54:21 +00:00
Coleman Kane
e42e0b8669 Add the module dependency on the mem(4) module. This will fix the module
failing to load on a kernel that has "nodevice mem" in the config. It will
now properly bring in the mem(4) module.

Submitted by:	antoine
Reviewed by:	imp
MFC after:	1 week
2008-03-13 14:08:41 +00:00
Konstantin Belousov
22eca0bf45 Since version 4.3, gcc changed its behaviour concerning the i386/amd64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed. This new behaviour conforms to the i386/amd64 ABI.

Modify the signal handler frame setup code to clear the DF {e,r}flags
bit on the amd64/i386 for the signal handlers.

jhb@ noted that it might break old apps if they assumed DF == 1 would be
preserved in the signal handlers, but that such apps should be rare and
that older versions of gcc would not generate such apps.

Submitted by:	Aurelien Jarno <aurelien aurel32 net>
PR:	121422
Reviewed by:	jhb
MFC after:	2 weeks
2008-03-13 10:54:38 +00:00
Konstantin Belousov
ea39de9f93 Add missed parentheses 2008-03-13 09:52:48 +00:00
David Xu
83660cf974 Add const qualifier to cpuset mask's pointer, since the cpuset mask should
be not changed by the system call.
2008-03-13 02:56:11 +00:00
Jeff Roberson
f4d77e9e54 PR 117603
- Close a sleepqueue signal race by interlocking with the per-process
   spinlock.  This was mistakenly omitted from the thread_lock patch and
   has been a race since.

MFC After:	1 week
PR:		bin/117603
Reported by:	Danny Braniss <danny@cs.huji.ac.il>
2008-03-13 00:46:12 +00:00
Jeff Roberson
66257bc8d9 - The P_SA flag has been removed. Don't reference it in a KASSERT. 2008-03-12 22:17:06 +00:00
Jeff Roberson
eab82b2ebe - Fix build breakage; there was a reference to a removed syscall in
a KASSERT().  Attempt to cleanup the comment to reflect reality.
2008-03-12 22:14:14 +00:00
John Baldwin
391664b110 The variable MTRR registers actually have variable-sized PhysBase and
PhysMask fields based on the number of physical address bits supported
by the current CPU.  The old code assumed 36 bits on i386 and 40 bits on
amd64.  In truth, all Intel CPUs up until recently used 36 bits (a newer
Intel CPU uses 38 bits) and all the Opteron CPUs used 40 bits.

In at least one case (the new Intel CPU) having the size of the mask field
wrong resulted in writing questionable values into the MTRR registers on
the application processors (BSP as well if you modify the MTRRs via
memcontrol or running X, etc.).  The result of the questionable physmask
was that all of memory was apparently treated as uncached rather than
write-back resulting in a very significant performance hit.

Fix this by constructing a run-time mask for the PhysBase and PhysMask
fields based on the number of physical address bits supported by the CPU.
All 64-bit capable CPUs provide a count of PA bits supported via the
0x80000008 extended CPUID feature, so use that if it is available.  If that
feature is not available, then assume 36 PA bits.

While I'm here, expand the (now-unused) macros for the PhysBase and
PhysMask fields to the current largest possible value (52 PA bits).

MFC after:	1 week
PR:		i386/120516
Reported by:	Nokia
2008-03-12 22:09:19 +00:00
John Baldwin
4cbd0e8984 MFamd64: Break up the probe logic in the mem_drvinit routines so it's
a bit easier to parse.
2008-03-12 21:44:46 +00:00
John Baldwin
f15a9cd288 Minimize diffs with i686_mem.c:
- A few whitespace changes I missed in the style(9) changes.
- Move M_MEMDESC to mem.c.
2008-03-12 21:43:50 +00:00
John Baldwin
e249f70262 Relax the BIOS/OS sempahore handoff code to workaround different hard
hangs (one at boot, one at shutdown) in recent machines.  First, only try
to take ownership of the EHCI controller if the BIOS currently owns the
controller.  On a HP DL160 G5, the machine hangs when we try to take
ownership.  Second, don't bother trying to give up ownership of the
controller during shutdown.  It's not strictly required and a Dell DCS S29
hangs on shutdown after the config write.

Both of these changes match the behavior of the Linux EHCI driver.  I also
think both of these hangs are caused by bugs in the BIOS' SMM handler
causing it to get stuck in an infinite loop in SMM.

MFC after:	1 week
2008-03-12 20:57:17 +00:00
John Baldwin
4c134f3e80 Partially revert 1.95. It changed the probe for a mouse device to only
accept a mouse using the boot subclass.  Instead, restore the original
hid_is_collection() test and fallback to testing the interface class,
subclass, and protocol if that fails.

MFC after:	1 week
PR:		usb/118670
2008-03-12 20:20:36 +00:00
Sam Leffler
810df80181 fix inverted test that disabled ACK's on xmit 2008-03-12 20:03:31 +00:00
Sam Leffler
823c77d78b add device hints to control the rx FIFO interrupt level on 16550A parts
PR:		kern/121421
Submitted by:	UEMURA Tetsuya
Reviewed by:	marcel
MFC after:	2 weeks
2008-03-12 19:09:20 +00:00
Remko Lodder
16630f3430 Add missing comma.
PR:		bin/121645
Submitted by:	OISHI Masakuni <yamasa at bsdhouse dot org>
Approved by:	imp (mentor, implicit for trivial changes)
MFC after:	3 days
2008-03-12 18:25:47 +00:00
Remko Lodder
4ee0aeea8a Add resume support to the agp_i810 family.
Submitted by:	"Robert Noland" <rnoland at 2hip dot net>
Reviewed by:	anholt
Approved by:	anholt, imp (mentor)
MFC after:	1 week
2008-03-12 18:23:39 +00:00
Rafal Jaworowski
772619e186 Convert TSEC watchdog to the new scheme.
Reviewed by:	imp, marcel
Approved by:	cognet (mentor)
2008-03-12 16:35:25 +00:00
Rafal Jaworowski
ecb1ab1761 Obtain TSEC h/w address from the parent bus (OCP) and not rely blindly on what
might be currently programmed into the registers.

Underlying firmware (U-Boot) would typically program MAC address into the
first unit only, and others are left uninitialized. It is now possible to
retrieve and program MAC address for all units properly, provided they were
passed on in the bootinfo metadata.

Reviewed by:	imp, marcel
Approved by:	cognet (mentor)
2008-03-12 16:32:08 +00:00
Rafal Jaworowski
7cc9e5030e Improve handling U-Boot's "eth%daddr" while PowerPC metadata preparation.
We're now more robust against cases of non-sorted and/or non-continuous
numbering of those entries.

Reviewed by:	imp, marcel
Approved by:	cognet (mentor)
2008-03-12 16:12:48 +00:00
Rafal Jaworowski
7572ed5a08 Eliminate artificial increasing of 'netdev_opens' counter in loader's net_open().
This was introduced as a workaround long time ago for some Alpha firmware
(which is now gone), and actually prevented net_close() to ever be
called.

Certain firmwares (U-Boot) need local shutdown operations to be performed on a
network controller upon transaction end: such platform-specific hooks are
supposed to be called via netif_close() (from within net_close()).

This change effectively reverts the following CVS commit:

    sys/boot/common/dev_net.c

    revision 1.7
    date: 2000/05/13 15:40:46;  author: dfr;  state: Exp;  lines: +2 -1
    Only probe network settings on the first open of the network device.
    The alpha firmware takes a seriously long time to open the network device
    the first time.

Also suppress excessive output while netbooting via loader, unless debugging.

While there, make sys/boot/uboot more style(9) compliant.

Reviewed by:	imp
Approved by:	cognet (mentor)
2008-03-12 16:01:34 +00:00
Rafal Jaworowski
507ea268f2 Respect RF_SHAREABLE flag in ARM nexus_setup_intr()
Reviewed by:	imp
Approved by:	cognet (mentor)
2008-03-12 15:46:25 +00:00
Andrew Gallatin
47c2e9879b Remove dead code which makes a call to mem_range_attr_set().
This fixes a bug where mxge did not declare a dependancy on
mem(4), and failed to load with options nomem.

Pointed out by: antoine
2008-03-12 15:36:00 +00:00
Rafal Jaworowski
1397332d85 Improve ARM bus_dmamap_load_buffer() error handling.
Reviewed by:	imp
Approved by:	cognet (mentor)
Spotted by:	Grzegorz Bernacki gjb AT semihalf DOT com
2008-03-12 15:31:37 +00:00
Paolo Pisati
ab0fcfd00a -Don't pass down the entire pkt to ProtoAliasIn, ProtoAliasOut, FragmentIn
and FragmentOut.
-Axe the old PacketAlias API: it has been deprecated since 5.x.
2008-03-12 11:58:29 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
Jeff Roberson
1581606f9c - Bump __FreeBSD_version for sleepq/cv_* api changes. 2008-03-12 06:33:36 +00:00
Jeff Roberson
c5aa6b581d - Pass the priority argument from *sleep() into sleepq and down into
sched_sleep().  This removes extra thread_lock() acquisition and
   allows the scheduler to decide what to do with the static boost.
 - Change the priority arguments to cv_* to match sleepq/msleep/etc.
   where 0 means no priority change.  Catch -1 in cv_broadcastpri() and
   convert it to 0 for now.
 - Set a flag when sleeping in a way that is compatible with swapping
   since direct priority comparisons are meaningless now.
 - Add a sysctl to ule, kern.sched.static_boost, that defaults to on which
   controls the boost behavior.  Turning it off gives better performance
   in some workloads but needs more investigation.
 - While we're modifying sleepq, change signal and broadcast to both
   return with the lock held as the lock was held on enter.

Reviewed by:	jhb, peter
2008-03-12 06:31:06 +00:00
Jeff Roberson
bdb5bdf0b7 - KSE may free a thread that was never actually forked. This will leave
td_cpuset NULL.  Check for this condition before dereferencing the
   cpuset.

Reported by:	david@catwhisker.org, miwi@freebsd.org
Sponsored by:	Nokia
2008-03-12 05:01:14 +00:00
Alexander Motin
10e873189c Improve apply callback error reporting:
Before this patch callback returned result of the last finished call chain.
Now it returns last nonzero result from all call chain results in this request.

As soon as this improvement gives reliable error reporting, it is now possible
to remove dirty workaround in ng_socket, made to return ENOBUFS error statuses
of request-response operations. That workaround was responsible for returning
ENOBUFS errors to completely unrelated requests working at the same time
on socket.
2008-03-11 21:58:48 +00:00
John Baldwin
1b085fde87 Style(9) these files. No changes in the compiled code. (Verified by
diff'ing objdump -d output).
2008-03-11 21:41:36 +00:00
John Baldwin
336d8e5536 Add constants for the various fields in MTRR registers.
MFC after:	1 week
Verified by:	md5(1)
2008-03-11 20:10:37 +00:00
Marcel Moolenaar
1c25a4fc75 In intr_lookup(), when adding an IRQ to powerpc_intrs[], also
set a default name. If the IRQ is added as a consequence of
configurating the IRQ without there ever being a handler
assigned to it, we will not have a name. This breaks the
fragile intrcnt/intrnames logic.
2008-03-11 19:58:52 +00:00
John Baldwin
4fcf220b00 Don't enable the workaround for the jitter bug on the 5722.
Obtained from:	Linux tg3 driver
2008-03-11 15:05:54 +00:00
Pyun YongHyeon
44858c36f8 Uncomment vr(4), vr(4) should work on all architectures. 2008-03-11 05:09:03 +00:00
Pyun YongHyeon
de126af331 Teach vr(4) to use bus_dma(9) and major overhauling to handle link
state change and reliable error recovery.
 o Moved vr_softc structure and relevant macros to header file.
 o Use PCIR_BAR macro to get BARs.
 o Implemented suspend/resume methods.
 o Implemented automatic Tx threshold configuration which will be
   activated when it suffers from Tx underrun. Also Tx underrun
   will try to restart only Tx path and resort to previous
   full-reset(both Rx/Tx) operation if restarting Tx path have failed.
 o Removed old bit-banging MII interface. Rhine provides simple and
   efficient MII interface. While I'm here show PHY address and PHY
   register number when its read/write operation was failed.
 o Define VR_MII_TIMEOUT constant and use it in MII access routines.
 o Always honor link up/down state reported by mii layers. The link
   state information is used in vr_start() to determine whether we
   got a valid link.
 o Removed vr_setcfg() which is now handled in vr_link_task(), link
   state taskqueue handler. When mii layer reports link state changes
   the taskqueue handler reprograms MAC to reflect negotiated duplex
   settings. Flow-control changes are not handled yet and it should
   be revisited when mii layer knows the notion of flow-control.
 o Added a new sysctl interface to get statistics of an instance of
   the driver.(sysctl dev.vr.0.stats=1)
 o Chip name was renamed to reflect the official name of the chips
   described in VIA Rhine I/II/III datasheet.
	REV_ID_3065_A -> REV_ID_VT6102_A
	REV_ID_3065_B -> REV_ID_VT6102_B
	REV_ID_3065_C -> REV_ID_VT6102_C
	REV_ID_3106_J -> REV_ID_VT6105_A0
	REV_ID_3106_S -> REV_ID_VT6105M_A0
   The following chip revisions were added.
	#define REV_ID_VT6105_B0	0x83
	#define REV_ID_VT6105_LOM	0x8A
	#define REV_ID_VT6107_A0	0x8C
	#define REV_ID_VT6107_A1	0x8D
	#define REV_ID_VT6105M_B1	0x94
 o Always show chip revision number in device attach. This shall help
   identifying revision specific issues.
 o Check whether EEPROM reloading is complete by inspecting the state
   of VR_EECSR_LOAD bit. This bit is self-cleared after the EEPROM
   reloading. Previously vr(4) blindly spins for 200us which may/may
   not enough to complete the EEPROM reload.
 o Removed if_mtu setup. It's done in ether_ifattach().
 o Use our own callout to drive watchdog timer.
 o In vr_attach disable further interrupts after reset. For VT6102 or
   newer hardwares, diable MII state change interrupt as well because
   mii state handling is done by mii layer.
 o Add more sane register initialization for VT6102 or newer chips.
    - Have NIC report error instead of retrying forever.
    - Let hardware detect MII coding error.
    - Enable MODE10T mode.
    - Enable memory-read-multiple for VT6107.
 o PHY address for VT6105 or newer chips is located at fixed address 1.
   For older chips the PHY address is stored in VR_PHYADDR register.
   Armed with these information, there is no need to re-read
   VR_PHYADDR register in miibus handler to get PHY address. This
   saves one register access cycle for each MII access.
 o Don't reprogram VR_PHYADDR register whenever access to a register
   located at a PHY address is made. Rhine fmaily allows reprogramming
   PHY address location via VR_PHYADDR register depending on
   VR_MIISTAT_PHYOPT bit of VR_MIISTAT register. This used to lead
   numerous phantom PHYs attached to miibus during phy probe phase and
   driver used to limit allowable PHY address in mii register accessors
   for certain chip revisions. This removes one more register access
   cycle for each MII access.
 o Correctly set VLAN header length.
 o bus_dma(9) conversion.
    - Limit DMA access to be in range of 32bit address space. Hardware
      doesn't support DAC.
    - Apply descriptor ring alignment requirements(16 bytes alignment)
    - Apply Rx buffer address alignment requirements(4 bytes alignment)
    - Apply Tx buffer address alignment requirements(4 bytes alignment)
      for Rhine I chip. Rhine II or III has no Tx buffer address
      alignment restrictions, though.
    - Reduce number of allowable number of DMA segments to 8.
    - Removed the atomic(9) used in descriptor ownership managements
      as it's job of bus_dmamap_sync(9).
    With these change vr(4) should work on all platforms.
 o Rhine uses two separated 8bits command registers to control Tx/Rx
   MAC. So don't access it as a single 16bit register.
 o For non-strict alignment architectures vr(4) no longer require
   time-consuming copy operation for received frames to align IP
   header. This greatly improves Rx performance on i386/amd64
   platforms. However the alignment is still necessary for
   strict-alignment platforms(e.g. sparc64). The alignment is handled
   in new fuction vr_fixup_rx().
 o vr_rxeof() now rejects multiple-segmented(fragmented) frames as
   vr(4) is not ready to handle this situation. Datasheet said nothing
   about the reason when/why it happens.
 o In vr_newbuf() don't set VR_RXSTAT_FIRSTFRAG/VR_RXSTAT_LASTFRAG
   bits as it's set by hardware.
 o Don't pass checksum offload information to upper layer for
   fragmented frames. The hardware assisted checksum is valid only
   when the frame is non-fragmented IP frames. Also mark the checksum
   is valid for corrupted frames such that upper layers doesn't need
   to recompute the checksum with software routine.
 o Removed vr_rxeoc(). RxDMA doesn't seem to need to be idle before
   sending VR_CMD_RX_GO command. Previously it used to stop RxDMA
   first which in turn resulted in long delays in Rx error recovery.
 o Rewrote Tx completion handler.
    - Always check VR_TXSTAT_OWN bit in status word prior to
      inspecting other status bits in the status word.
    - Collision counter updates were corrected as VT3071 or newer
      ones use different bits to notify collisions.
    - Unlike other chip revisions, VT86C100A uses different bit to
      indicate Tx underrun. For VT3071 or newer ones, check both
      VR_TXSTAT_TBUFF and VR_TXSTAT_UDF bits to see whether Tx
      underrun was happend. In case of Tx underrun requeue the failed
      frame and restart stalled Tx SM. Also double Tx DMA threshold
      size on each failure to mitigate future Tx underruns.
    - Disarm watchdog timer only if we have no queued packets,
      otherwise don't touch watchdog timer.
 o Rewrote interrupt handler.
    - status word in Tx/Rx descriptors indicates more detailed error
      state required to recover from the specific error. There is no
      need to rely on interrupt status word to recover from Tx/Rx
      error except PCI bus error. Other event notifications like
      statistics counter overflows or link state events will be
      handled in main interrupt handler.
    - Don't touch VR_IMR register if we are in suspend mode. Touching
      the register may hang the hardware if we are in suspended state.
      Previously it seems that touching VR_IMR register in interrupt
      handler was to work-around panic occurred in system shutdown
      stage on SMP systems. I think that work-around would hide
      root-cause of the panic and I couldn't reproduce the panic
      with multiple attempts on my box.
 o While padding space to meet minimum frame size, zero the pad data
   in order to avoid possibly leaking sensitive data.
 o Rewrote vr_start_locked().
    - Don't try to queue packets if number of available Tx descriptors
      are short than that of required one.
 o Don't reinitialize hardware whenever media configuration is
   changed. Media/link state changes are reported from mii layer if
   this happens and vr_link_task() will perform necessary changes.
 o Don't reinitialize hardware if only PROMISC bit was changed. Just
   toggle the PROMISC bit in hardware is sufficient to reflect the
   request.
 o Rearrganed the IFCAP_POLLING/IFCAP_HWCSUM handling in vr_ioctl().
 o Generate Tx completion interrupts for every VR_TX_INTR_THRESH-th
   frames. This reduces Tx completion interrupts under heavy network
   loads.
 o Since vr(4) doesn't request Tx interrupts for every queued frames,
   reclaim any pending descriptors not handled in Tx completion
   handler before actually firing up watchdog timeouts.
 o Added vr_tx_stop()/vr_rx_stop() to wait for the end of active
   TxDMA/RxDMA cycles(draining). These routines are used in vr_stop()
   to ensure sane state of MAC before releasing allocated Tx/Rx
   buffers. vr_link_task() also takes advantage of these functions to
   get to idle state prior to restarting Tx/Rx.
 o Added vr_tx_start()/vr_rx_start() to restart Rx/Tx. By separating
   Rx operation from Tx operation vr(4) no longer need to full-reset
   the hardware in case of Tx/Rx error recovery.
 o Implemented WOL.
 o Added VT6105M specific register definitions. VT6105M has the
   following hardware capabilities.
    - Tx/Rx IP/TCP/UDP checksum offload.
    - VLAN hardware tag insertion/extraction. Due to lack of information
       for getting extracted VLAN tag in Rx path, VLAN hardware support
       was not implemented yet.
    - CAM(Content Addressable Memory) based 32 entry perfect multicast/
      VLAN filtering.
    - 8 priority queues.
 o Implemented CAM based 32 entry perfect multicast filtering for
   VT6105M. If number of multicast entry is greater than 32, vr(4)
   uses traditional hash based filtering.
 o Reflect real Tx/Rx descriptor structure. Previously vr(4) used to
   embed other driver (private) data into these structure. This type
   of embedding make it hard to work on LP64 systems.
 o Removed unused vr_mii_frame structure and MII bit-baning
   definitions.
 o Added new PCI configuration registers that controls mii operation
   and mode selection.
 o Reduced number of Tx/Rx descriptors to 128 from 256. From my
   testing, increasing number of descriptors above than 64 didn't help
   increasing performance at all. Experimentations show 128 Rx
   descriptors seems to help a lot reducing Rx FIFO overruns under
   high system loads. It seems the poor Tx performance of Rhine
   hardwares comes from the limitation of hardware. You wouldn't
   satuarte the link with vr(4) no matter how fast CPU/large number of
   descriptors are used.
 o Added vr_statistics structure to hold various counter values.

No regression was reported but one variant of Rhine III(VT6105M)
found on RouterBOARD 44 does not work yet(Reported by Milan Obuch).
I hope this would be resolved in near future.

I'd like to say big thanks to Mike Tancsa who kindly donated a Rhine
hardware to me. Without his enthusiastic testing and feedbacks
overhauling vr(4) never have been possible. Also thanks to Masayuki
Murayama who provided some good comments on the hardware's internals.
This driver is result of combined effort of many users who provided
many feedbacks so I'd like to say special thanks to them.

Hardware donated by:	Mike Tancsa (mike AT sentex dot net)
Reviewed by:		remko (initial version)
Tested by:		Mike Tancsa(x86), JoaoBR ( joao AT matik DOT com DOT br )
			Marcin Wisnicki ( mwisnicki+freebsd AT gmail DOT com )
			Stefan Ehmann ( shoesoft AT gmx DOT net )
			Florian Smeets ( flo AT kasimir DOT com )
			Phil Oleson ( oz AT nixil DOT net )
			Larry Baird ( lab AT gta DOT com )
			Milan Obuch ( freebsd-current AT dino DOT sk )
			remko (initial version)
2008-03-11 04:51:22 +00:00
Pyun YongHyeon
59cf2cdf02 vr(4) was repocopied to src/sys/dev/vr. 2008-03-11 03:53:53 +00:00
Pyun YongHyeon
daeba9bdc6 Update file list and Makefile after repocopying vr(4) from
src/sys/pci to src/sys/dev.
2008-03-11 03:50:57 +00:00
Pyun YongHyeon
ea7d6fcdcd Forced commit to note that vr(4) was repocopied from sys/pci
and modified for its new location.
2008-03-11 03:44:46 +00:00
Pyun YongHyeon
2b71cf8696 Move comments block 1 line up to remark on the setting
if_capabilities. This would make comments clear.

Suggested by:	yar
2008-03-11 02:39:52 +00:00
Andrew Thompson
82f1b132a4 Update wpi(4) with stability fixes
- remove second taskqueue
 - busdma 16k alignment workaround
 - use busdma instead of external mbuf storage on Rx
 - locking fixes
 - net80211 state change fixes
 - improve scanning reliability
 - improve radio hw switch interaction
 - consolidate callouts

Parts obtained from:	benjsc, sam
Tested by:		many
2008-03-10 23:16:48 +00:00
Jeff Roberson
c143ac21af - Fix the invalid priority panics people are seeing by forcing
tdq_runq_add to select the runq rather than hoping we set it properly
   when we adjusted the priority.  This involves the same number of
   branches as before so should perform identically without the extra
   fragility.

Tested by:	bz
Reviewed by:	bz
2008-03-10 22:48:27 +00:00
John Baldwin
463e0f91cb Probe CPUs after the PCI hierarchy on i386, amd64, and ia64. This allows
the cpufreq drivers to reliably use properties of PCI devices for quirks,
etc.
- For the legacy drivers, add CPU devices via an identify routine in the
  CPU driver itself rather than in the legacy driver's attach routine.
- Add CPU devices after Host-PCI bridges in the acpi bus driver.
- Change the ichss(4) driver to use pci_find_bsf() to locate the ICH and
  check its device ID rather than having a bogus PCI attachment that only
  checked for the ID in probe and always failed.  As a side effect, you
  can now kldload ichss after boot.
- Fix the ichss(4) driver to use the correct device_t for the ICH (and not
  for ichss0) when doing PCI config space operations to enable SpeedStep.

MFC after:	2 weeks
Reviewed by:	njl, Andriy Gapon  avg of icyb.net.ua
2008-03-10 22:18:07 +00:00
John Baldwin
c3cefed5eb - Don't execute cpuid to fetch the features. We already have the features
present in cpu_feature2.  Also, use CPUID2_EST rather than a magic
  number.
- Don't free the ACPI settings list in detach if we are going to fail the
  request.  Otherwise an attempt to kldunload est would free the array
  but the driver would keep trying to use it.

MFC after:	1 week
2008-03-10 22:00:35 +00:00
John Baldwin
4937cb2d30 Change the BTX kernel to drop all the way out to real mode to invoke BIOS
routines (V86 requests from the client and hardware interrupt handlers):
- Install trampoline real mode interrupt handlers at IDT vectors 0x20-0x2f
  to handle hardware interrupts by invoking the appropriate vector (0x8-0xf
  or 0x70-0x78).  This allows the 8259As to use vectors 0x20-0x2f in real
  mode as well as protected mode will ensuring that the master 8259A
  doesn't share IDT space with CPU exceptions in protected mode.
- Since we don't need to reserve space for page tables and a page directory
  anymore since dropping paging support, move the TSS and protected mode
  IDT up by 16k.  Grow the ring 1 link stack by 16k as a result.
- Repurpose the ring 1 link stack to be used as a real mode stack when
  invoking real mode routines either via a V86 request or a hardware
  interrupts.  This simplifies a few things as we avoid disturbing the
  original user stack.
- Add some more block comments to explain how the code interacts with the
  V86 structure as this wasn't immediately obvious from the prior comments
  (e.g. that we explicitly copy the seg regs for real mode out of the V86
  struct onto the stack to be popped off when going into real mode, etc.).
  Also, document some of the stack frames we create going to real mode and
  back.
- Remove all of the virtual 86 related code including having to simulate
  various instructions and BIOS calls on a trap from virtual 86 mode.
- Explicitly panic if a user client attempts to perform a V86 CALL
  request that isn't a far call.
- Bump version to 1.2.

Assuming this works ok this should fix some of the long standing issues
with USB booting as well as etherboot.

MFC after:	2 weeks
Submitted by:	kib (some parts from his original real mode patch)
2008-03-10 21:43:31 +00:00
Ed Maste
3eb8098d2b Remove include of opt_quota.h; as of revision 1.205 there is no longer
any #ifdef QUOTA conditional code.
2008-03-10 18:44:07 +00:00
Robert Watson
d4cafc74ae Remove XXX to remind me to check the free space calculation, which to my
eyes appears right following a check.

MFC after:	3 days
2008-03-10 18:15:02 +00:00
Robert Watson
b525186851 Remove unused vc_tnode field from struct smb_vc.
MFC after:	3 days
2008-03-10 14:55:34 +00:00
Yoshihiro Takahashi
0236301720 MFi386: revision 1.482.
Import uslcom(4) from OpenBSD - this is a driver for Silicon Laboratories
  CP2101/CP2102 based USB serial adapters.
2008-03-10 12:25:04 +00:00
Jeff Roberson
7217d8d1ee - Don't rely on a side effect of sched_prio() to set the initial ts_runq
for thread0.  Set it directly in sched_setup().  This fixes traps on boot
   seen on some machines.

Reported by:	phk
2008-03-10 09:50:29 +00:00
Jeff Roberson
8f93d79d05 - Handle kdb switch panics outside of mi_switch() to remove some instructions
from the common path and make the code more clear.  Whether this has any
   impact on performance may depend on optimization levels.

Sponsored by:	Nokia
2008-03-10 03:16:51 +00:00
Jeff Roberson
73daf66f41 Reduce ULE context switch time by over 25%.
- Only calculate timeshare priorities once per tick or when a thread is woken
   from sleeping.
 - Keep the ts_runq pointer valid after all priority changes.
 - Call tdq_runq_add() directly from sched_switch() without passing in via
   tdq_add().  We don't need to adjust loads or runqs anymore.
 - Sort tdq and ts_sched according to utilization to improve cache behavior.

Sponsored by:	Nokia
2008-03-10 03:15:19 +00:00
Warner Losh
9ab8f3544a Tiny bit of KNF to make bus_setup_intr() look like the rest of this
function.
2008-03-10 01:48:25 +00:00
Jeff Roberson
1bf6461e98 - Add the missing '2' case to the switch table for kern.smp.topology and
assign it to create the flat 'none' topology where all cpus are scheduled
   as if they are equal and unrelated.
2008-03-10 01:38:53 +00:00
Jeff Roberson
32c9d3a767 - Rather than repeating the same preemption code everywhere call the scheduler
specific sched_preempt() routine.
2008-03-10 01:32:48 +00:00
Jeff Roberson
ff256d9c47 - Add an implementation of sched_preempt() that avoids excessive IPIs.
- Normalize the preemption/ipi setting code by introducing sched_shouldpreempt()
   so the logical is identical and not repeated between tdq_notify() and
   sched_setpreempt().
 - In tdq_notify() don't set NEEDRESCHED as we may not actually own the thread lock
   this could have caused us to lose td_flags settings.
 - Garbage collect some tunables that are no longer relevant.
2008-03-10 01:32:01 +00:00
Jeff Roberson
1e24c28f46 - Add a sched_preempt() routine to be called by md code after IPI_PREEMPT is
delivered.
 - Add a simple implementation to 4bsd.
2008-03-10 01:30:35 +00:00
Robert Watson
23a0c23034 Improve convergence of bpf_filter.c toward style(9).
MFC after:	3 weeks
Submitted by:	csjp
2008-03-09 21:13:43 +00:00
Marius Strobl
801772ec32 - Fix some style bugs and remove another banal comment missed in
rev. 1.46.
- Move the KASSERT on gem_add_rxbuf() to the right spot and add an
  equivalent one to gem_disable_tx().
2008-03-09 17:55:19 +00:00
Marius Strobl
d8ef604544 - Fix some style bugs.
- Replace hard-coded functions names missed in rev. 1.44 with __func__.

MFC after:	1 week
2008-03-09 17:09:15 +00:00
Marius Strobl
d5295d0b09 - Do as the comment in pmap_bootstrap() suggests and flush all non-locked
TLB entries possibly left over by the firmware and also do so while
  bootstrapping APs.
- Use __FBSDID.

MFC after:	1 month
2008-03-09 15:53:34 +00:00
Bjoern A. Zeeb
413deb1262 Padding after EOL option must be zeros according to RFC793 but
the NOPs used are 0x01.
While we could simply pad with EOLs (which are 0x00), rather use an
explicit 0x00 constant there to not confuse poeple with 'EOL padding'.
Put in a comment saying just that.

Problem discussed on:	src-committers with andre, silby, dwhite as
			follow up to the rev. 1.161 commit of tcp_var.h.
MFC after:		11 days
2008-03-09 13:26:50 +00:00
Robert Watson
358f8d822b HZ now defaults to 1000 on many architectures, so update NOTES to reflect
that.

MFC after:	3 days
PR:		113670
Submitted by:	Ighighi <ighighi at gmail.com>
2008-03-09 11:29:59 +00:00
Rui Paulo
8a000acaa9 Some PIIX4 chipsets need to be told to generate Stop Breaks by setting
the appropriate bit in the DEVACTB register.
This change allows the C2 state on those systems to work as expected.

Reviewed by:	njl
Submitted by:	Andriy Gapon <avg at icyb.net.ua>
MFC after:	1 week
2008-03-09 11:19:03 +00:00
Alexander Motin
395adfbe34 Addition to the previous commit. Release inproc in case of memory error. 2008-03-09 11:17:00 +00:00
Alan Cox
593e717ec9 Eliminate an unnecessary test from vm_fault's delete-behind heuristic.
Specifically, since the delete-behind heuristic is never applied to a
device-backed object, there is no point in checking whether each of the
object's pages is fictitious.  (Only device-backed objects have
fictitious pages.)
2008-03-09 06:08:58 +00:00
Warner Losh
908e1e5df5 Any driver that relies on its parent to set the devclass has no way to
know if has siblings that need an actual probe.  Introduce a specail
return value called BUS_PROBE_NOOWILDCARD.  If the driver returns
this, the probe is only successful for devices that have had a
specific devclass set for them.

Reviewed by: current@, jhb@, grehan@
2008-03-09 05:10:22 +00:00
Marcel Moolenaar
27080415a2 Don't use in32() and out32() when writing to the CCSRBAR. The
in*() and out*() primitives should not be used, other than by
ISA drivers. In this case they were used for memory-mapped I/O
and were not even used in the spirit of the primitives.
2008-03-09 02:29:19 +00:00
Alexander Motin
af63939c67 To avoid control data losses do not acknowledge recieving of control packet
if netgraph reported error while delivering to destination.
Reset 'next send' counter to the last requested by peer on ack timeout
to resend all subsequest packets after lost one again without additional hints.
2008-03-08 23:55:29 +00:00
Antoine Brodin
4fd1b794e2 Bump __FreeBSD_version for F_DUP2FD command to fcntl(2)
Requested by:	Craig Rodrigues
Approved by:	rwatson (mentor)
2008-03-08 22:17:14 +00:00
Antoine Brodin
e3ad7f6626 Introduce a new F_DUP2FD command to fcntl(2), for compatibility with
Solaris and AIX.
fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent.
Document it.
Add some regression tests (identical to the dup2(2) regression tests).

PR:		120233
Submitted by:	Jukka Ukkonen
Approved by:	rwaston (mentor)
MFC after:	1 month
2008-03-08 22:02:21 +00:00
David E. O'Brien
0a3374af71 "root" the include path so there is less duplication. 2008-03-08 19:14:43 +00:00
Scott Long
9d6a74eb84 Fix a mistake made during the import of the driver. Previous versions of
HPT drivers would sometimes test the value of a preprocessor definition but
not always make sure that the definition existed in the first place, leading
to warnings on newer compilers.  I blindly assumed the same with this driver,
and it turned out to be wrong and to enable some code that doesn't work.
2008-03-08 18:06:48 +00:00
Robert Watson
36b208e008 Use sbuf routines to construct core dump filenames rather than custom
string buffer handling, making the code both easier to read and more
robust against string-handling bugs.

MFC after:	1 week
2008-03-08 16:31:29 +00:00
Robert Watson
eeccc36738 Unlock the process lock when expand_name() fails, or we may leak the
process lock leading to a hang.  This bug was introduced in
kern_sig.c:1.351, when the call to expand_name() was moved earlier
bit this particular error case was not updated.
2008-03-08 15:48:06 +00:00
Marcel Moolenaar
704bb9b36f Enable the D-cache and I-cache when not already enabled.
It so happens that U-Boot disables the D-cache when booting
an ELF image, so this change makes sure we run with the
D-cache enabled from now on. It shows too...

While here, remove the duplicate definition of the hw.model
sysctl.
2008-03-08 05:36:25 +00:00
Marcel Moolenaar
8a109fa3d8 For AIM, have cpu_idle() set MSR_POW when the powerpc_pow_enabled
variable is set. On my Mac Mini this puts the CPU in NAP mode when
the kernel is idle and, any technical or environmental reasons
aside, avoids that I have to listen to the fan all day :-)
2008-03-07 22:27:06 +00:00
Marcel Moolenaar
d6f5929710 Add support for the BUS_CONFIG_INTR() method to the platform and to
openpic(4). Make use of it in ocpbus(4). On the MPC85xxCDS, IRQ0:4
are active-low.
2008-03-07 22:08:43 +00:00
Alexander Motin
6e7ed93017 Send only one incoming notification at a time to reduce queue
trashing and improve performance.
Remove waitflag argument from ng_ksocket_incoming2(), it means nothing
as function call was queued by netgraph.
Remove node validity check, as node validity guarantied by netgraph.
Update comments.
2008-03-07 21:12:56 +00:00
Robert Watson
7c7b7f8e1b Add a /S mode to DDB "ex" command, which interprets and prints the
value at the requested address as a symbol.  For example, "ex /S
aio_swake" prints the name of the function currently registered in
via aio_swake hook.

The change as committed differs slightly from the patch in the PR,
as I force the size of the retrieved value (and the automatic
address increment) to be sizeof(void *).  This seems to provide
the most useful auto-increment behavior, and defaults using the
default size (4), which is not sizeof(void *) on 64-bit platforms.

MFC after:	3 days
PR:		57976
Submitted by:	Dan Strick <strick at covad.net>
2008-03-07 18:09:07 +00:00
Marcel Moolenaar
6630c534aa Apply le*toh() or htole*() to the variables of which we use the address
as the buffer pointer in the call to axe_cmd(). This is needed to make
the code work on big-endian machines.

Ok'd: imp@
2008-03-07 16:55:24 +00:00
Robert Watson
b9175c4556 Move IFF_NEEDSGIANT warning from if_ethersubr.c to if.c so it is displayed
for all network interfaces, not just ethernet-like ones.

Upgrade it to a louder WARNING and be explicit that the flag is obsolete.
Support for IFF_NEEDSGIANT will be removed in a few months (see arch@ for
details) and will not appear in 8.0.

Upgrade if_watchdog to a WARNING.
2008-03-07 16:00:44 +00:00
Robert Watson
b916b56b5a Add __FBSDID() tag.
MFC after:	3 days
Pointed out by:	antoine
2008-03-07 15:27:08 +00:00
Robert Watson
3755dbd805 When killing a user process from DDB, check that the requested signal is
> 0 rather than >= 0, or we will panic when trying to deliver the signal.

MFC after:	3 days
PR:		100802
Submitted by:	Valerio Daelli <valerio.daelli at gmail.com>
2008-03-07 14:26:30 +00:00