Commit Graph

67028 Commits

Author SHA1 Message Date
bz
e72139652f In some situations we were not clearing pending link state attentions.
Because of this we were not getting further interrupts for link state
changes, thus never went into iface UP state and thus could not transmit.

The only way out of this was an incoming packet generating an rx interrupt
and making us call into bge_link_upd.

Up to rev. 1.101, in bge_start_locked, we only returned instantly
if there was 'no link AND nothing queued for tx'. So with a packet queued
for tx, we hit the register scrubbing at the end of bge_start_locked
and were out fine. We simply lost a packet or two but got the interrupts
need to get into UP state.
With rev. 1.102 this was turned into 'if there is no link OR there is
nothing to send' (correct behaviour) and as long as there is no link
we never hit the register scrubbing and consequently never got the link UP.

What we do now is force an interrupt at the end of bge_ifmedia_upd_locked
so we will call bge_link_upd, clear the link state attention and get
further interrupts.
This helps to get the iface UP on an idle network or at least to get
it UP faster not depending on an rx intr anymore.
In case you could not get a DHCP lease or it took very long,
it was because of this.

It is unknown which chips are affected by this. ASIC rev. 0x2003 was the
most popular trouble candidate.
At least the fiber cards should have been working fine.

Which register to scrub is currently under discussion. The comitted
solution was tested and found to work for a lot of setups. It might
not help with MSI.
The reason why we end up in such a situation is entirely unknown.

PR:		kern/111804
Tested by:	phk, scottl at Y!
MFC after:	14 days
2008-04-08 11:51:17 +00:00
kevlo
bb2376b552 Remove some long-dead code
Reviewed by: cognet
2008-04-08 10:24:42 +00:00
kib
133f8f7798 Regenerate 2008-04-08 09:51:19 +00:00
kib
eb77b477b4 Implement the linux syscalls
openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat,
    renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat.

Submitted by:	rdivacky
Sponsored by:	Google Summer of Code 2007
Tested by:	pho
2008-04-08 09:45:49 +00:00
weongyo
07140f6c56 Add a couple of missing wireless NIC driver modules.
Approved by:	thompsa (mentor)
2008-04-08 01:47:33 +00:00
jhb
348561beef Add PCI ID's for ICH8 USB controllers.
MFC after:	1 week
PR:		usb/116574
Submitted by:	Dave Grochowski  malus.x of gmail
2008-04-07 19:12:22 +00:00
andre
03c2dba129 Remove TCP options ordering assumptions in tcp_addoptions(). Ordering
was changed in rev. 1.161 of tcp_var.h.  All option now test for sufficient
space in TCP header before getting added.

Reported by:	Mark Atkinson <atkin901-at-yahoo.com>
Tested by:	Mark Atkinson <atkin901-at-yahoo.com>
MFC after:	1 week
2008-04-07 19:09:23 +00:00
andre
2f48bcbeeb Remove now unnecessary comment. 2008-04-07 18:50:05 +00:00
andre
6028637ec1 Use #defines for TCP options padding after EOL to be consistent.
Reviewed by:	bz
2008-04-07 18:43:59 +00:00
jhb
589d19f263 Revert back to probing Host-PCI bridges in the order we encounter them in
the tree rather than sorting them by their address on PCI bus 0.

Reported by:	kan
2008-04-07 18:35:11 +00:00
pjd
e83dd39a00 Correct function name in panic().
Reported by:	kensmith
2008-04-07 18:12:37 +00:00
attilio
d5dbd84790 - Use a different encoding for lockmgr options: make them encoded by
bit in order to allow per-bit checks on the options flag, in particular
  in the consumers code [1]
- Re-enable the check against TDP_DEADLKTREAT as the anti-waiters
  starvation patch allows exclusive waiters to override new shared
  requests.

[1] Requested by:	pjd, jeff
2008-04-07 14:46:38 +00:00
rpaulo
be69e8bcc6 Actually, I was looking at the wrong Linux .c file. Set INIT2 to its
previous value.
While there, lower the delay for the misterious key.
2008-04-07 12:58:43 +00:00
rwatson
cb53c63d17 Add further TCP inpcb locking assertions to some TCP input code paths.
MFC after:	1 month
2008-04-07 12:41:45 +00:00
rpaulo
9c842efae3 * Add missing #else in the #ifdef DEBUG section.
* Fix the login in asmc_init().
* Change the INIT2 constant to reflect the same change in the Linux driver.
2008-04-07 12:09:59 +00:00
rpaulo
aef67c0d53 "Prettyfy" numbers in hexadecimal. No functional change. 2008-04-07 11:38:42 +00:00
rpaulo
44e0ecaea4 Remove isa_if.h. 2008-04-07 11:26:13 +00:00
rpaulo
e654fd2681 The SMC is represented on the acpi tables, so we can completely remove
dependency on isa. We are now an acpi child.

Also:
	* Add compile time debugging activation
	* Increase the delay for the SMS init flag.
2008-04-07 11:22:12 +00:00
rpaulo
0db35b0777 Add opt_intr_filter.h. 2008-04-07 11:08:45 +00:00
alc
f2ea5d4883 Update pmap_page_wired_mappings() so that it counts 2/4MB page mappings. 2008-04-07 07:38:02 +00:00
rwatson
6d3db5778b Maintain and observe a ZBUF_FLAG_IMMUTABLE flag on zero-copy BPF
buffer kernel descriptors, which is used to allow the buffer
currently in the BPF "store" position to be assigned to userspace
when it fills, even if userspace hasn't acknowledged the buffer
in the "hold" position yet.  To implement this, notify the buffer
model when a buffer becomes full, and check that the store buffer
is writable, not just for it being full, before trying to append
new packet data.  Shared memory buffers will be assigned to
userspace at most once per fill, be it in the store or in the
hold position.

This removes the restriction that at most one shared memory can
by owned by userspace, reducing the chances that userspace will
need to call select() after acknowledging one buffer in order to
wait for the next buffer when under high load.  This more fully
realizes the goal of zero system calls in order to process a
high-speed packet stream from BPF.

Update bpf.4 to reflect that both buffers may be owned by userspace
at once; caution against assuming this.
2008-04-07 02:51:00 +00:00
rwatson
d566093a40 Coerce if_loop.c in the general direction of style(9):
- Use ANSI function declarations
- Remove use of 'register' keyword
- Prefer style(9) return parens, white space

MFC after:	1 month
2008-04-07 01:43:30 +00:00
truckman
3ab6955dbc vfs_syscalls.c 1.452 mistakenly swapped the behavior of chown() and lchown(). 2008-04-07 00:29:32 +00:00
rwatson
00684a83a1 In in_pcbnotifyall() and in6_pcbnotify(), use LIST_FOREACH_SAFE() and
eliminate unnecessary local variable caching of the list head pointer,
making the code a bit easier to read.

MFC after:	3 weeks
2008-04-06 21:20:56 +00:00
attilio
b841937e98 Bump __FreeBSD_version in order to reflect lockmgr_rw() and
lockmgr_args_rw() introduction.
2008-04-06 20:27:54 +00:00
attilio
07441f19e1 Optimize lockmgr in order to get rid of the pool mutex interlock, of the
state transitioning flags and of msleep(9) callings.
Use, instead, an algorithm very similar to what sx(9) and rwlock(9)
alredy do and direct accesses to the sleepqueue(9) primitive.

In order to avoid writer starvation a mechanism very similar to what
rwlock(9) uses now is implemented, with the correspective per-thread
shared lockmgrs counter.

This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and
lockmgr_args_rw().  These two are like the 2 "normal" versions, but they
both accept a rwlock as interlock.  In order to realize this, the general
lockmgr manager function "__lockmgr_args()" has been implemented through
the generic lock layer. It supports all the blocking primitives, but
currently only these 2 mappers live.

The patch drops the support for WITNESS atm, but it will be probabilly
added soon. Also, there is a little race in the draining code which is
also present in the current CVS stock implementation: if some sharers,
once they wakeup, are in the runqueue they can contend the lock with
the exclusive drainer.  This is hard to be fixed but the now committed
code mitigate this issue a lot better than the (past) CVS version.
In addition assertive KA_HELD and KA_UNHELD have been made mute
assertions because they are dangerous and they will be nomore supported
soon.

In order to avoid namespace pollution, stack.h is splitted into two
parts: one which includes only the "struct stack" definition (_stack.h)
and one defining the KPI.  In this way, newly added _lockmgr.h can
just include _stack.h.

Kernel ABI results heavilly changed by this commit (the now committed
version of "struct lock" is a lot smaller than the previous one) and
KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction,
so manpages and __FreeBSD_version will be updated accordingly.

Tested by:      kris, pho, jeff, danger
Reviewed by:    jeff
Sponsored by:   Google, Summer of Code program 2007
2008-04-06 20:08:51 +00:00
alc
2f4904816f Introduce vm_reserv_reclaim_contig(). This function is used by
contigmalloc(9) as a last resort to steal pages from an inactive,
partially-used superpage reservation.

Rename vm_reserv_reclaim() to vm_reserv_reclaim_inactive() and
refactor it so that a separate subroutine is responsible for breaking
the selected reservation.  This subroutine is also used by
vm_reserv_reclaim_contig().
2008-04-06 18:09:28 +00:00
mav
254de061de Rewrite node's r/w/q-lock semantics using only atomics instead of mutex
and atomics combination. Mutex is now used only for queue protection.
Also avoid unneded extra swi scheduling calls.
2008-04-06 15:26:32 +00:00
jeff
d1c199f415 - Correct a major error introduced in the per-cpu timeout commit. Sleep
and wakeup require the same wait channel to function properly.

Found by:	kris
Pointy hat:	me
2008-04-06 11:08:49 +00:00
cognet
9fe523332b Remove bus_space_generic.c from the per-plarform files. Having it in the
per-cpu files should be enough.
2008-04-05 21:57:11 +00:00
cognet
7bb418c25a Add bus_space_generic.c for the i81342 as well. 2008-04-05 21:51:11 +00:00
jhb
68917b32fc Move INTR_FILTER from opt_global.h to its own header. 2008-04-05 20:13:15 +00:00
jhb
79918c45a6 Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
  'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
  Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
  scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
  Also, don't bother unmasking the interrupt in the post_filter case if
  we never masked it.  The INTR_FILTER case had been doing this by having
  arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
  They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
  both the 'post_filter' and 'post_ithread' hooks to match what the
  non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
  'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
  designed to do even in 5.x).

Glanced at by:	piso
Reviewed by:	marius
Requested by:	marius [1], [5]
Tested on:	amd64, i386, arm, sparc64
2008-04-05 19:58:30 +00:00
jhb
5279835925 During attach on some de(4) adapters the driver sends out a test packet as
part of detecting the media.  Explicitly ensure that we don't send it to
bpf(4) as bpf(4) isn't setup yet.  This worked by accident before the bpf
interface stuff was reworked to avoid other races (bpf_peers_present, etc.)
but now it needs an explicit check to avoid a panic.

MFC after:	3 days
PR:		kern/120915
2008-04-05 17:24:44 +00:00
takawata
4d340bc45f GPE lock may recurse on resume path. 2008-04-05 14:21:01 +00:00
alc
871b77b7f6 Eliminate an unnecessary test from vm_phys_unfree_page(). 2008-04-05 05:02:53 +00:00
yongari
47012d1725 Add support for IC Plus IP1001 PHY.
Tested by:	Stuart Fraser < stuart AT stuartfraser DOT net >
2008-04-05 00:52:07 +00:00
imp
64685606b6 If you build a compiler with TARGET_BIG_ENDIAN, and then try to build
a little endian kernel, things break.  Be explicit about the endian
choice by setting it in the little endian case as well.
2008-04-04 19:33:09 +00:00
alc
927cf7f228 Update a comment to vm_map_pmap_enter(). 2008-04-04 19:14:58 +00:00
alc
067dba5f97 Reintroduce UMA_SLAB_KMAP; however, change its spelling to
UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM.
(UMA_SLAB_KMAP met its original demise in revision 1.30 of
vm/uma_core.c.)  UMA_SLAB_KERNEL is now required by the jumbo frame
allocators.  Without it, UMA cannot correctly return pages from the
jumbo frame zones to the VM system because it resets the pages' object
field to NULL instead of the kernel object.  In more detail, the jumbo
frame zones are created with the option UMA_ZONE_REFCNT.  This causes
UMA to overwrite the pages' object field with the address of the slab.
However, when UMA wants to release these pages, it doesn't know how to
restore the object field, so it sets it to NULL.  This change teaches
UMA how to reset the object field to the kernel object.

Crashes reported by: kris
Fix tested by: kris
Fix discussed with: jeff
MFC after: 6 weeks
2008-04-04 18:41:12 +00:00
imp
8353654e96 Fix stupid typo 2008-04-04 18:22:16 +00:00
alc
b1ed68eb9c Eliminate an unnecessary test and its misleading comment from pmap_enter(). 2008-04-04 18:00:22 +00:00
raj
cd6e8c4dc8 Make kernel.tramp build properly on ARM9E.
Reviewed by:	imp
Approved by:	cognet (mentor)
2008-04-04 17:35:24 +00:00
jeff
d6dabc3153 - Add sysctls at debug.rwlock to control the behavior of the speculative
spinning when readers hold a lock.  This spinning is speculative because,
   unlike the write case, we can not test whether the owners are running.
 - Add speculative read spinning for readers who are blocked by pending
   writers while a read lock is still held.  This allows the thread to
   spin until the write lock succeeds after which it may spin until the
   writer has released the lock.  This prevents excessive context switches
   when readers and writers both hold the lock for brief periods.

Sponsored by:	Nokia
2008-04-04 10:00:46 +00:00
kib
dae02901d4 The temporary workaround for the call to the vget() without lock type in
the fdesc_allocvp(). The caller of the fdesc_allocvp() expects that the
returned vnode is not reclaimed. Do lock the vnode exclusive and drop
the lock after.

Reported by:	pho
Reviewed by:	jeff
2008-04-04 09:37:57 +00:00
jeff
73796e923c - Add a Nokia copyright to cpuset to reflect their generous
contribution to this work.
2008-04-04 01:22:04 +00:00
jeff
85d3ffe23c - Allow static_boost to specify no boost with '0', traditional kernel
fixed pri boost with '1' or any priority less than the current thread's
   priority with a value greater than two.  Default the boost to
   PRI_MIN_TIMESHARE to prevent regular user-space threads from starving
   threads in the kernel.  This prevents these user-threads from also
   being scheduled as if they are high fixed-priority kernel threads.
 - Restore the setting of lowpri in tdq_choose().  It has to be either here
   or in sched_switch().  I accidentally removed it from both places.

Tested by:	kris
2008-04-04 01:16:18 +00:00
jeff
c50de590cc - Don't check for the ITHD pri class in tdq_load_add and rem. 4BSD doesn't
do this either.  Simply check P_NOLOAD.  It'd be nice if this was
   in a thread flag so we didn't have an extra cache miss every time we
   add and remove a thread from the run-queue.
2008-04-04 01:04:43 +00:00
jeff
7d635b683d - Fix a mis-merge that crept in during the softclock changes.
Spotted by:	jhb
2008-04-04 01:03:23 +00:00
emaste
3e2864c15b Allow crashdumps on machines with >4GB of RAM as long as the adapter can
do 64-bit S/G.

Submitted by:	Alex Bencz
Reviewed by:	scottl
2008-04-03 23:29:31 +00:00
jfv
bb6c26f00e Fix the build breakage, need the | between dependencies, I didn't
realize that :(
2008-04-03 20:58:18 +00:00
imp
4211d46f1c Always build kernel.tramp. This should be helpful for a lot of
people, as well making sure it doesn't break.
2008-04-03 20:42:36 +00:00
raj
d7131d4a77 Now really add the bus_space_generic.c file...
Reviewed by:	sam
Approved by:	cognet (mentor)
2008-04-03 18:28:34 +00:00
raj
87f84373e2 Refactor certain ARM bus space methods: instead of having multiple copies of
the same code introduce sys/arm/arm/bus_space_generic.c for a shared set of
routines.

Reviewed by:	sam
Approved by:	cognet (mentor)
2008-04-03 18:22:08 +00:00
raj
6eb81d8e9b Fix AVILA build.
Reviewed by:	sam
Approved by:	cognet(mentor)
2008-04-03 18:20:39 +00:00
marcel
7dc245de4b Align functions to 16-byte boundaries due to profiling granularity. 2008-04-03 17:40:20 +00:00
marcel
d06d18ff96 Set sc_psim so that the openpic core can correct the off-by-one
error in the number of IRQs that PSIM gives us.
2008-04-03 17:38:27 +00:00
imp
13cdc14d99 Take the first baby step towards unifying and cleaning up arminit():
- Pull all the code to deal with the trampoline stuff into one
	  centeralized place and use it from everywhere.
	- Some minor style tidiness

Reviewed by: tinguely
2008-04-03 16:44:50 +00:00
scottl
253d6ed01c Don't force a reset at driver attach time. It doesn't work on some
adapters, apparently.
2008-04-03 14:39:48 +00:00
davidxu
6e5250730e let umtxq_busy() only spin on mp machine. make function name
do_rwlock_unlock to be consistent with others.
2008-04-03 11:49:20 +00:00
jfv
520f6a782b Another build fix 2008-04-03 06:45:38 +00:00
jfv
dd0d32f82b Fix a lint issue in the build. 2008-04-03 06:17:16 +00:00
imp
24b0dea1b1 KERNBASE + 0x00200000 is the same thing as KERNVIRTADDR on this
platform, so use the latter in preference to the former.  This makes
the fake_preload setup be the same between kb920x_machdep.c and
avila_machdep.c....
2008-04-03 06:14:23 +00:00
imp
01156d94e4 Remove unnecessary #define. 2008-04-03 06:07:45 +00:00
jfv
31df7a30d1 Fix minor bug in last checkin, NO_STRICT_ALIGNMENT code. 2008-04-03 00:25:09 +00:00
jfv
9d514d84b9 This update primarily addresses the ability to have both the em
and the igb driver static in the kernel. But it also reflects
some other bug fixes in my development stream at Intel.
PR 122373 is also fixed in this code.
2008-04-02 22:00:36 +00:00
imp
a8d57d03bd Add zyd, ural, and rum. They were missing. 2008-04-02 16:17:19 +00:00
gallatin
f4c98f7455 Initialize if_baudrate using IF_Gbps() macro.
Note that if_baudrate is a long, and 32-bits isn't enough to properly
represent 10Gb/s.

Pointed out by: dwhite
2008-04-02 13:59:43 +00:00
jeff
ca28ca664f - Convert two timeout users to the new callout_reset_curcpu() api.
Sponsored by:	Nokia
2008-04-02 11:21:42 +00:00
jeff
b065517935 Implement per-cpu callout threads, wheels, and locks.
- Move callout thread creation from kern_intr.c to kern_timeout.c
 - Call callout_tick() on every processor via hardclock_cpu() rather than
   inspecting callout internal details in kern_clock.c.
 - Remove callout implementation details from callout.h
 - Package up all of the global variables into a per-cpu callout structure.
 - Start one thread per-cpu.  Threads are not strictly bound.  They prefer
   to execute on the native cpu but may migrate temporarily if interrupts
   are starving callout processing.
 - Run all callouts by default in the thread for cpu0 to maintain current
   ordering and concurrency guarantees.  Many consumers may not properly
   handle concurrent execution.
 - The new callout_reset_on() api allows specifying a particular cpu to
   execute the callout on.  This may migrate a callout to a new cpu.
   callout_reset() schedules on the last assigned cpu while
   callout_reset_curcpu() schedules on the current cpu.

Reviewed by:	phk
Sponsored by:	Nokia
2008-04-02 11:20:30 +00:00
kib
c951adca24 Add two missed chunks from the rev. 1.210, for the giant_read() and
giant_ioctl().

PR:	kern/122287
MFC after:	3 days
2008-04-02 11:11:58 +00:00
jeff
639ca8f21b - Destroy the bo mtx when the vnode is destroyed. 2008-04-02 10:40:03 +00:00
davidxu
aefa44f0cc Fix compiling problem for amd64. 2008-04-02 05:54:41 +00:00
alc
18c266c7a7 Optimize pmap_pml4e() and pmap_pdpe() based upon two observations: The
given pmap is never NULL, and therefore pmap_pml4e() can never return
NULL.  The pervasive use of these inline functions throughout the pmap
makes these simple changes worthwhile.
2008-04-02 04:39:47 +00:00
davidxu
f4f495d3ed Er, don't restart a timeout version. 2008-04-02 04:26:59 +00:00
davidxu
ebdf401288 Introduce kernel based userland rwlock. Each umtx chain now has two lists,
one for readers and one for writers, other types of synchronization
object just use first list.

Asked by: jeff
2008-04-02 04:08:37 +00:00
emaste
40129db550 Calling RequestSupplementAdapterInfo before RequestAdapterInfo appears
to trip a bug causing the latter to return a zeroed struct
aac_adapter_info.  This causes two issues.  One is cosmetic only --
a verbose boot prints information about the controller, and shows all
zero:

aac0: Unknown processor 0MHz, 0MB memory (0MB cache, 0MB execution),
  unknown battery platform

The second problem is that the firmware version information is stored
away for aac_rev_check, for userland tools (like aaccli) to query via
the FSACTL_MINIPORT_REV_CHECK and FSACTL_LNX_MINIPORT_REV_CHECK ioctls.
When aaccli encounters this issue it prints

Command Error: <The current AFAAPI.DLL is too old to work with the
  current controller software.>

Move the RequestSupplementAdapterInfo call after RequestAdapterInfo,
which seems to fix both problems.
2008-04-01 20:53:32 +00:00
attilio
e00250d54d Bump __FreeBSD_version in order to reflect rw_try_rlock() and
rw_try_wlock() functions introduction.
2008-04-01 20:33:06 +00:00
attilio
672b78c87b Add rw_try_rlock() and rw_try_wlock() to rwlocks.
These functions try the specified operation (rlocking and wlocking) and
true is returned if the operation completes, false otherwise.

The KPI is enriched by this commit, so __FreeBSD_version bumping and
manpage updating will happen soon.

Requested by:	jeff, kris
2008-04-01 20:31:55 +00:00
dfr
60db59bdb1 Don't try to use an SX lock while holding the vnode interlock.
Sponsored by:	Isilon Systems
2008-04-01 16:07:01 +00:00
weongyo
426badf1fa Add malo driver to the build
Approved by:	thompsa (mentor)
2008-04-01 01:55:19 +00:00
weongyo
e31d8fea3c remove warnings for 64bit aware platforms.
Approved by:	thompsa (mentor)
2008-04-01 01:48:08 +00:00
scottl
f69578dd1e The MPT driver treats the "core" module with the same importance and
abstraction as the RAID and CAM modules, making it nearly impossible
for enough initialization to be done in time for the RAID module to
know whether to attach.  On top of this, no reset was being done on
the controller on attach, in violation of the spec.  Additionally,
the port enable step was being deferred to the end of the attach
process, long after it should have been done to ensure reliable
operation from the controller.  Fix all of these with a few hacks
to force the "attach" and "enable" steps of the core module early
on, and ensure that a reset and port enable also happens early on.
In the future, the driver needs to be refactored to eliminate the
core module abstraction, clean up withe reset/enable steps, and
defer event messages until all of the modules are available to
recieve them.
2008-03-31 21:54:05 +00:00
kmacy
15067326c7 reduce the size of the jumbo ring on i386 and disable pcpu cluster caching 2008-03-31 21:02:27 +00:00
sam
f6ff14fe1e add include path required to find ah_osdep.h
PR:		kern/122145
MFC after:	3 days
2008-03-31 18:49:09 +00:00
kib
2ad0eb2d91 Add the libc glue and headers definitions for the *at() syscalls.
Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:14:04 +00:00
kib
5c017b360f Regen 2008-03-31 12:12:27 +00:00
kib
7a1c49c4b3 Add the freebsd32 compatibility shims for the *at() syscalls.
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:08:30 +00:00
kib
6687cc3940 Add the openat(), fexecve() and other *at() syscalls to the table.
Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:06:55 +00:00
kib
78facfe99b Implement the fexecve(2) syscall.
Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:05:52 +00:00
kib
e0eb2de7e9 Implement the
openat(2), faccessat(2), fchmodat(2), fchownat(2), fstatat(2),
	futimesat(2), linkat(2), mkdirat(2), mkfifoat(2), mknodat(2),
	readlinkat(2), renameat(2), symlinkat(2)
syscalls.

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:04:20 +00:00
kib
eff8c6d35e Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:01:21 +00:00
kib
fb67926ebb Add the support for the O_EXEC open(2) mode, as specified by the
POSIX Extended API Set Part 2 extension specification.

Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 11:57:18 +00:00
kib
9ab7de877c Add the constant definition needed by the implementation of the
openat() and the related syscalls.

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 11:55:10 +00:00
kib
fe816f911f Add the utility function vn_commname() to retrieve the command name
from the vfs namecache, when available.

Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 11:53:03 +00:00
jeff
1968343329 - Since rev 1.142 of ffs_snapshot.c the interlock has not been required
to protect the v_lock pointer.  Removing the interlock acquisition
   here allows vn_lock() to proceed without requiring the interlock
   at all.
 - If the lock mutated while we were sleeping on it the interlock has
   been dropped.  It is conceivable that the upper layer code was
   relying on the interlock and LK_NOWAIT to protect the identity or
   state of the vnode while acquiring the lock.  In this case return
   EBUSY rather than trying the new lock to prevent potential races.

Reviewed by:	tegge
2008-03-31 07:55:45 +00:00
jeff
5e4f326d87 - Don't free snapdata structures when they are no longer in use.
Keeping the lockmgr lock valid allows us to switch the v_lock pointer
   in snapshot vnodes between the embedded lockmgr lock and snapdata
   lock without needing the vnode interlock to protect against races
 - Keep unused snapdata structures in a list.
 - Add a function to lock the devvp and allocate a snapdata to it or
   acquire a new one without races.  The old function was safe from
   creation races because we set the mount flag when creating snapshots
   and thus serializing them.  However, it might have been subject to
   destroying races.

Reviewed by:	tegge
2008-03-31 07:47:08 +00:00
yongari
55186b0e36 Padding more bytes than necessary one broke another variants of
PCIe RealTek chips. Only pad IP packets if the payload is less than
28 bytes.

Obtained from:	NetBSD
PR:		kern/122221
2008-03-31 04:03:14 +00:00
marcel
f4a93f0828 Better implement I-cache invalidation. The previous implementation
was a kluge. This implementation matches the behaviour on powerpc
and sparc64.
While on the subject, make sure to invalidate the I-cache after
loading a kernel module.

MFC after: 2 weeks
2008-03-30 23:09:14 +00:00
alc
a4bcbb3771 Eliminate an unnecessary printf() from kmem_suballoc(). The subsequent
panic() can be extended to convey the same information.
2008-03-30 20:08:59 +00:00
attilio
2e7c0d4d53 lockmgrs need to be released before to be destroyed and draining doesn't
make an exception.
Add correct stub for it.

Reviewed by:	rwatson
2008-03-30 18:16:33 +00:00
jeff
c2c4476b86 - Consistently return EDEADLK when presented with a new set that is
incompatible with existing bindings.
 - Try to copyout the setid in cpuset() before migrating the proc to the
   setid in case the user has supplied a bad buffer.
 - Rename cpuset_root() and cpuset_base() to cpuset_ref{root,base} to
   be more descriptive and free cpuset_root to be used as a different
   type of symbol.
 - Make cpuset_root the cpuset_t set of all cpus in the system.  This
   should contain the same bitmask as all_cpus presently.
 - Add a CPU_CMP() macro to compare two sets.
2008-03-30 11:31:14 +00:00
mav
19447c2a3e - Account all node stats at the shape mode.
- Do not check destination hook presence, it will be done by netgraph.
- Use u_int instead of int in some places to simplify type conversions.
- Use NG_SEND_DATA_ONLY() macro instead of selfmade equivalent.
2008-03-30 07:53:51 +00:00
mav
3847b38de2 Use new atomic_fetchadd() primitive instead of looping atomic_cmpset(). 2008-03-30 00:27:48 +00:00
jeff
ef1b5135a9 - Don't allow calls to vn_lock() with no lock type requested. Callers
which simply want a reference should use vref().  Callers which want
   to check validity need to hold a lock while performing any action
   based on that validity.  vn_lock() would always release the interlock
   before returning making any action synchronous with the validity check
   impossible.
2008-03-29 23:36:26 +00:00
jeff
26629f5add - Use vget() to lock the vnode rather than refing without a lock and
locking in separate steps.
2008-03-29 23:30:40 +00:00
jeff
30e699e885 - Simplify null_hashget() and null_hashins() by using vref() rather
than a complex series of steps involving vget() without a lock type
   to emulate the same thing.
2008-03-29 23:24:54 +00:00
mav
dd8be7d6b9 There is no need to erase hook->hk_node before freing hook. 2008-03-29 22:53:58 +00:00
marcel
5b21e055a6 Change the order from SI_ORDER_FIRST to SI_ORDER_ANY (within
SI_SUB_DRIVERS) to avoid loading schemes before all the GEOM
classes have been loaded and initialized. Otherwise we may
end up using mutexes that haven't been initialized (due to
g_retaste() posting an event).
2008-03-29 17:33:29 +00:00
jeff
43963c5cfa - Use vm_object_reference_locked() directly from
vm_object_reference().  This is intended to get rid of vget()
   consumers who don't wish to acquire a lock.  This is functionally
   the same as calling vref(). vm_object_reference_locked() already
   uses vref.

Discussed with:	alc
2008-03-29 07:06:13 +00:00
alc
fe959556c5 Eliminate an #if 0/#endif that was unintentionally introduced
by the previous revision.
2008-03-29 04:29:50 +00:00
mlaier
5cb64aae63 Make ALTQ cope with disappearing interfaces (particularly common with mpd
and netgraph in gernal).  This also allows to add queues for an interface
that is not yet existing (you have to provide the bandwidth for the
interface, however).

PR:		kern/106400, kern/117827
MFC after:	2 weeks
2008-03-29 00:24:36 +00:00
emaste
41514132ee Implement FSACTL_LNX_GET_FEATURES and FSACTL_GET_FEATURES ioctls. RAID
tools (e.g. arcconf) need this to be able to create arrays larger than 2TB.

Submitted by: Adaptec, via driver build 15317
2008-03-28 19:07:25 +00:00
brueffer
d549733458 Add a couple of missing NIC driver modules.
Approved by:	rwatson (mentor)
MFC after:	3 days
2008-03-28 18:13:09 +00:00
marcel
4545b45cdc Add support for PC-9800 partition tables. 2008-03-28 17:58:55 +00:00
emaste
664c6db74b If we're returning successfully from bus_dmamem_alloc, don't record a KTR
of error = ENOMEM.
2008-03-28 15:28:20 +00:00
rpaulo
4e702aed16 Add Qualcomm ZTE CMDMA MSM modem to the list of supported modems.
MFC after:   1 week
2008-03-28 14:20:06 +00:00
attilio
b6fa73a781 Bump __FreeBSD_version in order to reflect BUF_LOCKWAITERS() reintegration
and lockmgr_waiters() introduction.
2008-03-28 12:31:26 +00:00
attilio
7e107a0c8c b_waiters cannot be adequately protected by the interlock because it is
dropped after the call to lockmgr() so just revert this approach using
something similar to the precedent one:
BUF_LOCKWAITERS() just checks if there are waiters (not the actual number
of them) and it is based on newly introduced lockmgr_waiters() which
returns if the lockmgr has waiters or not. The name has been choosen
differently by old lockwaiters() in order to not confuse them.

KPI results enriched by this commit so __FreeBSD_version bumping and
manpage update will be happening soon.
'struct buf' also changes, so kernel ABI is disturbed.

Bug found by:	jeff
Approved by:	jeff, kib
2008-03-28 12:30:12 +00:00
dfr
3d7f9d1b14 Minor changes to improve compatibility with older FreeBSD releases. 2008-03-28 09:50:32 +00:00
brooks
5e4993bfe3 Use ; instead of : to end a line.
Submitted by:	Niclas Zeising <niclas dot zeising at gmail dot com>
2008-03-28 08:19:03 +00:00
marcel
dd866faa70 When retasting, wither any existing GEOMs of the same class. This
allows the class to create a different GEOM for the same provider
as well as avoid that we end up with multiple GEOMs of the same
class with the same name.

For example, when a disk contains a PC98 partition table but
only MBR is supported, then the partition table can be treated
as a MBR. If support for PC98 is later loaded as a module, the
MBR scheme is pre-empted for the PC98 scheme as expected.
2008-03-28 06:31:12 +00:00
ps
41d5b26ff8 Add support to mincore for detecting whether a page is part of a
"super" page or not.

Reviewed by:	alc, ups
2008-03-28 04:29:27 +00:00
attilio
f6c880cb2d _lockmgr_args() accepts a 'char *' string as file, so modify _BUF_LOCK()
and _BUF_TIMELOCK() prototypes accordingly with this.
2008-03-28 02:48:16 +00:00
yongari
e8dec714c1 In revision 1.70, 1.71 and 1.84 re(4) tried to workaround checksum
offload bugs by manual padding for short IP/UDP frames. Unfortunately
it seems that these workaround does not work reliably on newer PCIe
variants of RealTek chips.

To workaround the hardware bug, always pad short frames if Tx IP
checksum offload is requested. It seems that the hardware has a
bug in IP checksum offload handling. NetBSD manually pads short
frames only when the length of IP frame is less than 28 bytes but I
chose 60 bytes to safety. Also unconditionally set IP checksum
offload bit in Tx descriptor if any TCP or UDP checksum offload is
requested. This is the same way as Linux does but it's not
mentioned in data sheet.

Obtained from:	NetBSD
Tested by:	remko, danger
2008-03-28 01:21:21 +00:00
jb
cee2462540 Remove the last 3 files I missed. These have been repo copied to the new
location under a cddl part of the tree following the core@ license review.
2008-03-28 00:28:45 +00:00
attilio
67b6d2277a Instruments buffer lock objects in order to track correctly consumers
consumers in locking operations.
While here, operates some style(9) cleanups.
2008-03-28 00:14:33 +00:00
jb
291b24b755 Remove files that have been repo copied to their new location
in cddl-specific parts of the source tree.
2008-03-28 00:08:47 +00:00
jb
5794ada908 The sources covered by Sun's CDDL have been repo copied below the
src/cddl and src/sys/cddl directories per the core@ decision following
the license review.

This change modifies the affected Makefiles to reference the sources
in their new location.
2008-03-27 23:21:25 +00:00
mav
586e0246eb Remove ng_setisr() call from ng_dequeue(). It is useless as we any way
will never exit ngintr(), while there is some ready requests on the queue.
It was made years ago with hope of parallel queue processing by several
net threads. But even if we have several threads sometimes, we have no
rights to process queue in parallel as it will break original requests
serialization that is critically important for some setups.
2008-03-27 23:02:30 +00:00
antoine
162471f2fb Remove option headers that do not exist and are not used
from the Makefiles in sys/modules.
(opt_devfs.h, opt_bdg.h, opt_emu10kx.h and opt_uslcom.h)

Approved by:	rwatson (mentor)
2008-03-27 20:38:03 +00:00
mav
fd0bef772c Switch from timeval to bintime, to use 1/(2^20) of seconds instead of
microseconds. It allows to use bit shifts instead of some heavy 64bit
mul/div math operations.
2008-03-27 20:04:20 +00:00
iedowse
8b81d719e1 Add IFF_NEEDSGIANT to IFF_CANTCHANGE, to prevent user-level code
from clearing the IFF_NEEDSGIANT flag on Giant-locked interfaces.
In particular, wpa_supplicant was doing this on USB interfaces,
causing panics when Giant-locked code was then called without Giant.

Submitted by:	Alexey Popov
Reviewed by:	rwatson
MFC after:	3 days
2008-03-27 18:02:30 +00:00
dfr
c26f064bd4 Add nfslockd and krpc modules. 2008-03-27 11:55:03 +00:00
dfr
dc98ee4196 Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
2008-03-27 11:54:20 +00:00
jb
34e730ca27 When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.
2008-03-27 05:03:26 +00:00
alc
2a244be094 MFamd64 with few changes:
1. Add support for automatic promotion of 4KB page mappings to 2MB page
   mappings.  Automatic promotion can be enabled by setting the tunable
   "vm.pmap.pg_ps_enabled" to a non-zero value.  By default, automatic
   promotion is disabled.  Tested by: kris

2. To date, we have assumed that the TLB will only set the PG_M bit in a
   PTE if that PTE has the PG_RW bit set.  However, this assumption does
   not hold on recent processors from Intel.  For example, consider a PTE
   that has the PG_RW bit set but the PG_M bit clear.  Suppose this PTE
   is cached in the TLB and later the PG_RW bit is cleared in the PTE,
   but the corresponding TLB entry is not (yet) invalidated.
   Historically, upon a write access using this (stale) TLB entry, the
   TLB would observe that the PG_RW bit had been cleared and initiate a
   page fault, aborting the setting of the PG_M bit in the PTE.  Now,
   however, P4- and Core2-family processors will set the PG_M bit before
   observing that the PG_RW bit is clear and initiating a page fault.  In
   other words, the write does not occur but the PG_M bit is still set.

   The real impact of this difference is not that great.  Specifically,
   we should no longer assert that any PTE with the PG_M bit set must
   also have the PG_RW bit set, and we should ignore the state of the
   PG_M bit unless the PG_RW bit is set.
2008-03-27 04:34:17 +00:00
jb
735643909e Regen after makesyscalls.sh change. 2008-03-27 01:55:06 +00:00
jb
4a6deb0614 Generate another function for the DTrace syscall provider to specify
the syscall argument types.

This code is only compiled into the systrace kernel modul and has no
effect otherwise.
2008-03-27 01:53:44 +00:00
attilio
b94ea1b2e3 Really, smb_iod_main() is not totally MPSAFE, so just acquire and drop
Giant around it in order to assume MPSAFETY.

Reported by:	jhb, rwatson
Pointy hat to:	attilio
2008-03-27 01:23:59 +00:00
phk
c763b22a79 Back in the good old days, PC's had random pieces of rock for
frequency generation and what frequency the generated was anyones
guess.

In general the 32.768kHz RTC clock x-tal was the best, because that
was a regular wrist-watch Xtal, whereas the X-tal generating the
ISA bus frequency was much lower quality, often costing as much as
several cents a piece, so it made good sense to check the ISA bus
frequency against the RTC clock.

The other relevant property of those machines, is that they
typically had no more than 16MB RAM.

These days, CPU chips croak if their clocks are not tightly within
specs and all necessary frequencies are derived from the master
crystal by means if PLL's.

Considering that it takes on average 1.5 second to calibrate the
frequency of the i8254 counter, that more likely than not, we will
not actually use the result of the calibration, and as the final
clincher, we seldom use the i8254 for anything besides BEL in
syscons anyway, it has become time to drop the calibration code.

If you need to tell the system what frequency your i8254 runs,
you can do so from the loader using hw.i8254.freq or using the
sysctl kern.timecounter.tc.i8254.frequency.
2008-03-26 22:12:00 +00:00
phk
f5d8b74690 Further cleanup of sound generation in syscons:
The timer_spkr_*() functions take care of the enabling/disabling
of the speaker.

Test on the existence of timer_spkr_*() functions, rather than
architectures.
2008-03-26 22:02:51 +00:00
phk
3cbe36127b Make speaker a pseudo device driver instead of attaching to a PnP id.
If somebody cleaned this code up to proper style(9), it could become
a great educational starting point for aspiring kernel hackers.
2008-03-26 21:33:41 +00:00
rwatson
61a4ef5ea0 Add a comment explaining that we initialize the 'a' buffer for
zero-copy to the store buffer position on the BPF descriptor,
and the 'b' buffer as the free buffer in order to fill them in
the order documented in bpf(4).

MFC after:	4 months
Suggested by:	csjp
2008-03-26 21:29:13 +00:00
mav
5b9ac353f2 Some minor code and math optimizations. 2008-03-26 21:19:03 +00:00
jhb
20cadd93f0 Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo'
(such as 'atime' vs 'noatime').  The filesystems will always see either
'nofoo' or 'nonofoo', never plain 'foo'.  As such, their list of valid
mount options should include 'nofoo' instead of 'foo'.  With this fix,
you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as
noatime without getting an error.  You can also update a noatime FFS
filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime'
using nmount(2) (e.g. 7.x /sbin/mount binary).

MFC after:	1 week
Reviewed by:	crodig
2008-03-26 20:48:07 +00:00
phk
259c5e1579 Remove two variables which are handled MI now. 2008-03-26 20:28:52 +00:00
phk
168398fe50 Eliminate unnecessary #includes 2008-03-26 20:26:12 +00:00
phk
fa71439e44 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
dfr
7ce50c542d Bump __FreeBSD_version for the addition of 'l_sysid' to the flock structure. 2008-03-26 15:41:00 +00:00