clearing MSI enable bit for MSI capable hardwares resulted in Tx
problems. MSI enable bit is set only when MSI is requested from
user.
Tested by: remko
(i.e. fixed delivery) to SAPIC_DELMODE_LOWPRI. While the commit
log doesn't mention the change in behaviour, it is believed to be
deliberate. In the last 5.5 years this hasn't been a problem. Nor
do I think did it make any difference, but who knows. However, I
do know that it break SMP support for Montecito-based machines.
Switch back to fixed-CPU delivery so that SMP works again. This
gives me some time to look more closely at the problem, as well
as make sure the I-cache validation as it's implemented currently
is sufficient in SMP configurations...
mips32r2 and mips64r2 (and close relatives) processors. There
presently is support for ADMtek ADM5120, A mips 4Kc in a malta board,
the RB533 routerboard (based on IDT RC32434) and some preliminary
support for sibtye/broadcom designs. Other hardware support will be
forthcomcing.
This port boots multiuser under gxemul emulating the malta board and
also bootstraps on the hardware whose support is forthcoming...
Oleksandr Tymoshenko, Wojciech Koszek, Warner Losh, Olivier Houchard,
Randall Stewert and others that have contributed to the mips2 and/or
mips2-jnpr perforce branches. Juniper contirbuted a generic mips port
late in the life cycle of the misp2 branch. Warner Losh merged the
mips2 and Juniper code bases, and others list above have worked for
the past several months to get to multiuser.
In addition, the mips2 work owe a debt to the trail blazing efforts of
the original mips branch in perforce done by Juli Mallett.
mips32r2 and mips64r2 (and close relatives) processors. There
presently is support for ADMtek ADM5120, A mips 4Kc in a malta board,
the RB533 routerboard (based on IDT RC32434) and some preliminary
support for sibtye/broadcom designs. Other hardware support will be
forthcomcing.
This port boots multiuser under gxemul emulating the malta board and
also bootstraps on the hardware whose support is forthcoming...
Oleksandr Tymoshenko, Wojciech Koszek, Warner Losh, Olivier Houchard,
Randall Stewert and others that have contributed to the mips2 and/or
mips2-jnpr perforce branches. Juniper contirbuted a generic mips port
late in the life cycle of the misp2 branch. Warner Losh merged the
mips2 and Juniper code bases, and others list above have worked for
the past several months to get to multiuser.
In addition, the mips2 work owe a debt to the trail blazing efforts of
the original mips branch in perforce done by Juli Mallett.
mips32r2 and mips64r2 (and close relatives) processors. There
presently is support for ADMtek ADM5120, A mips 4Kc in a malta board,
the RB533 routerboard (based on IDT RC32434) and some preliminary
support for sibtye/broadcom designs. Other hardware support will be
forthcomcing.
This port boots multiuser under gxemul emulating the malta board and
also bootstraps on the hardware whose support is forthcoming...
Oleksandr Tymoshenko, Wojciech Koszek, Warner Losh, Olivier Houchard,
Randall Stewert and others that have contributed to the mips2 and/or
mips2-jnpr perforce branches. Juniper contirbuted a generic mips port
late in the life cycle of the misp2 branch. Warner Losh merged the
mips2 and Juniper code bases, and others list above have worked for
the past several months to get to multiuser.
In addition, the mips2 work owe a debt to the trail blazing efforts of
the original mips branch in perforce done by Juli Mallett.
merged juniper and mips2 code base. This represents the work of
Juniper Engineers, plus Oleksandr Tymoshenko, Wojciech Koszek, Warner
Losh, Olivier Houchard, Randall Stewert and others that have
contributed to the mips2 and/or mips2-jnpr perforce branches.
The original code from KAME did not take care of address
aliases or multiple ip addresses that have the same
prefix.
Reviewed by: rwatson, gnn, sam, kmacy, julian
(ECMP) for both IPv4 and IPv6. Previously, multipath route insertion
is disallowed. For example,
route add -net 192.103.54.0/24 10.9.44.1
route add -net 192.103.54.0/24 10.9.44.2
The second route insertion will trigger an error message of
"add net 192.103.54.0/24: gateway 10.2.5.2: route already in table"
Multiple default routes can also be inserted. Here is the netstat
output:
default 10.2.5.1 UGS 0 3074 bge0 =>
default 10.2.5.2 UGS 0 0 bge0
When multipath routes exist, the "route delete" command requires
a specific gateway to be specified or else an error message would
be displayed. For example,
route delete default
would fail and trigger the following error message:
"route: writing to routing socket: No such process"
"delete net default: not in table"
On the other hand,
route delete default 10.2.5.2
would be successful: "delete net default: gateway 10.2.5.2"
One does not have to specify a gateway if there is only a single
route for a particular destination.
I need to perform more testings on address aliases and multiple
interfaces that have the same IP prefixes. This patch as it
stands today is not yet ready for prime time. Therefore, the ECMP
code fragments are fully guarded by the RADIX_MPATH macro.
Include the "options RADIX_MPATH" in the kernel configuration
to enable this feature.
Reviewed by: robert, sam, gnn, julian, kmacy
public namespace for WITNESS as they are only used internally so just
move them in the private namespace for the subsystem (with all related
supporting definitions).
Make clock_if.m and subr_rtc.c standard on i386
Add hints for "atrtc" driver, for non-PnP, non-ACPI systems.
NB: Make sure to install GENERIC.hints into /boot/device.hints in these!
Nuke MD inittodr(), resettodr() functions.
Don't attach to PHP0B00 in the "attimer" dummy driver any more, and remove
comments that no longer apply for that reason.
Add new "atrtc" device driver, which handles IBM PC AT Real Time
Clock compatible devices using subr_rtc and clock_if.
This driver is not entirely clean: other code still fondles the
hardware to get a statclock interrupt on non-ACPI timer systems.
Wrap some overly long lines.
After it has settled in -current, this will be ported to amd64.
Technically this is MFC'able, but I fail to see a good reason.
under bootverbose.
Struct ct is used for setting/reading real time clocks and I'm about
to Do Things to some of those, so a bit of preemptive debugging is
in order.
Remove a pointless __inline.
the only one difference is that lockmgr*() functions now accept
LK_NOWITNESS flag which skips ordering for the instanced calling.
- Remove an unuseful stub in witness_checkorder() (because the above check
doesn't allow ever happening) and allow witness_upgrade() to accept
non-try operation too.
- Fix speaker issues with Dell Vostro 1500 (GPIO0)
Tested by: John Wright <jwright.gmail.com>
- Apply ridiculous quirk on Asus A8X series (A8JC, A8M, A8xx, etc). These
different laptop series share simmilar pci id, hardware codecs, etc.
but works differently. A slight difference in connection type for
widget #26 is used to differentiate it.
Tested by: eric baumbach <embaumbach.gmail.com>
- Apply GPIO0 quirk for ASUS G2K laptop
- Sort ASUS ids accordingly.
Submitted by: jkim
MFC after: 3 days
TX traffic to sit in the send chain until a received packet kick
started the interrupt handler. This would cause extremely slow
performance when used with NFS over UDP.
- Removed untested polling code.
- Updated copyright year in the file header.
- Removed inadvertent ^M's created by DOS text editor.
MFC after: 2 weeks
be handled by chn_abort() and chn_start() alone. This should fix
few issues with single duplex hardware (mostly) or pre virtual record
(RELENG 6) under WINE emulation and possibly others that using
SNDCTL_DSP_SETTRIGGER.
MFC after: 3 days
The problem is that the PM support is part of a much larger WIP here, but due to popular demand I decided to get some of it imported.
Also I forgot the mention:
HW sponsored by: Vitsch Electronics / VEHosting
may be held for the duration of the various dirhash operations which
avoids many complex unlock/lock/revalidate sequences.
- Permit shared locks on lookup. To protect the ip->i_dirhash pointer we
use the vnode interlock in the shared case. Callers holding the
exclusive vnode lock can run without fear of concurrent modification to
i_dirhash.
- Hold an exclusive dirhash lock when creating the dirhash structure for
the first time or when re-creating a dirhash structure which has been
recycled.
Tested by: kris, pho
indexes so directory lookup becomes shared lock safe. In the modifying
cases an exclusive lock is held here so the commit routine may
rely on the state of i_offset.
- Similarly handle i_diroff by fetching at the start and setting only once
the operation is complete. Without the exclusive lock these are only
considered hints.
- Assert that an exclusive lock is held when we're preparing for a commit
routine.
- Honor the lock type request from lookup instead of always using exclusive
locking.
Tested by: pho, kris
I've taken a slightly different approach than is used with the ICH8 controllers
in that each controller is not identified individually (eg USB A, USB B, etc).
Instead I've given then same description to each one even though the device ID
differs. This can easily be changed if desired, or ICH8 (and any others using
that approach) can be made to work as this does.
lookup hard interrupt events by number. Ignore the irq# for soft intrs.
- Add support to cpuset for binding hardware interrupts. This has the
side effect of binding any ithread associated with the hard interrupt.
As per restrictions imposed by MD code we can only bind interrupts to
a single cpu presently. Interrupts can be 'unbound' by binding them
to all cpus.
Reviewed by: jhb
Sponsored by: Nokia
2/4MB page from a PDE. Specifically, change it to use PG_PS_FRAME,
not PG_FRAME, to extract the physical address of a 2/4MB page from a
PDE.
Change the last argument passed to pmap_pv_insert_pde() from a
vm_page_t representing the first 4KB page of a 2/4MB page to the
vm_paddr_t of the 2/4MB page. This avoids an otherwise unnecessary
conversion from a vm_paddr_t to a vm_page_t in pmap_copy().
Support is working on the Silicon Image SiI3124/3132.
Support is working on some AHCI chips but far from all.
Remember this is WIP, so test reports and (constructive) suggestions are welcome!
received frame under certain conditions. wpaul said the length
0xfff0 is special meaning that indicates hardware is in the
process of copying a packet into host memory. But it seems
there are other cases that hardware is busy or stuck in bad
situation even if the received frame length is not 0xfff0.
To work-around this condition, add a check that verifys that
recevied frame length is in valid range. If received length is out
of range reinitialize hardware to recover from stuck condition.
Reported by: Mike Tancsa ( mike AT sentex DOT net )
Tested by: Mike Tancsa
Obtained from: OpenBSD
MFC after: 1 week
no longer needed, but for now we still want to be consistent with other
similar checks in the tree.
- Call ASSERT_VOP_ELOCKED() only when vget() returns 0.
Reviewed by: jeff
o create a private task queue thread that sets up root and current
directories (hooking mountroot event as needed); this is necessary
because task queue threads are parented from proc0 and it does not
have a reference to rootvnode (lost when / mounting moved to init)
o bounce image load + unload requests through the private task q so
we can load images even when the request is made from a thread that
does not have sufficient context (e.g. task q thread)
o add a check in the task q thread to fail requests before root is
mounted (just in case)
Reviewed by: jhb, mlaier, luigi (glance)
MFC after: 1 month
and linux_openat(). Instead just pass AT_FDCWD into linux_common_open()
for the linux_open() case. This prevents passing -1 as a dirfd to
openat() from succeeding which is wrong.
Suggested by: rwatson, kib
Approved by: kib (mentor)
ICMP unreach, frag needed. Up to now we only looked at the
interface MTU. Make sure to only use the minimum of the two.
In case IPSEC is compiled in, loop the mtu through ip_ipsec_mtu()
to avoid any further conditional maths.
Without this, PMTU was broken in those cases when there was a
route with a lower MTU than the MTU of the outgoing interface.
PR: kern/122338
Tested by: Mark Cammidge mark peralex.com
Reviewed by: silence on net@
MFC after: 2 weeks
so that all implemented variants have proper prototypes. The 8-bit,
16-bit and 64-bit variants are not implemented.
This really fixes the current build breakages caused by type casting
and struct aliasing rules.
commands can be written to /dev/psm%d and status can be read back from it.
- Reflect the change in psm(4) and bump version for ports.
MFC after: 1 week
Because of this we were not getting further interrupts for link state
changes, thus never went into iface UP state and thus could not transmit.
The only way out of this was an incoming packet generating an rx interrupt
and making us call into bge_link_upd.
Up to rev. 1.101, in bge_start_locked, we only returned instantly
if there was 'no link AND nothing queued for tx'. So with a packet queued
for tx, we hit the register scrubbing at the end of bge_start_locked
and were out fine. We simply lost a packet or two but got the interrupts
need to get into UP state.
With rev. 1.102 this was turned into 'if there is no link OR there is
nothing to send' (correct behaviour) and as long as there is no link
we never hit the register scrubbing and consequently never got the link UP.
What we do now is force an interrupt at the end of bge_ifmedia_upd_locked
so we will call bge_link_upd, clear the link state attention and get
further interrupts.
This helps to get the iface UP on an idle network or at least to get
it UP faster not depending on an rx intr anymore.
In case you could not get a DHCP lease or it took very long,
it was because of this.
It is unknown which chips are affected by this. ASIC rev. 0x2003 was the
most popular trouble candidate.
At least the fiber cards should have been working fine.
Which register to scrub is currently under discussion. The comitted
solution was tested and found to work for a lot of setups. It might
not help with MSI.
The reason why we end up in such a situation is entirely unknown.
PR: kern/111804
Tested by: phk, scottl at Y!
MFC after: 14 days
was changed in rev. 1.161 of tcp_var.h. All option now test for sufficient
space in TCP header before getting added.
Reported by: Mark Atkinson <atkin901-at-yahoo.com>
Tested by: Mark Atkinson <atkin901-at-yahoo.com>
MFC after: 1 week
bit in order to allow per-bit checks on the options flag, in particular
in the consumers code [1]
- Re-enable the check against TDP_DEADLKTREAT as the anti-waiters
starvation patch allows exclusive waiters to override new shared
requests.
[1] Requested by: pjd, jeff
buffer kernel descriptors, which is used to allow the buffer
currently in the BPF "store" position to be assigned to userspace
when it fills, even if userspace hasn't acknowledged the buffer
in the "hold" position yet. To implement this, notify the buffer
model when a buffer becomes full, and check that the store buffer
is writable, not just for it being full, before trying to append
new packet data. Shared memory buffers will be assigned to
userspace at most once per fill, be it in the store or in the
hold position.
This removes the restriction that at most one shared memory can
by owned by userspace, reducing the chances that userspace will
need to call select() after acknowledging one buffer in order to
wait for the next buffer when under high load. This more fully
realizes the goal of zero system calls in order to process a
high-speed packet stream from BPF.
Update bpf.4 to reflect that both buffers may be owned by userspace
at once; caution against assuming this.
state transitioning flags and of msleep(9) callings.
Use, instead, an algorithm very similar to what sx(9) and rwlock(9)
alredy do and direct accesses to the sleepqueue(9) primitive.
In order to avoid writer starvation a mechanism very similar to what
rwlock(9) uses now is implemented, with the correspective per-thread
shared lockmgrs counter.
This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and
lockmgr_args_rw(). These two are like the 2 "normal" versions, but they
both accept a rwlock as interlock. In order to realize this, the general
lockmgr manager function "__lockmgr_args()" has been implemented through
the generic lock layer. It supports all the blocking primitives, but
currently only these 2 mappers live.
The patch drops the support for WITNESS atm, but it will be probabilly
added soon. Also, there is a little race in the draining code which is
also present in the current CVS stock implementation: if some sharers,
once they wakeup, are in the runqueue they can contend the lock with
the exclusive drainer. This is hard to be fixed but the now committed
code mitigate this issue a lot better than the (past) CVS version.
In addition assertive KA_HELD and KA_UNHELD have been made mute
assertions because they are dangerous and they will be nomore supported
soon.
In order to avoid namespace pollution, stack.h is splitted into two
parts: one which includes only the "struct stack" definition (_stack.h)
and one defining the KPI. In this way, newly added _lockmgr.h can
just include _stack.h.
Kernel ABI results heavilly changed by this commit (the now committed
version of "struct lock" is a lot smaller than the previous one) and
KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction,
so manpages and __FreeBSD_version will be updated accordingly.
Tested by: kris, pho, jeff, danger
Reviewed by: jeff
Sponsored by: Google, Summer of Code program 2007
contigmalloc(9) as a last resort to steal pages from an inactive,
partially-used superpage reservation.
Rename vm_reserv_reclaim() to vm_reserv_reclaim_inactive() and
refactor it so that a separate subroutine is responsible for breaking
the selected reservation. This subroutine is also used by
vm_reserv_reclaim_contig().
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
Also, don't bother unmasking the interrupt in the post_filter case if
we never masked it. The INTR_FILTER case had been doing this by having
arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
both the 'post_filter' and 'post_ithread' hooks to match what the
non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
designed to do even in 5.x).
Glanced at by: piso
Reviewed by: marius
Requested by: marius [1], [5]
Tested on: amd64, i386, arm, sparc64
part of detecting the media. Explicitly ensure that we don't send it to
bpf(4) as bpf(4) isn't setup yet. This worked by accident before the bpf
interface stuff was reworked to avoid other races (bpf_peers_present, etc.)
but now it needs an explicit check to avoid a panic.
MFC after: 3 days
PR: kern/120915
UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM.
(UMA_SLAB_KMAP met its original demise in revision 1.30 of
vm/uma_core.c.) UMA_SLAB_KERNEL is now required by the jumbo frame
allocators. Without it, UMA cannot correctly return pages from the
jumbo frame zones to the VM system because it resets the pages' object
field to NULL instead of the kernel object. In more detail, the jumbo
frame zones are created with the option UMA_ZONE_REFCNT. This causes
UMA to overwrite the pages' object field with the address of the slab.
However, when UMA wants to release these pages, it doesn't know how to
restore the object field, so it sets it to NULL. This change teaches
UMA how to reset the object field to the kernel object.
Crashes reported by: kris
Fix tested by: kris
Fix discussed with: jeff
MFC after: 6 weeks
spinning when readers hold a lock. This spinning is speculative because,
unlike the write case, we can not test whether the owners are running.
- Add speculative read spinning for readers who are blocked by pending
writers while a read lock is still held. This allows the thread to
spin until the write lock succeeds after which it may spin until the
writer has released the lock. This prevents excessive context switches
when readers and writers both hold the lock for brief periods.
Sponsored by: Nokia
the fdesc_allocvp(). The caller of the fdesc_allocvp() expects that the
returned vnode is not reclaimed. Do lock the vnode exclusive and drop
the lock after.
Reported by: pho
Reviewed by: jeff
fixed pri boost with '1' or any priority less than the current thread's
priority with a value greater than two. Default the boost to
PRI_MIN_TIMESHARE to prevent regular user-space threads from starving
threads in the kernel. This prevents these user-threads from also
being scheduled as if they are high fixed-priority kernel threads.
- Restore the setting of lowpri in tdq_choose(). It has to be either here
or in sched_switch(). I accidentally removed it from both places.
Tested by: kris
do this either. Simply check P_NOLOAD. It'd be nice if this was
in a thread flag so we didn't have an extra cache miss every time we
add and remove a thread from the run-queue.
- Pull all the code to deal with the trampoline stuff into one
centeralized place and use it from everywhere.
- Some minor style tidiness
Reviewed by: tinguely
platform, so use the latter in preference to the former. This makes
the fake_preload setup be the same between kb920x_machdep.c and
avila_machdep.c....
and the igb driver static in the kernel. But it also reflects
some other bug fixes in my development stream at Intel.
PR 122373 is also fixed in this code.
- Move callout thread creation from kern_intr.c to kern_timeout.c
- Call callout_tick() on every processor via hardclock_cpu() rather than
inspecting callout internal details in kern_clock.c.
- Remove callout implementation details from callout.h
- Package up all of the global variables into a per-cpu callout structure.
- Start one thread per-cpu. Threads are not strictly bound. They prefer
to execute on the native cpu but may migrate temporarily if interrupts
are starving callout processing.
- Run all callouts by default in the thread for cpu0 to maintain current
ordering and concurrency guarantees. Many consumers may not properly
handle concurrent execution.
- The new callout_reset_on() api allows specifying a particular cpu to
execute the callout on. This may migrate a callout to a new cpu.
callout_reset() schedules on the last assigned cpu while
callout_reset_curcpu() schedules on the current cpu.
Reviewed by: phk
Sponsored by: Nokia
given pmap is never NULL, and therefore pmap_pml4e() can never return
NULL. The pervasive use of these inline functions throughout the pmap
makes these simple changes worthwhile.
to trip a bug causing the latter to return a zeroed struct
aac_adapter_info. This causes two issues. One is cosmetic only --
a verbose boot prints information about the controller, and shows all
zero:
aac0: Unknown processor 0MHz, 0MB memory (0MB cache, 0MB execution),
unknown battery platform
The second problem is that the firmware version information is stored
away for aac_rev_check, for userland tools (like aaccli) to query via
the FSACTL_MINIPORT_REV_CHECK and FSACTL_LNX_MINIPORT_REV_CHECK ioctls.
When aaccli encounters this issue it prints
Command Error: <The current AFAAPI.DLL is too old to work with the
current controller software.>
Move the RequestSupplementAdapterInfo call after RequestAdapterInfo,
which seems to fix both problems.
These functions try the specified operation (rlocking and wlocking) and
true is returned if the operation completes, false otherwise.
The KPI is enriched by this commit, so __FreeBSD_version bumping and
manpage updating will happen soon.
Requested by: jeff, kris
abstraction as the RAID and CAM modules, making it nearly impossible
for enough initialization to be done in time for the RAID module to
know whether to attach. On top of this, no reset was being done on
the controller on attach, in violation of the spec. Additionally,
the port enable step was being deferred to the end of the attach
process, long after it should have been done to ensure reliable
operation from the controller. Fix all of these with a few hacks
to force the "attach" and "enable" steps of the core module early
on, and ensure that a reset and port enable also happens early on.
In the future, the driver needs to be refactored to eliminate the
core module abstraction, clean up withe reset/enable steps, and
defer event messages until all of the modules are available to
recieve them.
openat(2), faccessat(2), fchmodat(2), fchownat(2), fstatat(2),
futimesat(2), linkat(2), mkdirat(2), mkfifoat(2), mknodat(2),
readlinkat(2), renameat(2), symlinkat(2)
syscalls.
Based on the submission by rdivacky,
sponsored by Google Summer of Code 2007
Reviewed by: rwatson, rdivacky
Tested by: pho
openat() and the related syscalls.
Based on the submission by rdivacky,
sponsored by Google Summer of Code 2007
Reviewed by: rwatson, rdivacky
Tested by: pho
to protect the v_lock pointer. Removing the interlock acquisition
here allows vn_lock() to proceed without requiring the interlock
at all.
- If the lock mutated while we were sleeping on it the interlock has
been dropped. It is conceivable that the upper layer code was
relying on the interlock and LK_NOWAIT to protect the identity or
state of the vnode while acquiring the lock. In this case return
EBUSY rather than trying the new lock to prevent potential races.
Reviewed by: tegge
Keeping the lockmgr lock valid allows us to switch the v_lock pointer
in snapshot vnodes between the embedded lockmgr lock and snapdata
lock without needing the vnode interlock to protect against races
- Keep unused snapdata structures in a list.
- Add a function to lock the devvp and allocate a snapdata to it or
acquire a new one without races. The old function was safe from
creation races because we set the mount flag when creating snapshots
and thus serializing them. However, it might have been subject to
destroying races.
Reviewed by: tegge
was a kluge. This implementation matches the behaviour on powerpc
and sparc64.
While on the subject, make sure to invalidate the I-cache after
loading a kernel module.
MFC after: 2 weeks
incompatible with existing bindings.
- Try to copyout the setid in cpuset() before migrating the proc to the
setid in case the user has supplied a bad buffer.
- Rename cpuset_root() and cpuset_base() to cpuset_ref{root,base} to
be more descriptive and free cpuset_root to be used as a different
type of symbol.
- Make cpuset_root the cpuset_t set of all cpus in the system. This
should contain the same bitmask as all_cpus presently.
- Add a CPU_CMP() macro to compare two sets.
- Do not check destination hook presence, it will be done by netgraph.
- Use u_int instead of int in some places to simplify type conversions.
- Use NG_SEND_DATA_ONLY() macro instead of selfmade equivalent.
which simply want a reference should use vref(). Callers which want
to check validity need to hold a lock while performing any action
based on that validity. vn_lock() would always release the interlock
before returning making any action synchronous with the validity check
impossible.
SI_SUB_DRIVERS) to avoid loading schemes before all the GEOM
classes have been loaded and initialized. Otherwise we may
end up using mutexes that haven't been initialized (due to
g_retaste() posting an event).
vm_object_reference(). This is intended to get rid of vget()
consumers who don't wish to acquire a lock. This is functionally
the same as calling vref(). vm_object_reference_locked() already
uses vref.
Discussed with: alc
and netgraph in gernal). This also allows to add queues for an interface
that is not yet existing (you have to provide the bandwidth for the
interface, however).
PR: kern/106400, kern/117827
MFC after: 2 weeks
dropped after the call to lockmgr() so just revert this approach using
something similar to the precedent one:
BUF_LOCKWAITERS() just checks if there are waiters (not the actual number
of them) and it is based on newly introduced lockmgr_waiters() which
returns if the lockmgr has waiters or not. The name has been choosen
differently by old lockwaiters() in order to not confuse them.
KPI results enriched by this commit so __FreeBSD_version bumping and
manpage update will be happening soon.
'struct buf' also changes, so kernel ABI is disturbed.
Bug found by: jeff
Approved by: jeff, kib
allows the class to create a different GEOM for the same provider
as well as avoid that we end up with multiple GEOMs of the same
class with the same name.
For example, when a disk contains a PC98 partition table but
only MBR is supported, then the partition table can be treated
as a MBR. If support for PC98 is later loaded as a module, the
MBR scheme is pre-empted for the PC98 scheme as expected.
offload bugs by manual padding for short IP/UDP frames. Unfortunately
it seems that these workaround does not work reliably on newer PCIe
variants of RealTek chips.
To workaround the hardware bug, always pad short frames if Tx IP
checksum offload is requested. It seems that the hardware has a
bug in IP checksum offload handling. NetBSD manually pads short
frames only when the length of IP frame is less than 28 bytes but I
chose 60 bytes to safety. Also unconditionally set IP checksum
offload bit in Tx descriptor if any TCP or UDP checksum offload is
requested. This is the same way as Linux does but it's not
mentioned in data sheet.
Obtained from: NetBSD
Tested by: remko, danger
src/cddl and src/sys/cddl directories per the core@ decision following
the license review.
This change modifies the affected Makefiles to reference the sources
in their new location.
will never exit ngintr(), while there is some ready requests on the queue.
It was made years ago with hope of parallel queue processing by several
net threads. But even if we have several threads sometimes, we have no
rights to process queue in parallel as it will break original requests
serialization that is critically important for some setups.
from clearing the IFF_NEEDSGIANT flag on Giant-locked interfaces.
In particular, wpa_supplicant was doing this on USB interfaces,
causing panics when Giant-locked code was then called without Giant.
Submitted by: Alexey Popov
Reviewed by: rwatson
MFC after: 3 days
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
1. Add support for automatic promotion of 4KB page mappings to 2MB page
mappings. Automatic promotion can be enabled by setting the tunable
"vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic
promotion is disabled. Tested by: kris
2. To date, we have assumed that the TLB will only set the PG_M bit in a
PTE if that PTE has the PG_RW bit set. However, this assumption does
not hold on recent processors from Intel. For example, consider a PTE
that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE
is cached in the TLB and later the PG_RW bit is cleared in the PTE,
but the corresponding TLB entry is not (yet) invalidated.
Historically, upon a write access using this (stale) TLB entry, the
TLB would observe that the PG_RW bit had been cleared and initiate a
page fault, aborting the setting of the PG_M bit in the PTE. Now,
however, P4- and Core2-family processors will set the PG_M bit before
observing that the PG_RW bit is clear and initiating a page fault. In
other words, the write does not occur but the PG_M bit is still set.
The real impact of this difference is not that great. Specifically,
we should no longer assert that any PTE with the PG_M bit set must
also have the PG_RW bit set, and we should ignore the state of the
PG_M bit unless the PG_RW bit is set.
frequency generation and what frequency the generated was anyones
guess.
In general the 32.768kHz RTC clock x-tal was the best, because that
was a regular wrist-watch Xtal, whereas the X-tal generating the
ISA bus frequency was much lower quality, often costing as much as
several cents a piece, so it made good sense to check the ISA bus
frequency against the RTC clock.
The other relevant property of those machines, is that they
typically had no more than 16MB RAM.
These days, CPU chips croak if their clocks are not tightly within
specs and all necessary frequencies are derived from the master
crystal by means if PLL's.
Considering that it takes on average 1.5 second to calibrate the
frequency of the i8254 counter, that more likely than not, we will
not actually use the result of the calibration, and as the final
clincher, we seldom use the i8254 for anything besides BEL in
syscons anyway, it has become time to drop the calibration code.
If you need to tell the system what frequency your i8254 runs,
you can do so from the loader using hw.i8254.freq or using the
sysctl kern.timecounter.tc.i8254.frequency.
The timer_spkr_*() functions take care of the enabling/disabling
of the speaker.
Test on the existence of timer_spkr_*() functions, rather than
architectures.
zero-copy to the store buffer position on the BPF descriptor,
and the 'b' buffer as the free buffer in order to fill them in
the order documented in bpf(4).
MFC after: 4 months
Suggested by: csjp
(such as 'atime' vs 'noatime'). The filesystems will always see either
'nofoo' or 'nonofoo', never plain 'foo'. As such, their list of valid
mount options should include 'nofoo' instead of 'foo'. With this fix,
you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as
noatime without getting an error. You can also update a noatime FFS
filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime'
using nmount(2) (e.g. 7.x /sbin/mount binary).
MFC after: 1 week
Reviewed by: crodig
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.
The new (optional) MD functions are:
timer_spkr_acquire()
timer_spkr_release()
and
timer_spkr_setfreq()
the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.
Drop entirely timer2 on pc98, it is not used anywhere at all.
Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.
Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.
This eliminate the need for the speaker driver to know about
i8254frequency at all. In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.
Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.
In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that. It's probably not important.
Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.
This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.
Highlights include:
* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.
* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.
* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.
* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.
* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.
* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.
Sponsored by: Isilon Systems
PR: 95247 107555 115524 116679
MFC after: 2 weeks