Fix message ring send path:
- define msgrng_access_enable() which disables local interrupts
and enables message ring access. Also define msgrng_restore() which
restores interrupts
- remove all other msgrng enable/disable macros, no need of critical_enter
and other locking here.
- message_send() fixup: re-read status until pending bit clears
- message_send_retry() fixup: retry only few times with interrupts disabled
- Fix up message_send/message_send_retry callers - call
msgrng_access_enable() and msgrng_restore() correctly so that interrupts
are not disabled for long.
- removed unused and obsolete code from sys/mips/rmi/msgring.h
- some style fixes - more later
rge.c (XLR GMAC driver):
- updated for the message ring changes
- remove unused message_send_block()
- retry on credit failure, this is not a permanent failure when credits
are configured correctly. Add panic if credits are not available to
send for a long time.
PowerPC CPUs running a 32-bit kernel. This bug could cause in-use VSIDs
to be allocated again to another process, causing memory space overlaps
and corruption.
Reported by: linimon
discrepencies from the igb version which was the target.
Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.
directories for purposes of validating name cache entries. This
closes races where two updates to a file or directory within the same
second could result in stale entries in the name cache. While here,
remove the 'n_expiry' field as it is no longer used.
Reviewed by: rmacklem
MFC after: 1 week
"uncore" devices (such as the memory controller) in that socket. Stop
hardcoding support for two busses, but instead start probing buses at
domain 0, bus 255 and walk down until a bus probe fails. Also, do not probe
a bus if it has already been enumerated elsewhere (e.g. if ACPI ever
enumerates these buses in the future).
Fix interrupt routing so that the irq returned is correct for XLR and
XLS. This also updates the MSI hack we had earlier - we still don't
really support MSI, but we support some drivers that use MSI, by providing
support for allocating one MSI per pci link - this MSI is directly
mapped to the link IRQ.
lock on the pmc-sx lock. This prevents a deadlock with
pmc_log_process_mappings, which has an exclusive lock on pmc-sx and tries
to get a read lock on a vm_map. Downgrading the vm_map_lock in munmap
allows pmc_log_process_mappings to continue, preventing the deadlock.
Without this change I could cause a deadlock on a multicore 8.1-RELEASE
system by having one thread constantly mmap'ing and then munmap'ing a
PROT_EXEC mapping in a loop while I repeatedly invoked and stopped pmcstat
in system-wide sampling mode.
Reviewed by: fabient
Approved by: emaste (mentor)
MFC after: 2 weeks
using ipproto_{un,}register() and the newly created ip6proto_{un,}register()
so that it can again receive IPPROTO_CARP packets allowing its state machine
to work.
Reviewed by: bz
Approved by: ken (mentor)
- add identify method to create driver's own device_t
- successfully probe only driver's own device_t instead of any device_t
- (ab)use device order to hopefully be probed/attached after acpi_wmi
PR: kern/147858
Tested by: Maciej Suszko <maciej@suszko.eu>
MFC after: 1 week
- Add special check for case when time expires before being programmed.
This fixes interrupt loss and respectively timer death on attempt to
program very short interval. Increase minimal supported period to more
realistic value.
- Add support for hint.hpet.X.allowed_irqs tunable, allowing manually
specify which interrupts driver allowed to use. Unluckily, many BIOSes
program wrong allowed interrupts mask, so driver tries to stay on safe
side by not using unshareable ISA IRQs. This option gives control over
this limitation, allowing more per-CPU timers to be provided, when FSB
interrupts are not supported. Value of this tunable is bitmask.
- Do not use regular interrupts on virtual machines. QEMU and VirtualBox
do not support them properly, that may cause problems. Stay safe by default.
Same time both QEMU and VirtualBox work fine in legacy_route mode.
VirtualBox also works fine if manually specify allowed ISA IRQs with above.
upper layer. Until now, unionfs prevents to use that kind of
file system as upper layer. This time, I changed to allow
that kind of file system as upper layer. By this change, you
can use whiteout not supporting file system (e.g., especially
for tmpfs) as upper layer. It's very useful for combination of
tmpfs as upper layer and read only file system as lower layer.
By difinition, without whiteout support from the file system
backing the upper layer, there is no way that delete and rename
operations on lower layer objects can be done. EOPNOTSUPP is
returned for this kind of operations as generated by VOP_WHITEOUT()
along with any others which would make modifica tions to the
lower layer, such as chmod(1).
This change is suggested by ed.
Submitted by: ed
client to return an error when rabp is not set, so it
behaves the same way as the regular NFS client for this
case. It does not affect NFSv4, since nfs_getcacheblk()
only fails for "intr" mounts and NFSv4 can't use the
"intr" mount option.
MFC after: 2 weeks
Also change int -> u_int for order parameter in device_add_child_ordered.
There should not be any ABI change as struct device is private to subr_bus.c
and the API change should be compatible.
To do: change int -> u_int for order parameter of bus_add_child method
and its implementations. The change should also be API compatible, but
is a bit more churn.
Suggested by: imp, jhb
MFC after: 1 week
Both deadline and current_time are time_seconds (+ utc_offset())
casted to unsigned long long. No need to cast to or print as pointers.
MFC after: 4 days
vm_page_startup uses MSGBUF_SIZE value for adding msgbuf pages to minidump.
If opt_msgbuf.h is not included and MSGBUF_SIZE is overriden in kernel
config, then not all msgbuf pages will be dumped. And most importantly,
struct msgbuf itself will not be included. Thus the dump would look
corrupted/incomplete to tools like kgdb, dmesg, etc that try to access
struct msgbuf as one of the first things they do when working on a crash
dump.
MFC after: 5 days
A make buildkernel -j4 uses ~360% CPU.
- Bracket the AP spinup printf with a mutex to avoid garbled output.
- Enable SMP by default on powerpc64.
Reviewed by: nwhitehorn
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.
The barrier semantics of bioq_insert_tail() were broken in two ways:
o In bioq_disksort(), an added bio could be inserted at the head of
the queue, even when a barrier was present, if the sort key for
the new entry was less than that of the last queued barrier bio.
o The last_offset used to generate the sort key for newly queued bios
did not stay at the position of the barrier until either the
barrier was de-queued, or a new barrier (which updates last_offset)
was queued. When a barrier is in effect, we know that the disk
will pass through the barrier position just before the
"blocked bios" are released, so using the barrier's offset for
last_offset is the optimal choice.
sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
o Update last_offset in bioq_insert_tail().
o Only update last_offset in bioq_remove() if the removed bio is
at the head of the queue (typically due to a call via
bioq_takefirst()) and no barrier is active.
o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
set prev to the barrier and cur to it's next element. Now that
last_offset is kept at the barrier position, this change isn't
strictly necessary, but since we have to take a decision branch
anyway, it does avoid one, no-op, loop iteration in the while
loop that immediately follows.
o In bioq_disksort(), bypass the normal sort for bios with the
BIO_ORDERED attribute and instead insert them into the queue
with bioq_insert_tail(). bioq_insert_tail() not only gives
the desired command order during insertion, but also provides
barrier semantics so that commands disksorted in the future
cannot pass the just enqueued transaction.
sys/sys/bio.h:
Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.
sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
Use an ordered command for SCSI/ATA-NCQ commands issued in
response to bios with the BIO_ORDERED flag set.
sys/cam/scsi/scsi_da.c
Use an ordered tag when issuing a synchronize cache command.
Wrap some lines to 80 columns.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
Mark bios with the BIO_FLUSH command as BIO_ORDERED.
Sponsored by: Spectra Logic Corporation
MFC after: 1 month
to pad with 0xFF when it encounter short frames. According to RFC
1042 the pad bytes should be 0x00.
Because manual padding consumes extra CPU cycles, introduce a new
tunable which controls the padding behavior. Turning this tunable
on will have driver pad manually but it's disabled by default. Users
can enable software padding by setting the following tunable to
non-zero value.
dev.sis.%d.manual_pad="1"
PR: kern/35422 (patch not used)
prevented driver from working on big-endian machines. Also rewrite
station address programming to make it work on strict-alignment
architectures. With this change, sis(4) now works on sparc64 and
performance number looks good even though sis(4) have to apply
fixup code to align received frames on 2 bytes boundary on sparc64.
In protosw we define pr_protocol as short, while on the wire
it is an uint8_t. That way we can have "internal" protocols
like DIVERT, SEND or gaps for modules (PROTO_SPACER).
Switch ipproto_{un,}register to accept a short protocol number(*)
and do an upfront check for valid boundries. With this we
also consistently report EPROTONOSUPPORT for out of bounds
protocols, as we did for proto == 0. This allows a caller
to not error for this case, which is especially important
if we want to automatically call these from domain handling.
(*) the functions have been without any in-tree consumer
since the initial introducation, so this is considered save.
Implement ip6proto_{un,}register() similarly to their legacy IP
counter parts to allow modules to hook up dynamically.
Reviewed by: philip, will
MFC after: 1 week
response to DMA activate FIS under certain circumstances. This is
recommended fix from chip datasheet. If triggered, this bug most likely
cause write command timeout.
MFC after: 2 weeks
value 0xff. On hot-plug this value confuses ata_generic_reset() device
presence detection logic. As soon as we already know drive presence from
SATA hard reset, hint ata_generic_reset() to wait for device signature
until success or full timeout.
greater than 65535 bytes then the CDC driver might not work as expected, which
is not likely with the existing USB speeds.
Submitted by: Hans Petter Selasky
the file handle's size and was recently committed to
lib/libstand/nfs.c. This allows pxeboot to use NFSv3 and work
correcty for non-FreeBSD as well as FreeBSD NFS servers.
If built with OLD_NFSV2 defined, the old
code that predated this patch will be used.
Tested by: danny at cs.huji.ac.il
boot.nfsroot.nfshandlelen and set the diskless root fs to
use NFSv3 and this file handle length when it is set. If
this environment variable is not set, the diskless root fs
will use NFSv2 and the same defaults as before. This fixes
the problem where the diskless nfs root fs had to be on a
FreeBSD server for NFSv3 to work, because it did not know
the correct file handle length and assumed the size used
by FreeBSD. Until pxeboot and loader are replaced by ones
built from commits coming soon, boot.nfsroot.nfshandlelen
will not be set by them and the diskless root fs will use
NFSv2 unless the /etc/fstab entry has the "nfsv3" option
specified.
Tested by: danny at cs.huji.ac.il
MFC after: 2 weeks
allmulti is toggled. Controller does not require reinitialization.
This removes unnecessary controller reinitialization whenever
tcpdump is used.
While I'm here remove unnecessary variable reinitialization.
configured TX/RX MACs before getting a valid link. After that, when
link state change callback is called, it called device
initialization again to reconfigure TX/RX MACs depending on
resolved link state. This hack created several bad side effects and
it required more hacks to not collide with sis_tick callback as
well as disabling switching to currently selected media in device
initialization. Also it seems sis(4) was used to be a template
driver for long time so other drivers which was modeled after
sis(4) also should be changed.
TX/RX MACs are now reconfigured after getting a valid link. Fix for
short cable error is also applied after getting a link because it's
only valid when the resolved speed is 100Mbps.
While I'm here slightly reorganize interrupt handler such that
sis(4) always read SIS_ISR register to see whether the interrupt is
ours or not. This change removes another hack and make it possible
to nuke sis_stopped variable in softc.
thread in a racy manner, which can lead to attempting to migrate a
thread that is pinned to a CPU. Instead, have sched_switch() determine
which CPU a thread should run on if the current one is not allowed.
KASSERT in sched_bind() that the thread is not yet pinned to a CPU.
KASSERT in sched_switch() that only migratable threads or those moving
due to a sched_bind() are changing CPUs.
sched_affinity code came from jhb@.
MFC after: 2 weeks
- add rm_try_rlock().
- add RM_SLEEPABLE to use sx(9) as the back-end lock in order to sleep while
holding the write lock.
- change rm_noreadtoken to a cpu bitmask to indicate which CPUs need to go
through the lock/unlock in order to synchronize. As a side effect, this
also avoids IPI to CPUs without any readers during rm_wlock.
Discussed with: ups@, rwatson@ on arch@
Sponsored by: Isilon Systems, Inc.
o Enforce TX/RX descriptor ring alignment. NS data sheet says the
controller needs 4 bytes alignment but use 16 to cover both SiS
and NS controllers. I don't have SiS data sheet so I'm not sure
what is alignment restriction of SiS controller but 16 would be
enough because it's larger than the size of a TX/RX descriptor.
Previously sis(4) ignored the alignment restriction.
o Enforce RX buffer alignment, 4.
Previously sis(4) ignored RX buffer alignment restriction.
o Limit number of TX DMA segment to be used to 16. It seems
controller has no restriction on number of DMA segments but
using more than 16 looks resource waste.
o Collapse long mbuf chains with m_collapse(9) instead of calling
expensive m_defrag(9).
o TX/RX side bus_dmamap_load_mbuf_sg(9) support and remove
unnecessary callbacks.
o Initial endianness support.
o Prefer local alignment fixup code to m_devget(9).
o Pre-allocate TX/RX mbuf DMA maps instead of creating/destroying
these maps in fast TX/RX path. On non-x86 architectures, this is
very expensive operation and there is no need to do that.
o Add missing bus_dmamap_sync(9) in TX/RX path.
o watchdog is now unarmed only when there are no pending frames
on controller. Previously sis(4) blindly unarmed watchdog
without checking the number of queued frames.
o For efficiency, loaded DMA map is reused for error frames.
o DMA map loading failure is now gracefully handled. Previously
sis(4) ignored any DMA map loading errors.
o Nuke unused macros which are not appropriate for endianness
operation.
o Stop embedding driver maintained structures into descriptor
rings. Because TX/RX descriptor structures are shared between
host and controller, frequent bus_dmamap_sync(9) operations are
required in fast path. Embedding driver structures will increase
the size of DMA map which in turn will slow down performance.
- set cache_coherent_dma flag in cpuinfo for XLR, this will make sure that
BUS_DMA_COHERENT flag is handled correctly in busdma_machdep.c
- iodi.c, call device_get_name() just once
- clear RMI specific EIRR while intializing CPUs
- remove debug print in intr_machdep.c
which also avoids NULL pointer arithmetic, as suggested by jhb. The
available space goes from 11 bytes to 7.
Reviewed by: nyan
Approved by: rpaulo (mentor)
programs that refuse to run as root (pgsql) to install probes when their
user is part of the wheel group.
Sponsored by: The FreeBSD Foundation
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> Security: Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.
M dev/dtrace/dtrace_load.c
not change interrupt vector if it is not pointing the ROM itself. Actually,
we just fail shadowing altogether if that is the case because the shadowed
copy will be useless for sure and POST may not be relocatable or useful.
While I'm here, fix a debugging message under bootverbose, really. r211829
fixed one case but broke another. Mea Culpa.
or not by comparing reported TX consumer index with saved index. So
remove unnecessary check done after freeing transmitted mbufs.
While I'm here nuke unnecessary variable initializations.