add a FreeBSD_version check. It should work fine for compiling
on -HEAD, 9.x and 8.x.
* Conditionally compile the 11n options only when 11n is enabled.
The above changes allow the ath(4) driver to compile and run on
8.1-RELEASE (Hi old PC-BSD!) but with the 11n stuff disabled.
I've done a test against the net80211 and tools in 8.1-RELEASE.
The NIC used in testing is the AR2427 in an EEEPC.
Just to be clear - this change is to allow the -HEAD ath/hal/rate
code to run on 9.x _and_ 8.x with no source changes. However,
when running on earlier kernels, it should only be used for legacy
mode. (Don't define ATH_ENABLE_11N.)
this commit is not enough to enable CARP operation on
if_bridge(4), because the latter doesn't handle or even
initialize its ifp->if_link_state.
Reported by: Alexander Lunev <sol289 gmail.com>
in set_apic_interrupt_ids(). Besides, set_apic_interrupts_ids() is not
called in the !SMP case too.
Fix this by:
- Adding the BSP as an interrupt target directly in cpu_startup().
- Remove an obsolete optimization where the BSP are skipped in
set_apic_interrupt_ids().
Reported by: jh
Reviewed by: jhb
MFC after: 3 days
X-MFC: r233961
Pointy hat to: me
The SA_PROC signal property indicated whether each signal number is directed
at a specific thread or at the process in general. However, that depends on
how the signal was generated and not on the signal number. SA_PROC was not
used.
accesses of the cache member of vm_object objects.
- Use novel vm_page_is_cached() for checks outside of the vm subsystem.
Reviewed by: alc
MFC after: 2 weeks
X-MFC: r234039
They do not have compatible configuration registers in PCI configuration
space. Instead their configuration resides in AMD "PM I/O" space
(accessed via a pair of I/O space registers).
MFC after: 5 days
that it will be freed to the cache pool rather than the default pool.
Otherwise, the cached pages within the reservation may be recycled sooner
than necessary.
Reported by: Andrey Zonov
policy configuration, avoid leaking resources following failed calls
to get and set MAC labels by file descriptor.
Reported by: Mateusz Guzik <mjguzik at gmail.com> + clang scan-build
MFC after: 3 days
accounting for I/O counts at completion of I/O operation. Also switch
from using global devmtx to vnode mutex to reduce contention.
Suggested and reviewed by: kib
damage which I committed when I had less clue about such things.
Don't ever put normal data frames on the mcast software queue.
Just put mcast frames there if needed.
Pass the txq decision into ath_tx_normal_setup(), as we've already made
the decision. Don't re-do it.
Whilst i'm here, add another random debugging statement.
used in the code which needs to implement some specific
behaviour when being run under QEMU.
- Make PXA UART probe code to work under QEMU gumstix, which
doesn't emulate all the ports properly.
allocator.
Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in
memory is not likely to become possible in foreseeable feature and would
require new unit number allocator.
Discussed with: delphij
MFC after: 2 weeks
This fixes bootp on if_smc, as bootp code perform SIOCSIFADDR
ioctl call immediately after sending the request (which causes
if_init being called) which causes the adapter to drop all the
packets received in the meantime.
call these after rate control selection is done.
The duration/protection code wasn't working - it expected the rix to
be valid. Unfortunately after I moved the rate control selection into
late in the process, the rix value isn't valid and thus the protection/
duration code would get things wrong.
HT frames are now correctly protected with an RTS and for the AR5416,
this involves having the aggregate frames be limited to 8K.
TODO:
* Fix up the DMA sync to occur just before the frame is queued to the
hardware. I'm adjusting the duration here but not doing the DMA
flush.
* Doubly/triply ensure that the aggregate frames are being limited to
the correct size, or the AR5416 will get unhappy when TXing RTS-protected
aggregates.
if any subframes in an aggregate have different protection from the
first frame in the formed aggregate, don't add that frame to the
aggregate.
This is likely a suboptimal method (I think we'll mostly be OK marking
frames that have seqno's with the same protection as normal data frames)
but I'll just be cautious for now.
This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size. The AR5416 has a limitation on
RTS protected aggregates (8KiB).
that don't exist.
Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline). However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior. (The previous behavior was to
return 001b, or LUN offline.)
ctl.c: Change the default inquiry peripheral qualifier to 011b,
and add a sysctl and tunable to allow the user to change
it back to 001b if needed.
Don't insert a Copan copyright statement in the inquiry
data. The copyright statements on the files are
sufficient.
ctl_private.h: Add sysctl variable context to the CTL softc.
ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c: Include sys/sysctl.h.
MFC after: 3 days
222813, that left all un-pinned interrupts assigned to CPU 0.
sys/x86/x86/intr_machdep.c:
In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
the "intr_cpus" cpuset to only contain CPU0.
This initialization is too late and nullifies the results of calls
the intr_add_cpu() that occur much earlier in the boot process.
Since "intr_cpus" is statically initialized to the empty set, and
all processors, including the BSP, already add themselves to
"intr_cpus" no special initialization for the BSP is necessary.
MFC after: 3 days
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.
MFC after: 5 days
The menu item is now made completely independent with the ACPI item - most
modern systems seem to require ACPI and become even more "unsafe"
without it.
Safe Mode no longer disables APIC for the same reason.
kbdmux is not disabled as this feature has proven itself stable.
New actions:
- SMP is disabled in the Safe Mode now
- eventtimers are forced to periodic mode (some real and virtual systems
seem to have problems otherwise)
- geom extra vigorous integrity checking is disabled, this is to
facilitate migration from previous versions
Possible short term to do:
- make SMP switch a separate menu item
- restore APIC switch as a separate menu item
Longer term to do:
- turn various tweaks into separate menu items in a Safe Mode sub-menu
Please consider adding a safety tweak to Safe Mode when introducing
new major features or changes that may cause instabilities.
Discussed with: jhb, scottl, Devin Teske
MFC after: 3 weeks (stable/9 only)
Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw frames
is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses
BPF only to send data. This leads us to the situation when software like cdpd,
being run on high-traffic-volume interface significantly reduces overall performance
since we have to acquire additional locks for every packet.
Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter we
assume it to be write-only and add it to special writer-only per-interface list.
This makes bpf_peers_present() return 0, so no additional overhead is introduced.
After filter is supplied, descriptor is added to original per-interface list permitting
packets to be captured.
Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of
setting snap length.
Fortunately, most programs explicitly sets (event catch-all) filter after that.
tcpdump(1) is a good example.
So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.
Sysctl is named net.bpf.optimize_writers and is turned off by default.
- While here, document all sysctl variables in bpf.4
Sponsored by Yandex LLC
Reviewed by: glebius (previous version)
Reviewed by: silence on -net@
Approved by: (mentor)
MFC after: 4 weeks
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.
- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.
- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.
Found by: Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by: Yandex LLC
Reviewed by: glebius (previous version)
Reviewed by: silence on -net@
Approved by: (mentor)
MFC after: 3 weeks
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take. The new ktrace records are
enabled via the 'p' trace type, and are enabled in the default set of
trace points.
Reviewed by: kib
MFC after: 2 weeks
On FreeBSD the direct ioctl argument is automatically copied in/out
as necesary by the kernel ioctl entry point.
PR: kern/164445
Submitted by: Luis Garces-Erice <lge@ieee.org>
Tested by: Attila Nagy <bra@fsn.hu>
MFC after: 5 days
application destroys semaphore after sem_wait returns. Just enter
kernel to wake up sleeping threads, only update _has_waiters if
it is safe. While here, check if the value exceed SEM_VALUE_MAX and
return EOVERFLOW if this is true.
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily. On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission. The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.
In order to do this:
* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
is torn down.
This is a first pass that implements the above. What needs to be done/
tested:
* What happens during say, a channel reset / stuck beacon _and_ BAR
TX. It _should_ be correctly buffered and retried once the
reset has completed. But if a bgscan occurs (and they shouldn't,
grr) the BAR frame will be forcibly failed and the aggregation session
will be torn down.
Yes, another reason to disable bgscan until I've figured this out.
* There's way too much locking going on here. I'm going to do a couple
of further passes of sanitising and refactoring so the (re) locking
isn't so heavy. Right now I'm going for correctness, not speed.
* The BAR TX can fail if the hardware TX queue is full. Since there's
no "free" space kept for management frames, a full TX queue (from eg
an iperf test) can race with your ability to allocate ath_buf/mbufs
and cause issues. I'll knock this on the head with a subsequent
commit.
* I need to do some _much_ more thorough testing in hostap mode to ensure
that many concurrent traffic streams to different end nodes are correctly
handled. I'll find and squish whichever bugs show up here.
But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.
Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".
The correct behaviour is:
* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
in an aggregate.
Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.
I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.
MFC after: 3 days
revision 1.173
date: 2011/11/09 12:36:03; author: camield; state: Exp; lines: +11 -12
State expire time is a baseline time ("last active") for expiry
calculations, and does _not_ denote the time when to expire. So
it should never be added to (set into the future).
Try to reconstruct it with an educated guess on state import and
just set it to the current time on state updates.
This fixes a problem on pfsync listeners where the expiry time
could be double the expected value and cause a lot more states
to linger.
forwarding a packet, that creates state, until
pfsync(4) peer acks state addition (or 10 msec
timeout passes).
This is needed for active-active CARP configurations,
which are poorly supported in FreeBSD and arguably
a good idea at all.
Unfortunately by the time of import this feature in
OpenBSD was turned on, and did not have a switch to
turn it off. This leaked to FreeBSD.
This change make it possible to turn this feature
off via ioctl() and turns it off by default.
Obtained from: OpenBSD
and it is no longer referenced by a user process. The inode for a
file whose name has been removed, but is still referenced at the
time of a crash will still be allocated in the filesystem, but will
have no references (e.g., they will have no names referencing them
from any directory).
With traditional soft updates these unreferenced inodes will be
found and reclaimed when the background fsck is run. When using
journaled soft updates, the kernel must keep track of these inodes
so that it can find and reclaim them during the cleanup process.
Their existence cannot be stored in the journal as the journal only
handles short-term events, and they may persist for days. So, they
are tracked by keeping them in a linked list whose head pointer is
stored in the superblock. The journal tracks them only until their
linked list pointers have been commited to disk. Part of the cleanup
process involves traversing the list of unreferenced inodes and
reclaiming them.
This bug was triggered when confusion arose in the commit steps
of keeping the unreferenced-inode linked list coherent on disk.
Notably, a race between the link() system call adding a link-count
to a file and the unlink() system call removing a link-count to
the file. Here if the unlink() ran after link() had looked up
the file but before link() had incremented the link-count of the
file, the file's link-count would drop to zero before the link()
incremented it back up to one. If the file was referenced by a
user process, the first transition through zero made it appear
that it should be added to the unreferenced-inode list when in
fact it should not have been added. If the new name created by
link() was deleted within a few seconds (with the file still
referenced by a user process) it would legitimately be a candidate
for addition to the unreferenced-inode list. The result was that
there were two attempts to add the same inode to the unreferenced-inode
list which scrambled the unreferenced-inode list's pointers leading
to a panic. The fix is to detect and avoid the false attempt at
adding it to the unreferenced-inode list by having the link()
system call check to see if the link count is zero before it
increments it. If it is, the link() fails with ENOENT (showing that
it has failed the link()/unlink() race).
While tracking down this bug, we have added additional assertions
to detect the problem sooner and also simplified some of the code.
Reported by: Kirk Russell
Fix submitted by: Jeff Roberson
Tested by: Peter Holm
PR: kern/159971
MFC (to 9 only): 2 weeks
sleeping from a swi handler (even though in this case it would be ok), so
switch the refill and scanning SWI handlers to being tasks on a fast
taskqueue. Also, only schedule the refill task for a CMCI as an MC# can
fire at any time, so it should do the minimal amount of work needed and
avoid opportunities to deadlock before it panics (such as scheduling a
task it won't ever need in practice). To handle the case of an MC# only
finding recoverable errors (which should never happen), always try to
refill the event free list when the periodic scan executes.
MFC after: 2 weeks
flags check.
- Add a comment for the immutable/append check done after handling of
the flags.
- Style improvements.
No functional change intended.
Submitted by: bde
MFC after: 2 weeks
an uncorrected ECC error tends to fire on all CPUs in a package
simultaneously and the current printf hacks are not sufficient to make
the messages legible. Instead, use the existing mca_lock spinlock to
serialize calls to mca_log() and change the machine check code to panic
directly when an unrecoverable error is encoutered rather than falling
back to a trap_fatal() call in trap() (which adds nearly a screen-full of
logging messages that aren't useful for machine checks).
MFC after: 2 weeks
via procstat(1) and fstat(1):
- Change shm file descriptors to track the pathname they are associated
with and add a shm_path() method to copy the path out to a caller-supplied
buffer.
- Use the fo_stat() method of shared memory objects and shm_path() to
export the path, mode, and size of a shared memory object via
struct kinfo_file.
- Add a struct shmstat to the libprocstat(3) interface along with a
procstat_get_shm_info() to export the mode and size of a shared memory
object.
- Change procstat to always print out the path for a given object if it
is valid.
- Teach fstat about shared memory objects and to display their path,
mode, and size.
MFC after: 2 weeks
checked PROTECT bit in INQUIRY data for all SPC devices, while it is defined
only since SPC-3. But there are some SPC-2 USB devices were reported, that
have PROTECT bit set, return no error for READ CAPACITY(16) command, but
return wrong sector count value in response.
MFC after: 3 days
First cut of new HW support from LSI and merge into FreeBSD.
Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
Style
MFhead_mfi r227579
Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
MSI support
MFhead_mfi r227612
More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
Improved timeout support from Scott.
MFhead_mfi r228108
Make file.
MFhead_mfi r228208
Fixed botched merge of Skinny support and enhanced handling
in call back routine.
MFhead_mfi r228279
Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
Restore structure layout by reverting the array header to
use [0] instead of [1].
MFhead_mfi r232412
Put wildcard pattern later in the match table.
MFhead_mfi r232413
Use lower case for hexadecimal numbers to match surrounding
style.
MFhead_mfi r232414
Add more Thunderbolt variants.
MFhead_mfi r232888
Don't act on events prior to boot or when shutting down.
Add hw.mfi.detect_jbod_change to enable or disable acting
on JBOD type of disks being added on insert and removed on
removing. Switch hw.mfi.msi to 1 by default since it works
better on newer cards.
MFhead_mfi r233016
Release driver lock before taking Giant when deleting children.
Use TAILQ_FOREACH_SAFE when items can be deleted. Make code a
little simplier to follow. Fix a couple more style issues.
MFhead_mfi r233620
Update mfi_spare/mfi_array with the actual number of elements
for array_ref and pd. Change these max. #define names to avoid
name space collisions. This will require an update to mfiutil
It avoids mfiutil having to do a magic calculation.
Add a note and #define to state that a "SYSTEM" disk is really
what the firmware calls a "JBOD" drive.
Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
sys/dev/isci/isci_task_request.c:198:7: error: case value not in enumerated type 'SCI_TASK_STATUS' (aka 'enum _SCI_TASK_STATUS') [-Werror,-Wswitch]
case SCI_FAILURE_TIMEOUT:
^
This is because the switch is done on a SCI_TASK_STATUS enum type, but
the SCI_FAILURE_TIMEOUT value belongs to SCI_STATUS instead.
Because the list of SCI_TASK_STATUS values cannot be modified at this
time, use the simplest way to get rid of this warning, which is to cast
the switch argument to int. No functional change.
Reviewed by: jimharris
MFC after: 3 days
- Don't malloc() new MCA records for machine checks logged due to a
CMCI or MC# exception. Instead, use a pre-allocated pool of records.
When a CMCI or MC# exception fires, schedule a swi to refill the pool.
The pool is sized to hold at least one record per available machine
bank, and one record per CPU. This should handle the case of all CPUs
triggering a single bank at once as well as the case a single CPU
triggering all of its banks. The periodic scans still use malloc()
since they are run from a safe context.
- Since we have to create an swi to handle refills, make the periodic scan
a second swi for the same thread instead of having a separate taskqueue
thread for the scans.
Suggested by: mdf (avoiding malloc())
MFC after: 2 weeks
- Do not define the foo_start() methods or set if_start in the ifnet if
multiq transmit is enabled. Also, set if_transmit and if_qflush before
ether_ifattach rather than after when multiq transmit is enabled. This
helps to ensure that the drivers never try to mix different transmit
methods.
- Properly restart transmit during resume. igb(4) was not restarting it
at all, and em(4) was restarting even if the link was down and was
calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions. Transmit
completion processing does not have a processing limit, so it always
runs to completion and never has more work to do when it returns.
Instead, the previous code was returning 'true' anytime there were
packets in the queue that weren't still in the process of being
transmitted. The effect was that the driver would continuously
reschedule a task to process TX completions in effect running at 100%
CPU polling the hardware until it finished transmitting all of the
packets in the ring. Now it will just wait for the next TX completion
interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
there are pending packets in the relevant software queue (IFQ or buf_ring)
after processing TX completions. This is the root cause for the OACTIVE
hangs as if the MSI-X queue handler drained all the pending packets from
the TX ring, nothing would ever restart it. As such, remove some
previously-added workarounds to reschedule a task to poll the TX ring
anytime OACTIVE was set.
Tested by: sbruno
Reviewed by: jfv
MFC after: 1 week
"Under a highly specific and detailed set of internal timing conditions,
the processor may incorrectly update the stack pointer after a long series
of push and/or near-call instructions, or a long series of pop and/or
near-return instructions. The processor must be in 64-bit mode for this
erratum to occur."
MFC after: 3 days
- Correctly determine the maximum payload size for setting the TX link
frequent NACK latency and replay timer thresholds.
Submitted by: stefanf [1]
MFC after: 3 days
This change also workarounds dhclient's link state handling bug by
not giving current link status.
Unlike other controllers, ale(4)'s PHY hibernation perfectly works
such that driver does not see a valid link if the controller is not
brought up. If dhclient(8) runs on ale(4) it will blindly waits
until link UP and then gives up after 10 seconds. Because
dhclient(8) still thinks interface got a valid link when IFM_AVALID
is not set for selected media, this change makes dhclient initiate
DHCP without waiting for link UP.
needs to defer link state handling.
While I'm here, mark IFF_DRV_RUNNING before changing media. If
link is established without any delay, that link state change
handling could be lost.
that revision, the bswapXX_const() macros were renamed to bswapXX_gen().
Also, bswap64_gen() was implemented as two calls to bswap32(), and
similarly, bswap32_gen() as two calls to bswap16(). This mainly helps
our base gcc to produce more efficient assembly.
However, the arguments are not properly masked, which results in the
wrong value being calculated in some instances. For example,
bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0)
returns 0xfcdefc9a7c563412.
Fix this by appropriately masking the arguments to bswap16() in
bswap32_gen(), and to bswap32() in bswap64_gen(). This should also
silence warnings from clang.
Submitted by: jh
revision has two problems:
- It can produce worse code with both clang and gcc.
- It doesn't fix the actual issue introduced in r232721, which will be
fixed in the next commit.
Submitted by: bde, tijl and jh
Pointy hat to: dim
not disable it and it is even harmful as hselasky found out. Historically,
this code was originated from (OLDCARD) CardBus driver and later leaked into
PCI driver when CardBus was newbus'ified and refactored with PCI driver.
However, it is not really necessary even for CardBus.
Reviewed by: hselasky, imp, jhb
bridges. Rather than blindly enabling the windows on all of them, only
enable the window when an MSI interrupt is enabled for a device behind
the bridge, similar to what already happens for HT PCI-PCI bridges.
To implement this, each x86 Host-PCI bridge driver has to be able to
locate it's actual backing device on bus 0. For ACPI, use the _ADR
method to find the slot and function of the device. For the non-ACPI
case, the legacy(4) driver already scans bus 0 looking for Host-PCI
bridge devices. Now it saves the slot and function of each bridge that
it finds as ivars that the Host-PCI bridge driver can then use in its
pcib_map_msi() method.
This fixes machines where non-MSI interrupts were broken by the previous
round of HT MSI changes.
Tested by: bapt
MFC after: 1 week
added, the call to pmap_kextract() was moved up, and as a result the
code never updated the physical address to use for DMA if a bounce
buffer was used. Restore the earlier location of pmap_kextract() so
it takes bounce buffers into account.
Tested by: kargl
MFC after: 1 week
Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue. It shouldn't
be called outside the taskqueue.
But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.
Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames. But for now, this should suffice for the BAR TX case.
I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.
be less ambiguous and more clearly identify what it means. This
attribute is what Intel refers to as UC-, and it's only difference
relative to normal UC memory is that a WC MTRR will override a UC-
PAT entry causing the memory to be treated as WC, whereas a UC PAT
entry will always override the MTRR.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.
Don't disable BARs on any PCI display devices, because doing that can
sometimes cause the main memory bus to stop working, causing all
memory reads to return nothing but 0xFFFFFFFF, even though the memory
location was previously written. After a while a privileged
instruction fault will appear and then nothing more can be debugged.
The reason for this behaviour is unknown.
MFC after: 1 week
This makes our naming scheme more closely match other systems and the
expectations of much third-party software. MIPS builds which are little-endian
should require and exhibit no changes. Big-endian TARGET_ARCHes must be
changed:
From: To:
mipseb mips
mipsn32eb mipsn32
mips64eb mips64
An entry has been added to UPDATING and some foot-shooting protection (complete
with warnings which should become errors in the near future) to the top-level
base system Makefile.
While we have a snapshot vnode unlocked to avoid a deadlock with another
inode in the same inode block being updated, the filesystem containing
it may be forcibly unmounted. When that happens the snapshot vnode is
revoked. We need to check for that condition and fail appropriately.
This change will be included along with 232351 when it is MFC'ed to 9.
Spotted by: kib
Reviewed by: kib
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
to enable the collection of counts of synchronous and asynchronous
reads and writes for its associated filesystem. The counts are
displayed using `mount -v'.
Ensure that buffers used for paging indicate the vnode from
which they are operating so that counts of paging I/O operations
from the filesystem are collected.
This checkin only adds the setting of the mount point for the
UFS/FFS filesystem, but it would be trivial to add the setting
and clearing of the mount point at filesystem mount/unmount
time for other filesystems too.
Reviewed by: kib
are are not mapped during ranged operations and reduce the scope of the
tlbie lock only to the actual tlbie instruction instead of the entire
sequence. There are a few more optimization possibilities here as well.
Revert r233555 and apply a fix for the reference counting regressions.
Tested by: andreast, lme, nwhitehorn,
Sevan / Venture37 (venture37 at gmail dot com)
Submitted by: Robert Moore (robert dot moore at intel dot com)
violation should be activated unless the system is cold-booted
after updating EEPROM.
The PCI protocol violation happens only when established link is
10Mbps so the workaround should be updated whenever link state
change is detected. Previously the workaround was activated only
when user checks current media status with ifconfig(8).
whether the checksum of EEPROM is valid or not. Because driver
heavily relies on EEPROM information when it selectively enables
features/workarounds, it would be helpful to know whether driver
sees valid EEPROM.
While I'm here remove all other EEPROM accesses since the entire
EEPROM is loaded at device attach time.
MFC after: 2 weeks
for 82550C. For 82550 controllers this change restores CPUSaver
microcode loading. Due to silicon bug on 82550 and 82550C with
server extension, these controllers seem to require CPUSaver
microcode to receive fragmented UDP datagrams. However the
microcode shouldn't be used on client featured 82550C as it locks
up the controller. In addition, client featured 82550C does not
have the silicon bug. Also clear temporary memory used for
microcode loading since the same memory area is used for other
commands.
While I'm here use 82550C in probe message instead of generic
82550.
Reported by: Andreas Longwitz <longwitz <> incore de>
Tested by: Andreas Longwitz <longwitz <> incore de>
MFC after: 2 weeks
- Make INITAFTERSUSPEND flag independent of HOOKRESUME flag.
- Automatically set INITAFTERSUSPEND flag when ALPS GlidePoint is detected.
- Always probe Synaptics Touchpad. Allow MOUSE_SYN_GETHWINFO ioctl and
automatically set INITAFTERSUSPEND flag when a supported device is detected,
regardless of "hw.psm.synaptics_support" tunable setting.
- Update psm(4) to reflect the above changes.
- Remove long-time defunct SYNCHACK flag while I am in the neighborhood.
MFC after: 1 month
The 'make depend' rules have to use custom -I paths for the special compat
includes for the opensolaris/zfs headers.
This option will pull in the couple of files that are shared with dtrace,
but they appear to correctly use the MODULE_VERSION/MODULE_DEPEND rules
so loader should do the right thing, as should kldload.
Reviewed by: pjd (glanced at)
- Do not cover error returned by pmc_core_initialize with the
result of pmc_uncore_initialize, fail right away.
- Give a user something to report instead failing silently
Reported by: Alexandr Kovalenko <never@nevermind.kiev.ua>
The on-chip SD slots do not have PCI BARs corresponding to them, so
this has to be handled in the custom SoC memory allocation.
Provide memory resource for rids corresponding to BAR 0 and 1 in
the custom allocation code.
The XLP on-chip devices have PCI configuration headers, but some of the
devices need custom resource allocation code.
- devices with no MEM/IO BARs with registers in PCIe extended reg
space have to be handled in memory resource allocation
- devices without INTPIN/INTLINE in PCI header can be supported
by having these faked with a shadow register.
- Some devices does not allow 8/16 bit access to the register space,
he default bus space cannot be used for these.
Subclass pci and override attach and resource allocation methods to
take care of this.
Remove earlier code which did this partially.
Temporarily revert an upstream commit. This change caused regressions for
too many laptop users. Especially, automatic repair for broken _BIF caused
strange reference counting issues and kernal panics. This reverts:
c995fed15a
SCTP will only do IPv4 UDP checksum calculation as defined by the host
policy. When tunneling SCTP always calculates the inner checksum already
so not doing the outer UDP can save cycles.
While here virtualize the variable.
Requested by: tuexen
MFC after: 2 weeks
The flash commands and responses are little-endian and have to be
byte swapped on big-endian systems. However the raw read of data
need not be swapped.
Make the cfi_read and cfi_write do the swapping, and provide a
cfi_read_raw which does not byte swap for reading data from
flash.
loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c
that will be called when a newly loaded KLD module adds more probes or
a module with probes is unloaded.
This fixes two issues: first, if a module with SDT probes was loaded after
sdt.ko was loaded, those new probes would not be available in DTrace.
Second, if a module with SDT probes was unloaded while sdt.ko was loaded,
the kernel would panic the next time DTrace had cause to try and do
anything with the no-longer-existent probes.
This makes it possible to create SDT probes in KLD modules, although there
are still two caveats: first, any SDT probes in a KLD module must be part
of a DTrace provider that is defined in that module. At present DTrace
only destroys probes when the provider is destroyed, so you can still
panic the system if a KLD module creates new probes in a provider from a
different module(including the kernel) and then unload the the first module.
Second, the system will panic if you unload a module containing SDT probes
while there is an active D script that has enabled those probes.
MFC after: 1 month
Move XLP PCI UART device to sys/mips/nlm/dev/ directory. Other
drivers for the XLP SoC devices will be added here as well.
Update uart_cpu_xlp.c and uart_pci_xlp.c use macros for uart port,
speed and IO frequency.
Features:
- network driver for the four 10G interfaces and two management ports
on XLP 8xx.
- Support 4xx and 3xx variants of the processor.
- Source code and firmware building for the 16 mips32r2 micro-code engines
in the Network Accelerator.
- Basic initialization code for Packet ordering Engine.
Submitted by: Prabhath Raman (prabhath at netlogicmicro com)
[refactored and fixed up for style by jchandra]
On XLP evaluation platform, the board information is stored
in an I2C eeprom and the network block configuration is available
from a CPLD connected to the GBU (NOR flash bus). Add support
for both of these.
Support for the Security and RSA blocks on XLP SoC. Even though
the XLP supports many more algorithms, only the ones supported
in OCF have been added.
Submitted by: Venkatesh J. V. (venkatesh at netlogicmicro com)
Add a Simple polled driver iicoc for the OpenCores I2C controller. This
is used in Netlogic XLP processors.
Submitted by: Sreekanth M. S. (kanthms at netlogicmicro com)
The earlier version of the driver is sys/mips/rmi/dev/iic/ds1374u.c
Convert all references to ds1374u to ds1374, and use DEVMETHOD_END.
Also update the license header as Netlogic is now Broadcom.
- XLP supports hardware swap for PCIe IO/MEM accesses. Since we
are in big-endian mode, enable hardware swap and use the normal
bus space.
- move some printfs to bootverbose, and remove others.
- fix SoC device resource allocation code
- Do not use '|' while updating PCIE_BRIDGE_MSI_ADDRL
- some style fixes
In collaboration with: Venkatesh J. V. (venkatesh at netlogicmicro com)
Because the code lacks all the GNU extensions to printf() format stuff,
the compiler doesn't helpfully tell us that I messed up in a previous
commit.
Pointy hat to: adrian, who likely only cares about this because he's the
only one who bothers flipping on net80211 debugging.
uses of the page queues mutex with a new rwlock that protects the page
table and the PV lists. This reduces system time during a parallel
buildworld by 35%.
Reviewed by: alc
within the BAW.
This regression was introduced in ane earlier commit by me to fix the
BAW seqno allocation-but-not-insertion-into-BAW race. Since it was only
ever using the to-be allocated sequence number, any frame retries
with the first frame in the BAW still in the software queue would
have constantly failed, as ni_txseqs[tid] would always be outside
the BAW.
TODO:
* Extract out the mostly common code here in the agg and non-agg ADDBA
case and stuff it into a single function.
PR: kern/166357
Function acquired reader lock if needed.
Assert check for reader or writer lock (RA_LOCKED / RA_UNLOCKED)
- While here, add knlist_init_mtx.9 to MLINKS and fix some style(9) issues
Reviewed by: glebius
Approved by: ae(mentor)
MFC after: 2 weeks
I see traffic stalls.
It turns out that the bug isn't because the first and last frame in the
BAW is in the software queue. It is more likely that it's because
the first frame in the BAW is still in the software queue and thus there's
no more room to allocate and do subsequent TX.
PR: kern/166357
net.inet.ip.fw.tables_max is now read-write.
- Bump IPFW_TABLES_MAX to 65535
Default number of tables is still 128
- Remove IPFW_TABLES_MAX from ipfw(8) code.
Sponsored by Yandex LLC
Approved by: kib(mentor)
MFC after: 2 weeks
XenServer configurations that advertise the multi-page ring extension,
but only allow a single page of ring space.
sys/dev/xen/blkfront/blkfront.c:
If only one page of ring space is being used, do not publish
in the XenStore the number of pages in use (1), via either
of the supported multi-page ring extension schemes.
Single page operation is the same with or without the
ring-page extension being negotiated. Relying on the
legacy behavior avoids an incompatible difference in how
the two ring-page extension schemes that are out in the
wild, deal with the base case of a single page. The
Amazon/Red Hat drivers use the same XenStore variable as
if the extension was not negotiated. The Citrix drivers
assume the new ring reference XenStore variables will be
available
Reported by: Oliver Schonefeld <schonefeld@ids-mannheim.de>
MFC after: 3 days
This is not entirely correct as it simply resets the channel, flushing
whatever is in the TX/RX queue. This can and will break aggregation
BAW tracking. But the alternative (HT40 frames being sent with the hardware
in HT20 mode) is even worse.
There's still a small window between the htinfo being received (and the ni_chw
field being updated) which could cause problems. I'll look at fleshing this
out in follow-up commits.
PR: kern/166286
Currently, a channel width change updates the 802.11n HT info data in
net80211 but it doesn't trigger any device changes. So the device
driver may decide that HT40 frames can be transmitted but the last
device channel set only had HT20 set.
Now, a task is scheduled so a hardware reset or change isn't done
during any active ongoing RX. It also means that it's serialised
with the other task operations (eg channel change.)
This isn't the final incantation of this work, see below.
For now, any unmodified drivers will simply receive a channel
change log entry. A subsequent patch to ath(4) will introduce
some basic channel change handling (by resetting the NIC.)
Other NICs may need to update their rate control information.
TODO:
* There's still a small window at the present moment where the
channel width has been updated but the task hasn't been fired.
The final version of this should likely pass in a channel width
field to the driver and let the driver atomically do whatever
it needs to before changing the channel.
PR: kern/166286
<20120222095239.Horde.0hpYHJjmRSRPRKzXsoFRbYk@webmail.leidinger.net>.
According to some private emails received, it apparently is not unpopular
to use at least Quad GigaSwift cards driven by cas(4) in x86 machines.
MFC after: 1 week
order to avoid otherwise harmless witness warnings when these are acquired
at the same time and due to both using MTX_NETWORK_LOCK as their type.
The right fix actually would be to use different, descriptive types for
these. However, the latter would require undesirable changes to the shared
code base. Another approach would be to just supply NULL as the type, which
was deemed as less desirable though as it would cause the unique but cryptic
name also to be used for the type and to diverge from the type used by other
network device drivers.
MFC after: 1 week
DMA tag with a 4 GB boundary as required by PCI-Express. With r232403 in
place this actually is redundant. However, the host-PCI-Express bridge
driver is the more appropriate place for implementing this restriction.
MFC after: 3 days
recent changes in sys/x86/include/endian.h:
sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__uint32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes value from 1684238190 to 28526 [-Werror,-Wconstant-conversion]
buf->magic = ntohl(DCONS_MAGIC);
^~~~~~~~~~~~~~~~~~
sys/sys/param.h:306:18: note: expanded from:
#define ntohl(x) __ntohl(x)
^
./x86/endian.h:128:20: note: expanded from:
#define __ntohl(x) __bswap32(x)
^
./x86/endian.h:78:20: note: expanded from:
__bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
^
./x86/endian.h:68:26: note: expanded from:
(((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
^
./x86/endian.h:75:53: note: expanded from:
__bswap16_gen((__uint16_t)(x)) : __bswap16_var(x)))
~~~~~~~~~~~~~ ^
This is because the __bswapXX_gen() macros (for x86) call the regular
__bswapXX() macros. Since the __bswapXX_gen() variants are only called
when their arguments are constant, there is no need to do that constancy
check recursively. Also, it causes the above error with clang.
Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly,
__bswap32_gen() from __bswap64_gen().
While here, add extra parentheses around the __bswap16_gen() macro
expansion, to prevent unexpected side effects.
- header and stub .c file for fasttrap module. It's not supported on
MIPS yet, but there is no way to disable support completely
- Do as amd64 trying to limit allocated memory
Note that this driver additionally probes some device IDs for the most
part not know to other MPT drivers, if at all. So rename the macros not
present in mpi_cnfg.h to match the naming scheme in the latter and but
suffix them with a _FB in order to not cause conflicts.
- Like mpt_set_config_regs(), comment out mpt_read_config_regs() as the
content of the registers read isn't actually used and both functions
aren't exactly up to date regarding the possible layouts of the BARs
(these function might be helpful for debugging though, so don't remove
them completely).
- Use DEVMETHOD_END.
- Use NULL rather than 0 for pointers.
- Remove an unusual check for the softc being NULL.
- Remove redundant zeroing of the softc.
- Remove an overly banal and actually partly incorrect as well as partly
outdated comment regarding the allocation of the memory resource.
MFC after: 3 days
appropriate state handling takes place, not doing so results in the
device doing nothing until manual intervention.
Reviewed by: iwasaki
Tested by: iwasaki (iwi)
MFC after: 4 weeks
until domain discovery is complete. This fixes an isci(4) bug on FreeBSD 7.x
where devices weren't always appearing after boot without an explicit rescan.
Sponsored by: Intel
Reported and tested by: <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
sys/dev/mps/mps_sas.c:861:1: error: function 'mpssas_discovery_timeout' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
mpssas_discovery_timeout(void *data)
^
Because the driver is obtained from upstream, we don't want to modify
it; just silence the warning instead, it is harmless.
MFC after: 3 days
KLD is preloaded with loader(8) and leads to infinity loops.
Also do not return EEXIST error code from MOD_LOAD handler, because
we have undocumented(?) ability replace kernel's module with preloaded one.
And if we have so, then preloaded module will be initialized first.
Thus error in MOD_LOAD handler will be triggered for the kernel.
PR: kern/165573
MFC after: 3 weeks
o Fix buffer overflows when using a long property body in node paths.
o Fix loop end condition when iterating through the symbol table.
o Better error handling during node modification, better problem reporting.
o Eliminate build time warnings.
Submitted by: Lukasz Wojcik
Obtained from: Semihalf
MFC after: 1 week
- Replace MIPS24K-specific code with more generic framework that will
make adding new CPU support easier
- Add MIPS24K support for new framework
- Limit backtrace depth to 1 for stability reasons and add option
HWPMC_MIPS_BACKTRACE to override this limitation
Sycall argument is pointer to array of register_t values. Casting it to
pointer to structure with fields of size smaller then register_t we rely
on compiler-dependent memory layout of structure.
Tested on: mips64 and amd64 systems
The GPL infected parts which were blocking the inclusion of snd_csa
and snd_emu10kx in GENERIC have recently been removed from the tree.
I'm also adding snd_cmi to GENERIC, which I originally intended to
add when we enabled sound support by default.
Discussed with: jhb, pfg, Yuriy Tsibizov <yuriy.tsibizov@gfk.ru>
Approved by: jhb
private to this file. The 'lapics' array was actually shadowing a
completely different 'lapics' array that is private to local_apic.c.
Reported by: bde
MFC after: 2 weeks
kernel.
When access restrictions are added to a page table entry, we flush the
corresponding virtual address mapping from the TLB. In contrast, when
access restrictions are removed from a page table entry, we do not
flush the virtual address mapping from the TLB. This is exactly as
recommended in AMD's documentation. In effect, when access
restrictions are removed from a page table entry, AMD's MMUs will
transparently refresh a stale TLB entry. In short, this saves us from
having to perform potentially costly TLB flushes. In contrast,
Intel's MMUs are allowed to generate a spurious page fault based upon
the stale TLB entry. Usually, such spurious page faults are handled
by vm_fault() without incident. However, when we are executing
no-fault sections of the kernel, we are not allowed to execute
vm_fault(). This change introduces special-case handling for spurious
page faults that occur in no-fault sections of the kernel.
In collaboration with: kib
Tested by: gibbs (an earlier version)
I would also like to acknowledge Hiroki Sato's assistance in
diagnosing this problem.
MFC after: 1 week
along with functions, SYSCTLs and tunables that are not used with
ATA_CAM in #ifndef ATA_CAM, similar to the existing #ifdef'ed ATA_CAM
code for the other way around. This makes it easier to understand
which parts of ata(4) actually are used in the new world order and
to later on remove the !ATA_CAM bits. It also makes it obvious that
there is something fishy with the C-bus front-end as well as in the
ATP850 support, as these used ATA_LOCKING which is defunct in the
ATA_CAM case. When fixing the former, ATA_LOCKING probably needs to
be brought back in some form or other.
Reviewed by: mav
MFC after: 1 week
inp_socket. To avoid panic, do not dereference inp_socket,
but obtain reuse port option from inp_flags2, like this
is done after next call to in_pcblookup_local() a few lines
down below.
Submitted by: rwatson
As of FreeBSD 8, this driver should not be used. Applications that use
posix_openpt(2) and openpty(3) use the pts(4) that is built into the
kernel unconditionally. If it turns out high profile depend on the
pty(4) module anyway, I'd rather get those fixed. So please report any
issues to me.
The pty(4) module is still available as a kernel module of course, so a
simple `kldload pty' can be used to run old-style pseudo-terminals.
longer serve any purpose. Prior to r157446, they served a purpose
because there was a fixed amount of kernel virtual address space
reserved for pv entries at boot time. However, since that change pv
entries are accessed through the direct map, and so there is no limit
imposed by a fixed amount of kernel virtual address space.
Fix a couple of nearby style issues.
Reviewed by: jhb, kib
MFC after: 1 week
AcpiEnterSleepState() executes (optional) _GTS method since ACPICA 20120215
(r231844). To evaluate the method, we need malloc(9), which may sleep.
Reported by: bschmidt
MFC after: 3 days
Enable using the statically embedded blob from the kernel, if present. The KLD
loaded DTB takes precedence, but they are both recognized and handled in the
same way.
Submitted by: Lukasz Wojcik
Obtained from: Semihalf
MFC after: 1 week
is queued to the hardware.
Because multiple concurrent paths can execute ath_start(), multiple
concurrent paths can push frames into the software/hardware TX queue
and since preemption/interrupting can occur, there's the possibility
that a gap in time will occur between allocating the sequence number
and queuing it to the hardware.
Because of this, it's possible that a thread will have allocated a
sequence number and then be preempted by another thread doing the same.
If the second thread sneaks the frame into the BAW, the (earlier) sequence
number of the first frame will be now outside the BAW and will result
in the frame being constantly re-added to the tail of the queue.
There it will live until the sequence numbers cycle around again.
This also creates a hole in the RX BAW tracking which can also cause
issues.
This patch delays the sequence number allocation to occur only just before
the frame is going to be added to the BAW. I've been wanting to do this
anyway as part of a general code tidyup but I've not gotten around to it.
This fixes the PR.
However, it still makes it quite difficult to try and ensure in-order
queuing and dequeuing of frames. Since multiple copies of ath_start()
can be run at the same time (eg one TXing process thread, one TX completion
task/one RX task) the driver may end up having frames dequeued and pushed
into the hardware slightly/occasionally out of order.
And, to make matters more annoying, net80211 may have the same behaviour -
in the non-aggregation case, the TX code allocates sequence numbers
before it's thrown to the driver. I'll open another PR to investigate
this and potentially introduce some kind of final-pass TX serialisation
before frames are thrown to the hardware. It's also very likely worthwhile
adding some debugging code into ath(4) and net80211 to catch when/if this
does occur.
PR: kern/166190
segments.h to a new x86 segments.h.
Add __packed attribute to some structs (just to be sure).
Also make it clear that i386 GDT and LDT entries are used in ia64 code.
than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int'
count of pages overflowed on 64-bit platforms. While here, change
vm_object_madvise() to accept two vm_pindex_t parameters (start and end)
rather than a (start, count) tuple to match other VM APIs as suggested
by alc@.
scheme. The LDM is a logical volume manager for MS Windows NT and it
is also known as dynamic volumes. It supports about 2000 partitions
and also provides the capability for software RAID implementations.
This version implements only partitioning scheme capability and based
on the linux-ntfs project documentation and several publications across
the Web. NOTE: JBOD, RAID0 and RAID5 volumes aren't supported.
An access to the LDM metadata is read-only. When LDM is on the disk
partitioned with MBR we can also destroy metadata. For the GPT
partitioned disks destroy action is not supported.
Reviewed by: ivoras (previous version)
MFC after: 1 month
excluded from superpage promotions. At least one of the reason is
that pv_table is sized for non-fictitious pages only.
Consistently check for the page to be non-fictitious before accesing
superpage pv list.
Sponsored by: The FreeBSD Foundation
Reviewed by: alc
MFC after: 2 weeks
driver is running driver would have already completed flow control
configuration. This change removes unnecessary media changes in
controller reconfiguration cases such that it does not trigger link
reestablishment for configuration change requests like promiscuous
mode change.
Reported by: Many
Tested by: Mike Tancsa <mike <> sentex dot net>
MFC after: 1 week
inserted after the priority token thus cleaning up the output.
- Remove the needless double internal do_add_char function.
- Resolve a possible deadlock if interrupts are
disabled and getnanotime is called
Reviewed by: bde kmacy, avg, sbruno (various versions)
Approved by: cperciva
MFC after: 2 weeks
reg.h with stubs.
The tREGISTER macros are only made visible on i386. These macros are
deprecated and should not be available on amd64.
The i386 and amd64 versions of struct reg have been renamed to struct
__reg32 and struct __reg64. During compilation either __reg32 or __reg64
is defined as reg depending on the machine architecture. On amd64 the i386
struct is also available as struct reg32 which is used in COMPAT_FREEBSD32
code.
Most of compat/ia32/ia32_reg.h is now IA64 only.
Reviewed by: kib (previous version)
manipulation of the pvo_vlink and pvo_olink entries is already protected
by the table lock, so most remaining instances of the acquisition of the
page queues lock can likely be replaced with the table lock, or removed
if the table lock is already held.
Reviewed by: alc
rather than the header file. With this also move RT_MAXFIBS and
RT_NUMFIBS into the implemantion to avoid further usage in other
code. rt_numfibs is all that should be needed.
This allows users to change the number of FIBs from 1..RT_MAXFIBS(16)
dynamically using the tunable without the need to change the kernel
config for the maximum anymore. This means that thet multi-FIB
feature is now fully available with GENERIC kernels.
The kernel option ROUTETABLES can still be used to set the default
numbers of FIBs in absence of the tunable.
Ok.ed by: julian, hrs, melifaro
MFC after: 2 weeks
behaviour on error from write RPC back to behaviour of old nfs client.
When set to not zero, the pages for which write failed are kept dirty.
PR: kern/165927
Reviewed by: alc
MFC after: 2 weeks
if the filesystem performed short write and we are skipping the page
due to this.
Propogate write error from the pager back to the callers of
vm_pageout_flush(). Report the failure to write a page from the
requested range as the FALSE return value from vm_object_page_clean(),
and propagate it back to msync(2) to return EIO to usermode.
While there, convert the clearobjflags variable in the
vm_object_page_clean() and arguments of the helper functions to
boolean.
PR: kern/165927
Reviewed by: alc
MFC after: 2 weeks
field are synchronized, there is no need for pmap_protect() to acquire
the page queues lock unless it is going to access the pv lists.
Reviewed by: kib
These are needed for some particular port configurations where the default
speed isn't suitable for all link speed types. (Ie, changing 10/100/1000MBit
PLL rate requires a similar MII clock rate, rather than a fixed MII rate.)
This is:
* only currently implemented for the ar71xx;
* isn't used anywhere (yet), as the final interface for this hasn't yet
been determined.
* printf -> device_printf
* print the buffer pointer and sequence number for any buffer that wasn't
correctly tidied up before it was freed. This is to aid in some
current SMP TX debugging stalls.
PR: kern/166190
Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed.
Create machine/npx.h on amd64 to allow compiling i386 code that uses
this header.
The original npx.h and fpu.h define struct envxmm differently. Both
definitions have been included in the new x86 header as struct __envxmm32
and struct __envxmm64. During compilation either __envxmm32 or __envxmm64
is defined as envxmm depending on machine architecture. On amd64 the i386
struct is also available as struct envxmm32.
Reviewed by: kib
- Merge r232744 changes to pc98.
(Allow a kernel to be built with 'nodevice atpic'.)
- Move ICU related defines from x86/isa/atpic.c to x86/isa/icu.h and
use them in x86/x86/intr_machdep.c.
Reviewed by: jhb
or look them up individually in pmap_remove() and apply the same logic
in the other ranged operation (pmap_protect). This speeds up make
installworld by a factor of 2 on powerpc64.
MFC after: 1 week
For example, some BIOS for AMD SB600 south bridge may map HPET MMIO base
address as a memory BAR for SMBus controller depending on a PM register
configuration. Before r231161 (and r232086, subsequent MFC to stable/9),
it was not fatal but hpet(4) just failed to attach. Since we probe and
attach HPET earlier than PCI devices now, it caused unfortunate hard lockup.
With this patch, it does not hang any more and HPET works at the same time.
Clean up some style nits while I am in the neighborhood.
PR: kern/165647
Reviewed by: jhb
MFC after: 3 days
Doomed vnode is hardly of any use here, besides all callers handle error
case. vfs_hash_get() does the same.
Don't mess with vnode holdcount, vget() takes care of it already.
Approved by: mdf (mentor)
to match reality: clang does _not_ disable SSE automatically when
-mno-mmx is used, you have to specify -mno-sse explicitly.
Note this was the case even before r232894, which only makes a change in
the 'positive' flag case; e.g. when you specify -msse, MMX gets enabled
too.
MFC after: 1 week
hardclock() tick should be run on every active CPU, or on only one.
On my tests, avoiding extra interrupts because of this on 8-CPU Core i7
system with HZ=10000 saves about 2% of performance. At this moment option
implemented only for global timers, as reprogramming per-CPU timers is
too expensive now to be compensated by this benefit, especially since we
still have to regularly run hardclock() on at least one active CPU to
update system uptime. For global timer it is quite trivial: timer runs
always, but we just skip IPIs to other CPUs when possible.
Option is enabled by default now, keeping previous behavior, as periodic
hardclock() calls are still used at least to implement setitimer(2) with
ITIMER_VIRTUAL and ITIMER_PROF arguments. But since default schedulers don't
depend on it since r232917, we are much more free to experiment with it.
MFC after: 1 month
with HZ rate through the sched_tick() calls from hardclock().
Potentially it can be used to improve precision, but now it is just minus
one more reason to call hardclock() for every HZ tick on every active CPU.
SCHED_4BSD never used sched_tick(), but keep it in place for now, as at
least SCHED_FBFS existing in patches out of the tree depends on it.
MFC after: 1 month
function.
From the submitter:
This patch fixes an issue I encountered using an NFS root with an
ar71xx-based MikroTik RouterBoard 450G on -current where the kernel fails
to contact a DHCP/BOOTP server via if_arge when it otherwise should be able
to. This may be the same issue that Monthadar Al Jaberi reported against
an RSPRO on 6 March, as the signature is the same:
%%%
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
.
.
.
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
arge0: initialization failed: no memory for rx buffers
DHCP/BOOTP timeout for server 255.255.255.255
arge0: initialization failed: no memory for rx buffers
%%%
The primary issue that I found is that the DHCP/BOOTP message that
bootpc_call() is sending never makes it onto the wire, which I believe is
due to the following:
- Last December, a change was made to the ifioctl that bootpc_call() uses
to adjust the netmask around the sosend().
- The new ioctl (SIOCAIFADDR) performs an if_init when invoked, whereas the
old one (SIOCSIFNETMASK) did not.
- if_arge maintains its own sense of link state in sc->arge_link_status.
- On a single-phy interface, sc->arge_link_status is initialized to 0 in
arge_init_locked().
- sc->arge_link_status remains 0 until a phy state change notification
causes arge_link_task to run, notice the link is up, and set it to 1.
- The inits caused by the ifioctls in bootpc_call are reinitializing the
interface, but not the phy, so sc->arge_link_status goes to 0 and remains
there.
- arge_start_locked() always sees sc->arge_link_status == 0 and returns
without queuing anything.
The attached patch changes arge_init_locked() such that in the single-phy
case, instead of initializing sc->arge_link_status to 0, it runs
arge_link_task() to set it according to the current phy state. This change
has allowed my setup to mount an NFS root successfully.
Submitted by: Patrick Kelsey <kelsey@ieee.org>
Reviewed by: juli
I had some interesting hangs until I realised I should try flushing the
DDR FIFO register and lo and behold, hangs stopped occuring.
I've put in a few DDR flushes here and there in case people decide to
reuse some of these functions. It's very very likely they're almost
all superflous.
To test:
* Connect to a network with a _lot_ of broadcast traffic
* Do this:
# while true; do ifconfig arge0 down; ifconfig arge0 up; done
This fixes the mbuf exhaustion that has been reported when the interface
state flaps up/down.
required for the ABI the kernel is being built for.
XXX This is implemented in a kind-of nasty way that involves including source
files, but it's still an improvement.
o) Retire ISA_* options since they're unused and were always wrong.
* enable ALQ and net80211/ath ALQ logging by default, to make it possible
to get debug register traces.
* Update some comments
* Enable HWPMC for testing.
remaining drivers that haven't been converted have various problems or
complexities that will be dealt with later. This list includes:
hptrr, hptmv, hpt27xx - device aggregation across multiple parents
drm - want to talk to the maintainer first
tsec, sec - Openfirmware devices, not sure if changes are warranted
fatm - Done except for unused testing code
usb - want to talk to the maintainer first
ce, cp, ctau, cx - Significant driver changes needed to convey parent info
There are also devices tucked into architecture subtrees that I'll leave
for the respective maintainers to deal with.
add in the netgraph interface to the list of
acceptable interfaces. A todo at the next
IETF code blitz, though is we need to review
why we screen interfaces, there was a reason ;-).
PR: 165210
MFC after: 1 week
- Add support for IPv6 and interface extended tables
- Make number of tables to be loader tunable in range 0..65534.
- Use IP_FW3 opcode for all new extended table cmds
No ABI changes are introduced. Old userland will see valid tables for
IPv4 tables and no entries otherwise. Flush works for any table.
IP_FW3 socket option is used to encapsulate all new opcodes:
/* IP_FW3 header/opcodes */
typedef struct _ip_fw3_opheader {
uint16_t opcode; /* Operation opcode */
uint16_t reserved[3]; /* Align to 64-bit boundary */
} ip_fw3_opheader;
New opcodes added:
IP_FW_TABLE_XADD, IP_FW_TABLE_XDEL, IP_FW_TABLE_XGETSIZE, IP_FW_TABLE_XLIST
ipfw(8) table argument parsing behavior is changed:
'ipfw table 999 add host' now assumes 'host' to be interface name instead of
hostname.
New tunable:
net.inet.ip.fw.tables_max controls number of table supported by ipfw in given
VNET instance. 128 is still the default value.
New syntax:
ipfw add skipto tablearg ip from any to any via table(42) in
ipfw add skipto tablearg ip from any to any via table(4242) out
This is a bit hackish, special interface name '\1' is used to signal interface
table number is passed in p.glob field.
Sponsored by Yandex LLC
Reviewed by: ae
Approved by: ae (mentor)
MFC after: 4 weeks
implementations or no implementation on all platforms.
Some of these functions might be good ideas, but their semantics were unclear
given the lack of implementation, and an unlucky porter could be fooled into
trying to implement them or, worse, being baffled when something like
platform_trap_enter() failed to be called.
machine word. For example, it turns CPU_SET() into expected shift and OR,
removing two extra shifts and additional index on memory access.
Generated code checked for kernel (optimized) and user-level (unoptimized)
cases with GCC and CLANG.
Reviewed by: attilio
MFC after: 2 weeks
updated.
o Number of times NIC ran out of RX buffer descriptors
o Number of inbound packet errors
o Number of inbound packets that were chosen to be discarded
Previously only the discarded packet counter was used to update
if_ierrors. This change fixes wrong if_ierrors counter on
BCM570[0-4] controllers. For BCM5705 and later controllers bge(4)
already correctly counted it.
Reported by: Eugene Grosbein <egrosbein <> rdtc dot ru>
device in device attach. This would help to narrow down issue to a
specific controller and operating mode of the controller.
While I'm here rename BGE_MISCCFG_BOARD_ID with
BGE_MISCCFG_BOARD_ID_MASK.
AMD-8131 PCI-X bridge. The bridge seems to reorder write access to
mailbox registers such that it caused watchdog timeouts by
out-of-order TX completions.
Tested by: Michael L. Squires <mikes <> siralan dot org >
Reviewed by: jhb
- Pass interrupt trapframe for handlers dow the chain
- Add PMC interrupt handler
PMC interrupt is a special case, so we want handle it as soon as possible
with minimum overhead. So we handle it apb filter routine.
bootable image.
The kernel has to fit inside an 896KiB area in a 4MB SPI flash.
So a bunch of stuff can't be included (and more is to come), including
(unfortunately) IPv6.
TODO:
* GPIO modules need to be created
* Shrink the image a bit more by removing some of the CAM layer debugging
strings.
While there, make some style adjustments, like missed () around
return values.
Submitted by: bde
Reviewed by: mckusick
Tested by: pho
MFC after: 2 weeks
The bawrite() schedules the write to happen immediately, and its use
frees the current thread to do more cleanups.
Submitted by: bde
Reviewed by: mckusick
Tested by: pho
MFC after: 2 weeks
Synchronous inode block update is not needed for MNT_LAZY callers (syncer),
and since waitfor values are not zero, code did unneccessary synchronous
update.
Submitted by: bde
Reviewed by: mckusick
Tested by: pho
MFC after: 2 weeks
PCP and CFI fields.
* Ethernet_type for VLAN encapsulation is tunable, default is 0x8100;
* PCP (Priority code point) and CFI (canonical format indicator) is
tunable per VID;
* Tunable encapsulation to support 802.1q
* Encapsulation/Decapsulation code improvements
New messages have been added for this netgraph node to support the
new features.
However, the legacy "vlan" id is still supported and compiled in by
default. It can be disabled in a future release.
TODO:
* Documentation
* Examples
PR: kern/161908
Submitted by: Ivan <rozhuk.im@gmail.com>
- add the macro NETMAP_RING_FIRST_RESERVED() which returns
the index of the first non-released buffer in the ring
(this is useful for code that retains buffers for some time
instead of processing them immediately)