to simply switch rather than lowering priority and switching. This allows
threads of equal priority to run but not lesser priority.
Discussed with: davidxu
Reported by: NIIMI Satoshi <sa2c@sa2c.net>
Approved by: re
critical_exit() owepreempt check. ULE will always use owepreempt to
preempt the idle thread. This change does not effect 4BSD since it will
never set owepreempt without PREEMPTION enabled.
- Remove some unused code from choosethread().
Discussed with: jhb
Approved by: re
it must first ensure that the page is no longer mapped. This is
trivially accomplished by calling pmap_remove_all() a little earlier
in vm_page_cache(). While I'm in the neighborbood, make a related
panic message a little more useful.
Approved by: re (kensmith)
Reported by: Peter Holm and Konstantin Belousov
Reviewed by: Konstantin Belousov
a consequence of sparc64/sparc64/vm_machdep.c revision 1.76. It occurs
when uma_small_free() frees a page. The solution has two parts: (1) Mark
pages allocated with VM_ALLOC_NOOBJ as PG_UNMANAGED. (2) Defer the lock
assertion in pmap_page_is_mapped() until after PG_UNMANAGED is tested.
This is safe because both PG_UNMANAGED and PG_FICTITIOUS are immutable
flags, i.e., they do not change state between the time that a page is
allocated and freed.
Approved by: re (kensmith)
PR: 116794
TCP: [X.X.X.X]:X to [X.X.X.X]:X tcpflags 0x18<PUSH,ACK>; tcp_do_segment: FIN_WAIT_2: Received data after socket was closed, sending RST and removing tcpcb
So that it also includes how many bytes of data were received. It now looks
like this:
TCP: [X.X.X.X]:X to [X.X.X.X]:X tcpflags 0x18<PUSH,ACK>; tcp_do_segment: FIN_WAIT_2: Received X bytes of data after socket was closed, sending RST and removing tcpcb
Approved by: re (gnn)
not being independently freeable. This allows one to embed an mbuf in
the cluster itself. This confers the benefits of the packet zone on
all cluster sizes. Embedded mbufs currently suffer from the same
limitation that packet zone mbufs do in that one cannot disconnect
them and pass them around independently of the cluster. It would
likely be possible to eliminate this limitation in the future by
adding a second reference for the mbuf itself.
Approved by: re(gnn)
problems with the syncache, it produces a lot of console noise and has led
to quite a few false positive bug reports. It can be selectively
re-enabled when debugging specific problems by frobbing the same sysctl.
Discussed with: silby
Approved by: re (gnn)
directory itself (rather than any of its contents) is visible to the
current thread.
MFC after: 1 week
PR: kern/90063
Submitted by: john of 8192.net
Approved by: re (kensmith)
with all functions supported. This is done adding usb device IDs
to the table of recognised devices (because there is no standard
'scanner' class, so no other way to recognise them), and with
a small change to the uscanner attach routine that prevents
reconfiguring the whole USB device while we are dealing only with
one of its USB interfaces.
The latter part has been suggested by Steinar Hamre in
http://www.freebsd.org/cgi/query-pr.cgi?pr=107665 , i have
only added a bit of explaination to the code.
I have personally tried this on the Epson DX-5050 and DX-6000
devices (on the US market they have different names, CX-something).
I have good reasons to think that, possibly with the mere addition
of more USB ids to the table in uscanner.c, this should work with
all Epson multifunction devices in that family (from DX-3800 to
DX-7000 - these units are in the 50-120$ price range).
More details on related topics (SANE configuration, OCR, etc.)
at http://info.iet.unipi.it/~luigi/FreeBSD/dx5050.html
Manpage updates coming soon.
Approved by: re, imp
MFC after: 3 days
UDMA modes.
Please notice that Soekris NET5501 bios versions before 1.32f has a bug
that prevents this from working.
Approved by: re (gnn)
MFC: 2 weeks
Before that fix, it was possible for the function to fail if number
of sharers changes between 'x = sx->sx_lock' step and atomic_cmpset_acq_ptr()
call.
This fixes ZFS problem when ZFS returns strange EIO errors under load.
In ZFS there is a code that depends on the fact that sx_try_slock() can
only fail if there is an exclusive owner.
Discussed with: attilio
Reviewed by: jhb
Approved by: re (kensmith)
after the switch leads to a race where the outgoing thread still owns
the local queue lock while another cpu may switch it in. This race
is only possible on machines where cpu_switch can take significantly
longer on different cpus which in practice means HTT machines with
unfair thread scheduling algorithms.
Found by: kris (of course)
Approved by: re
starvation caused by unbalanced interrupt loads.
- Change the rebalancer to work on stathz ticks but retain randomization.
- Simplify locking in tdq_idled() to use the tdq_lock_pair() rather than
complex sequences of locks to avoid deadlock.
Reported by: kris
Approved by: re
retransmittion by handover event (fast mobility code)
- Fixed problem of mobility code which is caused by remaining
parameters in the deleted primary destination.
- Add a missing lock. When a peer sends an INIT, and while we
are processing it to send an INIT-ACK the socket is closed,
we did not hold a lock to keep the socket from going away.
Add protection for this case.
- Fix so that arwnd is alway uses the minimal rwnd if the user
has set the socket buffer smaller. Found this when the test
org decided to see what happens when you set in a rwnd of 10
bytes (which is not allowed per RFC .. 4k is minimum).
- Fixes so a cookie-echo ootb will NOT cause an abort to
be sent. This was happening in a MPI collision case.
- Examined all panics and unless there was no recovery, moved
any that were not already to INVARANTS.
Approved by: re@freebsd.org (gnn)
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
After discussion with Sam, switch back to use firmware(9) instead of
having the firmware in hex format.
Put the binary firmware uuencoded into sys/contrib/dev/npe, and slap a
LICENSE file, as found on the Intel website.
Approved by: re (blanket), mux (mentor)
MFC After: 1 week
Without this change the following situation was possible:
1. Provider is orphaned from within class' access() method on last write
close - orphan provider event is send.
2. GEOM detects last write close on a provider and sends new provider event.
3. g_orphan_register() is called, and calls all orphan methods of attached
consumers.
4. New provider event is executed on orphaned provider, all classes can
taste already orphaned provider, and some may attach consumers to it.
Those consumers will never go away, because the g_orphan_register()
was already called.
We end up with a zombie provider.
With this change, at step 3, we will cancel new provider event.
How to repeat this problem:
# mdconfig -a -t malloc -s 10m
# geli init -i 0 md0
# geli attach md0
# newfs -L test /dev/md0.eli
# mount /dev/ufs/test /mnt/tmp
# geli detach -l md0.eli
# umount /mnt/tmp
# glabel status
Name Status Components
ufs/test N/A N/A
Reviewed by: phk
Approved by: re (kensmith)
value for kern.sched.preempt_thresh appropriately. It can still by
adjusted at runtime. ULE will still use IPI_PREEMPT in certain
migration situations.
- Assert that we're not trying to compile ULE on an unsupported
architecture. To date, I believe only i386 and amd64 have implemented
the third cpu switch argument required.
Approved by: re
cache: vm_object_page_remove() should convert any cached pages that
fall with the specified range to free pages. Otherwise, there could
be a problem if a file is first truncated and then regrown.
Specifically, some old data from prior to the truncation might reappear.
Generalize vm_page_cache_free() to support the conversion of either a
subset or the entirety of an object's cached pages.
Reported by: tegge
Reviewed by: tegge
Approved by: re (kensmith)
to gem_attach() as the former access softc members not yet initialized
at that time and gem_reset() actually is enough to stop the chip. [1]
o Revise the use of gem_bitwait(); add bus_barrier() calls before calling
gem_bitwait() to ensure the respective bit has been written before we
starting polling on it and poll for the right bits to change, f.e. even
though we only reset RX we have to actually wait for both GEM_RESET_RX
and GEM_RESET_TX to clear. Add some additional gem_bitwait() calls in
places we've been missing them according to the GEM documentation.
Along with this some excessive DELAYs, which probably only were added
because of bugs in gem_bitwait() and its use in the first place, as
well as as have of an gem_bitwait() reimplementation in gem_reset_tx()
were removed.
o Add gem_reset_rxdma() and use it to deal with GEM_MAC_RX_OVERFLOW errors
more gracefully as unlike gem_init_locked() it resets the RX DMA engine
only, causing no link loss and the FIFOs not to be cleared. Also use it
deal with GEM_INTR_RX_TAG_ERR errors, with previously were unhandled.
This was based on information obtained from the Linux GEM and OpenSolaris
ERI drivers.
o Turn on workarounds for silicon bugs in the Apple GMAC variants.
This was based on information obtained from the Darwin GMAC and Linux GEM
drivers.
o Turn on "infinite" (i.e. maximum 31 * 64 bytes in length) DMA bursts.
This greatly improves especially RX performance.
o Optimize the RX path, this consists of:
- kicking the receiver as soon as we've a spare descriptor in gem_rint()
again instead of just once after all the ready ones have been handled;
- kicking the receiver the right way, i.e. as outlined in the GEM
documentation in batches of 4 and by pointing it to the descriptor
after the last valid one;
- calling gem_rint() before gem_tint() in gem_intr() as gem_tint() may
take quite a while;
- doubling the size of the RX ring to 256 descriptors.
Overall the RX performance of a GEM in a 1GHz Sun Fire V210 was improved
from ~100Mbit/s to ~850Mbit/s.
o In gem_add_rxbuf() don't assign the newly allocated mbuf to rxs_mbuf
before calling bus_dmamap_load_mbuf_sg(), if bus_dmamap_load_mbuf_sg()
fails we'll free the newly allocated mbuf, unable to recycle the
previous one but a NULL pointer dereference instead.
o In gem_init_locked() honor the return value of gem_meminit().
o Simplify gem_ringsize() and dont' return garbage in the default case.
Based on OpenBSD.
o Don't turn on MAC control, MIF and PCS interrupts unless GEM_DEBUG is
defined as we don't need/use these interrupts for operation.
o In gem_start_locked() sync the DMA maps of the descriptor rings before
every kick of the transmitter and not just once after enqueuing all
packets as the NIC might instantly start transmitting after we kicked
it the first time.
o Keep state of the link state and use it to enable or disable the MAC
in gem_mii_statchg() accordingly as well as to return early from
gem_start_locked() in case the link is down. [3]
o Initialize the maximum frame size to a sane value.
o In gem_mii_statchg() enable carrier extension if appropriate.
o Increment if_ierrors in case of an GEM_MAC_RX_OVERFLOW error and in
gem_eint(). [3]
o Handle IFF_ALLMULTI correctly; don't set it if we've turned promiscuous
group mode on and don't clear the flag if we've disabled promiscuous
group mode (these were mostly NOPs though). [2]
o Let gem_eint() also report GEM_INTR_PERR errors.
o Move setting sc_variant from gem_pci_probe() to gem_pci_attach() as
device probe methods are not supposed to touch the softc.
o Collapse sc_inited and sc_pci into bits for sc_flags.
o Add CTASSERTs ensuring that GEM_NRXDESC and GEM_NTXDESC are set to
legal values.
o Correctly set up for 802.3x flow control, though #ifdef out the code
that actually enables it as this needs more testing and mainly a proper
framework to support it.
o Correct and add some conversions from hard-coded functions names to
__func__ which were borked or forgotten in if_gem.c rev. 1.42.
o Use PCIR_BAR instead of a homegrown macro.
o Replace sc_enaddr[6] with sc_enaddr[ETHER_ADDR_LEN].
o In gem_pci_attach() in case attaching fails release the resources in
the opposite order they were allocated.
o Make gem_reset() static to if_gem.c as it's not needed outside that
module.
o Remove the GEM_GIGABIT flag and the associated code; GEM_GIGABIT was
never set and the associated code was in the wrong place.
o Remove sc_mif_config; it was only used to cache the contents of the
respective register within gem_attach().
o Remove the #ifdef'ed out NetBSD/OpenBSD code for establishing a suspend
hook as it will never be used on FreeBSD.
o Also probe Apple Intrepid 2 GMAC and Apple Shasta GMAC, add support for
Apple K2 GMAC. Based on OpenBSD.
o Add support for Sun GBE/P cards, or in other words actually add support
for cards based on GEM to gem(4). This mainly consists of adding support
for the TBI of these chips. Along with this the PHY selection code was
rewritten to hardcode the PHY number for certain configurations as for
example the PHY of the on-board ERI of Blade 1000 shows up twice causing
no link as the second incarnation is isolated.
These changes were ported from OpenBSD with some additional improvements
and modulo some bugs.
o Add code to if_gem_pci.c allowing to read the MAC-address from the VPD on
systems without Open Firmware.
This is an improved version of my variant of the respective code in
if_hme_pci.c
o Now that gem(4) is MI enable it for all archs.
Pointed out by: yongari [1]
Suggested by: rwatson [2], yongari [3]
Tested on: i386 (GEM), powerpc (GMACs by marcel and yongari),
sparc64 (ERI and GEM)
Reviewed by: yongari
Approved by: re (kensmith)
33MHz for calculating the latency timer values for its children.
Inspired by NetBSD doing the same and Linux as well as OpenSolaris
using a similar approach.
While at it rename a variable and change its type to be more
appropriate fuer values of PCI properties so the variable can be
more easily reused.
- Initialize the cache line size register of PCI devices to a
legal value; the cache line size is limited to 64 bytes by the
Fireplane/Safari, JBus and UPA interconnection busses. Setting
it to an unsupported value caused bad performance at least with
GEM as it causes them to not do cache line bursts and to not
issue cache line commands on the PCI bus.
Approved by: re (kensmith)
MFC after: 1 week
timeout occurring at exactly the same time. If this happens, the nfsiod
exits although there may be a queued async IO request for it.
Found by : Kris Kennaway
Approved by: re
ways:
(1) Cached pages are no longer kept in the object's resident page
splay tree and memq. Instead, they are kept in a separate per-object
splay tree of cached pages. However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock. Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.
This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held. Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.
Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case. Cached pages
are reclaimed far, far more often than they are reactivated. Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.
(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.
Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated. Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page. Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.
Discussed with: many over the course of the summer, including jeff@,
Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
This patch was part of ACPI-CA 20070508 release and the
following is excerpt from its change log:
Fixed a problem where the Global Lock handle was not properly
updated if a thread that acquired the Global Lock via executing
AML code then attempted to acquire the lock via the
AcpiAcquireGlobalLock interface. Reported by Joe Liu.
Approved by: re (kensmith)
Tested by: ambrisko
Obtained from: Intel
polling/interrupt-driven fallback and instead use polling only during
boot and pure interrupt-driven mode after boot. Polled mode could be
relegated completely to a legacy role if we could enable interrupts
during boot. Polled mode can be forced after boot by setting
debug.acpi.ec.polled="1", i.e. if there are timeouts.
- Use polling only during boot, shutdown, or if requested by the user.
Otherwise, use a generation count of GPEs, incremented atomically. This
prevents an old status value from being used if the EC is really slow
and the same condition (i.e. multiple IBEs for a write transaction) is
being checked.
- Check for and run the query handler directly if the SCI bit is set in
the status register during boot. Previously, the query handler wouldn't
run until interrupts were finally enabled late in boot.
- During boot and after starting a command, check if the event appears
to already have occurred before we even start waiting. If so, it's
possible the EC is very slow and we might accept an old status value.
Print a warning in this case. Once we've booted, interrupt-driven mode
should work just fine but polled mode could be unreliable. There's not
much more we can do about this until interrupts are enabled during boot.
- In the above case, we also do one final check if the interrupt-driven
mode gets a timeout. If the status is complete, it will force the
system back into polled mode since interrupt mode doesn't work. For
polled mode during boot, if the status appears to be already complete
before beginning the check loop, it waits 10 us before actually checking
the status, just in case the EC is really slow and hasn't gotten to work
on the new request yet.
- Use upper-case hex for the _Qxx method
- Use device_printf for errors, don't hide them under verbose
- Increase default total timeout to 750 ms and decrease polling interval
to 5 us.
- Don't pass the status value via the softc. Just read it directly.
- Remove the mutex. We use the sx lock for transaction serialization
with the query handler.
- Remove the Intel copyright notice as no code of theirs was ever
present in this file (verified against rev 1.1)
- Allow KTR module-only builds for ease of testing
Thanks to jkim and Alexey Starikovskiy for helpful discussions and testing.
Approved by: re
MFC after: 2 weeks
- Reintegrate the ANSI C function declaration change
from tcp_timer.c rev 1.92
- Reorganize the tcpcb structure so that it has a single
pointer to the "tcp_timer" structure which contains all
of the tcp timer callouts. This change means that when
the single tcp timer change is reintegrated, tcpcb will
not change in size, and therefore the ABI between
netstat and the kernel will not change.
Neither of these changes should have any functional
impact.
Reviewed by: bmah, rrs
Approved by: re (bmah)