stopped before adjusting their priority and setting them on the run
q so they cannot race for resources (pointed out by njl).
While here add a console printf on thread create fails; otherwise
noone may notice (e.g. return value is always 0 and caller has no
way to verify).
Reviewed by: jhb, scottl
MFC after: 2 weeks
client into the kernel by default, and many users won't use NFS,
don't start an extra 4 kernel threads that are unused. Once NFS
becomes active, it will start nfsiod's as it needs them.
We might consider mandating a minimum iod's equal to the number of
active NFS mounts (truncated to some value), which would force some
to remain available without having to create a new one if the file
system is mostly inactive.
PR: 70880
MFC after: 2 weeks
Prodded by: cel
Head nod: peter
Pointed out by: Joe <fbsd_user at a1poweruser dot com>
was done, I believe, to work around some cards having issues in the
suspend case. I think that this helped my Sony VAIO TS505 work better
when it had certain wireless cards in it and I did a apm -z. I've not
tested suspend/resume on other laptops in a long time, so I hope this
doesn't cause greif. Please let me know if it does.
of cases where we didn't take out the lock before setting or clearing
a bit. This apparently can lead to a race at kldunload time (at least
on my Turion64 laptop, never saw it on my Sony Vaio).
This version of scsi_target.c removes all SMP locking until
we have a lock-aware CAM stack. This allows us to use KNOTE
without a panic at least.
It's not yet clear whether target mode is working yet or not.
Discussed with: Scott, Ken, Nate, Justin
return to user space w/o waiting for I/O to complete.
I tried to get several folks who know this code better than me to review it
with no luck. I *do* know that w/o this code, using the SCSI target driver
panics in userret (if it doesn't panic in knote first).
if a process's uid or gid has changed, but the /proc/<PID> directory
itself was also set to mode 0. Assuming this doesn't open any
security holes, open access to the /proc/<PID> directory for users
other than root to read or search the directory.
Reviewed by: des (back in February)
MFC after: 3 weeks
Since tags are kept while packet resides in kernelspace, it's possible to
use other kernel facilities (like netgraph nodes) for altering those tags.
Submitted by: Andrey Elsukov <bu7cher at yandex dot ru>
Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru>
Approved by: glebius (mentor)
Idea from: OpenBSD PF
MFC after: 1 month
EHCI spec for linking in new qTDs into an asynchronous QH. This
requires that there is a qTD marked as not active and not halted
at the start of the QH's list, and the hardware will know to re-fetch
the qTD on each pass rather than just looking at the overlay qTD:
"The host controller must be able to advance the queue from the
Fetch QH state in order to avoid all hardware/software race
conditions. This simple mechanism allows software to simply link
qTDs to the queue head and activate them, then the host controller
will always find them if/when they are reachable."
This is achieved by keeping an "inactivesqtd" entry on the QH list,
and re-using it each time as the start of the next transfer, and
allocating a new qTD to become the next inactivesqtd. Then a new
transfer can be activated by just setting its "active" flag, which
avoids all the previous messing with overlay qTD state in
ehci_set_qh_qtd().
mimicing the NFS reference implementation.
NFS over TCP does not need fast retransmit timeouts, since network loss
and congestion are managed by the transport (TCP), unlike with NFS over
UDP. A long timeout prevents the unnecessary retransmission of non-
idempotent NFS requests.
Reviewed by: mohans, silby, rees?
Sponsored by: Network Appliance, Incorporated
the estimator to be more easily tuned and maintained.
There should be no functional change except there is now a lower limit
on the retransmit timeout to prevent the client from retransmitting
faster than the server's disks can fill requests, and an upper limit
to prevent the estimator from taking to long to retransmit during a
server outage.
Reviewed by: mohan, kris, silby
Sponsored by: Network Appliance, Incorporated
before starting exploring (4 seconds), and extend the wait period
if new USB buses are attached while waiting.
This works around a problem seen when there is more than one EHCI
controller in the system and you kldload usb.ko after the system
has booted. The problem is that usb.ko contains 3 separate PCI
drivers which get initialised one by one (uhci, ohci, ehci), and
when each driver is initialised, all PCI buses are re-probed after
just the addition of that driver. This means that there can be a
significant delay between the attaching of a companion controller
and the subsequent EHCI attach, so it is possible for the companion
controller's USB 1.x bus to be scanned before the EHCI driver gets
a chance to check if there is really a USB 2.x device connected.
- Rename REG_DL to REG_DLL and REG_DLH.
- Always treat DLL and DLH as two separate 8-bit registers instead of one
16-bit register.
Additionally, remove the probe for the high 4 bits of IER being 0 and don't
assume we can always read/write 0 to/from those bits.
These changes allow uart(4) to drive the UARTs on the Intel XScale PXA255.
Reviewed by: marcel
- Skip PnP devices as some wedge when trying to probe them as C-NET(98)S.
This fix makes le(4) actually work with the C-NET(98)S.
Reviewed by: marius
Tested by: Watanabe Kazuhiro < CQG00620 at nifty dot ne dot jp >
Checking if the queues are empty is not enough for the crypto_proc thread
(it is enough for the crypto_ret_thread), because drivers can be marked
as blocked. In a situation where we have operations related to different
crypto drivers in the queue, it is possible that one driver is marked as
blocked. In this case, the queue will not be empty and we won't wakeup
the crypto_proc thread to execute operations for the others drivers.
Simply setting a global variable to 1 when we goes to sleep and setting
it back to 0 when we wake up is sufficient. The variable is protected
with the queue lock.
Before the change if the thread was working on symmetric operation, we
would send unnecessary wakeup after adding asymmetric operation (when
asym queue was empty) and vice versa.
twice if we call crypto_kinvoke() from crypto_proc thread.
This change also removes unprotected access to cc_kqblocked field
(CRYPTO_Q_LOCK() should be used for protection).
where crypto_invoke() returns ERESTART and before we set cc_qblocked to 1,
crypto_unblock() is called and sets it to 0. This way we mark device as
blocked forever.
Fix it by not setting cc_qblocked in the fast path and by protecting
crypto_invoke() in the crypto_proc thread with CRYPTO_Q_LOCK().
This won't slow things down, because there is no contention - we have
only one crypto thread. Actually it can be slightly faster, because we
save two atomic ops per crypto request.
The fast code path remains lock-less.
Be cognizant as to whether we're running 2KLogin f/w in target mode and
do the appropriate loopid load based upon that.
Do a first cut (seems to work, at least for amd64) at 64 bit target
mode for fibre channel cards. We could probably also do it for SPI
cards, but that's not supported right now.
report this as an allocation failure for the item type. The failure
will be separately recorded with the bucket type. This my eliminate
high mbuf allocation failure counts under some circumstances, which
can be alarming in appearance, but not actually a problem in
practice.
MFC after: 2 weeks
Reported by: ps, Peter J. Blok <pblok at bsd4all dot org>,
OxY <oxy at field dot hu>,
Gabor MICSKO <gmicskoa at szintezis dot hu>
"Why didn't he use SECASVAR_LOCK()/SECASVAR_UNLOCK() macros to
synchronize access to the secasvar structure's fields?" one may ask.
There were two reasons:
1. refcount(9) is faster then mutex(9) synchronization (one atomic
operation instead of two).
2. Those macros are not used now at all, so at some point we may decide
to remove them entirely.
OK'ed by: gnn
MFC after: 2 weeks
- Linux ioctl support, with the other Linux changes MegaCli
will run if you mount linprocfs & linsysfs then set
sysctl compat.linux.osrelease=2.6.12 or similar. This works
on i386. It should work on amd64 but not well tested yet.
StoreLib may or may not work. Remember to kldload mfi_linux.
- Add in AEN (Async Event Notification) support so we can
get messages from the firmware when something happens.
Not all messages are in defined in event detail. Use
event_log to try to figure out what happened.
- Try to implement something like SIGIO for StoreLib. Since
mrmonitor doesn't work right I can't fully test it. StoreLib
works best with the rh9 base. In theory mrmonitor isn't
needed due to native driver support of AEN :-)
Now we can configure and monitor the RAID better.
Submitted by: IronPort Systems.
full, kick the binary blob to force it to complete any pending tx
completions.
- In the watchdog routine, only reset the chip if the blob doesn't complete
any pending tx completions rather than requiring it to complete all of
the pending tx completions.
Submitted by: Nathan Whitehorn <nathanw@uchicago.edu>
MFC after: 2 weeks
a defensive programming measure.
Note that whilst these members are not used by the ip_output()
path, we are passing an instance of struct ip_moptions here
which is declared on the stack (which could be considered a
bad thing).
ip_output() does not consume struct ip_moptions, but in case it
does in future, declare an in_multi vector on the stack too to
behave more like ip_findmoptions() does.
lnc(4) on PC98 and i386. The ISA front-end supports the same non-PNP
network cards as lnc(4) did and additionally a couple of PNP ones.
Like lnc(4), the C-bus front-end of le(4) only supports C-NET(98)S
and is untested due to lack of such hardware, but given that's it's
based on the respective lnc(4) and not too different from the ISA
front-end it should be highly likely to work.
- Remove the descriptions of le(4), which where converted from lnc(4),
from sys/i386/conf/NOTES and sys/pc98/conf/NOTES as there's a common
one in sys/conf/NOTES.
entry to the PCI NICs section so it's in the same spot in all GENERIC
config files.
- Add a note to the description of pcn(4) informing that is has precedence
over le(4).
or SHA512, the blocksize is 128 bytes, not 64 bytes as anywhere else.
The bug also exists in NetBSD, OpenBSD and various other independed
implementations I look at.
- We cannot decide which hash function to use for HMAC based on the key
length, because any HMAC function can use any key length.
To fix it split CRYPTO_SHA2_HMAC into three algorithm:
CRYPTO_SHA2_256_HMAC, CRYPTO_SHA2_384_HMAC and CRYPTO_SHA2_512_HMAC.
Those names are consistent with OpenBSD's naming.
- Remove authsize field from auth_hash structure.
- Allow consumer to define size of hash he wants to receive.
This allows to use HMAC not only for IPsec, where 96 bits MAC is requested.
The size of requested MAC is defined at newsession time in the cri_mlen
field - when 0, entire MAC will be returned.
- Add swcr_authprepare() function which prepares authentication key.
- Allow to provide key for every authentication operation, not only at
newsession time by honoring CRD_F_KEY_EXPLICIT flag.
- Make giving key at newsession time optional - don't try to operate on it
if its NULL.
- Extend COPYBACK()/COPYDATA() macros to handle CRYPTO_BUF_CONTIG buffer
type as well.
- Accept CRYPTO_BUF_IOV buffer type in swcr_authcompute() as we have
cuio_apply() now.
- 16 bits for key length (SW_klen) is more than enough.
Reviewed by: sam
crypto_invoke(). This allows to serve multiple crypto requests in
parallel and not bached requests are served lock-less.
Drivers should not depend on the queue lock beeing held around
crypto_invoke() and if they do, that's an error in the driver - it
should do its own synchronization.
- Don't forget to wakeup the crypto thread when new requests is
queued and only if both symmetric and asymmetric queues are empty.
- Symmetric requests use sessions and there is no way driver can
disappear when there is an active session, so we don't need to check
this, but assert this. This is also safe to not use the driver lock
in this case.
- Assymetric requests don't use sessions, so don't check the driver
in crypto_kinvoke().
- Protect assymetric operation with the driver lock, because if there
is no symmetric session, driver can disappear.
- Don't send assymetric request to the driver if it is marked as
blocked.
- Add an XXX comment, because I don't think migration to another driver
is safe when there are pending requests using freed session.
- Remove 'hint' argument from crypto_kinvoke(), as it serves no purpose.
- Don't hold the driver lock around kprocess method call, instead use
cc_koperations to track number of in-progress requests.
- Cleanup register/unregister code a bit.
- Other small simplifications and cleanups.
Reviewed by: sam
- Implement CUIO_SKIP() macro which is only responsible for skipping the given
number of bytes from iovec list. This allows to avoid duplicating the same
code in three functions.
Reviewed by: sam
actually destroyed. Also move call to knlist_init() into tapcreate(). This
should fix panic described in kern/95357.
PR: kern/95357
No response from: freebsd-current@
MFC after: 3 days
use this ioctl to obtain the list of HCI nodes. User-space application
is expected to preallocate 'ng_btsocket_hci_raw_node_list_names' structure
and set limit in 'num_nodes' field. The 'nodes' field should be allocated
as well and it should have space for at least 'num_nodes' elements.
The SIOC_HCI_RAW_NODE_LIST_NAMES should be issued on bound raw HCI socket.
It does not really really matter what HCI name the socket is bound to, as
long as it is not empty.
MFC after: 1 week
still should return BUS_PROBE_LOW_PRIORITY instead of BUS_PROBE_DEFAULT
in order to give pcn(4) a chance to attach in case it probes after le(4).
- Rearrange the code related to RX interrupt handling so that ownership of
RX descriptors is immediately returned to the NIC after we have copied
the data of the hardware, allowing the NIC to already reuse the descriptor
while we are processing the data in ifp->if_input(). This results in a
small but measurable increase in RX throughput.
As a side-effect, this moves the workaround for the LANCE revision C bug
to am7900.c (still off by default as I doubt we will actually encounter
such an old chip in a machine running FreeBSD) and the workaround for the
bug in the VMware PCnet-PCI emulation to am79000.c, which is now also
only compiled on i386 (resulting in a small increase in RX throughput on
the other platforms).
- Change the RX interrupt handlers so that the descriptor error bits are
only check once in case there was no error instead of twice (inspired
by the NetBSD pcn(4), which additionally predicts the error branch as
false).
- Fix the debugging output of the RX and TX interrupt handlers; while
looping through the descriptors print info about the currently processed
one instead of always the previously last used one; remove pointless
printing of info about the RX descriptor bits after their values were
reset.
- Create the DMA tags used to allocate the memory for the init block,
descriptors and packet buffers with the alignment the respective NIC
actually requires rather than using PAGE_SIZE unconditionally. This might
as well fix the alignment of the memory as it seems we do not inherit
the alignment constraint from the parent DMA tag.
- For the PCI variants double the number of RX descriptors and buffers
from 8 to 16 as this minimizes the number of RX overflows im seeing with
one NIC-mainboard combination. Nevertheless move reporting of overflows
under debugging as they seem unavoidable with some crappy hardware.
- Set the software style of the PCI variants to ILACC rather than PCnet-PCI
as the former is was am79000.c actually implements. Should not make a
difference for this driver though.
- Fix the driver name part in the MODULE_DEPEND of the PCI front-end for
ether.
- Use different device descriptions for PCnet-Home and PCnet-PCI.
- Fix some 0/NULL confusion in lance_get().
- Use bus_addr_t for sc_addr and bus_size_t for sc_memsize as these are
more appropriate than u_long for these.
- Remove the unused LE_DRIVER_NAME macro.
- Add a comment describing why we are taking the LE_HTOLE* etc approach
instead of using byteorder(9) functions directly.
- Improve some comments and fix some wording.
MFC after: 2 weeks
gateways which are unreachable except through the default router. For
example, assuming there is a default route configured, and inserting
a route
"route add 64.102.54.0/24 60.80.1.1"
is currently allowed even when 60.80.1.1 is only reachable through
the default route. However, an error is thrown when this route is
utilized, say,
"ping 64.102.54.1" will return an error
This type of route insertion should be disallowed becasue:
1) Let's say that somehow our code allowed this packet to flow to
the default router, and the default router knows the next hop is
60.80.1.1, then the question is why bother inserting this route in
the 1st place, just simply use the default route.
2) Since we're not talking about source routing here, the default
router could very well choose a different path than using 60.80.1.1
for the next hop, again it defeats the purpose of adding this route.
Reviewed by: ru, gnn, bz
Approved by: andre
Current code does not report link loss correctly - when link goes down,
mii_phy_tick() will notice that with up to mii_anegticks delay.
If link goes up during this delay then link flapping will be unnoticed
by driver.
2) mii_phy_add_media(): initialize sc->mii_anegticks for 10/100 media
3) Use MII_ANEGTICKS/MII_ANEGTICKS_GIGE defines instead of hardcoded values.
Approved by: glebius (mentor)
MFC after: 1 month
mount(2) system call:
* Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and
linprocfs). This (mostly) restores the ability to mount these
filesystems using the old mount(2) system call (see below for the
rest of the fix).
* Remove not-NULL check for the data argument from the mount(2) entry
point. Per the mount(2) man page, it is up to the individual
filesystem being mounted to verify data. Or, in the case of procfs,
etc. the filesystem is free to ignore the data parameter if it does
not use it. Enforcing data to be not-NULL in the mount(2) system call
entry point prevented passing NULL to filesystems which ignored the
data pointer value. Apparently, passing NULL was common practice
in such cases, as even our own mount_std(8) used to do it in the
pre-nmount(2) world.
All userland programs in the tree were converted to nmount(2) long ago,
but I've found at least one external program which broke due to this
(presumably unintentional) mount(2) API change. One could argue that
external programs should also be converted to nmount(2), but then there
isn't much point in keeping the mount(2) interface for backward
compatibility if it isn't backward compatible.
as not connected. In soclose() case rip_detach() will kill inpcb for
us later.
It makes rawconnect regression test do not panic a system.
Reviewed by: rwatson
X-MFC after: with all 1th April inpcb changes
connections and get rid of the flow_id as it is not guaranteed to be stable
some (most?) current implementations seem to just zero it out.
PR: kern/88664
Reported by: jylefort
Submitted by: Joost Bekkers (w/ changes)
Tested by "regisr" <regisrApoboxDcom>
By making the imo_membership array a dynamically allocated vector,
this minimizes disruption to existing IPv4 multicast code. This
change breaks the ABI for the kernel module ip_mroute.ko, and may
cause a small amount of churn for folks working on the IGMPv3 merge.
Previously, sockets were subject to a compile-time limitation on
the number of IPv4 group memberships, which was hard-coded to 20.
The imo_membership relationship, however, is 1:1 with regards to
a tuple of multicast group address and interface address. Users who
ran routing protocols such as OSPF ran into this limitation on machines
with a large system interface tree.
Add a new option, SKYEYE_WORKAROUNDS, which as the name suggests adds
workarounds for things skyeye doesn't simulate. Specifically :
- Use USART0 instead of DBGU as the console, make it not use DMA, and manually provoke an interrupt when we're done in the transmit function.
- Skyeye maintains an internal counter for clock, but apparently there's
no way to access it, so hack the timecounter code to return a value which
is increased at every clock interrupts. This is gross, but I didn't find a
better way to implement timecounters without hacking Skyeye to get the
counter value.
- Force the write-back of PTEs once we're done writing them, even if they
are supposed to be write-through. I don't know why I have to do that.
for nfsclient and nfs4client in order to prevent local root users
from panicing the system.
PR: kern/77463
Submitted by: Wojciech A. Koszek
Reviewed by: cel, rees
MFC after: 2 weeks
Security: Local root users can panic the system at will
divisor. This allows us to set the line speed to the maximum
of 1/4 of the device clock.
o Disable the baudrate generator before programming the line
settings, including baudrate, and enable it afterwards.
was not checked at all. There is only one case when sc_clean_up()
can fail, because of wait_scrn_saver_stop(), but it doesn't hurt
to check anyway.
Reviewed by: rodrigc
Found by: Coverity Prevent
seperately. Also use pfil hook/unhook instead of keeping the check
functions in pfil just to return there based on the sysctl. While here fix
some whitespace on a nearby SYSCTL_ macro.
When porting FreeBSD to a new platform, one of the more useful things to do is
get mi_startup() to let you know which SYSINIT it's up to. Most people tend to
whack a printf in the SYSINIT loop to print the address of the function it's
about to call. Going one better, jhb made a version that uses DDB to look up
the name of the function and print that instead. This version is essentially
his with the addition of some ifdeffery to make it optional and to allow it to
work (although using only the function address, not the symbol) if you forgot
to enable DDB.
All the cool bits by: jhb
Approved by: scottl, rink, cognet, imp
Remove an unnecessary check of the table's bus clock. CPUs that
support this feature export only the high/low settings via the MSR,
packed into 32 bits.
Hardware from: Centaur Technologies
MFC after: 1 week
vm_ksubmap_init() calls pmap_copy_page(), which uses the mini data cache
to do the copy, but we're running uncaching before cpu_setup().
For some reason it hasn't been a problem so far, but it is for the
PXA255.
Spotted out by: benno
the linux module, since it is not cross-platform
- move linprocfs from "files" and "options" to architecture specific files,
since it only makes sense to build this for those architectures, where we
also have a linuxolator
- disable the build of the linuxolator on our tier-2 architecture "Alpha":
* we don't have a linux_base port which supports Alpha and at the
same time is not outdated/obsoleted upstream/in a good condition/
currently working
* the upcomming new default linux base port is based upon Fedora
Core 3 (security support via http://www.fedoralegacy.org), which
isn't available for Alpha (like the current default linux base
port which is based upon Red Hat 8)
* nobody answered my request for testing it ~1 month ago on
current@ and alpha@ (it doesn't surprises me, see above)
* a SoC student wouldn't have to waste time on something which
nobody is willing to test
This does not remove the alpha specific MD files of the linuxolator yet.
Discussed on: arch (mostly silence)
Spiritual support by: scottl
If the embedded controller exists before the sysresource devices, for
example, it will be attached first. Instead, let the normal device
order function work as we first desired. [1]
There still remained a problem where we couldn't allocate resources in
acpi0 that were passed up by the sysresource pseudo-devices. These
devices had to probe/attach first to give their resources to acpi, then
acpi would allocate them before probing/attaching other devices. To
work around this, we attach them from acpi_sysres_alloc(). A better
approach would be to implement multi-pass probe/attach in newbus but
that's a much bigger task.
Suggested by: jhb [1]
Hardware from: Centaur Technologies
MFC after: 1 week
to ensure we match the type signature; we cannot assume HAL_BUS_TAG
and HAL_BUS_HANDLE correspond to bus_space_tag_t and bus_space_handle_t
(should probably do this for HAL_SOFTC too but leave that for now)
MFC after: 1 month
assuming them to be inflight write buffers. This is not always the case.
bufdaemon might hold the buffer lock and give up writing the buffer due to it
having dependencies, the file system being suspended or the vnode lock being
held by another thread. When bufdaemon decides to write the buffer there is
still a window before bufobj_wref() has been called, allowing other threads to
believe that the vnode has no dirty buffers or inflight writes.
Try harder to flush first block of new subdirectory to get rid of MKDIR_BODY
dependency.