As soon as geom_raid3 reports it's own stripe as sector size, report largest
underlying provider's stripe, multiplied by number of data disks in array,
due to transformation done, as array stripe.
Make sure the multicast forwarding cache entry's stall queue is properly
initialized before trying to insert an entry into it.
PR: kern/142052
Reviewed by: bms
File flags handling fixes for ext2fs:
- Disallow setting of flags not supported by ext2fs.
- Map EXT2_APPEND_FL to SF_APPEND.
- Map EXT2_IMMUTABLE_FL to SF_IMMUTABLE.
- Map EXT2_NODUMP_FL to UF_NODUMP.
Note that ext2fs doesn't support user settable append and immutable flags.
EXT2_NODUMP_FL is an user settable flag also on Linux.
PR: kern/122047
Approved by: trasz (mentor)
Apply fix for Solaris bug 6764159: restore_object() makes a call
that can block while having a tx open but not yet committed
(onnv revision 7994)
Submitted by: mm
Approved by: pjd
Obtained from: OpenSolaris
Print backspaces after echoing an EOF.
Applications like shells expect EOF to give no graphical output, while
our implementation prints ^D by default (tunable with stty echoctl).
Make the new implementation behave like the old TTY code. Print two
backspaces afterwards.
I totally forgot to MFC this, because the 8.0 freeze took a little
longer than I expected.
Reminded by: koitsu
Unloading of the nfscl module is unsupported because newnfslock doesn't
support unloading. It's not trivial to implement newnfslock unloading so
for now just admit that unloading is unsupported and refuse to attempt
unload in all nfscl module event handlers.
Approved by: trasz (mentor)
Fix ordering of nfscl_modevent() and ncl_uninit(). nfscl_modevent() must
be called after ncl_uninit() when unloading the nfscl module because
ncl_uninit() uses ncl_iod_mutex which is destroyed in nfscl_modevent().
Approved by: trasz (mentor)
Use the EVENTHANDLER system to hook into the usb device configuration and
perform a function such as ejecting a 3G autoinstaller disk. The eventhandler
system properly tracks threads and is safe to unload, remove the
setting/clearing of a function pointer in the kernel by u3g(4) which included a
tsleep for safety.
If the runcount is non-zero in eventhandler_deregister() then one or more
threads are executing the eventhandler, sleep in this case to make it safe for
module unload. If the runcount was up then an entry would have been marked
EHE_DEAD_PRIORITY so use this as a trigger to do the wakeup in
eventhandler_prune_list().
Add a quirk for the Curitel UM175 where setting multiplexing for call
management over the data endpoint causes communication to die.
Take this one step further and model it on the existing NetBSD quirk and import
other device IDs from them.
Obtained from: NetBSD
Fix hardware issue with FTDI chips: avoid sending a zero length packet due to
hardware sending garbage on ZLPs.
Reported by: Corey Smith
Submitted by: HPS
Throughout the network stack we have a few places of
if (jailed(cred))
left. If you are running with a vnet (virtual network stack) those will
return true and defer you to classic IP-jails handling and thus things
will be "denied" or returned with an error.
Work around this problem by introducing another "jailed()" function,
jailed_without_vnet(), that also takes vnets into account, and permits
the calls, should the jail from the given cred have its own virtual
network stack.
We cannot change the classic jailed() call to do that, as it is used
outside the network stack as well.
Discussed with: julian, zec, jamie, rwatson (back in Sept)
Add a few more V_hacks to nfsclient to allow machines with a VIMAGE
kernel to boot from NFS. [1]
Note: this is not a full virtualization of nfsclient. It is only does
what advertised above and nothing more.
Remove duplicate devstat_start_transaction_bio() call. It is already called
from geom_disk. Dulicate call causes wrong queue depth and busy accounting.
- Cleanup kernel messages, mostly PMP.
- Took references on devices, while PMP reinitializes them, to not let them
go and distort freeze reference counting.
r199563:
Fix copy & paste error and remove extra space before colon.
r199608:
Remove unnecessary structure packing.
r199609:
Add initial endianness support. It seems the controller supports
both big-endian and little-endian format in descriptors for Rx path
but I couldn't find equivalent feature in Tx path. So just stick to
little-endian for now.
r199610:
Because we know received bytes including CRC there is no reason to
call m_adj(9). The controller also seems to have a capability to
strip CRC bytes but I failed to activate this feature except for
loopback traffic.
r199611:
Add IPv4/TCP/UDP Tx checksum offloading support. It seems the
controller also has support for IP/TCP checksum offloading for Rx
path. But I failed to find to way to enable Rx MAC to compute the
checksum of received frames.
r199612:
Add __FBSDID.
r199613:
Only Tx checksum offloading is supported now. Remove experimental
code sneaked in r199611.
r199558:
Use bus_{read,write}_4 rather than bus_space_{read,write}_4.
r199561:
Use capability pointer to access PCIe registers rather than
directly access them at fixed address. Frequently the register
offset could be changed if additional PCI capabilities are added to
controller.
One odd thing is ET_PCIR_L0S_L1_LATENCY register. I think it's PCIe
link capabilities register but the location of the register does
not match with PCIe capability pointer + offset. I'm not sure it's
shadow register of PCIe link capabilities register.
Remove complex macros that were used to compute bits values.
Although these macros may have its own strength, its complex
definition make hard to read the code.
Approved by: delphij
o Properly support M5229 revision 0xc7 and 0xc8:
- These revisions no longer have cable detection capability.
- The UDMA support bit of register 0x4b has been dropped without an
replacement.
- According to Linux it's crucial for working ATAPI DMA support to
also set the reserved bit 1 of regsiter 0x53 with these revisions.
o Only set ATA_CHECKS_CABLE for chip versions that actually support
cable detection, i.e. neither for ALI_OLD nor for ALI_NEW revisions
>= 0xc7.
Specify the capability and media bits of the capabilities page in
native, i.e. big-endian, format and convert as appropriate like we
also do with the multibyte fields of the other pages. This fixes
the output of acd_describe() to match reality on big-endian machines
without breaking it on little-endian ones. While at it, also convert
the remaining multibyte fields of the pages read although they are
currently unused for consistency and in order to prevent possible
similar bugs in the future.
Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals.
MFC r198590:
Trapsignal() calls kern_sigprocmask() when delivering catched signal
with proc lock held.
MFC r198670:
For trapsignal() and postsig(), kern_sigprocmask() is called with
both process lock and curproc->p_sigacts->ps_mtx locked. Prevent lock
recursion on ps_mtx in reschedule_signals().
In kern_sigsuspend(), manipulate thread signal mask using
kern_sigprocmask(). Also, do cursig/postsig loop immediately after
waiting for signal, repeating the wait if wakeup was spurious due to
race with other thread fetching signal from the process queue before us.
MFC r199136:
Use cpu_set_syscall_retval(9) to set syscall result, and return
EJUSTRETURN from kern_sigsuspend() to prevent syscall return code from
modifying wrong frame.
Take care of possibility that pending SIGCONT might be cancelled by
SIGSTOP, causing postsig() not to deliver any catched signal.
Note that r200091 completely overrides r200053 and the merge of the
former is recorded for bookkeeping only.
r200091 won't be merged to 'more stable' branche(s) because of the POLA.
Put process-directed signals to the process queue unconditionally,
selecting the thread to deliver the signal only by the thread returning
to usermode.
Change cursig() and postsig() to look both into the thread and process
signal queues.
MFC r197976:
Fix typo.
MFC r200082:
Remove wrong assertion. Debugee is allowed to lose a signal
Don't warn about an RSDP with a corrupt checksum. The kernel does a better
job about warning about these things later and this message can be
confusing.
Fix a confusing typo in the EDD packet structure used in gptboot and
gptzfsboot. I got the segment and offset fields reversed in the structure,
but I also succeeded in crossing the assignments so the actual EDD packet
ended up correct.
- Port bios_getmem() from libi386 to {gpt,}zfsboot() and use it to
safely allocate a heap region above 1MB. This enables {gpt,}zfsboot()
to allocate much larger buffers than before.
- Use a larger buffer (1MB instead of 128K) for temporary ZFS buffers. This
allows more reliable reading of compressed files in a raidz/raidz2 pool.
- Various small whitespace and style fixes.
- Improve the algorithm the loader uses to choose a memory range for its
heap when using a range above 1MB.
Properly return an error reply if an NFS remove or link operation fails.
Previously the failing operation would allocate an mbuf and construct an
error reply, but because the function did not return 0, the NFS server
assumed it had failed to generate a reply and would leak the reply mbuf as
well as not sending the reply to the NFS client.
ndis_scan_results() can sleep if the scan results are not ready when
ndis_scan() is called. However, ndis_scan() is invoked from softclock()
and cannot sleep. Move ndis_scan_results() to the ndis driver's scan_end
hook instead.
Implement rtld part of the support for -z nodlopen (see ld(1)).
MFC r199877:
Allow to load not-openable dso when tracing. This fixes ldd on such dso or
dso linked to non-openable object.
Remove '\n' at the end of error message.
End comments with dot.
Unbreak the ata_atapi() usage. Since r200171 (MFC'ed in r200432) the
mode setting functions get a ata_device type device passed instead of
a ata_channel one, thus ata_atapi() has to be adjusted accordingly.
Reviewed by: mav
Large I/Os on Promise controllers reported to cause UDMA ICRC errors and
subsequent timeouts. Restore previous limit for now, at least until
I will have hardware to experiment.
PR: kern/141438
- On entrance to the rx_eof sync RX rings maps with POSTWRITE flag
instead of POSTREAD: the hardware do not touch this memory (CPU
updates it). It is already synchronized as PREWRITE after the
processing is done.
- Add support for new BGE chips (5761, 5784 and 57780). These chips uses new
BGE_PCI_PRODID_ASICREV register to store the chip identifier and its revision.
- Add new grouping macro for 7575+ chips (BGE_IS_5755_PLUS).
- Add IDs for Fujitsu-branded Broadcom adapters.
Add a new errno, ENOTCAPABLE, to be returned when a process requests an
operation on a file descriptor that is not authorized by the descriptor's
capability flags.
Sponsored by: Google
Improve grammar in ip_input comment while attempting to maintain what
might be its meaning.
(Note, merge of the revision correcting a spelling error in this commit
will follow as well!)
Reserve system call numbers for Capsicum security framework capabilities,
capability mode, and process descriptors: cap_new, cap_getrights, cap_enter,
cap_getmode, pdfork, pdkill, pdgetpid, and pdwait.
Obtained from: TrustedBSD Project
Sponsored by: Google
Add audit events for process descriptor system calls, which will appear in
a future OpenBSM release.
Sponsored by: Google
Obtained from: TrustedBSD Project
Some general cleanup of scatter/gather memory allocation
- We don't need to check malloc return values with M_WAITOK
- remove variables that we don't really need
- cleanup the error paths by just calling drm_sg_cleanup()
- fix drm_sg_cleanup() to be safe to call at any time
Introduce ATA_CAM kernel option, turning ata(4) controller drivers into
cam(4) interface modules. When enabled, this option deprecates all ata(4)
peripheral drivers (ad, acd, ...) and interfaces and allows cam(4) drivers
(ada, cd, ...) and interfaces to be natively used instead.
As side effect of this, ata(4) mode setting code was completely rewritten
to make controller API more strict and permit above change. While doing
this, SATA revision was separated from PATA mode. It allows DMA-incapable
SATA devices to operate and makes hw.ata.(ata|atapi)_dma tunable work again.
Also allow ata(4) controller drivers (except some specific or broken ones)
to handle larger data transfers. Previous constraint of 64K was artificial
and is not really required by PCI ATA BM specification or hardware.
Submitted by: nwitehorn (powerpc part)
Limit maximum I/O size, depending on command set supported by device.
It is required to suppot non-LBA48 devices with MAXPHYS above 128K.
Same is done in ada(4).
Add counters for the i7 architecture which were accidentally left
out of the original commit of i7 support. These are all the counters
on pages A-32 and A-33 of the _Intel(R) 64 and IA32 Architectures
Software Developer's Manual Vol 3B_, June 2009. Almost all
of these counters relate to operations on the L2 cache.
r200124:
Avoid using additional variable for storing an error if we are not going
to do anything with it.
r200126:
Fix deadlock when ZVOLs are present and we are replacing dead component or
calling scrub when pool is in a degraded state. It will try to taste ZVOLs,
which will lead to deadlock, as ZVOL will try to acquire the same locks as
replace/scrub is holding already.
We can't simply skip provider based on their GEOM class, because ZVOL can have
providers build on top of it and we need to skip those as well.
We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give
us positive answer and we have to skip those providers.
This way we remove possibility to create ZFS pools on top of ZVOLs, but it is
not very useful anyway.
I believe deadlock is still possible in some very complex situations like when
we have MD provider on top of UFS file on top of ZVOL. When we try to replace
dead component in the pool mentioned ZVOL is based on, there might be a
deadlock when ZFS will try to taste MD provider. There is no easy way to detect
that, but it isn't very common.
r200125,r200158:
Fix order of looking for providers.
Before r200125 the order of looking for providers was wrong. It was:
1. Find provider by name.
2. Find provider by guid.
3. Find provider by name and guid.
Where it should have been:
1. Find provider by name and guid.
2. Find provider by guid.
3. Find provider by name.
Improve support for High-speed USB audio devices.
- fix issues regarding the mixer, where the interface number was not set in
time.
- fix wrong use of resolution parameter.
Submitted by: Hans Petter Selasky
Remove overuse of exclamation marks in kernel printfs, there mere fact a
message has been printed is enough to get someones attention. Also remove the
line number for DPRINTF/DPRINTFN, it already prints the funtion name and a
unique message.
Disable interrupts after doing early takeover of the usb controller in case usb
isnt actually compiled in (or kldloaded) as the controller could cause spurious
interrupts.
Improve High Speed slot allocation mechanism by moving the computation to the
endpoint rather than per xfer and provide functions around get/free of resources.
Submitted by: Hans Petter Selasky
ehci_init() will do reset and set the usbrev flag. Fix problem where
ehci_reset() was called before ehci_init().
PR: usb/140242
Submitted by: Sebastian Huber
- Add usb_fill_bulk_urb() and usb_bulk_msg() linux compat functions [1]
- Don't write actual length if the actual length pointer is NULL [2]
- correct Linux Compatibility error codes for short isochronous IN transfers
and make status field signed.
Submitted by: Leunam Elebek [1], Manuel Gebele [2]
updates device entries supported with the product name not magic numbers
and sorts entries. WUSB54GCV2 is added.
overhauls urtw(4) for supporting RTL8187B devices properly that there
was major changes to initialize RF chipset and set H/W registers and
removed a lot of magic numbers on code.
Reduce probe priority of USB input devices to BUS_PROBE_GENERIC from
BUS_PROBE_SPECIFIC. This allows device-specific drivers like atp to
attach reliably.
Do not ignore device interrupt if bus mastering is still active. It is
normal in case of media read error and some ATAPI cases, when transfer size
is unknown beforehand. PCI ATA BM specification tells that in case of such
underrun driver should just manually stop DMA engine. DMA engine should
same time guarantie that all bus mastering transfers completed at the moment
of driver reads interrupt flag asserted.
This change fixes interrupt storms and command timeouts in many cases.
PR: kern/103602, sparc64/121539, kern/133122, kern/139654
On Soft Reset, read device signature from FIS receive area, instead of
PxSIG register. It works better for NVidia chipsets. ahci(4) does the same.
PR: kern/140472, i386/138668
Explicitly acknowledge MSI completion, as required by SiI3124 datasheet.
It makes MSI working there. Later (and cheaper) PCIe chips (3132/3531)
still randomly crashing system in few seconds of high MSI rates, generating
something inaporopriate, like NMI or "Fatal trap 30".
Add Asynchronous Notification support for controllers without SNTF
capability by snooping SDB FIS receive area. It should be even faster
then regular way, but less reliable.
Change 'load' balancing mode algorithm:
- Instead of measuring last request execution time for each drive and
choosing one with smallest time, use averaged number of requests, running
on each drive. This information is more accurate and timely. It allows to
distribute load between drives in more even and predictable way.
- For each drive track offset of the last submitted request. If new request
offset matches previous one or close for some drive, prefer that drive.
It allows to significantly speedup simultaneous sequential reads.
PR: kern/113885
Modify the experimental nfs server so that it falls back to
using VOP_LOOKUP() when VFS_VGET() returns EOPNOTSUPP in the
ReaddirPlus RPC. This patch is based upon one by pjd@ for the
regular nfs server which has not yet been committed. It is needed
when a ZFS volume is exported and ReaddirPlus (which almost
always happens for NFSv4) is performed by a client. The patch
also simplifies vnode lock handling somewhat.
Tested by: gerrit at pmp.uni-hannover.de
Patch the experimental NFS server is a manner analagous to
r197525, so that the creation verifier is handled correctly
in va_atime for 64bit architectures. There were two problems.
One was that the code incorrectly assumed that
sizeof (struct timespec) == 8 and the other was that the tv_sec
field needs to be assigned from a signed 32bit integer, so that
sign extension occurs on 64bit architectures. This is required
for correct operation when exporting ZFS volumes.
Tested by: gerrit at pmp.uni-hannover.de
Reviewed by: pjd
Add a CPU features framework on PowerPC and simplify CPU setup a little
more. This provides three new sysctls to user space:
hw.cpu_features - A bitmask of available CPU features
hw.floatingpoint - Whether or not there is hardware FP support
hw.altivec - Whether or not Altivec is available
PR: powerpc/139154
Turn on NAP mode on G5 systems, and refactor the HID0 setup code a little.
This makes my G5 Xserve sound slightly less like it is filled with
howling banshees.
MFC r198968:
Unbreak E500 builds. The inline assembly for the 970 CPUs
is invalid when compiling for BookE.
MFC r199533:
Fix cpuid output on E500 core.
Add two new fcntls to enable/disable read-ahead:
- F_READAHEAD: specify the amount for sequential access. The amount is
specified in bytes and is rounded up to nearest block size.
- F_RDAHEAD: Darwin compatible version that use 128KB as the sequential
access size.
A third argument of zero disables the read-ahead behavior.
Please note that the read-ahead amount is also constrainted by sysctl
variable, vfs.read_max, which may need to be raised in order to better
utilize this feature.
Thanks Igor Sysoev for proposing the feature and submitting the original
version, and kib@ for his valuable comments.
Remove extra parantheses from usb_ethernet.c and usb_serial.c lines.
config(8) doesn't parse parantheses and instead treated them as being
part of the device driver name (e.g. '(u3g' vs 'u3g'). While here, fix the
style of these long lines to match the wrapping used for other long lines
in this file.
Create a seperate ZFS enabled loader.
This adds zfsloader which will be called by zfsboot/gptzfsboot code rather
than the tradional loader. This eliminates the need to set the
LOADER_ZFS_SUPPORT variable in order to get a ZFS enabled loader.
Note however, that you must reinstall your bootcode (zfsboot/gptzfsboot)
in order for the boot process to use the new loader.
New installations will no longer be required to build a ZFS enabled
loader for a working ZFS boot system. Installing zfsboot/gptzfsboot is
sufficient for acknowledging the use of CDDL code and therefore the ZFS
enabled loader.
Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets
no matter whether we are compiled as module or if our default of the
net.inet6.ip6.v6only sysctl already matches what we would set.
This avoids unnecessary complications with modules, VIMAGES, INET6 and
the sysctl value, especially considering that most users will use
linux compat as a module.
Discussed with: kib, rwatson (weeks ago)
Reviewed by: rwatson
r199237:
sc->rev and is_offload(sc) will always be 0 during probe. Wait till
attach to get correct values.
r199238:
Make sure *some* edc is setup even for an unknown transceiver (assume
it is optical).
r199239:
The 10GBASE-T card should use an IPG of 1. Also enable the check
for low power startup on this card.
r199240:
Don't disable the XGMAC's tx on ifconfig down. It is unnecessary
and can cause false backpressure in the chip. Fix a us/ms mixup
while here.
r200003:
T3 firmware 7.8.0 for cxgb(4)
Make sure that the primary native brandinfo always gets added
first and the native ia32 compat as middle (before other things).
o(ld)brandinfo as well as third party like linux, kfreebsd, etc.
stays on SI_ORDER_ANY coming last.
The reason for this is only to make sure that even in case we would
overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo
would still be there and the system would be operational.
Reviewed by: kib
lindev(4) [1] is supposed to be a collection of linux-specific pseudo
devices that we also support, just not by default (thus only LINT or
module builds by default).
While currently there is only "/dev/full" [2], we are planning to see more
in the future. We may decide to change the module/dependency logic in the
future should the list grow too long.
This is not part of linux.ko as also non-linux binaries like kFreeBSD
userland or ports can make use of this as well.
Suggested by: rwatson [1] (name)
Submitted by: ed [2]
Discussed with: markm, ed, rwatson, kib (weeks ago)
Reviewed by: rwatson, brueffer (prev. version)
PR: kern/68961
Add more statistics variables for IPcomp.
Try to version the struct in a backward compatible way.
People asked for the versioning of the stats structs in general before.
Note: old netstat binaries, as only consumer, continue to work as they are
still using kvm but will not display the new stats. [1]
Discussed with: rwatson [1]
In case the compression result is the same size as the orignal version,
the compression was useless as well. Make sure to not update the data
and return, else we would waste resources when decompressing.
This also avoids the copyback() changing data other consumers like
xform_ipcomp.c would have ignored because of no win and sent out without
noting that compression was used, resulting in invalid packets at the
receiver.
Only add the IPcomp header if crypto reported success and we have a lower
payload size. Before we had always added the header, no matter if we
actually send out compressed data or not.
With this, after the opencrypto/deflate changes, IPcomp starts to work
apart from edge cases. Leave it disabled by default until those are
fixed as well.
PR: kern/123587