Use zfs_read() instead of xfsread() to read /boot.config. xfsread() fails
short read requests, so the result was that a /boot.config smaller than 512
bytes was ignored. boot2 uses fsread() instead of xfsread() to read
/boot.config already, so this makes zfsboot more like boot2.
Approved by: re (kib)
In function do_rw_wrlock, when a writer got an error and before returning,
check if there are readers blocked by us via URWLOCK_WRITE_WAITERS flag,
and resume the readers. The error must be EAGAIN, otherwise there must
have memory problem, and nobody can rescue the buggy application.
Approved by: re (kib), davidxu
Refine r195509, instead of checking that vnode type is VBAD, that is
set quite late in the revocation path, properly verify that vnode is
not doomed before calling VOP.
Approved by: re (bz)
If provider is open for writing when we taste it, skip it for classes that
depend on on-disk metadata. This was we won't attach to providers that are used
by other classes. For example we don't want to configure partitions on da0 if
it is part of gmirror, what we really want is partitions on mirror/foo.
During regular work it works like this: if provider is open for writing a class
receives the spoiled event from GEOM and detaches, once provider is closed the
taste event is send again and class can rediscover its metadata if it is still
there. This doesn't work that way when new class arrives, because GEOM gives
all existing providers for it to taste, also those open for writing. Classes
have to decided on their own if they want to deal with such providers (eg.
geom_dev) or not (classes modified by this commit).
Reported by: des, Oliver Lehmann <lehmann@ans-netz.de>
Tested by: des, Oliver Lehmann <lehmann@ans-netz.de>
Discussed with: phk, marcel
Reviewed by: marcel
Approved by: re (kib)
r197831:
Fix situation where Mac OS X NFS client creates a file and when it tries
to set ownership and mode in the same setattr operation, the mode was
overwritten by secpolicy_vnode_setattr().
PR: kern/118320
Submitted by: Mark Thompson <info-gentoo@mark.thompson.bz>
r197842:
Fix white-spaces.
r197843:
On FreeBSD it is enough to report provider removal when orphan event is
received, we don't have to do it on every ENXIO error in I/O path.
Solaris has no GEOM so they have to handle it in a less clean way.
r197860:
File system owner is when uid matches and jail matches.
r197861:
Allow file system owner to modify system flags if securelevel permits.
Approved by: re (kib)
Per their definition, atomic instructions used in conjuction with
memory barriers should also ensure that the compiler doesn't reorder paths
where they are used. GCC, however, does that aggressively, even in
presence of volatile operands. The most reliable way GCC offers for avoid
instructions reordering is clobbering "memory".
Not all our memory barriers, right now, clobber memory for GCC-like
compilers.
Fix these cases.
Approved by: re (kib)
When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.
Approved by: re (kib)
Fix RTS/CTS flow control, broken by the TTY overhaul. The new TTY
interface is fairly simple WRT dealing with flow control, but
needed 2 new RX buffer functions with "get-char-from-buf" separated
from "advance-buf-pointer" so that the pointer could be advanced
only when ttydisc_rint() succeeded.
Approved by: re (kib)
Remove tcp_input lock statistics; these are intended for debugging only
and are not intended to ship in 8.0 as they dirty additional cache
lines in a performance-critical per-packet path.
Approved by: re (kib, bz)
In tcp_input(), we acquire a global write lock at first only if a
segment is likely to trigger a TCP state change (i.e., FIN/RST/SYN).
If we later have to upgrade the lock, we acquire an inpcb reference
and drop both global/inpcb locks before reacquiring in-order. In
that gap, the connection may transition into TIMEWAIT, so we need
to loop back and reevaluate the inpcb after relocking.
Reported by: Kamigishi Rei <spambox at haruhiism.net>
Reviewed by: bz
Approved by: re (kib)
unifdef NFSCLIENT because the nlm depends on the nfsclient even if NFSCLIENT
is not defined.
Now the nfslockd module works with the nfsclient module.
Reviewed by: kib
Approved by: re (kensmith)
Remove a log message from production code. This log message can be
triggered by a misconfigured host that is sending out gratuious ARPs.
This log message can also be triggered during a network renumbering
event when multiple prefixes co-exist on a single network segment.
Approved by: re
Previously, if an address alias is configured on an interface, and
this address alias has a prefix matching that of another address
configured on the same interface, then the ARP entry for the alias
is not deleted from the ARP table when that address alias is removed.
This patch fixes the aforementioned issue.
PR: kern/139113
Reviewed by: bz
Approved by: re
The flow-table associates TCP/UDP flows and IP destinations with
specific routes. When the routing table changes, for example,
when a new route with a more specific prefix is inserted into the
routing table, the flow-table is not updated to reflect that change.
As such existing connections cannot take advantage of the new path.
In some cases the path is broken. This patch will update the affected
flow-table entries when a more specific route is added. The route
entry is properly marked when a route is deleted from the table.
In this case, when the flow-table performs a search, the stale
entry is updated automatically. Therefore this patch is not
necessary for route deletion.
Reviewed by: bz, kmacy
Approved by: re
Fix some unexpected potential NULL de-references in kernel mode due to
usage of pre-8.0 wifi operations with the ndis driver wrapping a Win32/64
wifi driver.
Submitted by: Paul B Mahol <onemda@gmail.com>
Approved by: re
Use __NO_STRICT_ALIGNMENT to determine whether de(4) have to apply
alignment fixup code for received frames on strict alignment
architectures.
MFC r197463:
Consistently use bus_addr_t.
MFC r197464:
Destroy dmamap in dma cleanup.
MFC r197465:
Align Tx/Rx descriptors on 32 bytes boundary instead of PAGE_SIZE.
Also align setup descriptor on 32 bytes boundary. Tx buffer have no
alignment limitation so create dmamap without alignment
restriction[1]. Rx buffer still seems to require 4 bytes alignment
limitation but we can simply use MCLBYTES for size to map the
buffer instead of TULIP_DATA_PER_DESC as the buffer is allocated
with m_getcl(9).
de(4) supports up to TULIP_MAX_TXSEG segments for Tx buffers,
increase maximum dma segment size to TULIP_MAX_TXSEG * MCLBYTES.
While I'm here remove TULIP_DATA_PER_DESC as it is not used anymore.
This should fix de(4) breakage introduced after r176206.
Submitted by: jhb [1]
Reported by: WATANABE Kazuhiro < CQG00620 <> nifty dot ne dot jp >
Tested by: WATANABE Kazuhiro < CQG00620 <> nifty dot ne dot jp >,
Takahashi Yoshihiro < nyan <> jp dot freebsd dot org >
Approved by: re (kib)
Two more mxge watchdog fixes
1) Restore the PCI Express control register after a watchdog
reset. This is required because the device will come out
of watchdog reset with the pectl reg at its default state,
and important BIOS configuration (like max payload size)
could be lost.
2) Call mxge_start_locked() for every tx queue before dropping
the lock in the watchdog handler. This is required, as
the queue's buf ring may have filled during the reset.
Approved by: re (kib)
EHCI Hardware BUG workaround
The EHCI HW can use the qtd_next field instead of qtd_altnext when a short
packet is received. This contradicts what is stated in the EHCI datasheet.
Also the total-bytes field in the status field of the following TD gets
corrupted upon reception of a short packet! We work this around in software by
not queueing more than one job/TD at a time of up to 16Kbytes! The bug has been
seen on multiple INTEL based EHCI chips. Other vendors have not been tested
yet.
- Applications using /dev/usb/X.Y.Z, where Z is non-zero are affected, but not
applications using LibUSB v0.1, v1.2 and v2.0.
- Mass Storage (umass) is affected.
Approved by: re (kib)
As a workaround, for Intel CPUs, do not use CLFLUSH in
pmap_invalidate_cache_range() when self-snoop is apparently not reported
in cpu features.
Approved by: re (bz, kensmith)
Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old
format ZFS, as defined in the manual page.
Submitted by: pjd (response of my original patch but bugs are mine)
Approved by: re (kib)
Add no zero mapping feature.
NOTE: Unlike in the other branches where this change will be "merged"
to, the 'no zero mapping' is enabled by default in stable/8.
Errata: FreeBSD-EN-09:05.null
Approved by: re (kib)
Add '#define NFSCLIENT' into opt_nfs.h if the NFSCLIENT variable is 1
(the default is 1).
This makes the nfslockd module works for NFS client.
Reviewed by: dfr
Approved by: re (kib)
r197512, r197513, r197514, r197515, r197525:
r197287:
Purge namecache for the file system being rolled back, so it doesn't point at
invalid vnodes after the rollback resulting in EIO errors when trying to access
files which are in the namecache.
Reported by: des
r197289:
Purge file system namecache when receiving incremental stream and rolling back
to it.
r197351:
Purge namecache in the same place OpenSolaris does.
r197426:
Restore BSD behaviour - when creating new directory entry use parent directory
gid to set group ownership and not process gid.
This was overlooked during v6 -> v13 switch.
PR: kern/139076
Reported by: Sean Winn <sean@gothic.net.au>
r197458:
Close race in zfs_zget(). We have to increase usecount first and then
check for VI_DOOMED flag. Before this change vnode could be reclaimed
between checking for the flag and increasing usecount.
r197459:
Before calling vflush(FORCECLOSE) mark file system as unmounted so the
following vnops will fail. This is very important, because without this change
vnode could be reclaimed at any point, even if we increased usecount. The only
way to ensure that vnode won't be reclaimed was to lock it, which would be very
hard to do in ZFS without changing a lot of code. With this change simply
increasing usecount is enough to be sure vnode won't be reclaimed from under
us. To be precise it can still be reclaimed but we won't be able to see it,
because every try to enter ZFS through VFS will result in EIO.
The only function that cannot return EIO, because it is needed for vflush() is
zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks
z_teardown_lock and never returns EIO.
r197497:
Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to
be a bit weak and OpenSolaris also switched to fletcher4.
r197498: head/cddl/contrib/opensolaris
Fletcher4 is not the default checksum algorithm.
r197512:
- Don't depend on value returned by gfs_*_inactive(), it doesn't work
well with forced unmounts when GFS vnodes are referenced.
- Make other preparations to GFS for forced unmounts.
PR: kern/139062
Reported by: trasz
r197513:
Use traverse() function to find and return mount point's vnode instead of
covered vnode when snapshot is already mounted.
r197514:
On lookup error VFS expects *vpp to be set to NULL, be sure to do that.
r197515:
Handle cases where virtual (GFS) vnodes are referenced when doing forced
unmount. In that case we cannot depend on the proper order of invalidating
vnodes, so we have to free resources when we have a chance.
PR: kern/139062
Reported by: trasz
r197525:
Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object.
This completes the fix from r185586.
PR: kern/139059
Reported by: Daniel Braniss <danny@cs.huji.ac.il>
Submitted by: Jaakko Heinonen <jh@saunalahti.fi>
Tested by: Daniel Braniss <danny@cs.huji.ac.il>
Approved by: re (kib)
- According to Linux, the ALi M5451 can do 31-bit DMA instead of just
30-bit like the reset of the controllers supported by this driver.
Actually ALi M5451 can be setup up to generate 32-bit addresses by
setting the 31st bit via the accompanying ISA bridge, which allows
it to work in sparc64 machines whose IOMMU require at least 32-bit
DMA. Even though other architectures would also benefit from 32-bit
DMA, enabling this bit is limited to sparc64 as bus_dma(9) doesn't
generally guarantee that a low address of BUS_SPACE_MAXADDR_32BIT
results in a buffer in the 32-bit range.
- According to Tatsuo YOKOGAWA's ali(4), the the DMA transfer size of
ALi M5451 is fixed to 64k and in fact using the default size of 4k
causes the chip to overrun the mapping, triggering uncorrectable
DMA errors on sparc64.
- The 4DWAVE DX and NX require the recording buffer to be 8-byte
aligned so adjust the bus_dma_tag_create(9) accordingly.
- Unlike the rest of the controllers supported by this driver, the
ALi M5451 only has 32 hardware channels instead of 64 so limit the
loop in tr_intr() accordingly. [1]
Submitted by: yongari [1]
Reviewed by: yongari (superset of what is committed)
Approved by: re (kib)
alignment requirements. It is busdma task, to manage proper alignment by
loading data to bounce buffers.
PR: kern/127316
Reviewed by: current@
Tested by: Ryan Rogers
Approved by: re (kib)
Do not call BUS_DRIVER_ADDED() for detached buses (attach failed) on
driver load. This fixes crash on atapicam module load on systems, where
some ata channels (usually ata1) was probed, but failed to attach.
Reviewed by: jhb, imp
Tested by: many
Approved by: re (kib)
the work area was totally unsynchronized which means this driver only
had a chance of working on x86 when no bounce buffers were involved,
which isn't that likely given that support for 64-bit DMA is currently
broken throughout ata(4).
- Add necessary little-endian conversion of accesses to the work area,
making this driver work on big-endian hosts. While at it, use the
alignment-agnostic byte order encoders in order to be on the safe side.
- Clear the reserved member of the SG list entries in order to be on the
safe side. [1]
Submitted by: yongari [1]
Reviewed by: yongari
Approved by: re (kib)
The elements in the component arrays may be direct Package objects rather
than references to objects. In that case, simply use the Package directly.
Approved by: re (kib)
- Split the logic to parse an SMAP entry out into a separate function on
amd64 similar to i386. This fixes a bug on amd64 where overlapping
entries would not cause the SMAP parsing to stop.
- Change the SMAP parsing code to do a sorted insertion into physmap[]
instead of an append to support systems with out-of-order SMAP entries.
Approved by: re (kib)