coherent with the data caches. Implement a quick fix to allow
us to boot on Montecito, while I'm working on a better fix in
the mean time.
Commit made on Montecito-based Itanium...
is to be requested via a "ro" option. At the same time, MNT_RDONLY
is gradually becoming an indicator of the current state of the FS
instead of a command flag. Today passing MNT_RDONLY alone to the
kernel's mount machinery will lead to various glitches. (See the
PRs for examples.)
Therefore mount the root FS with a "ro" option instead of the
MNT_RDONLY flag. (Note that MNT_RDONLY still is added to the mount
flags internally, by vfs_donmount(), if "ro" was specified.)
To be able to pass "ro" cleanly to kernel_vmount(), teach the latter
function to accept options with NULL values.
Also correct the comment explaining how mount_arg() handles length
of -1.
PR: bin/106636 kern/120319
Submitted by: Jaakko Heinonen <see PR kern/120319 for email> (originally)
legacy interrupts rather than MSI as a special case. Prior to this
commit, the interrupt handler was doing the slow handshaking with
the device to ensure the legacy interrupt was lowered in both
the legacy and MSI-X case. This handshaking was not
required for MSI-X.
allocator for jumbo frame.
o Removed unneeded jlist lock which was used to manage jumbo
buffers.
o Don't reinitialize hardware if MTU was not changed.
o Added additional check for minimal MTU size.
o Added a new tunable hw.skc.jumbo_disable to disable jumbo frame
support for the driver. The tunable could be set for systems that
do not need to use jumbo frames and it would save
(9K * number of Rx descriptors) bytes kernel memory.
o Jumbo buffer allocation failure is no longer critical error for
the operation of sk(4). If sk(4) encounter the allocation failure
it just disables jumbo frame support and continues to work without
user intervention.
With these changes jumbo frame performance of sk(4) was slightly
increased and users should not encounter jumbo buffer allocation
failure. Previously sk(4) tried to allocate physically contiguous
memory, 3388KB for 256 Rx descriptors. Sometimes that amount of
contiguous memory region could not be available for running systems
which in turn resulted in failure of loading the driver.
Tested by: Cy Schubert < Cy.Schubert () komquats dot com >
modules using invalid ABI versions (e.g. a 7.x module with an 8.x kernel)
for a given kernel:
- Add a 'kernel' module version whose value is __FreeBSD_version.
- Add a version dependency on 'kernel' in every module that has an
acceptable version range of __FreeBSD_version up to the end of the
branch __FreeBSD_version is part of. E.g. a module compiled on 701000
would work on kernels with versions between 701000 and 799999 inclusive.
Discussed on: arch@
MFC after: 1 week
A couple of notes for this:
* WITNESS support, when enabled, is only used for shared locks in order
to avoid problems with the "disowned" locks
* KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order
to assert for a generic thread (not curthread) owning or not the
lock. Really, this kind of check is bogus but it seems very
widespread in the consumers code. So, for the moment, we cater this
untrusted behaviour, until the consumers are not fixed and the
options could be removed (hopefully during 8.0-CURRENT lifecycle)
* Implementing KA_HELD and KA_UNHELD (not surported natively by
WITNESS) made necessary the introduction of LA_MASKASSERT which
specifies the range for default lock assertion flags
* About other aspects, lockmgr_assert() follows exactly what other
locking primitives offer about this operation.
- Build real assertions for buffer cache locks on the top of
lockmgr_assert(). They can be used with the BUF_ASSERT_*(bp)
paradigm.
- Add checks at lock destruction time and use a cookie for verifying
lock integrity at any operation.
- Redefine BUF_LOCKFREE() in order to not use a direct assert but
let it rely on the aforementioned destruction time check.
KPI results evidently broken, so __FreeBSD_version bumping and
manpage update result necessary and will be committed soon.
Side note: lockmgr_assert() will be used soon in order to implement
real assertions in the vnode namespace replacing the legacy and still
bogus "VOP_ISLOCKED()" way.
Tested by: kris (earlier version)
Reviewed by: jhb
access cache improvements:
- Flush just access control state on CODA_PURGEUSER, not the full
namecache for /coda.
- When replacing a fid on a cnode as a result of, e.g.,
reintegration after offline operation, we no longer need to
purge the namecache entries associated with its vnode.
MFC after: 1 month
modeled on the access cache found in NFS, smbfs, and the Linux coda
module. This is a positive access cache of a single entry per file,
tracking recently granted rights, but unlike NFS and smbfs,
supporting explicit invalidation by the distributed file system.
For each cnode, maintain a C_ACCCACHE flag indicating the validity
of the cache, and a cached uid and mode tracking recently granted
positive access control decisions.
Prefer the cache to venus_access() in VOP_ACCESS() if it is valid,
and when we must fall back to venus_access(), update the cache.
Allow Venus to clear the access cache, either the whole cache on
CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER.
Unlike the Coda module on Linux, we don't flush all entries on a
user purge using a generation number, we instead walk present
cnodes and clear only entries for the specific user, meaning it is
somewhat more expensive but won't hit all users.
Since the Coda module is agressive about not keeping around
unopened cnodes, the utility of the cache is somewhat limited for
files, but works will for directories. We should make Coda less
agressive about GCing cnodes in VOP_INACTIVE() in order to improve
the effectiveness of in-kernel caching of attributes and access
rights.
MFC after: 1 month
VFS namecache, as is done by the Coda module on Linux. Unlike the Coda
namecache, the global VFS namecache isn't tagged by credential, so use
ore conservative flushing behavior (for now) when CODA_PURGEUSER is
issued by Venus.
This improves overall integration with the FreeBSD VFS, including
allowing __getcwd() to work better, procfs/procstat monitoring, and so
on. This improves shell behavior in many cases, and improves ".."
handling. It may lead to some slowdown until we've implemented a
specific access cache, which should net improve performance, but in the
mean time, lookup access control now always goes to Venus, whereas
previously it didn't.
MFC after: 1 month
When ntfs_ntput() reaches 0 in the refcount the inode lockmgr is not
released and directly destroyed. Fix this by unlocking the lockmgr() even
in the case of zero-refcount.
Reported by: dougb, yar, Scot Hetzel <swhetzel at gmail dot com>
Submitted by: yar
nfs_xid_gen() function instead of duplicating the logic in both
nfsm_rpchead() and the NFS3ERR_JUKEBOX handling in nfs_request().
MFC after: 1 week
Submitted by: mohans (a long while ago)
through the FreeBSD ABI. IPC_INFO, SHM_INFO, SHM_STAT were added
specifically for Linux binary support. They are not documented
as being a part of the FreeBSD ABI, also, the structures necessary
for them have been hidden away from the users for a long time.
Also, the Linux ABI layer uses it's own structures to populate the
responses back to the user to ensure that the ABI is consistent.
I think there is a bit more separation work that needs to happen.
Reviewed by: jhb
Discussed with: jhb
Discussed on: freebsd-arch@ (very briefly)
MFC after: 1 month
the PIC also informs the platform at which IRQ level it can start
assigning IPIs, since this can depend on the number of IRQs
supported for external interrupts.
PAGE_SIZE or less, the bounce page counting logic was flawed and wouldn't
reserve any pages. Adjust to be correct. Review of other architectures is
forthcoming.
Submitted by: Joseph Golio
With write-allocate cache we get into the following scenario:
1. data has been updated in the memory by the USB HC, but
2. D-cache holds an un-flushed value of it
3. when affected cache line is being replaced, the old (un-flushed) value is
flushed and overwrites the newly arrived
This is possible due to how write-allocate works with virtual caches (ARM for
example).
In case of USB transfers it leads to fatal tags discrepancies in umass(4)
operation, which look like the following:
umass0: Invalid CSW: tag 1 should be 2
(probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe0:umass-sim0:0:0:0): Retrying Command
umass0: Invalid CSW: tag 1 should be 3
(probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe0:umass-sim0:0:0:0): Retrying Command
umass0: Invalid CSW: tag 1 should be 4
(probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe0:umass-sim0:0:0:0): Retrying Command
umass0: Invalid CSW: tag 1 should be 5
(probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe0:umass-sim0:0:0:0): Retrying Command
umass0: Invalid CSW: tag 1 should be 6
(probe0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR
(probe0:umass-sim0:0:0:0): error 5
(probe0:umass-sim0:0:0:0): Retries Exausted
To eliminate this, a BUS_DMASYNC_PREREAD sync operation is required in
usbd_start_transfer().
Credits for nailing this down go to Grzegorz Bernacki gjb AT semihalf DOT com.
Reviewed by: imp
Approved by: cognet (mentor)
historical relic, and are no longer appropriate for either LAN or WAN
mounting. At modern (gigabit and 10 gigabit) LAN speeds packet loss
from socket buffer fill events is common, and sequence numbers wrap
quickly enough that data corruption is possible. TCP solves both of
these problems without imposing significant overhead.
MFC after: 1 month
sectors so the geometry of large IDE disks has to be adjusted. This
corresponds to what the OpenSolaris dad(7D) driver does except that
the latter only tweaks sectors and effectively limits the mediasize
to 128GB so the cylinders and heads fields won't ever overflow. Not
limiting the mediasize is a compromise between allowing to use Sun
disk label as far as possible and being able to use the entire disk
with another disk label.
This allows to use the full capacity of large IDE disks if they were
not labeled under (Open)Solaris (in both ways of the meaning).
MFC after: 2 weeks
Turn off TFTP support by default: when both TFTP and NFS are enabled in the
loader, strange interactions occur in the pure netbooting scenario (i.e.
loader is TFTP-ed, kernel+world mounted over NFS), leading to very slow access
to the NFS-exported files.
Reviewed by: grehan
Approved by: cognet (mentor)
The logical disks will appear as /dev/lvm/<vol group>-<logical vol>, for
instance /dev/lvm/vg0-home. GLVM currently supports linear stripes with
segments on multiple physical disks. The metadata is read only, logical
volumes can not be allocated or resized.
Reviewed by: Ivan Voras
- Include lock.h in lockmgr.h as nested header in order to safely use
LOCK_FILE and LOCK_LINE. As long as this code will be replaced soon
we can tollerate for a while this namespace pollution even if the real
fix would be to let lockmgr() depend by lock.h as a separate header.