long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
constraints.
These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.
Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.
Reviewed by: scottl
snapshots on UFS filesystems running with journaled soft updates.
This is the first of several bugs that need to be fixed before
removing the restriction added in -r230250 to prevent the use
of snapshots on filesystems running with journaled soft updates.
The deadlock occurs when holding the snapshot lock (snaplk)
and then trying to flush an inode via ffs_update(). We become
blocked by another process trying to flush a different inode
contained in the same inode block that we need. It holds the
inode block for which we are waiting locked. When it tries to
write the inode block, it gets blocked waiting for the our
snaplk when it calls ffs_copyonwrite() to see if the inode
block needs to be copied in our snapshot.
The most obvious place that this deadlock arises is in the
ffs_copyonwrite() routine when it updates critical metadata
in a snapshot and tries to write it out before proceeding.
The fix here is to write the data and indirect block pointer
for the snapshot, but to skip the call to ffs_update() to
write the snapshot inode. To ensure that we will never have
to update a pointer in the inode itself, the ffs_snapshot()
routine that creates the snapshot has to ensure that all the
direct blocks are allocated as part of the creation of the
snapshot.
A less obvious place that this deadlock occurs is when we hold
the snaplk because we are deleting a snapshot. In the course of
doing the deletion, we need to allocate various soft update
dependency structures and allocate some journal space. If we
hit a resource limit while doing this we decrease the resources
in use by flushing out an existing dirty file to get it to give
up the soft dependency resources that it holds. The flush can
cause an ffs_update() to be done on the inode for the file that
we have selected to flush resulting in the same deadlock as
described above when the inode that we have chosen to flush
resides in the same inode block as the snapshot inode that we hold.
The fix is to defer cleaning up any time that the inode on which
we are operating is a snapshot.
Help and review by: Jeff Roberson
Tested by: Peter Holm
MFC (to 9 only) after: 2 weeks
installs clang as /usr/bin/cc, /usr/bin/c++ and /usr/bin/cpp.
Note this does *not* disable building and installing gcc, which will
still be available as /usr/bin/gcc, /usr/bin/g++ and /usr/bin/gcpp. If
you want to disable gcc completely, you must use WITHOUT_GCC.
MFC after: 2 weeks
operations for setting and accessing vnode's v_socket field.
The operations are necessary to implement proper unix socket handling
on layered file systems like nullfs(5).
This change fixes the long standing issue with nullfs(5) being in that
unix sockets did not work between lower and upper layers: if we bound
to a socket on the lower layer we could connect only to the lower
path; if we bound to the upper layer we could connect only to the
upper path. The new behavior is one can connect to both the lower and
the upper paths regardless what layer path one binds to.
PR: kern/51583, kern/159663
Suggested by: kib
Reviewed by: arch
MFC after: 2 weeks
sys/xen/interface/io/blkif.h:
o Insert space in "Red Hat".
o Fix typo "discard-aligment" -> "discard-alignment"
o Fix typo "unamp" -> "unmap"
o Fix typo "formated" -> "formatted"
o Clarify the text for "params".
o Clarify the text for "sector-size".
o Clarify the text for "max-requests" in the backend section.
instead of accepting half-constructed vnode. Previous code cannot decide
what to do with such vnode anyway, and although processing it for hash
removal, paniced later when getting rid of nullfs reference on lowervp.
While there, remove initializations from the declaration block.
Tested by: pho
MFC after: 1 week
function null_destroy_proto() from null_insmntque_dtr(). Also
apply null_destroy_proto() in null_nodeget() when we raced and a vnode
is found in the hash, so the currently allocated protonode shall be
destroyed.
Lock the vnode interlock around reassigning the v_vnlock.
In fact, this path will not be exercised after several later commits,
since null_nodeget() cannot take shared-locked lowervp at all due to
insmntque() requirements.
Reported by: rea
Tested by: pho
MFC after: 1 week
- Reserver respective number of addresses for managment port
- octm uses base address directly
- other drivers get MACs on "first come first served" basis
Reviewed by: juli
external pagers in Mach. FreeBSD doesn't implement external pagers.
Moreover, it don't pageout the kernel object. So, the reasons for
having code don't hold.
Reviewed by: kib
MFC after: 6 weeks
following clang warning:
sys/kern/sys_pipe.c:1556:10: error: promoted type 'int' of K&R function parameter is not compatible with the parameter type 'mode_t'
(aka 'unsigned short') declared in a previous prototype [-Werror]
mode_t mode;
^
sys/kern/sys_pipe.c:155:19: note: previous declaration is here
static fo_chmod_t pipe_chmod;
^
Enforce a boundary of no more than 4GB - transfers crossing a 4GB
boundary can lead to data corruption due to PCIe limitations. This
change is a less-intrusive workaround that can be quickly merged back
to older branches; a cleaner implementation will arrive in HEAD later
but may require KPI changes.
This change is based on a suggestion by jhb@.
Reviewed by: scottl, jhb
Sponsored by: Sandvine Incorporated
MFC after: 3 days
amd64/i386/pc98 endian.h with stubs.
In __bswap64_const(x) the conflict between 0xffUL and 0xffULL has been
resolved by reimplementing the macro in terms of __bswap32(x). As a side
effect __bswap64_var(x) is now implemented using two bswap instructions on
i386 and should be much faster. __bswap32_const(x) has been reimplemented
in terms of __bswap16(x) for consistency.
get rid of testing explicitly for clang (using ${CC:T:Mclang}) in
individual Makefiles.
Instead, use the following extra macros, for use with clang:
- NO_WERROR.clang (disables -Werror)
- NO_WCAST_ALIGN.clang (disables -Wcast-align)
- NO_WFORMAT.clang (disables -Wformat and friends)
- CLANG_NO_IAS (disables integrated assembler)
- CLANG_OPT_SMALL (adds flags for extra small size optimizations)
As a side effect, this enables setting CC/CXX/CPP in src.conf instead of
make.conf! For clang, use the following:
CC=clang
CXX=clang++
CPP=clang-cpp
MFC after: 2 weeks
corruption. Thanks to scottl@ for the suggestion.
This change will likely be revised after consideration of a general
method to address this type of issue for other drivers.
Sponsored by: Sandvine Incorporated
MFC after: 3 days
extract a link status of PHY when parent driver is re(4).
RGEPHY_MII_SSR register does not seem to report correct PHY status
on some integrated PHYs used with re(4).
Unfortunately, RealTek PHYs have no additional information to
differentiate integrated PHYs from external ones so relying on PHY
model number is not enough to know that. However, it seems
RGEPHY_MII_SSR register exists for external RealTek PHYs so
checking parent driver would be good indication to know which PHY
was used. In other words, for non-re(4) controllers, the PHY is
external one and its revision number is greater than or equal to 2.
This change fixes intermittent link UP/DOWN messages reported on
RTL8169 controller.
Also, mii_attach(9) is tried after setting interface name since
rgephy(4) have to know parent driver name.
PR: kern/165509
not get syscall exit notification until the child performed exec of
exit. Swap the order of doing ptracestop() and waiting for P_PPWAIT
clearing, by postponing the wait into syscallret after ptracestop()
notification is done.
Reported, tested and reviewed by: Dmitry Mikulin <dmitrym juniper net>
MFC after: 2 weeks
USERSPACE:
1. add support for devices with different number of rx and tx queues;
2. add better support for zero-copy operation, adding an extra field
to the netmap ring to indicate how many buffers we have already processed
but not yet released (with help from Eddie Kohler);
3. The two changes above unfortunately require an API change, so while
at it add a version field and some spares to the ioctl() argument
to help detect mismatches.
4. update the manual page for the two changes above;
5. update sample applications in tools/tools/netmap
KERNEL:
1. simplify the internal structures moving the global wait queues
to the 'struct netmap_adapter';
2. simplify the functions that map kring<->nic ring indexes
3. normalize device-specific code, helps mainteinance;
4. start exploring the impact of micro-optimizations (prefetch etc.)
in the ixgbe driver.
Use 'legacy' descriptors on the tx ring and prefetch slots gives
about 20% speedup at 900 MHz. Another 7-10% would come from removing
the explict calls to bus_dmamap* in the core (they are effectively
NOPs in this case, but it takes expensive load of the per-buffer
dma maps to figure out that they are all NULL.
Rx performance not investigated.
I am postponing the MFC so i can import a few more improvements
before merging.
APIC is not found.
- Don't panic if lapic_enable_cmc() is called and the APIC is not enabled.
This can happen due to booting a kernel with APIC disabled on a CPU that
supports CMCI.
- Wrap a long line.
Descriptions are specific to drivers and we don't change drivers on attached
devices. This fixes a few places where we were not clearing the description
when detaching a driver (e.g. with device_attach() failed). While here, fix
a few other nits:
- Remove spurious call to remove a device's driver from
devclass_driver_deleted(). device_detach() removes it already.
- Fix a typo.
- In sched_pickcpu() be more careful taking previous CPU on SMT systems.
Do it only if all other logical CPUs of that physical one are idle to avoid
extra resource sharing.
- In sched_pickcpu() change general logic of CPU selection. First
look for idle CPU, sharing last level cache with previously used one,
skipping SMT CPU groups. If none found, search all CPUs for the least loaded
one, where the thread with its priority can run now. If none found, search
just for the least loaded CPU.
- Make cpu_search() compare lowest/highest CPU load when comparing CPU
groups with equal load. That allows to differentiate 1+1 and 2+0 loads.
- Make cpu_search() to prefer specified (previous) CPU or group if load
is equal. This improves cache affinity for more complicated topologies.
- Randomize CPU selection if above factors are equal. Previous code tend
to prefer CPUs with lower IDs, causing unneeded collisions.
- Rework periodic balancer in sched_balance_group(). With cpu_search()
more intelligent now, make balansing process flat, removing recursion
over the topology tree. That fixes double swap problem and makes load
distribution more even and predictable.
All together this gives 10-15% performance improvement in many tests on
CPUs with SMT, such as Core i7, for number of threads is less then number
of logical CPUs. In some tests it also gives positive effect to systems
without SMT.
Reviewed by: jeff
Tested by: flo, hackers@
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Uftdi(4) examines (c_iflag & (IXON|IXOFF)) to control hw XON-XOFF support.
This is obviously no good, if changes to those bits are not communicated
down the stack.
allow.mount.zfs:
allow mounting the zfs filesystem inside a jail
This way the permssions for mounting all current VFCF_JAIL filesystems
inside a jail are controlled wia allow.mount.* jail parameters.
Update sysctl descriptions.
Update jail(8) and zfs(8) manpages.
TODO: document the connection of allow.mount.* and VFCF_JAIL for kernel
developers
MFC after: 10 days
socket protocol number. This is useful since the socket type can
be implemented by different protocols in the same protocol family,
e.g. SOCK_STREAM may be provided by both TCP and SCTP.
Submitted by: Jukka A. Ukkonen <jau iki fi>
PR: kern/162352
Discussed with: bz
Reviewed by: glebius
MFC after: 2 weeks
been bait-and-switched from the rate control code.
This will avoid the panic that I saw and will avoid sending invalid rates
(eg 11a/11g OFDM rates when in 11b, on 11b-only NICs (AR5211)) where the
rate table is not "big".
It also will point out situations where this occurs for the 11n NICs
which will have sufficiently large rate tables that "invalid rix" doesn't
occur.
I'll try to follow this up with a commit that adds a current operating mode
check. The "rix" is only relevant to the current operating mode and rate
table.
PR: kern/165475
* ath_reset() is being called in softclock context, which may have the
thing sleep on a lock. To avoid this, since we really _shouldn't_
be sleeping on any locks, break out the no-loss reset path into a tasklet
and call that from:
+ ath_calibrate()
+ ath_watchdog()
This has the added advantage that it'll end up also doing the frame
RX cleanup from within the taskqueue context, rather than the softclock
context.
* Shuffle around the taskqueue_block() call to be before we grab the lock
and disable interrupts.
The trouble here is that taskqueue_block() doesn't block currently
queued (but not yet running) tasks so calling it doesn't guarantee
no further tasks (that weren't running on _A_ CPU at the time of this
call) will complete. Calling taskqueue_drain() on these tasks won't
work because if any _other_ thread calls taskqueue_enqueue() for whatever
reason, everything gets very angry and stops working.
This slightly changes the race condition enough to let ath_rx_tasklet()
run before we try disabling it, and thus quietens the warnings a bit.
The (more) true solution will be doing something like the following:
* having a taskqueue_blocked mask in ath_softc;
* having an interrupt_blocked mask in ath_softc;
* only calling taskqueue_drain() on each individual task _after_ the
lock has been acquired - that way no further tasklet scheduling
is going to occur.
* Then once the tasks have been blocked _and_ the interrupt has been
disabled, call taskqueue_drain() on each, ensuring that anything
that _was_ scheduled or running is removed.
The trouble is if something calls taskqueue_enqueue() on a task
after taskqueue_blocked() has been called but BEFORE taskqueue_drain()
has been called, ta_pending will be set to 1 and taskqueue_drain()
will sit there stuck in msleep() until you hard-kill the machine.
PR: kern/165382
PR: kern/165220
unp->unp_vnode pointer to detect if there is a vnode associated with
(binded to) this socket and does necessary cleanup if there is.
The issue is that after forced unmount this check may be too late as
the unp_vnode is reclaimed and the reference is stale.
To fix this provide a helper function that is called on a socket vnode
reclamation to do necessary cleanup.
Pointed by: kib
Reviewed by: kib
MFC after: 2 weeks
RTL810x family , RTL8139 has different register map for Config
registers.
While here, follow the lead of re(4) in WOL configuration.
- Disable WOL_UCAST and WOL_MCAST capabilities by default.
- Config5 register write does not need to unlock EEPROM access
on RTL8139 family but unlocking EEPROM access does not affect
its operation and make it consistent with re(4).
Reported by: Matt Renzelmann mjr <> cs dot wisc dot edu
according to POSIX document, the clock ID may be dynamically allocated,
it unlikely will be in 64K forever. To make it future compatible, we
pack all timeout information into a new structure called _umtx_time, and
use fourth argument as a size indication, a zero means it is old code
using timespec as timeout value, but the new structure also includes flags
and a clock ID, so the size argument is different than before, and it is
non-zero. With this change, it is possible that a thread can sleep
on any supported clock, though current kernel code does not have such a
POSIX clock driver system.
call in an #if 0 section.
In in6_selecthlim() optimize a case where in6p cannot be NULL due to an
earlier check.
More consistently use u_int instead of int for fibnum function arguments.
Sponsored by: Cisco Systems, Inc.
MFC after: 3 days
bridge, this allows us to have more than one independent bridge in the same
STP domain.
PR: kern/164369
Submitted by: Nikos Vassiliadis (earlier version)
MFC after: 2 weeks
It doesn't _really_ help all that much, I'll commit something to
sys/net/if.c at some point explaining why, but the lock should be held
when checking/manipulating/branching because of said lock.
comlock, I'd like to find and analyse these cases to see if they
really are valid.
So, throw in a lock here and wait for the (hopefully!) inevitable
complaints.
- Remove all attempts to guess physical temperature using DiodeOffset.
There are too many reports that it varies wildly depending on motherboard.
Instead, if it is known to scale well and its offset is known from other
temperature sensors on board, the user may set "dev.amdtemp.0.sensor_offset"
tunable to compensate the difference. Document the caveats in amdtemp(4).
- Add a quirk for Socket AM2 Revision G processors. These processors are
known to have a different offset according to Linux k8temp driver.
- Warn about Family 10h Erratum 319. These processors have broken sensors.
- Report temperature in more logical orders under dev.amdtemp node. For
example, "dev.amdtemp.0.sensor0.core0" is now "dev.amdtemp.0.core0.sensor0".
- Replace K8, K10 and K11 with official processor names in amdtemp(4).
v_writecount. Keep the amount of the virtual address space used by
the mappings in the new vm_object un_pager.vnp.writemappings
counter. The vnode v_writecount is incremented when writemappings gets
non-zero value, and decremented when writemappings is returned to
zero.
Writeable shared vnode-backed mappings are accounted for in vm_mmap(),
and vm_map_insert() is instructed to set MAP_ENTRY_VN_WRITECNT flag on
the created map entry. During deferred map entry deallocation,
vm_map_process_deferred() checks for MAP_ENTRY_VN_WRITECOUNT and
decrements writemappings for the vm object.
Now, the writeable mount cannot be demoted to read-only while
writeable shared mappings of the vnodes from the mount point
exist. Also, execve(2) fails for such files with ETXTBUSY, as it
should be.
Noted by: tegge
Reviewed by: tegge (long time ago, early version), alc
Tested by: pho
MFC after: 3 weeks
a new jail parameter node with the following parameters:
allow.mount.devfs:
allow mounting the devfs filesystem inside a jail
allow.mount.nullfs:
allow mounting the nullfs filesystem inside a jail
Both parameters are disabled by default (equals the behavior before
devfs and nullfs in jails). Administrators have to explicitly allow
mounting devfs and nullfs for each jail. The value "-1" of the
devfs_ruleset parameter is removed in favor of the new allow setting.
Reviewed by: jamie
Suggested by: pjd
MFC after: 2 weeks
at which the lle_tbl pointer points to freed memory and the llt_free pointer is no longer
valid.
Move the free pointer in to the llentry itself and update the initalization sites.
MFC after: 2 weeks
"panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused
by use of a mix of tsleep() and msleep() calls on the same event
in the new NFS server DRC code. It did "mtx_unlock(); tsleep();"
in two places, which kib@ noted introduced a slight risk that the
wakeup() would occur before the tsleep(), resulting in a 10sec
delay before waking up. This patch fixes the problem by replacing
"mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also
changes a nfsmsleep() call to mtx_sleep() so that the code uses
mtx_sleep() consistently within the file.
Tested by: hrs (in progress)
Reviewed by: jhb
MFC after: 5 days
to the debugger. When reparenting for debugging, keep the child in
the new orphan list of old parent. When looping over the children in
kern_wait(), iterate over both children list and orphan list to search
for the process by pid.
Submitted by: Dmitry Mikulin <dmitrym juniper.net>
MFC after: 2 weeks
I'm not sure _why_ the ic is NULL here, but I've seen it occasionally do
this after I've been tinkering with things for a while. It ends up
crashing in a call to ath_chan_set() via the net80211 scan code and scan
task.
don't give RX path more priority than TX path.
Also remove infinite loop in interrupt handler and limit number of
iteration to 32. This change addresses system load fluctuations
under high network load.
found on Adaptec AIC-6915 Starfire ethernet controller.
While here, use status register to know resolved speed/duplex.
With this change, sf(4) correctly reports speed/duplex of
established link.
Reviewed by: marius
the traffic flow, this may not be the case giving poor traffic distribution.
Add a sysctl which allows us to fall back to our own flow hash code.
PR: kern/164901
Submitted by: Eugene Grosbein
MFC after: 1 week
UMTX_OP_WAIT. Upper 16bits is enough to hold a clock id, and lower
16bits is used to pass flags. The change saves a clock_gettime() syscall
from libthr.
- Centralize address assignment
- Make sure managment ports get first MAC address in pool
- Properly propagate fail if address allocation failed
Submitted by: Andrew Duane <aduane@juniper.net>
sys/dev/hpt27xx/osm_bsd.c, since it gets the following warnings:
sys/dev/hpt27xx/osm_bsd.c:1180:25: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security]
S_IRUSR | S_IWUSR, driver_name);
^~~~~~~~~~~
@/dev/hpt27xx/hpt27xx_config.h:46:21: note: expanded from:
#define driver_name hpt27xx_driver_name
^~~~~~~~~~~~~~~~~~~
Since 'hpt27xx_driver_name' is a constant string symbol (coming from the
proprietary hpt27xx_lib.o file), there is no security problem.
Because this driver is provided by the vendor, and applying changes
requires re-certification and other bureaucratic exercises, just disable
the warning for now.
MFC after: 1 week
several sys/cam/ctl files, since these get the following warnings:
In file included from sys/cam/ctl/ctl_backend.c:60:
sys/cam/ctl/ctl_private.h:300:30: error: variable 'page_index_template' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static struct ctl_page_index page_index_template[] = {
^
These warnings are tricky to fix without a lot of overhaul, and they are
harmless, so disable them for now.
MFC after: 1 week
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.
Discussed with: bde, das (previous versions)
MFC after: 1 month
types 0x05 and 0x0f, but 0x05 is preferred and used when partition is
created with "gpart add -t ebr ...".
This should keep EBR partitions accessible after r231754 for those,
who have EBR on the partition with type 0x0f.
- In case the parent is bge(4), don't set the Jumbo frame settings unless
the MAC actually is Jumbo capable as otherwise the PHY might not have the
corresponding registers implemented. This is also in line with what the
Linux tg3 driver does.
PR: 165032
Submitted by: Alexander Milanov
Obtained from: OpenBSD
MFC after: 3 days
hold the lock.
This is part of my series of work to try and capture when net80211
locking isn't.
ObNote: it'd be nice to be able to mark a lock as "assert if the lock
is dropped", so I could capture functions which decide that dropping
and reacquiring the lock is a good idea (without re-checking the
sanity of the state protected by the lock.)
Vnode-backed mappings cannot be put into the kernel map, since it is a
system map.
Use exec_map for transient mappings, and remove the mappings with
kmem_free_wakeup() to notify the waiters on available map space.
Do not map the whole executable into KVA at all to copy it out into
usermode. Directly use vn_rdwr() for the case of not page aligned
binary.
There is one place left where the potentially unbounded amount of data
is mapped into exec_map, namely, in the COFF image activator
enumeration of the needed shared libraries.
Reviewed by: alc
MFC after: 2 weeks
devices that are unplugged via QEMU.
sys/dev/xen/blkback/blkback.c:
Toolstack initiated closures change the frontend's state
to Closing. The backend must change to Closing as well,
even if we can't actually close yet, in order for the
frontend to notice and start the closing process.
MFC after: 3 days
- remove the KEVENT code, which was incomplete and not compiled anyways;
- change some while() loops into for()
- adjust indentation
- remove extra whitespace
MFC after: 1 week
struct ccb_pathinq from sys/cam/cam_ccb.h wasn't added to stable/7 at all
and didn't appear in stable/8 until svn R195534. Since __FreeBSD_version
did not get bumped until svn R195634, assume that maxio is valid at 800102
or higher.
Obtained from: Yahoo! Inc.
MFC after: 0 days
with RX/TX halting.
* Always disable/enable interrupts during a channel change, just to simply
things.
* Ensure that the ath taskqueue has completed and is paused before
continuing.
This dramatically reduces the instances of overlapping RX and reset
conditions.
PR: kern/165220
The previous code did not limit the I/O request size based on
the maximum number of segments supported by the back-end. In
current practice, since the only back-end supporting chained
requests is the FreeBSD implementation, this limit was never
exceeded.
sys/dev/xen/blkfront/block.h:
Add two macros, XBF_SEGS_TO_SIZE() and XBF_SIZE_TO_SEGS(),
to centralize the logic of reserving a segment to deal with
non-page-aligned I/Os.
sys/dev/xen/blkfront/blkfront.c:
o When negotiating transfer parameters, limit the
max_request_size we use and publish, if it is greater
than the maximum, unaligned, I/O we can support with
the number of segments advertised by the backend.
o Don't unilaterally reduce the I/O size published to
the disk layer by a single page. max_request_size
is already properly limited in the transfer parameter
negotiation code.
o Fix typos in printf strings:
"max_requests_segments" -> "max_request_segments"
"specificed" -> "specified"
MFC after: 1 day
- Make hash sizes growable, to satisfy users running large mpd
installations, having thousands of nodes.
- NG_NAMEHASH() proved to give a very bad distribution in real life
name sets, while generic hash32_str(name, HASHINIT) proved to give
an even one, so you the latter for name hash.
- Do not store unnamed nodes in slot 0 of name hash, no reason for that.
- Use the ID hash in cases when we need to run through all nodes: the
NGM_LISTNODES command and in the vnet_netgraph_uninit().
- Implement NGM_LISTNODES and NGM_LISTNAMES as separate code, the former
iterates through the ID hash, and the latter through the name hash.
- Keep count of all nodes and of named nodes, so that we don't need
to count nodes in NGM_LISTNODES and NGM_LISTNAMES. The counters are
also used to estimate whether we need to grow hashes.
- Close a race between two threads running ng_name_node() assigning same
name to different nodes.
messages were printed.
This can be enabled with the kern.msgbuf_show_timestamp sysctl
PR: kern/161553
Reviewed by: avg
Submitted by: Arnaud Lacombe <lacombar@gmail.com>
Approved by: cperciva
MFC after: 1 month
spinlock_enter()/spinlock_exit() to save/restore RFLAGS. We know interrupt
is disabled when returning from S3. For AP, we do not have to save/restore
it because IRET will do it for us any way. Do not save CR3 locally because
savectx() does it and BSP does not have to switch to kernel map for amd64.
Change contigmalloc(9) flag while I am in the neighborhood.
Introduce some functions to map NIC ring indexes into netmap ring
indexes and vice versa. This way we can implement the bound
checks only in one place (and hopefully in a correct way).
On passing, make the code and comments more uniform across the
various drivers.
Mask off the first 16 pages unless we appear to be running in a VM. This
address may be overridden by 'hw.physmem.start' tunable from loader.
Note Linux used to have a BIOS quirk table for this issue but it seems they
made it default recently.
hz >> 1000 and thus getting outside the timestamp clock frequenceny of
1ms < x < 1s per tick as mandated by RFC1323, leading to connection
resets on idle connections.
Always use a granularity of 1ms using getmicrouptime() making all but
relevant callouts independent of hz.
Use getmicrouptime(), not getmicrotime() as the latter may make a jump
possibly breaking TCP nfsroot mounts having our timestamps move forward
for more than 24.8 days in a second without having been idle for that
long.
PR: kern/61404
Reviewed by: jhb, mav, rrs
Discussed with: silby, lstewart
Sponsored by: Sandvine Incorporated (originally in 2011)
MFC after: 6 weeks
don't try probe and create EBR scheme when parent partition type
is not "ebr". This fixes error messages about corrupted EBR for
some partitions where is actually another partition scheme.
NOTE: if you have EBR on the partition with different than "ebr"
(0x05) type, then you will lost access to partitions until it will be
changed.
MFC after: 2 weeks
FreeBSD's front and back Xen blkif interface drivers.
sys/dev/xen/blkfront/block.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
Replace FreeBSD specific multi-page ring impelementation with
support for both the Citrix and Amazon/RedHat versions of this
extension.
sys/dev/xen/blkfront/blkfront.c:
o Add a per-instance sysctl tree that exposes all negotiated
transport parameters (ring pages, max number of requests,
max request size, max number of segments).
o In blkfront_vdevice_to_unit() add a missing return statement
so that we properly identify the unit number for high numbered
xvd devices.
sys/dev/xen/blkback/blkback.c:
o Add static dtrace probes for several events in this driver.
o Defer connection shutdown processing until the front-end
enters the closed state. This avoids prematurely tearing
down the connection when buggy front-ends transition to the
closing state, even though the device is open and they
veto the close request from the tool stack.
o Add nodes for maximum request size and the number of active
ring pages to the exising, per-instance, sysctl tree.
o Miscelaneous style cleanup.
sys/xen/interface/io/blkif.h:
o Add extensive documentation of the XenStore nodes used to
implement the blkif interface.
o Document the startup sequence between a front and back driver.
o Add structures and documenatation for the "discard" feature
(AKA Trim).
o Cleanup some definitions related to FreeBSD's request
number/size/segment-limit extension.
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
sys/xen/xenbus/xenbusvar.h:
Add the convenience function xenbus_get_otherend_state() and
use it to simplify some logic in both block-front and block-back.
MFC after: 1 day
This allows LUNs greater than 0 to be probed. It can be increased later if
need be.
This brings back SVN rev 224973, which was inadvertently removed with the
import of the LSI driver.
Reported by: dwhite
MFC after: 3 days