temporarily stores characters if the TTY buffer is full when
used a as a console. This can happen when a console is suspended.
Also properly do the flow stop signalling when this happens and
flow start when the condition changes back to normal again.
Bump __FreeBSD_version to force external kernel modules
to be recompiled. No kernel API changes.
MFC after: 1 week
Suggested by: ed @
give rwlock(9) the ability to crunch different type of structures, with
the only constraint that they have a lock cookie named rw_lock.
This name, then, becames reserved from the struct that wants to use
the rwlock(9) KPI and other locking primitives cannot reuse it for
their members.
Namely such structs are the current struct rwlock and the new struct
rwlock_padalign. The new structure will define an object which has the
same layout of a struct rwlock but will be allocated in areas aligned
to the cache line size and will be as big as a cache line.
For further details check comments on above mentioned revisions.
Reviewed by: jimharris, jeff
only constraint that they have a lock cookie named mtx_lock.
This name, then, becames reserved from the struct that wants to use the
mtx(9) KPI and other locking primitives cannot reuse it for their
members.
Namely such structs are the current struct mtx and the new
struct mtx_padalign. The new structure will define an object which is
the same as the same layout of a struct mtx but will be allocated in
areas aligned to the cache line size and will be as big as a cache line.
This is supposed to give higher performance for highly contented mutexes
both spin or sleep (because of the adaptive spinning), where the cache
line contention results in too much traffic on the system bus.
The struct mtx_padalign can be used in a completely transparent way
with the mtx(9) KPI.
At the moment, a possibility to MFC the patch should be carefully
evaluated because this patch breaks the low level KPI
(not its representation though).
Discussed with: jhb
Reviewed by: jeff, andre
Reviewed by: mdf (earlier version)
Tested by: jimharris
sched_unpin() as they are functions static and inline. This way it
can do two dangerous things:
- Reorder instructions around both of them, taking out from the safe
path operations that are supposed to be (ie. per-cpu accesses)
- Cache the value of td_pinned in CPU registers not making visible
in kernel context to the scheduler once it is scanning the runqueue,
as td_pinned is not marked volatile.
In order to avoid both possible bugs explicitly, protect the safe path
with compiler memory barriers. This will prevent reordering and caching
by the compiler about td_pinned operations.
Generally this could lead to suboptimal code traversing the pinnings
but this is not the case as can be easilly verified:
http://lists.freebsd.org/pipermail/svn-src-projects/2012-October/005797.html
Discussed with: jeff, jhb
MFC after: 2 weeks
with softupdates went away. Note that this does not fix the problem
entirely; I'm committing it now to make it easier for someone to pick
up the work.
Reviewed by: mckusick
In the old TTY layer, SIGTTIN was correctly handled like this:
while (data should be read) {
send SIGTTIN if not foreground process group
read data
}
In the new TTY layer, however, this behaviour was changed, based on a
false interpretation of the standard:
send SIGTTIN if not foreground process group
while (data should be read) {
read data
}
Correct this by pushing tty_wait_background() into the ttydisc_read_*()
functions.
Reported by: koitsu
PR: kern/173010
MFC after: 2 weeks
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.
After this change a packet processed by the stack isn't
modified at all[2] except for TTL.
After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.
[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.
[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.
Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
(Model 0x2D /* Per Intel document 253669-044US 08/2012. */)
Add manpage to document all the goodness that is available in this
processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: jimharris@ fabient@
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
GIANT from VFS. In addition, disconnect also netsmb, which is a base
requirement for SMBFS.
In the while SMBFS regular users can use FUSE interface and smbnetfs
port to work with their SMBFS partitions.
Also, there are ongoing efforts by vendor to support in-kernel smbfs,
so there are good chances that it will get relinked once properly locked.
This is not targeted for MFC.
now use function calls:
if_clone_simple()
if_clone_advanced()
to initialize a cloner, instead of macros that initialize if_clone
structure.
Discussed with: brooks, bz, 1 year ago
counter, without actually allocating the vnodes. The supposed use of
the getnewvnode_reserve(9) is to reclaim enough free vnodes while the
code still does not hold any resources that might be needed during the
reclamation, and to consume the slack later for getnewvnode() calls
made from the innards. After the critical block is finished, the
caller shall free any reserve left, by getnewvnode_drop_reserve(9).
Reviewed by: avg
Tested by: pho
MFC after: 1 week
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks
- All packets in NETISR_IP queue are in net byte order.
- ip_input() is entered in net byte order and converts packet
to host byte order right _after_ processing pfil(9) hooks.
- ip_output() is entered in host byte order and converts packet
to net byte order right _before_ processing pfil(9) hooks.
- ip_fragment() accepts and emits packet in net byte order.
- ip_forward(), ip_mloopback() use host byte order (untouched actually).
- ip_fastforward() no longer modifies packet at all (except ip_ttl).
- Swapping of byte order there and back removed from the following modules:
pf(4), ipfw(4), enc(4), if_bridge(4).
- Swapping of byte order added to ipfilter(4), based on __FreeBSD_version
- __FreeBSD_version bumped.
- pfil(9) manual page updated.
Reviewed by: ray, luigi, eri, melifaro
Tested by: glebius (LE), ray (BE)
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.
o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.
o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.
o Drivers handle their stats theirselves: if_obytes, if_omcasts.
o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.
o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.
o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.
o mxge(4) just no longer needs NO_SLOW_STATS define.
o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.
Reviewed by: jfv, gnn
directly in _rmlock.h and then including it (and its dependencies)
in pcpu.h. This leads to few _*.h headers to be included in pcpu.h
but this is not considered a big deal.
Really pc_rm_queue should be implemented as a dynamic member with
DPCPU interface, but we really want to keep the read acquisition as
fast as possible, so even the further pc_dynamic indirection should be
avoided, and the pollution is dealt like this.
Discussed with: jhb
MFC after: 1 week
Compared to __member2struct(), this macro has the following advantages:
- It ensures that the type of the pointer is compatible with the member
field of the structure (or a void pointer).
- It works properly in combination with volatile and const, though
unfortunately it drops these qualifiers from the returned value.
mdf@ proposed to add the container_of() macro, just like Linux has.
Eventually I decided against this, as <sys/param.h> is included all over
the place. It seems container_of() on Linux is specific to the kernel,
not userspace. I'd rather not pollute userspace with this.
I also thought about adding __container_of(), but this would have two
advantages. Xorg seems to already have a __container_of(), which is not
compatible with this version. Also, the underscore in the middle
conflicts with our existing macros (__offsetof, __rangeof, etc).
I'm changing member2struct() to use its old code, as the extra
strictness of this new macro conflicts with existing code (read: cxgb).
MFC after: 1 month
The prev-pointers point to the next-pointers of the previous element --
not the ENTRY structure. The next-pointers are stored in the ENTRY
structures first, so the code would already work correctly. Still, it is
more accurate to use the next-fields.
To prevent misuse of __member2struct() in the future, I've got a patch
that requires the pointer to be passed to this macro to be compatible
with the member of the structure. I'll commit this patch after I've
tested it properly.
MFC after: 1 month.
Regular LISTs have been implemented in such a way that the prev-pointer
does not point to the previous element, but to the next-pointer stored
in the previous element. This is done to simplify LIST_REMOVE(). This
macro can be implemented without knowing the address of the list head.
Unfortunately this makes it harder to implement LIST_PREV(), which is
why this macro was never here. Still, it is possible to implement this
macro. If the prev-pointer points to the list head, we return NULL.
Otherwise we simply subtract the offset of the prev-pointer within the
structure.
It's not as efficient as traversing forward of course, but in practice
it shouldn't be that bad. In almost all use cases, people will want to
compare the value returned by LIST_PREV() against NULL, so an optimizing
compiler will not emit code that does more branching than TAILQs.
While there, make the code a bit more readable by introducing
__member2struct(). This makes STAILQ_LAST() far more readable.
MFC after: 1 month
about vnode reclamation. Typical use is for the bypass mounts like
nullfs to get a notification about lower vnode going away.
Now, vgone() calls new VFS op vfs_reclaim_lowervp() with an argument
lowervp which is reclaimed. It is possible to register several
reclamation event listeners, to correctly handle the case of several
nullfs mounts over the same directory.
For the filesystem not having nullfs mounts over it, the overhead
added is a single mount interlock lock/unlock in the vnode reclamation
path.
In collaboration with: pho
MFC after: 3 weeks
lookup code that dotdot lookups shall override any shared lock
requests with the exclusive one. The flag is useful for filesystems
which sometimes need to upgrade shared lock to exclusive inside the
VOP_LOOKUP or later, which cannot be done safely for dotdot, due to
dvp also locked and causing LOR.
In collaboration with: pho
MFC after: 3 weeks
NetBSD/pc98 was never merged into the main NetBSD tree and is no longer
developed. Adding locking to these drivers would have made the compat
shims hard to impossible to maintain, so remove the shims to ease
future changes.
These changes were verified by md5. Some additional shims can be removed
that do affect the compiled results that I will probably do in another
round.
Approved by: nyan (tentatively)
- Provide missing function that can do hashing of arbitrary sized buffer.
- Refetch lookup3.c and do only minimal edits to it, so that diff between
our jenkins_hash.c and lookup3.c is minimal.
- Add declarations for jenkins_hash(), jenkins_hash32() to sys/hash.h.
- Document these functions in hash(9)
Obtained from: http://burtleburtle.net/bob/c/lookup3.c
handler and not more statically.
Unfortunately, it seems that this is not ideal for new platform bringup
and boot low level development (which needs ktr_cpumask to be effective
before tunables can be setup).
Because of this, add a way to statically initialize cpusets, by passing
an list of initializers, divided by commas. Also, provide a way to enforce
an all-set mask, for above mentioned initializers.
This imposes some differences on how KTR_CPUMASK is setup now as a
kernel option, and in particular this makes the words specifications
backward wrt. what is currently in -CURRENT. In order to avoid mismatches
between KTR_CPUMASK definition and other way to setup the mask
(tunable, sysctl) and to print it, change the ordering how
cpusetobj_print() and cpusetobj_scan() acquire the words belonging
to the set.
Please give a look to sys/conf/NOTES in order to understand how the
new format is supposed to work.
Also, ktr manpages will be updated shortly by gjb which volountereed
for this.
This patch won't be merged because it changes a POLA (at least
from the theoretical standpoint) and this is however a patch that
proves to be effective only in development environments.
Requested by: rpaulo
Reviewed by: jeff, rpaulo
PMC_PROC_IS_USING_PMCS macro.
Invocations of this macro are not synchronized with setting/clearing
of P_HWPMC flag, so the atomic operation here isn't needed. Removing
the atomic operation provides noticeable improvement (5-6%) on
some scheduler-intensive workloads with HWPMC_HOOKS enabled on an
8C Sandy Bridge Xeon system.
Sponsored by: Intel
Reviewed by: jhb
MFC after: 1 week
"device_free_softc()" and "device_claim_softc()",
to allow USB serial drivers refcounting the softc.
These functions are used to grab the softc from
auto-free and to free the softc back to the correct
malloc type, respectivly.
Discussed with: jhb
MFC after: 2 weeks