also fix up handling and proding of the tx, _OACTIVE is now handled
better...
Submitted by: Peter Edwards (sk_jfree)
Obtained from: OpenBSD and/or NetBSD (tx prod)
instead of a vnode for it.
The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.
(vfs_cluster doesn't use the backing store argument.)
i386 to dev/acpi_support. In theory, these devices could be found
other than in i386 machines only as amd64 becomes more popular. These
drivers don't appear to do anything i386 specific, so move them to
dev/acpi_support. Move config lines to files so that those
architectures that don't support kernel modules can build them into
the kernel. At the same time, rename acpi_snc to acpi_sony to follow
the lead of all the other specialty devices.
connects to the keyboard and mouse and needs some special treatment.
Until this is fully understood, implemented and tested, simply avoid
probing the second Z8530. This is also what the zs(4) driver does.
table with console settings, we now only need to know at which
address the UART lives. Leaving the baudrate unspecified results
in us using the baudrate at which the UART operates. This removes
one parameter that can interfere with a successful installation
out of the box.
current baudrate setting. Use this ioctl() when we don't know the
baudrate of the sysdev (as represented by a 0 value). When the
ioctl() fails, e.g. when the backend hasn't implemented it or the
hardware doesn't provide the means to determine its current baudrate
setting, we invalidate the baudrate setting by setting it to -1.
None of the backends currently implement the new ioctl().
A baudrate we consider insane is silently replaced with 0. When the
baudrate is 0, we will not try to program the hardware. Instead we
leave the communication speed unaltered, maximizing the chance to
have a working console. Obviously this means we allow specifying a
0 baudrate for exactly that purpose.
- Because em_encap() can now fail in a way that leaves us without an
mbuf chain, potentially set *m_headp to NULL if that happens, so that
the caller can do the right thing. This case can occur when we try
to prepend the vlan header mbuf but can't allocate additional memory.
- Modify the caller of em_encap() to detect a NULL m_head and not try
to queue the mbuf if that happens.
- When em_encap() fails, make sure to call bus_dmamap_destroy() to
clean up.
but sk(4) is so prevalent on AMD64 motherboards we need to reduce the number
of round trips in the mailing lists trying to get sufficient information to
make sure we've got a handle on all the problems and are working towards
making sk(4) solid.
Submitted by: bz
isn't worth adding to the modules lists that we have to hard code for
this to work. Since we print PID right away, we have a trace point
already.
Minor knf while I'm here.
to a cdev and a devsw, doing all the relevant checks along the way.
Add the check to see if fp->f_vnode->v_rdev differs from our cached
fp->f_data copy of our cdev. If it does the device was revoked and
we return ENXIO.
Use this in all the places where sleeping with the lock held is not
an issue.
The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.
return EINVAL rather than setting error, and don't free sops
unconditionally. The first change was merged accidentally as part of
the larger set of changes to introduce MAC labels and access control,
and potentially lead to continued processing of a request even after
it was determined to be invalid. The second change was due to changes
in the semaphore code since the original work was performed.
Pointed out by: truckman
models of laptops, which are essentially the same as the normal
ones, as far as acpi_asus is concerned[1]
o Use the above as an excuse to reshuffle the mess I made of the
probe function when I originally wrote it.
Reported by: Soeren Larsen <soeren@whiteswan.dk>
and has been broken twice:
- in the beginning of div_output() replace KASSERT with assignment, as
it was in rev. 1.83. [1] [to be MFCed]
- refactor changes introduced in rev. 1.100: do not prepend a new tag
unconditionally. Before doing this check whether we have one. [2]
A small note for all hacking in this area:
when divert socket is not a real userland, but ng_ksocket(4), we receive
_the same_ mbufs, that we transmitted to socket. These mbufs have rcvif,
the tags we've put on them. And we should treat them correctly.
Discussed with: mlaier [1]
Silence from: green [2]
Reviewed by: maxim
Approved by: julian (mentor)
MFC after: 1 week
This makes it possible to have more than one address with the same prefix.
The first address added is used for the route. On deletion of an address
with IFA_ROUTE set, we try to find a "fallback" address and hand over the
route if possible.
I plan to MFC this in 4 weeks, hence I keep the - now obsolete - argument to
in_ifscrub as it must be considered KAPI as it is not static in in.c. I will
clean this after the MFC.
Discussed on: arch, net
Tested by: many testers of the CARP patches
Nits from: ru, Andrea Campi <andrea+freebsd_arch webcom it>
Obtained from: WIDE via OpenBSD
MFC after: 1 month
to be modified and extended without breaking the user space ABI:
Use _kernel variants on _ds structures for System V sempahores, message
queues, and shared memory. When interfacing with userspace, export
only the _ds subsets of the _kernel data structures. A lot of search
and replace.
Define the message structure in the _KERNEL portion of msg.h so that it
can be used by other kernel consumers, but not exposed to user space.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, SPAWAR, McAfee Research
to be modified and extended without breaking the user space ABI:
Define _kernel wrapper data structures for the user-exposed data
structures that current server as the internal data structures for
the implementation:
- struct msqid_kernel wraps struct msqid_ds.
- struct semid_kernel wraps truct semid_ds.
- struct shmid_kernel wraps struct shmid_ds.
- Don't expose extern definition 'shmsegs' outside of sysv_shm.c.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, SPAWAR, McAfee Research
promiscuous mode introduced in 1.45, which programs the em card not
to strip or prepend tags when in promiscuous mode without also
modifying behavior to manually prepend a vlan header in the event
that the card isn't doing it on transmit. Due to a feature of card
operation, if the global VLAN prepend/strip register isn't set,
setting the VLAN tag flag on individual packet descriptors will
cause the packet to be transmitted using ISL encapsulation rather
than 802.1Q VLAN encapsulation.
This fix causes em_encap() to prepend the header by tracking whether
the card is configured to temporarily disable prepending/stripping
due to promiscuous mode. As a result, entering promiscuous mode on
the parent em interface no longer causes vlans to appear to "wedge"
or transmit ISL-encapsulated frames, which typically will not be
configured/spoken by the other endpoints on the VLAN trunk. This
bug may also exist in other drivers, and the additional vlan
encapsulation logic should be abstracted and centralized in
if_vlan.c if so.
RELENG_5_3 candidate.
MFC after: 1 week
Tested by: pjd, rwatson
Reported by: astesin at ukrtelecom dot net
Reported by: Mike Tancsa <mike at sentex dot net>
Reported by: Iasen Kostov <tbyte at OTEL dot net>
reports of problems. The bug is probably that there are cases where
`xfer->timeout && !sc->sc_bus.use_polling' is not a suitable test
for an active timeout callout, so an explicit flag will be necessary.
Apologies for the breakage.
the tree. Small tweaks were made by myself to eliminate unnecessary
includes and some other minor issues. Last time I asked takawata-san
about this driver, he suggested I commit it.
Submitted by: takawata
of atomic_store_rel().
- Use the 80386 versions of atomic_load_acq() and atomic_store_rel() that
do not use serializing instructions on all UP kernels since a UP machine
does need to synchronize with other CPUs. This trims lots of cycles from
spin locks on UP kernels among other things.
Benchmarked by: rwatson
a bridge without a _PRT were a _PRT was needed. Instead, the warning in
dmesg is a false warning and only serves to cause unnecessary concern.
MFC after: 1 week
be made holding the NFS server mutex. To clean this up, introduce a
version of the function, nfsrv_access_withgiant(), that expects the
NFS server mutex to already have been dropped and Giant acquired.
Wrap nfsrv_access() around this. This permits callers to more
efficiently check access if they're in a code block performing VFS
operations, and can be substitited for the nfsrv_access() call that
triggered this bug.
PR: 73807, 73208
MFC after: 1 week
that the events queue is empty. In other case we're able to hit the race
where for example da0s1 is tasted by some other class, which means that
da0 is open with exclusive bit set, which means that we can't open da0
for writing if it is our component.
Reported by: Attila Nagy <bra@fsn.hu> (and somebody else sometime ago,
but I cannot find who it was)
transfer timeouts that typically cause a transfer to be completed
twice, resulting in panics and page faults:
o A transfer completion interrupt could arrive while an abort_task
event was set up, so the transfer would be aborted after it had
completed. This is very easy to reproduce. Fix this by setting
the transfer status to USBD_TIMEOUT before scheduling the
abort_task so that the transfer completion code will ignore it.
o The transfer completion code could execute concurrently with the
timeout callout, leaving the callout blocked (e.g. waiting for
Giant) while the transfer completion code runs. In this case,
callout_stop() does not prevent the callout from running, so
again the timeout code would run after the transfer was complete.
Handle this case by checking the return value from callout_stop(),
and ignoring the transfer if the callout could not be removed.
o Finally, protect against a timeout callout occurring while a
transfer is being aborted by another process. Here we arrange
for the timeout processing to ignore the transfer, and use
callout_drain() to ensure that the callout has really gone before
completing the transfer.
This was tested by repeatedly performing USB transfers with a timeout
set to approximately the same as the normal transfer completion
time. In the PR below, apparently this occurred by accident with a
particular printer and the default timeout.
PR: kern/71491
changes associated with adding System V IPC support. This will prevent
old modules from being used with the new kernel, and new modules from
being used with the old kernel.
appropriate for different tag requirements. With the former global pool,
bounce pages might get allocated that are appropriate for one tag, but not
appropriate for another, but the system had no way to distinguish between them.
Now zones with distinct attributes are created to hold pages, and each tag
that requires bouncing is associated with a zone. New zones are created as
needed if no existing zones can meet the requirements of the tag. Stats for
each zone are tracked via the hw.busdma sysctl node.
This should help drivers that are failing with mysterious data corruption.
MFC After: 1 week
includes the latter, but also declares variables which are defined
in kern/subr_param.c).
Change som VM parameters from quad_t to unsigned long. They refer to
quantities (size limits for text, heap and stack segments) which must
necessarily be smaller than the size of the address space, so long is
adequate on all platforms.
MFC after: 1 week
sensitive, but less excercised location (the watchdog). While here use the
*_start_locked function directly to avoid drop, grab, drop lock.
I have to be very careful with future ALTQ patches!
Found & reviewed by: rwatson
MFC after: 3 days
The tunable vfs.devfs.fops controls this feature and defaults to off.
When enabled (vfs.devfs.fops=1 in loader), device vnodes opened
through a filedescriptor gets a special fops vector which instead
of the detour through the vnode layer goes directly to DEVFS.
Amongst other things this allows us to run Giant free read/write to
device drivers which have been weaned off D_NEEDGIANT.
Currently this means /dev/null, /dev/zero, disks, (and maybe the
random stuff ?)
On a 700MHz K7 machine this doubles the speed of
dd if=/dev/zero of=/dev/null bs=1 count=1000000
This roughly translates to shaving 2usec of each read/write syscall.
The poll/kqfilter paths need more work before they are giant free,
this work is ongoing in p4::phk_bufwork
Please test this and report any problems, LORs etc.
Change the spelling of the "catch" option to be consistent with the new
options. Implement the "no wait" option. An implementation of the "CPU
private" for i386 will be committed at a later date.
This generates a KTR event for each critical section entered and exited.
It would be desirable to also log the filename and line number of the
source entering or exiting the critical section, but this requires
hacking up the critical section API, so I've not done that yet.
retain the pcbinfo lock until we're done using a pcb in the in-bound
path, as the pcbinfo lock acts as a pseuo-reference to prevent the pcb
from potentially being recycled. Clean up assertions and make sure to
assert that the pcbinfo is locked at the head of code subsections where
it is needed. Free the mbuf at the end of tcp_input after releasing
any held locks to reduce the time the locks are held.
MFC after: 3 weeks
number of entries into bucket_zone_lookup(), which helps make more
clear the logic of consumers of bucket zones.
Annotate the behavior of bucket_init() with a comment indicating
how the various data structures, including the bucket lookup tables,
are initialized.
swapoff: failed to locate %d swap blocks
The race occurred because putpages() can block between the time it
allocates swap space and the time it updates the swap metadata to
associate that space with a vm_object, so swapoff() would complain
about the temporary inconsistency. I hoped to fix this by making
swp_pager_getswapspace() and swp_pager_meta_build() a single atomic
operation, but that proved to be inconvenient. With this change,
swapoff() simply doesn't attempt to be so clever about detecting when
all the pageout activity to the target device should have drained.
because this call is only needed to wake threads that slept when they
discovered a dead object connected to a vnode. To eliminate unnecessary
calls to wakeup() by vnode_pager_dealloc(), introduce a new flag,
OBJ_DISCONNECTWNT.
Reviewed by: tegge@
Expose some of the amd64-specific sysarch functions to allow alternative
implementations of the %fs/%gs code for TLS, threads, etc. USER_LDT does
not exist on the amd64 kernel, so we have to implement things other ways.
priority. The sleep queues don't get updated when the priority of
threads changes, so sleepq_signal() might not always wakeup the
highest priority thread. Updating the queues when thread priorities
change cannot be easily done due to lock orders, so instead we do an
O(n) walk of the queue for a sleepq_signal() operation instead of O(1).
On the other hand, adding a thread to a sleep queue now goes from O(n)
to O(1) so it ends up as an even tradeoff. The correctness here with
regards to priorities is actually fairly important. msleep() gives
interactive threads their priority "boost" after they are placed on the
queue, but before this fix that "boost" wasn't used to determine the
highest priority thread that sleepq_signal() awoke.
- Fix up some comments.
Inspired by: ups, bde
associated with each processor. This ID is inferred from the index
of the pcs structure in the hwprb.
- Give Alpha CPUs FreeBSD CPU IDs more like other architectures where the
boot processor is always CPU 0 and the other processors are numbered
1 ... N. List active CPUs in the system in cpu_mp_announce() as well.
Silence on: alpha@
thread is created rather than adjusting the priority in the main
function. (kthread_create() should probably take the initial priority
as an argument.)
- Only yield the CPU in the !PREEMPTION case if there are any other
runnable threads. Yielding when there isn't anything else better to do
just wastes time in pointless context switches (albeit while the system
is idle.)
- Tweak the updating of the ithread name in ithread_update() so that the
'+' and '*' characters for device names that were too short only get
added at the end after as many device names as possible were fit into
the allocated space. Prior to this, some long devices would result
in '+' chars showing up between two different devices rather than at the
end.
seconds of idling, DRITY flags are removed.
- If mirror is in idle state or is not open for writing, sleep without
timeout when waiting for I/O requests.
- Don't use atomic operations, for now sysctls are protected by Giant.
- Update debugs.
- Fix for good (I hope) force-stopping mirrors and some filure cases
(e.g. the last good component dies when synchronization is in progress).
Don't use ->nstart/->nend consumer's fields, as this could be racy,
because those fields are used in g_down/g_up, use ->index consumer's
field instead for tracking number of not finished requests.
Reported by: marcel
- After 5 seconds of idle time (this should be configurable) mark all
dirty providers as clean, so when mirror is not used in 5 seconds
and there will be power failure, no synchronization on boot is needed.
Idea from: sorry, I can't find who suggested this
- When there are no ACTIVE components and no NEW components destroy whole
mirror, not only provider.
- Fix one debug to show information about I/O request, before we change
its command.
- Eliminate an initialized but unused variable.
- Eliminate an unnecessary call to clear the page's PG_BUSY flag. (The
call to vm_page_rename() already clears the page's PG_BUSY flag through
its call to vm_page_remove().)
In order to avoid livelock, swapoff() skips over objects with a
nonzero pip count and makes another pass if necessary. Since it is
impossible to know which objects we care about, it would choose an
arbitrary object with a nonzero pip count and wait for it before
making another pass, the theory being that this object would finish
paging about as quickly as the ones we care about. Unfortunately,
we may have slept since we acquired a reference to this object.
Hack around this problem by tsleep()ing on the pointer anyway, but
timeout after a fixed interval. More elegant solutions are possible,
but the ones I considered unnecessarily complicate this rare case.
Also, kill some nits that seem to have crept into the swapoff() code
in the last 75 revisions or so:
- Don't pass both sp and sp->sw_used to swap_pager_swapoff(), since
the latter can be derived from the former.
- Replace swp_pager_find_dev() with something simpler. There's no
need to iterate over the entire list of swap devices just to determine
if a given block is assigned to the one we're interested in.
- Expand the scope of the swhash_mtx in a couple of places so that it
isn't released and reacquired once for every hash bucket.
- Don't drop the swhash_mtx while holding a reference to an object.
We need to lock the object first. Unfortunately, doing so would
violate the established lock order, so use VM_OBJECT_TRYLOCK() and
try again on a subsequent pass if the object is already locked.
- Refactor swp_pager_force_pagein() and swap_pager_swapoff() a bit.
syslog(3) if we are a priveleged program (sshd, su, etc.).
- Make syslogd open an additional socket /var/run/logpriv, with 0600
permissions.
- In libc, try to use this socket.
- Do not loop forever if we are using this socket (partial backout of 1.31)
Reviewed by: dwmalone, Andrea Campi <andrea webcom it>
Approved by: julian (mentor)
MFC after: 1 month
out c->c_func, we can't take it after callout_stop(). To take it before
we need to acquire callout_lock, to avoid race. This commit narrows
down area where lock is held, but hack is still present.
This should be redesigned.
Approved by: julian (mentor)