Only set sticks (and acquire sched_lock) on entry from user mode.
Add handlers for all kinds of mmu misses, and for interrupts from
user mode.
Acquire Giant before calling into the vm system so this runs with
invariants.
Try to get the restrictions for page faults on user memory from
kernel mode right.
Only set pcb_onfault and return to the alternate return code if
this is actually a fault on user memory from kernel mode.
2. Use the upcoming "tick" interface.
3. Save a call frame as well as a trap frame on proc0's initial stack.
4. Setup a pointer to the per-cpu interrupt queue.
5. Install the per-cpu pointer in interrupt and alternate globals as well.
6. Flush out setregs so exec works.
Submitted by: tmm (3, 5, 6)
2. Add spill and fill handlers for spills to the user stack on entry
to the kernel.
3. Add code to handle instruction mmu misses from user mode.
4. Add code to handle level interrupts from kernel mode and vectored
interrupt traps from either.
5. Save the pil in the trapframe on entry from kernel mode and restore
it on return.
Submitted by: tmm (1, 2)
actual end of the section. The new gas (binutils) puts in additional padding
which was misaligning the concatenated btx loader.
Reported by: Oliver Hartmann <ohartman@klima.physik.uni-mainz.de>,
Harti Brandt <brandt@fokus.gmd.de>
Tested by: Oliver Hartmann <ohartman@klima.physik.uni-mainz.de>,
David Wolfskill <dhw@whistle.com>, ps
Reviewed by: jhb
MFC after: 1 day
"./foo.ko". Use "/full/path/foo.ko" instead so that when the path is
reported as being an absolute path to the "shared library", at least
it's not really a relative path.
Obtained from: LOMAC/FreeBSD project
are a really nasty interface that should have been killed long ago
when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they
operate on (struct user) will not be around much longer since it
is part-per-process and part-per-thread in a post-KSE world.
gdb does not actually use this except for the obscure 'info udot'
command which does a hexdump of as much of the child's 'struct user'
as it can get. It carries its own #defines so it doesn't break
compiles.
bypass some extra anti-foot-shooting measures. Currently, its only
effect is to allow detaching a device while it's still open (e.g.,
mounted). This is useful for testing how the system reacts to a disk
suddenly going away, which can happen with some removeable media.
At this point, the force option is only checked on detach, so it
would've been possible to allow the option to be passed with the
MDIOCDETACH operation. This was not done to allow the possibility of
having the force flag influence other tests in the future, which may
not necessarily deal with detaching the device.
Reviewed by: sobomax
Approved by: phk
Avoid using parenthesis enclosure macros (.Pq and .Po/.Pc) with plain text.
Not only this slows down the mdoc(7) processing significantly, but it also
has an undesired (in this case) effect of disabling hyphenation within the
entire enclosed block.
into sadb_x_sa2_sequence from sadb_x_sa2_reserved3 in the sadb_x_sa2
structure. Also the output of setkey is changed. sequence number
of the sadb is replaced to the end of the output.
Obtained from: KAME
filename passed in via the module loader functions in the GDB
"sharedlibrary" support structures. This isn't good, since the pointer
would become stale in almost every case (not the pre-loaded case, of
course).
Change this to malloc()ed copy of the string and finally fix the reason
that gdb -k's "sharedlibrary" command stopped working.
Obtained from: LOMAC/FreeBSD (cf. NAI Labs)
It didn't implement the proper /dev/fd functionality (which would be to
include in the directory listing /dev/fd/n if the process has fd n open)
anyway.
Anything needing access to /dev/fd/n where n > 2 can use the optional
fdescfs module, which implements this properly and does not cause any
trouble with devfs.
Discussed with: phk
definitions of all of the ioctls, plus round out all ioctl definitions to
match what exists for linux. Allow ioctls to be called through either the
native or linux interface, though no apps exist (yet) that can take
advantage of native calling.
just before the memory hole to 4 megs. Special case building exception.s
like locore.s, it needs to at the beginning so the branches out from the
trap table don't overflow.
Correct an off by one in our critical section handling.
SEQADDR always reads the next instruction to execute,
so we must subtract one from its value before making
comparisons with entries in the critical section table.
Print a few additional registers whenever we dump
card state.
Show the SCB_CONTROL and SCB_TAG values for all pending
SCBs in card SCB ram when dumping card state.
aic7xxx.seq:
Fix a bug introduced while optimizing the SDPTR path.
We would ack the SDPTR message twice on Ultra2 or better
chips if it occurred after all data had been transferred
for a transaction.
Change our workaround for the PCI2.1 retry bug on some
chips. Although the previous workaround was logically
correct, its faster method of draining the FIFO seemed
to occassionally confuse the FIFO state. We now drain
the FIFO at half the speed which avoids the problem.
aic7xxx_pci.c:
Chips with the PCI 2.1 retry bug can't handle a 16byte
cachesize. If the cachesize is set to 16bytes, drop
it to 0.
Although it can go higher, it is not safe to so do on arrays with many
members. Compromise by adding a tunable, "hw.aac.iosize_max" that can be
set at boottime. Also document in the aac(4) manpage.
MFC after: 4 weeks
can't blindly write zero into it to disable the card. We must
preserve this bit. This changes pcic_disable to only clear the bits
we know we need to clear on card disable, thus preserving the magic
bit for many TI bridges.
This appears to have fixed the problems that people are reporting
about the system failing to recognize cards being inserted or removed
(or both). Greg: This may fix your problem too :-).
interrupt handler from the upper half, etc. This fixes some serious stability
problems that we were seeing on our production server. These patches have
been tested for almost 6 months and are a highly recommended MFC candidate.
Reviewed by: gibbs, merry, msmith
MFC after: 4 days
bind() call on IPv4 sockets:
Currently, if one tries to bind a socket using INADDR_LOOPBACK inside a
jail, it will fail because prison_ip() does not take this possibility
into account. On the other hand, when one tries to connect(), for
example, to localhost, prison_remote_ip() will silently convert
INADDR_LOOPBACK to the jail's IP address. Therefore, it is desirable to
make bind() to do this implicit conversion as well.
Apart from this, the patch also replaces 0x7f000001 in
prison_remote_ip() to a more correct INADDR_LOOPBACK.
This is a 4.4-RELEASE "during the freeze, thanks" MFC candidate.
Submitted by: Anton Berezin <tobez@FreeBSD.org>
Discussed with at some point: phk
MFC after: 3 days
as there are now "unusual" protection properties to Pmem that differ
from the other files. While I'm at it, introduce proc locking for
the other files, which was previously present only in the Pmem case.
Obtained from: TrustedBSD Project
and so special-casing was introduced to provide extra procfs privilege
to the kmem group. With the advent of non-setgid kmem ps, this code
is no longer required, and in fact, can is potentially harmful as it
allocates privilege to a gid that is increasingly less meaningful.
Knowledge of specific gid's in kernel is also generally bad precedent,
as the kernel security policy doesn't distinguish gid's specifically,
only uid 0.
This commit removes reference to kmem in procfs, both in terms of
access control decisions, and the applying of gid kmem to the
/proc/*/mem file, simplifying the associated code considerably.
Processes are still permitted to access the mem file based on
the debugging policy, so ps -e still works fine for normal
processes and use.
Reviewed by: tmm
Obtained from: TrustedBSD Project
particularly nice that IPSEC inserts a zero-length mbuf into the
chain, and that bug should be fixed too, but interfaces should be
robust to bad input.
Print the interface name when TUNDEBUG()ing about dropping an mbuf.
order to avoid namespace collision with subr_mchain.c's mb_init(). This
wasn't "fatal" as the mbuf initialization routine mb_init() was local to
subr_mbuf.c which in turn didn't pull in subr_mchain.c's mb_init()
declaration, but it should deffinately be changed now before it creates
headache.
addresses. It helps to use the physical address that the virtual address
actually maps to (doh!). Comment out some code that crashes.
Found independently by: tmm
commented out in the entire life of the 2.x+ branch and given the amount
of gcc-specific code we have and the warning checks that gcc does I'm not
sure that it is going to get us much for some time.
to hang or panic kernel by detaching disk from which fs is mounted;
- replace "md" with MD_NAME in yet another place.
Reviewed by: phk
Approved by: phk
debugging support as well. Debugging module support is handled
identically to kernel debugging support, right down to poor
choice of make variable names.
1) allocate fewer buckets
2) when failing to allocate swap zone, keep reducing the zone by
a third rather than a half in order to reduce the chance of
allocating way too little.
I also moved around some code for readability.
Suggested by: dillon
Reviewed by: dillon
This paniced my one of my machines one time too many :-( and there is
no sign of a solution in the pipeline. The deltas are still easily
available in cvs. The problem is that if the parent has been swapped
out, the child process cannot grope around in the parent's UPAGES to
see the sigact[] array or it will fault. This probably is a showstopper
for this implementation anyway.
is the diagnostics register at offset 0x93. When bit 5 is set in this
register, bits 4-7 in ExCA register 0x5 being 0000 are required for
pci interrupt routing. When it is clear, then bit 4 of ExCA register
0x3 is used to enable it.
The only other issue is that when you route interrupts this way, you
must read ExCA register 0x4 in order to clear the interrupt, else you
get an interrupt storm.
Deal with this requirement by setting things up. It is believed that
this won't hurt other chipsets, but other chipsets may require their
own work arounds.
a temporary array to store struct buf pointers if the list doesn't
fit in a local array. Usually it frees the array when finished,
but if it jumps to the 'again' label and the new list does fit in
the local array then it can forget to free a previously malloc'd
M_TEMP memory.
Move the free() up a line so that it frees any previously allocated
memory whether or not it needs to malloc a new array.
Reviewed by: dillon
defined to 0 in the non-SMP case, which very much makes sense as it
permits its usage in per-CPU initialization loops (for an example, check
out subr_mbuf.c).
Further, on a UP system, make mb_alloc always use the first per-CPU
container, regardless of cpuid (i.e. remove reliability on cpuid in the
UP case).
Requested by: alfred
asleep() and await() functions split the functionality of msleep() up into
two halves. Only the asleep() half (which is what puts the process on the
sleep queue) actually needs the lock usually passed to msleep() held to
prevent lost wakeups. await() does not need the lock held, so the lock
can be released prior to calling await() and does not need to be passed in
to the await() function. Typical usage of these functions would be as
follows:
mtx_lock(&foo_mtx);
... do stuff ...
asleep(&foo_cond, PRIxx, "foowt", hz);
...
mtx_unlock&foo_mtx);
...
await(-1, -1);
Inspired by: dillon on the couch at Usenix
the first sector of the emulated floppy to contain a valid MS-DOS BPB that
it can modify. Since boot1 is the first sector of boot.flp, this resulted
in the BIOS overwriting part of boot1: specifically the function used to
read in sectors from the disk.
Submitted by: Mark Peek <mark@whistle.com>
Submitted by: Doug Ambrisko <ambrisko@ambrisko.com>
PR: i386/26382
Obtained from: NetBSD, OpenBSD (the example BPB)
MFC after: 1 month
of debugging the current process when that is in conflict with other
restrictions (such as jail, unprivileged_procdebug_permitted, etc).
o This corrects anomolies in the behavior of
kern.security.unprivileged_procdebug_permitted when using truss and
ktrace. The theory goes that this is now safe to use.
Obtained from: TrustedBSD Project
MIB entries.
o Relocate kern.suser_permitted to kern.security.suser_permitted.
o Introduce new kern.security.unprivileged_procdebug_permitted, which
(when set to 0) prevents processes without privilege from performing
a variety of inter-process debugging activities. The default is 1,
to provide current behavior.
This feature allows "hardened" systems to disable access to debugging
facilities, which have been associated with a number of past security
vulnerabilities. Previously, while procfs could be unmounted, other
in-kernel facilities (such as ptrace()) were still available. This
setting should not be modified on normal development systems, as it
will result in frustration. Some utilities respond poorly to
failing to get the debugging access they require, and error response
by these utilities may be improved in the future in the name of
beautification.
Note that there are currently some odd interactions with some
facilities, which will need to be resolved before this should be used
in production, including odd interactions with truss and ktrace.
Note also that currently, tracing is permitted on the current process
regardless of this flag, for compatibility with previous
authorization code in various facilities, but that will probably
change (and resolve the odd interactions).
Obtained from: TrustedBSD Project
This is to be friendly with non-IPv6 peer (If the peer complains due to
lack of IPv6CP, drop IPv6CP). This basically implements "RXJ+" state
transition in the RFC.
Obtained from: NetBSD
o Move PIOCSRESOURCE from pccard to pcic so the kernel can give pccardd
better hints as to what resources to use.
o Implement an undocumented hw.pcic.interrupt_route to allow people that
need to do so to route their interrupts in a non-standard way.
o Only preallocate a resource in probe if we're routing via pci.
o If we aren't routing via pci, then set the irq to use explicitly
to defeat the automatic IRQ routing of the pci layer.
This, with the pccardd code should be close to what can be committed
to -stable.
- mostly complete kernel pmap support, and tested but currently turned
off userland pmap support
- low level assembly language trap, context switching and support code
- fully implemented atomic.h and supporting cpufunc.h
- some support for kernel debugging with ddb
- various header tweaks and filling out of machine dependent structures
to a new architecture. This is the base of the sparc64 port, but contains
limited machine dependent code, and can be used a base for ports. Included
are:
- standard machine dependent headers, tweaked for a 64 bit, big endian
architecture, including empty versions of all the machine dependent
structures
- a machine independent atomic.h, which can be used until a port has
support for interrupts and the operations really need to be atomic
- stub versions of all the machine dependent functions, which panic
when called and print out the name of the function that needs to
be implemented. functions which are normally in assembly files are
not included, but this should reduce the number of different undefined
references on the first few compiles from hundreds to 5 or 6
Given minimal startup code and console support it should be trivial to
make this compile and run the first few sysinits on almost any architecture.
Requested by: alfred, imp, jhb
dynamic symbol table buckets and chains. The sparc64 toolchain uses 32
bit .hash entries, unlike other 64 bits architectures (alpha), which use
64 bit entries.
Discussed with: dfr, jdp
a standard cell_t type for the fields of all argument structs. Also
make ihandle_t and phandle_t unsigned to avoid sign extension problems.
Approved by: benno
FreeBSD _does_ define ENOMSG, so no need for checking if we support it.
Inspired by PR: 22470
Which was submitted by: Bjorn Tornqvist <bjorn@west.se>
MFC after: 1 week
in the case where there are no interrupts routed for it does not
contain enough space to use it to route an interrupt. In the case
where we need to route an interrupt, throw away the returned buffer
and create a new one containing the interrupt we want.
boot time. Loading as a module once the system is up and running
doesn't make any sense.
- Fix acpi_FindIndexedResource (it would only check the first resource),
changes the calling interface.
- Add a new helper function (acpi_AppendBufferResource) to help building
buffers containing resources.
- Remove the beer-ware license (reqested by phk)
- Reorganise so that the PIIX4 workaround code is kept together, and
switch the workaround function via the timecounter struct, saving
a compare in the read-timecounter codepath. Also indicate that
the workaround is active by changing the timecounter hardware string.
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
This probably isn't entirely final as yet- but it's a lot closer to now
being what it should be, including allowing camcontrol to actually set
specific settings.
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
Roll core minor.
either what's in NVRAM or what the safe defaults would be if we lack NVRAM.
Then we rename cur_XXXX to actv_XXXX (these are the currently active settings)
and the dev_XXX settings to goal_XXXX (these are the settings which we want
cur_XXXX to converge to).
Handle both old and new TARGIOALLOCUNIT/TARGIOFREEUNIT cases- the new
one allows us to specify inquiry data we want to use.
Handle more of the CAM_DIS_DISCONNECT case.
Move TARGCTLIOALLOCUNIT to OTARGCTLIOALLOCUNIT, TARGCTLIOFREEUNIT
to OTARGCTLIOFREEUNIT and redefine old associated structure to be
old_ioc_alloc_unit- deprecation but preservation of binaries.
Add new structure for same- but this one contains a pointer to
user defined INQUIRY data so you can define what the target
device looks like to the outside world.
1. If we get frozen, unfreeze for disable disconnects.
2. Put CAM_DIS_DISCONNECT commands at the head of the work queue
(we have a target still connected and we can't run anything else
until this command completes).
If we had an error sending the last CTIO, unfreeze the queue anyway.
resources it is attempting to assign to a child object. This should
help people track down mysterious resource allocation problems more
easily.
# Unfortunately, it is harder to do the conflict check and report which
# resource failed if the driver itself doesn't.
because it shares ufs code. In ufs_fhtovp(), the test on i_effnlink
is invalid because ext2fs does not maintain this field. In ufs_close(),
i_effnlink is also tested, to determines whether or not to call
vn_start_write(). The ufs_fhtovp issue breaks NFS exporting of
ext2fs filesystems; I believe the other is harmless.
Fix both cases by checking um_i_effnlink_valid in the ufsmount
struct, and use i_nlink if necessary.
Noticed by: bde
Reviewed by: mckusick, bde
size (previously, the transfer size would be rounded up to a multiple of
the block size, which would overflow the buffer).
This fixes panics when doing things like trying to mount audio CD's.
PR: kern/21946
Review Timeout: sos
already allow this for NFS swap configured via BOOTP, so it is
known to work fine.
For many diskless configurations is is more flexible to have the
client set up swapping itself; it can recreate a sparse swap file
to save on server space for example, and it works with a non-NFS
root filesystem such as an in-kernel filesystem image.
strictly necessary on current, but having it in here makes the diffs with
stable smaller and doesn't hurt anything except for phk's redundant include
finder.
never completed" message. The RX reset takes longer complete than it
used to, a lot longer in fact than xl_wait() is prepared to wait.
When we do the RX reset in xl_reset(), this cases xl_wait() to time out
and whine. We wait a little extra time now after the RX reset, which
should silence the warning.
Thanks to obrien for finally getting me a box with a NIC that
causes this problem for me to tinker with.
entry (d_ino == 0) is found in a position that is not the start of
a DIRBLKSIZ block.
While such entries cannot occur normally (ufs always extends the
previous entry to cover the free space instead), they do not cause
problems and fsck does not fix them, so panicking is bad.
hw.pcic.irq Globally set the IRQ for all pcic devices' management
interrupt (aka card status change or CSC interrupt)
This is what used to be known as
machdep.pccard.pcic_irq (which has been retained for
now for compatibility).
hw.pcic.ignore_fuction_1 Ignores function 1 for all PCIC bridges by not
attaching to them. Lucent released a huge batch
of cards that were imporperly manufactuered (lacking
the 0 ohm resister to disable slot 1). This is
a big hammer to keep those cards from causing problems
(I've had 4 people contact me saying my patches
worked great once they added a kludge to always ignore
function 1, or until they soldered these resistors
in place!).
No clue where to document these. They act as both boot loader environment
variables, as well as read-only sysctls after boot.
At the same time, sort sys/systm.h in its proper order after sys/sysctl.h.
VM caching of disks through mmap() and stopping syncing of open files
that had their last reference in the fs removed (ie: their unsync'ed
pages get discarded on close already, so I made it stop syncing too).
o kill blank line that I introduced in cardinfo.h
o Delete unused variable wasinactive.
o return 0 from pccard_resume.
o Set the state and lastsate initially to be empty.
o move comment above code for interrupt dispatching.
o Powerstate interface is now available as of 430002, not 500000 (note that
this change will be not 100% correct since the power state stuff didn't
enter current until well after 500000, but it is good enough for the two
branche we have going now).
power x 0.
pccardc power x 0 used to disable the slot. But a suspend/resume
would reactivate the pccard. It no longer does that. Now the
disabling of the slot is sticy until it is reset with power x 1 or the
card is ejected. This seems closer to correct behavior to me.
o Process all card state changes the same using pccard_do_stat_change().
o Cleanup disabling the card so that we can preserve the state after
the change. Basically, don't set it to empty as often as we do.
o On suspend, the new state is "empty" and the laststate is "suspend"
o Document state machine with a diagram of states and edges. The
edges are labeld to tell the reader what event causes the external
state changes.
o "machdep.pccard.pcic_resume_reset" may be obsolete now. We always
call the bridge driver's resume method on resume now. Otherwise cards
won't automatically show up. If it needs to stay, I'll add it back.
In file included from ../../../dev/vinum/vinumhdr.h:77,
from ../../../dev/vinum/vinum.c:44:
../../../dev/vinum/vinumext.h:165: warning: redundant redeclaration of `setjmp' in same scope
../../../sys/systm.h:96: warning: previous declaration of `setjmp'
../../../dev/vinum/vinummemory.c:44: warning: redundant redeclaration of `longjmp' in same scope
../../../sys/systm.h:97: warning: previous declaration of `longjmp'
longer have a pccard in the slot. This fixes the problem where pccard
would say that a card had been inserted on resume. This also appears
to make the insert/remove events more reliable after a resume as well,
but that may be a different bug I need to hunt down.
initialize in the right order to make derivative settings work right.
eg: at compile time, nmbufs was double nmbclusters. For POLA this should
work the same at runtime.
Tunables are now derived at boot time from maxusers. ie: change maxusers
via a tunable and all the derivative settings change. You can change
the other tunables individually as well. Even hz etc is tunable.
making pcbs available to the outside world. otherwise, we will see
inpcb without ipsec security policy attached (-> panic() in ipsec.c).
Obtained from: KAME
MFC after: 3 days
were indices in a dense array. The cpuids are a sparse set and treat
them as such, setting up containers only for CPUs activated during
mb_init().
- Fix netstat(1) and systat(1) to treat the per-CPU stats area as a sparse
map, in accordance with the above.
This allows us to properly boot with certain CPUs disactivated. However, if
we later decide to re-activate said CPUs, we will barf until we decide to
implement CPU spinon/spinoff callback hooks to allow for said CPUs' per-CPU
containers to get configured on their activation.
Reported by: mjacob
Partially (sys/ diffs) Submitted by: mjacob
This has supposedly been incorporated into the Intel code already, so this
will get cleanly replaced with the "official" version when it is next
imported and will not cause any conflicts or hiccups.
- Use sysctl to export stats
- Use ip_encap.c's encapsulation support
- Update lkm to kld (is 6 years a record for a broken module?)
- Remove some unused cruft
effect, which would cause unnecessary route deletion:
* Unfortunately, this has the obnoxious
* property of also triggering for insertion /above/ a pre-existing network
* route and clones. Sigh. This may be fixed some day.
The effect has been even worse, because recent versions of route.c set
the parent rtentry for cloned routes from an interface-direct route.
For example, suppose that we have an interface "ne0" that has an IPv4
subnet "10.0.0.0/24". Then we may have a cloned route like 10.0.0.1
on the interface, whose parent route is 10.0.0.0/24 (to the interface
ne0). Now, when we add the default route (i.e. 0.0.0.0/0),
rt_fixchange() will remove the cloned route 10.0.0.1. The (bad) effect
also prevents rt_setgate from configuring rt_gwroute, which would not
be an intended behavior.
As suggested in the comments to rt_fixchange(), we need stricter check
in the function, to prevent unintentional route deletion.
This fix also solve the "IPV6 panic?" problem in nd6_timer().
Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
MFC after: 4 days
static entries with oid's over 100, and defining enough dynamic entries
causes an overlap.
Move the "magic" value 0x100 into <sys/sysctl.h> where it belongs.
PR: 29131
Submitted by: "Alexander N. Kabaev" <kabaev@mail.ru>
Reviewed by: -arch, -audit
MFC after: 2 weeks
Sometimes, when pccardd is restarted, it fails to realize that the
device is already attached and tries to attach it again. This leads
to bad mojo since the pccard code isn't setup to handle that, so the
panic was put in. Now it appears that it is triggering too easily, so
I'm backing it off to a non-fatal error.
Correctly reintroduce loop_seen_once semantics- that is, if we've never
seen good link, start bouncing commands with CAM_SEL_TIMEOUT. But we
have to be careful to have let ourselves try (in isp_kthread) to check
for loop up at least once.
PR: 28992
MFC after: 1 week
LISTENed for, return EEXISTS.
Only match the magic "*" service tag if no other LISTEN service tags
match.
Require an explicit LISTEN for an empty service tag in order to match
empty service requests.
Approved by: julian
MFC after: 3 days