interfaces. The original resource_find() returned a pointer to an internal
resource table entry. resource_find_hard() dereferences the actual
passed in value (oops!) - effectively trashing random memory due to
the pointer being passed in with a random initial value.
Submitted by: bde
and remove sysctl oids at will during runtime - they don't rely on
linker sets. Also, the node oids can be referenced by more than
one kernel user, which means that it's possible to create partially
overlapping trees.
Add sysctl contexts to help programmers manage multiple dynamic
oids in convenient way.
Please see the manpages for detailed discussion, and example module
for typical use.
This work is based on ideas and code snippets coming from many
people, among them: Arun Sharma, Jonathan Lemon, Doug Rabson,
Brian Feldman, Kelly Yancey, Poul-Henning Kamp and others. I'd like
to specially thank Brian Feldman for detailed review and style
fixes.
PR: kern/16928
Reviewed by: dfr, green, phk
a NMI occured, you could type continue in DDB and the kernel would
not attempt to detect what type of NMI was recieved. Now we check
for the type of NMI first and then go to DDB if it is enabled.
This will solve the problem with having DDB enabled and getting an
NMI due to some possibly bad error and being able to continue the
operation of the kernel when you really want to panic and know
what happened.
Submitted by: jhb
never expire if poll() or select() was called before the system had been
in multiuser for 1 second. This was caused by only checking to see if
tv_sec was zero rather than checking both tv_sec and tv_usec.
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.
Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
SYSCTL_LONG macro to be consistent with other integer sysctl variables
and require an initial value instead of assuming 0. Update several
sysctl variables to use the unsigned types.
PR: 15251
Submitted by: Kelly Yancey <kbyanc@posi.net>
insertion of a CF card, for random values of N > 1. With these fixes,
I've been able to do 100 insert/remove of the cards w/o a crash with
lots of system activity going on that in the past would help trigger
the crash.
The problem:
FreeBSD creates dev_t's on the fly as they are needed and never
destroys them. These dev_t's point to a struct disk that is used for
housekeeping on the disk. When a device goes away, the struct disk
pointer becomes a dangling pointer. Sometimes when the device comes
back, the pointer will point to the new struct disk (in which case the
insertion will work). Other times it won't (especially if any length
of time has passed, since it is dependent on memory returned from
malloc).
The Fix:
There is one of these dev_t's that is always correct. The
device for the WHOLE_DISK_SLICE is always right. It gets set at
create_disk() time. So, the fix is to spend a little CPU time and
lookup the WHOLE_DISK_SLICE dev_t and use the si_disk from that in
preference to the one that's in the device asking to do the I/O. In
addition, we change the test of si_disk == NULL meaning that the dev
needed to inherit properties from the pdev to dev->si_disk !=
pdev->si_disk. This test is a little stronger than the previous test,
but can sometimes be fooled into not inheriting. However, the results
of this fooling are that the old values will be used, which will
generally always be the same as before. si_drv[12] are the only
values that are copied that might pose a problem. They tend to change
as the si_disk field would change, so it is a hole, but it is a small
hole.
One could correctly argue that one should replace much of this code
with something much much better. I would be on the pro side of that
argument.
Reviewed by: phk (who also ported the original patch to current)
Sponsored by: Timing Solutions
after the acquisition of any advisory locks. This fix corrects a case
in which a process tries to open a file with a non-blocking exclusive
lock. Even if it fails to get the lock it would still truncate the
file even though its open failed. With this change, the truncation
is done only after the lock is successfully acquired.
Obtained from: BSD/OS
o Set access mode to -r--r--r-- if SS_CANTRCVMORE is set and the receive
buffer is empty.
o Set access mode to --w--w--w- is SS_CANTSENDMORE is set.
Discussed with: alfred
instead of a struct iovec * array and int len. Get rid of stupidly trying
to allocate all of the memory and copyin()ing the entire iovec[], and
instead just do the proper VOP_WRITE() in ktrwrite() using a copy of
the struct uio that the syscall originally used.
This solves the DoS which could easily be performed; to work around the
DoS, one could also remove "options KTRACE" from the kernel. This is
a very strong MFC candidate for 4.1.
Found by: art@OpenBSD.org
instead of requiring every caller of linker_load_file() to perform the
check itself. This avoids netgraph loading KLD's when securelevel > 0,
not to mention any future code that may call linker_load_file().
Reviewed by: dfr
On unload, remove references from freelist to memory type defined by module.
Print a warning if module defines and allocate its own memory type, but
didn't free it all on unload.
Reviewed by: peter
Use strtoul(), not strtol() in the hints decoder so that
'flags 0xa0ffa0ff' is not truncated to 0x7fffffff.
Use a stack buffer instead of a static 100 byte bss buffer.
Use \0 for the NUL character.
Remove some ``excessive'' parens.
1) while allocating a uidinfo struct malloc is called with M_WAITOK,
it's possible that while asleep another process by the same user
could have woken up earlier and inserted an entry into the uid
hash table. Having redundant entries causes inconsistancies that
we can't handle.
fix: do a non-waiting malloc, and if that fails then do a blocking
malloc, after waking up check that no one else has inserted an entry
for us already.
2) Because many checks for sbsize were done as "test then set" in a non
atomic manner it was possible to exceed the limits put up via races.
fix: instead of querying the count then setting, we just attempt to
set the count and leave it up to the function to return success or
failure.
3) The uidinfo code was inlining and repeating, lookups and insertions
and deletions needed to be in their own functions for clarity.
Reviewed by: green
When re-adding an event, do not reset the event state. If the event was
pending, it will remain pending. This allows the user to change the udata
field after the event was registered, while not losing any events which
have already occurred.
Reported by: jmg
accept filters are now loadable as well as able to be compiled into
the kernel.
two accept filters are provided, one that returns sockets when data
arrives the other when an http request is completed (doesn't work
with 0.9 requests)
Reviewed by: jmg