doesn't change functionality, but makes code more logical.
Obtained from: DrafonFlyBSD
o Use VOP_GETATTR() to obtain actual size of file and parse no more than that.
Previously, we parsed MAXSHELLCMDLEN characters regardless of the actual file
size. This makes the following working:
$ printf '#!/bin/echo' > /tmp/test.sh
$ chmod 755 /tmp/test.sh
$ /tmp/test.sh
Previously, attempts to execve() that shell script has been failing with bogus
ENAMETOOLONG.
PR: kern/64196
Submitted by: Magnus B.ckstr.m <b@etek.chalmers.se>
- Simplify CARP_LOG() and making it working (we don't have addlog in FreeBSD).
- Introduce CARP_DEBUG() which logs with LOG_DEBUG severity when
net.inet.carp.log > 1
- Use CARP_DEBUG to log state changes of carp interfaces.
After CARP_LOG() cleanup it appeared that carp_input_c() does not need sc
argument. Remove it.
Sponsored by: Rambler
script. Otherwise it's possible to panic kernel by constructing a shell
script with first line not ending in '\n'.
Also, treat '\0' as line terminating character, which may me useful in
some situations.
Submitted by: gad
- store assigned PCI addresses at cninit time for later mmap range
check
- implement set_border to scrub X remnants when switching back to VTYs
- implement mmap, only allowing addresses within the range of the
console adapter.
for a given device in some circumstances, so move the PDO creation
to the attach routine so we don't end up creating two PDOs.
Also, when we skip the call to ndis_convert_res() in if_ndis.c:ndis_attach(),
initialize sc->ndis_block->nmb_rlist to NULL. We don't explicitly zero
the miniport block, so this will make sure ndis_unload_driver() does
the right thing.
when we create a PDO, the driver_object associated with it is that
of the parent driver, not the driver we're trying to attach. For
example, if we attach a PCI device, the PDO we pass to the NdisAddDevice()
function should contain a pointer to fake_pci_driver, not to the NDIS
driver itself. For PCI or PCMCIA devices this doesn't matter because
the child never needs to talk to the parent bus driver, but for USB,
the child needs to be able to send IRPs to the parent USB bus driver, and
for that to work the parent USB bus driver has to be hung off the PDO.
This involves modifying windrv_lookup() so that we can search for
bus drivers by name, if necessary. Our fake bus drivers attach themselves
as "PCI Bus," "PCCARD Bus" and "USB Bus," so we can search for them
using those names.
The individual attachment stubs now create and attach PDOs to the
parent bus drivers instead of hanging them off the NDIS driver's
object, and in if_ndis.c, we now search for the correct driver
object depending on the bus type, and use that to find the correct PDO.
With this fix, I can get my sample USB ethernet driver to deliver
an IRP to my fake parent USB bus driver's dispatch routines.
- Add stub modules for USB support: subr_usbd.c, usbd_var.h and
if_ndis_usb.c. The subr_usbd.c module is hooked up the build
but currently doesn't do very much. It provides the stub USB
parent driver object and a dispatch routine for
IRM_MJ_INTERNAL_DEVICE_CONTROL. The only exported function at
the moment is USBD_GetUSBDIVersion(). The if_ndis_usb.c stub
compiles, but is not hooked up to the build yet. I'm putting
these here so I can keep them under source code control as I
flesh them out.
right for certain MAP_FIXED mappings on ia64 but it will work fine for all
other mappings and works fine on amd64.
Requested by: ps, Christian Zander
MFC after: 1 week
- In kern_ndis.c:ndis_unload_driver(), test that ndis_block->nmb_rlist
is not NULL before trying to free() it.
- In subr_pe.c:pe_get_import_descriptor(), do a case-insensitive
match on the import module name. Most drivers I have encountered
link against "ntoskrnl.exe" but the ASIX USB ethernet driver I'm
testing with wants "NTOSKRNL.EXE."
- In subr_ntoskrnl.c:IoAllocateIrp(), return a pointer to the IRP
instead of NULL. (Stub code leftover.)
- Also in subr_ntoskrnl.c, add ExAllocatePoolWithTag() and ExFreePool()
to the function table list so they'll get exported to drivers properly.
panic at kmem_alloc() via malloc(9).
PR: kern/77748
Submitted by: Wojciech A. Koszek
OK'ed by: brooks
Security: local DoS, a sample code in the PR.
MFC after: 3 days
mastering on all other interfaces:
- call carp_carpdev_state() on initialize instead of just setting to INIT
- in carp_carpdev_state() check that interface is UP, instead of checking
that it is not DOWN, because a rebooted machine may have interface in
UNKNOWN state.
Sponsored by: Rambler
Obtained from: OpenBSD (partially)
when trim'ing space off the back of a chain; this is indirect
solution to a potential null ptr deref
Noticed by: Coverity Prevent analysis tool (null ptr deref)
Reviewed by: dg, rwatson
vn_extattr_rm. This is meant to catch conditions where IO_NODELOCKED
has been specified without the vnode being locked.
Discussed with: rwatson
MFC after: 1 week
object on to the zone allocator. It should be noted that uma_zalloc(9)
uses bzero to zero out the object so there probably wont be any
real performance benefit. If UMA grows the ability to supply
zeroed zones more efficiently in the future, we will not have to
modify all the existing consumers.
Discussed with: rwatson,julian
MFC after: 1 week
and a machine-independent though inefficient InterlockedExchange().
In Windows, InterlockedExchange() appears to be implemented in header
files via inline assembly. I would prefer using an atomic.h macro for
this, but there doesn't seem to be one that just does a plain old
atomic exchange (as opposed to compare and exchange). Also implement
IoSetCancelRoutine(), which is just a macro that uses InterlockedExchange().
Fill in IoBuildSynchronousFsdRequest(), IoBuildAsynchronousFsdRequest()
and IoBuildDeviceIoControlRequest() so that they do something useful,
and add a bunch of #defines to ntoskrnl_var.h to help make these work.
These may require some tweaks later.
a bugfix of clearing the On-Demand flag when going back to 100%. It
has been tested and works on an IBM R32. Note original work done by
Ted Unangst and sobomax@.
the previous one failed and there are more than one plex in the volume.
This could have led to a flood of error messages on the console and
probably a deadlock in certain situations.
has been set. Assert that this is the case so that we catch filesystems
who are using naked VOP_LOCKs in illegal cases.
Sponsored by: Isilon Systems, Inc.
considered to be as good as an exclusive lock, although there is still a
possibility of someone acquiring a VOP LOCK while xlock is held.
Sponsored by: Isilon Systems, Inc.
chipset. Add support for this card. Office Max has them on sale and
I was surprised that we didn't have it in our supported list when I
plugged it in...
IRQ 0 and not an ExtINT pin. The MADT enumerators ignore the PC-AT flag
and ignore overrides that map IRQ 0 to pin 2 when this quirk is present.
- Add a block comment above the quirks to document each quirk so that we
can use more verbose descriptions quirks.
MFC after: 2 weeks
ACPI MADT only does if the PC-AT flag is set), then don't assume that pin 0
on the first I/O APIC is an ExtINT pin. Instead, assume that it is ISA
IRQ 0.
with the kernel compile time option:
options IPFIREWALL_FORWARD_EXTENDED
This option has to be specified in addition to IPFIRWALL_FORWARD.
With this option even packets targeted for an IP address local
to the host can be redirected. All restrictions to ensure proper
behaviour for locally generated packets are turned off. Firewall
rules have to be carefully crafted to make sure that things like
PMTU discovery do not break.
Document the two kernel options.
PR: kern/71910
PR: kern/73129
MFC after: 1 week
so that parent interface is not left in promiscous mode after carp
interface is destroyed.
This is not perfect, since promisc counter is added when carp
interface is assigned an IP address. However, when address is removed
parent interface is still in promiscuous mode. Only removal of
carp interface removes promisc from parent. Same way in OpenBSD.
Sponsored by: Rambler
List devfs_dirents rather than vnodes off their shared struct cdev, this
saves a pointer field in the vnode at the expense of a field in the
devfs_dirent. There are often 100 times more vnodes so this is bargain.
In addition it makes it harder for people to try to do stypid things like
"finding the vnode from cdev".
Since DEVFS handles all VCHR nodes now, we can do the vnode related
cleanup in devfs_reclaim() instead of in dev_rel() and vgonel().
Similarly, we can do the struct cdev related cleanup in dev_rel()
instead of devfs_reclaim().
rename idestroy_dev() to destroy_devl() for consistency.
Add LIST_ENTRY de_alias to struct devfs_dirent.
Remove v_specnext from struct vnode.
Change si_hlist to si_alist in struct cdev.
String new devfs vnodes' devfs_dirent on si_alist when
we create them and take them off in devfs_reclaim().
Fix devfs_revoke() accordingly. Also don't clear fields
devfs_reclaim() will clear when called from vgone();
Let devfs_reclaim() call dev_rel() instead of vgonel().
Move the usecount tracking from dev_rel() to devfs_reclaim(),
and let dev_rel() take a struct cdev argument instead of vnode.
Destroy SI_CHEAPCLONE devices in dev_rel() (instead of
devfs_reclaim()) when they are no longer used. (This
should maybe happen in devfs_close() instead.)
that NFS ever started using it. Long time ago I added the necessary
vhold()/vdrop() calls to replace it, but forgot to remove the v_id code.
Do it now.
It is _never_ OK to find a vnode from a struct cdev because you have
no way of telling if you get the right one. You might be in jail or
chroot for instance.
The fundamental problem is that we get only the lower 8 bits of the
minor device number so there is no guarantee that we can actually
find the disk device in question at all.
This was probably a bigger issue pre-GEOM where the upper bits
signaled which slice were in use.
The secondary problem is how we get from (partial) dev_t to vnode.
The correct implementation will involve traversing the mount list
looking for a perfect match or a possible match (for truncated
minor).
hosts to share an IP address, providing high availability and load
balancing.
Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.
FreeBSD port done solely by Max Laier.
Patch by: mlaier
Obtained from: OpenBSD (mickey, mcbride)
address is not supplied, then jail IP is choosed and in_pcbbind() is called.
Since udp_output() does not save local addr after call to in_pcbconnect_setup(),
in_pcbbind() is called for each packet, and this is incorrect.
So, we shall treat jailed sockets specially in udp_output(), we will save
their local address.
This fixes a long standing bug with broken sendto() system call in jails.
PR: kern/26506
Reviewed by: rwatson
MFC after: 2 weeks
loopback interface. Nobody have explained me sense of this check.
It breaks connect() system call to a destination address which is
loopback routed (e.g. blackholed).
Reviewed by: silence on net@
MFC after: 2 weeks
modes, systems may take longer. If the status values don't match, try
matching just the lowest 8 bits if no bits above 8 are set in the desired
value. The IBM R32 has other bits set in the status register that are
irrelevant to the expected value.
the switch. Other interim tests (i.e., for minimum runtime) could
invalidate the start time. This fixes transitions to cooler states in that
now they go to the next active state (_AC0 -> _AC1) instead of going
straight to off (_AC0 -> off).
Submitted by: Alexandre "Sunny" Kovalenko (Alex.Kovalenko / verizon.net)
locks held, specify the ACPI_ISR flag to keep it from acquiring any more
mutexes (which could potentially sleep.) This should fix "could sleep"
warning messages on the following path:
msleep()
AcpiOsWaitSemaphore()
AcpiUtAcquireMutex()
AcpiDisableGpe()
EcGpeHandler()
AcpiEvGpeDispatch()
AcpiEvGpeDetect()
AcpiEvGpeDetect()
AcpiEvSciXruptHandler()
a socket from a regular socket to a listening socket able to accept new
connections. As part of this state transition, solisten() calls into the
protocol to update protocol-layer state. There were several bugs in this
implementation that could result in a race wherein a TCP SYN received
in the interval between the protocol state transition and the shortly
following socket layer transition would result in a panic in the TCP code,
as the socket would be in the TCPS_LISTEN state, but the socket would not
have the SO_ACCEPTCONN flag set.
This change does the following:
- Pushes the socket state transition from the socket layer solisten() to
to socket "library" routines called from the protocol. This permits
the socket routines to be called while holding the protocol mutexes,
preventing a race exposing the incomplete socket state transition to TCP
after the TCP state transition has completed. The check for a socket
layer state transition is performed by solisten_proto_check(), and the
actual transition is performed by solisten_proto().
- Holds the socket lock for the duration of the socket state test and set,
and over the protocol layer state transition, which is now possible as
the socket lock is acquired by the protocol layer, rather than vice
versa. This prevents additional state related races in the socket
layer.
This permits the dual transition of socket layer and protocol layer state
to occur while holding locks for both layers, making the two changes
atomic with respect to one another. Similar changes are likely require
elsewhere in the socket/protocol code.
Reported by: Peter Holm <peter@holm.cc>
Review and fixes from: emax, Antoine Brodin <antoine.brodin@laposte.net>
Philosophical head nod: gnn
possible that the same packet would show up multiple times. This poses some
constraints on the TBD locking for snc(4) (see comment).
Obtained from: DragonFlyBSD
Submitted by: Joerg Sonnenberger
Reviewed by: rwatson
was a bad idea, but since it is done like this in the vendor source we keep
it around for older versions. As a safe guard against future misuse we don't
even define CALLOUT_INITIALIZER anymore.
This fixes ALTQ after callout_init_mtx() and takes altq_var.h off the vendor
branch.
Submitted by: Divacky Roman <xdivac02NOstud.fit.vutbrSPAMcz> (w/ changes)
on the previous generation of Pentium-M processors (Banias). Support for
Dothan and later processors involves working with acpi_perf(4) to extract
information about supported states. This driver should work on MP systems
including HTT. It is experimental and may have a few bugs but has been
tested to not crash at least.
Thanks to Colin Percival for his initial work on this driver.
only call the protocol's pru_rcvd() if the protocol has the flag
PR_WANTRCVD set. This brings that instance of pru_rcvd() into line with
the rest, which do check the flag.
MFC after: 3 days
very slow process, especially for large file systems that is just
recovered from a crash.
Since the summary is already re-sync'ed every 30 second, we will
not lag behind too much after a crash. With this consideration
in mind, it is more reasonable to transfer the responsibility to
background fsck, to reduce the delay after a crash.
Add a new sysctl variable, vfs.ffs.compute_summary_at_mount, to
control this behavior. When set to nonzero, we will get the
"old" behavior, that the summary is computed immediately at mount
time.
Add five new sysctl variables to adjust ndir, nbfree, nifree,
nffree and numclusters respectively. Teach fsck_ffs about these
API, however, intentionally not to check the existence, since
kernels without these sysctls must have recomputed the summary
and hence no adjustments are necessary.
This change has eliminated the usual tens of minutes of delay of
mounting large dirty volumes.
Reviewed by: mckusick
MFC After: 1 week
of the global UNIX domain socket mutex: no protection is needed that
early in the setup of the UNIX domain socket and socket structures.
MFC after: 3 days
preliminary support for using the GCC-compatibility of ICC was committed
but couldn't be tested at that time due to problems with ICC itself. Since
ICC 8.1 it's possible to use its GCC-compatibility under FreeBSD and it
turned out that a typedef for __gnuc_va_list is required in that case.
Revert the part of rev. 1.8 which #ifdef'ed out __gnuc_va_list for ICC.
MFC after: 1 week
patch from kan@).
Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on
close. This is not yet a generally safe function, but for this very
specific use it is safe. This solves the problem with buffers not
being flushed by unmount or after failed mount attempts.
a packet has VLAN mbuf tag attached. This is faster to check than
m_tag_locate(), and allows us to use the tags in non-vlan(4) VLAN
producers.
The first argument to VLAN_OUTPUT_TAG() is now unused but retained
for backward compatibility.
While here, embellish a fix in rev. 1.174 of if_ethersubr.c -- it
now checks for packets with VLAN (mbuf) tags, and it should now
be possible to bridge(4) on vlan(4)'s whose parent interfaces
support VLAN decapsulation in hardware.
Reviewed by: sam
pointers in argv and envv in userland and use that together with
kern_execve() and exec_free_args() to implement freebsd32_execve()
without using the stackgap.
- Fix freebsd32_adjtime() to call adjtime() rather than utimes(). Still
uses stackgap for now.
- Use kern_setitimer(), kern_getitimer(), kern_select(), kern_utimes(),
kern_statfs(), kern_fstatfs(), kern_fhstatfs(), kern_stat(),
kern_fstat(), and kern_lstat().
Tested by: cokane (amd64)
Silence on: amd64, ia64
pointers in argv and envv in userland and use that together with
kern_execve() and exec_free_args() to implement linux_execve() for the
amd64/linux32 ABI without using the stackgap.
- Implement linux_nanosleep() using the recently added kern_nanosleep().
- Use linux_emul_convpath() instead of linux_emul_find() in
exec_linux_imgact_try().
Tested by: cokane
Silence on: amd64
the semantics in that the returned filename to use is now a kernel
pointer rather than a user space pointer. This required changing the
arguments to the CHECKALT*() macros some and changing the various system
calls that used pathnames to use the kern_foo() functions that can accept
kernel space filename pointers instead of calling the system call
directly.
- Use kern_open(), kern_stat(), kern_lstat(), kern_fstat(), kern_access(),
kern_truncate(), kern_pathconf(), kern_execve(), kern_select(),
kern_setitimer(), kern_getitimer(), kern_statfs(), and kern_fstatfs().
Silence on: alpha@
like a valid range. We already do this in the memory case (although
the code there is somewhat different than the I/o case because we have
to deal with different kinds of memory). Since most laptops don't
have non-subtractive bridges, this wasn't seen in practice.
Evidentally the Compaq R3000 hits this problem with PC Cards.
Some minor style fixes while I'm here.
Submitted by: Jung-uk Kim
mounted (is it Joliet, RockRidge, High Sierra) based on bootverbose.
Most file systems don't generate log messages based on details of the
file system superblock, and these log messages disrupt sysinstall output
during a new install from CD. We may want to explore exposing this
status information using nmount() at some point.
MFC after: 3 days
Note that this results in more kernel virtual memory being reserved for
temporary storage of the args. The args temporary space is allocated out
of exec_map (a submap of kernel_map). This will use roughly 4MB of KVM.
OK'ed by: dg
copy op to shift arguments on the stack instead of transfering each
argument one by one through a register. Probably doesn't affect overall
operation, but makes the code a little less grotty and easier to update
later if I choose to make the wrapper handle more args. Also add
comments.
so->so_options when solisten() will succeed, rather than setting it
conditionally based on there not being queued sockets in the completed
socket queue. Otherwise, if the protocol exposes new sockets via the
completed queue before solisten() completes, the listen() system call
will succeed, but the socket and protocol state will be out of sync.
For TCP, this didn't happen in practice, as the TCP code will panic if
a new connection comes in after the tcpcb has been transitioned to a
listening state but the socket doesn't have SO_ACCEPTCONN set.
This is historical behavior resulting from bitrot since 4.3BSD, in which
that line of code was associated with the conditional NULL'ing of the
connection queue pointers (one-time initialization to be performed
during the transition to a listening socket), which are now initialized
separately.
Discussed with: fenner, gnn
MFC after: 3 days
driver. This used to be handled by cpufreq_drv_settings() but it's
useful to get the type/flags separately from getting the settings.
(For example, you don't have to pass an array of cf_setting just to find
the driver type.)
Use this new method in our in-tree drivers to detect reliably if acpi_perf
is present and owns the hardware. This simplifies logic in drivers as well
as fixing a bug introduced in my last commit where too many drivers attached.
are equal to PCCARD_TPCE_FS_MEMSPACE_NONE, memspace will be zero, so
testing for this case inside of the if statement results in dead code.
We'd fail to set a value to zero that's already zero (since it is
initialized to 0 indirectly) with this code being there. Well, except
in the very rare case that we have a card that has a defualt entry
that includes a memory space followed by one that has no memory space
(these are extremely rare, I don't recall ever having seen one :-).
Fix this by setting num_memspace to 0 in a more appropriate place.
Submitted by: Coverity Prevent analysis tool
than the generic ne-2000 string. This should have no effect on the
actual support of the parts, just reporting what the part was.
Also, rename a few functins and symbols to reflect a more generic
part support that grew out of the early specific support.
soref() to also covering the update of so_state. While no other user
threads can update the socket state here as it's not yet hooked up to
the file descriptor array yet, the protocol could also frob the
socket state here, leading to a lost update to the so_state field.
No reported instances of this bug (as yet).
MFC after: 3 days
connection status before inserting the new socket into the listen
socket's accept queue, or there might be a race in which another thread
wakes up when the accept lock is released, and sees the socket before its
state is set correctly. The wakeup still occurs after the accept lock is
released. There have been no diagnoses of this bug in real-world systems
(as yet).
MFC after: 3 days
or just offering info. In the former case, we don't probe/attach to allow
the ACPI driver precedence. A refinement of this would be to actually
use the info provided by acpi_perf(4) to get the real CPU clock rates
instead of estimating them but since all systems that support both
acpi_perf(4) and ichss(4) export the control registers to acpi_perf(4),
it can just handle the registers on its own.
loaded, the tick interrupt enabled and a handler that resets the tick
counter on every tick interrupt. While this isn't documented this can
cause DELAY() to wait for a value the tick counter will not reach when
used in early boot, i.e. before cpu_initclocks() is called, depending
on when in the cycle DELAY() is called, the delay value and the value
the tick compare register is set to. The excessive use of DELAY() in
uart(4) when probing Sun keyboards seems to always manage to trigger
this, resulting in a hang during boot.
Disable the tick interrupt in tick_init(), which is called early in
sparc64_init(), until the interrupt is enabled again in tick_start(),
called by cpu_initclocks(), with our own handler. This fixes the hang
during probing Sun keyboards on AXi boards and Ultra 10, with other
machines like Ultra 5 probably being affected but not tested.
Additional testing by: Matthias Muthmann
MFC after: 1 week