2) Alter packet flow inside dummynet: allow certain packets to bypass
dummynet scheduler. Benefits are:
- lower latency: if packet flow does not exceed pipe bandwidth, packets
will not be (up to tick) delayed (due to dummynet's scheduler granularity).
- lower overhead: if packet avoids dummynet scheduler it shouldn't reenter ip
stack later. Such packets can be fastforwarded.
- recursion (which can lead to kernel stack exhaution) eliminated. This fix
long existed panic, which can be triggered this way:
kldload dummynet
sysctl net.inet.ip.fw.one_pass=0
ipfw pipe 1 config bw 0
for i in `jot 30`; do ipfw add 1 pipe 1 icmp from any to any; done
ping -c 1 localhost
3) Three new sysctl nodes are added:
net.inet.ip.dummynet.io_pkt - packets passed to dummynet
net.inet.ip.dummynet.io_pkt_fast - packets avoided dummynet scheduler
net.inet.ip.dummynet.io_pkt_drop - packets dropped by dummynet
P.S. Above comments are true only for layer 3 packets. Layer 2 packet flow
is not changed yet.
MFC after: 3 month
while other variants have inorder ethernet address for the same
chipset. Override ethernet address ordering if we already know how
it was stored. This fixes the use of inversed ethernet address on
MCP67.
Submitted by: ariff
MFC after: 3 days
Allocate space in keyboard state structure instead to prevent random byte
from possibly overwritten stack location frombeing shoved into USB device
when transfer actually takes place.
This fixes at least one instance of LEDs not working with USB keyboards.
characters (mostly "&"). Because top(1) shows only first six characters of
wait channel, without this change we saw only one meaningful character.
Requested by: kris & others
MFC after: 1 week
must be globally performed before calling any of the TLB invalidation
functions.
With one exception, on amd64, this requirement was already met. Fix this
one case. Also, as a clarification, change an existing atomic op into a
release. (Suggested by: jhb)
Reported and reviewed by: ups
MFC after: 3 days
o do not override the home channel recorded for the sta when the frame is
received off-channel; this fixes a problem where we might think the sta
was operating on the channel the frame was received on causing association
requests to be ignored/rejected (likely cause of kern/99036)
o don't include rssi of off-channel frames in the avg rssi used to select
a bss; this gives us a better estimate of the signal we will see for the
station when on-channel
PR: kern/99036
Found by: Yubin Gong
Reviewed by: sephe
MFC after: 1 week
This import includes:
o wpi Wireless driver for the Intel 3945 Wireless Lan Controller (802.11abg) (sys/dev/wpi)
o Intel firmware revision 2.14.4 & associated LICENSE (sys/dev/contrib/wpi, sys/contrib/dev/wpi/LICENSE)
o wpifw Firmware driver (sys/modules/wpifw)
Approved by: mlaier, sam (co-mentors)
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.
As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.
The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).
In collaboration with: Peter Holm
Reviewed by: jhb
default object rather than cache it was to have
vm_pager_has_page(object, pindex, ...) == FALSE to imply that there is
no cached page in object at pindex. This allows to avoid explicit
checks for cached pages in vm_object_backing_scan().
For now, we need the same bandaid for the swap object, otherwise both
the vm_page_lookup() and the pager can report that there is no page at
offset, while page is stored in the cache. Also, this fixes another
instance of the KASSERT("object type is incompatible") failure in the
vm_page_cache_transfer().
Reported and tested by: Peter Holm
Reviewed by: alc
MFC after: 3 days
interface. Once the limit is reached packets with unknown source addresses are
dropped until an existing host cache entry expires or is removed. Useful to
use with the STICKY cache option.
Sponsored by: miniSuperHappyDevHouse NZ
reset problem when we reboot the system with the zyd device inserted.
Submitted by: Weongyo Jeong
Reported by: Ted Lindgreen (ted@tednet.nl)
MFC after: 1 week
it's been printing out scary messages about "Unhanded Event Notify Frame"
that are needlessly worrisome to users. Change this warning to only print
out at an elevated debugging level.
warnings. Specifically, whenever vm_page_alloc(9) returned NULL to
get_pv_entry(), we issued a warning regardless of the number of pv
entries in use. (Note: The older pv entry allocator in RELENG_6 does
not have this problem.)
Reported by: Jeremy Chadwick
Eliminate the direct call to pagedaemon_wakeup() by get_pv_entry().
This was a holdover from earlier times when the page daemon was
responsible for the reclamation of pv entries.
MFC after: 5 days
Put in a little comment explaining why it went away.
Re-enable it in the case there an exisiting process is just splitting
off its address space and file descriptors.
(I donpt think anything uses that code but it needs some sort of locking
and this does the job.
Reviewed by: Davidxu, alc, others
MFC after: 3 days
CPUs to make sure idle threads are evicted from the softc before returning
from acpi_cpu_shutdown(). However, this is unnecessary since stop_cpus()
handles this for itself and at this point it's possible that our IPI will be
blocked (interrupts disabled).
Thanks to: Glen Leeder <glen.leeder / nokia.com>
MFC after: 3 days
don't do this right; instead go to the scan cache so we pass through
auth state (if the cache is warm we can do this w/o an actual scan)
MFC after: 1 week
(BIO_WRITE and BIO_FLUSH) as it is done is Solaris. The difference is
that Solaris calls it only for sync requests, but we can't say in GEOM
is the request is sync or async, so we do it for every request.
MFC after: 1 week
to change the freq before the other CPUs are active. The current code
always attempts to change all CPUs to match each other, and the requisite
sched_bind() call won't work before APs are launched.
/dev/agpgart and agp_free_res() frees resources like the BAR for the
aperture. Splitting this up lets chipset-specific detach routines
manipulate the aperture during their detach routines without panicing.
MFC after: 1 week
Reviewed by: anholt
* Do not hold any locks over calls to copyin/copyout.
* Clean up some #ifdefs
* fix a possible mbuf leak when NAT fails on policy routed packets
PR: 117216
- Select a tag gains ability to optionally save new tags
off in the timewait system.
- When looking up associations do not give back a stcb that
is in the about-to-be-freed state, and instead continue
looking for other candiates.
- New function to query to see if value is in time-wait.
- Timewait had a time comparison error that caused very
few vtags to actually stay in time-wait.
- When setting tags in time-wait, we now use the time
requested NOT a fixed constant value.
- sstat now gets the proper associd when we do the query.
- When we process an association, we expect the tag chosen
(if we have one from a cookie) to be in time-wait. Before
we would NOT allow the assoc up by checking if its good.
In theory this should have caused almost all assoc not
to come up except for the time-comparison bug above (this
bug was hidden by the time comparison bug :-D).
- Don't save tags for nonce values in the time-wait cache
since these are used only during cookie collisions and do
not matter if they are unique or not.
MFC after: 1 week
set this flag and it was more or less just copied and pasted from
another FreeBSD driver while porting this driver from NetBSD, whose
gentbi(4) doesn't set MIIF_NOISOLATE either.
- Fix spelling in a comment.
OK'ed by: yongari
MFC after: 3 months
zero (0). Actual RFCOMM channel will be assigned after listen(2)
call is done on a RFCOMM socket bound to a ''wildcard'' RFCOMM
channel zero (0).
Address locking issues in ng_btsocket_rfcomm_bind()
Submitted by: Heiko Wundram (Beenic) < wundram at beenic dot net >
MFC after: 1 week
- Remove AU_.* hard-coded audit class constants, as udit classes are now
entirely dynamically configured using /etc/security/audit_class.
Obtained from: TrustedBSD Project
supports the removal of hard-coded audit class constants in OpenBSM
1.0. All audit classes are now dynamically configured via the
audit_class database.
Obtained from: TrustedBSD Project
changes:
01 - Enhanced LRO:
LRO feature is extended to support multi-buffer mode. Previously,
Ethernet frames received in contiguous buffers were offloaded.
Now, frames received in multiple non-contiguous buffers can be
offloaded, as well. The driver now supports LRO for jumbo frames.
02 - Locks Optimization:
The driver code was re-organized to limit the use of locks.
Moreover, lock contention was reduced by replacing wait locks
with try locks.
03 - Code Optimization:
The driver code was re-factored to eliminate some memcpy
operations. Fast path loops were optimized.
04 - Tag Creations:
Physical Buffer Tags are now optimized based upon frame size.
For better performance, Physical Memory Maps are now re-used.
05 - Configuration:
Features such as TSO, LRO, and Interrupt Mode can be configured
either at load or at run time. Rx buffer mode (mode 1 or mode 2)
can be configured at load time through kenv.
06 - Driver Statistics:
Run time statistics are enhanced to provide better visibility
into the driver performance.
07 - Bug Fixes:
The driver contains fixes for the problems discovered and
reported since last submission.
08 - MSI support:
Added Message Signaled Interrupt feature which currently uses 1
message.
09 Removed feature:
Rx 3 buffer mode feature has been removed. Driver now supports 1,
2 and 5 buffer modes of which 2 and 5 buffer modes can be used
for header separation.
10 Compiler warning:
Fixed compiler warning when compiled for 32 bit system.
11 Copyright notice:
Source files are updated with the proper copyright notice.
MFC after: 3 days
Submitted by: Alicia Pena <Alicia dot Pena at neterion dot com>,
Muhammad Shafiq <Muhammad dot Shafiq at neterion dot com>
made by Michael Eisele and the patch was slightly modified by me.
With this change several NVIDIA ethernet controllers(e.g. MCP61)
works.
RTL8211B(L) is RealTek's new gigabit PHY. The PHY has several
features including crossover correction, polarity correction as
well as supporting triple speed(10/100/1000bps). Data transfer
between MAC and PHY is via RGMII for 1000baseT, MII for
10baseT/100baseTX.
Unfortunately, RealTek used the same model number for RTL8211B(L)
PHY so there is no way to discriminate between RTL8211B(L) and its
predecessors. ATM RTL8211B uses revision number 2 so checking the
revision number seems to be only way to identify it.
Obtained from: Michael Eisele [1]
Tested by: clemens fischer < ino-qc AT spotteswoode DOT de DOT eu DOT org >
mii_anegticks to MII_ANEGTICKS_GIGE and use it. Previously it used
to MII_ANEGTICKS which may not be enough to wait before retrying
autonegotiation process at 1000bps.
o Reset autonegotation timer if media option is not IFM_AUTO or we
got a valid link.
o Announce link loss right after it happends.
o Autonegiation is retried every mii_anegticks seconds.
o Report link state changes right after setting autonegotiation.
Blade 1500/SX1500 boards have inherited the firmware bug of the
AX1105 mainboards to not include an interrupt map entry for the
parallel port controller (for the AX1105 the heuristic code for
E450s probably erroneously kicks in and guesses an interrupt).
- Take advantage of bus_generic_setup_intr(9).
- Fix some whitespace bugs.
entry point, which is no longer required now that we don't support
old-style multicast tunnels. This removes the last mbuf object class
entry point that isn't init/copy/destroy.
Obtained from: TrustedBSD Project
Framework by moving from mac_mbuf_create_netlayer() to more specific
entry points for specific network services:
- mac_netinet_firewall_reply() to be used when replying to in-bound TCP
segments in pf and ipfw (etc).
- Rename mac_netinet_icmp_reply() to mac_netinet_icmp_replyinplace() and
add mac_netinet_icmp_reply(), reflecting that in some cases we overwrite
a label in place, but in others we apply the label to a new mbuf.
Obtained from: TrustedBSD Project
in the TrustedBSD MAC Framework:
- Add mac_atalk.c and add explicit entry point mac_netatalk_aarp_send()
for AARP packet labeling, rather than using a generic link layer
entry point.
- Add mac_inet6.c and add explicit entry point mac_netinet6_nd6_send()
for ND6 packet labeling, rather than using a generic link layer entry
point.
- Add expliict entry point mac_netinet_arp_send() for ARP packet
labeling, and mac_netinet_igmp_send() for IGMP packet labeling,
rather than using a generic link layer entry point.
- Remove previous genering link layer entry point,
mac_mbuf_create_linklayer() as it is no longer used.
- Add implementations of new entry points to various policies, largely
by replicating the existing link layer entry point for them; remove
old link layer entry point implementation.
- Make MAC_IFNET_LOCK(), MAC_IFNET_UNLOCK(), and mac_ifnet_mtx global
to the MAC Framework rather than static to mac_net.c as it is now
needed outside of mac_net.c.
Obtained from: TrustedBSD Project
reason (not all BIOSen have _DIS methods for all link devices for example).
This matches the behavior of attach() with respect to _DIS as well.
Submitted by: njl
userland preemption directly from hardclock() via sched_clock() when a
thread uses up a full quantum instead of using a periodic timeout to cause
a userland preemption every so often. This fixes a potential deadlock
when IPI_PREEMPTION isn't enabled where softclock blocks on a lock held
by a thread pinned or bound to another CPU. The current thread on that
CPU will never be preempted while softclock is blocked.
Note that ULE already drives its round-robin userland preemption from
sched_clock() as well and always enables IPI_PREEMPT.
MFC after: 1 week
a private softc list is needed neither for tracking clones in general
nor for destroying all clones before the module unload -- if_clone
takes care of all that. (Note that some other interface drivers do
need a softc list to be able to scan it for their private purposes.)
noatime, noexec, suiddir, nosuid, nosymfollow, union,
noclusterr, noclusterw, multilabel, acls, force, update,
async. These options correspond to MOPT_STDOPTS, MOPT_FORCE, MOPT_UPDATE,
and MOPT_ASYNC.
Currently, mount_nfs converts these "-o" options from strings
to MNT_ flags via getmntopts(),
and passes the flags from userspace to the kernel.
This change will allow us in future to pass these mount options
as strings directly to the kernel via nmount() when doing NFS mounts.
out instead of returning an error.
(1) This makes the behavior consistent with mount(2).
(2) This makes update mounts on the root file system work properly.
(3) The explicit checks for MNT_ROOTFS in src/sbin/fsck_ffs/main.c
and src/usr.sbin/mountd/mountd.c which were put in to
eliminate errors during update mounts on the root file system
can be removed.
The only place were MNT_ROOTFS can be validly set
is inside the kernel, i.e. with vfs_mountroot_try().
Reviewed by: phk
MFC after: 3 days
handle to the PCI device_t if the ACPI device_t is already attached to a
driver. This happens on the Tablet TC1000 which for some reason includes
two PCI-ISA bridges and treats the second bridge as an ACPI system resource
device.
Reviewed by: njl (a while ago)
MFC after: 3 days
that would have an offset beyond the end of the target object. Such
pages should remain in the source object.
MFC after: 3 days
Diagnosed and reviewed by: Kostik Belousov
Reported and tested by: Peter Holm
defined. This lets each boot program choose which version of cgbase() it
wants to use rather than forcing ufsread.c to have that knowledge.
MFC after: 1 week
Discussed with: imp
saves about 500 bytes in the boot code. While the AT91RM9200 has 12k
of space for the boot loader, which is more than i386's 8k, the code
generated by gcc is a bit bigger.
I've had this in p4 for about two years now.
we move towards netinet as a pseudo-object for the MAC Framework.
Rename 'mac_create_mbuf_linklayer' to 'mac_mbuf_create_linklayer' to
reflect general object-first ordering preference.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
kthread_add() takes the same parameters as the old kthread_create()
plus a pointer to a process structure, and adds a kernel thread
to that process.
kproc_kthread_add() takes the parameters for kthread_add,
plus a process name and a pointer to a pointer to a process instead of just
a pointer, and if the proc * is NULL, it creates the process to the
specifications required, before adding the thread to it.
All other old kthread_xxx() calls return, but act on (struct thread *)
instead of (struct proc *). One reason to change the name is so that
any old kernel modules that are lying around and expect kthread_create()
to make a process will not just accidentally link.
fix top to show kernel threads by their thread name in -SH mode
add a tdnam formatting option to ps to show thread names.
make all idle threads actual kthreads and put them into their own idled process.
make all interrupt threads kthreads and put them in an interd process
(mainly for aesthetic and accounting reasons)
rename proc 0 to be 'kernel' and it's swapper thread is now 'swapper'
man page fixes to follow.
refactored it to be a generic device.
Instead of being part of the standard kernel, there is now a 'nvram' device
for i386/amd64. It is in DEFAULTS like io and mem, and can be turned off
with 'nodevice nvram'. This matches the previous behavior when it was
first committed.
This change introduces audit_proc_coredump() which is called by coredump(9)
to create an audit record for the coredump event. When a process
dumps a core, it could be security relevant. It could be an indicator that
a stack within the process has been overflowed with an incorrectly constructed
malicious payload or a number of other events.
The record that is generated looks like this:
header,111,10,process dumped core,0,Thu Oct 25 19:36:29 2007, + 179 msec
argument,0,0xb,signal
path,/usr/home/csjp/test.core
subject,csjp,csjp,staff,csjp,staff,1101,1095,50457,10.37.129.2
return,success,1
trailer,111
- We allocate a completely new record to make sure we arent clobbering
the audit data associated with the syscall that produced the core
(assuming the core is being generated in response to SIGABRT and not
an invalid memory access).
- Shuffle around expand_name() so we can use the coredump name at the very
beginning of the coredump call. Make sure we free the storage referenced
by "name" if we need to bail out early.
- Audit both successful and failed coredump creation efforts
Obtained from: TrustedBSD Project
Reviewed by: rwatson
MFC after: 1 month
primary object type, and then by secondarily by method name. This sorts
entry points relating to particular objects, such as pipes, sockets, and
vnodes together.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
the PS/2 mouse controller. Thus, when acpi_ibm(4) claimed the mouse
device, the mouse would stop working. The one ACPI dump of an R40 that
I've looked at includes an HKEY device with the proper "IBM0068" ID, so
I'm not sure how the "IBM0057" ID could have helped at all.
MFC after: 1 week
Approved by: njl
Rework the read/write support in the bios disk driver some to cut down
on duplicated code.
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
from mac_vfs.c to mac_process.c to join other functions that setup up
process labels for specific purposes. Unlike the two proc create calls,
this call is intended to run after creation when a process registers as
the NFS daemon, so remains an _associate_ call..
Obtained from: TrustedBSD Project
than mac_<policy>_whatever, as this shortens the names and makes the code
a bit easier to read.
When dealing with label structures, name variables 'mb', 'ml', 'mm rather
than the longer 'mac_biba', 'mac_lomac', and 'mac_mls', likewise making
the code a little easier to read.
Obtained from: TrustedBSD Project
order. The kernel used to shuffle them around to get things right,
but that was recently fixed. This makes our boot loader match the
behavior of most other boot loaders for the atmel parts. This bug was
inherited from the Kwikbyte loader that we started from.
This bug was discovered by Bj.ANvrn KNvnig back in June, but fell on the
floor. He provided patches to the kernel, include backwards
compatibility options that were similar to Olivier's if_ate.c commit.
in the same order as it's set in ate_set_mac.
I remember a discussion about this on -arm, but apparently nothing was done.
Warner, is this wrong ?
X-MFC After: proper review
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
the PCIOCGETCONF, PCIOCREAD and PCIOCWRITE IOCTLs, which was broken
with the introduction of PCI domain support.
As the size of struct pci_conf_io wasn't changed with that commit,
this unfortunately requires the ABI of PCIOCGETCONF to be broken
again in order to be able to provide backwards compatibility to
the old version of that IOCTL.
Requested by: imp
Discussed with: re (kensmith)
Reviewed by: PCI maintainers (imp, jhb)
MFC after: 5 days
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:
mac_<object>_<method/action>
mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.
All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
on duplicated code and support 64-bit LBAs for GPT.
- The code to manage an EDD or C/H/S I/O request are now in their own
routines. The EDD routine now handles a full 64-bit LBA instead of
truncating LBAs to the lower 32-bits. (MBRs and BSD labels only
have 32-bit LBAs anyway, so the only LBAs ever passed down were 32-bit).
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
- If a disk supports EDD then always use it rather than only using it if
the cylinder is > 1023. Other parts of the boot code already do
something similar to this. Also, GPT just uses LBAs, so for a GPT disk
it's probably best to ignore C/H/S completely. Always using EDD when
it is supported by a disk is an easy way to accomplish this.
MFC after: 1 week