designed to help detect tamper-after-free scenarios, a problem more
and more common and likely with multithreaded kernels where race
conditions are more prevalent.
Currently MemGuard can only take over malloc()/realloc()/free() for
particular (a) malloc type(s) and the code brought in with this
change manually instruments it to take over M_SUBPROC allocations
as an example. If you are planning to use it, for now you must:
1) Put "options DEBUG_MEMGUARD" in your kernel config.
2) Edit src/sys/kern/kern_malloc.c manually, look for
"XXX CHANGEME" and replace the M_SUBPROC comparison with
the appropriate malloc type (this might require additional
but small/simple code modification if, say, the malloc type
is declared out of scope).
3) Build and install your kernel. Tune vm.memguard_divisor
boot-time tunable which is used to scale how much of kmem_map
you want to allott for MemGuard's use. The default is 10,
so kmem_size/10.
ToDo:
1) Bring in a memguard(9) man page.
2) Better instrumentation (e.g., boot-time) of MemGuard taking
over malloc types.
3) Teach UMA about MemGuard to allow MemGuard to override zone
allocations too.
4) Improve MemGuard if necessary.
This work is partly based on some old patches from Ian Dowse.
provides truer debugger stack traces. For those that want to stick with
-O2 kernel builds, one should probably add -fno-optimize-sibling-calls
so that each stack frame as a frame pointer.
It is semi-promissed by the Release Engineers that when RELENG_6 is
created we go back to -O2.
Desired by: scottl, jhb
Silence on: net@, current@, hackers@.
No objections: joerg
Requested by: by many (mostly Cronyx) users for a long long time.
MFC after: 10 days
PR: kern/21771, kern/66348
the ISA and CBUS (called isa on pc98) attachments. Eliminate all PC98
ifdefs in the process (the driver in pc98/pc98/mse.c was a copy of the one
in i386/isa/mse.c with PC98 ifdefs). Create a module for this driver.
I've tested this my PC-9821RaS40 with moused. I've not tested this on i386
because I have no InPort cards, or similar such things. NEC standardized
on bus mice very early, long before ps/2 mice ports apeared, so all PC-98
machines supported by FreeBSD/pc98 have bus mice, I believe.
Reviewed by: nyan-san
zero-copy receive of jumbo frames. This eliminates the need for the
jumbo frame allocator implemented in kern/uipc_jumbo.c and sys/jumbo.h.
Remove it.
Note: Zero-copy receive of jumbo frames did not work without these changes;
I believe there was insufficient locking on the jumbo vm object.
Tested by: ken@
Discussed with: gallatin@
initializations but we did have lofty goals and big ideals.
Adjust to more contemporary circumstances and gain type checking.
Replace the entire vop_t frobbing thing with properly typed
structures. The only casualty is that we can not add a new
VOP_ method with a loadable module. History has not given
us reason to belive this would ever be feasible in the the
first place.
Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.
Give coda correct prototypes and function definitions for
all vop_()s.
Generate a bit more data from the vnode_if.src file: a
struct vop_vector and protype typedefs for all vop methods.
Add a new vop_bypass() and make vop_default be a pointer
to another struct vop_vector.
Remove a lot of vfs_init since vop_vector is ready to use
from the compiler.
Cast various vop_mumble() to void * with uppercase name,
for instance VOP_PANIC, VOP_NULL etc.
Implement VCALL() by making vdesc_offset the offsetof() the
relevant function pointer in vop_vector. This is disgusting
but since the code is generated by a script comparatively
safe. The alternative for nullfs etc. would be much worse.
Fix up all vnode method vectors to remove casts so they
become typesafe. (The bulk of this is generated by scripts)
without ever being changed to actually work with an i8251. Nobody is
working on this either at the moment, so it's not about to change
soon.
When the code necessary to support the i8251 is committed, this can
be reverted again.
clock found on the ISA bus (some USIIe, USIIi and USIIIi models) and
EBus (USIII models) instead of a MK48Txx clock.
Testet by: Matthew T. Lager" <freebsd@trinetworks.com> on Sun Fire V100,
Xavier Beaudouin <kiwi@oav.net> on Netra X1 (initial version)
respective NetBSD driver for use with the genclock interface.
It's first use will be on sparc64 but it was also tested on alpha with
a preliminary patch to switch alpha to use the genclock code together
with this driver instead of the respective code in alpha/alpha/clock.c
and the rather MD mcclock(4). Using it on i386 and amd64 won't be that
hard but some changes/extensions to improve the genclock code in general
should be done first, e.g. add locking and make it easier to access the
NVRAM usually coupled with RTCs.
i386 to dev/acpi_support. In theory, these devices could be found
other than in i386 machines only as amd64 becomes more popular. These
drivers don't appear to do anything i386 specific, so move them to
dev/acpi_support. Move config lines to files so that those
architectures that don't support kernel modules can build them into
the kernel. At the same time, rename acpi_snc to acpi_sony to follow
the lead of all the other specialty devices.
the tree. Small tweaks were made by myself to eliminate unnecessary
includes and some other minor issues. Last time I asked takawata-san
about this driver, he suggested I commit it.
Submitted by: takawata
an inordinate amount of synchronous console output that is fairly
undesirable on slower serial console. It's easily hit by accident
when frobbing other sysctls late at night.
on UltraSPARC workstations. The driver is based on OpenBSD's SBus
cs4231 driver and heavily modified to incorporate into sound(4)
infrastructure. Due to the lack of APCDMA documentation, the DMA
code of SBus cs4231 came from OpenBSD's driver.
The driver runs without Giant lock and supports both SBus and EBus
based CS4231 audio controller. Special thanks to marius for providing
feedbacks during the driver writing. His feedback made it possible
to write hiccup free playback code under high system loads.
Approved by: jake (mentor)
Reviewed by: marius (initial version)
Tested by: marius, kwm, Julian C. Dunn(jdunn AT opentrend DOT net)
jest, of most excellent fancy: he hath taught me lessons a thousand
times; and now, how abhorred in my imagination it is! my gorge rises
at it. Here were those hacks that I have curs'd I know not how
oft. Where be your kludges now? your workarounds? your layering
violations, that were wont to set the table on a roar?
Move the skeleton of specfs into devfs where it now belongs and
bury the rest.
followed by "make depend" shouldn't do anything. It doesn't
seem to be a problem anymore, and if someone finds it to break
again, please contact me so we can work on a real fix.
Reviewed by: bde
- Sort kmod.mk knobs in the documentation section.
- Fixed misuses of the word "KLD" which stands for
"kernel ld", or "kernel linker", where kernel
module is meant.
- Removed redundant uses of ${.OBJDIR}.
- Whitespace and indentation fixes.
- CLEANFILES cleanup.
- Target redefinition protection (install.debug).
Submitted by: bde, ru
Reviewed by: ru, bde
that was fixed by this should not normally happen, and since I did not
record the traces of my failed build attempt that had been solved with
that change, it's not entirely clear whether it hadn't been a pilot
error on my end. In dubio pro reo. :-)
be used to announce various system activity.
The auxio device provides auxiliary I/O functions and is found on various
SBus/EBus UltraSPARC models. At present, only front panel LED is
controlled by this driver.
Approved by: jake (mentor)
Reviewed by: joerg
Tested by: joerg
problem I had but it's happening in code that is messing around with
register windows - I'm willing to live with that piece being sensitive
to this and it looks like the other problems we had reported lately
are not fixed by using -O instead of -O2.
Sorry for the churn. Looks like I need a second pointy hat. Someone
tells me they stack well. :-))))
-O2 on kernel compiles after all. While working on adding a KASSERT
to sparc64/sparc64/rwindow.c I found that it was "position sensitive",
putting it above a call to flushw() instead of below caused corruption
of processes on the system. jake and jhb have both confirmed there is
no obvious explanation for that. The exact same kernel code does not
have the process corruption problem if compiled with -O instead of -O2.
There have been signs of similar issues floated on the sparc64@ mailing
list, lets see if this helps make them go away.
Note this isn't an optimal fix as far as the file format goes, if this
disgusts too many people I'll fix it the right way. Since compiling
with something other than -O is a known problem this format would prevent
a change to the default causing grief. And this may also help motivate
finding out what the compiler is doing wrong so we can shift back to
using -O2. :-)
My turn for the pointy hat... One of the florescent ones...
MFC after: 2 days
Allocation is always lowest free unit number.
A mixed range/bitmap strategy for maximum memory efficiency. In
the typical case where no unit numbers are freed total memory usage
is 56 bytes on i386.
malloc is called M_WAITOK but no locking is provided (yet). A bit of
experience will be necessary to determine the best strategy. Hopefully
a "caller provides locking" strategy can be maintained, but that may
require use of M_NOWAIT allocation and failure handling.
A userland test driver is included.
because it was mostly irrelevant - except for the silly BIOS_PADDRTOVADDR
etc macros. Along the way of working around this, I missed a few things.
* Make syscons properly inherit the bios capslock/shiftlock/etc state like
i386 does. Note that we cannot inherit the bios key repeat rate because
that requires a bios call (which is impossible for us).
* Give syscons the ability to beep on amd64. Oops.
While here, make bios.c compile and add it to files.amd64.
PHYSADDR : Address of the physical memory
KERNPHYSADDR : Physical address where the kernel starts
KERNVIRTADDR : Virtual address of the kernel
STARTUP_PAGETABLE_ADDR : Where to put the page table at bootstrap
+ Xscale specific options
Users should move to the new geom_vinum implementation instead.
The refcount logic which is being added to devices to enable safe module
unloading and the buf/vm work also in progress would require a major rework
of the (old)-vinum code to comply with the new semantics.
The actual source files will not be removed until I have coordinated with
the geomvinum people if they need any bits repo-copied etc.
This is necessary so source upgrades use the correct binary.
MFC after: 3 days
For the record: Problem spotted by Scott Long, who mentioned
that source upgrades from 4.7 to recent 5.x and 6.0 are broken.
Detailed analysis shows that 4.7 has a broken make(1) binary.
A breakage was fixed in RELENG_4 in make/main.c,v 1.35.2.7 by
imp@, though the commit log erroneously stated "MFC 1.68"
while in fact it should have been spelled as "MFC 1.67".
VT6122 gigabit ethernet chip and integrated 10/100/1000 copper PHY.
The vge driver has been added to GENERIC for i386, pc98 and amd64,
but not to sparc or ia64 since I don't have the ability to test
it there. The vge(4) driver supports VLANs, checksum offload and
jumbo frames.
Also added the lge(4) and nge(4) drivers to GENERIC for i386 and
pc98 since I was in the neighborhood. There's no reason to leave them
out anymore.
to RS232 bridges, such as the one found in the DeLorme Earthmate USB GPS
receiver (which is the only device currently supported by this driver).
While other USB to serial drivers in the tree rely heavily on ucom, this
one is self-contained. The reason for that is that ucom assumes that
the bridge uses bulk pipes for I/O, while the Cypress parts actually
register as human interface devices and use HID reports for configuration
and I/O.
The driver is not entirely complete: there is no support yet for flow
control, and output doesn't seem to work, though I don't know if that is
because of a bug in the code, or simply because the Earthmate is a read-
only device.
but with slightly cleaned up interfaces.
The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.
The KSE (or td_sched) structure is now allocated per thread and has no
allocation code of its own.
Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.
Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.
The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.
A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.
Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.
Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by: scottl, peter
MFC after: 1 week
FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c). This
is a possible MT5 candidate.
remaining consumers to have the count passed as an option. This is
i4b, pc98/wdc, and coda.
Bump configvers.h from 500013 to 600000.
Remove heuristics that tried to parse "device ed5" as 5 units of the ed
device. This broke things like the snd_emu10k1 device, which required
quotes to make it parse right. The no-longer-needed quotes have been
removed from NOTES, GENERIC etc. eg, I've removed the quotes from:
device snd_maestro
device "snd_maestro3"
device snd_mss
I believe everything will still compile and work after this.
will cause the network stack to operate without the Giant lock by
default. This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.
Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:
- It is still possible to disable Giant-free operation by setting
debug.mpsafenet to 0 in loader.conf.
- Add "options NET_WITH_GIANT", which will restore the default value of
debug.mpsafenet to 0, and is intended for use on systems compiled with
known unsafe components, or where a more conservative configuration is
desired.
- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
kernel components to declare dependence on Giant over the network
stack. If the declaration is made by a preloaded module or a compiled
in component, the disposition of debug.mpsafenet will be set to 0 and
a warning concerning performance degraded operation printed to the
console. If it is declared by a loadable kernel module after boot, a
warning is displayed but the disposition cannot be changed. This is
implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
is intended for the processing of configuration choices after tunables
are read in and the console is available to generate errors, but
before much else gets going.
This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.
compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.
If no hooks are connected the entire packet filter hooks section and related
activities are jumped over. This removes any performance impact if no hooks
are active.
Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
The prefix management code currently resides in nd6, leaving only the
unused router renumbering capability in the in6_prefix files. Removing
it will make it easier for us to provide locking for the remainder of
IPv6 by reducing the number of objects requiring synchronized access.
This functionality has also been removed from NetBSD and OpenBSD.
Submitted by: George Neville-Neil <gnn at neville-neil.com>
Discussed with/approved by: suz, keiichi at kame.net, core at kame.net
amd64 agp option here in order to let the pc98 kernel build
complete. This doesn't seem right, since there probably aren't
plans to build a pc98 amd64 box; however, it's not clear to me
how to get config to generate an opt_agp.h without an option
defined.
and preserves the ipfw ABI. The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.
However there are many changes how ipfw is and its add-on's are handled:
In general ipfw is now called through the PFIL_HOOKS and most associated
magic, that was in ip_input() or ip_output() previously, is now done in
ipfw_check_[in|out]() in the ipfw PFIL handler.
IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to
be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
reassembly. If not, or all fragments arrived and the packet is complete,
divert_packet is called directly. For 'tee' no reassembly attempt is made
and a copy of the packet is sent to the divert socket unmodified. The
original packet continues its way through ip_input/output().
ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet
with the new destination sockaddr_in. A check if the new destination is a
local IP address is made and the m_flags are set appropriately. ip_input()
and ip_output() have some more work to do here. For ip_input() the m_flags
are checked and a packet for us is directly sent to the 'ours' section for
further processing. Destination changes on the input path are only tagged
and the 'srcrt' flag to ip_forward() is set to disable destination checks
and ICMP replies at this stage. The tag is going to be handled on output.
ip_output() again checks for m_flags and the 'ours' tag. If found, the
packet will be dropped back to the IP netisr where it is going to be picked
up by ip_input() again and the directly sent to the 'ours' section. When
only the destination changes, the route's 'dst' is overwritten with the
new destination from the forward m_tag. Then it jumps back at the route
lookup again and skips the firewall check because it has been marked with
M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with
'option IPFIREWALL_FORWARD' to enable it.
DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for
a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will
then inject it back into ip_input/ip_output() after it has served its time.
Dummynet packets are tagged and will continue from the next rule when they
hit the ipfw PFIL handlers again after re-injection.
BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS.
More detailed changes to the code:
conf/files
Add netinet/ip_fw_pfil.c.
conf/options
Add IPFIREWALL_FORWARD option.
modules/ipfw/Makefile
Add ip_fw_pfil.c.
net/bridge.c
Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw
is still directly invoked to handle layer2 headers and packets would
get a double ipfw when run through PFIL_HOOKS as well.
netinet/ip_divert.c
Removed divert_clone() function. It is no longer used.
netinet/ip_dummynet.[ch]
Neither the route 'ro' nor the destination 'dst' need to be stored
while in dummynet transit. Structure members and associated macros
are removed.
netinet/ip_fastfwd.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.
netinet/ip_fw.h
Removed 'ro' and 'dst' from struct ip_fw_args.
netinet/ip_fw2.c
(Re)moved some global variables and the module handling.
netinet/ip_fw_pfil.c
New file containing the ipfw PFIL handlers and module initialization.
netinet/ip_input.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code. ip_forward() does not longer require
the 'next_hop' struct sockaddr_in argument. Disable early checks
if 'srcrt' is set.
netinet/ip_output.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.
netinet/ip_var.h
Add ip_reass() as general function. (Used from ipfw PFIL handlers
for IPDIVERT.)
netinet/raw_ip.c
Directly check if ipfw and dummynet control pointers are active.
netinet/tcp_input.c
Rework the 'ipfw forward' to local code to work with the new way of
forward tags.
netinet/tcp_sack.c
Remove include 'opt_ipfw.h' which is not needed here.
sys/mbuf.h
Remove m_claim_next() macro which was exclusively for ipfw 'forward'
and is no longer needed.
Approved by: re (scottl)
The ISA probe uses an identify routine to probe all slot locations from
1 to 14 that do not conflict with other allocated resources. This required
making aic7770.c part of the driver core when compiled as a module.
aic7xxx.c:
aic79xx.c:
aic_osm_lib.c:
Use aic_scb_timer_start() consistently to start the watchdog timer.
This removes a few places that verbatum copied the code in
aic_scb_timer_start().
During recovery processing, allow commands to still be queued to
the controller. The only requirement we have is that our recovery
command be queued first - something the code already guaranteed.
The only other change required to make this work is to prevent
timers from being started for these newly queued commands.
Approved by: re
have been rush hour...
While here, move COMPAT_IA32 from opt_global.h to opt_compat.h like on
amd64. Consequently, it's unsafe to use the option in pcb.h. We now
unconditionally have the ia32 specific registers in the PCB.
This commit is untested.
with the COMPAT_LINUX32 option. This is largely based on the i386 MD Linux
emulations bits, but also builds on the 32-bit FreeBSD and generic IA-32
binary emulation work.
Some of this is still a little rough around the edges, and will need to be
revisited before 32-bit and 64-bit Linux emulation support can coexist in
the same kernel.
for EBus, ISA and PCI, by compiling ofw_isa.c and ofw_pci_if.m unconditio-
nally. The correct way is to rewrite OF_decode_addr() in ofw_machdep.c in
a bus-neutral way. That's certainly possible but we unfortunately didn't
make it for FreeBSD 5.3.
Approved by: tmm
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond. A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog. If the callout fails to reset the timer for ten
seconds, the watchdog will fire. The sysctl allows you to select which
CPU will run the watchdog.
A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog. On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.
This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.
On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.
have already done this, so I have styled the patch on their work:
1) introduce a ip_newid() static inline function that checks
the sysctl and then decides if it should return a sequential
or random IP ID.
2) named the sysctl net.inet.ip.random_id
3) IPv6 flow IDs and fragment IDs are now always random.
Flow IDs and frag IDs are significantly less common in the
IPv6 world (ie. rarely generated per-packet), so there should
be smaller performance concerns.
The sysctl defaults to 0 (sequential IP IDs).
Reviewed by: andre, silby, mlaier, ume
Based on: NetBSD
MFC after: 2 months