timer since irq0 isn't being driven at hz in that case and we don't need to
try to handle edge cases with rollover, etc. that require irq0 to be firing
for the timecounter to actually work.
Submitted by: phk
Tested by: schweikh
Approved by: re (scottl)
and stop trying to play cute games so that sccs[] shares space with
version[].
Reported by: Jilles Tjoelker jilles at stack dot nl
Discussed with: bde, "R. Imura" imura at ryu16 dot org
Idea from: NetBSD (via bde)
Approved by: re (scottl)
MFC after: 1 week
a cosmetic change. m_uiotombuf() produces a packet header mbuf, while
original implementation did not. When kernel is compiled with MAC
support, headerless mbuf will cause panic.
Reported by: Alexander Nikiforenko <asn rambler-co.ru>
Approved by: re (scottl)
MFC After: 2 weeks
module-specific malloc types. These should help us to pinpoint the
possible memory leakage in the future.
- Implementing xpt_alloc_ccb_nowait() and replacing all malloc/free based
CCB memory management with xpt_alloc_ccb[_nowait]/xpt_free_ccb. Hopefully
this would be helpful if someday we move the CCB allocator to use UMA
instead of malloc().
Encouraged by: jeffr, rwatson
Reviewed by: gibbs, scottl
Approved by: re (scottl)
starts with an ifatm which in turns has an ifnet. Remove also a couple
of unneccessary casts that could hide such things in the future.
Approved by: re
so residue of division for all hosts on net is the same, and thus only
one VHID answers. Change source IP in host byte order.
Reviewed by: mlaier
Approved by: re (scottl)
o Grab the MAC address out of the CIS if the card has the special
3Com 0x88 tuple. Most 3Com cards don't have this tuple, but we
prefer it to the eeprom since it only appears to be present when
the eeprom doesn't have the info. So far, I've only observed this
on my 3C362 and 3C362B cards, but the NetBSD driver implies that
the 3C362C also has this tuple, and that some 3C574 cards do too (none
of mine do). ep_pccard_mac was written after looking at the NetBSD
code.
o Store the enet addr in the softc for this device, so we can use the
overridden MAC to set the station address.
o Create a routine to set the station address and use it where we need it.
o setup the cmd shitfs and such before we call ep_alloc(), and remove
setting up the cmd shift value there. It initializes to 0, and those
attachments that need to frob it do so before calling ep_alloc.
o Remove some obsolete comments
o No longer a need to export ep_get_macaddr, so make it static
o ep_alloc already grabs the EEPROM id, so we don't need to grab it again
in ep_pccard_attach.
o eliminate unit, it isn't needed, fix some printfs to be device_printf
instead.
# All my pccards except the 3C1 work now. Didn't test ISA or cbus cards
# that I have: 3C509B-TP or 3C569B-J-TPO
Tested on: 3C589B, 3C589C, 3C589D, 3C589D-TP, 3C562, 3C562B/3C563B,
3C562D/3C563D, 3CCFE574BT, 3CXEM556, 3CCSH572BT, 3C574-TX,
3CCE589EC, 3CXE589EC, 3CCFEM556, 3C1
Approved by: re (scottl)
scan the CIS for interesting tuples. 95% of what can be obtained from
the CIS is harvested by the pccard layer and presented to the user in
standard function calls. However, there are special needs at times
where the standard stuff doesn't suffice. This is for those special
cases.
CARD_SCAN_CIS(device_get_parent(dev), function, argp)
scans the CIS of the card, passing each tuple to function with
the tuple and argp as its arguments. Returning 0 continues the scan,
while returning 1 terminates the scan. The value of the last
invocation of function is returned from this function.
int (*pccard_scan_t)(struct pccard_tuple *tuple, void *argp)
function called for each tuple. Elements of the CIS tuple can be
read with pccard_tuple_read_{1,2,3,4,n}(). You are reading
the actual tuple memory each time, in case your card has
registers in the CIS.
# I suppose these things should be documented in pccard(4) or something like
# that.
# I plan on unifying cardbus CIS support in a similar way.
Approved by: re (scottl)
- pmcstat(8) gprof output mode fixes:
lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h:
+ Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event
+ Add an 'entryaddr' field to the PMCLOG_PROCEXEC event,
so that pmcstat(8) can determine where the runtime loader
/libexec/ld-elf.so.1 is getting loaded.
sys/kern/kern_exec.c:
+ Use a local struct to group the entry address of the image being
exec()'ed and the process credential changed flag to the exec
handling hook inside hwpmc(4).
usr.sbin/pmcstat/*:
+ Support "-k kernelpath", "-D sampledir".
+ Implement the ELF bits of 'gmon.out' profile generation in a new
file "pmcstat_log.c". Move all log related functions to this
file.
+ Move local definitions and prototypes to "pmcstat.h"
- Other bug fixes:
+ lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read().
+ sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all
attached PMCs when a process exits.
+ sys/sys/pmc.h: correct a function prototype.
+ Improve usage checks in pmcstat(8).
Approved by: re (blanket hwpmc)
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.
ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
sscanf fails. They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note
that 64 bit consumers can still debug 32 bit targets.
IA64 has got stubs for ia32_reg.c.
Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet. We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.
Approved by: re
that newer Intel cpu hardware implements them too. This includes things
like the NX (pte no-execute) flag for execute protection. We'll need to
reference this for implementing no-exec in pmap.c at some point.
Some feature flags are duplicated in both the Intel-orignated bits and
the AMD bits. Suppress the the duplicates correctly - the old code
assumed they were a 1:1 mapping which is not correct. We can't just mask
off the bits present in cpu_feature.
Converge with amd64 where this originated from.
Intel cpu's that implement any AMD features will report them in dmesg now.
Approved by: re
* Add ichwd (The Intel EM64T folks have an ICH)
* Cosmetic comment syncs
* Merge cpufreq change over to NOTES
* add pbio (it compiles, but isn't useful since no boxes have ISA slots)
* copy ath settings (note: wlan disabled here since its in global NOTES)
* copy profiling, including fixing a previous i386->amd64 merge typo.
Approved by: re (blanket i386 <-> amd64 sync/convergence)
be possible to get the swapgs state reversed if doreti traps during
the iretq. Attempt to handle this. load_gs() might need special
handling too. Running the kernel with the user's TLS and the
kernel's PCPU space interchanged would be bad(TM).
Discovered as a result of a conversation with: bde
Approved by: re
ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with
a size of zero. Traditionally this was what you did before IOC_VOID
existed, and we had some established users of this in the tree, namely
procfs. Certain 3rd party drivers with binary userland components also
have this too.
This is necessary to have 4.x and 5.x binaries use these ioctl's. We
found this at work when trying to run 4.x binaries.
Approved by: re
dump format. The key reason to do this is so that we can dump sparse
address space. For example, we need to be able to skip the PCI hole
just below the 4GB boundary. Trying to destructively dump MMIO device
registers is Really Bad(TM). The frequent result of trying to do a
crash dump on a machine with 4GB or more ram was ugly (lockup or reboot).
This code has been taken directly from the IA64 dump_machdep.c code,
with just a few (mostly minor) mods.
Introduce a dump_avail[] array in the machdep.c code so that we have a
source of truth for what memory is present in a machine that needs to be
dumped. We can't use phys_avail[] because all sorts of things slice
memory out of it that we really need to dump. eg: the vm page array
and the dmesg buffer. dump_avail[] is pretty much an unmolested version
of phys_avail[]. It does have Maxmem correction.
Bump the i386 and amd64 dump format to version 2, but nothing actually
uses this. amd64 was actually using the i386 dump version number.
libkvm support to follow.
Approved by: re
The ipfw tables lookup code caches the result of the last query. The
kernel may process multiple packets concurrently, performing several
concurrent table lookups. Due to an insufficient locking, a cached
result can become corrupted that could cause some addresses to be
incorrectly matched against a lookup table.
Submitted by: ru
Reviewed by: csjp, mlaier
Security: CAN-2005-2019
Security: FreeBSD-SA-05:13.ipfw
Correct bzip2 permission race condition vulnerability.
Obtained from: Steve Grubb via RedHat
Security: CAN-2005-0953
Security: FreeBSD-SA-05:14.bzip2
Approved by: obrien
Correct TCP connection stall denial of service vulnerability.
A TCP packets with the SYN flag set is accepted for established
connections, allowing an attacker to overwrite certain TCP options.
Submitted by: Noritoshi Demizu
Reviewed by: andre, Mohan Srinivasan
Security: CAN-2005-2068
Security: FreeBSD-SA-05:15.tcp
Approved by: re (security blanket), cperciva
was written in the old fragmented mbuf chain instead of the defragmented
one. Thus, the duration field of outgoing frames was incorrect.
o Only call m_defrag() if the mbuf fragmentation threshold is greater
than what is currently supported by the driver.
Reviewed by: silby (mentor)
Approved by: re (scottl)
fields for each system call, I missed two system call files because
they weren't named syscalls.master. Catch up with this last two,
mapping the system calls to the NULL event for now.
Spotted by: jhb
Approved by: re (scottl)
with a single copyin() + translate and translate + copyout() rather than
using the stackgap.
- Remove implementation of the stackgap for freebsd32 since it is no longer
used for that compat ABI.
Approved by: re (scottl)
reporting - in my previous change, I missed the case where a mbuf
from the packet zone was freed back to the mbuf/packet keg, where
it was subsequently put into the mbuf zone and found not to contain
the expected trash. This change adds the necessary trash_dtor call inside
mb_fini_pack so that everything is correct.
Thanks for Bosko for finding the bug and showing me how secondary zones
work.
Approved by: re (dwhite)
route itself.
It fixes a bug where an IPv4 route for example has an IPv6 gateway
specified:
route add 10.1.1.1 -inet6 fe80::1%fxp0
Destination Gateway Flags Refs Use Netif Expire
10.1.1.1 fe80::1%fxp0 UGHS 0 0 fxp0
The fix rejects these illegal combinations:
route: writing to routing socket: Invalid argument
add host 10.1.1.1: gateway fe80::1%fxp0: Invalid argument
Reviewed by: KAME jinmei@isl.rdc.toshiba.co.jp
Reviewed by: andre (mentor)
Approved by: re
MFC after: 5
which command to use to read the eeprom and which devices have an MII.
Simplify code by no longer using the OLDCARD compat rouintes (I don't
know if this breaks OLDCARD on pc98 or not, but OLDCARD on pc98 days
are numbered, I hope). This also removes a number of kludges that we
had before because they are OBE. Add a convenience routine to lookup
the device to avoid many casts in many places.
Tested with: 3C589D-TP, 3CCSH572BT
Approved by: re (scottl, blanket ep)
handling of pci resources, and mapping framebuffer leading to panics on X
startup. The proper solution involves use of bus_alloc_resource without
RF_ACTIVE, but this code is being rewritten in DRM CVS currently, and disabling
for now doesn't remove any features, so take the easy route.
PR: kern/80718
Approved by: re (scottl)
immediate is not saved by the architecture. Any of the break.{mifx}
instructions have their immediate saved in cr.iim on interruption.
Consequently, when we handle the break interrupt, we end up with a
break value of 0 when it was a break.b. The immediate is important
because it distinguishes between different uses of the break and
which are defined by the runtime specification.
The bottomline is that when the GNU debugger replaces a B-unit
instruction with a break instruction in the inferior, we would not
send the process a SIGTRAP when we encounter it, because the value
is not one we recognize as a debugger breakpoint.
This change adds logic to decode the bundle in which the break
instruction lives whenever the break value is 0. The assumption
being that it's a break.b and we fetch the immediate directly out
of the instruction. If the break instruction was not a break.b,
but any of break.{mifx} with an immediate of 0, we would be doing
unnecessary work. But since a break 0 is invalid, this is not a
problem and it will still result in a SIGILL being sent to the
process.
Approved by: re (scottl)
processing is now done in the ACK processing case.
- Merge tcp_sack_option() and tcp_del_sackholes() into a new function
called tcp_sack_doack().
- Test (SEG.ACK < SND.MAX) before processing the ACK.
Submitted by: Noritoshi Demizu
Reveiewed by: Mohan Srinivasan, Raja Mukerji
Approved by: re
pointer to a softc which is no longer valid since the ifnet struct was split
out from the softc.
Approved by: mlaier (mentor)
Approved by: re (blanket)
Dont try to enable read/write caching on devices that doesn't support it,
this reduces the noise from ATA on flash devices and the like.
Approved by: re@ (scottl)
kernel module. LibAlias is not aware about checksum offloading,
so the caller should provide checksum calculation. (The only
current consumer is ng_nat(4)). When TCP packet internals has
been changed and it requires checksum recalculation, a cookie
is set in th_x2 field of TCP packet, to inform caller that it
needs to recalculate checksum. This ugly hack would be removed
when LibAlias is made more kernel friendly.
Incremental checksum updates are left as is, since they don't
conflict with offloading.
Approved by: re (scottl)
actually work. Also use the right semantics for IF_HANDOFF to get correct
stats.
Reported and tested by: Sascha Luck <sascha at c4inet dot net>
Approved by: re (blanket)
a DLT_NULL interface. In particular:
1) Consistently use type u_int32_t for the header of a
DLT_NULL device - it continues to represent the address
family as always.
2) In the DLT_NULL case get bpf_movein to store the u_int32_t
in a sockaddr rather than in the mbuf, to be consistent
with all the DLT types.
3) Consequently fix a bug in bpf_movein/bpfwrite which
only permitted packets up to 4 bytes less than the MTU
to be written.
4) Fix all DLT_NULL devices to have the code required to
allow writing to their bpf devices.
5) Move the code to allow writing to if_lo from if_simloop
to looutput, because it only applies to DLT_NULL devices
but was being applied to other devices that use if_simloop
possibly incorrectly.
PR: 82157
Submitted by: Matthew Luckie <mjl@luckie.org.nz>
Approved by: re (scottl)
that says why we do this (or rather, explains that it is some voodoo magic
that's poorly understood). The local buffer fixes the crash on attach.
o Rename get_e() to ep_get_e() to avoid namespace pollution.
Submitted by: mux
Approved by: re (scottl)
opening a device, devfs_open needs the file descriptor to install its
own fileops. Failing to pass the file descriptor causes the vnode to
be returned with the regular vnops, which will cause a panic on the
first read or write because devfs_specops is not meant to support
those operations.
This bug caused a panic after exec'ing any set[ug]id program with
fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we
would panic if the exec'd program ever tried to use any of those
descriptors).
Reviewed by: phk
Approved by: re (scottl)
allowed writing to the registers by any user that can open the DRI device, and
therefore ability to initiate DMA. This came in with the merge from DRI CVS on
2005-04-15.
Approved by: re (scottl)
Obtained from: DRM CVS
a problem with one particular switch module. Create a kernel option
BGE_FAKE_AUTONEG that restores the 5.4 behavior, which should make the DNLK
switch module work. IBM/Intel blades with Intel or AD switch modules should
work without patching or kernel options with this commit.
Hardware for testing provided by several folks, including
Danny Braniss <danny@cs.huji.ac.il>, Achim Patzner <ap@bnc.net>,
and OffMyServer.
Approved by: re
exec_copyin_strings() to catch up to rev 1.266 of kern_exec.c. This fixes
panics on amd64 with compat binaries since exec_free_args() was freeing
more memory than these functions were allocating and the mismatch could
cause memory to be freed out from under other concurrent execs.
Approved by: re (scottl)
Provide a backwards compatible way to have the extra macro by defining
PCCARD_API_LEVEL 5 before including pccarddevs for driver writers that
want/need to have the same driver on 5 and 6 with pccard attachments.
Approved by: re (dwhite)
(depends on how many memory you have) observed through "tar -tvf /dev/sa0."
Without this patch, RELENG_5 and HEAD panics with something like:
kmem_malloc(4096): kmem_map too small: 42258432 total allocated
RELENG_4 doesn't panic but spews following errors:
camq_init: - cannot malloc array!
Reviewed by: gibbs, scottl
Approved by: re (scottl)
MFC after: 3 days
to the statement in ip_mroute.h, as well as being the same as what
OpenBSD has done with this file. It matches the copyright in NetBSD's
1.1 through 1.14 versions of the file as well, which they subsequently
added back.
It appears to have been lost in the 4.4-lite1 import for FreeBSD 2.0,
but where and why I've not investigated further. OpenBSD had the same
problem. NetBSD had a copyright notice until Multicast 3.5 was
integrated verbatim back in 1995. This appears to be the version that
made it into 4.4-lite1.
Approved by: re (scottl)
MFC after: 3 days
(1) "ipf -T" is broken for fetching single entries and
(2) loading rules with numbered collections does not order insertion right.
(3) stats aren't accumulated for hash table memory failures
Approved by: re (dwhite)
the UMA "trash" allocator is used - this ensures that any writes to a freed
mbuf should provoke a panic.
Only enabled under INVARIANTS, of course.
Approved by: re (scottl)
portability issues. Also note that for amd64, a hack is used to work
around gcc optimization (thanks to cognet@).
Reviewed by: mux (mentor)
Approved by: re (dougb)
#!-line had multiple whitespace characters after the interpreter name, and
it did not have any options, then the code would do nasty things trying to
process a (non-existent) option-string which "ended before it began"...
Submitted by: Morten Johansen
Approved by: re (dwhite)
are actually caused by a buf with both VNCLEAN and VNDIRTY set. In
the traces it is clear that the buf is removed from the dirty queue while
it is actually on the clean queue which leaves the tail pointer set.
Assert that both flags are not set in buf_vlist_add and buf_vlist_remove.
Sponsored by: Isilon Systems, Inc.
Approved by: re (blanket vfs)
cache_zap() to clear the v_dd pointers when a directory vnode is forcibly
discarded. For this to work, all vnodes with v_dd pointers to a directory
must also have name cache entries linked via v_cache_dst to that dvp
otherwise we could not find them at cache_purge() time. The following
code snipit could break this guarantee by unlinking a directory before
fetching it's dotdot. The dotdot lookup would initialize the v_dd field
of the unlinked directory which could never be cleared. To fix this
we don't initialize v_dd for orphaned vnodes.
printf("rmdir: %d\n", rmdir("../foo")); /* foo is cwd */
printf("chdir: %d\n", chdir(".."));
printf("%s\n", getwd(NULL));
Sponsored by: Isilon Systems, Inc.
Discovered by: kkenn
Approved by: re (blanket vfs)
not only means that it's possible (though unlikely) that we hand out
differing tags for the same bus space, it also means that the tags
we handed out are not used during bus enumeration. Both affect our
ability to compare tags. Fix the first by initializing our tags only
once. Fix the second by testing if one of the tags to compare is our
tag and the other is a busspace_isa_{io|mem} tag and declare them
equal if so.
This fixes using uart(4) as the serial console on a ds10. That is,
the low-level console worked, but we could not match the resources
to one of the UARTs found during bus enumeration, which prevented
uart(4) from becoming the console in single- or multi-user mode.
Approved by: re (kensmith)
MFC after: 2 days
Thanks to: all involved in getting a ds10 to me; directly or indirectly.
Special thanks to: Dave Knight, ISC (for not scratching my Porsche :-)
pending discussion of how implementation would proceed. Applications
like -lc_r expect select(3) to match the EAGAIN-status of IO
functions.
Approved by: re
- do not use static memory as we are under a shared lock only
- properly rtfree routes allocated with rtalloc
- rename to verify_path6()
- implement the full functionality of the IPv4 version
Also make O_ANTISPOOF work with IPv6.
Reviewed by: gnn
Approved by: re (blanket)
kernel mode, always use the curthread pmap instead. There are valid cases
were we can fault on a user address from the kernel without pcb_onfault
being set.
Approved by: re (blanket)
contaminated with the GPL code. While this information was present in
the COPYRIGHT.INFO file, it is FreeBSD's standard practice to, where
possible, include explicit license information in files.
Approved by: release engineer (scottl)
ref while we're calling vgone(). This prevents transient refs from
re-adding us to the free list. Previously, a vfree() triggered via
vinvalbuf() getting rid of all of a vnode's pages could place a partially
destructed vnode on the free list where vtryrecycle() could find it. The
first call to vtryrecycle would hang up on the vnode lock, but when it
failed it would place a now dead vnode onto the free list, and another
call to vtryrecycle() would free an already free vnode. There were many
complications of having a zero ref count while freeing which can now go
away.
- Change vdropl() to release the interlock before returning. All callers
now respect this, so vdropl() directly frees VI_DOOMED vnodes once the
last ref is dropped. This means that we'll never have VI_DOOMED vnodes
on the free list.
- Seperate v_incr_usecount() into v_incr_usecount(), v_decr_usecount() and
v_decr_useonly(). The incr/decr split is so that incr usecount can
return with the interlock still held while decr drops the interlock so
it can call vdropl() which will potentially free the vnode. The calling
function can't drop the lock of an already free'd node. v_decr_useonly()
drops a usecount without droping the hold count. This is done so the
usecount reaches zero in vput() before we recycle, however the holdcount
is still 1 which prevents any new references from placing the vnode
back on the free list.
- Fix vnlrureclaim() to vhold the vnode since it doesn't do a vget(). We
wouldn't want vnlrureclaim() to bump the usecount since this has
different semantics. Also change vnlrureclaim() to do a NOWAIT on the
vn_lock. When this function runs we're usually in a desperate situation
and we wouldn't want to wait for any specific vnode to be released.
- Fix a bunch of misc comments to reflect the new behavior.
- Add vhold() and vdrop() to vflush() for the same reasons that we do in
vlrureclaim(). Previously we held no reference and a vnode could have
been freed while we were waiting on the lock.
- Get rid of vlruvp() and vfreehead(). Neither are used. vlruvp() should
really be rethought before it's reintroduced.
- vgonel() always returns with the vnode locked now and never puts the
vnode back on a free list. The vnode will be freed as soon as the last
reference is released.
Sponsored by: Isilon Systems, Inc.
Debugging help from: Kris Kennaway, Peter Holm
Approved by: re (blanket vfs)
Hopefully this fixes ed(4) under qemu. I'm shocked that real hardware
is apparently working with these bugs.
Approved by: re (ifnet blanket)
Pointy hat: brooks
of the clean and dirty lists. This is in an attempt to catch the wrong
bufobj problem sooner.
- In vgonel() don't acquire an extra reference in the active case, the
vnode lock and VI_DOOMED protect us from recursively cleaning.
- Also in vgonel() clean up some stale comments.
Sponsored by: Isilon Systems, Inc.
Approved by: re (blanket vfs)
early. I've moved it all the way to the top rather than part way up as
the submitter did.
Submitted by: Jung-uk Kim <jkim at niksun dot com>
Reported by: submitter, le, dougb
Approved by: re (ifnet blanket)
function pointer to the vga render dispatch table and initialized it with
vga_nop. The problem is that vga_nop() is a varargs function, and the
table declares a non-varargs function pointer. On amd64 (and I think ppc),
mixing varargs and non-varargs function pointers is fatal.
Change vga_nop() and gfb_nop() from varargs to non-varargs do-nothing
functions. This stops the stack corruption that only happened on amd64.
Approved by: re (scottl)
used to ensure that we weren't exiting the syscall with a lock still
held. This wasn't safe, however, because we'd already executed a vput()
and on a loaded system the vnode may have been free'd by the time we
assert. This functionality is also handled by the td_locks assert in
userret, which doesn't tell you what the syscall was, but will at least
panic before you deadlock.
Sponsored by: Isilon Systems, Inc.
Discovred by: Peter Holm
Approved by: re (blanket vfs)
anyway and it's not used outside of vfs_subr.c.
- Change vgonel() to accept a parameter which determines whether or not
we'll put the vnode on the free list when we're done.
- Use the new vgonel() parameter rather than VI_DOOMED to signal our
intentions in vtryrecycle().
- In vgonel() return if VI_DOOMED is already set, this vnode has already
been reclaimed.
Sponsored by: Isilon Systems, Inc.
this is happening at the moment and sometimes causing panics later on the
package cluster when we bremfree() a buf whose delayed bremfree() did not
previously happen.
Sponsored by: Isilon Systems, Inc.
most of the code to deal with them has been dead for sometime. Simplify
the code by doing an insert sort hinted by the current head position.
Met with apathy by: arch@
on an IPv4 packet as these variables are uninitialized if not. This used to
allow arbitrary IPv6 packets depending on the value in the uninitialized
variables.
Some opcodes (most noteably O_REJECT) do not support IPv6 at all right now.
Reviewed by: brooks, glebius
Security: IPFW might pass IPv6 packets depending on stack contents.
Approved by: re (blanket)
I introduce a very small race here (some file system can be mounted or
unmounted between 'count' calculation and file systems list creation),
but it is harmless.
Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by: Peter Holm <peter@holm.cc>
fails.
Move detaching the ifnet from the ifindex_table into if_free so we can
both keep the sanity checks and actually delete the ifnets. [0]
Reported by: gallatin [0]
Approved by: re (blanket)
It can be used to panic the kernel by giving too big value.
Fix it by moving allocation and size verification into kern_getfsstat().
This even simplifies kern_getfsstat() consumers, but destroys symmetry -
memory is allocated inside kern_getfsstat(), but has to be freed by the
caller.
Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by: Peter Holm <peter@holm.cc>
o getsockopt(SO_ACCEPTFILTER) always returns success on listen socket
even we didn't install accept filter on the socket.
o Fix these bugs and add regression tests for them.
Submitted by: Igor Sysoev [1]
Reviewed by: alfred
MFC after: 2 weeks
to initialize the buffer array in ata_raid_attach() by removing the
initializer. There's no memset(?) in the kernel. Instead, assign
'\0' to the first element. The buffer array holds strings only, so
this is functionally equivalent.
Applies to: ia64
Tripped over by: tinderbox
events could be added to cover other interesting details.
- Add some VNASSERTs to discover places where we access vnodes after
they have been uma_zfree'd before we try to free them again.
- Add a few more VNASSERTs to vdestroy() to be certain that the vnode is
really unused.
Sponsored by: Isilon Systems, Inc.
in that KTR_VFS will be hand placed, while KTR_VOP traces the individual
vnode operations and is generated by vnode_if.awk.
- Add a comment describing KTR_VOP.
I missed. Since I did no rearrange any softcs, casting the result of
device_get_softc() to (struct ifnet **) and derefrencing it yeilds a
pointer to the ifp. This makes at least vr(4) nics work.
many regions checked again and again despite knowing the pages
contained were not usable and only satisfied the alignment constraints
This case was compounded, especially for large allocations, by the
practice of looping from the top of memory so as to keep out of the
important low-memory regions. While the old contigmalloc(9) has the
same problem, it is not as noticeable due to looping from the low
memory to high.
This degenerate case is fixed, as well as reversing the sense of the
rest of the loops within it, to provide a tremendous speed increase.
This makes the best case O(n * VM overhead) much more likely than the
worst case O(4 * VM overhead). For comparison, the worst case for old
contigmalloc would be O(5 * VM overhead) in addition to its strategy
of turning used memory into free being highly pessimal.
Also, fix a bug that in practice most likely couldn't have been triggered,
int the new contigmalloc(9): it walked backwards from the end of memory
without accounting for how many pages it needed. Potentially, nonexistant
pages could have been mapped. This hasn't occurred because the kernel
generally requests as its first contigmalloc(9) a single page.
Reported by: Nicolas Dehaine <nicko@stbernard.com>, wes
MFC After: 1 month
More testing by: Nicolas Dehaine <nicko@stbernard.com>, wes
atomic write request, it can fill the buffer cache with the entirety
of that write in order to handle retries. However, it never drops
the vnode lock, or else it wouldn't be atomic, so it ends up waiting
indefinitely for more buf memory that cannot be gotten as it has it
all, and it waits in an uncancellable state.
To fix this, hibufspace is exported and scaled to a reasonable
fraction. This is used as the limit of how much of an atomic write
request by the NFS client will be handled asynchronously. If the
request is larger than this, it will be turned into a synchronous
request which won't deadlock the system. It's possible this value is
far off from what is required by some, so it shall be tunable as soon
as mount_nfs(8) learns of the new field.
The slowdown between an asynchronous and a synchronous write on NFS
appears to be on the order of 2x-4x.
General nod by: gad
MFC after: 2 weeks
More testing: wes
PR: kern/79208
well worth the bloat.
- Change the formatting of 'show ktr' slightly to accommodate the
additional field. Remove a tab from the verbose output and place the
actual trace data after a : so it is more easy to understand which
part is the event and which is part of the record.
psm(4), ukbd(4), ums(4) and usb(4) on by default. Modulo some nits with
the most annoying one probably being USB keyboards no longer working at
the OFW boot prompt after halting FreeBSD these drivers work fine on
sparc64 including X and there's nothing left that I'd consider a show-
stopper. I.e. graphical consoles on sun4u machines should either work
out of the box or by plugging in a card that is supported by either
creator(4) or machfb(4). The exception obviously are SBus-only machines
without UPA slots like some Ultra 1 (but which also still lack support
in other areas) and certain Exx0 (but which probably are mainly used
with serial consoles anyway). I'll try to add a cgsix(4) for these later
as Sun CG6 cards are probably the most common SBus framebuffer cards in
sun4u machines. I however don't see much sense in adding drivers for the
dozen of SBus framebuffers that were destined for sparc v8 machines.
The rest of the USB drivers aren't enabled as I'm only aware of ukbd(4)
and ums(4) as well as ohci(4) working with the on-board ALI M5237 and
Sun PCIO-2 controllers. Aue(4) definitely doesn't work on sparc64, yet.
Thanks to:
- Jake for the initial work on syscons(4) on sparc64 and creator(4).
- Marcel for uart(4) and especially for its support for the SCCs which
are only used on sparc64 so far. In various regards it wouldn't have
been possible to enable syscons(4) by default on sparc64, yet, without
uart(4).
- All that tested patches.
Ok'ed by: scottl (RE hat), tmm
files after they were repo-copied to sys/dev/atkbdc. The sources of
atkbdc(4) and its children were moved to the new location in preparation
for adding an EBus front-end to atkbdc(4) for use on sparc64; i.e. in
order to not further scatter them over the whole tree which would have
been the result of adding atkbdc_ebus.c in e.g. sys/sparc64/ebus. Another
reason for the repo-copies was that some of the sources were misfiled,
e.g. sys/isa/atkbd_isa.c wasn't ISA-specific at all but for hanging
atkbd(4) off of atkbdc(4) and was renamed to atkbd_atkbdc.c accordingly.
Most of sys/isa/psm.c, i.e. expect for its PSMC PNP part, also isn't
ISA-specific.
- Separate the parts of atkbdc_isa.c which aren't actually ISA-specific
but are shareable between different atkbdc(4) bus front-ends into
atkbdc_subr.c (repo-copied from atkbdc_isa.c). While here use
bus_generic_rl_alloc_resource() and bus_generic_rl_release_resource()
respectively in atkbdc_isa.c instead of rolling own versions.
- Add sparc64 MD bits to atkbdc(4) and atkbd(4) and an EBus front-end for
atkbdc(4). PS/2 controllers and input devices are used on a couple of
Sun OEM boards and occur on either the EBus or the ISA bus. Depending on
the board it's either the only on-board mean to connect a keyboard and
mouse or an alternative to either RS232 or USB devices.
- Wrap the PSMC PNP part of psm.c in #ifdef DEV_ISA so it can be compiled
without isa(4) (e.g. for EBus-only machines). This ISA-specific part
isn't separated into its own source file, yet, as it requires more work
than was feasible for 6.0 in order to do it in a clean way. Actually
philip@ is working on a rewrite of psm(4) so a more comprehensive
clean-up and separation of hardware dependent and independent parts is
expected to happen after 6.0.
Tested on: i386, sparc64 (AX1105, AXe and AXi boards)
Reviewed by: philip
of just dropping the lock around the ip_output call. This used to cause
corrupted state tree walks for some call-paths.
In a second stage all callouts will be marked MPSAFE according to the
setting of mpsafenet.
Reported and tested by: Matthew Grooms <mgrooms at seton dot org>
MFC after: 3 days
X-MFC after: Marking callouts MPSAFE + 1 week
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
it; instead pass the space occupied by the header down into the
crypto modules (except in the demic case which needs it only when
doing int in s/w)
o while here fix defrag to strip the header from 2nd and later frames
o teach decap code how to handle 4-address frames
- Do not edit pullup_len outside M_CHECK macro.
- Do not reimplement NG_FWD_NEW_DATA().
- Remove redundant check for item being not NULL.
Submitted by: ru
do the subsequent ip_output() in IPFW. In ipfw_tick(), the keep-alive
packets must be generated from the data that resides under the
stateful lock, but they must not be sent at that time, as this would
cause a lock order reversal with the normal ordering (interface's
lock, then locks belonging to the pfil hooks).
In practice, this caused deadlocks when using IPFW and if_bridge(4)
together to do stateful transparent filtering.
MFC after: 1 week
- Initialize val_ec with the content of the volume EC register
for ACPI_IBM_METHOD_VOLUME and ACPI_IBM_METHOD_MUTE in acpi_ibm_sysctl_set()
if there is no CMOS handle present. This fixes setting volume and mute on
such models.
Submitted by: ru
Approved by: philip
hack MSS of packets outgoing via interface with small MTU, to workaround
path MTU discovery problems.
Written by Alexey Popov, with some cleanups from me. There are also plans
to improve mpd port, so that it uses this node, instead of doing MSS
hacking in userland, when 'enable tcpmssfix' option is on.
Submitted by: Alexey Popov <lollypop@flexuser.ru>