packets coming out of a GIF tunnel are re-processed by ipfw, et. al.
By default they are not reprocessed. With the option they are.
This reverts 1.214. Prior to that change packets were not re-processed.
After they were which caused problems because packets do not have
distinguishing characteristics (like a special network if) that allows
them to be filtered specially.
This is really a stopgap measure designed for immediate MFC so that
4.8 has consistent handling to what was in 4.7.
PR: 48159
Reviewed by: Guido van Rooij <guido@gvr.org>
MFC after: 1 day
o Make DXS3 the primary playback channel. It may be the only
universally supported channel with the assorted revisions of this
chipset.
o Add sysctl and handler for enabling s/pdif output from DXS3.
as opposed to one after the other. This is faster in both -CURRENT
and -STABLE. Additionally, there is less code duplication for
error-checking.
One thing to note is that this code seems to return(1) when no buffers
are available; perhaps ENOBUFS should be the correct return value?
Partially submitted & tested by: Hiten Pandya <hiten@unixdaemons.com>
MFC after: 1 week
and enable it by default, with a limit of 16.
At the same time, tweak maxfragpackets downward so that in the worst
possible case, IP reassembly can use only 1/2 of all mbuf clusters.
MFC after: 3 days
Reviewed by: hsu
Liked by: bmah
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.
Sponsored by: DARPA & NAI Labs.
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.
Sponsored by: DARPA & NAI Labs.
Retire the "d_dump_t" and use the "dumper_t" type instead.
Dumper_t takes a void * as first arg which is more general than the
dev_t taken by d_dump_t. (Remember: we could have net-dumpers if
somebody wrote us one!)
Define the convention for GEOM controlled disk devices to be that the
first argument to the dumper function is the struct disk pointer.
Change device drivers accordingly.
Change the argument to disk_destroy() to be the same struct disk * as
disk_create() takes.
This enables drivers to ignore the (now) bogus dev_t which disk_create()
returns.
a number of related problems along the way.
- Automatically detect CDROM drives that can't handle 6 byte mode
sense and mode select, and adjust our command size accordingly.
We have to handle this in the cd(4) driver (where the buffers are
allocated), since the parameter list length is different for the
6 and 10 byte mode sense commands.
- Remove MODE_SENSE and MODE_SELECT translation removed in ATAPICAM
and in the umass(4) driver, since there's no way for that to work
properly.
- Add a quirk entry for CDROM drives that just hang when they get a 6
byte mode sense or mode select. The reason for the quirk must be
documented in a PR, and all quirks must be approved by
ken@FreeBSD.org. This is to make sure that we fully understand why
each quirk is needed. Once the CAM_NEW_TRAN_CODE is finished, we
should be able to remove any such quirks, since we'll know what
protocol the drive speaks (SCSI, ATAPI, etc.) and therefore whether
we should use 6 or 10 byte mode sense/select commands.
- Change the way the da(4) handles the no_6_byte sysctl. There is
now a per-drive sysctl to set the minimum command size for that
particular disk. (Since you could have multiple disks with
multiple requirements in one system.)
- Loader tunable support for all the sysctls in the da(4) and cd(4)
drivers.
- Add a CDIOCCLOSE ioctl for cd(4) (bde pointed this out a long
time ago).
- Add a media validation routine (cdcheckmedia()) to the cd(4)
driver, to fix some problems bde pointed out a long time ago. We
now allow open() to succeed no matter what, but if we don't detect
valid media, the user can only issue CDIOCCLOSE or CDIOCEJECT
ioctls.
- The media validation routine also reads the table of contents off
the drive. We use the table of contents to implement the
CDIOCPLAYTRACKS ioctl using the PLAY AUDIO MSF command. The
PLAY AUDIO TRACK INDEX command that we previously used was
deprecated after SCSI-2. It works in every SCSI CDROM I've tried,
but doesn't seem to work on ATAPI CDROM drives. We still use the
play audio track index command if we don't have a valid TOC, but
I suppose it'll fail anyway in that case.
- Add _len() versions of scsi_mode_sense() and scsi_mode_select() so
that we can specify the minimum command length.
- Fix a couple of formatting problems in the sense printing code.
MFC after: 4 weeks
OSes has probably caused more problems than it ever solved. Allow the
user to retire the old behavior by specifying their own privileged
range with,
net.inet.ip.portrange.reservedhigh default = IPPORT_RESERVED - 1
net.inet.ip.portrange.reservedlo default = 0
Now you can run that webserver without ever needing root at all. Or
just imagine, an ftpd that can really drop privileges, rather than
just set the euid, and still do PORT data transfers from 20/tcp.
Two edge cases to note,
# sysctl net.inet.ip.portrange.reservedhigh=0
Opens all ports to everyone, and,
# sysctl net.inet.ip.portrange.reservedhigh=65535
Locks all network activity to root only (which could actually have
been achieved before with ipfw(8), but is somewhat more
complicated).
For those who stick to the old religion that 0-1023 belong to root and
root alone, don't touch the knobs (or even lock them by raising
securelevel(8)), and nothing changes.
Some drives seem to be confused by simultaneous probes.
Tested by: marcel
As a side effect, logical units whose lun is greater than 0 might not be
probed correctly if the lun of 0 doesn't exist in the target because
CAM doesn't scan such luns.
I have a SCSI-FireWire bridge which maps SCSI-ID to LUN and it is an
example of such targets.
dev_t to the method functions.
The dev_t can still be found at struct consdev *->cn_dev.
Add a void *cn_arg element to struct consdev which the drivers can use
for retrieving their softc.
This moves all chipset specific code to a new file 'ata-chipset.c'.
Extensive use of tables and pointers to avoid having the same switch
on chipset type in several places, and to allow substituting various
functions for different HW arch needs.
Added PIO mode setup and all DMA modes.
Support for all known SiS chipsets. Thanks to Christoph Kukulies for
sponsoring a nice ASUS P4S8X SiS648 based board for this work!
Tested on: i386, PC98, alpha and sparc64
This moves all chipset specific code to a new file 'ata-chipset.c'.
Extensive use of tables and pointers to avoid having the same switch
on chipset type in several places, and to allow substituting various
functions for different HW arch needs.
Added PIO mode setup and all DMA modes.
Support for all known SiS chipsets. Thanks to Christoph Kukulies for
sponsoring a nice ASUS P4S8X SiS648 based board for this work!
Tested on: i386, PC98, alpha and sparc64
In devsw() return dead_cdevsw instead of NULL in case the dev_t does not
have a si_devsw.
This may improve our survival chances with devices which go away unexpectedly.
compile-time constants). That is, a "bucket" now is not necessarily
a page-worth of mbufs or clusters, but it is MBUF_BUCK_SZ, CLUS_BUCK_SZ
worth of mbufs, clusters.
o Rename {mbuf,clust}_limit to {mbuf,clust}_hiwm and introduce
{mbuf,clust}_lowm, which currently has no effect but will be used
to set the low watermarks.
o Fix netstat so that it can deal with the differently-sized buckets
and teach it about the low watermarks too.
o Make sure the per-cpu stats for an absent CPU has mb_active set to 0,
explicitly.
o Get rid of the allocate refcounts from mbuf map mess. Instead,
just malloc() the refcounts in one shot from mbuf_init()
o Clean up / update comments in subr_mbuf.c
used to share resource limits between rfork threads, but never was.
Removing it makes resource limit locking much simpler -- only the current
process can change the contents of the structure that p_limit points to.
support for Tekram DC395U2W cards.
Add a fix submitted by joerg@ to correctly report some errors to CAM.
Use bus_dma instead of the remaining vtophys().
reference counter array for mbuf clusters. I don't know
how this got past early testing nor how it survived so long
without getting caught. If anyone was seeing really really
bizarre memory corruption in a few mbufs this would be why.
field for holding driver-dependant data. Instead of putting the pointer
to the driver command struct in there, take advantage of these structs
being a (virtually) contiguous array and just put the array index in the
field.
control block. Allow the socket and tcpcb structures to be freed
earlier than inpcb. Update code to understand an inp w/o a socket.
Reviewed by: hsu, silby, jayanth
Sponsored by: DARPA, NAI Labs
routine does not require a tcpcb to operate. Since we no longer keep
template mbufs around, move pseudo checksum out of this routine, and
merge it with the length update.
Sponsored by: DARPA, NAI Labs
that a command completion happened, all further processing is deferred to
a taskqueue. The taskqueue itself runs implicetely under Giant, but we
already used a taskqueue for the biodone() processing, so this at least
saves the contesting of Giant in the interrupt handler.
retain symetry with aac_alloc_commans(). Since aac_alloc_commands()
allocates fib maps and places them onto the fib lists, aac_free_commands()
should reverse those operations.
o Combine two ifs with the same body with an ||.
o Switch from uintptr_t to uint32_t for fib map load operations.
The target is a uint32_t so using this type for the map load call
avoids an extra cast. uintptr_t should only be used when you need
an "int sized the same as the machine's poitner size" which is not
the case here.
o Removed the commented out M_WAITOK flag in the allocation in
aac_alloc_commands(). The kernel will only block in the allocator
if it can grow the size of the kernel. This usually results in a
page-out which could involve this aac device. Thus, sleeping here
could deadlock the machine. Assuming this operation is occurring outside
of attach time, we have enough fibs to operate anyway, so waiting for
fibs to free up is okay if not optimal.
o In aac_alloc_commands(), if we cannot dmamem_alloc additional fib
space, free the fib map.
o In aac_alloc_commands(), if we cannot create per-command dmamaps, don't
lose track of the fib map that is mapping all of the commands that we
have already released into the free pool. Instead, just cut out of
the loop and modify aac_free_commands to not attempt to free maps that
have not been allocated.
o Don't use a magic number when pre-allocating fibs.
o Use PAGE_SIZE to allocate in page sized chunks instead of an
architecture specific constant.
Submitted by: gibbs
- delay acks for T/TCP regardless of delack setting
- fix bug where a single pass through tcp_input might not delay acks
- use callout_active() instead of callout_pending()
Sponsored by: DARPA, NAI Labs
wish the busdma APIs were more consistent accross architectures.
We should probably move all the other DMA map creations in
xl_attach() where we can really handle them failing, since
xl_init() is void and shouldn't fail.
Pointy hat to: mux
Tested by: Anders Andersson <anders@hack.org>
One of the vnodes is on different mount and is possibly on a different
kind of filesystem; treating it as an smbfs vnode then writing to it
will probably corrupt it.
PR: 48381
MFC after: 1 month
group number properly based on the board id. Perform dummy reads of
registers after writing to flush the hardware write buffers.
This gets the soon to be committed zs attachment working.
Minor CIS resource allocation code cleanup
Remove some fairly useless debug writes.
This finishes the work to move as much cardbus code as possible into
pci. We wind up removing 800-odd lines from cardbus.c: we go from
1285 to 400 lines.
Reviewed by: mdodd
bus_dmamap_load() it.
- Make it so reusing mbufs when we can't allocate (or map) new ones
actually works. We were previously trying to reuse a mbuf which
was already bus_dmamap_unload()'ed.
Reviewed by: silby
- Fix memory leak in detaching.
- Initialize fc->status to other than FWBUSREST.
* fwohci.c
- Ignore BUS reset events while BUS reset phase. We can't clear that flag
during bus reset phase.
UltraSPARCs, and an eeprom attachment for fhc, which allows the date
to be set properly on these machines. Central is a wierd bus which
seems to only ever have 1 fhc attached to it. FHC (FireHose Controller)
is another wierd bus with various things on it depending where its attached.
The fhc attached to central has eeprom and zs, and the fhcs which attach
directly to nexus have simm-status, environment and other nodes, none of
which I'll probably ever have documentation for.
Thanks to Ade Lovett for providing access to an 8 cpu e4500.
#if'ed out for a while. Complete the deed and tidy up some other bits.
We need to be able to call this stuff from outer edges of interrupt
handlers for devices that have the ISR bits in pci config space. Making
the bios code mpsafe was just too hairy. We had also stubbed it out some
time ago due to there simply being too much brokenness in too many systems.
This adds a leaf lock so that it is safe to use pci_read_config() and
pci_write_config() from interrupt handlers. We still will use pcibios
to do interrupt routing if there is no acpi.. [yes, I tested this]
Briefly glanced at by: imp
is encoded in the PCI BAR. The latter is more reliable.
This allows the sio/modem function of the Xircom RealPort ethernet+modem
card to work. Note that there still seem to be issues with sio_pci not
releasing resources on detach.
pci busses implement this.
Also minor comment smithing in cardbus. Fix copyright to this year
with my name on it since I've been doing a lot to this file.
Reviewed by: jhb
on the fly and read into userland one at a time, this costs very
little total memory. The pnpinfo sizes of pccard is more than 64
bytes due to the length of the strings that man cards have in their
CIS.
- Don't initiate bus reset even if probe failed for some nodes to prevent
infinite bus reset loop.
Problem Reported by: Pierre Beyssac <pb@fasterix.frmug.org>
- Protect timeout routine with splfw() for 4-stable.
* sbp.c
- Make sure to release devq when start request.
cr_uid.
Note: we do not have socheckuid() in RELENG_4, ip_fw2.c uses its
own macro for a similar purpose that is why ipfw2 in RELENG_4 processes
uid rules correctly. I will MFC the diff for code consistency.
Reported by: Oleg Baranov <ol@csa.ru>
Reviewed by: luigi
MFC after: 1 month
sched_lock around accesses to p_stats->p_timer[] to avoid a potential
race with hardclock. getitimer(), setitimer() and the realitexpire()
callout are now Giant-free.
add a signal to a mailbox's pending set.
- Add a new function, thread_signal_upcall(), this causes the current thread
to upcall so that we can deliver pending signals.
Reviewed by: mini
I was in two minds as to where to put them in the first case..
I should have listenned to the other mind.
Submitted by: parts by davidxu@
Reviewed by: jeff@ mini@
Wearing said pointy hat, correct the oversight and hope nobody
notices.
# this should make xircom modems happier to detach once other bugs with
# the cardbus layer are fixed.
Noticed by: scottl
Conical Hat to: imp
queue lock already held.
- In getblk() and flushbufqueues() use bremfreel() while we still have the
buf queue lock held to keep the lists consistent.
- Add LK_NOWAIT to two cases where we're essentially asserting that the bufs
are not locked while acquiring the locks. This will make sure that we get
the appropriate panic() and not another one for sleeping with a lock held.
o Use the common pci_* routines in preference to the copied and hacked
routines from an ancient pci.c.
This saves 509 lines in cardbus.c. More savings to follow when I
convert the resource code over. In the past when I've done this the
resource code conversion breaks cardbus in subtle ways so I'm doing a
1/2 way checkpoint this time. cardbus still works for me the same as
it did before.
It also looks like cardbus devices now show up as pci bus devices to
pciconf -l, but maybe that was happening before.
Inspired by a patch from Justin Gibbs many moons ago. When he
finishes his kobj multiple inheritance work, we can transition the
finished version of this work to that fairly easily.
- Mark the process leader as having an advisory lock
- Check if process leader is marked as having advisory lock when
closing file
- Check that file is still open after lock has been obtained
- Don't allow file descriptor table sharing between processes
with different leaders
PR: 10265
Reviewed by: alfred
is already in pages, so we should not convert from bytes to pages.
The result of this bug was bad scaling of the VHPT relative to the
available memory.
Submitted by: Arun Sharma <arun@sharma-home.net>
It's unnecessary for two reasons: (1) Giant is at present already held in
such cases and (2) our various implementations of pmap_growkernel() look to
be MP safe. (For example, for sparc64 the proof of (2) is trivial.)
freebsd4_sigaction() and osigaction() instead of around the whole
body of those functions. They now no longer hold Giant around calls
to copyin() and copyout(), and it is slightly more obvious what
Giant is protecting.
barrier between free'ing filedesc structures. Basically if you want to
access another process's filedesc, you want to hold this mutex over the
entire operation.
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.
Submitted by: Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by: DARPA & NAI Labs.
opposed to returning the top of the old chain when there was one and
the top of the newly allocated chain if there was no old chain.
Actually, it should be noted that prior to this fix, although the
comment above m_getm() advertised that m_getm() would return the
top of the old chain (if an old chain was being passed in) it
actually [wrongly] was returning the tail mbuf in the old chain
instead. This is a bug but since the one use of m_getm() in
the tree luckily did not depend on the behavior, it happened
to work out without notice.
Harti Brandt pointed out that the advertised behavior was actually
not the real behavior and so this change makes m_getm() ALWAYS
return the newly allocated chain (and fixes the comment). This
is less confusing and is the best course of action as then the
caller is always able to have both a reference to the top of
the original chain (because it's passing it in in the call) and
a reference to the newly attached chain. Although the API is
slightly modified, I don't think that any third-party code uses
m_getm() and if it does, it surely can't be working properly
because the old behavior was bogus.
API bug pointed out by: Harti Brandt <brandt@fokus.fraunhofer.de>
To fix scsi, don't wait for ithreads if we're dumping, it makes the
debugger sad.
To fix ata, use what appears to be a polling method if we're dumping,
I stole this from tmm but added code to ensure that this change is
only in effect while dumping.
Tested by: des
for the agp module, and add agp to the list of modules to compile for alpha.
Add an alpha_mb() to agp_flush_cache for alpha -- it's not correct but may
improve the situation, and it's what linux and NetBSD do.
we can have additional different types of bridges.
o remove now bogus comment.
o Don't clear CARD_OK when we can't attach a card.
o minor style nits
# this make kldload of cardbus drivers work for me when the card is
# present on boot.
o chip_name arrays ifdef'd out.
o use the OLDCARD-like get/put functions so we can support differnt types
of mappings.
o Write the beggings of is this a valid exca device and introduce more
chipset support.
# this is partially a wip, but also needed because some other cahnges I've
# made require some of these changes.
previous revision fixed the panic, I found the problem exits in
another part of the function by investigating the crom dump sent by him.
The search was started in the middle of bus info block and the
routine misunderstood the EUI64 as a crom entry. This problem is fixed.
PR: kern/48129
Fix incorrect type mask included in a logical unit number and check
the validity of the lun.
Introdice RTLD_SELF special handle and properly process it within
dlsym() and dlinfo() functions.
The intention is to improve our compatibility with Solaris and
to make a Java port easier.
Partially submitted by: phantom
- Drain fwohci TX queue first then drain xfer queue which has not started.
- Check validity of the received packet length.
- Don't allocate too large buffer for xfer receive buf.
sbp
- Fix panic for some CROM which doesn't have a text leaf.
This could fix the PR kern/48129 but no feedback has been gotten from
the originator yet.
- Put back some M_NOWAIT flags into malloc which could be called
in interrupt context for 4-stable.
warning which breaks builds.
cc1: warnings being treated as errors
src/sys/net/bridge.c: In function `bdg_forward':
sys/net/bridge.c:931: warning: suggest parentheses around assignment used as truth value
*** Error code 1
lower extremities.
Setting bit 4 in debugflags (sysctl kern.geom.debugflags=16) will
allow any open to succeed on rank#1 providers. This will generally
correspond to the physical disk devices: ad0, da0, md0 etc.
This fundamentally violates the mechanics of GEOMs autoconfiguration,
and is only provided as a debugging facility, so obviously error
reports on GEOM where this bit is or has been set will not be
accepted.
Kill the slightly bogus #define for DECODE_PROTOTYPE
Be less verbose. Hide most (all I hope) of the CIS
parsing behind cardbus_debug_cis (which is set with
hw.cardbus.debug_cis=1).
This doesn't fix problems with parsing, but should make cardbus
less chatty. There appears to be some issues still with the
parsing of the CIS, but this won't fix them.
Prompted by: scottl
Second part of the kldload patches for cardbus. This makes
kldload of a driver for a device that's inserted now appears
to work. To make it work, we only do a power cycle of the card
if there's no children drivers attached.
This likely is papering over bogosities in the power system. The
power sequence needs to be re-written, so I'll not worry about
the papering over until the re-write.
disk I/O processing.
The intent is that the disk driver in its hardware interrupt
routine will simply schedule the bio on the task queue with
a routine to finish off whatever needs done.
The g_up thread will then schedule this routine, the likely
outcome of which is a biodone() which queues the bio on
g_up's regular queue where it will be picked up and processed.
Compared to the using the regular taskqueue, this saves one
contextswitch.
Change our scheduling of the g_up and g_down queues to be water-tight,
at the cost of breaking the userland regression test-shims.
Input and ideas from: scottl
Cut up requests into smaller bits if they are longer than the drivers
disk->d_maxsize or dev->si_iosize_max.
Properly handle the race condition when using g_clone_bio() is used
without having the single-threadedness of g_down/g_up secure locking.
and d_stripesisze;
Introduce si_stripesize and si_stripeoffset in struct cdev so we
can make the visible to clustering code.
Add stripesize and stripeoffset to providers.
DTRT with stripesize and stripeoffset in various places in GEOM.
The locking here needs to be revisited, but this ought to get rid of the
LOR messages that people are complaining about for now. I imagine either
I or someone else interested with smp will eventually clear this up.
unconditionally. kldloading a cardbus driver was shooting down other
attached devices because most drivers assume that one cannot
power-cycle cards w/o the driver knowning about it.
Submitted by: simokawa-san
- Use the ratio of kg_runtime / kg_slptime to determine our dynamic priority.
- Scale kg_runtime and kg_slptime back when the sum of the two exceeds
SCHED_SLP_RUN_MAX. This allows us to slowly forget old behavior.
- Scale back the runtime and slptime in fork so that the new process has the
same ratio but much less accumulated time. This causes new behavior to be
noticed more quickly.
blocks now, which should eliminate problems with the driver failing to
attach due to insufficient contiguous RAM. Allow the FIB pool to grow
from the default of 128 to the max of 512 as demand grows. Also pad the
adapter init struct to work around the 2120/2200 DMA bug now that there
is no longer a FIB slab.
idle time.
Statistics now default to "on" and can be turned off with
sysctl kern.geom.collectstats=0
Performance impact of statistics collection is on the order of
800 nsec per consumer/provider set on a 700MHz Athlon.
that is protected by the vnode lock.
- Move B_SCANNED into b_vflags and call it BV_SCANNED.
- Create a vop_stdfsync() modeled after spec's sync.
- Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
fs specific processing. This gives all of these filesystems proper
behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
- Annotate the locking in buf.h