to map. If the checksum fails, the table is unmapped and a NULL pointer
returned.
- For ACPI version >= 2.0, check the extended checksum of the RSDP.
AcpiOsGetRootPointer() already checks the version 1.0 checksum.
- Remap the full MADT table at the end of madt_probe() so that we verify
its checksum before saying it is really there.
Requested by: njl
the swizzle method for routing PCI interrupts across the bridge. This
fixes problems with motherboards (typically laptops) whose BIOS doesn't
provide a PRT for the AGP bridge even though there is a device entry for
the bridge in the ACPI namespace.
Tested by: Kenneth Culver culverk at sweetdreamsracing dot biz
it back to userspace, so it does not break bind(2) on raw sockets in jails.
Currently some processes, like traceroute(8) construct a routing request
to determine its source address based on the destination. This sockaddr
data is fed directly to bind(2). When bind calls ifa_ifwithaddr(9) to
make sure the address exists on the interface, the comparison will
fail causing bind(2) to return EADDRNOTAVAIL if the data wasnt zero'ed
before initialization.
Approved by: bmilekic (mentor)
"options OFW_NEWPCI").
This is a bit overdue, the new sparc64 OFW PCI code which is
meant to replace the old one is in place for 10 months and
enabled by default in GENERIC for 8 months. FreeBSD 5.2 and
5.2.1 also shipped with the new code enabled by default.
- Some minor clean-up, e.g. remove functions that encapsulated
the #ifdefs for OFW_NEWPCI, remove unused resp. no longer
required includes, etc.
Approved by: tmm, no objections on freebsd-sparc64
It's not quite correct from a posix Point Of view, but it is a lot better
than what was there before. This will be revisited later
when we decide what form our priority extensions will take. Posix doesn't
specify how a system scope thread can change its priority so you need to
add non-standard extensions to be able to do it..
For now make this slightly non standard to allow it to be done.
Submitted by: Dan Eischen originally, changed by myself.
there's not dependencies on pccard symboles, such a dependency is not
necessary. This means that drivers that have multiple attachments can
not drag bogus devices into the kernel at load time.
We can't (yet) do this with pci and isa. Drivers written for them
actually do seem to have symbols that depend on these busses'
implementation code.
ndis not touched until other things can be tested.
the kernel. We can guarantee this by resetting the FP status register.
This masks all FP traps. The reason we did get FP traps was that we
didn't reset the FP status register in all cases.
Make sure to reset the FP status register in syscall(). This is one of
the places where it was forgotten.
While on the subject, reset the FP status register only when we trapped
from user space.
Previously, mlockall(2) usage would leak MAP_FUTUREWIRE of the process's
vmspace::vm_map and subsequent processes would wire all of their memory.
Coupled with a wired-page leak in vm_fault_unwire(), this would run the
system out of free pages and cause programs to randomly SIGBUS when
faulting in new pages.
(Note that this is not the fix for the latter part; pages are still
leaked when a wired area is unmapped in some cases.)
Reviewed by: alc
PR kern/62930
of IP options.
net.inet.ip.process_options=0 Ignore IP options and pass packets unmodified.
net.inet.ip.process_options=1 Process all IP options (default).
net.inet.ip.process_options=2 Reject all packets with IP options with ICMP
filter prohibited message.
This sysctl affects packets destined for the local host as well as those
only transiting through the host (routing).
IP options do not have any legitimate purpose anymore and are only used
to circumvent firewalls or to exploit certain behaviours or bugs in TCP/IP
stacks.
Reviewed by: sam (mentor)
devices it cannot attach to. This gets rid of extraneous but harmless
device_probe_and_attach() errors. While I'm here, make the device
description more useful. The !acpi case for cpu is handled by legacy0.
and Rx frames up to 8191 octets, so it is perfectly capable of supporting
vlan(4)-style VLAN natively.
Thus, make it support VLAN `oversize' frames.
Reviewed by: tmm
that the OHCI driver uses. Broken OHCI devices (like the controller
in my laptop, apparently) like to set this bit at times. Research
through google shows that this problem has shown up on other systems
as well.
As the scheduling overrun handler doesn't actually do anything, and
the only effect is console spamming, disabling the interrupt seems
to be the right thing to do. (And it is also what linux 2.6 does.)
allocation and deallocation. This flag's principal use is shortly after
allocation. For such cases, clearing the flag is pointless. The only
unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never
requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed
page.
Reviewed by: tegge@
individual asm versions. The global lock is shared between the BIOS and
OS and thus cannot use our mutexes. It is defined in section 5.2.9.1 of
the ACPI specification.
Reviewed by: marcel, bde, jhb
o The ieee80211_media_status() function updates the ifi_link_state field
and calls rt_ifmsg() to notify listeners on the routing socket.
Approved by: sam
2. Note that ct device uses ctau name as driver name (due to name conflict
with ct driver) and also mark it as a driver inside the CVS tree.
MFC after: 10 days
segment, remove the groping around in the Option ROM segments, remove the
bogus tests for bcopy vs. copyout. There really is no reason for a
management app to know these things other than to create l33t info tables
for the user.
a significant functional change, but it further cleans up the code and
brings it closer to being portable. Thanks to Don Bowman for helping to
test this.
and type for printing info about the device that didn't probe from child, not
parent.
This fixes a panic on systems where not yet supported devices hang off of the
nexus, e.g. on E450.
Reported by: joerg
host-PCI bridge device and find a valid $PIR.
- Make pci_pir_parse() private to pci_pir.c and have pir0's attach routine
call it instead of having legacy_pcib_attach() call it.
- Implement suspend/resume support for the $PIR by giving pir0 a resume
method that calls the BIOS to reroute each link that was already routed
before the machine was suspended.
- Dump the state of the routed flag in the links display code.
- If a link's IRQ is set by a tunable, then force that link to be re-routed
the first time it is used.
- Move the 'Found $PIR' message under bootverbose as the pir0 description
line lists the number of entries already. The pir0 line also only shows
up if we are actually using the $PIR which is a bonus.
- Use BUS_CONFIG_INTR() to ensure that any IRQs used by a PCI link are
set to level/low trigger/polarity.
active low polarity when using the PIC interrupt model. This should fix
broken SCI interrupts on machines when not using the APIC where the BIOS
doesn't program the ELCR to level trigger for the ACPI SCI.
Requested by: njl
polarity for a specified IRQ. The intr_config_intr() function wraps
this pic method hiding the IRQ to interrupt source lookup.
- Add a config_intr() method to the atpic(4) driver that reconfigures
the interrupt using the ELCR if possible and returns an error otherwise.
- Add a config_intr() method to the apic(4) driver that just logs any
requests that would change the existing programming under bootverbose.
Currently, the only changes the apic(4) driver receives are due to bugs
in the acpi(4) driver and its handling of link devices, hence the reason
for such requests currently being ignored.
- Have the nexus(4) driver on i386 implement the bus_config_intr() function
by calling intr_config_intr().
and intr_polarity enums for passing around interrupt trigger modes and
polarity rather than using the magic numbers 0 for level/low and 1 for
edge/high.
- Convert the mptable parsing code to use the new ELCR wrapper code rather
than reading the ELCR directly. Also, use the ELCR settings to control
both the trigger and polarity of EISA IRQs instead of just the trigger
mode.
- Rework the MADT's handling of the ACPI SCI again:
- If no override entry for the SCI exists at all, use level/low trigger
instead of the default edge/high used for ISA IRQs.
- For the ACPI SCI, use level/low values for conforming trigger and
polarity rather than the edge/high values we use for all other ISA
IRQs.
- Rework the tunables available to override the MADT. The
hw.acpi.force_sci_lo tunable is no longer supported. Instead, there
are now two tunables that can independently override the trigger mode
and/or polarity of the SCI. The hw.acpi.sci.trigger tunable can be
set to either "edge" or "level", and the hw.acpi.sci.polarity tunable
can be set to either "high" or "low". To simulate hw.acpi.force_sci_lo,
set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low".
If you are having problems with ACPI either causing an interrupt storm
or not working at all (e.g., the power button doesn't turn invoke a
shutdown -p now), you can try tweaking these two tunables to find the
combination that works.
IRQ is edge triggered or level triggered. For ISA interrupts, we assume
that edge triggered interrupts are always active high and that level
triggered interrupts are always active low.
- Don't disable an edge triggered interrupt in the PIC. This avoids
outb instructions to the actual PIC for traditional ISA IRQs such as
IRQ 1, 6, 14, and 15. (Fast interrupts such as IRQs 0 and 8 don't mask
their source, so this doesn't change anything for them.)
- For MCA systems we assume that all interrupts are level triggered and
thus need masking. Otherwise, we probe the ELCR. If it exists we trust
what it tells us regarding which interrupts are level triggered. If it
does not exist, we assume that IRQs 0, 1, 2, and 8 are edge triggered
and that all other IRQs are level triggered and need masking.
- Instruct the ELCR mini-driver to restore its saved state during resume.
register controlled the trigger mode and polarity of EISA interrupts.
However, it appears that most (all?) PCI systems use the ELCR to manage
the trigger mode and polarity of ISA interrupts as well since ISA IRQs used
to route PCI interrupts need to be level triggered with active low
polarity. We check to see if the ELCR exists by sanity checking the value
we get back ensuring that IRQS 0 (8254), 1 (atkbd), 2 (the link from the
slave PIC), and 8 (RTC) are all clear indicating edge trigger and active
high polarity.
This mini-driver will be used by the atpic driver to manage the trigger and
polarity of ISA IRQs. Also, the mptable parsing code will use this mini
driver rather than examining the ELCR directly.
not as a pending interrupt status, but as a matter of status quo.
Consequently, when there's no data to be transmitted the condition
is not cleared and uart_intr() is stuck in an infinite loop trying
to clear the UART_IPEND_TXIDLE status.
The z8530_bus_ipend() function is changed to return idle only once
after having sent any data.
The root cause for this problem is that we cannot use the interrupt
status bits of the SCC itself. The register that holds the interrupt
status can only be accessed by channel A and holds the status for
both channels. Using the interrupt status register would complicate
the driver because we need to synchronize access to the SCC between
the channels.
Elementary testing: marius
labeling new mbufs created from sockets/inpcbs in IPv4. This helps avoid
the need for socket layer locking in the lower level network paths
where inpcb locks are already frequently held where needed. In
particular:
- Use the inpcb for label instead of socket in raw_append().
- Use the inpcb for label instead of socket in tcp_output().
- Use the inpcb for label instead of socket in tcp_respond().
- Use the inpcb for label instead of socket in tcp_twrespond().
- Use the inpcb for label instead of socket in syncache_respond().
While here, modify tcp_respond() to avoid assigning NULL to a stack
variable and centralize assertions about the inpcb when inp is
assigned.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
lookup for the label tag fails, return NULL rather than something close
to NULL. This scenario occurs if mbuf header labeling is optional and
a policy requiring labeling is loaded, resulting in some mbufs having
labels and others not. Previously, 0x14 would be returned because the
NULL from m_tag_find() was not treated specially.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
returns okay when HW probe fails. This happens when comconsole flag is
set but VGA console is used instead.
Back out requested by: bde (He will be looking at other solutions from scratch)
test the label pointer for NULL before testing the label slot for
permitted values. When loading mac_test dynamically with conditional
mbuf labels, the label pointer may be NULL if the mbuf was
instantiated while labels were not required on mbufs by any policy.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
synchronization protecting against dynamic load and unload of MAC
policies, and instead simply blocks load and unload. In a static
configuration, this allows you to avoid the synchronization costs
associated with introducing dynamicism.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
interrupt source.
- Only do an outb() to the PIC to clear a bit in imen if the bit is set.
- Add a NUM_ISA_IRQS macro to replace uglier
'sizeof(array) / sizeof(member)' expressions along with a CTASSERT() to
ensure that the macro is correct.
than using legacy_pcib_attach(). The MP Table drivers don't use the $PIR,
and the legacy_pcib_attach() function probes and parses the $PIR in
addition to adding the pci bus child device.
o New function ip_findroute() to reduce code duplication for the
route lookup cases. (luigi)
o Store ip_len in host byte order on the stack instead of using
it via indirection from the mbuf. This allows to defer the host
byte conversion to a later point and makes a quicker fallback to
normal ip_input() processing. (luigi)
o Check if route is dampned with RTF_REJECT flag and drop packet
already here when ARP is unable to resolve destination address.
An ICMP unreachable is sent to inform the sender.
o Check if interface output queue is full and drop packet already
here. No ICMP notification is sent because signalling source quench
is depreciated.
o Check if media_state is down (used for ethernet type interfaces)
and drop the packet already here. An ICMP unreachable is sent to
inform the sender.
o Do not account sent packets to the interface address counters. They
are only for packets with that 'ia' as source address.
o Update and clarify some comments.
Submitted by: luigi (most of it)
o Extend the if_data structure with an ifi_link_state field and
provide the corresponding defines for the valid states.
o The mii_linkchg() callback updates the ifi_link_state field
and calls rt_ifmsg() to notify listeners on the routing socket
in addition to the kqueue KNOTE.
o If vlans are configured on a physical interface notify and update
all vlan pseudo devices as well with the vlan_link_state() callback.
No objections by: sam, wpaul, ru, bms
Brucification by: bde
the falling edge of a media state change.
This is in preparation for media state change notification to the
routing socket.
No objections by: sam, wpaul, ru, bms
Brucification by: bde
o Fix and improve comments and references,
o Add PFIL_HOOKS, UFS_ACL and UFS_DIRHASH,
o Switch from SCHED_4BSD to SCHED_ULE,
o Remove SCSI_DELAY (there's no SCSI support),
- change pf_get_pool() argument rule_number type from u_int32_t
to u_int8_t, fixes corruption of address pools with large
rulesets (mcbride@)
- prevent endless loops with route-to (dhartmei@)
- limit option length to 2 octets max (frantzen@)
Obtained from: OpenBSD
Approved by: mlaier(mentor), bms(mentor)
uncommitted):
Rename ip_claim_next_hop() to m_claim_next_hop(), give it an extra arg
(the type of tag to claim) and push it out of ip_var.h into mbuf.h
alongside all of the other macros that work ok mbuf's and tag's.
parameter).
Keep using it only in the i386 NOTES for now. It is fairly MI, but it
doesn't use bus-space and has a couple of i386 i/o instructions in pci
intitialization.
intent was to make sure that message structs allocated off of the stack were
4-byte aligned. However, the macros as defined did absolutely nothing.
And since I2O forces you to manually copy messages down to the hardware, there
really is no point of enforced alignment anyways.
rename the control device from rasr%d to asr%d. This starts us down the
path of divorcing ourselves from a very bogus design in the management
apps. Since the apps are open source now, they will likely be updated
and fixed before 5.3.
Adjust staticness and a variable name for the split of cy.c into cy.c and
cy_isa.c. Use the new header required for the split to avoid repeating
declarations in cy_pci.c.
Put @includes in a better spot. Fix many cases of 2 space indents and spaces
between a function name and the parens. Use KASSERT instead of a home-rolled
ASSERT. Remove some undeeded caddr casts.
- Define option FORCECONSPEED to force the serial console to
be CONSPEED. I've run into a lot of boards in which
the detect for prior speed doesn't work and ends up with
broken console since it is at the wrong speed.
- If a serial port is marked as a console, but console=vidconsole
and if the serial ports doesn't exist it will be probed and
attached at a 8250 chip. Then writes to that will freeze the
system.
- Add an option flags 0x400000 to mark this as a potential
comconsole in-case the one flaged with 0x10 does not exist
in the system.
This makes it easier to deploy on systems with one or two serial ports.
Obtained from: IronPort
- Use the dh_inserted member of the dispatch header in the Windows
timer structure to indicate that the timer has been "inserted into
the timer queue" (i.e. armed via timeout()). Use this as the value
to return to the caller in KeCancelTimer(). Previously, I was using
callout_pending(), but you can't use that with timeout()/untimeout()
without creating a potential race condition.
- Make ntoskrnl_init_timer() just a wrapper around ntoskrnl_init_timer_ex()
(reduces some code duplication).
- Drop Giant when entering if_ndis.c:ndis_tick() and
subr_ntorkrnl.c:ntoskrnl_timercall(). At the moment, I'm forced to
use system callwheel via timeout()/untimeout() to handle timers rather
than the callout API (struct callout is too big to fit inside the
Windows struct KTIMER, so I'm kind of hosed). Unfortunately, all
the callouts in the callwhere are not marked as MPSAFE, so when
one of them fires, it implicitly acquires Giant before invoking the
callback routine (and releases it when it returns). I don't need to
hold Giant, but there's no way to stop the callout code from acquiring
it as long as I'm using timeout()/untimeout(), so for now we cheat
by just dropping Giant right away (and re-acquiring it right before
the routine returns so keep the callout code happy). At some point,
I will need to solve this better, but for now this should be a suitable
workaround.
- Remove second license, the first was not that different and should be
fine.
- Add nexus_attach(), and do not perform its task in nexus_probe() any
more.
- Remove nexus_write_ivar(), since it was quite pointless.
- Remove superfluous devinfo members.
- Clean up some comments, minor style issues and prototypes.
as dependent on binutils features/quirks as the current one. This one
loads plain .o files without having to mess with shared object mode.
This happens to be essential on amd64, because binutils hasn't implemented
all the quirks/features that we need for producing the hack non-PIC shared
objects. As it turned out, .o format isn't all that inconvenient after
all. It looks like the ability to use the same .o files for linking
directly into a static kernel or loading as a module might be worth it.
It is still very much a work-in-progress, but it is almost usable. Other
changes are still needed in order to use it though, these have not been
committed yet. There is still a memory corruption/overrun bug somewhere.
For example, test modules load and work, but the machine explodes a few
minutes later in vm_forkproc() or the like. Notable missing things
include kldxref support, and loader(8) support. I wanted to figure out
a working baseline set of code first.
things which compare /etc/fstab entries to results from
getfsstat(). The real way to fix this is to make 'ufs2'
a recognized filesystem (for real, no beating around the
bush).
This should fix things like 'umount -a -t ufs' now.
Appologies for the previous breakage.
boot0sio.s was repo-copied to boot0.S.
- Rename boot0ext.s to boot0ext.S, to stay consistent
with other preprocessed asm files around here, and
for better portability.
Repocopied by: joe
condition where kse_wakeup() doesn't yet see them in (interruptible)
sleep queues. Also add an upcall check to sleepqueue_catch_signals()
suggested by jhb.
This commit should fix recent mysql hangs.
Reviewed by: jhb, davidxu
Mysql'd by: Robin P. Blanchard <robin.blanchard at gactr uga edu>
switch to using C99-style comments everywhere in preprocessed
assembler. The reason is that lines starting with the regexp
'^[[:space:]]#' are treated as preprocessing directives, and
while it seems to work now with GCC, it's not necessarily has
to work. Use C99 comments `//' for the trailing comments to
save whitespace.
bridges, the EBus bridge has resource ranges it claims exclusively to
map its children into in its BARs. Hence, we need to allocate these
completely and manage them for the children, instead of just passing
allocations through to the PCI layer as we did before.
While being there, split ebus_probe(), which did also contain code
normally belonging into the attach method, into ebus_probe() and
ebus_attach(), and perform some minor cleanups.
correct interrupt source.
- Cache a pointer to the i8254_intsrc's pending method to avoid several
pointer indirections in i8254_get_timecount().
Reported by: bde
Merge boot0.s and boot0sio.s into boot0_512.s controlled by "#ifdef SIO".
Add Makefile magic to generate boot0.s and boot0sio.s from boot0_512.s.
The compile boot0 and boot0sio have unchanged MD5 checksums.
channels. This also work when PCI native mode has been selected
(patch for /sys/dev/pci/pci.c needed for that) since pci_get_progif
uses the saved value for progif, not the one stored after we may have
changed from legacy mode to native PCI mode.
jail, which is less restrictive but allows for more flexible
jail usage (for those who are willing to make the sacrifice).
The default is off, but allowing raw sockets within jails can
now be accomplished by tuning security.jail.allow_raw_sockets
to 1.
Turning this on will allow you to use things like ping(8)
or traceroute(8) from within a jail.
The patch being committed is not identical to the patch
in the PR. The committed version is more friendly to
APIs which pjd is working on, so it should integrate
into his work quite nicely. This change has also been
presented and addressed on the freebsd-hackers mailing
list.
Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
PR: kern/65800
libufs, which only works for Charlie root.
This change reverts the introduction of libufs and moves the
check into the kernel. Since the f_fstypename is the same
for both ufs and ufs2, we check fs_magic for presence of
ufs2 and copy "ufs2" explicitly instead.
Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
possible while maintaining compatibility with the widest range of TCP stacks.
The algorithm is as follows:
---
For connections in the ESTABLISHED state, only resets with
sequence numbers exactly matching last_ack_sent will cause a reset,
all other segments will be silently dropped.
For connections in all other states, a reset anywhere in the window
will cause the connection to be reset. All other segments will be
silently dropped.
---
The necessity of accepting all in-window resets was discovered
by jayanth and jlemon, both of whom have seen TCP stacks that
will respond to FIN-ACK packets with resets not meeting the
strict last_ack_sent check.
Idea by: Darren Reed
Reviewed by: truckman, jlemon, others(?)
1) In pci.c, we need to check the child device's state, not the parent
device's state.
2) In acpi_pci.c, we have to run the power state change after the acpi
method when the old_state is > new state, not the other way around.
Submitted by: Dmitry Remesov
PR: 65694
1. rt_check() cleanup:
rt_check() is only necessary for some address families to gain access
to the corresponding arp entry, so call it only in/near the *resolve()
routines where it is actually used -- at the moment this is
arpresolve(), nd6_storelladdr() (the call is embedded here),
and atmresolve() (the call is just before atmresolve to reduce
the number of changes).
This change will make it a lot easier to decouple the arp table
from the routing table.
There is an extra call to rt_check() in if_iso88025subr.c to
determine the routing info length. I have left it alone for
the time being.
The interface of arpresolve() and nd6_storelladdr() now changes slightly:
+ the 'rtentry' parameter (really a hint from the upper level layer)
is now passed unchanged from *_output(), so it becomes the route
to the final destination and not to the gateway.
+ the routines will return 0 if resolution is possible, non-zero
otherwise.
+ arpresolve() returns EWOULDBLOCK in case the mbuf is being held
waiting for an arp reply -- in this case the error code is masked
in the caller so the upper layer protocol will not see a failure.
2. arpcom untangling
Where possible, use 'struct ifnet' instead of 'struct arpcom' variables,
and use the IFP2AC macro to access arpcom fields.
This mostly affects the netatalk code.
=== Detailed changes: ===
net/if_arcsubr.c
rt_check() cleanup, remove a useless variable
net/if_atmsubr.c
rt_check() cleanup
net/if_ethersubr.c
rt_check() cleanup, arpcom untangling
net/if_fddisubr.c
rt_check() cleanup, arpcom untangling
net/if_iso88025subr.c
rt_check() cleanup
netatalk/aarp.c
arpcom untangling, remove a block of duplicated code
netatalk/at_extern.h
arpcom untangling
netinet/if_ether.c
rt_check() cleanup (change arpresolve)
netinet6/nd6.c
rt_check() cleanup (change nd6_storelladdr)
- Fix some comments; remove numerous superfluous or outdated ones.
- Correctly pass on the requesting device when handing requests up
to the parent bus.
- Use the complete device name, including unit number, to build the
IOMMU instance name.
- Inline a function that was only used once, and was trivial.
consistently with the rest of the code, use IFP2AC(ifp) to access
the arpcom structure given the ifp.
In this case also fix a difference in assumptions WRT the rest of
the net/ sources: it is not the 'struct *softc' that starts with a
'struct arpcom', but a 'struct arpcom' that starts with a
'struct ifnet'
- use ifp instead if &ac->ac_if in a couple of nd6* calls;
this removes a useless dependency.
- use IFP2AC(ifp) instead of an extra variable to point to the struct arpcom;
this does not remove the nesting dependency between arpcom and ifnet but
makes it more evident.
caller to vm_page_grab(). Although this gives VM_ALLOC_ZERO a
different meaning for vm_page_grab() than for vm_page_alloc(), I feel
such change is necessary to accomplish other goals. Specifically, I
want to make the PG_ZERO flag immutable between the time it is
allocated by vm_page_alloc() and freed by vm_page_free() or
vm_page_free_zero() to avoid locking overheads. Once we gave up on
the ability to automatically recognize a zeroed page upon entry to
vm_page_free(), the ability to mutate the PG_ZERO flag became useless.
Instead, I would like to say that "Once a page becomes valid, its
PG_ZERO flag must be ignored."
added an arbitrary delay to our readings, causing us to use the ACPI-safe
read method when not necessary. Submitted by: bde
Old:
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks BAD min = 3, max = 19, width = 16
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks BAD min = 3, max = 19, width = 16
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 4, width = 1
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
New:
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
Also, reduce unnecesary overhead in ACPI-fast by remove the barrier for
reads. The timer in the ACPI-fast case is known to increase monotonically
so there is no need to serialize access to it.
misplaced forward declarations of structs). This also reduces namespace
pollution (the misplaced declarations were declared in the !_KERNEL case
when they are not used).
would actually map the file with read access enabled. According to
http://www.opengroup.org/onlinepubs/007904975/functions/mmap.html this is
an error. Similarly, an madvise(..., MADV_WILLNEED) would enable read
access on a virtual address range that was PROT_NONE.
The solution implemented herein is (1) to pass a vm_prot_t to
vm_map_pmap_enter() describing the allowed access and (2) to make
vm_map_pmap_enter() responsible for understanding the limitations of
pmap_enter_quick().
Submitted by: "Mark W. Krentel" <krentel@dreamscape.com>
PR: kern/64573
from tcp_hostcache would have overridden a (now) lower MTU of
an interface or route that changed since first PMTU discovery.
The bug would have caused TCP to redo the PMTU discovery when
not strictly necessary.
Make a comment about already pre-initialized default values
more clear.
Reviewed by: sam
state. Apparently it happens when both devices try to disconnect RFCOMM
multiplexor channel at the same time.
The scenario is as follows:
- local device initiates RFCOMM connection to the remote device. This
creates both RFCOMM multiplexor channel and data channel;
- remote device terminates RFCOMM data channel (inactivity timeout);
- local device acknowledges RFCOMM data channel termination. Because
there is no more active data channels and local device has initiated
connection it terminates RFCOMM multiplexor channel;
- remote device does not acknowledges RFCOMM multiplexor channel
termination. Instead it sends its own request to terminate RFCOMM
multiplexor channel. Even though local device acknowledges RFCOMM
multiplexor channel termination the remote device still keeps
L2CAP connection open.
Because of hanging RFCOMM multiplexor channel subsequent RFCOMM
connections between local and remote devices will fail.
Reported by: Johann Hugo <jhugo@icomtek.csir.co.za>
modules is a very nice way to produce hard-to-find panics. Who would look for
a bug in a Makefile anyway?
Has anyone seen the pointy hat? :-o
Approved by: njl (mentor)
ip_id again. ip_id is already set to the ip_id of the encapsulated packet.
Make a comment about mbuf allocation failures more realistic.
Reviewed by: sobomax
resource pre-allocation. The problem is that the BARs of the EBus bridges
contain the ranges for the resources for the EBus devices beyond the bridge.
So when the EBus code tries to allocate the resource for an EBus device
it's already allocated by the PCI code.
To be removed again as soon as we have a proper solution in the EBus Code.
Reviewed by: tmm
Approved by: marcel (mentor)
source address of a packet exists in the routing table. The
default route is ignored because it would match everything and
render the check pointless.
This option is very useful for routers with a complete view of
the Internet (BGP) in the routing table to reject packets with
spoofed or unrouteable source addresses.
Example:
ipfw add 1000 deny ip from any to any not versrcreach
also known in Cisco-speak as:
ip verify unicast source reachable-via any
Reviewed by: luigi
secondary bus is 0, we program the primary bus, the secondary bus and
the suborindate bus. This isn't ideal, since we start at parent_bus +
1 and store this in a static.
Ideally, we'd walk the tree and assign bus numbers. However, that's
harder to accomplish without some help from the bus layer which we're
not planning on doing that until 6.
This fixes my CardBus problems on my Sony PCG-Z1WA, and might fix the
Dells that have had problems.
sched_ule, in January 2004. Looking at this, "pagezero" is (one of) the
culprit(s). We had no provision for processes with P_NOLOAD set. With
pagezero not running at PRI_ITHD, kseq_load_{add,rem} count pagezero as
another-normal-process, thus the "expected-plus-one" load reported in
the above thread.
Submitted by: Nikos Ntarmos <ntarmos@ceid.upatras.gr>
gadgets (hotkeys, lcd, ...) on Asus laptops. I aim to closely track the
acpi4asus project which implements these features in the Linux kernel.
If this breaks your laptop, please let me know how it does it :-)
Approved by: njl (mentor)
there are a lot of other dependencies that preclude the kernel from
working). Instead, have a more generic note that isa should not be
removed. This should be less confusing for users.
(I hope.)
My original instinct to make ndis_return_packet() asynchronous was correct.
Making ndis_rxeof() submit packets to the stack asynchronously fixes
one recursive spinlock acquisition, but it's also possible for it to
happen via the ndis_txeof() path too. So:
- In if_ndis.c, revert ndis_rxeof() to its old behavior (and don't bother
putting ndis_rxeof_serial() back since we don't need it anymore).
- In kern_ndis.c, make ndis_return_packet() submit the call to the
MiniportReturnPacket() function to the "ndis swi" thread so that
it always happens in another context no matter who calls it.
workaround was for hardware where the clock was not latched, not for
hardware that was too slow. Also, make variable names more specific for ddb
printing.
While I would have prefered to have a solution that didn't move
knowledge of this into the pci layer. However, this is literally the
only exception that's listed in the PCI standard to the usual way of
decoding BARs. atapci devices in legacy mode now ignore the first 4
bars and hard code the values to the legacy ide values (well, for each
of the controllers that are in legacy mode). The 5th bar is handled
normally.
Remove the zero bar handling. zero bars should be ignored at all
other times, and since we handle that specially, we don't need the
older workaround.
what the ACPI-safe workaround is intended to fix. Requested by phk.
Set the bushandle and tag when attaching the timer, don't do it each time
in read_counter(). Pointed out by bde.
Move test_counter() to the end. Staticize acpi_timer_reg.
Clearly comment the assumptions on the structure of keys (addresses)
and masks, and introduce a macro, LEN(p), to extract the size of these
objects instead of using *(u_char *)p which might be confusing.
Comment the confusion in the types used to pass around pointers
to keys and masks, as a reminder to fix that at some point.
Add a few comments on what some functions do.
Comment a probably inefficient (but still correct) section of code
in rn_walktree_from()
The object code generated after this commit is the same as before.
At some point we should also change same variable identifiers such
as "t, tt, ttt" to fancier names such as "root, left, right" (just
in case someone wants to understand the code!), replace misspelling
of NULL as 0, remove 'register' declarations that make little sense
these days.
like "the foo(4) manual page" to "foo(4)". Uniformized the remaining
instances of "manual page" and "manpage" to "man page". Uniformized
some nearby sentence breaks. Reformatted the whole paragraph containing
these changes only for DUMMYNET.
of you with other cards, please do review and test the drivers for
MP-safety and disable Giant in the interrupt routines when you are
sure of proper functionality.
wireless ever since I added the new spinlock code. Previously, I added
a special ndis_rxeof_serial() function to insure that when we receive
a packet, we never end up calling the MiniportReturnPacket() routine
until after the receive handler has finished. I set things up so that
ndis_rxeof_serial() would only be used for serialized miniports since
they depend on this property. Well, it turns out deserialized miniports
depend on a similar property: you can't let MiniportReturnPacket() be
called from the same context as the receive handler at all. The 2100B
driver happens to use a single spinlock for all of its synchronization,
and it tries to acquire it both while in MiniportHandleInterrupt() and
in MiniportReturnPacket(), so if we call MiniportReturnPacket() from
the MiniportHandleInterrupt() context, we will end up trying to acquire
the spinlock recursively, which you can't do.
To fix this, I made the ndis_rxeof_serial() handler the default. An
alternate solution would be to make ndis_return_packet() submit
the call to MiniportReturnPacket() to the NDIS task queue thread.
I may do that in the future, after I've tested things a bit more.
supported. Symptoms of this bug included unnecessary use of ACPI-safe
and a dmesg that has deltas of about 2^24:
ACPI timer looks BAD min = 2, max = 16777206, width = 16777204
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks GOOD min = 4, max = 5, width = 1
ACPI timer looks BAD min = 2, max = 16777206, width = 16777204
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks BAD min = 2, max = 16777210, width = 16777208
ACPI timer looks BAD min = 4, max = 16777189, width = 16777185
ACPI timer looks GOOD min = 4, max = 5, width = 1
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks BAD min = 4, max = 16777189, width = 16777185
To fix this:
* Use a 32 bit timecounter mask when the timer is 32 bits.
* In test_counter(), use the acpi_TimerDelta function which handles 24/32
bit timers and wraparound.
Miscellaneous fixes:
* Use C99 initializers for timecounter struct.
* Use u_int and uint32_t where appropriate instead of unsigned.
* Remove whitespace-only lines
* Remove the old PIIX4 PCI workaround. The timecounter testing code has
been in use for long enough to prove it's functional.
globally available. acpi_TimerDelta() subtracts two readings from the
ACPI PM timer and returns the difference. It properly distinguishes between
24-bit and 32-bit timers and handles wraparound.
2. Document that this means that kernel modules must be rebuilt.
3. While I'm here, fix my sorting error in callout.h
Requested by: many [1], scottl [2], bde [3]
it checked for rt == NULL after dereferencing the pointer).
We never check for those events elsewhere, so probably these checks
might go away here as well.
Slightly simplify (and document) the logic for memory allocation
in rt_setgate().
The rest is mostly style changes -- replace 0 with NULL where appropriate,
remove the macro SA() that was only used once, remove some useless
debugging code in rt_fixchange, explain some odd-looking casts.
implementation taken directly from OpenBSD.
I've resisted committing this for quite some time because of concern over
TIME_WAIT recycling breakage (sequential allocation ensures that there is a
long time before ports are recycled), but recent testing has shown me that
my fears were unwarranted.
TIME_WAIT recycling cases I was able to generate with http testing tools.
In short, as the old algorithm relied on ticks to create the time offset
component of an ISN, two connections with the exact same host, port pair
that were generated between timer ticks would have the exact same sequence
number. As a result, the second connection would fail to pass the TIME_WAIT
check on the server side, and the SYN would never be acknowledged.
I've "fixed" this by adding random positive increments to the time component
between clock ticks so that ISNs will *always* be increasing, no matter how
quickly the port is recycled.
Except in such contrived benchmarking situations, this problem should never
come up in normal usage... until networks get faster.
No MFC planned, 4.x is missing other optimizations that are needed to even
create the situation in which such quick port recycling will occur.
a NULL crsbuf pointer. This shouldn't happen if it returns AE_OK. We'll
figure out why this is happening later.
Submitted by: Bruno Ducrot <ducrot@poupinou.org>
routine since the error will be reported back to the user buffer.
This will quiet down the bootverbose case when using an ACU which
does brute force discovery of the physical and logical devices.
the same process as the current thread it makes absolutely
no sense to lock the parent process through the pointer in
said thread.
Submitted by: pho (with minor correction)
Pointy Hat To: mtm
this patch were submitted by Maurycy Pawlowski-Wieronski. In addition
to Maurycy's change, break out softc tear down from ppp_clone_destroy()
into ppp_destroy() rather than performing a convoluted series of
extraction casts and indirections during tear down at mod unload.
Submitted by: Maurycy Pawlowski-Wieronski <maurycy@fouk.org>
of the struct, so that a placeholder for it (or unportable C99
initializers) are not needed for entries that don't use it. Use a C99
initializer for the 1 entry that uses it. Removed 91 placeholders.
This also restores API compatibility with NetBSD and RELENG_4 for most
entries.
+ remove useless wrappers around bcmp(), bcopy(), bzero().
The code assumes that bcmp() returns 0 if the size is 0, but
this is true for both the libc and the libkern versions.
+ nuke Bcmp, Bzero, Bcopy from radix.h now that nobody uses them anymore.
Removed the requirement for a particular subvendor/subproduct in
rev.1.26 (VScom PCI-800L card). While the BARs, etc., may depend on
the sub-ids, this is not known to be so, and I think it is better to
guess that they don't. The decision to check sub-id checks in this
file is apparently random; for VScom cards they were checked in 3 of
8 cases.
Reviewed by: timeout by committer (joerg) after 6 months
there so there are no ABI changes);
+ replace 5 redefinitions of the IPF2AC macro with one in if_arp.h
Eventually (but before freezing the ABI) we need to get rid of
struct arpcom (initially with the help of some smart #defines
to avoid having to touch each and every driver, see below).
Apart from the struct ifnet, struct arpcom now only stores a copy
of the MAC address (ac_enaddr, but we already have another copy in
the struct ifnet -- if_addrhead), and a netgraph-specific field
which is _always_ accessed through the ifp, so it might well go
into the struct ifnet too (where, besides, there is already an entry
for AF_NETGRAPH data...)
Too bad ac_enaddr is widely referenced by all drivers. But
this can be fixed as follows:
#define ac_enaddr ac_if.the_original_ac_enaddr_in_struct_ifnet
(note that the right hand side would likely be a pointer rather than
the base address of an array.)
+ replace 0 with NULL where appropriate (not complete)
+ remove register declaration while there
+ add argument names to function prototypes to have a better idea of
what they are used for
+ add 'const' qualifiers in 3 places
Nehemiah chip, but the work is all done in hardware.
There are three opportunities to add other entropy; the Data
Buffer, the Cipher's IV and the Cipher's key. A future commit
will exploit these opportunities.
+ remove a partly incorrect comment that i introduced in the last commit;
+ deal with the correct part of the above comment by cleaning up the
updates of 'info' -- rti_addrs needd not to be updated,
rti_info[RTAX_IFP] can be set once outside the loop.
While at it, correct a few misspelling of NULL as 0, but there are
way too many in this file, and i did not want to clutter the
important part of this commit.
Logical volumes on these devices show up as LUNs behind another
controller (also known as proxy controller). In order to issue
firmware commands for a volume on a proxy controller, they must be
targeted at the address of the proxy controller it is attached to,
not the Host/PCI controller.
A proxy controller is defined as a device listed in the INQUIRY
PHYSICAL LUNS command who's L2 and L3 SCSI addresses are zero. The
corresponding address returned defines which "bus" the controller
lives on and we use this to create a virtual CAM bus.
A logical volume's addresses first byte defines the logical drive
number. The second byte defines the bus that it is attached to
which corresponds to the BUS of the proxy controller's found or the
Host/PCI controller.
Change event notification to be handled in its own kernel thread.
This is needed since some events may require the driver to sleep
on some operations and this cannot be done during interrupt context.
With this change, it is now possible to create and destroy logical
volumes from FreeBSD, but it requires a native application to
construct the proper firmware commands which is not publicly
available.
Special thanks to John Cagle @ HP for providing remote access to
all the hardware and beating on the storage engineers at HP to
answer my questions.
Specifically, we used to enable the source after locking sched_lock
and just before we had already decided to do a context switch.
This meant that an ithread could never process more than one interrupt
per context switch. Enabling earlier in the loop before sched_lock is
acquired allows an ithread to handle multiple interrupts per context
switch if interrupts fire very rapidly. For the case of heavy interrupt
load this can reduce the number of context switches (and thus overhead)
as well as reduce interrupt latency.
- Now that we can handle multiple interrupts per context switch, add simple
interrupt storm protection to threaded interrupts. If X number of
consecutive interrupts are triggered before the itherad voluntarily
yields to another thread, then the interrupt thread will sleep with the
associated interrupt source disabled (masked) for 1/10th of a second.
The default value of X is 500, but it can be tweaked via the tunable/
sysctl hw.intr_storm_threshold. If an interrupt storm is detected, then
a message is output to the kernel console on the first occurrence per
interrupt thread. Interrupt storm protection can be disabled completely
by setting this value to 0. There is no scientific reasoning for the
1/10th of a second or 500 interrupts values, so they may require tweaking
at some point in the future.
Tested by: rwatson (an earlier version w/o the storm protection)
Tested by: mux (reportedly made a machine with two PCI interrupts
storming usable rather than hard locked)
Reviewed by: imp
different BIOSs use the same exact settings to mean two very different and
incompatible things for the SCI. Thus, if the SCI is remapped to a PCI
interrupt, we now trust the trigger/polarity that the MADT provides by
default. However, the SCI can be forced to level/lo as 1.10 did by setting
the tunable "hw.acpi.force_sci_lo" to a non-zero value from the loader.
Thus, if rev 1.10 caused an interrupt storm, it should nwo fix your
machine. If rev 1.10 fixed an interrupt storm on your machine, you
probably need to set the aforementioned tunable in /boot/loader.conf to
prevent the interrupt storm.
The more general problem of getting the SCI's trigger/polarity programmed
"correctly" (for some value of correctly meaning several workarounds for
broken BIOSs and inconsistent "implementations" of the ACPI standard) is
going to require more work, but this band-aid should improve the current
situation somewhat.
Requested by: njl
uiomove(9) is not properly locked. So, return to NEEDGIANT
mode. Later, when uiomove is finely locked, I'll revisit.
While I'm here, provide some temporary debugging output to
help catch blocking startups.
unconditionally initialize the mbuf header even if cluster allocation
failed, which could result in a NULL pointer dereference in low-memory
conditions.
PR: kern/65548
Submitted by: Stephan Uphoff <ups@tree.com>
the TAILQ_FOREACH() form.
Comment the need to store the same info (mac address for ethernet-type
devices) in two different places.
No functional changes. Even the compiler output should be unmodified
by this change.
of an interface. No functional change.
On passing, comment an useless invocation of TAILQ_INIT(&ifp->if_addrhead)
which could probably be removed in the interest of clarity.
of an interface. No functional change.
On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist()
which, if confirmed, would deserve to be fixed and MFC'ed
ntoskrnl_unlocl_dpc().
- hal_raise_irql(), hal_lower_irql() and hal_irql() didn't work right
on SMP (priority inheritance makes things... interesting). For now,
use only two states: DISPATCH_LEVEL (PI_REALTIME) and PASSIVE_LEVEL
(everything else). Tested on a dual PIII box.
- Use ndis_thsuspend() in ndis_sleep() instead of tsleep(). (I added
ndis_thsuspend() and ndis_thresume() to replace kthread_suspend()
and kthread_resume(); the former will preserve a thread's priority
when it wakes up, the latter will not.)
- Change use of tsleep() in ndis_stop_thread() to prevent priority
change on wakeup.
if the link-level address has been initialized already.
The majority of modern drivers never does this and works fine, which
makes me think that the check is totally unnecessary and a residue
of cut&paste from other drivers.
This change is done to simplify locking because now almost none of the
drivers uses this field. The exceptions are "ct" "ctau" and "cx"
where i am not sure if i can remove that part.
This avoids presenting invalid data to the client's applications
when the file is modified, and then extended within the window of
the resolution of the modifcation timestamp.
Reviewed By: iedowse
PR: kern/64091