- Fix a couple of LORs and panics;
- Temporarily remove the code that tries to cleanup sockets that stuck
on accepting queues (both complete and incomplete). I'm taking an ostrich
approach here until I find a better way to deal with sockets that were
disconnected before accepting (i.e. while socket was on complete or
incomplete accept queue).
we can do the stuff we need to do with linux processes at fork and
don't panic the kernel at exit of the child.
Submitted by: rdivacky
Tested with: tst-vfork* (glibc regression tests)
Tested by: netchild
progress the kernel audit code in CVS is considered authoritative.
This will ease $P4$-related merging issues during the CVS loopback.
Obtained from: TrustedBSD Project
LCDs to blink in the V_DISPLAY_ON case, at least in combination with
some 13W3-VGA-adaptors (what's exactly going on is unclear though,
as it happens when all of H-sync, V-sync and video output are enabled
and not touching the sync bits from the preset fixes it). Thus
creator_blank_display() now is reduced to turning the video output
on/off.
Although that DPMS code did what the XFree86/Xorg sunffb(4x) does,
it was questionable in the first place, as both implementations
also turn(ed) off the video output on standby and suspend, thus most
likely causing the monitor to turn off instead of entering standby
or suspend as intended (at least my monitors don't).
Reported and tested by: Patrick Reich
MFC after: 3 days
copyout(9) instead of copystr(9) for copying the errmsg from
kernel- to user-space. This fixes a panic on sparc64 when
using the nmount(2)-converted mountd(8).
While at it, use bcopy(3) instead of strncpy(3) in the kernel-
to kernel-space case for consistency with vfs_buildopts() and
between kernel- to user-space and kernel- to kernel-space case.
Following issues should be resolved:
- random watchdog timeouts (caused by concurrent phy access)
- some link state issues
- non working TX if media type was set explicitly
PR: kern/98738
Approved by: glebius (mentor)
MFC after: 2 weeks
returned by md_reserve(). Space reserved by mb_reserve() is
byte aligned and need to be used in conjunction with le16enc()
and le32enc().
Tested on: ia64
conditions. The cause of missing Tx completion interrupts comes from
Tx interrupt moderation mechanism(delayed interrupts) or chipset bug.
If Tx interrupt moderation mechanism is the cause of false watchdog
timeout error we should have to fix all device drivers that have Tx
interrupt moderation capability. We may need more investigation
for this issue. Anyway, the fix is the same for both cases.
This should fix occasional watchdog timeout errors seen on a few
systems.
Reported by: -net, Patrick M. Hausen < hausen AT punkt DOT de >
Tested by: Patrick M. Hausen < hausen AT punkt DOT de >
misc. control registers correctly and it is inconsistent with north bridge.
In fact, there are too many broken BIOS implementations out there and we
cannot fix every possible combination but at least it is consistent with
what we advertise with ioctl(2).
first filter out metadata update. Otherwise, devfs vnode could be
erronously interpreted as ufs one, causing further check of i_flags
to use random memory.
PR: kern/100365
Debugged and fix described by: tegge
Approved by: pjd (mentor)
MFC after: 2 weeks
REPORT LUNS command to a device.
camcontrol.[c8]: Implement reportluns. This tries to print the LUNs
out in a reasonable format. Only the periph
addressing method has been tested, since very little
hardware that I know of supports the other methods.
scsi_all.[ch]: Revamp the report luns CDB structure and helper
functions. This constitutes a little bit of an API
change, but since the old CDB length was 10 bytes,
and the REPORT LUNS CDB length is actually 12 bytes,
it's clear that no one was using this API in the
first place.
MFC After: 1 week
it ended up defaulting to ISP_ROLE_NONE. My testing hadn't caught it
because I was deliberatly setting role via ioctl.
Thanks to user Toni for lending me an alpha to test this on.
MFC after: 0 days
handling for amd64 in the common code. The MD parts for amd64 are still
outstanding, but at least this fixes some panics on amd64.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Tested by: bsam
Originally it wasn't enabled since the hardware wasn't commonplace,
but as 10GE hardware is becoming more widely used, building the module
by default should be beneficial.
Approved by: rwatson (mentor)
MFC after: 2 weeks
for example:
fwd tablearg ip from any to table(1)
where table 1 has entries of the form:
1.1.1.0/24 10.2.3.4
208.23.2.0/24 router2
This allows trivial implementation of a secondary routing table implemented
in the firewall layer.
I expect more work (under discussion with Glebius) to follow this to clean
up some of the messy parts of ipfw related to tables.
Reviewed by: Glebius
MFC after: 1 month
- protect td->td_proc->p_pid with the proc lock in linux_getpid
in the amd64 (= non i386) case [1]
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Noticed by: netchild [1]
in older versions of FreeBSD. This option is pointless as it is needed in just
about every interesting usage of forward that I have ever seen. It doesn't make
the system any safer and just wastes huge amounts of develper time
when the system doesn't behave as expected when code is moved from
4.x to 6.x It doesn't make
the system any safer and just wastes huge amounts of develper time
when the system doesn't behave as expected when code is moved from
4.x to 6.x or 7.x
Reviewed by: glebius
MFC after: 1 week
has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname
that corresponds to a given file descriptor). Valgrind-3.x needs this
functionality. This is a placeholder only at this time.
match up with reality and the prototype definitions.
Register the sem_exechook as the "process_exec" event handler, not
sem_exithook.
Submitted by: rdivacky
Sponsored by: SoC 2006
that it operates on lockmgr and sx locks. This can be useful for tracking
down vnode deadlocks in VFS for example. Note that this command is a bit
more fragile than 'show lockchain' as we have to poke around at the
wait channel of a thread to see if it points to either a struct lock or
a condition variable inside of a struct sx. If td_wchan points to
something unmapped, then this command will terminate early due to a fault,
but no harm will be done.
panic, go ahead and do the longer DELAY(1) spin wait.
- If we panic due to spinning too long, print out a few more details
including the pointer to the mutex in question and the tid of the owning
thread.
for syscalls in kld's, even when compiled into the kernel statically.
Note that since this hardcodes the SYS_ prefix SYSCALL_MODULE_HELPER() now
only works for native ABI system calls. Those are the only ones that
used the macro anyway, and I chose to not require a second argument to the
macro to specify the prefix or audit event directly.
- Send the systrace_args files for all the compat ABIs to /dev/null for
now. Right now makesyscalls.sh generates a file with a hardcoded
function name, so it wouldn't work for any of the ABIs anyway. Probably
the function name should be configurable via a 'systracename' variable
and the functions should be stored in a function pointer in the sysvec
structure.
by restoring the ifv_proto field in the vlan softc and putting it to use
this time. It's a good companion for ifv_encaplen, which has already been
used throughout this driver.
'show lockchain'. The churn is because I'm about to add a new
'show sleepchain' similar to 'show lockchain' for sleep locks (lockmgr and
sx) and 'show threadchain' was a bit ambiguous as both commands show
a chain of thread dependencies, 'lockchain' is for non-sleepable locks
(mtx and rw) and 'sleepchain' is for sleepable locks.
KERNBASE and VM_MAXUSER_ADDRESS.
Remove the useless include of opt_global.h, as noticed by netchild@ (the one
in arm/elf_trampoline.c is legit, because this file is compiled outside the
kernel, and doesn't use the standard CFLAGS).
line switch. Other files which may make the same mistake (according to
fxr.watson.org) but aren't fixed in this commit (people with more clue
about those files should fix this):
- i386/xbox/xbox.c
- arm/arm/elf_trampoline.c
- arm/arm/mem.c
Noticed by: cognet
- Prepare the modules for build on amd64, but don't build them there as
part of the kernel build yet. The code for the missing symbols on amd64
isn't committed and it may be solved differently.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
- TLS - complete
- pid/tid mangling - complete
- thread area - complete
- futexes - complete with issues
- clone() extension - complete with some possible minor issues
- mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
disabled when not build as part of the kernel with native FreeBSD mq*
support (module support for this will come later)
Tested with:
- linux-firefox - works, tested
- linux-opera - works, tested
- linux-realplay - doesnt work, issue with futexes
- linux-skype - doesnt work, issue with futexes
- linux-rt2-demo - works, tested
- linux-acroread - doesnt work, unknown reason (coredump) and sometimes
issue with futexes
- various unix utilities in linux-base-gentoo3 and linux-base-fc4:
everything tried worked
On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.
To test this new stuff, you have to run
sysctl compat.linux.osrelease=2.6.16
to switch back use
sysctl compat.linux.osrelease=2.4.2
Don't switch while running a linux program, strange things may or may not
happen.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild
compat.linux.osrelease is changed to "2.6.16" or similar).
On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Please don't style(9) the NetBSD code, we want to stay in sync. Not imported
on a vendor branch since we need local changes.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
With help from: manu@NetBSD.org
Obtained from: NetBSD (linux_{futex,time}.*)
image_params arg.
- Change struct image_params to include struct sysentvec pointer and
initialize it.
- Change all consumers of process_exit/process_exec eventhandlers to
new prototypes (includes splitting up into distinct exec/exit functions).
- Add eventhandler to userret.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Parts suggested by: jhb (on hackers@)
Reported by: Nick Withers < nick AT nickwithers DOT com >
Tested by: Nick Withers < nick AT nickwithers DOT com >
No objection from: ariff
MFC after: 1 week
serial line usb drivers that depend on it. Instead, let the compile
fail rather than silently not including the driver. This is more in
line with how we handle things like mii.
# I'll note: a better system for coping with missing depends is needed,
# but this dependency is clearly backwards given our current flawed
# depend system.
aren't mapped via pmap_enter() (KVA). We will eventually support PAT bits
on user pages, but those will require some sort of MI caching mode stored
in the vm_page.
Reviewed by: alc
i386 (I don't know) but on amd64 at hand here, it paniced early at
boot.
(I'm pretty sure that PAGE_SIZE here was miscopied from another place
during porting, where in OpenBSD bus_dmamem_alloc() is used, but there
PAGE_SIZE means completely different thing.)
options field in register 10 will be deterministic, not random.
Correct the number of input bits for EXECUTE_FIRMWARE 0..1 to
0..2- the 2322 and 24XX cards use mailbox register 2 to specify
whether the f/w being executed is freshly loaded or not.
Correct the number of input bits for {READ,WRITE}_RAM_WORD_EXTENDED
so that register 8 gets picked up.
Fix the indexing and offset for the 2322 f/w download so that it
correctly puts the different code segments where they belong.
Move VERIFY_CHECKSUM to be the 'else' clause to 2322 f/w downloads-
the EXECUTE_FIRMWARE command for 2322 and 24XX cards will tell you
if the f/w checksum is incorrect and VERIFY_CHECKSUM only works for
RISC SRAM address < 64K so you can only do a VERIFY_CHECKSUM on the
first of the 3 f/w segments for the 2322.
Shorten the delay for the continuation mailbox commands- 1ms is
ridiculous (100us is more likely).
All of the more or less is really only for the 2322/6322 cards.
Previously em(4) requeued the failed mbuf chains from
bus_dmamap_load_mbuf_sg(9) failure to resend it later. However,
bus_dmamap_load_mbuf_sg(9) may never complete its request as the
fragmented frames can have more than EM_MAX_SCATTER segments.
To handle the above EFBIG case, defragment the frame with m_defrag(9)
and free the mbuf chain if it can't deframent the chain due to
resource shortage.
Reviewed by glebius (with improvements)
o Create one more spare DMA map for Rx handler to recover from
bus_dmamap_load_mbuf_sg(9) failure.
o Make sure to update status bit in Rx descriptors even if we failed
to allocate a new buffer. Previously it resulted in stuck condition
and em_handle_rxtx task took up all available CPU cycles.
o Don't blindly unload DMA map. Reuse loaded DMA map if received
packet has errors. This would speed up Rx processing a bit under
heavy load as it does not need to reload DMA map in case of error.
(bus_dmamap_load_mbuf_sg(9) is the most expensive call in driver
context.)
o Update if_iqdrops counter if it can't allocate a mbuf cluster.
With this change it's now possible to see queue dropped packets
with netstat(1).
o Update mbuf_cluster_failed counter if fixup code failed to
allocate mbuf header.
o Return ENOBUFS instead of ENOMEM in case of Rx fixup failure.
o Make adapter->lmp NULL in case of Rx fixup failure. Strictly
specking it's not necessary for correct operation but it makes
the intention clear.
o Remove now unused dropped_pkts member in softc.
With these changes em(4) should survive mbuf cluster allocation
failure on Rx path.
Reviewed by: pdeuskar, glebius (with improvements)
MCLBYTES - ETHER_ALIGN. Previously it applied the alignment fixup code
for oversized frames which would result in reduced performance on
strict alignment archs.
page queues-synchronized flag. Reduce the scope of the page queues lock in
vm_fault() accordingly.
Move vm_fault()'s call to vm_object_set_writeable_dirty() outside of the
scope of the page queues lock. Reviewed by: tegge
Additionally, eliminate an unnecessary dereference in computing the
argument that is passed to vm_object_set_writeable_dirty().
created on Windows XP (and others maybe) were not detected.
We detected only those created with newfs_msdos(8).
Submitted by: Tobias Reifenberger <treif@mayn.de>
style(9)ified by: pjd
o when turning off the socket for a 16-bit card, write 0 to INTR register
rather than just tying to just clear the rest bit. this seems to fix
card insert detection after an eject on TI bridges (ricoh bridges work
either way, apparently). This is a MFp4.
o Cope better with TOPIC95 bridges on powerup. According to NetBSD driver,
these bridges don't set POWER_STATE, so cope accordingly in our power
code. They also need a little extra time to settle, so do that as well.
o It appears that we need to turn on/off one of the clocks to the card
when we power up/down that socket on a TOPIC97, also from NetBSD.
o TOPIC97 bridges need to specifically enable LV card support. Unconditionally
do this in the hopes that all laptops that have these chips support LV
voltages (they should, since they are required for CardBus).
o TOPIC register name regularization. Registers specific to models of TOPIC
are now called out as such.
# I need a machine with a TOPIC95 for testing.
space that enables low voltage operation (and maybe other stuff).
Enable the bits in this register so low voltage 16-bit cards may work.
Existance noticed in NetBSD driver.
Instead the threshould is initialized in device attach. Later the
threshold could be increased in Tx underrun error and the new
threshold should be used in xl_init_locked().
think the RealTek PHY needs driver to set RGEPHY_BMCR_AUTOEN bit of
RGEPHY_MII_BMCR register and proper ANAR register setting for manual
media type selection.
This fixes long standing manual media type selection bug in rgephy(4).
Reported by: Jelte Jansen <jelte AT NLnetLabs DOT nl>
Tested by: Jelte Jansen <jelte AT NLnetLabs DOT nl>
Use proper pointer dereference to inform modified mbuf chains to
caller.
While I'm here perform checksum offload setup after loading DMA
maps.
In collaboration with: glebius
Use proper pointer dereference to inform modified mbuf chains to
caller.
While I'm here perform checksum offload setup after loading DMA
maps as m_defrag(9) can return new mbuf chains.
In collaboration with: glebius
WB (write-back) on x86 via control bits in PTEs and PDEs (including making
use of the PAT MSR). Changes include:
- A new pmap_mapdev_attr() function for amd64 and i386 which takes an
additional parameter (relative to pmap_mapdev()) specifying the cache
mode for this mapping. Note that on amd64 only WB mappings are done with
the direct map, all other modes result in a private mapping.
- pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached)
mappings rather than WB. Previously we relied on the BIOS setting up
MTRR's to enforce memio regions being treated as UC. This might make
hw.cbb_start_memory unnecessary in some cases now for example.
- A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places
that used pmap_mapdev() to map non-device memory (such as ACPI tables)
to do so using WB as before.
- A new pmap_change_attr() function for amd64 and i386 that changes the
caching mode for a range of KVA.
Reviewed by: alc
This way one will be able to use provider encrypted on eg. i386 on
eg. sparc64. This doesn't really buy us much today, because UFS isn't
endian agnostic.
We retain backward compatibility by setting G_ELI_FLAG_NATIVE_BYTE_ORDER
flag on devices with version number less than 2 and not converting the
offset.
before tagging them. This can help to work around brain-damage in some
switches that fail to pad a frame after untagging it if its length drops
below the minimum. This option is blessed by IEEE Std 802.1Q (2003 Ed.),
paragraph C.4.4.3.b. It's controlled by sysctl net.link.vlan.soft_pad.
Idea by: az
MFC after: 1 week
82571EB quad port copper NIC and has few minor fixes.
Details:
- if_em.c. Merged manually, viewing diff between new vendor
driver and previous one.
- if_em_hw.c. Dropped in from vendor, and then restored
revision 1.15.
8k boundary with this program still.
text data bss dec hex filename
7925 4 4476 12405 3075 bootiic.out
so we have like 293 bytes left before we have to play games. There
may be ways to reduce that somewhat, but they start to be very board
specific.
changes in the future. This helps with getting started and to
overcome the really sucky level of granuality this timeout has in
getc. A timeout of 1 means 'wait until top of next second' rather
than 'wait for at least a second'.
o include current tx rate in stats so athstats gets a consistent
snapshot and doesn't have to make an extra ioctl
o record tx rate for raw frames
MFC after: 3 weeks
o change rssi to be signed in ieee80211_nodestats
o add noise floor in ieee80211_nodestats (use an implicit hole to
preserve layout); return it as zero until we can update the api's
so the driver can provide noise floor data
o add a bandaid so IEEE80211_IOC_STA_STATS works for sta mode; when
all nodes are in the station table this will no longer be needed
o fix braino in IEEE80211_IOC_STA_INFO implementation; was supposed
to take a mac address and return info for that sta or all stations
if ff:ff:ff:ff:ff was supplied--but somehow this didn't get implemented;
implement the intended semantics and leave a compat shim at the old
ioctl number for the previous api
Reviewed by: mlaier
MFC after: 3 weeks
o add some missing stats to the global stat structure
o move accounting work for data frame rx into ieee80211_deliver_data
o add per-sta stats for rx ucast/mcast frames
o set rcvif in ieee80211_deliver_data so callers don't need to
MFC after: 2 weeks
o PMBR partitions count to the number of partitions on the disk, which
means that if a PMBR entry is invalid we will not treat the MBR as a
PMBR by virtue of it not describing any partitions.
Previously the checks were inconsistent in that an invalid PMBR entry
would be harmless when no other partitions exist (we would treat the
MBR as a PMBR by virtue of it being empty), but it would be fatal when
there is at least one other partition.
o The partition size of a PMBR partition is one less than the media size
because the GPT starts at the second sector (LBA 1) and extends to
the end of the media. For backward bug-compatibility we accept a size
that's exactly the media size (FreeBSD bug).
Also, when the partition size can not be represented in a 32-bit
integral, the partition size in the MBR is to be set to 0xFFFFFFFF.
Accept this as a valid size, even if the size can be represented.
from the actual geometry. This enables support of disks larger than
~120GB on pc98 boxes. They make great little network appliances.
I've been using these changes for the past year or so on my network
storage pc98 box :-).
of geometry. However, some platforms have a more complicated mapping
of the firmware values to the actual values. pc98 is the only
platform that currently does this. This mapping is necessary for
large disks connected to pc98 boxes, as the firmware labels require do
special hacks to the actual geometry for interoperability. We cannot
do this all in the geom layer because of initialization issues (geom
looks for an already initialized pc98 label, but we need the geometry
information prior to initialization, classic chicken and egg problem).
We pass the disk and the device_t to this function because the
geometry mapping depends on what kind of controller is used.
This hook allows platforms that want to override things to do so, and
has 0 overhead on all other platforms. These patches have been in use
locally for a long time, and received good feedback from the pc98
community and sos@ at various times during their development.
MFC After: 1 week
synchronized by the lock on the object containing the page.
Transition PG_WANTED and PG_SWAPINPROG to use the new field,
eliminating the need for holding the page queues lock when setting
or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to
VPO_WANTED and VPO_SWAPINPROG, respectively.
Eliminate the assertion that the page queues lock is held in
vm_page_io_finish().
Eliminate the acquisition and release of the page queues lock
around calls to vm_page_io_finish() in kern_sendfile() and
vfs_unbusy_pages().
a file system, but need to obtain a vnode. We may not be able to do it, because
all vnodes could be already in use and other processes cannot release them,
because they are waiting in "suspfs" state.
In such situation, we allow to allocate a vnode anyway.
This is a temporary fix - there is no backpressure to free vnodes allocated in
those circumstances.
MFC after: 1 week
Reviewed by: tegge
- Store the Ethernet header in node softc.
- Initialize header with dst addr and ethertype in node
constructor method.
- In node connect method send NGM_ETHER_GET_ENADDR message
downwards.
- If received reply from ng_ether(4) store the src addr
in softc.
- Add NGM_PPPOE_SETENDADDR message that allows user to
override the address with whatever he/she wants.
set the MTU prior to mounting root via NFS. This is required if the
server supports a higher than default MTU because the client will not
see the responses otherwise.
MFC after: 3 weeks
cards stopped working. Specifically the AVM B1 PCMCIA Card no longer
detected. Its CIS chain read back as all FF's. Putting the delay
back solves those problems. I've opted to put in a much shorter delay
because as far as I can tell, no delay is really needed here. We'll
see how well this works in practice.
we obtained access. It is possible that GPT gets to taste a disk
first, which means the disk has not been opened before and it will
not get opened until after we checked the mediasize and sectorsize.
However, since the mediasize and sectorsize are determined at open
and that happens when access is optained, checking the mediasize
and sectorsize before obtaining access may result in GPT rejecting
the disk.
whole the physical memory, cached, using 1MB section mappings. This reduces
the address space available for user processes a bit, but given the amount of
memory a typical arm machine has, it is not (yet) a big issue.
It then provides a uma_small_alloc() that works as it does for architectures
which have a direct mapping.
Two almost identical patches based on the if_tap work were submitted
via GNATS; I started out with the patch in 100796 from David Gilbert,
but could have easily started with the patch from Vilmos Nebehaj which
I found only later.
MFC after: 1 week
PR: 93976, 100796
and vn_fullpath (that call malloc(..., M_WAITOK)) from under the
vm object lock, since sleep is not allowed while holding the mutex.
Being there, wrap VOP_GETATTR call with conditional Giant aquire.
Currently this is (almost) noop because pseudofs is Giant-locked.
Tested by: kris
Approved by: pjd (mentor)
MFC after: 2 weeks
function independently. This change is not only load-tested since I don't
have hardware that supports acpi_dock. Clean up comments and a name a
few constants.
is interaction between in-kernel sound buffer handling and hardware.
With small buffer, there are times when both harwdare reads and
kernel writes to the same buffer (it is only visible on slow machines, i
think). I'm digging in channel.c and buffer.c to find a solution that
allow use of large hardware buffers without sound lags - hardware can
handle buffers up to 32Mb."
Submitted by: Yuriy Tsibizov <Yuriy.Tsibizov@gfk.ru>
that aren't listed as valid in the link device's set of possible IRQs.
This allows the hints to be used to work around broken BIOSes that don't
specify the correct ste of possible IRQs. A warning is issued in the
dmesg in this case to be consistent with the $PIR handling code.
MFC after: 1 week
uipc_proto.c to uipc_usrreq.c, making localdomain static. Remove
uipc_proto.c as it's no longer used. With this change, UNIX domain
sockets are entirely encapsulated in uipc_usrreq.c.
- Print node ID, where possible.
- Prepend log messages with function name, or at least with "ng_pppoe".
Reviewed by: julian
Tested by: Joao Barros <joao.barros gmail.com>
This enables the scanner function on these devices to be detected
and probed by uscanner(4), but only when ulpt is not loaded.
PR: usb/92462
Submitted by: Friedrich Volkmann
MFC after: 30 days
_SOLARIS_C_SOURCE is defined.
The _OpenSolaris_version is set to match the last import of the OpenSolaris
tar ball and is based on the date in that file name.
are only visible if _SOLARIS_C_SOURCE is defined.
Note thar FreeBSD stat() and fstat() are 64-bit functions now and Solaris
still persists with both 32- and 64-bit versions. When I query this, I am
referred to: <http://www.unix.org/version2/whatsnew/lfs20mar.html>.
But when you look at the main page of unix.org you will see that the
Single Unix Specification <http://www.unix.org/version3/> is the most
recent standard they are pushing. And there are no stat64() fstat64()
functions defined there. I guess this just goes to prove that there are so
many standards, you can take your pick.
The cyclic timer is a high-resolution timer allows timeouts at nanosecond
intervals where hardware support is available. Typically on i386 there
is no HPET (high performance event timer) like the one Intel started
specifying some time in 2004, so the best that tye cyclic timer subsystem
can do is run at Hz.
The cyclic timer code itself is ported from OpenSolaris and is covered
by the CDDL, so it is only loaded as a module. This function type definition
is used in machine-dependent code to provide a hook for the module to
register it's callback function.
if _SOLARIS_C_SOURCE is defined.
Add two function prototypes which are required to feed high-resolution
times to DTrace. DTrace requires it's own functions with the dtrace_
prefix so that it knows not to try and trace them. This is a rule that
code executed from the DTrace probe context must obey.
The two functions are only be compiled if the KDTRACE option is defined
to compile in kernel support for loading the DTrace modules.
These are only defined if _SOLARIS_C_SOURCE is defined, so they don't
polute the FreeBSD compile environment.
They are used all over the OpenSolaris source, so defining them here
removes the need to continually resolve differences in FreeBSD system
haeder files from Solaris header files.
were unused or already in if_var.h so add if_name() to if_var.h and
remove net_osdep.h along with all references to it.
Longer term we may want to kill off if_name() entierly since all modern
BSDs have if_xname variables rendering it unnecessicary.
the notify structs. Fix messages in isp_got_msg_fc to print out the
loop id of the sender- not the wwpn which will be synthesized later,
if possible, in the outer layers. Put in debug printouts to pair
a notify ack to a notify so one can see the start/close of an
immediate notify event. Put in spsace for TASK MANAGEMENT response
flags (which we don't do yet).
on output frames.
Many people were confused with not working CARP, ng_bridge(4)
and other subsystems, because ng_ether(4) overwritten source
MAC address.
We have to adjust curthread's state enough so that it appears to be
in a poll(2) or select(2) call so that selrecord() will work and then
teardown that state after calling sopoll().
- Fix some minor nits in nearby ncp_sock_rselect() and in the identical
nbssn_rselect() function in the netsmb code:
- Don't call nb_poll()/ncp_poll() now that ncp_poll() already fakes up
poll(2) state since the rselect() functions already do that. Just
invoke sopoll() directly.
- To make things slightly more intuitive, store the results of sopoll()
in a new 'revents' variable rather than 'error' since that's what
sopoll() actually returns.
- If the requested timeout time has been exceeded by the time we get
ready to block, then return EWOULDBLOCK rather than 0 to signal a
timeout as this is what the calling code expects.
Tested by: Eric Christeson <eric.j.christeson AT gmail> (1)
MFC after: 1 week
interface, do not just assign -1 to tag because it breaks the logic of
the code to follow. The better way is to handle this case as an unsupported
protocol and return unless INVARIANTS is in effect and we can panic.
Panic is good there because the scenario can happen only because of a
coding error elsewhere.
We also should show the interface name in the panic message for easier
debugging of the problem, should it ever emerge.
Submitted by: qingli (initially)
as it tried to solve:
- it smuggled hidden 802.1q details into otherwise protocol-neutral code;
- it put an important code consistency check under DEBUG, which was never
defined by anyone but a developer hacking this file for the moment;
- lastly, the former bcopy() call had been correct as long as the "dead"
code was there.
(A new version of the fix for tag of -1 to come in the next commit.)
Agreed by: qingli
80003 NICs and NICs found on ICH8 mobos, and improves support for
already known chips.
Details:
- if_em.c. Merged manually, viewing diff between new vendor
driver and previous one. This was an easy task, because
most changes between 5.1.5 and 6.0.5 are bugfixes taken
from FreeBSD.
- if_em_hw.h. Dropped in from vendor, and then restored
revisions 1.16, 1.17, 1.18.
- if_em_hw.c. Dropped in from vendor, and then restored
revision 1.15.
- if_em_osdep.h. Added new required macros from vendor file
and add a hack against define namespace mangling in
if_em_hw.h. Intel made another hack, but I prefer mine.
systrace.
Another file called systrace_args.c is generated. This will be compiled
into systrace and is used to map the syscall arguments into the 64-bit
parameter array.
sofree(), as a number of protocols expect to be able to call
soisdisconnected() during detach. That may not be a good assumption,
but until I'm sure if it's a good assumption or not, allow it.
alloc'ing mbufs so that there is less error handling required.
- Go ahead and account for the data space in the first mbuf before entering
the loop to alloc more mbuf's. This simplifies the loop logic and avoids
confusing Coverity.
CID: 817
Reviewed by: sam
Tested by: pjd
Found by: Coverity Prevent (tm)
tcp_twstart(), but not to the other, tcp_detach(), as the socket is
already being torn down and therefore there are no listeners. This avoids
a panic if kqueue state is registered on the socket at close(), and
eliminates to XXX comments. There is one case remaining in which
tcp_discardcb() reaches up to the socket layer as part of the TCP host
cache, which would be good to avoid.
Reported by: Goran Gajic <ggajic at afrodita dot rcub dot bg dot ac dot yu>
vfs_rel() on the mountpoint if the MAC checks fail in kern_statfs() and
kern_fstatfs(). Similarly, don't perform an extra vfs_rel() if we get
a doomed vnode in kern_fstatfs(), and handle the case of mp being NULL
(for some doomed vnodes) by conditionalizing the vfs_rel() in
kern_fstatfs() on mp != NULL.
CID: 1517
Found by: Coverity Prevent (tm) (kern_fstatfs())
Pointy hat to: jhb
eliminating a second set of identical mutex operations at the bottom.
This allows brief exceeding of the max sockets limit, but only by
sockets in the last stages of being torn down.
PowerPC-based Apple's machines and small utility to do it from
userland modelled after the similar utility in Darwin/OSX.
Only tested on 1.25GHz G4 Mac Mini.
MFC after: 1 month
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write(). However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.
The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm
trying to clean up and fix our support for execute access.
Reviewed by: marcel@ (ia64)
vlan tag processing, the code will use bcopy() to remove the vlan
tag field but the code copies 2 bytes too many, which essentially
overwrites the protocol type field.
Also, a tag value of -1 is generated for unrecognized interface type,
which would cause an invalid memory access in the vlans[] array.
In addition, removed a line of dead code and its associated comments.
Reviewed by: sam
- Change the workaround for the autopad/checksum offload bug so that
instead of lying about the map size, we actually create a properly
padded mbuf and map it as usual. The other trick works, but is ugly.
This approach also gives us a chance to zero the pad space to avoid
possibly leaking data.
- With the PCIe devices, it looks issuing a TX command while there's
already a transmission in progress doesn't have any effect. In other
words, if you send two packets in rapid succession, the second one may
end up sitting in the TX DMA ring until another transmit command is
issued later in the future. Basically, if re_txeof() sees that there
are still descriptors outstanding, it needs to manually resume the
TX DMA channel by issuing another TX command to make sure all
transmissions are flushed out. (The PCI devices seem to keep the
TX channel moving until all descriptors have been consumed. I'm not
sure why the PCIe devices behave differently.)
(You can see this issue if you do the following test: plug an re(4)
interface into another host via crossover cable, and from the other
host do 'ping -c 2 <host with re(4) NIC>' to prime the ARP cache,
then do 'ping -c 1 -s 1473 <host with re(4) NIC>'. You're supposed
to see two packets sent in response, but you may only see one. If
you do 'ping -c 1 -s 1473 <host with re(4) NIC>' again, you'll
see two packets, but one will be the missing fragment from the last
ping, followed by one of the fragments from this ping.)
- Add the PCI ID for the US Robotics 997902 NIC, which is based on
the RTL8169S.
- Add a tsleep() of 1 second in re_detach() after the interrupt handler
is disconnected. This should allow any tasks queued up by the ISR
to drain. Now, I know you're supposed to use taskqueue_drain() for
this, but something about the way taskqueue_drain() works with
taskqueue_fast queues doesn't seem quite right, and I refuse to be
tricked into fixing it.
- If we fail to register the system call during MOD_LOAD, then note that
so that we don't try to deregister it or invoke the chained event handler
during the subsequent MOD_UNLOAD event. Doing the deregister when the
register failed could result in trashing system call entries.
- Add a SI_SUB_SYSCALLS just before starting up init and use that to
register syscall modules instead of SI_SUB_DRIVERS. Registering system
calls as late as possible increases the chances that any other module
event handlers or SYSINITs in a module are executed to initialize the
data in a kld before a syscall dependent on that data is able to be
invoked.
MFC after: 3 days
cache when unloading the nfsserver module. This fixes a memory leak and
a stale pointer.
- Use callout_drain() rather than callout_stop() when unloading the
nfsserver module.
MFC after: 3 days
- Right justify 'pid' label.
- Move the uid column to the right 2 columns so that the 3 process id
columns (pid, ppid, pgrp) are grouped together.
- Expand the uid column to 5 chars.
- Don't indent the tid for multithreaded processes.
Requested by: bde (1, 2, 4)
longer referenced by other threads (hence our freeing it), we don't need
to set the can't send and can't receive flags, wake up the consumers,
perform two levels of locking, etc. Implement a fast-path teardown,
sbdestroy(), which flushes and releases each socket buffer. A manual
dom_dispose of the receive buffer is still required explicitly to GC
any in-flight file descriptors, etc, before flushing the buffer.
This results in a 9% UP performance improvement and 16% SMP performance
improvement on a tight loop of socket();close(); in micro-benchmarking,
but will likely also affect CPU-bound macro-benchmark performance.
UNIX domain socket at the same time as the remote host is closing the
new connections as quickly as they open. Since the connect() and
send() paths are non-atomic with respect to another, it is possible
for the second thread's close() call to disconnect the two sockets
as connect() returns, leading to the consumer (which plans to send())
with a NULL kernel pointer to its proposed peer. As a result, after
acquiring the UNIX domain socket subsystem lock, we need to revalidate
the connection pointers even though connect() has technically succeed,
and reurn an error to say that there's no connection on which to
perform the send.
We might want to rethink the specific errno number, perhaps ECONNRESET
would be better.
PR: 100940
Reported by: Young Hyun <youngh at caida dot org>
MFC after: 2 weeks
MFC note: Some adaptation will be required
- Correct the PCI ID for the 8169SC/8110SC in the device list (I added
the macro for it to if_rlreg.h before, but forgot to use it.)
- Remove the extra interrupt spinlock I added previously. After giving it
some more thought, it's not really needed.
- Work around a hardware bug in some versions of the 8169. When sending
very small IP datagrams with checksum offload enabled, a conflict can
occur between the TX autopadding feature and the hardware checksumming
that can corrupt the outbound packet. This is the reason that checksum
offload sometimes breaks NFS: if you're using NFS over UDP, and you're
very unlucky, you might find yourself doing a fragmented NFS write where
the last fragment is smaller than the minimum ethernet frame size (60
bytes). (It's rare, but if you keep NFS running long enough it'll
happen.) If checksum offload is enabled, the chip will have to both
autopad the fragment and calculate its checksum header. This confuses
some revs of the 8169, causing the packet that appears on the wire
to be corrupted. (The IP addresses and the checksum field are mangled.)
This will cause the NFS write to fail. Unfortunately, when NFS retries,
it sends the same write request over and over again, and it keeps
failing, so NFS stays wedged.
(A simple way to provoke the failure is to connect the failing system
to a network with a known good machine and do "ping -s 1473 <badhost>"
from the good system. The ping will fail.)
Someone had previously worked around this using the heavy-handed
approahch of just disabling checksum offload. The correct fix is to
manually pad short frames where the TCP/IP stack has requested
checksum offloading. This allows us to have checksum offload turned
on by default but still let NFS work right.
- Not a bug, but change the ID strings for devices with hardware rev
0x30000000 and 0x38000000 to both be 8168B/8111B. According to RealTek,
they're both the same device, but 0x30000000 is an earlier silicon spin.
perform the reboot action via the reset register instead of our legacy
method. Default is 0 (use legacy). This is needed because some systems
hang on reboot even though they claim to support the reset register.
MFC after: 2 days
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
- fix "No sound in KDE":
The problem is related to the implementation of Envy24(1712) hardware
mixer support in the driver. Envy24(1712) has very precise 36bit wide
hardware mixer, which is superior that vchans (software sound mixer in
the kernel). The driver supports Envy24(1712) hardware mixer, so up to
10 channels (5 stereo pairs) can be playback simultaneously.
However, there are problems with the implementation of Envy24(1712)
hardware mixer support in the driver, one of them is the problem with
"no sound in KDE":
When playing back several channels simultaneously and
stoping one of the channels, sound starts to stutter and
plays at very low speed.
Another problem is:
Playing back simultaneously more than one 24bit/32bit
sound file or 16bit sound file and 24bit/32bit sound
file doesn't work as expected.
Submitted by: "Konstantin Dimitrov" <kosio.dimitrov@gmail.com>
except for s_family (which is read-only once after it is set when the
structure is created).
- Mark svr4_sys_ioctl(), svr4_sys_getmsg(), and svr4_sys_putmsg() MPSAFE.
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
section.
- Add a new explicit check before userret() to make sure we don't return
with any locks held. The advantage here is that we can include the
syscall number and name in syscall() whereas that info is not available
in userret().
- Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by
the more general checks just added.
MFC after: 2 weeks
a count of all non-spin locks, not just lockmgr locks. This can give us a
much cheaper way to see if we have any locks held (such as when returning
to userland via userret()) without requiring WITNESS.
MFC after: 1 week
poll (i.e. call read_char() method) slave keyboards.
This workaround should fix problem with kbdmux(4) and
atkbd(4) not working in ddb(4) and mid-boot.
MFC after: 1 week
kern_fstatfs() so that it is still held when prison_enforce_statfs() is
called (since that function likes to poke and prod at the mountpoint
structure).
MFC after: 3 days
all other mtx_lock() operations to block. Previously, when the mutex was
destroyed, it would still have a valid value in mtx_lock(): either the
unowned cookie, which would allow a subsequent mtx_lock() to succeed, or a
pointer to the thread who destroyed the mutex if the mutex was locked when
it was destroyed.
MFC after: 3 days
kern_accept() and accept1(). If another thread closed the new file
descriptor and the first thread later got an error trying to copyout the
socket address, then it would attempt to close the wrong file object. To
fix, add a struct file ** argument to kern_accept(). If it is non-NULL,
then on success kern_accept() will store a pointer to the new file object
there and not release any of the references. It is up to the calling code
to drop the references appropriately (including a call to fdclose() in case
of error to safely handle the aforementioned race). While I'm at it, go
ahead and fix the svr4 streams code to not leak the accept fd if it gets an
error trying to copyout the streams structures.
map was obtained from the SMAP. SMAP is trustworthy, and the memory
extending feature is a band-aid for older systems where FreeBSD's methods
of detecting memory were not always trustworthy. This fixes the issue
where using hw.physmem could result in the ACPI tables getting trashed
breaking ACPI.
MFC after: 3 days
Tested on: i386
by remembering a map used in bus_dmamap_load_mbuf_sg(9). I have
no idea how it could ever worked before.
This fixes a warning generated by a diagnostic check in sun4v
iommu driver.
Reported by: jb
Tested by: jb(sun4v)
have been invoked by uipc_close() or uipc_abort(), and the socket is in a
state of being torn down by the time we get to this point, so kqueue
state frobbed by soisdisconnected() is not available, so frobbing it will
result in a panic.
Reported by: Munehiro Matsuda <haro at h4 dot dion dot ne dot jp>
This avoids that mem.c has to include ofw_machdep.h, including
all OFW related headers.
o Provide a stub for OF_decode_addr(), which is used by low-level
console drivers to obtain a tag and handle given a OFW phandle.
This is different from sparc64, where a fake bus tag needs to be
created explicitly.
space range per channel, but rather the unshifted range. The
shifting depends on the bus. The hardcoded shift was specific
to the SBus on sparc64. The shifted range is now determined at
run-time. This fixes the mac-io attachment.
Such an address can be used directly in padlock's AES.
This improves speed of geli(8) significantly:
# sysctl kern.geom.zero.clear=0
# geli onetime -s 4096 gzero
# dd if=/dev/gzero.eli of=/dev/null bs=1m count=1000
Before: 113MB/s
After: 203MB/s
BTW. If sector size is set to 128kB, I can read at 276MB/s :)
skip the actual type 1 length (6 bytes). With this change, it is now possible
to correctly spot the VAT partition map in certain discs.
Submitted by: Pedro Martelletto <pedro@ambientworks.net>
Prevent casual modification by requiring hw.acpi.thermal.user_override to
be set first. Fix printing of negative temperatures in the K->C conversion.
Document the remaining thermal sysctls.
MFC after: 3 days
desired role configuration instead of existing role. This gets
us out of the mess where we configured a role of NONE (or were
LAN only, for example), but didn't continue to attach the CAM
module (because we had neither initiator nor target role
set). Unfortunately, the code that rewrites NVRAM to match
actual to desired role only works if the CAM module attaches.
MFC after: 2 weeks
controller ported from NetBSD. It supports the following Gigabit
Ethernet adapters.
o Antares Microsystems Gigabit Ethernet
o ASUS NX1101 Gigabit Ethernet
o D-Link DL-4000 Gigabit Ethernet
o IC Plus IP1000A Gigabit Ethernet
o Sundance ST-2021 Gigabit Ethernet
o Sundance ST-2023 Gigabit Ethernet
o Sundance TC9021 Gigabit Ethernet
o Tamarack TC9021 Gigabit Ethernet
The IP1000A Gigabit Ethernet is also found on some motherboards
(LOM) from ABIT.
Unlike NetBSD stge(4) it does not require promiscuous mode operation
to revice packet and it supports all hardware features(TCP/UDP/IP
checksum offload, VLAN tag stripping/insertion features and JUMBO
frame) and polling(4).
Due to lack of hardware, hardwares that have TBI trantransceivers
were not tested at all.
Special thanks to wpaul who provided valauble datasheet for the
controller and helped to debug jumbo frame related issues. Whitout
his datasheet I would have spent many hours to debug this chip.
Tested on: i386, sparc64