87328 Commits

Author SHA1 Message Date
gibbs
ae14155f37 Update netfront so that it queries and honors published
back-end features.

sys/dev/xen/netfront/netfront.c:
	o Add xn_query_features() which reads the XenStore and
	  records the TSO, LRO, and chained ring-request support
	  of the backend.
	o Rename xn_configure_lro() to xn_configure_features() and
	  use this routine to manage the setup of TSO, LRO, and
	  checksum offload.
	o In create_netdev(), initialize if_capabilities and
	  if_hwassist to the capabilities found on all backends.
	  Delegate configuration of if_capenable and the TSO flag
	  if if_hwassist to xn_configure_features().

Reported by:	Hugo Silva (fix inspired by patch provided)
Approved by:	re
MFC after:	1 week
2011-09-21 00:15:29 +00:00
gibbs
b72e5b17b4 Modify the netfront driver so it can successfully attach to
PV devices with the ioemu attribute set.

sys/dev/xen/netfront/netfront.c:
	o If a mac address for the interface cannot be found
	  in the front-side XenStore tree, look for an entry
	  in the back-side tree.  With ioemu devices, the
	  emulator does not populate the front side tree and
	  neither does Xend.
	o Return an error rather than panic when an attach
	  attempt fails.

Reported by:	Janne Snabb (fix inspired by patch provided)
PR:		kern/154302
Approved by:	re
2011-09-21 00:13:04 +00:00
gibbs
1f5bdef0e4 Correct suspend/resume support in the Netfront driver.
Sponsored by: BQ Internet

sys/dev/xen/netfront/netfront.c:
	o Implement netfront_suspend(), a specialized suspend
	  handler for the netfront driver.  This routine simply
	  disables the carrier so the driver is idle during
	  system suspend processing.
	o Fix a leak when re-initializing LRO during a link reset.
	o In netif_release_tx_bufs(), when cleaning up the grant
	  references for our TX ring, use gnttab_end_foreign_access_ref
	  instead of attempting to grant the page again.
	o In netif_release_tx_bufs(), we do not track mbufs associated
	  with mbuf chains, but instead just free each mbuf directly.
	  Use m_free(), not m_freem(), to avoid double frees of mbufs.
	o Refactor some code to enhance clarity.

Approved by:	re
MFC after:	1 week
2011-09-21 00:08:25 +00:00
gibbs
5ed6108620 Add suspend/resume support to the Xen blkfront driver.
Sponsored by: BQ Internet

sys/dev/xen/blkfront/block.h:
sys/dev/xen/blkfront/blkfront.c:
	Remove now unused blkif_vdev_t from the blkfront soft.

sys/dev/xen/blkfront/blkfront.c:
	o In blkfront_suspend(), indicate the desire to suspend
	  by changing the softc connected state to SUSPENDED, and
	  then wait for any I/O pending on the remote peer to
	  drain.  Cancel suspend processing if I/O does not
	  drain within 30 seconds.
	o Enable and update blkfront_resume().  Since I/O is
	  drained prior to the suspension of the VM, the complicated
	  recovery process performed by other Xen blkfront
	  implementations is avoided.  We simply tear down the
	  connection to our old peer, and then re-connect.
	o In blkif_initialize(), fix a resource leak and botched
	  return if we cannot allocate shadow memory for our
	  requests.
	o In blkfront_backend_changed(), correct our response to
	  the XenbusStateInitialised state.  This state indicates
	  that our backend peer has published sufficient data for
	  blkfront to publish ring information and other XenStore
	  data, not that a connection can occur.  Blkfront now
	  will only perform connection processing in response to
	  the XenbusStateConnected state.  This corrects an issue
	  where blkfront connected before the backend was ready
	  during resume processing.

Approved by:	re
MFC after:	1 week
2011-09-21 00:02:44 +00:00
gibbs
44b315b7fb Properly handle suspend/resume events in the Xen device
framework.

Sponsored by:	BQ Internet

sys/xen/xenbus/xenbusb.c:
	o In xenbusb_resume(), publish the state transition of the
	  resuming device into XenbusStateIntiailising so that the
	  remote peer can see it.  Recording the state locally is
	  not sufficient to trigger a re-connect sequence.
	o In xenbusb_resume(), defer new-bus resume processing until
	  after the remote peer's XenStore address has been updated.
	  The drivers may need to refer to this information during
	  resume processing.

sys/xen/xenbus/xenbusb_back.c:
sys/xen/xenbus/xenbusb_front.c:
	Register xenbusb_resume() rather than bus_generic_resume()
	as the handler for device_resume events.

sys/xen/xenstore/xenstore.c:
	o Fix grammer in a comment.
	o In xs_suspend(), pass suspend events on to the child
	  devices (e.g. xenbusb_front/back, that are attached
	  to the XenStore.

Approved by:	re
MFC after:	1 week
2011-09-20 23:44:34 +00:00
kib
100c4e4864 Use nowait sync request for a vnode when doing softdep cleanup. We possibly
own the unrelated vnode lock, doing waiting sync causes deadlocks.

Reported and tested by:	pho
Approved by:	re (bz)
2011-09-20 21:53:26 +00:00
kmacy
e3079e1350 Make KBI changes required for future MFCing of inpcb rtentry / llentry caching.
Reviewed by:	rwatson, bz
Approved by:	re (kib)
2011-09-20 20:27:26 +00:00
hselasky
8139b983c9 Avoid starting the USB transfer if an error is already pending.
This change fixes a race in device side mode during clear-stall from
host, which can cause data to be sent too early on the given
endpoint.

Approved by:	re (kib)
MFC after:	1 week
2011-09-20 14:17:58 +00:00
adrian
8bc7d7cad3 Manually set the channel when using monitor mode - the firmware
doesn't select it automatically.

Submitted by:	nox
Reviewed by:	bschmidt
Approved by:	re
PR:		kern/160815
2011-09-20 04:30:23 +00:00
hrs
45d40bbb1a Copy ip6po_minmtu and ip6po_prefer_tempaddr in ip6_copypktopts(). This fixes
inconsistency when options are specified by both setsockopt() and ancillary
data types.

PR:		kern/158307
Approved by:	re (bz)
2011-09-20 00:29:17 +00:00
tuexen
9fb650bb7b Cleanup the iterator code, remove code that is never executed.
Approved by: re
MFC after: 1 month.
2011-09-19 21:47:20 +00:00
attilio
420a3a3f8b It is safe to initialize locks even on early boot (and it is the same
thing all the other architectures already do) thus just initialize
kernel_pmap in pmap_bootstrap().

Reported by:	alc
Reviewed by:	alc, marius
Tested by:	flo, marius
Approved by:	re (kib)
MFC after:	1 week
2011-09-19 18:29:15 +00:00
attilio
17bc15383e #PROCHOT assertion is sticky after reading the MSR (accordingly with
Intel manuals) it must be cleared by writing a 0.
Fix that.

Sponsored by:	Sandvine Incorporated
Reported by:	rstone
Reviewed by:	delphij, emaste, rstone
Approved by:	re (kib)
MFC after:	1 week
2011-09-19 10:58:30 +00:00
trasz
18eab55455 Fix error handling bug that would prevent MAC structures from getting
freed properly if resource limit got exceeded.

Approved by:	re (kib)
2011-09-17 20:48:49 +00:00
trasz
885c397429 Fix long-standing thinko regarding maxproc accounting. Basically,
we were accounting the newly created process to its parent instead
of the child itself.  This caused problems later, when the child
changed its credentials - the per-uid, per-jail etc counters were
not properly updated, because the maxproc counter in the child
process was 0.

Approved by:	re (kib)
2011-09-17 19:55:32 +00:00
rstone
71b5744230 Clear transmit checksum offload context state upon lem(4) interface
initialization.  Prior to this change packets may be transmitted with an
incorrect checksum.

Em(4) already has an equivalent change in r213234.

Obtained From:  Sandvine
MFC After:      1 week
Approved by:    re (bz)
2011-09-17 13:48:09 +00:00
tuexen
680b9f90a2 Fix the enabling/disabling of Heartbeats and path MTU
discovery when using the SCTP_PEER_ADDR_PARAMS socket option.
Approved by: re
MFC after: 1 month.
2011-09-17 08:50:29 +00:00
kmacy
8bc0044e86 Auto-generated code from sys_ prefixing makesyscalls.sh change
Approved by:	re(bz)
2011-09-16 14:04:14 +00:00
kmacy
99851f359e In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
avg
ff769d30aa zfstest: rename to zfsboottest and move to tools
Approved by:	re (kib)
MFC after:	1 week
2011-09-16 08:22:48 +00:00
ae
ef85f238b0 Add IPv6 support to the ng_ipfw(4) [1]. Also add ifdefs to be able
build it with and without INET/INET6 support.

Submitted by:	Alexander V. Chernikov <melifaro at yandex-team.ru> [1]
Tested by:	Alexander V. Chernikov <melifaro at yandex-team.ru> [1]
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-15 12:28:17 +00:00
tuexen
cc85bd26ed Fix a typo introduced in
http://svn.freebsd.org/changeset/base/225571
Reported by Ilya A. Arkhipov.

Approved by: re
MFC after: 1 month.
2011-09-15 12:20:52 +00:00
kib
fdfe4f8a66 Put amd64_syscall() prototype in md_var.h.
Requested by:	jhb
Reviewed by:	alc, jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-15 09:54:07 +00:00
kib
747c6e1d12 Microoptimize the return path for the fast syscalls on amd64. Arrange
the code to have the fall-through path to follow the likely target.
Do not use intermediate register to reload user %rsp.

Proposed by:	alc
Reviewed by:	alc, jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-15 09:53:04 +00:00
tuexen
15bb2c985f Make sure that SCTP rejects broadcast, multicast and wildcard addresses
as remote addresses.

Approved by: re
MFC after: 1 month.
2011-09-15 08:49:54 +00:00
adrian
f23b1f625d Ensure that ta_pending doesn't overflow u_short by capping its value at USHRT_MAX.
If it overflows before the taskqueue can run, the task will be
re-added to the taskqueue and cause a loop in the task list.

Reported by:	Arnaud Lacombe <lacombar@gmail.com>
Submitted by:	Ryan Stone <rysto32@gmail.com>
Reviewed by:	jhb
Approved by:	re (kib)
MFC after:	1 day
2011-09-15 08:42:06 +00:00
tuexen
0e8ff918fb Ensure that 1-to-1 style SCTP sockets can only be connected once.
Allow implicit setup also for 1-to-1 style sockets as described
in the latest version of the socket API ID.

Approved by: re
MFC after: 1 month
2011-09-14 19:10:13 +00:00
hselasky
79dd2afc0b Reduce USB memory usage during enumeration.
We are allocating some kilobytes of extra memory during USB device enumeration.
This does not change alot under FreeBSD, but makes sense for various embedded
operating systems using the FreeBSD USB stack, which have less memory
resources available.

Approved by:	re (kib)
MFC after:	1 week
2011-09-14 15:16:53 +00:00
tuexen
eab7de0c8f Fix the handling of the flowlabel and DSCP value in the SCTP_PEER_ADDR_PARAMS
socket option.
Honor the net.inet6.ip6.auto_flowlabel sysctl setting.

Approved by: re (bz)
MFC after: 1 month.
2011-09-14 08:15:21 +00:00
rmacklem
99f390a4e8 Modify vfs_register() to use a hash calculation
on vfc_name to set vfc_typenum, so that vfc_typenum doesn't
change when file systems are loaded in different orders. This
keeps NFS file handles from changing, for file systems that
use vfc_typenum in their fsid. This change is controlled via
a loader.conf variable called vfs.typenumhash, since vfc_typenum
will change once when this is enabled. It defaults to 1 for
9.0, but will default to 0 when MFC'd to stable/8.

Tested by:	hrs
Reviewed by:	jhb, pjd (earlier version)
Approved by:	re (kib)
MFC after:	1 month
2011-09-13 21:01:26 +00:00
brueffer
c9926ff95a Improve the sleep_delay sysctl description by specifying which unit
the number is in.

PR:		159975
Submitted by:	gcooper
Approved by:	re (kib)
MFC after:	1 week
2011-09-13 15:57:29 +00:00
davidch
764ac0f2e8 - Fix compiler warning in ADD_64() macro.
Approved by:	re
Obtained from:	dimitry@andic.com
MFC after:	One week
2011-09-13 15:49:28 +00:00
avg
94836c37a8 zfs boot subroutines: correctly specify type of an integer literal
Found by adding more warning flags to zfs boot blocks build.

Approved by:	re (kib)
MFC after:	1 week
2011-09-13 14:07:05 +00:00
avg
85867a4f6c gpt/zfs boot blocks: reduce optimizing CFLAGS to -O1
gpt and zfs boot blocks are not nearly as size-constrained as boot2
from which they inherited their current optimization and anti-optimization
options.  As such the current options do not provide any benefit, but
make debugging of the code much harder.
Also, it has been demonstrated that combination of -mrtd and
-fno-unit-at-a-time may result in mis-compilation of the boot code
with the current base gcc.

Additionally, intermediate assembly file filtering is removed for
zfsboot.

The new boot blocks are all compile- and boot- tested using qemu.
gptzfsboot is tested with real hardware.

Reported by:	Peter Jeremy <peterjeremy@acm.org> [miscompilation]
Discussed with:	bde, jhb
Tested by:	Sebastian Chmielewski <chmielsster@gmail.com> [gptzfsboot]
Approved by:	re (kib)
MFC after:	3 weeks
2011-09-13 14:03:55 +00:00
avg
dab0468c87 zfstest: cleanup the code, improve functionality and diagnostics
The utility is not connected to the build, so it should be safe
to update it.
To do: move the utility to tools/.
Some code is provided by Peter Jeremy <peterjeremy@acm.org>

Tested by:	Sebastian Chmielewski <chmielsster@gmail.com>,
		Peter Jeremy <peterjeremy@acm.org> (earlier versions)
Approved by:	re (kib)
MFC after:	4 days
2011-09-13 14:01:35 +00:00
hrs
08320280c6 Add $ipv6_cpe_wanif to enable functionality required for IPv6 CPE
(r225485).  When setting an interface name to it, the following
configurations will be enabled:

 1. "no_radr" is set to all IPv6 interfaces automatically.

 2. "-no_radr accept_rtadv" will be set only for $ipv6_cpe_wanif.  This is
    done just before evaluating $ifconfig_IF_ipv6 in the rc.d scripts (this
    means you can manually supersede this configuration if necessary).

 3. The node will add RA-sending routers to the default router list
    even if net.inet6.ip6.forwarding=1.

This mode is added to conform to RFC 6204 (a router which connects
the end-user network to a service provider network).  To enable
packet forwarding, you still need to set ipv6_gateway_enable=YES.

Note that accepting router entries into the default router list when
packet forwarding capability and a routing daemon are enabled can
result in messing up the routing table.  To minimize such unexpected
behaviors, "no_radr" is set on all interfaces but $ipv6_cpe_wanif.

Approved by:	re (bz)
2011-09-13 00:06:11 +00:00
jhb
ebd93e5aff Allow the ipfw.ko module built with a kernel to honor any IPFIREWALL_*
options defined in the kernel config.  This more closely matches the
behavior of other modules which inherit configuration settings from the
kernel configuration during a kernel + modules build.

Reviewed by:	luigi
Approved by:	re (kib)
MFC after:	1 week
2011-09-12 21:09:56 +00:00
brueffer
f6a3e0b0cf Connect the vxge(4) module to the i386/amd64 build.
Catcher of stupid errors: kib
Approved by:	re (kib)
2011-09-12 20:57:22 +00:00
attilio
2267bc9430 dump_write() returns ENXIO if the dump is trying to be written outside
of the device boundry.
While this is generally ok, the problem is that all the consumers
handle similar cases (and expect to catch) ENOSPC for this (for a
reference look at minidumpsys() and dumpsys() constructions). That
ends up in consumers not recognizing the issue and amd64 failing to
retry if the number of pages grows up during minidump.
Fix this by returning ENOSPC in dump_write() and while here add some
more diagnostic on involved values.

Sponsored by:	Sandvine Incorporated
In collabouration with:	emaste
Approved by:	re (kib)
MFC after:	10 days
2011-09-12 20:39:31 +00:00
jhb
c73be1deff Partially revert 222753: If a CardBus card stores its CIS in a BAR, delete
the BAR after parsing the CIS.  This forces the resource range to be
reallocated if the BAR is reused by the device.

Submitted by:	deischen
Reviewed by:	imp
Approved by:	re (kib)
2011-09-12 15:21:52 +00:00
ed
5ccb03a60f Fix error return codes for ioctls on init/lock state devices.
In revision 223722 we introduced support for driver ioctls on init/lock
state devices. Unfortunately the call to ttydevsw_cioctl() clobbers the
value of the error variable, meaning that in many cases ioctl() will now
return ENOTTY, even though the ioctl() was processed properly.

Reported by:	Boris Samorodov <bsam ipt ru>
Patch by:	jilles@
Approved by:	re@ (kib@)
2011-09-12 10:07:21 +00:00
avg
3174e1aa42 dsp_ioctl: fix type of variable used to store ioctl request
PR:		kern/156433
Submitted by:	Grigori Goronzy <greg@chown.ath.cx>
Reviewed by:	hselasky
Approved by:	re (kib)
MFC after:	1 week
2011-09-12 08:38:21 +00:00
kib
91ab853ec4 The jump target shall be after the padding, not into it.
Reported by:	alc
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 18:00:46 +00:00
brueffer
129f307eb8 Fix a zyd(4) comment typo that was copy+pasted into most kernel config files.
PR:		160276
Submitted by:	MATSUMIYA Ryo <matsumiya@mma.club.uec.ac.jp>
Approved by:	re (kib)
MFC after:	1 week
2011-09-11 17:39:51 +00:00
kib
44839b3d92 Perform amd64-specific microoptimizations for native syscall entry
sequence. The effect is ~1% on the microbenchmark.

In particular, do not restore registers which are preserved by the
C calling sequence. Align the jump target. Avoid unneeded memory
accesses by calculating some data in syscall entry trampoline.

Reviewed by:	jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 16:08:10 +00:00
kib
55d0a85118 Inline the syscallenter() and syscallret(). This reduces the time measured
by the syscall entry speed microbenchmarks by ~10% on amd64.

Submitted by:	jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 16:05:09 +00:00
adrian
b7f3e89486 Fix the order of parameters passed to the HT frame duration calculation.
Approved by:	re (kib)
2011-09-11 09:43:13 +00:00
hselasky
51e6bf127c Refactor auto-quirk solution so that we break as few external
drivers as possible.

PR:		usb/160299
Approved by:	re (kib)
Suggested by:	rwatson
MFC after:	0 days
2011-09-10 15:55:36 +00:00
tuexen
0d8130b65d Improve implementation of the Nagle algorithm for SCTP:
Don't delay the final fragment of a fragmented user message.

Approved by: re
MFC after: 4 weeks
2011-09-09 13:52:37 +00:00
attilio
5494aebd97 Improve the informations reported in case of busy buffers during the shutdown:
- Axe out the SHOW_BUSYBUFS option and uses a tunable for selectively
enable/disable it, which is defaulted for not printing anything (0
value) but can be changed for printing (1 value) and be verbose (2
value)
- Improves the informations outputed: right now, there is no track of
the actual struct buf object or vnode which are referenced by the
shutdown process, but it is printed the related struct bufobj object
which is not really helpful
- Add more verbosity about the state of the struct buf lock and the
vnode informations, with the latter to be activated separately by the
sysctl

Sponsored by:	Sandvine Incorporated
Reviewed by:	emaste, kib
Approved by:	re (ksmith)
MFC after:	10 days
2011-09-08 12:56:26 +00:00