Commit Graph

100830 Commits

Author SHA1 Message Date
melifaro
881c9e28bf Simplify filling sockaddr_dl structure for if_resolvemulti()
callback providers. link_init_sdl() function can be used to
fill most of the parameters. Use caller stack instead of
allocation / freing memory for each request. Do not drop support
for extra-long (probably non-existing) link-layer protocols by
introducing link_alloc_sdl() (used by if_resolvemulti() callback)
and link_free_sdl() (used by caller).
Since this change breaks KBI, MFC requires slightly different approach
(link_init_sdl() auto-allocating buffer if necessary to handle cases
 with unmodified if_resolvemulti() callers).

MFC after:	2 weeks
2014-01-18 23:24:51 +00:00
neel
e74998b34d Some processor's don't allow NMI injection if the STI_BLOCKING bit is set in
the Guest Interruptibility-state field. However, there isn't any way to
figure out which processors have this requirement.

So, inject a pending NMI only if NMI_BLOCKING, MOVSS_BLOCKING, STI_BLOCKING
are all clear. If any of these bits are set then enable "NMI window exiting"
and inject the NMI in the VM-exit handler.
2014-01-18 21:47:12 +00:00
melifaro
421d2fc5eb Use in6_localip() instead of hand-rolled cycle.
MFC after:	2 weeks
2014-01-18 20:54:55 +00:00
melifaro
4e82296063 Add in6_prepare_ifra() function to ease preparing in-kernel IPv6
address requests.

MFC after:	2 weeks
2014-01-18 20:32:59 +00:00
alc
d98c6ca3a1 Style changes in vm_pageout_scan():
1. Be consistent in the style of "act_delta" manipulations between the
   inactive and active queue scans.

2. Explicitly compare to zero.

3. The deactivation of a page is based is based on its recent history
   and not just the current call to vm_pageout_scan().  The variable
   "act_delta" represents the current state of the page, and not its
   history.  Avoid possible confusion by not (ab)using "act_delta" for
   the making the deactivation decision.

Submitted by:	kib [1]
Reviewed by:	kib [2,3]
2014-01-18 20:02:59 +00:00
melifaro
9f1142ff95 Do some style(9) not done in r260851 to improve readability.
MFC after:	2 weeks
2014-01-18 15:57:43 +00:00
melifaro
9b02dc0fae Split in6_update_ifa() into smaller pieces leaving functionality intact.
Discussed with:	ae
MFC after:	2 weeks
2014-01-18 15:52:52 +00:00
bryanv
31dc7c36ca Add very simple virtio_random(4) driver to harvest entropy from host
Reviewed by:	markm (random bits only)
2014-01-18 06:14:38 +00:00
neel
4dec904890 If the guest exits due to a fault while it is executing IRET then restore
the state of "Virtual NMI blocking" in the guest's interruptibility-state
field before resuming the guest.
2014-01-18 02:20:10 +00:00
delphij
4b064bf9ac MFV r260834:
Fix memory leak of compressed buffers in l2arc_write_done (Illumos
#3995).
2014-01-18 01:45:39 +00:00
mav
7cf333c5cb Add ID for one more ASMedia AHCI-compatible controller.
Reported by:	ignace.peeters@gmail.com
MFC after:	2 weeks
2014-01-17 17:16:49 +00:00
glebius
49f165e9e9 Fix comment. 2014-01-17 11:09:05 +00:00
hselasky
b44a9daf13 Fix a possible memory use after free and leak situation associated
with USB device detach when using character device handles. This also
includes LibUSB. It turns out that "usb_close()" cannot always get a
reference to clean up its USB transfers and such, if called during the
kernel USB device detach.

Analysis by:	hselasky @
Reported by:	Juergen Lock <nox@jelal.kn-bremen.de>
MFC after:	1 week
2014-01-17 10:35:18 +00:00
avg
6b143ee35a traverse_visitbp: visit DMU_GROUPUSED_OBJECT before DMU_USERUSED_OBJECT
This is done to ensure that visited object IDs are always increasing.
Also, pass correct object ID to prefetch_dnode_metadata for
os_groupused_dnode.

Without this change we would hit an assert if traversal was paused on
a GROUPUSED object, which is unlikely but possible.

Apparently the same change was independently developed by Deplhix.

Reviewed by:	Matthew Ahrens <mahrens@delphix.com>
MFC after:	10 days
Sponsored by:	HybridCluster
2014-01-17 10:23:46 +00:00
hselasky
28503bcf6e Close a minor deadlock.
MFC after:	1 week
2014-01-17 08:21:09 +00:00
adrian
c46f73c7ae Implement a kqueue notification path for sendfile.
This fires off a kqueue note (of type sendfile) to the configured kqfd
when the sendfile transaction has completed and the relevant memory
backing the transaction is no longer in use by this transaction.
This is analogous to SF_SYNC waiting for the mbufs to complete -
except now you don't have to wait.

Both SF_SYNC and SF_KQUEUE should work together, even if it
doesn't necessarily make any practical sense.

This is designed for use by applications which use backing cache/store
files (eg Varnish) or POSIX shared memory (not sure anything is using
it yet!) to know when a region of memory is free for re-use.  Note
it doesn't mark the region as free overall - only free from this
transaction.  The application developer still needs to track which
ranges are in the process of being recycled and wait until all
pending transactions are completed.

TODO:

* documentation, as always

Sponsored by:	Netflix, Inc.
2014-01-17 05:26:55 +00:00
adrian
9d7a5637e2 Add in a default initialiser for the EVOPS_SENDFILE kqueue filterops.
Sponsored by:	Netflix, Inc.
2014-01-17 05:15:44 +00:00
adrian
103d7982d2 Implement the extension api for sendfile to allow for kqueue notifications.
This is still under a bit of flux, as the final API hasn't been nailed
down.  It's also unclear whether we should define the two new types in the
header or not - it may allow bad code to compile that shouldn't (ie,
since uintX's are defined, the developer may not include sys/types.h.)

Reviewed by:	peter, imp, bde
Sponsored by:	Netflix, Inc.
2014-01-17 05:13:08 +00:00
luigi
ba56cd1e18 forgot to update this file in 2607000 2014-01-17 04:38:58 +00:00
neel
77ef1cf997 If a VM-exit happens during an NMI injection then clear the "NMI Blocking" bit
in the Guest Interruptibility-state VMCS field.

If we fail to do this then a subsequent VM-entry will fail because it is an
error to inject an NMI into the guest while "NMI Blocking" is turned on. This
is described in "Checks on Guest Non-Register State" in the Intel SDM.

Submitted by:	David Reed (david.reed@tidalscale.com)
2014-01-17 04:21:39 +00:00
gnn
61b1174206 Fix various places where we don't properly release a lock
PR:		185043
Submitted by:	Michael Bentkofsky
MFC after:	2 weeks
2014-01-16 22:14:54 +00:00
imp
14cf77d37c Remove two redundantly repetitive assignments. 2014-01-16 20:40:02 +00:00
ray
0d4d14c84b Fix build after FDT changes.
Sponsored by:	The FreeBSD Foundation
2014-01-16 14:48:23 +00:00
glebius
63cf6debb3 Simplify wait/nowait code, eventually killing last remnant of
historical mbuf(9) allocator flag.

Sponsored by:	Nginx, Inc.
2014-01-16 13:45:41 +00:00
glebius
529a68392a Another round of removing historical mbuf(9) allocator flags.
They are breeding! New ones arouse since last round.

Sponsored by:	Nginx, Inc.
2014-01-16 13:44:47 +00:00
avg
d107399017 fix a build problem with INVARIANTS enabled introduced in r260704
Reported by:	glebius
MFC after:	5 days
X-MFC with:	r260704
2014-01-16 13:44:37 +00:00
glebius
8efec8e7d2 Remove historical macro.
Sponsored by:	Nginx, Inc.
2014-01-16 13:42:50 +00:00
glebius
750ebc2942 Substitute flags from historical mbuf(9) allocator with modern ones.
Sponsored by:	Nginx, Inc.
2014-01-16 13:42:14 +00:00
avg
e186f564bc fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt.  Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds.  But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.

Reviewed by:	gibbs
MFC after:	13 days
Sponsored by:	HybridCluster
2014-01-16 13:24:10 +00:00
avg
d1329f5a22 Revert r260705: wrong patch committed by accident
An earlier, less efficient version was committed by accident.
2014-01-16 13:20:20 +00:00
glebius
abd19c8039 Cleanup comments and whitespace. No functional changes. 2014-01-16 12:58:03 +00:00
melifaro
0092f95bcc Fix refcount leak on netinet ifa.
Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2014-01-16 12:35:18 +00:00
avg
113f9a4f53 zfs_deleteextattr: name buffer from namei is needed by zfs_rename
If we prematurely free the name buffer and it gets quickly recycled,
then zfs_rename may see data from another lookup or even unmapped memory
via cn_nameptr.

MFC after:	6 days
Sponsored by:	HybridCluster
2014-01-16 12:31:27 +00:00
avg
31b7f68d80 fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt.  Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds.  But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.

Reviewed by:	gibbs
MFC after:	13 days
Sponsored by:	HybridCluster
2014-01-16 12:26:54 +00:00
avg
97986ccb0b zfs: getnewvnode_reserve must be called outside of a zfs transaction
Otherwise we could run into the following deadlock.
A thread has a transaction open and assigned to a transaction group.
That would prevent the transaction group from be quiesced and synced.
The thread is blocked in getnewvnode_reserve waiting for a vnode to
a be reclaimed.  vnlru thread is blocked trying to enter ZFS VOP because
a filesystem is suspended by an ongoing rollback or receive operation.
In its turn the operation is waiting for the current transaction group
to be synced.

zfs_zget is always used outside of active transactions, but zfs_mknode
is always used in a transaction context.  Thus, we hoist
getnewvnode_reserve from zfs_mknode to its callers.

While there, assert that ZFS always calls getnewvnode while having
a vnode reserved.

Reported by:	adrian
Tested by:	adrian
MFC after:	17 days
Sponsored by:	HybridCluster
2014-01-16 12:22:46 +00:00
melifaro
479797d59e Fix ipfw fwd for IPv4 traffic broken by r249894.
Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.

Found by:	lstewart
Discussed with:	ae,lstewart
MFC after:	2 weeks
Sponsored by:	Yandex LLC
2014-01-16 11:50:00 +00:00
luigi
f11710f126 netmap_user.h:
add separate rx/tx ring indexes
   add ring specifier in nm_open device name

netmap.c, netmap_vale.c
   more consistent errno numbers

netmap_generic.c
   correctly handle failure in registering interfaces.

tools/tools/netmap/
   massive cleanup of the example programs
   (a lot of common code is now in netmap_user.h.)

nm_util.[ch] are going away soon.
pcap.c will also go when i commit the native netmap support for libpcap.
2014-01-16 00:20:42 +00:00
imp
868757e5ac Add data so we can convert a PIO unit number into a base address. 2014-01-15 19:53:36 +00:00
imp
74fda9ff76 Provide a simplified way to specify GPIO pins for the Atmel port. 2014-01-15 19:49:12 +00:00
ray
10e896d479 Update xboxfb driver to actual state.
NOTE: Not tested.

Sponsored by:	The FreeBSD Foundation
2014-01-15 12:35:28 +00:00
marcel
4d43158298 In the nested TLB fault handler, for a direct-mapped address, make
sure to clear the lower 12 bits. We're adding the translation
attributes to the physical address and non-zero bits in the first
12 bits would give us something unexpected, including invalid bit
values. Those trigger nested general protection faults.
We do not have to clear the region bits, because they are ignored
anyway, so we can replace an existing dep instruction with the one
we need.

This fixes GP faults for the swapper thread, as it's the only thread
that has a direct-mapped stack. Since the bug is in the nested TLB
fault handler, the frequency of hitting the GP is in the order of
hours/days under load.
2014-01-15 03:57:41 +00:00
mav
f63cb2f402 Fix lock leak in purely hypothetical case of TCP connection without SVC_ACK
method.  This change should be NOP now, but it is better to be future safe.

Reported by:	rmacklem
2014-01-14 20:18:38 +00:00
hselasky
502882e081 Don't output any modifier keys before we see a valid
non-modifier key press. This prevents so-called "ghost
keyboards" keeping modifier keys pressed while not
actually seen as a real keyboard.

MFC after:	2 weeks
2014-01-14 08:43:38 +00:00
neel
0bd53a85fb Add an API to rendezvous all active vcpus in a virtual machine. The rendezvous
can be initiated in the context of a vcpu thread or from the bhyve(8) control
process.

The first use of this functionality is to update the vlapic trigger-mode
register when the IOAPIC pin configuration is changed.

Prior to this change we would update the TMR in the virtual-APIC page at
the time of interrupt delivery. But this doesn't work with Posted Interrupts
because there is no way to program the EOI_exit_bitmap[] in the VMCS of
the target at the time of interrupt delivery.

Discussed with:	grehan@
2014-01-14 01:55:58 +00:00
andreast
32f32eb6de Described in the man page but not implemented. Here it comes,
atomic_swap_32/64. The latter only for powerpc64.

MFC after:	1 month
2014-01-13 22:21:29 +00:00
andreast
f7da21fb45 The onyx codec works also as module, so add it.
MFC after:	1 month
2014-01-13 21:44:17 +00:00
hselasky
c4e62298f9 Implement better error recovery for Transaction Translators, TTs,
found in High Speed USB HUBs which translate from High Speed USB into
FULL or LOW speed USB. In some rare cases SPLIT transactions might get
lost, which might leave the TT in an unknown state. Whenever we detect
such an error try to issue either a clear TT buffer request, or if
that is not possible reset the whole TT.

MFC after:	1 week
2014-01-13 15:21:11 +00:00
hselasky
94e0872859 Separate I/O errors from reception of STALL PID.
MFC after:	1 week
2014-01-13 15:06:03 +00:00
bryanv
794929e7d2 Add unmapped IO support to virtio_scsi(4) 2014-01-13 04:46:48 +00:00
bryanv
841c25608b Add unmapped IO support to virtio_blk(4) 2014-01-13 04:43:01 +00:00
bryanv
c35dbac21a Add sglist_append_bio(9) to append a struct bio's data to a sglist
Reviewed by:	jhb, kib (long ago)
2014-01-13 04:41:08 +00:00
alc
ed1e11749f Correctly update the count of stuck pages, "addl_page_shortage", in
vm_pageout_scan().  There were missing increments in two less common cases.

Don't conflate the count of stuck pages and the pageout deficit provided by
vm_page_alloc{,_contig}().  (A proposed fix to the OOM code depends on this.)

Handle held pages consistently in the inactive queue scan.  In the more
common case, we did not move the page to the tail of the queue.  Whereas, in
the less common case, we did.  There's no particular reason to move the page
in the less common case, so remove it.

Perform the calculation of the page shortage for the active queue scan a
little earlier, before the active queue lock is acquired.  The correctness
of this calculation doesn't depend on the active queue lock being held.

Eliminate a redundant variable, "pcount".  Use the more descriptive
variable, "maxscan", in its place.

Apply a few nearby style fixes, e.g., eliminate stray whitespace and excess
parentheses.

Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2014-01-12 19:04:20 +00:00
bryanv
5710e98625 Remove incorrect bit shift when assigning the LUN request field
This caused duplicate targets appearing on Google Compute Engine
instances.

PR:		kern/185626
Submitted by:	Venkatesh Srinivas <venkateshs@google.com>
MFC after:	3 days
2014-01-12 17:40:47 +00:00
hselasky
d9c8f41d9c Make sure reserved fields of the EHCI DMA descriptors are not dirty
after previous transfers.

MFC after:	1 week
2014-01-12 13:16:25 +00:00
hselasky
c2567bf636 Don't do synchronous USB requests inside USB transfer callbacks. It is
technically OK, but not recommended.

MFC after:	1 weeks
2014-01-12 11:44:28 +00:00
gavin
6265cc52b1 Remove spaces from boot messages when we print the CPU ID/Family/Stepping
to match the rest of the CPU identification lines, and once again fit
into 80 columns in the usual case.
2014-01-11 22:41:10 +00:00
gavin
f586c22ed6 Add firmware for Intel Centrino Wireless-N 105 devices.
Committed from:	Centrino 105 device
2014-01-11 18:56:48 +00:00
melifaro
104ab6ec12 Revert r260548. We really should not use IPFW_WLOCK() here
but this requires some more playing with IPFW_UH_WLOCK(). Leave till later.
2014-01-11 18:27:34 +00:00
mav
7aa4414d31 Move xpt_run_devq() call before request completion callback where it was
originally.

I am not sure why exactly have I moved it during one of many refactorings
during camlock project, but obviously it opens race window that may cause
use after free panics during SIM (in reported cases umass(4)) detach.

MFC after:	2 weeks
2014-01-11 16:52:09 +00:00
melifaro
9f930faa0d We don't need chain write lock since we're not modifying its contents.
LibAliasSetAddress() uses its own mutex to serialize changes.

While here, convert ifp->if_xname access to if_name() function.

MFC after:	2 weeks
Sponsored by:	Yandex LLC
2014-01-11 16:50:41 +00:00
mav
ccc6bb570f Fix for r260541: do not drop periph reference when request is restarted.
CAM_DEV_QFREEZE flag is still there and it will freeze device again.
2014-01-11 16:37:20 +00:00
pfg
e3a716491d ext2fs: fix inode flag conversion.
After r252890 we are naively attempting to pass through the
inode flags.  This is technically incorrect as the ext2
inode flags don't match the UFS/system values used in
FreeBSD and a clean conversion is needed.

Some filtering was left in place so the change didn't cause
significant changes in FreeBSD but some of the garbage passed
is likely to be the cause for warning messages in linux.

Fix the issue by resetting the flags before conversion as was
done previously. This also means we will not pass the EXT4_*
inode flags into FreeBSD's inode.

PR:		kern/185448
MFC after:	3 days
2014-01-11 15:19:04 +00:00
kevlo
8a56b3adbe Fix a logic error when checking if Tx power entries are greater than 31. 2014-01-11 14:48:16 +00:00
mav
d2c6e514d8 Take additional reference on SCSI probe periph to cover its freeze count.
Otherwise periph may be invalidated and freed before single-stepping freeze
is dropped, causing use after free panic.
2014-01-11 13:35:36 +00:00
hselasky
9b1e1eed73 Optimise interrupt logic. Technically writing a zero to the XHCI USB
status register has no effect. Can happen when the interrupt vector is
shared.

MFC after:	1 week
2014-01-11 08:16:31 +00:00
hselasky
6f6c390b96 Force clearing of event ring interrupts. The "Intel Lynx Point" XHCI
controller found in the MBP2013 has been observed to not work properly
unless this operation is performed.

MFC after:	1 week
Tested by:	Huang Wen Hui <huanghwh@gmail.com>
2014-01-11 08:10:01 +00:00
hselasky
feae7ea0b2 Move USB ID from u3g driver to uhso driver.
Submitted by:	Lundberg, Johannes <johannes@brilliantservice.co.jp>
MFC after:	1 week
2014-01-11 07:53:03 +00:00
jhibbits
6965758ceb Save and restore the GPIOs on the macio for suspend/resume. 2014-01-11 06:35:29 +00:00
neel
ec09639132 Enable "Posted Interrupt Processing" if supported by the CPU. This lets us
inject interrupts into the guest without causing a VM-exit.

This feature can be disabled by setting the tunable "hw.vmm.vmx.use_apic_pir"
to "0".

The following sysctls provide information about this feature:
- hw.vmm.vmx.posted_interrupts (0 if disabled, 1 if enabled)
- hw.vmm.vmx.posted_interrupt_vector (vector number used for vcpu notification)

Tested on a Intel Xeon E5-2620v2 courtesy of Allan Jude at ScaleEngine.
2014-01-11 04:22:00 +00:00
neel
41814122f1 Enable the "Acknowledge Interrupt on VM exit" VM-exit control.
This control is needed to enable "Posted Interrupts" and is present in all
the Intel VT-x implementations supported by bhyve so enable it as the default.

With this VM-exit control enabled the processor will acknowledge the APIC and
store the vector number in the "VM-Exit Interruption Information" field. We
now call the interrupt handler "by hand" through the IDT entry associated
with the vector.
2014-01-11 03:14:05 +00:00
luigi
651494a5f1 use explicit casts with void* to compile when included by C++ code 2014-01-11 00:00:11 +00:00
loos
74ff5d934c Build the geom_uncompress(4) module by default.
Fix geom_uncompress(4) module loading.  Don't link zlib.c (which is a module
itself) directly.

The built module was verified and used to read a few mkulzma(8) images on
amd64 to validate some of the informations on the manual page.

While here, don't overwrite CFLAGS.

Reviewed by:	ray
Approved by:	adrian (mentor)
2014-01-10 20:29:46 +00:00
mav
840d33804e Remove not applicable PI_SDTR_ABLE and PI_WIDE_16 hba_inquiry flags to
make CAM to not try negotiate unsupported settings and suppress warnings.

While there, enable command queuing on pass-through devices, announced
in hba_inquiry, but disabled.  Even though queue size is very small, It
seems working well enough.

Reviewed by:	scottl
MFC after:	2 weeks
2014-01-10 19:21:46 +00:00
luigi
fc0f950d52 Fix netmap emulation when NICs attached to a VALE switch have a different
number of tx and rx rings

Submitted by:	Vincenzo Maffione
2014-01-10 16:01:44 +00:00
luigi
634487fef4 sync with our internal repo - small change in debugging messages 2014-01-10 16:00:27 +00:00
kevlo
9dec29fa6d Use m_get2() instead of m_getcl().
Spotted by:	glebius
2014-01-10 14:47:20 +00:00
ae
308e5129f6 Mechanically replace direct accessing to if_xname to using if_name() macro. 2014-01-10 12:33:28 +00:00
mav
86afc7b769 Replace several instances of -1 with appropriate CAM_*_WILDCARD and types.
It was equal before r259397, but for good or bad, not any more for LUNs.

This change fixes at least CAM debugging.
2014-01-10 12:18:05 +00:00
melifaro
cd97f8bba8 Simplify inet alias handling code: if we're adding/removing alias which
has the same prefix as some other alias on the same interface, use
newly-added rt_addrmsg() instead of hand-rolled in_addralias_rtmsg().

This eliminates the following rtsock messages:

Pinned RTM_ADD for prefix (for alias addition).
Pinned RTM_DELETE for prefix (for alias withdrawal).

Example (got 10.0.0.1/24 on vlan4, playing with 10.0.0.2/24):

before commit, addition:

  got message of size 116 on Fri Jan 10 14:13:15 2014
  RTM_NEWADDR: address being added to iface: len 116, metric 0, flags:
  sockaddrs: <NETMASK,IFP,IFA,BRD>
   255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

  got message of size 192 on Fri Jan 10 14:13:15 2014
  RTM_ADD: Add Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED>
  locks:  inits:
  sockaddrs: <DST,GATEWAY,NETMASK>
   10.0.0.0 10.0.0.2 (255) ffff ffff ff

after commit, addition:

  got message of size 116 on Fri Jan 10 13:56:26 2014
  RTM_NEWADDR: address being added to iface: len 116, metric 0, flags:
  sockaddrs: <NETMASK,IFP,IFA,BRD>
   255.255.255.0 vlan4:8.0.27.c5.29.d4 14.0.0.2 14.0.0.255

before commit, wihdrawal:

  got message of size 192 on Fri Jan 10 13:58:59 2014
  RTM_DELETE: Delete Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED>
  locks:  inits:
  sockaddrs: <DST,GATEWAY,NETMASK>
   10.0.0.0 10.0.0.2 (255) ffff ffff ff

  got message of size 116 on Fri Jan 10 13:58:59 2014
  RTM_DELADDR: address being removed from iface: len 116, metric 0, flags:
  sockaddrs: <NETMASK,IFP,IFA,BRD>
   255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

adter commit, withdrawal:

  got message of size 116 on Fri Jan 10 14:14:11 2014
  RTM_DELADDR: address being removed from iface: len 116, metric 0, flags:
  sockaddrs: <NETMASK,IFP,IFA,BRD>
   255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

Sending both RTM_ADD/RTM_DELETE messages to rtsock is completely wrong
(and requires some hacks to keep prefix in route table on RTM_DELETE).

I've tested this change with quagga (no change) and bird (*).

bird alias handling is already broken in *BSD sysdep code, so nothing
changes here, too.

I'm going to MFC this change if there will be no complains about behavior
change.

While here, fix some style(9) bugs introduced by r260488
(pointed by glebius and bde).

Sponsored by:	Yandex LLC
MFC after:	4 weeks
2014-01-10 12:13:55 +00:00
kevlo
4b3cba7a66 Use m_getcl() instead of MGETHDR/MCLGET macros.
Suggested by:	glebius
2014-01-10 02:47:20 +00:00
jmg
016d28765a revert part of r260485 which changes how part of the header gets
included..  netstat uses -DKERNEL=1 to get these parts and breaks the
build w/o it...

melifaro@ says that ae@ is probably asleep, and the PR doesn't have
this part of the patch...  Probably a local change got in by accident..

PR:		185148
Pointy hat to:	ae@
2014-01-09 22:41:18 +00:00
dim
b0a91599c9 Fix a braino with r259730: we cannot currently use CFLAGS.gcc or
CFLAGS.clang in sys/conf/Makefile.arm, since the main kernel build does
not use <bsd.sys.mk>.  So revert that particular change for now.

Pointy hat to:	me
Noticed by:	zbb
MFC after:	3 days
X-MFC-With:	r259730
2014-01-09 22:16:30 +00:00
ian
0e1ecc8c57 Add a prototype for the new arm_devmap_print_table(). This should have
been part of r260490.
2014-01-09 20:57:19 +00:00
ian
cf9be8affd Add a function to print the contents of the static device mapping table,
and invoke it for bootverbose logging, and also from a new DDB command,
"show devmap".  Also tweak the format string for the bootverbose output
of physical memory chunks to get the leading zeros in the hex values.
2014-01-09 18:51:57 +00:00
melifaro
dfba7fd9ef Split rt_newaddrmsg_fib() into two different functions.
Adding/deleting interface addresses involves access to 3 different subsystems,
int different parts of code. Each call can fail, so reporting successful
operation by rtsock in the middle of the process error-prone.

Further split routing notification API and actual rtsock calls via creating
public-available rt_addrmsg() / rt_routemsg() functions with "private"
rtsock_* backend.

MFC after:	2 weeks
2014-01-09 18:13:25 +00:00
ae
15b36ec523 Remove extra nesting from X_ip6_mforward() function.
Also remove disabled definitions from ip6_mroute.h.

PR:		185148
Sponsored by:	Yandex LLC
2014-01-09 15:38:28 +00:00
adrian
b66c9c2304 Be much more specific (and correct) about the device id matching.
These device IDs have an AR3012 bluetooth device that shows up with
bcdDevice=1 when it doesn't have the firmware loaded, and bcdDevice=2
when it's ready to speak full HCI.

Tested:

* AR5B225 PCIe - AR9485 + AR3012
2014-01-09 15:31:44 +00:00
ae
1e65346e1d Add MRT6_DLOG() macro for debugging.
Reduce number of MRT6DEBUG ifdefs and fix some broken format strings.

MFC after:	1 week
Sponsored by:	Yandex LLC
2014-01-09 14:58:06 +00:00
neel
00a86f71de Don't expose 'vmm_ipinum' as a global. 2014-01-09 03:25:54 +00:00
kevlo
1fd048d35f Replace deprecated M_DONTWAIT with M_NOWAIT. 2014-01-09 01:48:33 +00:00
glebius
1cebfc36ae Fix build with VIMAGE. 2014-01-09 00:59:03 +00:00
adrian
19f7055283 Refactor out the common sendfile code from the do_sendfile() and the
compat32 sendfile syscall.

Sponsored by:	Netflix, Inc.
2014-01-09 00:11:14 +00:00
melifaro
6e726b4922 Constanly use RT_ALL_FIBS everywhere instead of -1.
MFC after:	2 weeks
2014-01-08 23:09:02 +00:00
peter
3244f6064b Don't expose svc_loss_reg / _unreg to userland as they're kernel-only
additions from r260229 and the SVCPOOL type doesn't exist in userland.
2014-01-08 22:37:18 +00:00
melifaro
db2be6a793 Introduce IN6_MASK_ADDR() macro to unify various hand-rolled code
to do IPv6 addr & mask in different places.

MFC after:	2 weeks
2014-01-08 22:13:32 +00:00
jhb
35bc581adc The changes in r233781 attempted to make logging during a machine check
exception more readable.  In practice they prevented all logging during
a machine check exception on at least some systems.  Specifically, when
an uncorrected ECC error is detected in a DIMM on a Nehalem/Westmere
class machine, all CPUs receive a machine check exception, but only
CPUs on the same package as the memory controller for the erroring DIMM
log an error.  The CPUs on the other package would complete the scan of
their machine check banks and panic before the first set of CPUs could
log an error.  The end result was a clearer display during the panic
(no interleaved messages), but a crashdump without any useful info about
the error that occurred.

To handle this case, make all CPUs spin in the machine check handler
once they have completed their scan of their machine check banks until
at least one machine check error is logged.  I tried using a DELAY()
instead so that the CPUs would not potentially hang forever, but that
was not reliable in testing.

While here, don't clear MCIP from MSR_MCG_STATUS before invoking panic.
Only clear it if the machine check handler does not panic and returns
to the interrupted thread.
2014-01-08 21:04:12 +00:00
ray
c23c0c9f00 Restore VGA mode on vt switch. It fix VESA mode left by Xorg on exit.
Sponsored by:	The FreeBSD Foundation
2014-01-08 14:42:26 +00:00
rmh
cf5ac2bc67 Fix build of vt_xboxfb. 2014-01-08 14:36:35 +00:00
gavin
9632094f87 Add support for the Intel Centrino Wireless-N 135 chipset.
MFC after:	2 weeks
2014-01-08 13:59:33 +00:00
ganbold
dcd59d86f3 Update dts files of Cubieboard1,2 to use 1GB memory.
Whilst there, fix cpu config register address for Cubieboard2.

Approved by: stas (mentor)
2014-01-08 09:33:16 +00:00
kevlo
2d30c961db Rename definition of IEEE80211_FC1_WEP to IEEE80211_FC1_PROTECTED.
The origin of WEP comes from IEEE Std 802.11-1997 where it defines
whether the frame body of MAC frame has been encrypted using WEP
algorithm or not.
IEEE Std. 802.11-2007 changes WEP to Protected Frame, indicates
whether the frame is protected by a cryptographic encapsulation
algorithm.

Reviewed by:	adrian, rpaulo
2014-01-08 08:06:56 +00:00
ian
bb4969cec5 Add option USB_HOST_ALIGN to configs that contain 'device usb'. Setting
this to the cache line size is required to avoid data corruption on armv4
and armv5, and improves performance on armv6, in both cases by avoiding
partial cacheline flushes for USB IO.
2014-01-08 03:42:09 +00:00
ian
8424dea313 Add option USB_HOST_ALIGN to configs that contain 'device usb'. Setting
this to the cache line size is required to avoid data corruption on armv4
and armv5, and improves performance on armv6, in both cases by avoiding
partial cacheline flushes for USB IO.

All these configs already exist in 10-stable.  A few that don't (and
thus can't be MFC'd yet) will be committed separately.
2014-01-08 03:40:18 +00:00
yongari
d98502cf61 m_defrag(9) does not touch original mbuf chain when it can't
allocate new mbuf.  Free original mbuf chain when driver is not
able to send the packet.
2014-01-08 01:06:32 +00:00
edavis
74e1a8fb08 defragment mbuf chains longer than hw segment limit before dropping
Approved by:	davidch
2014-01-07 22:26:20 +00:00
luigi
07f442b39d fix use after free when releasing a netmap adapter.
Submitted by:	Giuseppe Lettieri
2014-01-07 21:14:28 +00:00
neel
ab2de99290 Use the 'Virtual Interrupt Delivery' feature of Intel VT-x if supported by
hardware. It is possible to turn this feature off and fall back to software
emulation of the APIC by setting the tunable hw.vmm.vmx.use_apic_vid to 0.

We now start handling two new types of VM-exits:

APIC-access: This is a fault-like VM-exit and is triggered when the APIC
register access is not accelerated (e.g. apic timer CCR). In response to
this we do emulate the instruction that triggered the APIC-access exit.

APIC-write: This is a trap-like VM-exit which does not require any instruction
emulation but it does require the hypervisor to emulate the access to the
specified register (e.g. icrlo register).

Introduce 'vlapic_ops' which are function pointers to vector the various
vlapic operations into processor-dependent code. The 'Virtual Interrupt
Delivery' feature installs 'ops' for setting the IRR bits in the virtual
APIC page and to return whether any interrupts are pending for this vcpu.

Tested on an "Intel Xeon E5-2620 v2" courtesy of Allan Jude at ScaleEngine.
2014-01-07 21:04:49 +00:00
adrian
57b2f48ff1 Reserve an event type for the upcoming EVENT_SENDFILE and
extend the event struct pointer union to allow for 'other' types.

Sponsored by:	Netflix, Inc.
2014-01-07 20:24:25 +00:00
mav
2fd0db3dfd Allow delete_method sysctl to be set to "DISABLE". 2014-01-07 20:12:10 +00:00
scottl
207475f6fd Remove aicasm as a build dependency. It made sense when the ahc and ahd
drivers and their firmware were under active development, but those days
have passed.  The firmware now exists in pre-compiled form, no longer
dependent on it's sources or on aicasm.  If you wish to rebuild the
firmware from source, the glue still exists under the 'make firmware'
target in sys/modules/aic7xxx.

This also fixes the problem introduced with r257777 et al with building
kernels the old fashioned way in sys/$arch/compile/$CONFIG when the
ahc/ahd drivers were included.
2014-01-07 19:33:17 +00:00
melifaro
58f7b15da9 Remove dead code.
Reported by:	Coverity
Coverity CID:	1018057
MFC after:	2 weeks
2014-01-07 19:00:40 +00:00
neel
23ea3a1c59 Fix a bug introduced in r260167 related to VM-exit tracing.
Keep a copy of the 'rip' and the 'exit_reason' and use that when calling
vmx_exit_trace(). This is because both the 'rip' and 'exit_reason' can
be changed by 'vmx_exit_process()' and can lead to very misleading traces.
2014-01-07 18:53:14 +00:00
melifaro
860ae05c24 Teach every SIOCGIFSTATUS provider to fill in ifs->ascii anyway.
Remove old bits of data concat for 'ascii' field.
Remove special SIOCGIFSTATUS handling from if.c (which Coverity yells at).

Reported by:	Coverity
Coverity CID:	1147174
MFC after:	2 weeks
2014-01-07 15:59:33 +00:00
attilio
23d2536d96 Use __predict_false() on sensitive lock paths as most of the times,
when PMC-soft feature is not used the check will be false.

Sponsored by:	EMC / Isilon storage division
Submitted by:	Anton Rang <anton.rang@isilon.com>
2014-01-07 14:03:42 +00:00
loos
c74f1326f9 Fix the geom mappings for WR1043ND.
The uboot mapping is only 128KiB (0x20000) and not 2MiB (0x200000).

Dynamically adjust kernel and rootfs mappings based on the
geom_uncompress(4) magic.

This makes the built images more reliable by accepting changes on kernel
size transparently and matches the images built with zrouter and
freebsd-wifi-build.

Tested by:	gjb
Approved by:	adrian (mentor)
Obtained from:	Zrouter
2014-01-07 13:09:35 +00:00
mav
c5a69c8307 Fix off-by-one error in r260229.
Coverity CID:	1148955
2014-01-07 11:43:51 +00:00
trasz
d0cf88a92b Fix a rare "truncated checksums" problem, which manifested like this:
WARNING: icl_pdu_check_data_digest: data digest check failed; got 0xf23b,
    should be 0xdb7f23b

Tested by:	Darcy Birkbeck
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2014-01-07 11:03:57 +00:00
hselasky
9784536f3c Check the XHCI event ring regardless of the XHCI status register
value. The "Intel Lynx Point" XHCI controller found in the MBP2013 has
been observed to not always set the event interrupt bit while there
are events to consume in the event ring.

MFC after:	1 week
Tested by:	Huang Wen Hui <huanghwh@gmail.com>
2014-01-07 09:52:26 +00:00
adrian
7f18e0f33d Add a compile-time control over the size of KN_HASHSIZE.
This is needed for applications that use a lot of non-filedescriptor
knotes.

MFC after:	1 week
Sponsored by:	Netflix, Inc.
2014-01-07 01:17:27 +00:00
neel
b47601c298 Allow vlapic_set_intr_ready() to return a value that indicates whether or not
the vcpu should be kicked to process a pending interrupt. This will be useful
in the implementation of the Posted Interrupt APICv feature.

Change the return value of 'vlapic_pending_intr()' to indicate whether or not
an interrupt is available to be delivered to the vcpu depending on the value
of the PPR.

Add KTR tracepoints to debug guest IPI delivery.
2014-01-07 00:38:22 +00:00
jimharris
e31eb3d992 For IDENTIFY passthrough commands to Chatham prototype controllers, copy
the spoofed identify data into the user buffer rather than issuing the
command to the controller, since Chatham IDENTIFY data is always spoofed.

While here, fix a bug in the spoofed data for Chatham submission and
completion queue entry sizes.

Sponsored by:	Intel
MFC after:	3 days
2014-01-06 23:51:26 +00:00
neel
35066c7a68 Split the VMCS setup between 'vmcs_init()' that does initialization and
'vmx_vminit()' that does customization.

This makes it easier to turn on optional features (e.g. APICv) without
having to keep adding new parameters to 'vmcs_set_defaults()'.

Reviewed by:	grehan@
2014-01-06 23:16:39 +00:00
melifaro
9f8536f282 Partially fix IPv4 interface routes deletion in RADIX_MPATH.
Noticed by:	Nikolay Denev <ndenev at gmail.com>
MFC after:	1 month
2014-01-06 22:36:20 +00:00
glebius
353906d3d2 When pf_get_translation() fails, it should leave *sn pointer pristine,
otherwise we will panic in pf_test_rule().

PR:		182557
2014-01-06 19:05:04 +00:00
schweikh
916cce9b6e Correct a grammo in a comment; remove white space at EOL. 2014-01-06 17:23:22 +00:00
andreast
370e258309 Fix arm build.
Reviewed by:	ian, zbb
2014-01-06 17:16:27 +00:00
ian
5d95c195b3 Switch to using arm_devmap_add_entry() to set up static device mapping.
This eliminates the hard-coded max kva and roughly doubles the available
kva space.
2014-01-06 16:57:22 +00:00
ian
5ffa5b0b13 Don't try to find a static mapping before calling pmap_mapdev(), that logic
is now part of pmap_mapdev() and doesn't need to be duplicated here.
Likewise for unmapping.
2014-01-06 16:33:16 +00:00
ian
51e107617a Allow 'no static device mappings' to potentially work. It's not clear that
every arm system must have some static mappings to work correctly (although
currently they all do), so remove some panic() calls (which would never
been seen anyway, because they would happen before a console is available).
2014-01-06 16:07:27 +00:00
ian
c6f3eb52c2 Switch to using arm_devmap_add_entry() to set up static device mapping.
This eliminates the hard-coded max kva and roughly doubles the available
kva space.
2014-01-06 15:48:16 +00:00
luigi
41068e3dad It is 2014 and we have a new version of netmap.
Most relevant features:

- netmap emulation on any NIC, even those without native netmap support.

  On the ixgbe we have measured about 4Mpps/core/queue in this mode,
  which is still a lot more than with sockets/bpf.

- seamless interconnection of VALE switch, NICs and host stack.

  If you disable accelerations on your NIC (say em0)

        ifconfig em0 -txcsum -txcsum

  you can use the VALE switch to connect the NIC and the host stack:

        vale-ctl -h valeXX:em0

  allowing sharing the NIC with other netmap clients.

- THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers
  instead of pointers/count as before). This was unavoidable to support,
  in the future, multiple threads operating on the same rings.
  Netmap clients require very small source code changes to compile again.
      On the plus side, the new API should be easier to understand
  and the internals are a lot simpler.

The manual page has been updated extensively to reflect the current
features and give some examples.

This is the result of work of several people including Giuseppe Lettieri,
Vincenzo Maffione, Michio Honda and myself, and has been financially
supported by EU projects CHANGE and OPENLAB, from NetApp University
Research Fund, NEC, and of course the Universita` di Pisa.
2014-01-06 12:53:15 +00:00
mav
b421f931ee Fix NULL dereference panic on UDP requests introduced in r260229. 2014-01-06 12:40:46 +00:00
marcel
5e2984b1f1 In atomic_or_8_nv() load 1 and not 8 bytes from the address
given. Note that atomic_or_8_nv() is not used at this time.
2014-01-06 05:00:58 +00:00
adrian
f5fdc16f4b Correctly remove entries from the relevant receive ath_buf list before
freeing them.

The current code would walk the list and call the buffer free, which
didn't remove it from any lists before pushing it back on the free list.

Tested:		AR9485, STA mode

Noticed by:	dillon@apollo.dragonflybsd.org
2014-01-06 03:48:32 +00:00
ian
420aa503c6 Remove dev/fdt/fdt_pci.c, which was code specific to Marvell ARM SoCs,
related to setting up static device mappings.  Since it was only used by
arm/mv/mv_pci.c, it's now just static functions within that file, plus
one public function that gets called only from arm/mv/mv_machdep.c.
2014-01-05 22:36:34 +00:00
gavin
775cb4ad60 Wrap SUBDIRs over several lines. 2014-01-05 21:35:07 +00:00
dim
3c9bc33d22 Split the last gcc-specific flags off into CFLAGS.gcc. This also
removes the need to use -Qunused-arguments for clang throughout the
tree.

MFC after:	3 days
2014-01-05 21:03:49 +00:00
ian
1b0fae4d63 Enable the cesa security/crypto device by providing the required property
in the dts source, and adding the right devices to the kernel config. Also
generally bring the kernel config into line with what we have for other
Marvell/Kirkwood systems (add lots of useful devices and options).

One particularly notable addition amongst the kernel config changes is
USB_HOST_ALIGN=32, which may help eliminate data corruption on USB drives.

PR:		kern/181975 arm/162159
2014-01-05 20:44:10 +00:00
ian
e0c5d7ff01 Add #include <machine/fdt.h> to a few files that used to get it via
pollution from other headers.
2014-01-05 20:09:51 +00:00
mav
0b0d3d9762 Fix build after r260234 by converting ddi_get_lbolt64() from inline into
a macro.  Otherwise compiler complains that hz variable used there either
undefined or defined twice, thanks to header mess caused by compat shims.
2014-01-05 19:07:42 +00:00
nwhitehorn
f06ffda243 Retire machine/fdt.h as a header used by MI code, as its function is now
obsolete. This involves the following pieces:
- Remove it entirely on PowerPC, where it is not used by MD code either
- Remove all references to machine/fdt.h in non-architecture-specific code
  (aside from uart_cpu_fdt.c, shared by ARM and MIPS, and so is somewhat
  non-arch-specific).
- Fix code relying on header pollution from machine/fdt.h includes
- Legacy fdtbus.c (still used on x86 FDT systems) now passes resource
  requests to its parent (nexus). This allows x86 FDT devices to allocate
  both memory and IO requests and removes the last notionally MI use of
  fdtbus_bs_tag.
- On those architectures that retain a machine/fdt.h, unused bits like
  FDT_MAP_IRQ and FDT_INTR_MAX have been removed.
2014-01-05 18:46:58 +00:00
ian
80568041a3 Convert from using fdt_immr style to arm_devmap_add_entry() to make
static device mappings.

This SoC relied heavily on the fact that all devices were static-mapped
at a fixed address, and it (rather bogusly) used bus_space read and write
calls passing hard-coded virtual addresses instead of proper bus handles,
relying on the fact that the virtual addresses of the mappings were known
at compile time, and relying on the implementation details of arm
bus_space never changing.  All such usage was replaced with calls to
bus_space_map() to obtain a proper bus handle for the read/write calls.

This required adjusting some of the #define values that map out hardware
registers, and some of them were renamed in the process to make it clear
which were defining absolute physical addresses and which were defining
offsets.  (The ones that just define offsets don't appear to be referenced
and probably serve no value other than perhaps documentation.)
2014-01-05 18:40:06 +00:00
ian
666e77d06d Eliminate use of fdt_immr_addr(), it's not needed for this SoC. Convert
to the newer arm_devmap_add_entry() routine for creating device mappings.
2014-01-05 16:45:34 +00:00
ian
0e63ab546c Use the common armv6 fdt_bus_tag defintion instead of an essentially
identical local copy of it.
2014-01-05 15:33:33 +00:00
gavin
c0bedb8ffd Add firmware version 18.168.6.1 (API version 6) for Intel Centrino
Wireless-N 135 wireless adapters, soon to be supported by iwn(4).

Committed using:	Laptop with Centrino 135 chipset
Obtained from:	wireless.kernel.org firmware downloads
2014-01-05 01:07:14 +00:00
adrian
01b37d373d Move the retune notification print to a debug print.
Yes, I still have to do the retune.  But I'm giving in to many people
pestering me (very gently!) about this.

Tested:

* Intel Centrino 6205
2014-01-05 00:46:31 +00:00
imp
8da5238e28 More NAND IDs of some really old Samsung parts, also list the part
number that we're matching...
2014-01-04 22:30:18 +00:00
melifaro
f85abe9555 Change semantics for rnh_lookup() function: now
it performs exact match search, regardless of netmask existance.
This simplifies most of rnh_lookup() consumers.

Fix panic triggered by deleting non-existent host route.

PR:		kern/185092
Submitted by:	Nikolay Denev <ndenev at gmail.com>
MFC after:	1 month
2014-01-04 22:25:26 +00:00
ian
b03757c0c6 Doh! Use C comments, not C++. 2014-01-04 22:14:59 +00:00
ian
877c685bda Convert static device mapping to use the new arm_devmap_add_entry(),
and add static mappings that cover most of the on-chip peripherals with
1MB section mappings.  This adds about 220MB or so available kva space
by not using a hard-coded 0xF0000000 as the mapping address.
2014-01-04 22:09:53 +00:00