freebsd-skq

Author	SHA1	Message	Date
David Christensen	4e4007688c	Substantial rewrite of bxe(4) to add support for the BCM57712 and BCM578XX controllers. Approved by: re MFC after: 4 weeks	2013-09-20 20:18:49 +00:00
Edward Tomasz Napierala	009ea47eb2	Bring in the new iSCSI target and initiator. Reviewed by: ken (parts) Approved by: re (delphij) Sponsored by: FreeBSD Foundation	2013-09-14 15:29:06 +00:00
Mark Murray	9d32fc31c7	MFC	2013-09-07 07:58:29 +00:00
Cy Schubert	bfc88dcbf7	Update ipfilter 4.1.28 --> 5.1.2. Approved by: glebius (mentor) BSD Licensed by: Darren Reed <darrenr@reed.wattle.id.au> (author)	2013-09-06 23:11:19 +00:00
Mark Murray	0fbf163e60	MFC	2013-09-06 17:42:12 +00:00
Pawel Jakub Dawidek	7008be5bd7	Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD \| CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t cap_rights_init(cap_rights_t rights, ...); void cap_rights_set(cap_rights_t rights, ...); void cap_rights_clear(cap_rights_t rights, ...); bool cap_rights_is_set(const cap_rights_t rights, ...); bool cap_rights_is_valid(const cap_rights_t rights); void cap_rights_merge(cap_rights_t dst, const cap_rights_t src); void cap_rights_remove(cap_rights_t dst, const cap_rights_t src); bool cap_rights_contains(const cap_rights_t big, const cap_rights_t little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP \| CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation	2013-09-05 00:09:56 +00:00
Mark Murray	77de2c3f58	Separate out the Software RNG entropy harvesting queue and thread into its own files. Submitted by: Arthur Mesh <arthurmesh@gmail.com>	2013-08-30 11:42:57 +00:00
Mark Murray	f27c28dc6e	MFC	2013-08-30 11:38:34 +00:00
Justin T. Gibbs	9f40021f28	Introduce a new, HVM compatible, paravirtualized timer driver for Xen. Use this new driver for both PV and HVM instances. This driver requires a Xen hypervisor that supports vector callbacks, VCPUOP hypercalls, and reports that it has a "safe PV clock". New timer driver: Submitted by: will Sponsored by: Spectra Logic Corporation PV port to new driver, and bug fixes: Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D sys/dev/xen/timer/timer.c: - Register a PV timer device driver which (currently) implements device_{identify,probe,attach} and stubs device_detach. The detach routine requires functionality not provided by timecounters(4). The suspend and resume routines need additional work (due to Xen requiring that the hypercalls be executed on the target VCPU), and aren't needed for our purposes. - Make sure there can only be one device instance of this driver, and that it only registers one eventtimers(4) and one timecounters(4) device interface. Make both interfaces use PCPU data as needed. - Match, with a few style cleanups & API differences, the Xen versions of the "fetch time" functions. - Document the magic scale_delta() better for the i386 version. - When registering the event timer, bind a separate event channel for the timer VIRQ to the device's event timer interrupt handler for each active VCPU. Describe each interrupt as "xen_et:c%d", so they can be identified per CPU in "vmstat -i" or "show intrcnt" in KDB. - When scheduling a timer into the hypervisor, try up to 60 times if the hypervisor rejects the time as being in the past. In the common case, this retry shouldn't happen, and if it does, it should only happen once. This is because the event timer advertises a minimum period of 100usec, which is only less than the usual hypercall round trip time about 1 out of every 100 tries. (Unlike other similar drivers, this one actually checks whether the hypervisor accepted the singleshot timer set hypercall.) - Implement a RTC PV clock based on the hypervisor wallclock. sys/conf/files: - Add dev/xen/timer/timer.c if the kernel configuration includes either the XEN or XENHVM options. sys/conf/files.i386: sys/i386/include/xen/xen_clock_util.h: sys/i386/xen/clock.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/xen_rtc.c: - Remove previous PV timer used in i386 XEN PV kernels, the new timer introduced in this change is used instead (so we share the same code between PVHVM and PV). MFC after: 2 weeks	2013-08-29 23:11:58 +00:00
Justin T. Gibbs	76acc41fb7	Implement vector callback for PVHVM and unify event channel implementations Re-structure Xen HVM support so that: - Xen is detected and hypercalls can be performed very early in system startup. - Xen interrupt services are implemented using FreeBSD's native interrupt delivery infrastructure. - the Xen interrupt service implementation is shared between PV and HVM guests. - Xen interrupt handlers can optionally use a filter handler in order to avoid the overhead of dispatch to an interrupt thread. - interrupt load can be distributed among all available CPUs. - the overhead of accessing the emulated local and I/O apics on HVM is removed for event channel port events. - a similar optimization can eventually, and fairly easily, be used to optimize MSI. Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure, and misc Xen cleanups: Sponsored by: Spectra Logic Corporation Unification of PV & HVM interrupt infrastructure, bug fixes, and misc Xen cleanups: Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D sys/x86/x86/local_apic.c: sys/amd64/include/apicvar.h: sys/i386/include/apicvar.h: sys/amd64/amd64/apic_vector.S: sys/i386/i386/apic_vector.s: sys/amd64/amd64/machdep.c: sys/i386/i386/machdep.c: sys/i386/xen/exception.s: sys/x86/include/segments.h: Reserve IDT vector 0x93 for the Xen event channel upcall interrupt handler. On Hypervisors that support the direct vector callback feature, we can request that this vector be called directly by an injected HVM interrupt event, instead of a simulated PCI interrupt on the Xen platform PCI device. This avoids all of the overhead of dealing with the emulated I/O APIC and local APIC. It also means that the Hypervisor can inject these events on any CPU, allowing upcalls for different ports to be handled in parallel. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: Map Xen per-vcpu area during AP startup. sys/amd64/include/intr_machdep.h: sys/i386/include/intr_machdep.h: Increase the FreeBSD IRQ vector table to include space for event channel interrupt sources. sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: Remove Xen HVM per-cpu variable data. These fields are now allocated via the dynamic per-cpu scheme. See xen_intr.c for details. sys/amd64/include/xen/hypercall.h: sys/dev/xen/blkback/blkback.c: sys/i386/include/xen/xenvar.h: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/xen/gnttab.c: Prefer FreeBSD primatives to Linux ones in Xen support code. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: sys/dev/xen/balloon/balloon.c: sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/console/xencons_ring.c: sys/dev/xen/control/control.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/xenpci/xenpci.c: sys/i386/i386/machdep.c: sys/i386/include/pmap.h: sys/i386/include/xen/xenfunc.h: sys/i386/isa/npx.c: sys/i386/xen/clock.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/mptable.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/xen_rtc.c: sys/xen/evtchn/evtchn_dev.c: sys/xen/features.c: sys/xen/gnttab.c: sys/xen/gnttab.h: sys/xen/hvm.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstore.c: sys/xen/xenstore/xenstore_dev.c: sys/xen/xenstore/xenstorevar.h: Pull common Xen OS support functions/settings into xen/xen-os.h. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: Remove constants, macros, and functions unused in FreeBSD's Xen support. sys/xen/xen-os.h: sys/i386/xen/xen_machdep.c: sys/x86/xen/hvm.c: Introduce new functions xen_domain(), xen_pv_domain(), and xen_hvm_domain(). These are used in favor of #ifdefs so that FreeBSD can dynamically detect and adapt to the presence of a hypervisor. The goal is to have an HVM optimized GENERIC, but more is necessary before this is possible. sys/amd64/amd64/machdep.c: sys/dev/xen/xenpci/xenpcivar.h: sys/dev/xen/xenpci/xenpci.c: sys/x86/xen/hvm.c: sys/sys/kernel.h: Refactor magic ioport, Hypercall table and Hypervisor shared information page setup, and move it to a dedicated HVM support module. HVM mode initialization is now triggered during the SI_SUB_HYPERVISOR phase of system startup. This currently occurs just after the kernel VM is fully setup which is just enough infrastructure to allow the hypercall table and shared info page to be properly mapped. sys/xen/hvm.h: sys/x86/xen/hvm.c: Add definitions and a method for configuring Hypervisor event delievery via a direct vector callback. sys/amd64/include/xen/xen-os.h: sys/x86/xen/hvm.c: sys/conf/files: sys/conf/files.amd64: sys/conf/files.i386: Adjust kernel build to reflect the refactoring of early Xen startup code and Xen interrupt services. sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: sys/dev/xen/control/control.c: sys/dev/xen/evtchn/evtchn_dev.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/xen/xenstore/xenstore.c: sys/xen/evtchn/evtchn_dev.c: sys/dev/xen/console/console.c: sys/dev/xen/console/xencons_ring.c Adjust drivers to use new xen_intr_*() API. sys/dev/xen/blkback/blkback.c: Since blkback defers all event handling to a taskqueue, convert this task queue to a "fast" taskqueue, and schedule it via an interrupt filter. This avoids an unnecessary ithread context switch. sys/xen/xenstore/xenstore.c: The xenstore driver is MPSAFE. Indicate as much when registering its interrupt handler. sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusvar.h: Remove unused event channel APIs. sys/xen/evtchn.h: Remove all kernel Xen interrupt service API definitions from this file. It is now only used for structure and ioctl definitions related to the event channel userland device driver. Update the definitions in this file to match those from NetBSD. Implementing this interface will be necessary for Dom0 support. sys/xen/evtchn/evtchnvar.h: Add a header file for implemenation internal APIs related to managing event channels event delivery. This is used to allow, for example, the event channel userland device driver to access low-level routines that typical kernel consumers of event channel services should never access. sys/xen/interface/event_channel.h: sys/xen/xen_intr.h: Standardize on the evtchn_port_t type for referring to an event channel port id. In order to prevent low-level event channel APIs from leaking to kernel consumers who should not have access to this data, the type is defined twice: Once in the Xen provided event_channel.h, and again in xen/xen_intr.h. The double declaration is protected by __XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared twice within a given compilation unit. sys/xen/xen_intr.h: sys/xen/evtchn/evtchn.c: sys/x86/xen/xen_intr.c: sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/xenpci/xenpcivar.h: New implementation of Xen interrupt services. This is similar in many respects to the i386 PV implementation with the exception that events for bound to event channel ports (i.e. not IPI, virtual IRQ, or physical IRQ) are further optimized to avoid mask/unmask operations that aren't necessary for these edge triggered events. Stubs exist for supporting physical IRQ binding, but will need additional work before this implementation can be fully shared between PV and HVM. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: sys/i386/xen/mp_machdep.c sys/x86/xen/hvm.c: Add support for placing vcpu_info into an arbritary memory page instead of using HYPERVISOR_shared_info->vcpu_info. This allows the creation of domains with more than 32 vcpus. sys/i386/i386/machdep.c: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/exception.s: Add support for new event channle implementation.	2013-08-29 19:52:18 +00:00
Mark Murray	12278dbb57	MFC	2013-08-26 10:40:25 +00:00
Mark Johnston	57f6086735	Implement the ip, tcp, and udp DTrace providers. The probe definitions use dynamic translation so that their arguments match the definitions for these providers in Solaris and illumos. Thus, existing scripts for these providers should work unmodified on FreeBSD. Tested by: gnn, hiren MFC after: 1 month	2013-08-25 21:54:41 +00:00
Mark Murray	ddbfa6b19e	1) example (partially humorous random_adaptor, that I call "EXAMPLE") * It's not meant to be used in a real system, it's there to show how the basics of how to create interfaces for random_adaptors. Perhaps it should belong in a manual page 2) Move probe.c's functionality in to random_adaptors.c * rename random_ident_hardware() to random_adaptor_choose() 3) Introduce a new way to choose (or select) random_adaptors via tunable "rngs_want" It's a list of comma separated names of adaptors, ordered by preferences. I.e.: rngs_want="yarrow,rdrand" Such setting would cause yarrow to be preferred to rdrand. If neither of them are available (or registered), then system will default to something reasonable (currently yarrow). If yarrow is not present, then we fall back to the adaptor that's first on the list of registered adaptors. 4) Introduce a way where RNGs can play a role of entropy source. This is mostly useful for HW rngs. The way I envision this is that every HW RNG will use this functionality by default. Functionality to disable this is also present. I have an example of how to use this in random_adaptor_example.c (see modload event, and init function) 5) fix kern.random.adaptors from kern.random.adaptors: yarrowpanicblock to kern.random.adaptors: yarrow,panic,block 6) add kern.random.active_adaptor to indicate currently selected adaptor: root@freebsd04:~ # sysctl kern.random.active_adaptor kern.random.active_adaptor: yarrow Submitted by: Arthur Mesh <arthurmesh@gmail.com>	2013-08-24 13:54:56 +00:00
Edward Tomasz Napierala	9732e4fd92	Move the old iSCSI initiator source to a more appropriate place (sys/dev/iscsi_initiator/ instead of sys/dev/iscsi/initiator/), to make room for the new one. This is also more logical location (kernel module being named iscsi_initiator.ko, for example). There is no ongoing work on this I know of, so it shouldn't make life harder for anyone. There are no functional changes, apart from "svn mv" and adjusting paths.	2013-08-22 14:02:34 +00:00
Pawel Jakub Dawidek	0dac22d8ea	Implement 32bit versions of the cap_ioctls_limit(2) and cap_ioctls_get(2) system calls as unsigned longs have different size on i386 and amd64. Reported by: jilles Sponsored by: The FreeBSD Foundation	2013-08-18 10:30:41 +00:00
Pedro F. Giffuni	d7511a40a7	Add read-only support for extents in ext2fs. Basic support for extents was implemented by Zheng Liu as part of his Google Summer of Code in 2010. This support is read-only at this time. In addition to extents we also support the huge_file extension for read-only purposes. This works nicely with the additional support for birthtime/nanosec timestamps and dir_index that have been added lately. The implementation may not work for all ext4 filesystems as it doesn't support some features that are being enabled by default on recent linux like flex_bg. Nevertheless, the feature should be very useful for migration or simple access in filesystems that have been converted from ext2/3 or don't use incompatible features. Special thanks to Zheng Liu for his dedication and continued work to support ext2 in FreeBSD. Submitted by: Zheng Liu (lz@) Reviewed by: Mike Ma, Christoph Mallon (previous version) Sponsored by: Google Inc. MFC after: 3 weeks	2013-08-12 21:34:48 +00:00
David E. O'Brien	5711939b63	* Add random_adaptors.[ch] which is basically a store of random_adaptor's. random_adaptor is basically an adapter that plugs in to random(4). random_adaptor can only be plugged in to random(4) very early in bootup. Unplugging random_adaptor from random(4) is not supported, and is probably a bad idea anyway, due to potential loss of entropy pools. We currently have 3 random_adaptors: + yarrow + rdrand (ivy.c) + nehemeiah * Remove platform dependent logic from probe.c, and move it into corresponding registration routines of each random_adaptor provider. probe.c doesn't do anything other than picking a specific random_adaptor from a list of registered ones. * If the kernel doesn't have any random_adaptor adapters present then the creation of /dev/random is postponed until next random_adaptor is kldload'ed. * Fix randomdev_soft.c to refer to its own random_adaptor, instead of a system wide one. Submitted by: arthurmesh@gmail.com, obrien Obtained from: Juniper Networks Reviewed by: so (des)	2013-08-09 15:31:50 +00:00
David E. O'Brien	0e6a0799a9	Back out r253779 & r253786.	2013-07-31 17:21:18 +00:00
Rui Paulo	31d9867769	Import OpenBSD's rsu(4) WLAN driver. Support chipsets are the Realtek RTL8188SU, RTL8191SU, and RTL8192SU. Many thanks to Idwer Vollering for porting/writing the man page and for testing. Reviewed by: adrian, hselasky Obtained from: OpenBSD Tested by: kevlo, Idwer Vollering <vidwer at gmail.com>	2013-07-30 02:07:57 +00:00
David E. O'Brien	99ff83da74	Decouple yarrow from random(4) device. * Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option. The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow. * random(4) device doesn't really depend on rijndael-. Yarrow, however, does. Add random_adaptors.[ch] which is basically a store of random_adaptor's. random_adaptor is basically an adapter that plugs in to random(4). random_adaptor can only be plugged in to random(4) very early in bootup. Unplugging random_adaptor from random(4) is not supported, and is probably a bad idea anyway, due to potential loss of entropy pools. We currently have 3 random_adaptors: + yarrow + rdrand (ivy.c) + nehemeiah * Remove platform dependent logic from probe.c, and move it into corresponding registration routines of each random_adaptor provider. probe.c doesn't do anything other than picking a specific random_adaptor from a list of registered ones. * If the kernel doesn't have any random_adaptor adapters present then the creation of /dev/random is postponed until next random_adaptor is kldload'ed. * Fix randomdev_soft.c to refer to its own random_adaptor, instead of a system wide one. Submitted by: arthurmesh@gmail.com, obrien Obtained from: Juniper Networks Reviewed by: obrien	2013-07-29 20:26:27 +00:00
Alfred Perlstein	3d30404f83	Fix watchdog pretimeout. The original API calls for pow2ns, however the new APIs from Linux call for seconds. We need to be able to convert to/from 2^Nns to seconds in both userland and kernel to fix this and properly compare units.	2013-07-27 20:47:01 +00:00
Navdeep Parhar	caf20efcde	Add support for packet-sniffing tracers to cxgbe(4). This works with all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and for general purpose monitoring without tapping any cxgbe or cxl ifnet directly. Tracers on the T4/T5 chips provide access to Ethernet frames exactly as they were received from or transmitted on the wire. On transmit, a tracer will capture a frame after TSO segmentation, hw VLAN tag insertion, hw L3 & L4 checksum insertion, etc. It will also capture frames generated by the TCP offload engine (TOE traffic is normally invisible to the kernel). On receive, a tracer will capture a frame before hw VLAN extraction, runt filtering, other badness filtering, before the steering/drop/L2-rewrite filters or the TOE have had a go at it, and of course before sw LRO in the driver. There are 4 tracers on a chip. A tracer can trace only in one direction (tx or rx). For now cxgbetool will set up tracers to capture the first 128B of every transmitted or received frame on a given port. This is a small subset of what the hardware can do. A pseudo ifnet with the same name as the nexus driver (t4nex0 or t5nex0) will be created for tracing. The data delivered to this ifnet is an additional copy made inside the chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual. /* watch cxl0, which is the first port hanging off t5nex0. / # cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting) # cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving) # cxgbetool t5nex0 tracer list # tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K "frames" with no L3/L4 checksum but this will show you the frames that were actually transmitted. / all done */ # cxgbetool t5nex0 tracer 0 disable # cxgbetool t5nex0 tracer 1 disable # cxgbetool t5nex0 tracer list # ifconfig t5nex0 destroy	2013-07-26 22:04:11 +00:00
Luiz Otavio O Souza	b9f07b864b	Add the support for 802.1q and port based vlans for arswitch. Tested on: RB450G (standalone ar8316), RSPRO (standalone ar8316) and TPLink MR-3220 (ar724x integrated switch). Approved by: adrian (mentor) Obtained from: zrouter	2013-07-23 14:24:22 +00:00
Rui Paulo	ae230e21f9	Fix the urtwnfw definitions. We can now use urtwnfw in kernel config files.	2013-07-13 07:17:18 +00:00
Andre Oppermann	81d392a09d	Improve SYN cookies by encoding the MSS, WSCALE (window scaling) and SACK information into the ISN (initial sequence number) without the additional use of timestamp bits and switching to the very fast and cryptographically strong SipHash-2-4 MAC hash algorithm to protect the SYN cookie against forgeries. The purpose of SYN cookies is to encode all necessary session state in the 32 bits of our initial sequence number to avoid storing any information locally in memory. This is especially important when under heavy spoofed SYN attacks where we would either run out of memory or the syncache would fill with bogus connection attempts swamping out legitimate connections. The original SYN cookies method only stored an indexed MSS values in the cookie. This isn't sufficient anymore and breaks down in the presence of WSCALE information which is only exchanged during SYN and SYN-ACK. If we can't keep track of it then we may severely underestimate the available send or receive window. This is compounded with large windows whose size information on the TCP segment header is even lower numerically. A number of years back SYN cookies were extended to store the additional state in the TCP timestamp fields, if available on a connection. While timestamps are common among the BSD, Linux and other *nix systems Windows never enabled them by default and thus are not present for the vast majority of clients seen on the Internet. The common parameters used on TCP sessions have changed quite a bit since SYN cookies very invented some 17 years ago. Today we have a lot more bandwidth available making the use window scaling almost mandatory. Also SACK has become standard making recovering from packet loss much more efficient. This change moves all necessary information into the ISS removing the need for timestamps. Both the MSS (16 bits) and send WSCALE (4 bits) are stored in 3 bit indexed form together with a single bit for SACK. While this is significantly less than the original range, it is sufficient to encode all common values with minimal rounding. The MSS depends on the MTU of the path and with the dominance of ethernet the main value seen is around 1460 bytes. Encapsulations for DSL lines and some other overheads reduce it by a few more bytes for many connections seen. Rounding down to the next lower value in some cases isn't a problem as we send only slightly more packets for the same amount of data. The send WSCALE index is bit more tricky as rounding down under-estimates the available send space available towards the remote host, however a small number values dominate and are carefully selected again. The receive WSCALE isn't encoded at all but recalculated based on the local receive socket buffer size when a valid SYN cookie returns. A listen socket buffer size is unlikely to change while active. The index values for MSS and WSCALE are selected for minimal rounding errors based on large traffic surveys. These values have to be periodically validated against newer traffic surveys adjusting the arrays tcp_sc_msstab[] and tcp_sc_wstab[] if necessary. In addition the hash MAC to protect the SYN cookies is changed from MD5 to SipHash-2-4, a much faster and cryptographically secure algorithm. Reviewed by: dwmalone Tested by: Fabian Keil <fk@fabiankeil.de>	2013-07-11 15:29:25 +00:00
Hiren Panchasara	3c9d5a037d	Adding urtwn(4) firmware and related changes. Reviewed by: rpaulo Approved by: sbruno (mentor)	2013-07-10 08:21:09 +00:00
Pedro F. Giffuni	5f5458322a	Add files related to ext2 HTree implementation These should've been added along with r252890 Reported by: gonzo PointyHat: pfg MFC after: 1 week	2013-07-07 01:12:29 +00:00
Navdeep Parhar	f72b68a1bf	- Include the T5 firmware with the driver. - Update the T4 firmware to the latest. - Minor reorganization and updates to the version macros, etc. Obtained from: Chelsio MFC after: 1 day	2013-07-03 23:52:15 +00:00
Peter Wemm	6c6f1f0185	Add an entry for filemon.	2013-07-03 20:22:12 +00:00
Davide Italiano	237abf0c56	- Trim an unused and bogus Makefile for mount_smbfs. - Reconnect with some minor modifications, in particular now selsocket() internals are adapted to use sbintime units after recent'ish calloutng switch.	2013-06-28 21:00:08 +00:00
Jeff Roberson	5f51836645	- Add a general purpose resource allocator, vmem, from NetBSD. It was originally inspired by the Solaris vmem detailed in the proceedings of usenix 2001. The NetBSD version was heavily refactored for bugs and simplicity. - Use this resource allocator to allocate the buffer and transient maps. Buffer cache defrags are reduced by 25% when used by filesystems with mixed block sizes. Ultimately this may permit dynamic buffer cache sizing on low KVA machines. Discussed with: alc, kib, attilio Tested by: pho Sponsored by: EMC / Isilon Storage Division	2013-06-28 03:51:20 +00:00
Oleksandr Tymoshenko	db5815641c	Rename run(4) firmware file from runfw to run.fw. Previous name was the same as top-level target name for "device runfw" kernel option and caused cyclic dependancy that lead to kernel build breakage Module change is not strictly required and done for name unification sake PR: conf/175751 Submitted by: Issei <i10a at herbmint.jp>	2013-06-21 18:16:54 +00:00
Jack F Vogel	fd75b91d13	Add quad port probe support, this gives the admin proper information about the slot (which should be a PCIE Gen 3 slot for this adapter) by looking back thru the PCI parent devices to the slot device. The fix above also corrects the bandwidth display to GT/s rather than the incorrect Gb/s Next, allow the use of ALTQ if you select the compile option IXGBE_LEGACY_TX. Allow the use of 'unsupported' optic modules by a compile option as well. Add a phy reset capability into the stop code, this is so a static configured driver will still behave properly when taken down (not being able to unload it). This revision synchronizes the shared code with Intel internal current code, and note that it now includes DCB supporting code, this was necessitated by some internal changes with the code, but it also will provide the opportunity to develop this feature in the core driver down the road. I have edited the README to get rid of some of the worse anachronisms in it as well, its by no means as robust as I might wish at this point however. Oh, I also have included some conditional stuff in the code so it will be compatible in both the 9.X and 10 environments. Performance has been a focus in recent changes and I believe this revision driver will perform very well in most workloads. MFC after: 2 weeks	2013-06-18 21:28:19 +00:00
Scott Long	c8789c34fd	This is an addendum to r251837. Missed adding the new references to cam_compat.c to the various makefiles. Obtained from: Netflix	2013-06-17 10:21:38 +00:00
Adrian Chadd	216ca2346f	Migrate the LNA mixing diversity machinery from the AR9285 HAL to the driver. The AR9485 chip and AR933x SoC both implement LNA diversity. There are a few extra things that need to happen before this can be flipped on for those chips (mostly to do with setting up the different bias values and LNA1/LNA2 RSSI differences) but the first stage is putting this code into the driver layer so it can be reused. This has the added benefit of making it easier to expose configuration options and diagnostic information via the ioctl API. That's not yet being done but it sure would be nice to do so. Tested: * AR9285, with LNA diversity enabled * AR9285, with LNA diversity disabled in EEPROM	2013-06-12 14:52:57 +00:00
Rui Paulo	c2c2fc4d86	Import Kevin Lo's port of urtwn(4) from OpenBSD. urtwn(4) is a driver for the Realtek RTL8188CU/RTL8192CU USB IEEE 802.11b/g/n wireless cards. This driver requires microcode which is available in FreeBSD ports: net/urtwn-firmware-kmod. Hiren ported the urtwn(4) man page from OpenBSD and Glen just commited a port for the firmware. TODO: - 802.11n support - Stability fixes - the driver can sustain lots of traffic but has trouble coping with simultaneous iperf sessions. - fix debugging MFC after: 2 months Tested by: kevlo, hiren, gjb	2013-06-08 16:02:31 +00:00
Adrian Chadd	b70f530bc7	Bring over the initial static bluetooth coexistence configuration for the WB195 combo NIC - an AR9285 w/ an AR3011 USB bluetooth NIC. The AR3011 is wired up using a 3-wire coexistence scheme to the AR9285. The code in if_ath_btcoex.c sets up the initial hardware mapping and coexistence configuration. There's nothing special about it - it's static; it doesn't try to configure bluetooth / MAC traffic priorities or try to figure out what's actually going on. It's enough to stop basic bluetooth traffic from causing traffic stalls and diassociation from the wireless network. To use this code, you must have the above NIC. No, it won't work for the AR9287+AR3012, nor the AR9485, AR9462 or AR955x combo cards. Then you set a kernel hint before boot or before kldload, where 'X' is the unit number of your AR9285 NIC: # kenv hint.ath.X.btcoex_profile=wb195 This will then appear in your boot messages: [100482] athX: Enabling WB195 BTCOEX This code is going to evolve pretty quickly (well, depending upon my spare time) so don't assume the btcoex API is going to stay stable. In order to use the bluetooth side, you must also load in firmware using ath3kfw and the binary firmware file (ath3k-1.fw in my case.) Tested: * AR9280, no interference * WB195 - AR9285 + AR3011 combo; STA mode; basic bluetooth inquiries were enough to cause traffic stalls and disassociations. This has stopped with the btcoex profile code. TODO: * Importantly - the AR9285 needs ASPM disabled if bluetooth coexistence is enabled. No, I don't know why. It's likely some kind of bug to do with the AR3011 sending bluetooth coexistence signals whilst the device is asleep. Since we don't actually sleep the MAC just yet, it shouldn't be a problem. That said, to be totally correct: + ASPM should be disabled - upon attach and wakeup + The PCIe powersave HAL code should never be called Look at what the ath9k driver does for inspiration. * Add WB197 (AR9287+AR3012) support * Add support for the AR9485, which is another combo like the AR9285 * The later NICs have a different signaling mechanism between the MAC and the bluetooth device; I haven't even begun to experiment with making that HAL code work. But it should be a lot more automatic. * The hardware can do much more interesting traffic weighting with bluetooth and wifi traffic. None of this is currently used. Ideally someone would code up something to watch the bluetooth traffic GPIO (via an interrupt) and then watch it go high/low; then figure out what the bluetooth traffic is and adjust things appropriately. * If I get the time I may add in some code to at least track this stuff and expose statistics. But it's up to someone else to experiment with the bluetooth coexistence support and add the interesting stuff (like "real" detection of bulk, audio, etc bluetooth traffic patterns and change wifi parameters appropriately - eg, maximum aggregate length, transmit power, using quiet time to control TX duty cycle, etc.)	2013-06-07 09:02:02 +00:00
Achim Leubner	dce93cd06d	Driver 'aacraid' added. Supports Adaptec by PMC RAID controller families Series 6, 7, 8 and upcoming products. Older Adaptec RAID controller families are supported by the 'aac' driver. Approved by: scottl (mentor)	2013-05-24 09:22:43 +00:00
Marcel Moolenaar	cb34ed4434	Add basic support for FDT to i386 & amd64. This change includes: 1. Common headers for fdt.h and ofw_machdep.h under x86/include with indirections under i386/include and amd64/include. 2. New modinfo for loader provided FDT blob. 3. Common x86_init_fdt() called from hammer_time() on amd64 and init386() on i386. 4. Split-off FDT specific low-level console functions from FDT bus methods for the uart(4) driver. The low-level console logic has been moved to uart_cpu_fdt.c and is used for arm, mips & powerpc only. The FDT bus methods are shared across all architectures. 5. Add dev/fdt/fdt_x86.c to hold the fdt_fixup_table[] and the fdt_pic_table[] arrays. Both are empty right now. FDT addresses are I/O ports on x86. Since the core FDT code does not handle different address spaces, adding support for both I/O ports and memory addresses requires some thought and discussion. It may be better to use a compile-time option that controls this. Obtained from: Juniper Networks, Inc.	2013-05-21 03:05:49 +00:00
Jung-uk Kim	a9d8d09c46	Merge ACPICA 20130517.	2013-05-20 23:52:49 +00:00
Jeff Roberson	f2cc1285c2	- Add a new general purpose path-compressed radix trie which can be used with any structure containing a uint64_t index. The tree code auto-generates type safe wrappers. - Eliminate the buf splay and replace it with pctrie. This is not only significantly faster with large files but also allows for the possibility of shared locking. Reviewed by: alc, attilio Sponsored by: EMC / Isilon Storage Division	2013-05-12 04:05:01 +00:00
Adrian Chadd	248dd6039d	Bring in a basic ethernet switch driver for the IP17x series of switches. These are notably found on some AR71xx based Mikrotik boards. Submitted by: Luiz Otavio O Souza <loos.br@gmail.com> Reviewed by: ray	2013-05-08 20:58:41 +00:00
Adrian Chadd	0e468be195	Add the AR9300 HAL into the kernel and module builds. Tested: * make universe (honest!)	2013-05-02 07:05:34 +00:00
Brooks Davis	b3caab66ad	MFP4 changes 222065 and 222068: Add a simplebus attachment for cfi(4)'s FDT support and move cfi_bus_fdt.c to sys/conf/files so non-ppc architectures are supported. Sponsored by: DARPA, AFRL	2013-04-30 18:33:29 +00:00
Jung-uk Kim	895f26a936	Merge ACPICA 20130418.	2013-04-19 23:49:34 +00:00
Adrian Chadd	7b796c4039	Implement a very basic multi-PHY aware switch device. This is intended to be used as a stop-gap for switch devices which expose multiple ethernet PHYs but we don't have a driver for - here, etherswitchcfg and the general switch configuration API can be used to interface to said PHYs. Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>	2013-04-19 17:50:38 +00:00
Kenneth D. Merry	adb974068b	Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to sys/nfs, since it is now shared by the two NFS servers. Suggested by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks	2013-04-17 22:42:43 +00:00
Kenneth D. Merry	d96b98a360	Revamp the old NFS server's File Handle Affinity (FHA) code so that it will work with either the old or new server. The FHA code keeps a cache of currently active file handles for NFSv2 and v3 requests, so that read and write requests for the same file are directed to the same group of threads (reads) or thread (writes). It does not currently work for NFSv4 requests. They are more complex, and will take more work to support. This improves read-ahead performance, especially with ZFS, if the FHA tuning parameters are configured appropriately. Without the FHA code, concurrent reads that are part of a sequential read from a file will be directed to separate NFS threads. This has the effect of confusing the ZFS zfetch (prefetch) code and makes sequential reads significantly slower with clients like Linux that do a lot of prefetching. The FHA code has also been updated to direct write requests to nearby file offsets to the same thread in the same way it batches reads, and the FHA code will now also send writes to multiple threads when needed. This improves sequential write performance in ZFS, because writes to a file are now more ordered. Since NFS writes (generally less than 64K) are smaller than the typical ZFS record size (usually 128K), out of order NFS writes to the same block can trigger a read in ZFS. Sending them down the same thread increases the odds of their being in order. In order for multiple write threads per file in the FHA code to be useful, writes in the NFS server have been changed to use a LK_SHARED vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem doesn't allow multiple writers to a file at once. ZFS is currently the only filesystem that allows multiple writers to a file, because it has internal file range locking. This change does not affect the NFSv4 code. This improves random write performance to a single file in ZFS, since we can now have multiple writers inside ZFS at one time. I have changed the default tuning parameters to a 22 bit (4MB) window size (from 256K) and unlimited commands per thread as a result of my benchmarking with ZFS. The FHA code has been updated to allow configuring the tuning parameters from loader tunable variables in addition to sysctl variables. The read offset window calculation has been slightly modified as well. Instead of having separate bins, each file handle has a rolling window of bin_shift size. This minimizes glitches in throughput when shifting from one bin to another. sys/conf/files: Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c when either the old or the new NFS server is built. sys/fs/nfs/nfsport.h, sys/fs/nfs/nfs_commonport.c: Bring in changes from Rick Macklem to newnfs_realign that allow it to operate in blocking (M_WAITOK) or non-blocking (M_NOWAIT) mode. sys/fs/nfs/nfs_commonsubs.c, sys/fs/nfs/nfs_var.h: Bring in a change from Rick Macklem to allow telling nfsm_dissect() whether or not to wait for mallocs. sys/fs/nfs/nfsm_subs.h: Bring in changes from Rick Macklem to create a new nfsm_dissect_nonblock() inline function and NFSM_DISSECT_NONBLOCK() macro. sys/fs/nfs/nfs_commonkrpc.c, sys/fs/nfsclient/nfs_clkrpc.c: Add the malloc wait flag to a newnfs_realign() call. sys/fs/nfsserver/nfs_nfsdkrpc.c: Setup the new NFS server's RPC thread pool so that it will call the FHA code. Add the malloc flag argument to newnfs_realign(). Unstaticize newnfs_nfsv3_procid[] so that we can use it in the FHA code. sys/fs/nfsserver/nfs_nfsdsocket.c: In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types that use the LK_SHARED lock type. sys/fs/nfsserver/nfs_nfsdport.c: In nfsd_fhtovp(), if we're starting a write, check to see whether the underlying filesystem supports shared writes. If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE. sys/nfsserver/nfs_fha.c: Remove all code that is specific to the NFS server implementation. Anything that is server-specific is now accessed through a callback supplied by that server's FHA shim in the new softc. There are now separate sysctls and tunables for the FHA implementations for the old and new NFS servers. The new NFS server has its tunables under vfs.nfsd.fha, the old NFS server's tunables are under vfs.nfsrv.fha as before. In fha_extract_info(), use callouts for all server-specific code. Getting file handles and offsets is now done in the individual server's shim module. In fha_hash_entry_choose_thread(), change the way we decide whether two reads are in proximity to each other. Previously, the calculation was a simple shift operation to see whether the offsets were in the same power of 2 bucket. The issue was that there would be a bucket (and therefore thread) transition, even if the reads were in close proximity. When there is a thread transition, reads wind up going somewhat out of order, and ZFS gets confused. The new calculation simply tries to see whether the offsets are within 1 << bin_shift of each other. If they are, the reads will be sent to the same thread. The effect of this change is that for sequential reads, if the client doesn't exceed the max_reqs_per_nfsd parameter and the bin_shift is set to a reasonable value (22, or 4MB works well in my tests), the reads in any sequential stream will largely be confined to a single thread. Change fha_assign() so that it takes a softc argument. It is now called from the individual server's shim code, which will pass in the softc. Change fhe_stats_sysctl() so that it takes a softc parameter. It is now called from the individual server's shim code. Add the current offset to the list of things printed out about each active thread. Change the num_reads and num_writes counters in the fha_hash_entry structure to 32-bit values, and rename them num_rw and num_exclusive, respectively, to reflect their changed usage. Add an enable sysctl and tunable that allows the user to disable the FHA code (when vfs.XXX.fha.enable = 0). This is useful for before/after performance comparisons. nfs_fha.h: Move most structure definitions out of nfs_fha.c and into the header file, so that the individual server shims can see them. Change the default bin_shift to 22 (4MB) instead of 18 (256K). Allow unlimited commands per thread. sys/nfsserver/nfs_fha_old.c, sys/nfsserver/nfs_fha_old.h, sys/fs/nfsserver/nfs_fha_new.c, sys/fs/nfsserver/nfs_fha_new.h: Add shims for the old and new NFS servers to interface with the FHA code, and callbacks for the The shims contain all of the code and definitions that are specific to the NFS servers. They setup the server-specific callbacks and set the server name for the sysctl and loader tunable variables. sys/nfsserver/nfs_srvkrpc.c: Configure the RPC code to call fhaold_assign() instead of fha_assign(). sys/modules/nfsd/Makefile: Add nfs_fha.c and nfs_fha_new.c. sys/modules/nfsserver/Makefile: Add nfs_fha_old.c. Reviewed by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks	2013-04-17 21:00:22 +00:00
Ivan Voras	c072011223	Introduce glabel labels based on GEOM ident attributes. In this initial implementation, error on the side of conservatism and only create labels for GEOMs of classes DISK and MULTIPATH. Discussed with: trasz Approved by: silence from freebsd-geom@	2013-04-15 16:09:24 +00:00
Gleb Smirnoff	4e76af6a41	Merge from projects/counters: counter(9). Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data. See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details. In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.	2013-04-08 19:40:53 +00:00

1 2 3 4 5 ...

1809 Commits