PCPU fields for curthread, by doing the same to Book-E. This closes
some potential races switching between CPUs. As a side effect, it turns out
the AIM and Book-E swtch.S implementations were the same to within a few
registers, so move that to powerpc/powerpc.
MFC after: 3 months
add actual platform probing based on PVR. Still needs a little more work:
in particular, the CCRS setup should move here.
Also turn "bare" into a truly bare platform that doesn't pretend to know how
to do anything except get the memory map. This should also be enhanced to
process the FDT reserved memory list, but that is for another day.
assumed that the MDIO bus was a direct child of the Ethernet interface. It
may not be and indeed on many device trees is not. While here, add proper
locking for MII transactions, which may be on a bus shared by several MACs.
Hardware donated by: Benjamin Perrault
fdtbus in most cases. This brings ARM and MIPS more in line with existing
Open Firmware platforms like sparc64 and powerpc, as well as preventing
double-enumeration of the OF tree on embedded PowerPC (first through nexus,
then through fdtbus).
This change is also designed to simplify resource management on FDT platforms
by letting there exist a platform-defined root bus resource_activate() call
instead of replying on fdtbus to do the right thing through fdt_bs_tag.
The OFW_BUS_MAP_INTR() and OFW_BUS_CONFIG_INTR() kobj methods are also
available to implement for similar purposes.
Discussed on: -arm, -mips
Tested by: zbb, brooks, imp, and others
MFC after: 6 weeks
components instead of with the kernel and/or modules. This ensures that it
gets built with the host compiler, not the compiler in obj/... used to build
the target components (which may be a cross-compiler outputting code for a
different architecture and using header files with types and options set up
for the wrong architecture).
Reviewed by: imp
This includes the following:
- use separate memory regions for VALE ports
- locking fixes
- some simplifications in the NIC-specific routines
- performance improvements for the VALE switch
- some new features in the pkt-gen test program
- documentation updates
There are small API changes that require programs to be recompiled
(NETMAP_API has been bumped so you will detect old binaries at runtime).
In particular:
- struct netmap_slot now is 16 bytes to support an extra pointer,
which may save one data copy when using VALE ports or VMs;
- the struct netmap_if has two extra fields;
MFC after: 3 days
Right now, the semaphore write is scheduled after each batch, which is
not optimal and must be tuned.
Discussed with: alc
Tested by: pho
MFC after: 1 month
Will fix RPI-B kernel build failure since it adds missing
armv6_idcache_wbinv_all which was previously taken from cpufunc_asm_pj4b.S.
Reviewed by: gber
1.3 of Intelб╝ Virtualization Technology for Directed I/O Architecture
Specification. The Extended Context and PASIDs from the rev. 2.2 are
not supported, but I am not aware of any released hardware which
implements them. Code does not use queued invalidation, see comments
for the reason, and does not provide interrupt remapping services.
Code implements the management of the guest address space per domain
and allows to establish and tear down arbitrary mappings, but not
partial unmapping. The superpages are created as needed, but not
promoted. Faults are recorded, fault records could be obtained
programmatically, and printed on the console.
Implement the busdma(9) using DMARs. This busdma backend avoids
bouncing and provides security against misbehaving hardware and driver
bad programming, preventing leaks and corruption of the memory by wild
DMA accesses.
By default, the implementation is compiled into amd64 GENERIC kernel
but disabled; to enable, set hw.dmar.enable=1 loader tunable. Code is
written to work on i386, but testing there was low priority, and
driver is not enabled in GENERIC. Even with the DMAR turned on,
individual devices could be directed to use the bounce busdma with the
hw.busdma.pci<domain>:<bus>:<device>:<function>.bounce=1 tunable. If
DMARs are capable of the pass-through translations, it is used,
otherwise, an identity-mapping page table is constructed.
The driver was tested on Xeon 5400/5500 chipset legacy machine,
Haswell desktop and E5 SandyBridge dual-socket boxes, with ahci(4),
ata(4), bce(4), ehci(4), mfi(4), uhci(4), xhci(4) devices. It also
works with em(4) and igb(4), but there some fixes are needed for
drivers, which are not committed yet. Intel GPUs do not work with
DMAR (yet).
Many thanks to John Baldwin, who explained me the newbus integration;
Peter Holm, who did all testing and helped me to discover and
understand several incredible bugs; and to Jim Harris for the access
to the EDS and BWG and for listening when I have to explain my
findings to somebody.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
busdma implementations to coexist. Copy busdma_machdep.c to
busdma_bounce.c, which is still a single implementation of the busdma
interface on x86 for now. The busdma_machdep.c only contains common
and dispatch code.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
negative diff that should improve reliability somewhat. There should be
no differences in behavior -- please report any that crop up. This has been
tested on ARM and PPC systems.
Tested by: ray
BUS_PROBE_GENERIC and not BUS_PROBE_SPECIFIC (0) so the OFW SPI bus can
attach when enabled. Export the spibus devclass_t and driver_t
declarations.
Submitted by: ray
Approved by: adrian (mentor)
Change 231031 by brooks@brooks_zenith on 2013/07/11 16:22:08
Turn the unused and uncompilable MIPS_DISABLE_L1_CACHE define in
cache.c into an option and when set force I- and D-cache line
sizes to 0 (the latter part might be better as a tunable).
Fix some casts in an #if 0'd bit of code which attempts to
disable L1 cache ops when the cache is coherent.
Sponsored by: DARPA/AFRL
Change 228019 by bz@bz_zenith on 2013/04/23 13:55:30
Add kernel side support for large TLB on BERI/CHERI.
Modelled similar to NLM
MFC after: 3 days
Sponsored by: DAPRA/AFRL
- Use bus reference phandles in place of FDT offsets as IRQ domain keys
- Unify the identical macio/fdt/mambo OpenPIC drivers into one
- Be more forgiving (following ePAPR) about what we need from the device
tree to identify an OpenPIC
- Correctly map all IRQs into an interrupt domain
- Set IRQ_*_CONFORM for interrupts on an unknown PIC type instead of
failing attachment for that device.
the cfi(4) driver. It remained in the tree longer than would be ideal
due to the time required to bring cfi(4) to feature parity.
Sponsored by: DARPA/AFRL
MFC after: 3 days
Implement support for interrupt-parent nodes in simplebus. The current
implementation requires that device declarations have an interrupt-parent
node and that it point to a device that has registered itself as a
interrupt controller in fdt_ic_list_head and implements the fdt_ic
interface.
Sponsored by: DARPA/AFRL
from the kernel build until it is ready.
sys/conf/files:
Remove the entry for xen/evtchn/evtchn_dev.c so it is not included
in any kernel builds.
Noticed by: smh
differed only with respect to the AIM version not following style(9) and
some additional features for 64-bit systems and machines with direct maps
in the AIM implementation that are no-ops on Book-E (at least for now).
Add an option ATSE_CFI_HACK to allow memory mapped CFI devices to have
their address range allocated sharable so that atse(4) can find it's
Ethernet address in the expected location.
We intend to remove this hack once the BERI platform has a loader.
Change 231100 by brooks@brooks_zenith on 2013/07/12 21:01:31
Add a new option ALTERA_SDCARD_FAST_SIM which checks immediatly
for success of I/O operations rather than queuing a task.
MFC after: 3 days
Sponsored by: DARPA/AFRL
Refactor of /dev/random device. Main points include:
* Userland seeding is no longer used. This auto-seeds at boot time
on PC/Desktop setups; this may need some tweeking and intelligence
from those folks setting up embedded boxes, but the work is believed
to be minimal.
* An entropy cache is written to /entropy (even during installation)
and the kernel uses this at next boot.
* An entropy file written to /boot/entropy can be loaded by loader(8)
* Hardware sources such as rdrand are fed into Yarrow, and are no
longer available raw.
------------------------------------------------------------------------
r256240 | des | 2013-10-09 21:14:16 +0100 (Wed, 09 Oct 2013) | 4 lines
Add a RANDOM_RWFILE option and hide the entropy cache code behind it.
Rename YARROW_RNG and FORTUNA_RNG to RANDOM_YARROW and RANDOM_FORTUNA.
Add the RANDOM_* options to LINT.
------------------------------------------------------------------------
r256239 | des | 2013-10-09 21:12:59 +0100 (Wed, 09 Oct 2013) | 2 lines
Define RANDOM_PURE_RNDTEST for rndtest(4).
------------------------------------------------------------------------
r256204 | des | 2013-10-09 18:51:38 +0100 (Wed, 09 Oct 2013) | 2 lines
staticize struct random_hardware_source
------------------------------------------------------------------------
r256203 | markm | 2013-10-09 18:50:36 +0100 (Wed, 09 Oct 2013) | 2 lines
Wrap some policy-rich code in 'if NOTYET' until we can thresh out
what it really needs to do.
------------------------------------------------------------------------
r256184 | des | 2013-10-09 10:13:12 +0100 (Wed, 09 Oct 2013) | 2 lines
Re-add /dev/urandom for compatibility purposes.
------------------------------------------------------------------------
r256182 | des | 2013-10-09 10:11:14 +0100 (Wed, 09 Oct 2013) | 3 lines
Add missing include guards and move the existing ones out of the
implementation namespace.
------------------------------------------------------------------------
r256168 | markm | 2013-10-08 23:14:07 +0100 (Tue, 08 Oct 2013) | 10 lines
Fix some just-noticed problems:
o Allow this to work with "nodevice random" by fixing where the
MALLOC pool is defined.
o Fix the explicit reseed code. This was correct as submitted, but
in the project branch doesn't need to set the "seeded" bit as this
is done correctly in the "unblock" function.
o Remove some debug ifdeffing.
o Adjust comments.
------------------------------------------------------------------------
r256159 | markm | 2013-10-08 19:48:11 +0100 (Tue, 08 Oct 2013) | 6 lines
Time to eat crow for me.
I replaced the sx_* locks that Arthur used with regular mutexes;
this turned out the be the wrong thing to do as the locks need to
be sleepable. Revert this folly.
# Submitted by: Arthur Mesh <arthurmesh@gmail.com> (In original diff)
------------------------------------------------------------------------
r256138 | des | 2013-10-08 12:05:26 +0100 (Tue, 08 Oct 2013) | 10 lines
Add YARROW_RNG and FORTUNA_RNG to sys/conf/options.
Add a SYSINIT that forces a reseed during proc0 setup, which happens
fairly late in the boot process.
Add a RANDOM_DEBUG option which enables some debugging printf()s.
Add a new RANDOM_ATTACH entropy source which harvests entropy from the
get_cyclecount() delta across each call to a device attach method.
------------------------------------------------------------------------
r256135 | markm | 2013-10-08 07:54:52 +0100 (Tue, 08 Oct 2013) | 8 lines
Debugging. My attempt at EVENTHANDLER(multiuser) was a failure; use
EVENTHANDLER(mountroot) instead.
This means we can't count on /var being present, so something will
need to be done about harvesting /var/db/entropy/... .
Some policy now needs to be sorted out, and a pre-sync cache needs
to be written, but apart from that we are now ready to go.
Over to review.
------------------------------------------------------------------------
r256094 | markm | 2013-10-06 23:45:02 +0100 (Sun, 06 Oct 2013) | 8 lines
Snapshot.
Looking pretty good; this mostly works now. New code includes:
* Read cached entropy at startup, both from files and from loader(8)
preloaded entropy. Failures are soft, but announced. Untested.
* Use EVENTHANDLER to do above just before we go multiuser. Untested.
------------------------------------------------------------------------
r256088 | markm | 2013-10-06 14:01:42 +0100 (Sun, 06 Oct 2013) | 2 lines
Fix up the man page for random(4). This mainly removes no-longer-relevant
details about HW RNGs, reseeding explicitly and user-supplied
entropy.
------------------------------------------------------------------------
r256087 | markm | 2013-10-06 13:43:42 +0100 (Sun, 06 Oct 2013) | 6 lines
As userland writing to /dev/random is no more, remove the "better
than nothing" bootstrap mode.
Add SWI harvesting to the mix.
My box seeds Yarrow by itself in a few seconds! YMMV; more to follow.
------------------------------------------------------------------------
r256086 | markm | 2013-10-06 13:40:32 +0100 (Sun, 06 Oct 2013) | 11 lines
Debug run. This now works, except that the "live" sources haven't
been tested. With all sources turned on, this unlocks itself in
a couple of seconds! That is no my box, and there is no guarantee
that this will be the case everywhere.
* Cut debug prints.
* Use the same locks/mutexes all the way through.
* Be a tad more conservative about entropy estimates.
------------------------------------------------------------------------
r256084 | markm | 2013-10-06 13:35:29 +0100 (Sun, 06 Oct 2013) | 5 lines
Don't use the "real" assembler mnemonics; older compilers may not
understand them (like when building CURRENT on 9.x).
# Submitted by: Konstantin Belousov <kostikbel@gmail.com>
------------------------------------------------------------------------
r256081 | markm | 2013-10-06 10:55:28 +0100 (Sun, 06 Oct 2013) | 12 lines
SNAPSHOT.
Simplify the malloc pools; We only need one for this device.
Simplify the harvest queue.
Marginally improve the entropy pool hashing, making it a bit faster
in the process.
Connect up the hardware "live" source harvesting. This is simplistic
for now, and will need to be made rate-adaptive.
All of the above passes a compile test but needs to be debugged.
------------------------------------------------------------------------
r256042 | markm | 2013-10-04 07:55:06 +0100 (Fri, 04 Oct 2013) | 25 lines
Snapshot. This passes the build test, but has not yet been finished or debugged.
Contains:
* Refactor the hardware RNG CPU instruction sources to feed into
the software mixer. This is unfinished. The actual harvesting needs
to be sorted out. Modified by me (see below).
* Remove 'frac' parameter from random_harvest(). This was never
used and adds extra code for no good reason.
* Remove device write entropy harvesting. This provided a weak
attack vector, was not very good at bootstrapping the device. To
follow will be a replacement explicit reseed knob.
* Separate out all the RANDOM_PURE sources into separate harvest
entities. This adds some secuity in the case where more than one
is present.
* Review all the code and fix anything obviously messy or inconsistent.
Address som review concerns while I'm here, like rename the pseudo-rng
to 'dummy'.
# Submitted by: Arthur Mesh <arthurmesh@gmail.com> (the first item)
------------------------------------------------------------------------
r255319 | markm | 2013-09-06 18:51:52 +0100 (Fri, 06 Sep 2013) | 4 lines
Yarrow wants entropy estimations to be conservative; the usual idea
is that if you are certain you have N bits of entropy, you declare
N/2.
------------------------------------------------------------------------
r255075 | markm | 2013-08-30 18:47:53 +0100 (Fri, 30 Aug 2013) | 4 lines
Remove short-lived idea; thread to harvest (eg) RDRAND enropy into the
usual harvest queues. It was a nifty idea, but too heavyweight.
# Submitted by: Arthur Mesh <arthurmesh@gmail.com>
------------------------------------------------------------------------
r255071 | markm | 2013-08-30 12:42:57 +0100 (Fri, 30 Aug 2013) | 4 lines
Separate out the Software RNG entropy harvesting queue and thread
into its own files.
# Submitted by: Arthur Mesh <arthurmesh@gmail.com>
------------------------------------------------------------------------
r254934 | markm | 2013-08-26 20:07:03 +0100 (Mon, 26 Aug 2013) | 2 lines
Remove the short-lived namei experiment.
------------------------------------------------------------------------
r254928 | markm | 2013-08-26 19:35:21 +0100 (Mon, 26 Aug 2013) | 2 lines
Snapshot; Do some running repairs on entropy harvesting. More needs
to follow.
------------------------------------------------------------------------
r254927 | markm | 2013-08-26 19:29:51 +0100 (Mon, 26 Aug 2013) | 15 lines
Snapshot of current work;
1) Clean up namespace; only use "Yarrow" where it is Yarrow-specific
or close enough to the Yarrow algorithm. For the rest use a neutral
name.
2) Tidy up headers; put private stuff in private places. More could
be done here.
3) Streamline the hashing/encryption; no need for a 256-bit counter;
128 bits will last for long enough.
There are bits of debug code lying around; these will be removed
at a later stage.
------------------------------------------------------------------------
r254784 | markm | 2013-08-24 14:54:56 +0100 (Sat, 24 Aug 2013) | 39 lines
1) example (partially humorous random_adaptor, that I call "EXAMPLE")
* It's not meant to be used in a real system, it's there to show how
the basics of how to create interfaces for random_adaptors. Perhaps
it should belong in a manual page
2) Move probe.c's functionality in to random_adaptors.c
* rename random_ident_hardware() to random_adaptor_choose()
3) Introduce a new way to choose (or select) random_adaptors via tunable
"rngs_want" It's a list of comma separated names of adaptors, ordered
by preferences. I.e.:
rngs_want="yarrow,rdrand"
Such setting would cause yarrow to be preferred to rdrand. If neither of
them are available (or registered), then system will default to
something reasonable (currently yarrow). If yarrow is not present, then
we fall back to the adaptor that's first on the list of registered
adaptors.
4) Introduce a way where RNGs can play a role of entropy source. This is
mostly useful for HW rngs.
The way I envision this is that every HW RNG will use this
functionality by default. Functionality to disable this is also present.
I have an example of how to use this in random_adaptor_example.c (see
modload event, and init function)
5) fix kern.random.adaptors from
kern.random.adaptors: yarrowpanicblock
to
kern.random.adaptors: yarrow,panic,block
6) add kern.random.active_adaptor to indicate currently selected
adaptor:
root@freebsd04:~ # sysctl kern.random.active_adaptor
kern.random.active_adaptor: yarrow
# Submitted by: Arthur Mesh <arthurmesh@gmail.com>
Submitted by: Dag-Erling Smørgrav <des@FreeBSD.org>, Arthur Mesh <arthurmesh@gmail.com>
Reviewed by: des@FreeBSD.org
Approved by: re (delphij)
Approved by: secteam (des,delphij)
- Update FreeBSD version in:
- UPDATING
- sys/conf/newvers.sh
- Add 11.0 FreeBSD version for manual pages
- Bump __FreeBSD_version to 1100000
Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation
They're both different cores:
* mips24k is an 8-stage pipeline, mips32r1 ABI, non-superscalar core.
* mips74k is a dual-issue 15-stage superscalar design, mips32r2 ABI.
They have different sets of quirks and bugs; these #define entries
will be used to work around these.
Now, strictly speaking, we should have CPU ABI families (mips32r1, mips32r2,
etc) and CPU core types (mips4k, mips24k, mips74k, etc.) But this is the
starting point of that particular tidy-up.
Reviewed by: imp@
Approved by: re@ (gjb)
Add a SYSINIT that forces a reseed during proc0 setup, which happens
fairly late in the boot process.
Add a RANDOM_DEBUG option which enables some debugging printf()s.
Add a new RANDOM_ATTACH entropy source which harvests entropy from the
get_cyclecount() delta across each call to a device attach method.
Looking pretty good; this mostly works now. New code includes:
* Read cached entropy at startup, both from files and from loader(8) preloaded entropy. Failures are soft, but announced. Untested.
* Use EVENTHANDLER to do above just before we go multiuser. Untested.
Contains:
* Refactor the hardware RNG CPU instruction sources to feed into
the software mixer. This is unfinished. The actual harvesting needs
to be sorted out. Modified by me (see below).
* Remove 'frac' parameter from random_harvest(). This was never
used and adds extra code for no good reason.
* Remove device write entropy harvesting. This provided a weak
attack vector, was not very good at bootstrapping the device. To
follow will be a replacement explicit reseed knob.
* Separate out all the RANDOM_PURE sources into separate harvest
entities. This adds some secuity in the case where more than one
is present.
* Review all the code and fix anything obviously messy or inconsistent.
Address som review concerns while I'm here, like rename the pseudo-rng
to 'dummy'.
Submitted by: Arthur Mesh <arthurmesh@gmail.com> (the first item)
Update the OFED Infiniband core to the version supplied in Linux
version 3.7.
The update to OFED is nearly all additional defines and functions
with the exception of the addition of additional parameters to
ib_register_device() and the reg_user_mr callback.
In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband)
have both been made into completely loadable modules to facilitate
testing of the OFED stack in FreeBSD.
Finally the Mellanox Infiniband drivers are now updated to the
latest version shipping with Linux 3.7.
Submitted by: Mellanox FreeBSD driver team:
Oded Shanoon (odeds mellanox.com),
Meny Yossefi (menyy mellanox.com),
Orit Moskovich (oritm mellanox.com)
Approved by: re
newvers.sh. Pass it in from include/Makefile. If it isn't passed in,
fall back to the old logic of using dirname $0.
Using dirname $0 does not yield the path to the script if it was
sourced in from another script in another directory; you end up with
the parent script's path. That was causing newvers.sh to look one
level below the FreeBSD src/ directory when building osreldate.h and it
may find something like a git or svn repo there that has nothing to do
with FreeBSD.
PR: 174422
Approved by: re ()
MFC after: 2 weeks
install directly into standard POWER LPARs, as found for example in
QEMU. The core of this device is the SCSI RDMA protocol as also found in
Infiniband. The SRP portions of the driver will be factored out and placed
/sys/cam in the future to allow them to be used for IB storage. Thanks to
Scott Long for a great deal of implementation help.
Reviewed by: scottl
Approved by: re (kib)
This commit marks the point the final KBI change was made as part of the
10.0-RELEASE cycle.
Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation
to implement epoll subset of functionality. The kqueue user data are 32bit
on i386 which is not enough for epoll user data so this patch overrides
kqueue fileops to maintain enough space in struct file.
Initial patch developed by me in 2007 and then extended and finished
by Yuri Victorovich.
Approved by: re (delphij)
Sponsored by: Google Summer of Code
Submitted by: Yuri Victorovich <yuri at rawbw dot com>
Tested by: Yuri Victorovich <yuri at rawbw dot com>
Requirements) systems from the projects/pseries branch. This in principle
includes all IBM POWER hardware released in the last 15 years with the
exception of POWER3-based systems when run in 64-bit mode. The main
development target, however, has been the PAPR logical partition support
that is the default target in KVM on POWER and QEMU -- mileage may vary
on actual hardware at present. Much of the heavy lifting here was done
by Andreas Tobler.
Approved by: re (kib)
and the equivalent functionality is now provided by sendfile(2) over
posix shared memory filedescriptor.
Remove the cow member of struct vm_page, and rearrange the remaining
members. While there, make hold_count unsigned.
Requested and reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
Approved by: re (delphij)
Use a variant of mips libc memcpy for kernel. This implementation uses
64-bit operations when compiled for 64-bit, and is significantly faster
in that case.
Submitted by: Tanmay Jagdale <tanmayj@broadcom.com>
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
performance... Use SSE2 instructions for calculating the XTS tweek
factor... Let the compiler do more work and handle register allocation
by using intrinsics, now only the key schedule is in assembly...
Replace .byte hard coded instructions w/ the proper instructions now
that both clang and gcc support them...
On my machine, pulling the code to userland I saw performance go from
~150MB/sec to 2GB/sec in XTS mode. GELI on GNOP saw a more modest
increase of about 3x due to other system overhead (geom and
opencrypto)...
These changes allow almost full disk io rate w/ geli...
Reviewed by: -current, -security
Thanks to: Mike Hamburg for the XTS tweek algorithm
Use this new driver for both PV and HVM instances.
This driver requires a Xen hypervisor that supports vector callbacks,
VCPUOP hypercalls, and reports that it has a "safe PV clock".
New timer driver:
Submitted by: will
Sponsored by: Spectra Logic Corporation
PV port to new driver, and bug fixes:
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
sys/dev/xen/timer/timer.c:
- Register a PV timer device driver which (currently)
implements device_{identify,probe,attach} and stubs
device_detach. The detach routine requires functionality
not provided by timecounters(4). The suspend and resume
routines need additional work (due to Xen requiring that
the hypercalls be executed on the target VCPU), and aren't
needed for our purposes.
- Make sure there can only be one device instance of this
driver, and that it only registers one eventtimers(4) and
one timecounters(4) device interface. Make both interfaces
use PCPU data as needed.
- Match, with a few style cleanups & API differences, the
Xen versions of the "fetch time" functions.
- Document the magic scale_delta() better for the i386 version.
- When registering the event timer, bind a separate event
channel for the timer VIRQ to the device's event timer
interrupt handler for each active VCPU. Describe each
interrupt as "xen_et:c%d", so they can be identified per
CPU in "vmstat -i" or "show intrcnt" in KDB.
- When scheduling a timer into the hypervisor, try up to
60 times if the hypervisor rejects the time as being in
the past. In the common case, this retry shouldn't happen,
and if it does, it should only happen once. This is
because the event timer advertises a minimum period of
100usec, which is only less than the usual hypercall round
trip time about 1 out of every 100 tries. (Unlike other
similar drivers, this one actually checks whether the
hypervisor accepted the singleshot timer set hypercall.)
- Implement a RTC PV clock based on the hypervisor wallclock.
sys/conf/files:
- Add dev/xen/timer/timer.c if the kernel configuration
includes either the XEN or XENHVM options.
sys/conf/files.i386:
sys/i386/include/xen/xen_clock_util.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/xen_rtc.c:
- Remove previous PV timer used in i386 XEN PV kernels, the
new timer introduced in this change is used instead (so
we share the same code between PVHVM and PV).
MFC after: 2 weeks
Re-structure Xen HVM support so that:
- Xen is detected and hypercalls can be performed very
early in system startup.
- Xen interrupt services are implemented using FreeBSD's native
interrupt delivery infrastructure.
- the Xen interrupt service implementation is shared between PV
and HVM guests.
- Xen interrupt handlers can optionally use a filter handler
in order to avoid the overhead of dispatch to an interrupt
thread.
- interrupt load can be distributed among all available CPUs.
- the overhead of accessing the emulated local and I/O apics
on HVM is removed for event channel port events.
- a similar optimization can eventually, and fairly easily,
be used to optimize MSI.
Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure,
and misc Xen cleanups:
Sponsored by: Spectra Logic Corporation
Unification of PV & HVM interrupt infrastructure, bug fixes,
and misc Xen cleanups:
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
sys/x86/x86/local_apic.c:
sys/amd64/include/apicvar.h:
sys/i386/include/apicvar.h:
sys/amd64/amd64/apic_vector.S:
sys/i386/i386/apic_vector.s:
sys/amd64/amd64/machdep.c:
sys/i386/i386/machdep.c:
sys/i386/xen/exception.s:
sys/x86/include/segments.h:
Reserve IDT vector 0x93 for the Xen event channel upcall
interrupt handler. On Hypervisors that support the direct
vector callback feature, we can request that this vector be
called directly by an injected HVM interrupt event, instead
of a simulated PCI interrupt on the Xen platform PCI device.
This avoids all of the overhead of dealing with the emulated
I/O APIC and local APIC. It also means that the Hypervisor
can inject these events on any CPU, allowing upcalls for
different ports to be handled in parallel.
sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
Map Xen per-vcpu area during AP startup.
sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
Increase the FreeBSD IRQ vector table to include space
for event channel interrupt sources.
sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
Remove Xen HVM per-cpu variable data. These fields are now
allocated via the dynamic per-cpu scheme. See xen_intr.c
for details.
sys/amd64/include/xen/hypercall.h:
sys/dev/xen/blkback/blkback.c:
sys/i386/include/xen/xenvar.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/xen/gnttab.c:
Prefer FreeBSD primatives to Linux ones in Xen support code.
sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/console/xencons_ring.c:
sys/dev/xen/control/control.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/xenpci/xenpci.c:
sys/i386/i386/machdep.c:
sys/i386/include/pmap.h:
sys/i386/include/xen/xenfunc.h:
sys/i386/isa/npx.c:
sys/i386/xen/clock.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/xen_rtc.c:
sys/xen/evtchn/evtchn_dev.c:
sys/xen/features.c:
sys/xen/gnttab.c:
sys/xen/gnttab.h:
sys/xen/hvm.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstore.c:
sys/xen/xenstore/xenstore_dev.c:
sys/xen/xenstore/xenstorevar.h:
Pull common Xen OS support functions/settings into xen/xen-os.h.
sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
Remove constants, macros, and functions unused in FreeBSD's Xen
support.
sys/xen/xen-os.h:
sys/i386/xen/xen_machdep.c:
sys/x86/xen/hvm.c:
Introduce new functions xen_domain(), xen_pv_domain(), and
xen_hvm_domain(). These are used in favor of #ifdefs so that
FreeBSD can dynamically detect and adapt to the presence of
a hypervisor. The goal is to have an HVM optimized GENERIC,
but more is necessary before this is possible.
sys/amd64/amd64/machdep.c:
sys/dev/xen/xenpci/xenpcivar.h:
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/sys/kernel.h:
Refactor magic ioport, Hypercall table and Hypervisor shared
information page setup, and move it to a dedicated HVM support
module.
HVM mode initialization is now triggered during the
SI_SUB_HYPERVISOR phase of system startup. This currently
occurs just after the kernel VM is fully setup which is
just enough infrastructure to allow the hypercall table
and shared info page to be properly mapped.
sys/xen/hvm.h:
sys/x86/xen/hvm.c:
Add definitions and a method for configuring Hypervisor event
delievery via a direct vector callback.
sys/amd64/include/xen/xen-os.h:
sys/x86/xen/hvm.c:
sys/conf/files:
sys/conf/files.amd64:
sys/conf/files.i386:
Adjust kernel build to reflect the refactoring of early
Xen startup code and Xen interrupt services.
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
sys/dev/xen/control/control.c:
sys/dev/xen/evtchn/evtchn_dev.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/xen/xenstore/xenstore.c:
sys/xen/evtchn/evtchn_dev.c:
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c
Adjust drivers to use new xen_intr_*() API.
sys/dev/xen/blkback/blkback.c:
Since blkback defers all event handling to a taskqueue,
convert this task queue to a "fast" taskqueue, and schedule
it via an interrupt filter. This avoids an unnecessary
ithread context switch.
sys/xen/xenstore/xenstore.c:
The xenstore driver is MPSAFE. Indicate as much when
registering its interrupt handler.
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusvar.h:
Remove unused event channel APIs.
sys/xen/evtchn.h:
Remove all kernel Xen interrupt service API definitions
from this file. It is now only used for structure and
ioctl definitions related to the event channel userland
device driver.
Update the definitions in this file to match those from
NetBSD. Implementing this interface will be necessary for
Dom0 support.
sys/xen/evtchn/evtchnvar.h:
Add a header file for implemenation internal APIs related
to managing event channels event delivery. This is used
to allow, for example, the event channel userland device
driver to access low-level routines that typical kernel
consumers of event channel services should never access.
sys/xen/interface/event_channel.h:
sys/xen/xen_intr.h:
Standardize on the evtchn_port_t type for referring to
an event channel port id. In order to prevent low-level
event channel APIs from leaking to kernel consumers who
should not have access to this data, the type is defined
twice: Once in the Xen provided event_channel.h, and again
in xen/xen_intr.h. The double declaration is protected by
__XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared
twice within a given compilation unit.
sys/xen/xen_intr.h:
sys/xen/evtchn/evtchn.c:
sys/x86/xen/xen_intr.c:
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/xenpci/xenpcivar.h:
New implementation of Xen interrupt services. This is
similar in many respects to the i386 PV implementation with
the exception that events for bound to event channel ports
(i.e. not IPI, virtual IRQ, or physical IRQ) are further
optimized to avoid mask/unmask operations that aren't
necessary for these edge triggered events.
Stubs exist for supporting physical IRQ binding, but will
need additional work before this implementation can be
fully shared between PV and HVM.
sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
sys/i386/xen/mp_machdep.c
sys/x86/xen/hvm.c:
Add support for placing vcpu_info into an arbritary memory
page instead of using HYPERVISOR_shared_info->vcpu_info.
This allows the creation of domains with more than 32 vcpus.
sys/i386/i386/machdep.c:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/exception.s:
Add support for new event channle implementation.
dynamic translation so that their arguments match the definitions for
these providers in Solaris and illumos. Thus, existing scripts for these
providers should work unmodified on FreeBSD.
Tested by: gnn, hiren
MFC after: 1 month
* It's not meant to be used in a real system, it's there to show how
the basics of how to create interfaces for random_adaptors. Perhaps
it should belong in a manual page
2) Move probe.c's functionality in to random_adaptors.c
* rename random_ident_hardware() to random_adaptor_choose()
3) Introduce a new way to choose (or select) random_adaptors via tunable
"rngs_want" It's a list of comma separated names of adaptors, ordered
by preferences. I.e.:
rngs_want="yarrow,rdrand"
Such setting would cause yarrow to be preferred to rdrand. If neither of
them are available (or registered), then system will default to
something reasonable (currently yarrow). If yarrow is not present, then
we fall back to the adaptor that's first on the list of registered
adaptors.
4) Introduce a way where RNGs can play a role of entropy source. This is
mostly useful for HW rngs.
The way I envision this is that every HW RNG will use this
functionality by default. Functionality to disable this is also present.
I have an example of how to use this in random_adaptor_example.c (see
modload event, and init function)
5) fix kern.random.adaptors from
kern.random.adaptors: yarrowpanicblock
to
kern.random.adaptors: yarrow,panic,block
6) add kern.random.active_adaptor to indicate currently selected
adaptor:
root@freebsd04:~ # sysctl kern.random.active_adaptor
kern.random.active_adaptor: yarrow
Submitted by: Arthur Mesh <arthurmesh@gmail.com>
(sys/dev/iscsi_initiator/ instead of sys/dev/iscsi/initiator/), to make
room for the new one. This is also more logical location (kernel module
being named iscsi_initiator.ko, for example). There is no ongoing work
on this I know of, so it shouldn't make life harder for anyone.
There are no functional changes, apart from "svn mv" and adjusting paths.
Basic support for extents was implemented by Zheng Liu as part
of his Google Summer of Code in 2010. This support is read-only
at this time.
In addition to extents we also support the huge_file extension
for read-only purposes. This works nicely with the additional
support for birthtime/nanosec timestamps and dir_index that
have been added lately.
The implementation may not work for all ext4 filesystems as
it doesn't support some features that are being enabled by
default on recent linux like flex_bg. Nevertheless, the feature
should be very useful for migration or simple access in
filesystems that have been converted from ext2/3 or don't use
incompatible features.
Special thanks to Zheng Liu for his dedication and continued
work to support ext2 in FreeBSD.
Submitted by: Zheng Liu (lz@)
Reviewed by: Mike Ma, Christoph Mallon (previous version)
Sponsored by: Google Inc.
MFC after: 3 weeks
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah
* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.
Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: so (des)
(or svnliteversion) in the current lookup path is not what
was used to check out the tree. If an incompatible version
is used, the svn revision number is not reported in uname(1).
Run ${svnversion} on newvers.sh itself when evaluating if the
svn(1) in use is compatible with the tree. Fallback to an
empty ${svnversion} if necessary.
With this change, svnliteversion from base is only used
if no compatible svnversion is found, so with this change,
the version of svn(1) from the ports tree is evaluated first.
Requested by: many
MFC after: 3 days
X-MFC-To: stable/9, releng/9.2 only
Do this by forcing inclusion of
sys/cddl/compat/opensolaris/sys/debug_compat.h
via -include option into all source files from OpenSolaris.
Note that this -include option must always be after -include opt_global.h.
Additionally, remove forced definition of DEBUG for some modules and fix
their build without DEBUG.
Also, meaning of DEBUG was overloaded to enable WITNESS support for some
OpenSolaris (primarily ZFS) locks. Now this overloading is removed and
that use of DEBUG is replaced with a new option OPENSOLARIS_WITNESS.
MFC after: 17 days
the tree version, for example if the tree is checked out with an
outdated svn from ports, but the base system svnlite is built.
Approved by: kib (mentor)
- change the SI_SUB_RUN_SCHEDULER sysinits in hv_utilc and
hv_netvsc_drv_freebsd.c to SI_SUB_KTHREAD_IDLE, since the
former is no longer in FreeBSD.
The use of these SYSINITs can probably be removed.
Support chipsets are the Realtek RTL8188SU, RTL8191SU, and RTL8192SU.
Many thanks to Idwer Vollering for porting/writing the man page and for
testing.
Reviewed by: adrian, hselasky
Obtained from: OpenBSD
Tested by: kevlo, Idwer Vollering <vidwer at gmail.com>
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.
* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.
* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah
* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.
Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien
The original API calls for pow2ns, however the new APIs from
Linux call for seconds.
We need to be able to convert to/from 2^Nns to seconds in both
userland and kernel to fix this and properly compare units.
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.
/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy
This time it is for a git mirror that stores svn revisions as
git notes, e.g. https://github.com/freebsd/freebsd
MFC after: 10 days
Sponsored by: HybridCluster
As part of this commit, add an nvme_strvis() function which borrows
heavily from cam_strvis(). This will allow stripping of
leading/trailing whitespace and also handle unprintable characters
in model/serial numbers. This function goes into a new nvme_util.c
file which is used by both the driver and nvmecontrol.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
make the ARM EABI the default ABI on arm, armeb, armv6 and armv6eb.
This is intended to be the default ABI from now on with the old ABI to be
retired. Because of this all users are strongly suggested to upgrade to the
ARM EABI.
As the two ABIs are incompatible it is unlikely upgrading in place will
work. Users should perform a full backup and either use an external machine
to upgrade, or install to an alternative location on their media. They
should also reinstall all ports or packages when these are available.
The only known issues are:
- pkg incorrectly detects the ABI. This is fixed upstream, and will a
patch will be made to the port.
- GDB can have issues with executables built with clang.
__FreeBSD_version has been bumped.
information into the ISN (initial sequence number) without the additional
use of timestamp bits and switching to the very fast and cryptographically
strong SipHash-2-4 MAC hash algorithm to protect the SYN cookie against
forgeries.
The purpose of SYN cookies is to encode all necessary session state in
the 32 bits of our initial sequence number to avoid storing any information
locally in memory. This is especially important when under heavy spoofed
SYN attacks where we would either run out of memory or the syncache would
fill with bogus connection attempts swamping out legitimate connections.
The original SYN cookies method only stored an indexed MSS values in the
cookie. This isn't sufficient anymore and breaks down in the presence of
WSCALE information which is only exchanged during SYN and SYN-ACK. If we
can't keep track of it then we may severely underestimate the available
send or receive window. This is compounded with large windows whose size
information on the TCP segment header is even lower numerically. A number
of years back SYN cookies were extended to store the additional state in
the TCP timestamp fields, if available on a connection. While timestamps
are common among the BSD, Linux and other *nix systems Windows never enabled
them by default and thus are not present for the vast majority of clients
seen on the Internet.
The common parameters used on TCP sessions have changed quite a bit since
SYN cookies very invented some 17 years ago. Today we have a lot more
bandwidth available making the use window scaling almost mandatory. Also
SACK has become standard making recovering from packet loss much more
efficient.
This change moves all necessary information into the ISS removing the need
for timestamps. Both the MSS (16 bits) and send WSCALE (4 bits) are stored
in 3 bit indexed form together with a single bit for SACK. While this is
significantly less than the original range, it is sufficient to encode all
common values with minimal rounding.
The MSS depends on the MTU of the path and with the dominance of ethernet
the main value seen is around 1460 bytes. Encapsulations for DSL lines
and some other overheads reduce it by a few more bytes for many connections
seen. Rounding down to the next lower value in some cases isn't a problem
as we send only slightly more packets for the same amount of data.
The send WSCALE index is bit more tricky as rounding down under-estimates
the available send space available towards the remote host, however a small
number values dominate and are carefully selected again.
The receive WSCALE isn't encoded at all but recalculated based on the local
receive socket buffer size when a valid SYN cookie returns. A listen socket
buffer size is unlikely to change while active.
The index values for MSS and WSCALE are selected for minimal rounding errors
based on large traffic surveys. These values have to be periodically
validated against newer traffic surveys adjusting the arrays tcp_sc_msstab[]
and tcp_sc_wstab[] if necessary.
In addition the hash MAC to protect the SYN cookies is changed from MD5
to SipHash-2-4, a much faster and cryptographically secure algorithm.
Reviewed by: dwmalone
Tested by: Fabian Keil <fk@fabiankeil.de>
system has svnliteversion.
- If svnliteversion is not found, look for svnversion in /usr/bin
and /usr/local/bin, since svnlite can be installed as svn if
WITH_SVN is set.[1]
- Remove /bin from binary search paths.[1]
Discussed with: kib [1]
MFC after: 3 days
Approved by: kib (mentor)
- Reconnect with some minor modifications, in particular now selsocket()
internals are adapted to use sbintime units after recent'ish calloutng
switch.
originally inspired by the Solaris vmem detailed in the proceedings
of usenix 2001. The NetBSD version was heavily refactored for bugs
and simplicity.
- Use this resource allocator to allocate the buffer and transient maps.
Buffer cache defrags are reduced by 25% when used by filesystems with
mixed block sizes. Ultimately this may permit dynamic buffer cache
sizing on low KVA machines.
Discussed with: alc, kib, attilio
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
same as top-level target name for "device runfw" kernel option and
caused cyclic dependancy that lead to kernel build breakage
Module change is not strictly required and done for name unification sake
PR: conf/175751
Submitted by: Issei <i10a at herbmint.jp>
(which should be a PCIE Gen 3 slot for this adapter) by looking back thru the PCI
parent devices to the slot device.
The fix above also corrects the bandwidth display to GT/s rather than the
incorrect Gb/s
Next, allow the use of ALTQ if you select the compile option IXGBE_LEGACY_TX.
Allow the use of 'unsupported' optic modules by a compile option as well.
Add a phy reset capability into the stop code, this is so a static configured
driver will still behave properly when taken down (not being able to unload it).
This revision synchronizes the shared code with Intel internal current code,
and note that it now includes DCB supporting code, this was necessitated by
some internal changes with the code, but it also will provide the opportunity
to develop this feature in the core driver down the road.
I have edited the README to get rid of some of the worse anachronisms in it
as well, its by no means as robust as I might wish at this point however.
Oh, I also have included some conditional stuff in the code so it will be
compatible in both the 9.X and 10 environments.
Performance has been a focus in recent changes and I believe this revision
driver will perform very well in most workloads.
MFC after: 2 weeks
Basically the situation is as follows:
- When using Clang + armv6, we should not need any intrinsics. It should
support it, even though due to a target misconfiguration it does not.
We should fix this in Clang.
- When using Clang + noarmv6, provide __atomic_* functions that disable
interrupts.
- When using GCC + armv6, we can provide __sync_* intrinsics, similar to
what we did for MIPS. As ARM and MIPS are quite similar, simply base
this implementation on the one I did for MIPS.
- When using GCC + noarmv6, disable the interrupts, like we do for
Clang.
This implementation still lacks functions for noarmv6 userspace. To be
done.
The AR9485 chip and AR933x SoC both implement LNA diversity.
There are a few extra things that need to happen before this can be
flipped on for those chips (mostly to do with setting up the different
bias values and LNA1/LNA2 RSSI differences) but the first stage is
putting this code into the driver layer so it can be reused.
This has the added benefit of making it easier to expose configuration
options and diagnostic information via the ioctl API. That's not yet
being done but it sure would be nice to do so.
Tested:
* AR9285, with LNA diversity enabled
* AR9285, with LNA diversity disabled in EEPROM
Realtek RTL8188CU/RTL8192CU USB IEEE 802.11b/g/n wireless cards.
This driver requires microcode which is available in FreeBSD ports:
net/urtwn-firmware-kmod.
Hiren ported the urtwn(4) man page from OpenBSD and Glen just commited a port
for the firmware.
TODO:
- 802.11n support
- Stability fixes - the driver can sustain lots of traffic but has trouble
coping with simultaneous iperf sessions.
- fix debugging
MFC after: 2 months
Tested by: kevlo, hiren, gjb
To make <stdatomic.h> work on MIPS (and ARM) using GCC, we need to
provide implementations of the __sync_*() functions. I already added
these functions for 4 and 8 byte types to libcompiler-rt some time ago,
based on top of <machine/atomic.h>.
Unfortunately, <machine/atomic.h> only provides a subset of the features
needed to implement <stdatomic.h>. This means that in some cases we had
to do compare-and-exchange calls in loops, where a simple ll/sc would
suffice.
Also implement these functions for 1 and 2 byte types. MIPS only
provides ll/sc instructions for 4 and 8 byte types, but this is of
course no limitation. We can simply load 4 bytes and use some bitmask
tricks to modify only the bytes affected.
Discussed on: mips, arch
Tested with: QEMU
for the WB195 combo NIC - an AR9285 w/ an AR3011 USB bluetooth NIC.
The AR3011 is wired up using a 3-wire coexistence scheme to the AR9285.
The code in if_ath_btcoex.c sets up the initial hardware mapping
and coexistence configuration. There's nothing special about it -
it's static; it doesn't try to configure bluetooth / MAC traffic priorities
or try to figure out what's actually going on. It's enough to stop basic
bluetooth traffic from causing traffic stalls and diassociation from
the wireless network.
To use this code, you must have the above NIC. No, it won't work
for the AR9287+AR3012, nor the AR9485, AR9462 or AR955x combo cards.
Then you set a kernel hint before boot or before kldload, where 'X'
is the unit number of your AR9285 NIC:
# kenv hint.ath.X.btcoex_profile=wb195
This will then appear in your boot messages:
[100482] athX: Enabling WB195 BTCOEX
This code is going to evolve pretty quickly (well, depending upon my
spare time) so don't assume the btcoex API is going to stay stable.
In order to use the bluetooth side, you must also load in firmware using
ath3kfw and the binary firmware file (ath3k-1.fw in my case.)
Tested:
* AR9280, no interference
* WB195 - AR9285 + AR3011 combo; STA mode; basic bluetooth inquiries
were enough to cause traffic stalls and disassociations. This has
stopped with the btcoex profile code.
TODO:
* Importantly - the AR9285 needs ASPM disabled if bluetooth coexistence
is enabled. No, I don't know why. It's likely some kind of bug to do
with the AR3011 sending bluetooth coexistence signals whilst the device
is asleep. Since we don't actually sleep the MAC just yet, it shouldn't
be a problem. That said, to be totally correct:
+ ASPM should be disabled - upon attach and wakeup
+ The PCIe powersave HAL code should never be called
Look at what the ath9k driver does for inspiration.
* Add WB197 (AR9287+AR3012) support
* Add support for the AR9485, which is another combo like the AR9285
* The later NICs have a different signaling mechanism between the MAC
and the bluetooth device; I haven't even begun to experiment with
making that HAL code work. But it should be a lot more automatic.
* The hardware can do much more interesting traffic weighting with
bluetooth and wifi traffic. None of this is currently used.
Ideally someone would code up something to watch the bluetooth traffic
GPIO (via an interrupt) and then watch it go high/low; then figure out
what the bluetooth traffic is and adjust things appropriately.
* If I get the time I may add in some code to at least track this stuff
and expose statistics. But it's up to someone else to experiment with
the bluetooth coexistence support and add the interesting stuff (like
"real" detection of bulk, audio, etc bluetooth traffic patterns and
change wifi parameters appropriately - eg, maximum aggregate length,
transmit power, using quiet time to control TX duty cycle, etc.)
1. Common headers for fdt.h and ofw_machdep.h under x86/include
with indirections under i386/include and amd64/include.
2. New modinfo for loader provided FDT blob.
3. Common x86_init_fdt() called from hammer_time() on amd64 and
init386() on i386.
4. Split-off FDT specific low-level console functions from FDT
bus methods for the uart(4) driver. The low-level console
logic has been moved to uart_cpu_fdt.c and is used for arm,
mips & powerpc only. The FDT bus methods are shared across
all architectures.
5. Add dev/fdt/fdt_x86.c to hold the fdt_fixup_table[] and the
fdt_pic_table[] arrays. Both are empty right now.
FDT addresses are I/O ports on x86. Since the core FDT code does
not handle different address spaces, adding support for both I/O
ports and memory addresses requires some thought and discussion.
It may be better to use a compile-time option that controls this.
Obtained from: Juniper Networks, Inc.
QLogic 8300 Series Adapters
Submitted by: David C Somayajulu (davidcs@freebsd.org) QLogic Corporation
Approved by: George Neville-Neil (gnn@freebsd.org)
PV entries are now roughly half the size.
Instead of using a shared UMA zone for 28 byte pv entries
(two 8-byte tailq nodes, a 4 byte pointer, a 4 byte address and 4 byte
flags), we allocate a page at a time per process.
This provides 252 pv entries per process (actually, per pmap address space)
and eliminates one of the 8-byte tailq entries since we now can track
per-process pv entries implicitly.
The pointer to the pmap can be eliminated by doing address arithmetic to
find the metadata on the page headers to find a single pointer shared by
all 252 entries. There is an 8-int bitmap for the freelist of those 252
entries.
When in serious low memory condition, allocation of another pv_chunk is
possible by freeing some pages in pmap_pv_reclaim().
Added pv_entry/pv_chunk related statistics to pmap.
pv_entry/pv_chunk statistics can be accessed via sysctl vm.pmap.
Ported PTE freelist of KVA allocation and maintenance from i386.
Using an idea from Stephan Uphoff, use the empty pte's that correspond
to the unused kva in the pv memory block to thread a freelist through.
This allows us to free pages that used to be used for pv entry chunks
since we can now track holes in the kva memory block.
As both ARM pmap.c and pmap-v6.c use the same header and pv_entry, pmap and
md_page structures are different, it was needed to separate code designed
for ARMv6/7 from the one for other ARMs.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: alc
Sponsored by: The FreeBSD Foundation, Semihalf
with any structure containing a uint64_t index. The tree code
auto-generates type safe wrappers.
- Eliminate the buf splay and replace it with pctrie. This is not only
significantly faster with large files but also allows for the possibility
of shared locking.
Reviewed by: alc, attilio
Sponsored by: EMC / Isilon Storage Division
locks. To support this, VNODE locks are created with the LK_IS_VNODE
flag. This flag is propagated down using the LO_IS_VNODE flag.
Note that WITNESS still records the LOR. Only the printing and the
optional entering into the kernel debugger is bypassed with the
WITNESS_NO_VNODE option.