all the clocks that they provide.
Each clocks are exported under the node 'clock.<clkname>' and have the following
children nodes :
- frequency
- parent (The selected parent, if any)
- parents (The list of parents, if any)
- childrens (The list of childrens, if any)
- enable_cnt (The enabled counter)
This give us the possibility to examine clocks at runtime and make graph of
the clock flow.
Reviewed by: mmel
MFC after: 2 month
Differential Revision: https://reviews.freebsd.org/D9833
This interface has no in-tree consumers and has been more or less
non-functional for several releases.
Remove manpage note that the procfs special file 'mem' is grouped to
kmem. This hasn't been true since r81107.
Remove procfs' README file. It is an out of date duplication of the manpage
(quoth the README: "since the bsd kernel is single-processor...").
Reviewed by: vangyzen, bcr (manpage)
Approved by: des (procfs maintainer), vangyzen (mentor)
Differential Revision: https://reviews.freebsd.org/D9802
For 4965 just extract 'is_chan_5ghz' flag from the RXON structure
(like it was done in r281287); for others it was never used.
Tested with Intel 6205, STA mode.
have been in the code all along, but were masked by having a fifo depth of
one byte at the hardware level, so everything kinda worked by accident.
The hardware interrupts when the TX fifo is half empty, so set
sc->sc_txfifosz to 8 bytes (half the hardware fifo size) to match. This
eliminates dropped characters on output.
Restructure the read loop to consume all the bytes in the fifo by using
the "rx fifo empty" bit of the flags register rather than the "rx ready"
bit of the interrupt status register. The rx-ready interrupt is cleared
when the number of bytes in the fifo fall below the interrupt trigger
level, leaving the fifo half full every time receive routine was called.
Now it loops until the fifo is completely empty every time (including
when the function is called due to a receive timeout as well as for
fifo-full).
RPi3 cpufreq is more like that on RPi2. Setting arm frequency
above min (say, "sysctl hw.cpufreq.arm_freq=600000001") turns on
turbo mode, and the firmware automatically raises voltage, sets
frequency to max 1200MHz, and throttle when overheat, etc.
Swap if/else parts and use SOC_BCM2835 def so RPi3 can share the
same cpufreq logic as RPi2, instead of falling to that for RPi.
Submitted by: Jia-Shiun Li <jiashiun@gmail.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9640
80486 production was stopped by Intel on September 2007. Dropping the 486
configuration option from the GENERIC kernel improves performance
slightly.
Removing I486_CPU is consistent at this time: we don't support any
processor without a FPU and the PC-98 arch, which frequently involved i486
CPUs, is also gone so we don't test such platforms anymore.
Relnotes: yes
MFC after: 2 weeks
https://reviews.freebsd.org/D9879
Modern GCC and Clang simply ignore the qualifier, while the old base GCC
produces a warning (treated as an error in the kernel build).
Approved by: cem
MFC after: 5 days
Linuxulator is x86 only.
The only notable differences in algnment for an LP64 64-bit system
when compared to a 32-bit system is an eight or large byte types
alignment.
MFC after: 1 month
it in emergency in sc_cnputc().
Locking fixes in sc_cnputc() previously turned off normal output in
near-deadlock conditions and added deferred output which might never
be completed. Emergency output goes to the frame buffer using
sufficiently atomic non-blocking writes if the console is in text
mode (in graphics mode, nothing is done, modulo races setting the
graphics mode bit). Screen updates overwrite the emergency output
if the emergency condition clears enough to reach them.
ec_putc() also works for "early" console output in normal x86 text
mode as soon as this mode is initialized (if ever). This uses a
hard-coded x86 frame buffer address before cninit() and a hopefully
MI address after cninit(). But non-x86 is more likely to not support
text mode, when ec_putc() will be null. ec_putc() has no dependencies
of syscons before cninit(), and only has them later to track syscons'
mode changes. This commit doesn't attach ec_putc() for early use.
To test emergency use, put a breakpoint in central syscons output code
like sc_puts() and do some user output. The system used to race or
deadlock in ddb output soon after entry to ddb. The locking fixes
deferred the output until after leaving ddb, so ddb was unusable and
you had to try typing c[ontinue] blindly until it exited, or better use
a serial console in parallel. Now the output goes to a window in the
middle 2/3 of the screen. Scrolling is circular and there is no cursor,
but otherwise ec_putc() provides full dumb terminal functionality and
very fast output that hides artificates from dumb overwrites.
by the CPU number.
This was originally for debugging near-deadlock conditions where
multiple CPUs either deadlock or scramble each other's output trying
to report the problem, but I found it interesting and sometimes
useful for ordinary kernel messages. Ordinary kernel messages
shouldn't be interleaved, but if they are then the colorization
makes them readable even if the interleaving is for every character
(provided the CPU printing each message doesn't change).
The default colors are 8-15 starting at 15 (bright white on black)
for CPU 0 and repeating every 8 CPUs. This works best with 8 CPUs.
Non-bright colors and nonzero background colors need special
configuration to avoid unreadable and ugly combinations so are not
configured by default. The next bright color after 15 is 8 (bright
black = dark gray) is not very readable but is the only other color
used with 2 CPUs. After that the next bright color is 9 (bright
blue) which is not much brighter than bright black, but is used with
3+ CPUs. Other bright colors are brighter.
Colorization is configured by default so that it gets tested. It can
only be turned off by configuring SC_KERNEL_CONS_ATTR to anything other
than FG_WHITE. After booting, all colors can be changed using the
syscons.kattr sysctl. This is a SYSCTL_OPAQUE, and no utility is
provided to change it (sysctl only displays it).
The default colors work in all VGA modes that I could test. In 2-color
graphics modes, all 8 bright colors are displayed as bright white, so
the colorization has no effect, but anything with a nonzero background
gives white on white unless the foreground is zero. I don't have an
mono or VGA grayscale hardware to test on. Support for mono mode seems
to have never worked right in syscons (I think bright white gives white
underline with either bold or bright), but VGA grayscale should work
better than 2-color graphics.
I imagine that the module would be useful only to a very limited number
of developers, so that's my excuse for not writing any documentation.
On a more serious note, please see DRAM Error Injection section of BKDGs
for families 10h - 16h. E.g. section 2.13.3.1 of BKDG for AMD Family 15h
Models 00h-0Fh Processors.
Many thanks to kib for his suggestions and comments.
Discussed with: kib
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D9824
Currently the feature is implemented only for a subset of errors
reported via Bank 4. The subset includes only DRAM-related errors.
The new code builds upon and reuses the Intel CMC (Correctable MCE
Counters) support code. However, the AMD feature is quite different
and, unfortunately, much less regular.
For references please see AMD BKDGs for models 10h - 16h.
Specifically, see MSR0000_0413 NB Machine Check Misc (Thresholding)
Register (MC4_MISC0).
http://developer.amd.com/resources/developer-guides-manuals/
Reviewed by: jhb
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9613
We may fail to reset the %CPU tracking window if a thread does not run
for over half of the ticks rollover period, resulting in a bogus %CPU
value for the thread until ticks fully rolls over. Handle this by comparing
the unsigned difference ticks - ts_ltick with SCHED_TICK_TARG instead.
Reviewed by: cem, jeff
MFC after: 1 week
Sponsored by: Dell EMC Isilon
CAM_UNLOCKED is internal flag and cannot correctly be set by userland.
Return EINVAL from CAMIOCOMMAND and CAMIOQUEUE if it is set.
Also fix leaks in some of the error paths for CAMIOQUEUE.
PR: 215356
Reviewed by: ken, mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9869
in valuestate array.
When opcode has size equal to ipfw_insn_u32, this means that it should
additionally match value specified in d[0] with table entry value.
ipfw_table_lookup() returns table value index, use TARG_VAL() macro to
convert it to its value. The actual 32-bit value stored in the tag field
of table_value structure, where all unspecified u32 values are kept.
PR: 217262
Reviewed by: melifaro
MFC after: 1 week
Sponsored by: Yandex LLC
- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.
- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.
- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.
- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.
- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.
- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.
MFC after: 1 week
Sponsored by: Mellanox Technologies
ULPs can set a qp's state to ERROR and then post a work request on the
sq and/or rq. When the reply for that work request comes back it is
guaranteed that all previous work requests posted on that queue have
been drained.
Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications
ucast/mcast/mgmt HT rate.
- Init global ieee80211_htrateset only once; neither ic_htcaps nor
ic_txstream is changed when device is attached;
- Move global ieee80211_htrateset structure to ieee80211com;
there was a possible data race when more than 1 wireless device is
used simultaneously;
- Discard unsupported rates in ieee80211_ioctl_settxparams(); otherwise,
an unsupported value may break connectivity (actually,
'ifconfig wlan0 ucastrate 8' for RTL8188EU results in immediate
disconnect + infinite 'device timeout's after it).
Tested with:
- Intel 6205, STA mode.
- RTL8821AU, STA mode.
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D9871
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC after: 1 month
For short transactions overhead of context switch can be too large.
Skipping it gives significant latency reduction. For large ones,
including multiple ZIOs, latency is less critical, while throughput
there may become limited by checksumming speed of single CPU core.
To get best of both cases, execute last ZIO directly from calling
thread context to save latency, while all others (if there are any)
enqueue to taskqueues in traditional way.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
The loader assumes physical memory in [2MB, 2MB + EFI_STAGING_SIZE)
is Conventional Memory, but actually it may not, e.g. in the case
of Hyper-V Generation-2 VM (i.e. UEFI VM) running on Windows
Server 2012 R2 host, there is a BootServiceData memory block at
the address 47.449MB and the memory is not writable.
Without the patch, the loader will crash in efi_copy_finish():
see PR 211746.
The patch verifies the end of the staging area, and reduces its
size if necessary. This way, the loader will not try to write into
the BootServiceData memory any longer.
Thank Marcel Moolenaar for helping me on this issue!
The patch also allocates the staging area in the first 1GB memory.
See the comment in the patch for this.
PR: 211746
Reviewed by: marcel, kib, sephe
Approved by: sephe (mentor)
MFC after: 2 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D9686
Add opt_platform.h and opt_acpi.h to the dependencies so modules can be
built as a part of buildworld when MODULES_WITH_WORLD is set
Reported by: Andre Albsmeier (for 11-stable)
MFC after: 1 day
registered. T4/5/6 have no internal limit on this size. This is
probably a copy paste from the T3 iw_cxgb driver.
MFC after: 3 days
Sponsored by: Chelsio Communications
Consider the rule matching when both @done and @retval values
returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6()
to return IP_FW_DENY and @done=0 when addresses do not match.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
This is a minimal attempt to keep consistency in the Makefiles so that
moving ficl to somwehere like contrib will be less error prone.
MFC after: 1 week
Elf_map_insert() needs to create mapping at the known fixed address.
Usage of vm_map_find() assumes, on the other hand, that any suitable
address space range above or equal the specified hint, is acceptable.
Due to operating on the fresh or cleared address space, vm_map_find()
usually creates mapping starting exactly at hint.
Switch to vm_map_insert() use to clearly request fixed mapping from
the VM.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
vm_map_insert() failure, drop the vnode lock around the call to
vm_object_deallocate().
Since the deallocated object is the vm object of the vnode, we might
get the vnode lock recursion there. In fact, it is almost impossible
to make vm_map_insert() failing there on stock kernel.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
that this shouldn't go in. I was unaware when I merged the pull
request. I don't wish to upset the status quo, so backout per
project practice.
Pull Request: https://github.com/freebsd/freebsd/pull/92
Noted by: hrs@
Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.
Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock
I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.
ps.
.-'---`-.
,' `.
| \
| \
\ _ \
,\ _ ,'-,/-)\
( * \ \,' ,' ,'-)
`._,) -',-')
\/ ''/
) / /
/ ,'-'
Hardware provided by: IBM LTC
TW_CL_MAX_NUM_LUNS should not be 16 but I presume 255. I have a 3ware
controller with more than 16 volumes (LUN's) and otherwise all LUN's
above the 16'th are not working.
Submitted by: jcatrysse <j.catrysse@proximedia.be>
Pull Request: https://github.com/freebsd/freebsd/pull/100
left-click event. It can be disabled setting the new
hw.usb.wsp.enable_single_tap_clicks sysctl to 0.
Submitted by: K Staring <qdk@quickdekay.net>
Pull Request: https://github.com/freebsd/freebsd/pull/97
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
and csum_flags using information from all fragments. This fixes
dropping of reassembled packets due to wrong checksum when the IPv6
checksum offloading is enabled on a network card.
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC
Otherwise kernel traps on NULL dereference if fpu_kern(9) is used from the
thread0 context.
Reported by: cem
Reviewed by: cem, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Make the random number generator work so we can do WPA encryption on the AP's.
Submitted by: Michael Vale <m.vale@live.com.au>
Pull Request: https://github.com/freebsd/freebsd/pull/16
Sockets representing the TCP endpoints for iWARP connections are
allocated by the ibcore module. Before this revision they were closed
either by the ibcore module or the iw_cxgbe hardware driver depending on
the state transitions during connection teardown. This is error prone
and there were cases where both iw_cxgbe and ibcore closed the socket
leading to double-free panics. The fix is to let ibcore close the
sockets it creates and never do it in the driver.
- Use sodisconnect instead of soclose (preceded by solinger = 0) in the
driver to tear down an RDMA connection abruptly. This does what's
intended without releasing the socket's fd reference.
- Close the socket in ibcore when the iWARP iw_cm_id is destroyed. This
works for all kinds of sockets: clients that initiate connections,
listeners, and sockets accepted off of listeners.
Reviewed by: Steve Wise @ Open Grid Computing, hselasky@
MFC after: 3 days
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D9796
The extended LVT entries can be used to configure interrupt delivery
for various events that are internal to a processor and can use this
feature.
All current processors that support the feature have four of such entries.
The entries are all masked upon the processor reset, but it's possible
that firmware may use some of them.
BIOS and Kernel Developer's Guides for some processor models do not assign
any particular names to the extended LVTs, while other BKDGs provide names
and suggested usage for them.
However, there is no fixed mapping between the LVTs and the processor
events in any processor model that supports the feature. Any entry can be
assigned to any event. The assignment is done by programming an offset
of an entry into configuration bits corresponding to an event.
This change does not expose the flexibility that the feature offers.
The change adds just a single method to configure a hardcoded extended LVT
entry to deliver APIC_CMC_INT. The method is designed to be used with
Machine Check Error Thresholding mechanism on supported processor models.
For references please see BKDGs for families 10h - 16h and specifically
descriptions of APIC30, APIC400, APIC[530:500] registers.
For a description of the Error Thresholding mechanism see, for example,
BKDG for family 10h, section 2.12.1.6.
http://developer.amd.com/resources/developer-guides-manuals/
Thanks to jhb and kib for their suggestions.
Reviewed by: kib
Discussed with: jhb
MFC after: 5 weeks
Relnotes: maybe
Differential Revision: https://reviews.freebsd.org/D9612
to stdout in the non-kernel case and to the console+log
in the kernel case. For the kernel case it hooks the
putbuf() machinery underneath printf(9) so that the buffer
is written completely atomically and without a copy into
another temporary buffer. This is useful for fixing
compound console/log messages that become broken and
interleaved when multiple threads are competing for the
console.
Reviewed by: ken, imp
Sponsored by: Netflix
This adds clocks support for the aw_ccung on the A31 SoC.
Newer DTS files require this.
All the clocks except two CSI are defined and exported on the clock domain.
The PLL_DDR clock have an update bit which need to be set after changing
the value, add the possibility to define one for NKMP clocks.
This allow us to add the missing clocks.
We now have the full list of clocks created under the clock domain.
SBP-2 specification defined maximum CDB length as 12 bytes. Newer SBP-3
specification allows CDB of any size, but this driver is too old. Proper
solution would be to look on maximal ORB size supported by the target.
MFC after: 1 week
"all" in ports currently means "stage the ports", which requires root today,
and brings to light other potential issues, like ENAMETOOLONG with staged
directories (bug 161481, etc).
This fixes buildkernel for me when run as a non-root user, assuming all
of the prerequisites have been installed beforehand and are up-to-date.
MFC after: 1 month
Discussed with: swills (IRC)
Sponsored by: Dell EMC Isilon
Add the necessary bits to enable kernel breakpoints for Book-E. The entrypoint
for program exception is very trivial, so rather than expand it to be similar to
AIM, add it into the standard trap handler.
This wasn't blocked out as Book-E specific because it is only a minor redundancy
over AIM, which should have already called db_trap_glue() at this point. If
it's going to panic with a fatal trap anywya, it doesn't matter if it goes
through this path again.
When committing DTrace in 2012/2013 era I inadvertently broke breakpoints, by
setting EXC_DTRACE to the same value as BKPT_INST. Change EXC_DTRACE to a
different, yet logically identical, trap (tw <all>,31,31).
MFC after: 2 weeks
RSS hash type will be used to identify the CPU on to which, a receive packet
will be queued. This patch extracts the "RSS hash type" from the receive
completion and sends it to the stack.
Submitted by: Venkatkumar Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed by: shurd
Approved by: sbruno
MFC after: 1 week
Sponsored by: Broadcom Limited
Differential Revision: https://reviews.freebsd.org/D9685
2. add sysctl to set pause frame parameters
3. increase max segs for TSO packets to BXE_TSO_MAX_SEGMENTS (32)
4. add debug messages for PHY
5. HW LRO support restricted to FreeBSD versions 8.x and above.
Submitted by:Vaishali.Kulkarni@cavium.com
MFC after:5 days
This is required for FDT's standard "reg-io-width" property
(similar to "reg-shift" property) found in many DTS files.
This fixes operation on Altera Arria 10 SOC Development Kit,
where standard ns8250 uart allows 4-byte access only.
Reviewed by: kan, marcel
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9785
The fixed is used only to fix up buggy MPTable information and the
trigger mode is probably ignored for the relevant interrupt types
anyway. Still, it's better to be standards compliant and have the code
do what it says it does.
Discussed with: jhb
MFC after: 5 days
vm_map_lookup_done should only be called when the gntdev has finished poking at
the entry.
Reported by: alc
Reviewed by: alc
MFC after: 1 week
Sponsored by: Citrix Systems R&D
The biggest change is that ctl_remove_initiator() now generates I_T NEXUS
LOSS event, cleaning part of LUs state related to the initiator.
MFC after: 2 weeks
During acpi_cmbat_attach() the acpi_cmbat_init_battery() notification
handler is registered. It has been observed this notification handler
can be called instantly, before the attach routine has returned. In
the notification handler there is a call to device_is_attached() which
returns false. Because the softc is set we know an attach is in
progress and the fix is simply to wait and try again in this case.
Reviewed by: avg @
MFC after: 1 week
The pl011 UART has a 16 entry Tx FIFO and a 16 entry Rx FIFO that
have not been used so far. Update the driver to enable the FIFOs
and use them in transmit and receive.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D8819
directly from the node.
- Use ni_txparms directly instead of calculating them manually every time
- Move M_EAPOL flag check upper; otherwise it may be skipped due to
'ucastrate' / 'mcastrate' check
- Use 'mgtrate' for control frames too (see ifconfig(8), mgtrate parameter)
- Add few more M_EAPOL checks where it was missing (zyd(4), ural(4),
urtw(4))
- Few unrelated cleanups
Tested with:
- Intel 6205 (iwn(4)), STA mode;
- WUSB54GC (rum(4)), HOSTAP mode + RTL8188EU (rtwn(4)), STA mode.
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D9811
PG_PROMOTED, that indicates whether lingering 4KB page mappings might
need to be flushed on a PDE change that restricts or destroys a 2MB
page mapping. This flag allows the pmap to avoid range invalidations
that are both unnecessary and costly.
Reviewed by: kib, markj
MFC after: 6 weeks
Differential Revision: https://reviews.freebsd.org/D9665
If we asked to send sense data by setting CAM_SEND_SENSE, but SIM didn't
confirm transmission by setting CAM_SENT_SENSE, assume it was not sent.
Queue the I/O back to CTL for later REQUEST SENSE with ctl_queue_sense().
This is needed for error reporting on SPI HBAs like ahc(4)/ahd(4).
MFC after: 2 weeks
Since Linux 4.9-4.10 DTS doesn't have clocks under /clocks but only a ccu node.
Currently only H3 is supported with almost the same state as HEAD.
(video pll aren't supported for now but we don't support video).
This driver and clocks will also be used for other SoC (A64, A31, H5, H2 etc ...)
Reviewed by: jmcneill
Differential Revision: https://reviews.freebsd.org/D9517
Its more important for SPI HBAs, as they don't support CDBs above 12 bytes.
The new error code makes CAM to fall back to alternative commands.
MFC after: 2 weeks
The ifdefs were '#if !defined(__i386__) || !defined(PC98)' previously,
so cpu_idle_acpi was enabled both i386 and amd64 except PC98.
I was obfuscated by '#if !defined(__i386__)' condition.
Submitted by: bde
Reported by: bde
This allows to properly handle cases when target wants to receive or send
more data then initiator wants to send or receive. Previously in such
cases isp(4) returned CAM_DATA_RUN_ERR, while now it returns resid > 0.
MFC after: 2 weeks
This change removes limitation of single S/G list entry and limitation on
maximal I/O size, using multiple data transfers per I/O if needed. Also
it removes code duplication between send and receive paths, which are now
completely equal.
In particular, don't set the synchronized bit for the peer unless it truly
appears to be synchronized to us. Also, don't set our own synchronized bit
unless we have actually seen a remote system.
Prior to this change, we were seeing some strange behavior, such as:
1. We send an advertisement with the Activity, Aggregation, and Default
flags, followed by an advertisement with the Activity, Aggregation,
Synchronization, and Default flags. However, we hadn't seen an
advertisement from another peer and were still advertising the default
(NULL) peer. A closer examination of the in-kernel data structures (using
kgdb) showed that the system had added the default (NULL) peer as a valid
aggregator for the segment.
2. We were receiving an advertisement from a peer that included the
default (NULL) peer instead of including our system information. However,
we responded with an advertisement that included the Synchronization flag
for both our system and the peer. (Since the peer's advertisement did not
include our system information, we shouldn't add the synchronization bit
for the peer.)
This patch corrects those two items.
Reviewed by: smh
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D9485
7570 tunable to allow zvol SCSI unmap to return on commit of txn to ZIL
illumos/illumos-gate@1c9272b8611c9272b861https://www.illumos.org/issues/7570
Based on the discovery that every unmap waits for the commit of the txn to the ZIL,
introducing a very high latency to unmap commands, this behavior was made into a
tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is
that by default SCSI unmap commands will result in space being freed within the zvol
(today they are ignored and returned with good status). However, unlike the code
today, instead of 18+ms per unmap, they take about 30us.
With the testing done on NTFS against a Win2k12 target, the new behavior should work
seamlessly. Files on the zvol that have already been set with the zfree application
will continue to write 0's when deleted, and any new files created since zvol
creation will send unmap commands when deleted. This behavior exists today, but with
this change the unmap commands will be processed and result in reclaim of space.
Author: Stephen Blinick <stephen.blinick@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
When we was compering it to code from boot2 it also looks like
this code is buggy and boot2 was never updated to use this code.
USE_XREAD flag is unused in boot2, and common/drv.c was never
build with that flag.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D9780
While there, make a change to not evict a first buffer outside the
requested eviciton range.
To do:
- give more consistent names to the size variables
- upstream to OpenZFS
PR: 216178
Reported by: lev
Tested by: lev
MFC after: 2 weeks
callout(9) prohibits callout functions from sleeping.
illumos mutexes are emulated using sx(9).
spa_deadman() calls vdev_deadman() and the latter acquires vq_lock.
As a result we can get a more confusing panic instead of a specific
panic or no panic:
sleepq_add: td 0xfffff80019669960 to sleep on wchan 0xfffff8001cff4d88 with sleeping prohibited
This change adds another level of indirection where the deadman
callout schedules spa_deadman() to be executed on taskqueue_thread.
While there, use callout_schedule(0 instead of callout_reset()
in spa_sync().
Discussed with: mav
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9762
A comment near kmem_reclaim() implies that we already did that.
Calling the hook is useful, because some handlers, e.g. ARC,
might be able to release significant amounts of KVA.
Now that we have more than one place where vm_lowmem hook is called,
use this change as an opportunity to introduce flags that describe
a reason for calling the hook. No handler makes use of the flags yet.
Reviewed by: markj, kib
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9764
6676 Race between unique_insert() and unique_remove() causes ZFS fsid change
illumos/illumos-gate@40510e8eba40510e8ebahttps://www.illumos.org/issues/6676
The fsid of zfs filesystems might change after reboot or remount. The problem seems to
be caused by a race between unique_insert() and unique_remove(). The unique_remove()
is called from dsl_dataset_evict() which is now an asynchronous thread. In a case the
dsl_dataset_evict() thread is very slow and calls unique_remove() too late we will end
up with changed fsid on zfs mount.
This problem is very likely caused by #5056.
Steps to Reproduce
Note: I'm able to reproduce this always on a single core (virtual) machine. On multicore
machines it is not so easy to reproduce.
# uname -a
SunOS openindiana 5.11 illumos-633aa80 i86pc i386 i86pc Solaris
# zfs create rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
vfs_fsid.val = [ 0x54d7028a, 0x70311508 ]
}
# zfs umount rpool/TEST
# zfs mount rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
vfs_fsid.val = [ 0xd9454e49, 0x6b36d08 ]
}
#
Impact
The persistent fsid (filesystem id) is essential for proper NFS functionality.
If the fsid of a filesystem changes on remount (or after reboot) the NFS
clients might not be able to automatically recover from such event and the
manual remount of the NFS filesystems on every NFS client might be needed.
Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Dan Vatca <dan.vatca@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
This code was disabled due to its high memory usage. But now we need this
functionality for cfumass(4) frontend, since USB MS BBB transport does not
support autosense.
MFC after: 2 weeks
Thread might create a condition for delayed SU cleanup, which creates
a reference to the mount point in td_su, but exit without returning
through userret(), e.g. when terminating due to single-threading or
process exit. In this case, td_su reference is not dropped and mount
point cannot be freed.
Handle the situation by clearing td_su also in the thread destructor
and in exit1(). softdep_ast_cleanup() has to receive the thread as
argument, since e.g. thread destructor is executed in different
context.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Convert PCIe hot plug support over to asking the firmware, if any, for
permission to use the HotPlug hardware. Implement pci_request_feature
for ACPI. All other host pci connections to allowing all valid feature
requests.
Sponsored by: Netflix
pcib_request_feature allows drivers to request the firmware (ACPI)
release certain features it may be using. ACPI normally manages things
like hot plug, advanced error reporting and other features until the
OS requests ACPI to relenquish control since it is taking over.
Sponsored by: Netflix
- Check return code from initialization path; otherwise, vap state
may be wrong after an error.
- Do not try to run iwn_stop() / iwn_init() multiple times.
- Merge iwn_radio_on/off() and move RFKILL bit check into the task.
- Try to handle possible RF switch state change in S3 state (PR 181694).
PR: 181694
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D9797
We have an original panic. Then, instead of writing the core to the dump
device, the kernel has a second panic: "smp_targeted_tlb_shootdown:
interrupts disabled". This change is an attempt to fix that second panic.
When the other CPUs are stopped, we can't notify them of the TLB shootdown,
so we skip that operation. However, when the CPUs come back up, we
invalidate the TLB to ensure they correctly observe any changes to the
page mappings.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D9786
pwgets() is based on ngets() from libstand, which includes a feature
that is not wanted in a very of the function designed for password
handling.
Pressing control+r echos out the entered string
This commit removes that feature from pwgets()
PR: 217298
Reported by: ehaupt
Reviewed by: kristof, tsoome, ehaupt
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D9782
On Core2 and older Intel CPUs, where TSC stops in C2, system does not
allow C2 entrance if timecounter hardware is TSC. This is done by
tc_windup() which tests for TC_FLAGS_C2STOP flag of the new
timecounter and increases cpu_disable_c2_sleep if flag is set. Right
now init_TSC_tc() only sets the flag if cpu_deepest_sleep >= 2, but
TSC is initialized too early for this variable to be set by
acpi_cpu.c.
There is no reason to require that ACPI reported C2 and deeper states
to set TC_FLAGS_C2STOP, so remove cpu_deepest_sleep test from
init_TSC_tc() condition. And since this is the only use of the
variable, remove it at all.
Reported and submitted by: Jia-Shiun Li <jiashiun@gmail.com>
Suggested by: jhb
MFC after: 2 weeks
In vm_fault_prefault(), if backward count causes underflow in
calculation of
starta = addra - backward * PAGE_SIZE;
then starta must be clipped to entry->start, instead of zero.
Clipping to zero allowed mapping outside of the map entries address
ranges, in particular, map at zero.
Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
Reviewed by: alc
MFC after: 1 week
* Uses the IWM_FW_PAGING_BLOCK_CMD firmware command to tell the firmware
what memory ranges to use for paging.
Obtained from: dragonflybsd.git 8a5b199964f8e7bdb00039f0b48817a01b402f18
Stop building BERI_DE4_BASE and BERI_SIM_BASE, they aren't particularly
valid as they don't have a root dev. Do build BERI_DE4_SDROOT which
does so devices get coverage.
Remove ident from BERI_DE4_BASE for the reasons above which will cause
it to fail to build. BERI_SIM_BASE was already this way and broke
universe.[0]
Reported by: rpokala
Sponsored by: DARPA, AFRL
This function allows the caller to specify the reference clock
and choose between absolute and relative mode. In relative mode,
the remaining time can be returned.
The API is similar to clock_nanosleep(3). Thanks to Ed Schouten
for that suggestion.
While I'm here, reduce the sleep time in the semaphore "child"
test to greatly reduce its runtime. Also add a reasonable timeout.
Reviewed by: ed (userland)
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9656
VXGE_DEFAULT_TTI_RTIMER_VAL and VXGE_DEFAULT_RTI_RTIMER_VAL have value
zero but nevertheless we should use the right value on each.
Pointed by: jhb
X-MFC with: r314145
least 2 * MSS. However, if the receive buffer size is small, this might
be impossible. Add back a criterion to send a TCP window update if
the window can be increased by at least half of the receive buffer size.
This condition was removed in r242252. This patch simply brings it back.
PR: 211003
Reviewed by: gnn
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D9475
Add if_iwm_7000.c/if_iwm_8000.c to SRCS to match similar additions made
to sys/conf/files after refactoring done in the commit noted.
PR: 217308
Pointyhat to: adrian
Submitted by: Andreas Nilsson <andrnils@gmail.com>
Reported by: Jakob Alvermark <jakob@alvermark.net>, Juan Ramómon Molina Menor <listjm@club.fr>
Sponsored by: Dell EMC Isilon
One test was inadvertently expecting a bug in the kernel's sscanf
implementation circa 2012. I don't know when that bug got fixed.
Reported by: royger
Reviewed by: royger
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D9766
Most of these are null pointer dereferences or missing error checks in the
unit tests. One is a missing error check in xnb_attach_failed. None can
cause real problems in running systems.
Reported by: Coverity
CIDs: 1092469 1092468 1092467 2092466 1092465 1092512 1092511 1092510
CIDs: 1092510 1092509 1092508 1092507
Reviewed by: royger
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D9234
The code is not operational right now so just comment away an obviously
useless assignment. Fix some typos while here.
Found with: coccinelle (da.cocci)
The Xen grant table device treats the mmap offset parameter as an unsigned
type, and as so it must use the newly introduced UOFF_TO_IDX.
Sponsored by: Citrix Systems R&D
MFC after: 2 weeks
X-MFC-with: r313690
The clang compiler will optimise these functions down to three AMD64
instructions if the bit argument is a constant during compilation.
MFC after: 1 week
Sponsored by: Mellanox Technologies
data structures.
vt_change_font() calls vtbuf_grow() to change some vt driver data
structures. It uses TF_MUTE to prevent the console from trying to use those
data structures while it changes them.
During the early stage of the boot process, the vt driver's tc_done routine
uses those data structures; however, it is currently called outside the
TF_MUTE check.
Move the tc_done routine inside the locked TF_MUTE check.
PR: 217282
Reviewed by: ed, ray
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D9709
As the current zfs file system is providing symlink via system attributes, need
to update the code accordingly.
Note, as the zfsboot code does not free the memory at this time, the
object list will put some stress on the boot2 heap, eventually we should
address the issue.
Reviewed by: allanjude, smh
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9706
When allocating unmapped pages, take advantage of the direct map on
AMD64 to get the virtual address corresponding to a page. Else all
pages allocated must be mapped because sometimes the virtual address
of a page is requested.
Move all page allocation and deallocation code into an own C-file.
Add support for GFP_DMA32, GFP_KERNEL, GFP_ATOMIC and __GFP_ZERO
allocation flags.
Make a clear separation between mapped and unmapped allocations.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
The i915kms driver in Linux 4.9 reimplement parts of the scatter list
functions with regards to performance. In other words there is not so
much room for changing structure layouts and functionality if the
i915kms should be built AS-IS. This patch aligns the scatter list
support to what is expected by the i915kms driver. Remove some
comments not needed while at it.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
For example, the FreeBSD GCC (4.2.1) has a spotty support for that
feature. If the static keyword is used with an unnamed array parameter
in a function declaration, then the compilation fails with:
error: static or type qualifiers in abstract declarator
The feature does work if the parameter is named.
So, the restriction introduced in this commit can be removed when all
affected function prototypes have the workaround.
MFC after: 1 week
Sponsored by: Panzura
with geom_flashmap(4) and teach it about MMC for slicing enhanced
user data area partitions. The FDT slicer still is the default for
CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
in question to the slicers as a single device may provide more than
provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes
Submitted by: jhibbits (RouterBoard bits)
Reviewed by: jhibbits
Note that the timer itself fully supports suspension, but due to the lack of
ordering during the resume process FreeBSD cannot guarantee that the timer is
resumed before any device attempts to use it.
Submitted by: Liuyingdong <liuyingdong@huawei.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D9639
- Move private data about ATIOs/INOTs from per-LUN to per-channel data.
This allows active commands to continue operation after LUN destruction.
This also simplifies lookup of the data by tag in some situations.
- Unify three restart_queue processing implementations.
- Complete all ATIOs from restart_queue on LUN disable.
- Delete ATIO private data when command completed or aborted, not depending
on the ATIO being requeued, that was ugly hack and could never happen. CAM
should always call ether XPT_CONT_TARGET_IO with status or XPT_ABORT.
- Implement XPT_ABORT for queued ATIOs/INOTs to allow CAM do graceful
shutdown, not depending on LUN disable, as it is done in ahd(4)/targ(4).
- Unify isp_endcmd() arguments to make it more usable in generic code.
- Remove never really used LUN state reference counter.
MFC after: 2 weeks
* This is more similar to how code/definitions are distributed in
Linux's iwlwifi.
* This should make recognizing new chipset variants, and adding additional
flags from the Linux iwlwifi code easier, without blowing up if_iwm.c
Obtained from: dragonflybsd.git 27d11320e707d2c41424efc1983762f6799941d6
* Just add the struct iwm_cfg pointers to the iwm_devices array, to get
rid of the large switch clause.
Obtained from: dragonflybsd.git 35f0e6c86c1654323d6b19f7a077f4ab8ac85868
if the fdt data doesn't provide a gpio pin for reading the write protect
switch and also doesn't contain a "wp-disable" property.
In r311735 the long-bitrotted code in this driver for using the non-
standard fdt "mmchs-wp-gpio-pin" property was replaced with new common
support code for handling write-protect and card-detect gpio pins. The
old code never found a property with that name, and the logic was to
assume that no gpio pin meant that the card was not write protected.
The new common code behaves differently. If there is no fdt data saying
what to do about sensing write protect, the value in the standard SDHCI
PRESENT_STATE register is used. On this hardware, if there is no signal
for write protect muxed into the sd controller then that bit in the
register indicates write protect.
The real problem here is the fdt data, which should contain "wp-disable"
properties for eMMC and micro-sd slots where write protect is not even
an option in the hardware, but we are not in control of that data, it
comes from linux. So we have to make the same flawed assumption in our
driver that the corresponding linux driver has: no info means no protect.
Reported by: several users on the arm@ list
Pointy hat: me, for not testing enough before committing r311735
* The sc->sc_uc.uc_error_event_table value is now at sc->error_event_table,
and not sc->umac_error_event_table.
Obtained from: dragonflybsd.git 612855b1a8c321ec9ba34f63edf913e7ecff8363
* Use the notification wait API, like it's done in the Linux iwlwifi code,
to wait for the IWM_MVM_ALIVE notification.
* This also should fix some firmware load interrupt issues, and errors
in the nic lock using.
Tested:
* (adrian) Intel 7260, STA mode
Obtained from: dragonflybsd.git a7697ea01c11fd493aec52260a02f31df680eb91
* While there, rename some functions to match the names and functionality
of the similarly named functions in Linux iwlwifi.
Obtained from: dragonflybsd.git e98ee77a816bfd8b4912047b93dfb2c560788f24
The difference of one was insignificant because zio_write_issue threads
ended up on the same run queues as other zio threads.
See sys/priority.h and sys/runq.h for more details.
Add a comment describing FreeBSD priority considerations and restore
the illumos variant of the code for comparison.
Obtained from: Panzura
MFC after: 2 weeks
Sponsored by: Panzura
show pte from the pmap of the process of the current DDB thread, instead
of necessarily the PCPU pmap.
Submitted by: Ryan Libby <rlibby@gmail.com>
Reviewed by: kib
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D9645
The build process generates *assym.h using nm from *genassym.o (which is
in turn created from *genassym.c).
When compiling with link-time optimization (LTO) using -flto, .o files
are LLVM bitcode, not ELF objects. This is not usable by genassym.sh,
so remove -flto from those ${CC} invocations.
Submitted by: George Rimar
Reviewed by: dim
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9659
then return EAGAIN. The current code just returns that if the LAST buf
failed.
Reviewed by: kib@, trasz@
Differential Revision: https://reviews.freebsd.org/D9677