Previous code by default setting pre-timeout interval to 120 seconds
made impossible to set timeout interval below that, resulting in error
0xcc (Invalid data field in Request) at least on Supermicro boards.
To fix that limit maximum pre-timeout interval to ~1/4 of the timeout
interval, that sounds like a reasonable default: not too short to fire
too late, but also not too long to give many false reports.
MFC after: 2 weeks
In case we want to use other WD than IPMI-provided, add
sysctl to disable initialization.
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D31548
Add request submission status checks before checking req->ir_compcode,
otherwise it may be zero just because of initialization.
Add checks for req->ir_compcode errors in ipmi_reset_watchdog() and
ipmi_set_watchdog(). In first case explicitly check for 0x80, which
means timer was not previously set, that I found happening after BMC
cold reset. This change makes watchdog timer to recover instead of
permanently ignoring reset errors after BMC reset or upgraded.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
The original implementation only supports getting the address from legacy
BIOS (by searching for the SMBIOS_SIG pattern in a fixed address space).
Try to get the SMBIOS table from EFI through efirt (EFI Runtime Services)
firstly. Continue to search in the legacy BIOS if a NULL address is
returned from EFI.
By this way the ipmi function supports both legacy BIOS and UEFI systems.
Reviewed by: dab, vangyzen
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D30007
This function will be used for exposing DMI info as sysctls in the
smbios module (in an upcoming review).
While here, add __packed to the structs.
Reviewed by: dab
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D29270
Add it to the x86 GENERIC and MINIMAL kernels
Sponsored by: Ampere Computing LLC
Submitted by: Klara Inc.
Reviewed by: rpokala
Differential Revision: https://reviews.freebsd.org/D28738
Copy the CP, PTRIN, etc macros from freebsd32.h into a sys/abi_compat.h
and replace existing definitation with includes where required. This
eliminates duplicate code and allows Linux and FreeBSD compatability
headers to be included in the same files.
Input from: cem, jhb
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24275
This change fixes a couple of issues with OPAL IPMI driver and
implements a mechanism to detect timeouts and discard old messages left
in receive queue, to avoid old messages from being confused with the
reply of new ones.
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D24185
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
between each byte either sent or received). However, most transitions
actually complete in 2-3 microseconds.
By polling the status register with a delay of 4us with exponential
backoff, the performance of most IPMI operations is significantly
improved:
- A BMC update on a Supermicro x9 or x11 motherboard goes from ~1 hour
to ~6-8 minutes.
- An ipmitool sensor list time improves by a factor of 4.
Testing showed no significant improvements on a modern server by using
a lower delay.
The changes should also generally reduce the total amount of CPU or
I/O bandwidth used for a given IPMI operation.
Submitted by: Loic Prylli <lprylli@netflix.com>
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20527
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.
EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.
LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).
No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
* Crank the OPAL state machine during the receive loop, to make sure the
pollers are executed
* Add a proper detach function, so the module can be unloaded and reloaded
at runtime.
It still doesn't reliably work 100% of the time on POWER9, and it appears
timing and/or cache related. It may work on POWER8 now.
MFC after: 2 weeks
PR: maybe related to 233998 (inconclusive at this time)
Submitted by: byuu <byuu AT tutanota.com> (previous version)
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D18506
Fix a NULL dereference that would occur any time an ioctl() was done, due to a
missing ipmi_enqueue_request callback. Just use the default for now, until we
decide to properly enable IPMI interrupts.
Reported by: kbowling
enough rate, the IPMI code can print large numbers of messages to the
console, such as:
ipmi0: KCS: Failed to read completion code
ipmi0: KCS error: ff
ipmi0: KCS: Failed to read completion code
ipmi0: KCS error: ff
These seem to be innocuous from a system standpoint, and the user-
space code can deal with the failures. Therefore, suppress printing
these messages to the console unless bootverbose is enabled.
Obtained from: Netflix, Inc.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Otherwise an orderly shutdown will initiate a watchdog that will cause
a 7 minute delayed reboot *by default*, In the freebsd.org cluster's case
this often worked out be a surprise reboot a minute or two after the
machine came back up.
hw.ipmi.cycle_time is the time to wait for the power down phase of the
ipmi power cycle before falling back to either reboot or halt.
Sponsored by: Netflix
o Make hw.ipmi.on a tuneable
o Changes to keep shutdown from hanging indefinitately after the wd
would normally have been disabled.
o Add support for setting pretimeout (which fires an interrupt
some time before the actual watchdog expires)
o Allow refinement of the actions to take when the watchdog expires
o Allow special startup timeout to keep us from hanging in boot
before watchdogd is started, but after we've loaded the kernel.
Obtained From: Netflix OCA Firmware
Some BMCs support power cycling the chassis via the chassis control
command 2 subcommand 2 (ipmitool called it 'chassis power cycle'). If
the BMC supports the chassis device, register a shutdown_final handler
that sends the power cycle command if request and waits up to 10s for
it to take effect. To minimize stack strain, we preallocate a ipmi
request in the softc. At the moment, we're verbose about what we're
doing.
Sponsored by: Netflix
Set watchdog timer parameters only when they really need to be changed.
In other cases just restart the timer with single Reset command instead
of two (Set and Reset).
From one side this visually reduces amount of CPU time burned in tight
loop waiting while some slow BMC configures its watchdog hardware, that
seems to be much more complicated task then just resetting the timer.
From another side on some BMCs those slow Set commands sometimes tend to
timeout, that leads to noisy log messages and even more CPU time burned,
so avoiding them can provide even bigger bonuses.
MFC after: 2 weeks
are not permitted to sleep. Only use the IPMI watchdog with backends
which poll driver-initiated requests to meet this requirement.
In practice this means that watchdogs will no longer be used on systems
that use the SSIF backend.
Differential Revision: https://reviews.freebsd.org/D2062
MFC after: 2 weeks
particular, updates to the watchdog should no longer sleep.
- Add a new IPMI_IO_LOCK for low-level I/O access. Use this for
kcs_polled_request() and smic_polled_request().
- Add a new backend callback "ipmi_driver_request" to handle a driver
request. The new callback performs the request sychronously for KCS
and SMIC. SSIF still defers the work to the worker thread since the
worker thread sleeps during request processing anyway.
- Allocate driver requests on the stack rather than using malloc().
Differential Revision: https://reviews.freebsd.org/D1723
Tested by: scottl
MFC after: 2 weeks
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
MFC after: 3 weeks
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
ipmi_isa_attach. This keeps unintended but harmless noise about "ipmi1"
from appearing in the boot up sequence.
Submitted by: jbh@ (suggested by)
Sponsored by: Yahoo! Inc.
to return on newer Dell hardware. Bump to 6 second timeouts until someone
has a better idea on how to handle this
Reviewed by: jhb@
MFC after: 2 weeks
Sponsored by: Yahoo! Inc.
no longer have the parent in the device tree. This causes the identify
function in ipmi_isa.c to attempt to probe and poke at the ISA IPMI interface
Move the check for ipmi_attached out of the ipmi_isa_attach function and into
the ipmi_isa_identify function. Remove the check of the device tree for
ipmi devices attached.
This probing appears to make Broadcom management firmware on Dell machines
crash and emit NMI EISA warnings at various times requiring power cycles
of the machines to restore.
Bump MAX_TIMEOUT to 6 seconds as a hack for super slow IPMI interfaces that
need longer to respond to our intial probes on startup.
Tested on Dell R410, R510, R815, HP DL160G6
This is MFC candidate for 9.2R
Reviewed by: peter
MFC after: 2 weeks
Sponsored by: Yahoo! Inc.
It is already done in SSIF interface code.
This reduces contention/spinning reported by many users.
PR: kern/172166
Submitted by: Eric van Gyzen <eric at vangyzen.net>
MFC after: 2 weeks
bits under #ifdef _KERNEL but leave definitions for various structures
defined by standards ($PIR table, SMAP entries, etc.) available to
userland.
- Consolidate duplicate SMBIOS table structure definitions in ipmi(4)
and smbios(4) in <machine/pc/bios.h> and make them available to
userland.
MFC after: 2 weeks
Starting or stopping the IPMI watchdog is rather expensive with the
current implementation as all IPMI requests are bounced via thread.
This is not viable during shutdown or dumps, and this avoids headache
in the common case that the watchdog is not enabled. The IPMI watchdog
should probably be reworked to not use a separate thread to fix this
in the case when the watchdog timer is enabled.
MFC after: 2 weeks
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.
Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc