Commit Graph

64 Commits

Author SHA1 Message Date
Jonathan T. Looney
1524298754 The current IPMI KCS code is waiting 100us for all transitions (roughly
between each byte either sent or received). However, most transitions
actually complete in 2-3 microseconds.

By polling the status register with a delay of 4us with exponential
backoff, the performance of most IPMI operations is significantly
improved:
  - A BMC update on a Supermicro x9 or x11 motherboard goes from ~1 hour
    to ~6-8 minutes.
  - An ipmitool sensor list time improves by a factor of 4.

Testing showed no significant improvements on a modern server by using
a lower delay.

The changes should also generally reduce the total amount of CPU or
I/O bandwidth used for a given IPMI operation.

Submitted by:	Loic Prylli <lprylli@netflix.com>
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20527
2019-06-12 16:06:31 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Justin Hibbits
95a1f0e81c ipmi: Fixes for ipmi_opal(powernv)
* Crank the OPAL state machine during the receive loop, to make sure the
  pollers are executed
* Add a proper detach function, so the module can be unloaded and reloaded
  at runtime.

It still doesn't reliably work 100% of the time on POWER9, and it appears
timing and/or cache related.  It may work on POWER8 now.

MFC after:	2 weeks
2019-04-02 04:12:06 +00:00
Conrad Meyer
26649bb5e8 efirt: When present, attempt to use EFI runtime services to shutdown
PR:		maybe related to 233998 (inconclusive at this time)
Submitted by:	byuu <byuu AT tutanota.com> (previous version)
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D18506
2018-12-15 05:46:04 +00:00
Takanori Watanabe
5efca36fbd Distinguish _CID match and _HID match and make lower priority probe
when _CID match.

Reviewed by: jhb, imp
Differential Revision:https://reviews.freebsd.org/D16468
2018-10-26 00:05:46 +00:00
Doug Ambrisko
3991dbf3fa Fix a module Makefile error on amd64 so the IPMI HW interfaces are built.
When the module is being unloaded and no HW interfaces were created don't
clean up.  This was exposed by the amd64 module build issue.
2018-08-16 15:59:02 +00:00
Justin Hibbits
54318d2a6a ipmi/opal: Enable polled mode and proper callback
Fix a NULL dereference that would occur any time an ioctl() was done, due to a
missing ipmi_enqueue_request callback.  Just use the default for now, until we
decide to properly enable IPMI interrupts.

Reported by:	kbowling
2018-08-12 20:33:55 +00:00
Justin Hibbits
0bf0bb832f Support building IPMI as a module on powerpc64
This still only supports IPMI via OPAL on powerpc64, but now it can be tested
with a GENERIC kernel.
2018-07-25 18:58:57 +00:00
Jonathan T. Looney
74800c5a08 In cases where an application issues certain IPMI commands at a high
enough rate, the IPMI code can print large numbers of messages to the
console, such as:
  ipmi0: KCS: Failed to read completion code
  ipmi0: KCS error: ff
  ipmi0: KCS: Failed to read completion code
  ipmi0: KCS error: ff

These seem to be innocuous from a system standpoint, and the user-
space code can deal with the failures. Therefore, suppress printing
these messages to the console unless bootverbose is enabled.

Obtained from:	Netflix, Inc.
2018-04-06 15:15:21 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Peter Wemm
9ee3ea71b3 As a follow-on to r325378, make the shutdown timer default to 0 as well.
Otherwise an orderly shutdown will initiate a watchdog that will cause
a 7 minute delayed reboot *by default*,  In the freebsd.org cluster's case
this often worked out be a surprise reboot a minute or two after the
machine came back up.
2017-11-05 05:05:18 +00:00
Warner Losh
c154763db1 Make the startup timeout 0 seconds by default rathern than 420s. This
makes the default fail safe when watchdogd is disabled (which is also
the default).

Sponsored by
2017-11-04 03:01:58 +00:00
Warner Losh
16f0063e99 Make time we wait for a power cycle tunable.
hw.ipmi.cycle_time is the time to wait for the power down phase of the
ipmi power cycle before falling back to either reboot or halt.

Sponsored by: Netflix
2017-10-26 22:53:02 +00:00
Warner Losh
14d004507e Various IPMI watchdog timer improvements
o Make hw.ipmi.on a tuneable
o Changes to keep shutdown from hanging indefinitately after the wd
  would normally have been disabled.
o Add support for setting pretimeout (which fires an interrupt
  some time before the actual watchdog expires)
o Allow refinement of the actions to take when the watchdog expires
o Allow special startup timeout to keep us from hanging in boot
  before watchdogd is started, but after we've loaded the kernel.

Obtained From: Netflix OCA Firmware
2017-10-26 22:52:51 +00:00
Warner Losh
1170c2fecc Implement IPMI support for RB_POWRECYCLE
Some BMCs support power cycling the chassis via the chassis control
command 2 subcommand 2 (ipmitool called it 'chassis power cycle').  If
the BMC supports the chassis device, register a shutdown_final handler
that sends the power cycle command if request and waits up to 10s for
it to take effect. To minimize stack strain, we preallocate a ipmi
request in the softc. At the moment, we're verbose about what we're
doing.

Sponsored by: Netflix
2017-10-25 15:30:53 +00:00
Alexander Motin
ea2ef9931c Optimize IPMI watchdog patting.
Set watchdog timer parameters only when they really need to be changed.
In other cases just restart the timer with single Reset command instead
of two (Set and Reset).

From one side this visually reduces amount of CPU time burned in tight
loop waiting while some slow BMC configures its watchdog hardware, that
seems to be much more complicated task then just resetting the timer.

From another side on some BMCs those slow Set commands sometimes tend to
timeout, that leads to noisy log messages and even more CPU time burned,
so avoiding them can provide even bigger bonuses.

MFC after:	2 weeks
2016-03-22 06:24:52 +00:00
Xin LI
42404113e6 Remove support for FreeBSD < 602110. 2015-08-30 08:48:31 +00:00
John Baldwin
9662eef57a Watchdog drivers need to support rearming the watchdog in contexts which
are not permitted to sleep.  Only use the IPMI watchdog with backends
which poll driver-initiated requests to meet this requirement.

In practice this means that watchdogs will no longer be used on systems
that use the SSIF backend.

Differential Revision:	https://reviews.freebsd.org/D2062
MFC after:	2 weeks
2015-04-24 16:56:23 +00:00
John Baldwin
c869aa71f0 Use direct hardware access for internal requests for KCS and SMIC. In
particular, updates to the watchdog should no longer sleep.
- Add a new IPMI_IO_LOCK for low-level I/O access.  Use this for
  kcs_polled_request() and smic_polled_request().
- Add a new backend callback "ipmi_driver_request" to handle a driver
  request.  The new callback performs the request sychronously for KCS
  and SMIC.  SSIF still defers the work to the worker thread since the
  worker thread sleeps during request processing anyway.
- Allocate driver requests on the stack rather than using malloc().

Differential Revision:	https://reviews.freebsd.org/D1723
Tested by:	scottl
MFC after:	2 weeks
2015-02-06 16:45:10 +00:00
John Baldwin
51f2504057 Explicitly treat timeouts when waiting for IBF or OBF to change state as an
error.  This fixes occasional hangs in the IPMI kcs thread when using
ipmitool locally.

MFC after:	1 week
2014-12-22 16:53:04 +00:00
Robert Watson
4a14441044 Update kernel inclusions of capability.h to use capsicum.h instead; some
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.

MFC after:	3 weeks
2014-03-16 10:55:57 +00:00
Gleb Smirnoff
a9b3c1bf05 Provide a crutch that prevents watchdog to interrupt dumping
on a box with IPMI enabled.

Okay from:	jhb
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-31 05:13:53 +00:00
Pawel Jakub Dawidek
7008be5bd7 Change the cap_rights_t type from uint64_t to a structure that we can extend
in the future in a backward compatible (API and ABI) way.

The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.

The structure definition looks like this:

	struct cap_rights {
		uint64_t	cr_rights[CAP_RIGHTS_VERSION + 2];
	};

The initial CAP_RIGHTS_VERSION is 0.

The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.

The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.

To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.

	#define	CAP_PDKILL	CAPRIGHT(1, 0x0000000000000800ULL)

We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:

	#define	CAP_LOOKUP	CAPRIGHT(0, 0x0000000000000400ULL)
	#define	CAP_FCHMOD	CAPRIGHT(0, 0x0000000000002000ULL)

	#define	CAP_FCHMODAT	(CAP_FCHMOD | CAP_LOOKUP)

There is new API to manage the new cap_rights_t structure:

	cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
	void cap_rights_set(cap_rights_t *rights, ...);
	void cap_rights_clear(cap_rights_t *rights, ...);
	bool cap_rights_is_set(const cap_rights_t *rights, ...);

	bool cap_rights_is_valid(const cap_rights_t *rights);
	void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
	void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
	bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);

Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:

	cap_rights_t rights;

	cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);

There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:

	#define	cap_rights_set(rights, ...)				\
		__cap_rights_set((rights), __VA_ARGS__, 0ULL)
	void __cap_rights_set(cap_rights_t *rights, ...);

Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:

	cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);

Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.

This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.

Sponsored by:	The FreeBSD Foundation
2013-09-05 00:09:56 +00:00
Sean Bruno
ef3103bba8 Check for ipmi_attached in ipmi_isa_probe as a suggested alternative to
ipmi_isa_attach.  This keeps unintended but harmless noise about "ipmi1"
from appearing in the boot up sequence.

Submitted by:	jbh@ (suggested by)
Sponsored by:	Yahoo! Inc.
2013-07-30 18:54:24 +00:00
Sean Bruno
de3e1e6590 empirical testing showed that 3 seconds is just too slow for GET_DEVICE_ID
to return on newer Dell hardware.  Bump to 6 second timeouts until someone
has a better idea on how to handle this

Reviewed by:	jhb@
MFC after:	2 weeks
Sponsored by:	Yahoo! Inc.
2013-07-30 18:44:29 +00:00
Sean Bruno
9b2566742d After discussions, revert svn r253708.
Changelog for 253708 was completely wrong and the code implemented something
non-standard for the wrong reasons.

Sponsored by:	Yahoo! Inc.
2013-07-30 18:41:36 +00:00
Sean Bruno
d3221f8556 At some point after stable/7 the ACPI and ISA interfaces to the IPMI controller
no longer have the parent in the device tree.  This causes the identify
function in ipmi_isa.c to attempt to probe and poke at the ISA IPMI interface

Move the check for ipmi_attached out of the ipmi_isa_attach function and into
the ipmi_isa_identify function.  Remove the check of the device tree for
ipmi devices attached.

This probing appears to make Broadcom management firmware on Dell machines
crash and emit NMI EISA warnings at various times requiring power cycles
of the machines to restore.

Bump MAX_TIMEOUT to 6 seconds as a hack for super slow IPMI interfaces that
need longer to respond to our intial probes on startup.

Tested on Dell R410, R510, R815, HP DL160G6

This is MFC candidate for 9.2R

Reviewed by:	peter
MFC after:	2 weeks
Sponsored by:	Yahoo! Inc.
2013-07-27 16:32:34 +00:00
Alexander V. Chernikov
58e8e6e6bb Unlock IPMI sc while performing requests via KCS and SMIC interfaces.
It is already done in SSIF interface code.
This reduces contention/spinning reported by many users.

PR:		kern/172166
Submitted by:	Eric van Gyzen <eric at vangyzen.net>
MFC after:	2 weeks
2013-03-25 14:30:34 +00:00
John Baldwin
960b5a7080 - Re-shuffle the <machine/pc/bios.h> headers to move all kernel-specific
bits under #ifdef _KERNEL but leave definitions for various structures
  defined by standards ($PIR table, SMAP entries, etc.) available to
  userland.
- Consolidate duplicate SMBIOS table structure definitions in ipmi(4)
  and smbios(4) in <machine/pc/bios.h> and make them available to
  userland.

MFC after:	2 weeks
2012-09-28 11:59:32 +00:00
John Baldwin
1710852ebd Don't try to stop the IPMI watchdog timer if it is not running.
Starting or stopping the IPMI watchdog is rather expensive with the
current implementation as all IPMI requests are bounced via thread.
This is not viable during shutdown or dumps, and this avoids headache
in the common case that the watchdog is not enabled.  The IPMI watchdog
should probably be reworked to not use a separate thread to fix this
in the case when the watchdog timer is enabled.

MFC after:	2 weeks
2012-08-07 12:40:31 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Ed Schouten
d745c852be Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
Robert Watson
a9d2f8d84f Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *.  With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by:	re (bz)
Submitted by:	jonathan
Sponsored by:	Google Inc
2011-08-11 12:30:23 +00:00
Ruslan Ermilov
14689886a5 Fixed firmware revision decoding:
- the major is 7-bit binary encoded
- the minor is BCD encoded

PR:		kern/151586
MFC after:	3 days
2011-04-14 07:14:22 +00:00
John Baldwin
c280c7d482 Fix test for double-nul characters that terminate the string table at
the end of each SMBIOS/DMI structure.

Submitted by:	Dmitrij Tejblum @ yandex.ru
MFC after:	3 days
2010-07-29 13:46:37 +00:00
John Baldwin
020e9fc33b Rework the SMBIOS table walker to make it operate like other table walkers
and remove a buffer overflow:
- Remove the array of per-type dispatch functions.  Instead, pass each
  structure to a single callback.  The callback should check the type of
  each table entry to take appropriate action.  This matches the behavior
  of other table walkers such as for the MP Table and MADT.
- Don't attempt to save an array of string pointers for each structure
  entry.  Instead, just skip the strings.  If this code is reused to
  provide a generic SMBIOS table walker in the future we could provide
  a method that looks up a specific string N for a given structure record
  instead of pre-populating an array of pointers.  This fixes a buffer
  overflow for structure entries with more than 20 strings.

PR:		kern/148546
Reported by:	Spencer Minear @ McAfee
MFC after:	3 days
2010-07-14 18:06:21 +00:00
Ruslan Ermilov
a46a1e767d - Fixed incorrect watchdog timeout setting: MSB of a 2-byte
value is obtained by dividing it by 256, not by 2550; also,
  one second is 10^9 nanoseconds, not 1800000000 nanoseconds.

- Due to rounding error, setting watchdog to a really small
  timeout (<1 sec) was turning the watchdog off.  It should
  set the watchdog to a small timeout instead.

- Implemented error checking in ipmi_wd_event(), as required
  by watchdog(9).

PR:		kern/130512
Submitted by:	Dmitrij Tejblum

- Additionally, check that the timeout value is within the
  supported range, and if it's too large, act as required by
  watchdog(9).

MFC after:	3 days
2009-12-18 12:10:42 +00:00
Jung-uk Kim
129d3046ef Import ACPICA 20090521. 2009-06-05 18:44:36 +00:00
Doug Ambrisko
d2b2128a28 Add stuff to support upcoming BMC/IPMI flashing of newer Dell machine
via the Linux tool.
     -  Add Linux shim to ipmi(4)
     -  Create a partitions file to linprocfs to make Linux fdisk see
        disks.  This file is dynamic so we can see disks come and go.
     -  Convert msdosfs to vfat in mtab since Linux uses that for
        msdosfs.
     -  In the Linux mount path convert vfat passed in to msdosfs
        so Linux mount works on FreeBSD.  Note that tasting works
        so that if da0 is a msdos file system
                /compat/linux/bin/mount /dev/da0 /mnt
        works.
     -  fix a 64it bug for l_off_t.
Grabing sh, mount, fdisk, df from Linux, creating a symlink of mtab to
/compat/linux/etc/mtab and then some careful unpacking of the Linux bmc
update tool and hacking makes it work on newer Dell boxes.  Note, probably
if you can't figure out how to do this, then you probably shouldn't be
doing it :-)
2009-03-26 17:14:22 +00:00
John Baldwin
3619ea6a08 Don't right-adjust the SMBus slave address for SSIF IPMI BMCs enumerated
via ACPI either.  This is somewhat academic since we don't currently
support such devices though.
2009-02-03 16:39:51 +00:00
John Baldwin
bb6bb7fe1b - Change ichsmb(4) to follow the format of all the other smbus controllers
for slave addressing by using left-adjusted slave addresses (i.e.
  xxxxxxx0b).
- Require the low bit of the slave address to always be zero in smb(4) to
  help catch broken applications.
- Adjust some code in the IPMI driver to not convert the slave address for
  SSIF to a right-adjusted address.  I (or possibly ambrisko@) added this in
  the past to (unknowingly) work around the bug in ichsmb(4).

Submitted by:	 Andriy Gapon <avg of icyb.net.ua> (1,2)
MFC after:	1 month
2009-02-03 16:14:37 +00:00
David E. O'Brien
c12dbd1d96 Fix typo where the code was missing the "IPMICTL_RECEIVE_MSG_32" condition
test.
2008-11-14 01:53:10 +00:00
John Baldwin
943bebd2c9 Remove hack attempt at using devfs cloning for per-file descriptor storage.
Use the much simpler cdevpriv for per-fd state and enable it.  This allows
multiple opens of /dev/ipmi0 (e.g. using ipmitool while ipmievd is running
in the background).

MFC after:	1 week
2008-08-28 02:13:53 +00:00
John Baldwin
93269b7123 - Tweak an error message.
- Fix a buglet where && was used instead of & to test if OBF was set in
  a couple of places.

MFC after:	1 week
2008-08-28 02:11:04 +00:00
Julian Elischer
3745c395ec Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
Doug Ambrisko
72d7331539 Add support to the ipmi, isa attachment to attempt to read ipmi
config info. from device.hints.  Some machines have ipmi controllers
that do not have attachment info in either PCI, SMBIOS or ACPI.
This idea was hacked together by me and then done properly by
jhb.

Submitted by:	jhb
Reviewed by:	jhb (man page)
Approved by:	re (Ken Smith)
MFC after:	1 week
2007-07-16 17:03:48 +00:00
John Baldwin
9310f22692 Update __FreeBSD_version check for MFC of pmap_mapbios(). 2007-05-02 18:43:51 +00:00
John Baldwin
4dc5078f81 Add constants for the fields in a BAR. Also, add two new macros
PCI_BAR_(IO|MEM)() that return true if the passed in value from a BAR
is for an IO or memory BAR, respectively.

Reviewed by:	imp
2007-03-31 21:39:02 +00:00
John Baldwin
657d9f9f55 - Add missing constants for subclasses.
- Add a few progif constants as well.
2007-03-31 20:41:00 +00:00
Nick Hibma
f29fa1dfa4 Revisit the watchdogs: Resetting the error to EINVAL after failing to set the
watchdog might hide the succesful arming of an earlier one. Accept that on
failing to arm any watchdog (because of non-supported timeouts) EOPNOTSUPP is
returned instead of the more appropriate EINVAL.

MFC after:	3 days
2007-03-27 21:03:37 +00:00