Commit Graph

1159 Commits

Author SHA1 Message Date
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Alexander Motin
cd500da924 Fix sbuf_finish() error code check in user-space.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-10-13 23:29:06 +00:00
Warner Losh
863f967f95 cam: Add quirk for Samsung MZ7* behind a SATA-to-SAS interposer
Sometimes, this drive will be present in the system such that the the
firmware identification string doesn't start with ATA, such as when
it's behind a SATA-to-SAS interposer. Add another quirk for that.

Submitted by: github user mr44er
Github PR: 423
2020-10-07 05:44:35 +00:00
John Baldwin
83a277830f Revert most of r360179.
I had failed to notice that sgsendccb() was using cam_periph_mapmem()
and thus was not passing down user pointers directly to drivers.  In
practice this broke requests submitted from userland.

PR:		249395
Reported by:	Trenton Schulz <trueos@norwegianrockcat.com>
Reviewed by:	scottl
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D26550
2020-09-25 21:19:56 +00:00
Mateusz Guzik
27dcd3d90b cam: clean up empty lines in .c and .h files 2020-09-01 22:13:48 +00:00
Mateusz Guzik
f1f3f7cf0b cam: add missing MAKEDEV_NOWAIT in passregister
Reported by:	dhw
2020-08-18 16:06:16 +00:00
Alexander Motin
c119ccdbfd Extend EIIOE field handling according to ses4r5 draft.
It should not affect any existing systems.

MFC after:	2 weeks
2020-08-17 15:11:46 +00:00
Alexander Motin
8bdf81e4d1 Add CTL support for REPORT IDENTIFYING INFORMATION command.
It allows to report to initiator LU identifying information, preset via
"ident_info" and "text_ident_info" options.

Unfortunately it is impossible to implement SET IDENTIFYING INFORMATION,
since we have no persistent storage it requires, so the information is
read-only for initiator and has to be set out-of-band.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-08-06 19:16:11 +00:00
Conrad Meyer
9982b3ee29 cam: ANSIfy 0-argument function definitions
No functional change.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D24854
2020-05-16 14:33:08 +00:00
John Baldwin
f721fcbccb Fix a memory leak for ENCIOC_GETSTRING I introduced in r360171.
MFC after:	1 week
Sponsored by:	DARPA
2020-05-08 16:41:23 +00:00
Warner Losh
72f8ed61a7 Change the flags back to an enum
This was changed in the review process for the flags sysctl. The
reasons for the change are no longer valid as the code changed after
that. Cast the one place where it might make a difference (but I don't
think it does).  This restores the ability to see flags for softc in
gdb.
2020-04-27 23:39:32 +00:00
John Baldwin
81b62a7663 Don't pass a user buffer pointer as the data pointer in a CCB.
Allocate a temporary buffer in the kernel to serve as the CCB data
pointer for a pass-through transaction and use copyin/copyout to
shuffle the data to/from the user buffer.

Reviewed by:	scottl, brooks
Obtained from:	CheriBSD
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24489
2020-04-21 23:38:54 +00:00
John Baldwin
47e735aded Don't access a user buffer directly from the kernel.
The handle_string callback for the ENCIOC_SETSTRING ioctl was passing
a user pointer to memcpy().  Fix by using copyin() instead.

For ENCIOC_GETSTRING ioctls, the handler was storing the user pointer
in a CCB's data_ptr field where it was indirected by other code.  Fix
this by allocating a temporary buffer (which ENCIOC_SETSTRING already
did) and copying the result out to the user buffer after the CCB has
been processed.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24487
2020-04-21 17:47:05 +00:00
John Baldwin
5326289d93 Don't try to copyout() to a kernel buffer.
The handle_string callback for the ENCIOC_GET_ENCNAME and
ENCIOC_GETENCID ioctls tries to copy the size of the generated string
out to userland.  However, the callback only has access to the kernel
copy of the structure populated by copyin().  The copyout() call
simply overwrites the value in the kernel's copy preventing the
subsequent overflow prevention logic from working.

Fix this by instead doing a copyout() of the updated length in the
caller after the callback returns.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24456
2020-04-17 18:19:13 +00:00
Warner Losh
ab485b018a Checks here against useracc are not useful and are racy.
copyin/copyout are sufficient to guard against bad addresses. They will return
EFAULT if the user is up to no good (by choice or ignorance). There's no point
in checking, since it doesn't even improve the error messages.

Noticed by: jhb
Reviewed by: brooks, jhb
2020-04-13 21:04:33 +00:00
Alexander Motin
12fec4c35d Relax too strict SES element descriptors check in r355430.
SES specifications allows the string to be NULL-terminated, while previous
code was considering it as invalid due to incorrectly ordered conditions.

MFC after:	 1 week
Sponsored by:	iXsystem, Inc.
2020-04-06 18:42:01 +00:00
Ed Maste
3709674072 sys/cam: remove doubled ;s 2020-03-20 16:15:45 +00:00
Alexander Motin
e4ee6c5444 Fix SES on device slots without phys after r349321.
Broadcom 9400-8i8e HBAs report virtual SES device, where slots representing
external connectors are reported having no phys.  Since sasdev_phys is NULL
there and proto_hdr is a union, ses_paths_iter() misinterpreted them as ATA.
Add explicit protocol check to properly differentiate them.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-03-19 17:20:50 +00:00
Warner Losh
e6db0f0409 Remove redundantly repetitive static __inline forward function
declarations.

We typically don't use them elsewhere in the kernel, and they aren't
needed here: the actual functions are a few lines away and aren't
mutually recursive.
2020-03-11 15:12:31 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Warner Losh
1731d53036 We pass a pointer to the flags to dabitsysctl, not an integer. Adjust the
handler to accept a poitner to a u_int. To make the type of the softc flags
stable and defined, make it a u_int. Cast the enum types to u_int for arg2 so
when passing to dabitsysctl it's a u_int.

Noticed by: emax@
Differential Revision: https://reviews.freebsd.org/D23785
2020-02-21 22:44:22 +00:00
Scott Long
1353215314 Add rudamentary support for UFS to probe whether a block device supports the
BIO_SPEEDUP command.  Add complimentary support to the CAM periphs that
support it.  This is a redo of r357710.
2020-02-16 23:10:59 +00:00
Warner Losh
efb17c5a54 Use INT instead of string for the ints. Because the string "I" was right, the
old code appeared to work. This was a cut and paste error.

Noticed by: rpokala@
2020-02-13 03:37:11 +00:00
Warner Losh
0c8ea9e567 Convert rotating and unmapped_io to a DA flag
Rotating and unmapped_io are really da flags. Convert them to a flag so it will
be reported with the other flags for the device. Deprecate the .rotating and
.unmapped_io sysctls in FreeBSD 14 and remove the softc ints.

Differential Revision: https://reviews.freebsd.org/D23417
2020-02-13 01:23:44 +00:00
Warner Losh
a8d238cd6d Export the current da flags as bitfield
Export the current flags. They can be useful to other programs wanting to do
special thigns for removable or similar devices.

Differential Revision: https://reviews.freebsd.org/D23417
2020-02-13 01:23:32 +00:00
Scott Long
85eb41f751 Revert r357710 and 357711 until they can be debugged 2020-02-10 14:27:28 +00:00
Scott Long
7d99bda79e Add rudamentary support for UFS to probe whether a block device supports the
BIO_SPEEDUP command.  Add complimentary support to the CAM periphs that
support it.
2020-02-10 00:23:20 +00:00
Scott Long
d176b8039e Ever since the block layer expanded its command syntax beyond just
BIO_READ and BIO_WRITE, we've handled this expanded syntax poorly in
drivers when the driver doesn't support a particular command.  Do a
sweep and fix that.

Reported by:	imp
2020-02-07 09:22:08 +00:00
Warner Losh
827bea2645 Fix spelling of removable 2020-01-29 00:28:50 +00:00
Gleb Popov
dfead4180e Fix typo: MANGAEMENT_PROTOCOL_OUT -> MANAGEMENT_PROTOCOL_OUT.
Approved by:	allanjude
2020-01-09 15:21:42 +00:00
Warner Losh
83b75bb3cc Revert r355813
It was extracted from a larger tree and is incomplete. Will resubmit after
reworking.
2019-12-16 19:16:26 +00:00
Warner Losh
68e1c49a96 Implement a system-wide limit or da and ada devices for delete.
Excesively large TRIMs can result in timeouts, which cause big
problems. Limit trims to 1GB to mititgate these issues.

Reviewed by: scottl
Differential Revision: https://reviews.freebsd.org/D22809
2019-12-16 18:16:44 +00:00
John Baldwin
5773ac113c Use callout_func_t instead of the deprecated timeout_t.
Reviewed by:	kib, imp
Differential Revision:	https://reviews.freebsd.org/D22752
2019-12-10 22:06:53 +00:00
Alan Somers
e083fb08b9 ses: sanitize illegal strings in SES element descriptors
The SES4r3 standard requires that element descriptors may only contain ASCII
characters in the range 0x20 to 0x7e.  Some SuperMicro expanders violate
that rule.  This patch adds a sanity check to ses(4).  Descriptors in
violation will be replaced by "<invalid>".

This patch fixes "sesutil --libxo xml" on such systems.  Previously it would
generate non-well-formed XML output.

PR:		241929
Reviewed by:	allanjude
MFC after:	2 weeks
Sponsored by:	Axcient
2019-12-06 00:06:05 +00:00
Kenneth D. Merry
0c8f059c29 Fix a hang introduced in r351599.
My changes in 351599 (kindly committed by avg) made the cd(4) media check
asynchronous to avoid a sleep while holding a mutex.

There was a difficult to reproduce bug with those changes that caused a
hang on boot on some single processor machines/VMs.  Leandro Lupori
managed to reproduce the bug, diagnose it, and supplied a patch!  Here is
his analysis, from the PR:

======
I was able to reproduce the problem described in comment#14.

Actually, I wasn't trying to reproduce it, I just started seeing it a few
weeks ago, in CURRENT.

I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a
single core/thread (-smp 1).

It happens only when there is no media in the emulated CD-ROM, a device
that QEMU adds by default, unless -nodefaults is specified in command line.

I've debugged it and this is what I've found:

1- After the CD probe is successful, GEOM will try to open the device,
which will end up calling cdcheckmedia(), that sets CD state to
CD_STATE_MEDIA_PREVENT.
2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED
flag is set and CD state moves to CD_STATE_MEDIA_SIZE.
3- Next, scsi_read_capacity() is executed and fails, state is set to
CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up
cdcheckmedia().
4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it
first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to
CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering
an infinite MEDIA_SIZE command loop.

When there is a least another core/thread, the GEOM thread that performed
the initial cdopen() will get scheduled again, closing the CD device, that
will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and
breaks the loop.

So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when
CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the
state should be advanced to CD_STATE_MEDIA size only when the current state
is CD_STATE_MEDIA_PREVENT.
=====

PR:		kern/219857
Submitted by:	Leandro Lupori <leandro.lupori@gmail.com>
MFC after:	1 week
2019-12-02 19:57:39 +00:00
Alexander Motin
a4876fbfc3 Make CAM use root_mount_hold_token() to delay boot.
Before this change CAM used config_intrhook_establish() for this purpose,
but that approach does not allow to delay it again after releasing once.

USB stack uses root_mount_hold() to delay boot until bus scan is complete.
But once it is, CAM had no time to scan SCSI bus, registered by umass(4),
if it already done other scans and called config_intrhook_disestablish().
The new approach makes it work smooth, assuming the USB device is found
during the initial bus scan.  Devices appearing on USB bus later may still
require setting kern.cam.boot_delay, but hopefully those are minority.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-22 18:39:51 +00:00
Scott Long
f0d6f5774a Remove NEEDGIANT from the scsi_sg /dev node. It likely has not been
needed for many years.

Reported by:	imp
2019-11-22 18:18:36 +00:00
Alexander Motin
cc453b2272 Set handling for some "Logical unit not ready" errors.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-20 20:00:03 +00:00
Warner Losh
02fa548cde Fix a race between daopen and damediapoll
When we do a daopen, we call dareprobe and wait for the results. The repoll runs
the da state machine up through the DA_STATE_RC* and then exits.

For removable media, we poll the device every 3 seconds with a TUR to see if it
has disappeared. This introduces a race. If the removable device has lots of
partitions, and if it's a little slow (like say a USB2 connected USB stick),
then we can have a fair amount of time that this reporbe is going on for. If,
during that time, damediapoll fires, it calls daschedule which changes the
scheduling priority from NONE to NORMAL. When that happens, the careful single
stepping in the da state machine is disrupted and we wind up sceduling multiple
read capacity calls. The first one succeeds and releases the reference. The
second one succeeds and releases the reference (and panics if the right code is
compiled into the da driver).

To avoid the race, only do the TUR calls while in state normal, otherwise just
reschedule damediapoll. This prevents the race from happening.
2019-11-13 01:58:43 +00:00
Warner Losh
45fceedf87 Add asserts for some state transitions
For the PROBEWP and PROBERC* states, add assertiosn that both the da device
state is in the right state, as well as the ccb state is the right one when we
enter dadone_probe{wp,rc}. This will ensure that we don't sneak through when
we're re-probing the size and write protection status of the device and thereby
leak a reference which can later lead to an invalidated peripheral going away
before all references are released (and resulting panic).

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:57 +00:00
Warner Losh
dc1c17691e Update the softc state of the da driver before releasing the CCB.
There are contexts where releasing the ccb triggers dastart() to be run
inline. When da was written, there was always a deferral, so it didn't matter
much. Now, with direct dispatch, we can call dastart from the dadone*
routines. If the probe state isn't updated, then dastart will redo things with
stale information. This normally isn't a problem, because we run the probe state
machine once at boot... Except that we also run it for each open of the device,
which means we can have multiple threads racing each other to try to kick off
the probe. However, if we update the state before we release the CCB, we can
avoid the race. While it's needed only for the probewp and proberc* states, do
it everywhere because it won't hurt the other places.

The race here happens because we reprobe dozens of times on boot when drives
have lots of partitions.  We should consider caching this info for 1-2 seconds
to avoid this thundering hurd.

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:52 +00:00
Warner Losh
fe95666bab Require and enforce that dareprobe() has to be called with the periph lock held.
Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:47 +00:00
Warner Losh
fb6ea34a3a Fix panic message to indicate right action that was improper.
Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295
2019-11-11 17:36:42 +00:00
Edward Tomasz Napierala
b5961be1ab Add GEOM attribute to report physical device name, and report it
via 'diskinfo -v'.  This avoids the need to track it down via CAM,
and should also work for disks that don't use CAM.  And since it's
inherited thru the GEOM hierarchy, in most cases one doesn't need
to walk the GEOM graph either, eg you can use it on a partition
instead of disk itself.

Reviewed by:	allanjude, imp
Sponsored by:	Klara Inc
Differential Revision:	https://reviews.freebsd.org/D22249
2019-11-09 17:30:19 +00:00
Alexander Motin
34a5c41c43 Add kern.cam.da.X.quirks tunable, similar existing for ada.
Submitted by:	Michael Lass
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20677
2019-09-26 14:48:39 +00:00
Alexander Motin
07f7e4c8b0 Fix assumptions of only one device per SES slot.
It is typical to have one, but no longer true for multi-actuator HDDs
with separate LUN for each actuator.

MFC after:	4 days
Sponsored by:	iXsystems, Inc.
2019-09-11 03:25:30 +00:00
Alexander Motin
16614d3518 Supply SAT layer with valid transfer sizes.
This is a rework of r344701, that noticed that number of bytes passes to
8 bit sector count field gets truncated.  First decision was to not pass
anything, since ATA specs define the field as N/A.  But it appeared to be a
problem for some SAT devices, that require information about data transfer
to operate properly.  Some additional investigation shown that it is quite
a common practice to set unused fields of ATA commands (fortunately ATA
specs formally allow it) to supply the information to SAT layer.  I have
found SAS-SATA interposer that does not allow pass-through without it.

As side effect, reduce code duplication by removing ata_do_28bit_cmd()
function, replacing it with more universal ata_do_cmd().

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-09-07 15:56:00 +00:00
Alexander Motin
6a216c0bb5 Take proper lock in ses_setphyspath_callback().
XPT_DEV_ADVINFO call should be protected by the lock of the specific
device it is addressed to, not the lock of SES device.  In some weird
case, probably with hardware violating standards, it sometimes caused
NULL dereference due to race.

To protect from it further, add lock assertion to *_dev_advinfo().

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-08-29 17:02:02 +00:00
Andriy Gapon
0093b755a7 scsi_cd: whitespace cleanup
Remove trailing whitespace and fix mixed indentation.

MFC after:	3 weeks
2019-08-29 08:26:40 +00:00
Andriy Gapon
c9f2918e69 scsi_cd: ifdef out cdsize()
It was used only by the old cdcheckmedia().

MFC after:	3 weeks
2019-08-29 08:19:11 +00:00