r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
BIO_READ and BIO_WRITE, we've handled this expanded syntax poorly in
drivers when the driver doesn't support a particular command. Do a
sweep and fix that.
Reported by: imp
o Remove All Rights Reserved from my notices
o imp@FreeBSD.org everywhere
o regularize punctiation, eliminate date ranges
o Make sure that it's clear that I don't claim All Rights reserved by listing
All Rights Reserved on same line as other copyright holders (but not
me). Other such holders are also listed last where it's clear.
later devices. These caches work akin to the ones found in HDDs/SSDs
that ada(4)/da(4) also enable if existent, but likewise increase the
likelihood of data loss in case of a sudden power outage etc. On the
other hand, write performance is up to twice as high for e. g. 1 GiB
files depending on the actual chip and transfer mode employed.
For maximum data integrity, the usage of eMMC caches can be disabled
via the hw.mmcsd.cache tunable.
- Get rid of the NOP mmcsd_open().
that userland doesn't switch partitions on its own, compare against
the partition mmcsd_ioctl_cmd() is going to switch to (based on the
device node used) rather than the currently selected partition.
to recover from that, especially when something goes wrong.
- When userland changes EXT_CSD, update the kernel copy before using
relevant EXT_CSD bits in mmcsd_switch_part().
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.
Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
This also involves adding a quirk table as TRIM is broken for some
Kingston eMMC devices, though. Compared to ERASE (declared "legacy"
in the eMMC specification v5.1), TRIM has the advantage of operating
on write sectors rather than on erase sectors, which typically are
of a much larger size. Thus, employing TRIM, we don't need to fiddle
with coalescing BIO_DELETE requests that are also of (write) sector
units into erase sectors, which might not even add up in all cases.
- For some SanDisk iNAND devices, the CMD38 argument, e. g. ERASE,
TRIM etc., has to be specified via EXT_CSD[113], which now is also
handled via a quirk.
- My initial understanding was that for eMMC partitions, the granularity
should be used as erase sector size, e. g. 128 KB for boot partitions.
However, rereading the relevant parts of the eMMC specification v5.1,
this isn't actually correct. So drop the code which used partition
granularities for delmaxsize and stripesize. For the most part, this
change is a NOP, though, because a) for ERASE, mmcsd_delete() used
the erase sector size unconditionally for all partitions anyway and
b) g_disk_limit() doesn't actually take the stripesize into account.
- Take some more advantage of mmcsd_errmsg() in mmcsd(4) for making
error codes human readable.
"br" or "bridge" where - according to the terminology outlined in
comments of bridge.h and mmcbr_if.m around since their addition in
r163516 - the bus is meant and used instead. Some of these instances
are also rather old, while those in e. g. mmc_subr.c are as new as
r315430 and were caused by choosing mmc_wait_for_request(), i. e. the
one pre-r315430 outliner existing in mmc.c, as template for function
parameters in mmc_subr.c inadvertently. This correction translates to
renaming "brdev" to "busdev" and "mmcbr" to "mmcbus" respectively as
appropriate.
While at it, also rename "reqdev" to just "dev" in mmc_subr.[c,h]
for consistency with was already used in mmm.c pre-r315430, again
modulo mmc_wait_for_request() that is.
- Remove comment lines from bridge.h incorrectly suggesting that there
would be a MMC bridge base class driver.
- Update comments in bridge.h regarding the star topology of SD and SDIO;
since version 3.00 of the SDHCI specification, for eSD and eSDIO bus
topologies are actually possible in form of so called "shared buses"
(in some subcontext later on renamed to "embedded" buses).
sdhci(4), mmc(4) and mmcsd(4). For the most part, this consists of:
- Correcting and extending the infrastructure for negotiating and
enabling post-DDR52 modes already added as part of r315598. In
fact, HS400ES now should work as well but hasn't been activated
due to lack of corresponding hardware.
- Adding support executing standard SDHCI initial tuning as well
as re-tuning as required for eMMC HS200/HS400 and the fast UHS-I
SD card modes. Currently, corresponding methods are only hooked
up to the ACPI and PCI front-ends of sdhci(4), though. Moreover,
sdhci(4) won't offer any modes requiring (re-)tuning to the MMC/SD
layer in order to not break operations with other sdhci(4) front-
ends. Likewise, sdhci(4) now no longer offers modes requiring the
set_uhs_timing method introduced in r315598 to be implemented/
hooked up (previously, this method was used with DDR52 only, which
in turn is only available with Intel controllers so far, i. e. no
such limitation was necessary before). Similarly for 1.2/1.8 V VCCQ
support and the switch_vccq method.
- Addition of locking to the IOCTL half of mmcsd(4) to prevent races
with detachment and suspension, especially since it's required to
immediately switch away from RPMB partitions again after an access
to these (so re-tuning can take place anew, given that the current
eMMC specification v5.1 doesn't allow tuning commands to be issued
with a RPMB partition selected). Therefore, the existing part_mtx
lock in the mmcsd(4) softc is additionally renamed to disk_mtx in
order to denote that it only refers to the disk(9) half, likewise
for corresponding macros.
On the system where the addition of DDR52 support increased the read
throughput to ~80 MB/s (from ~45 MB/s at high speed), HS200 yields
~154 MB/s and HS400 ~187 MB/s, i. e. performance now has more than
quadrupled compared to pre-r315598.
Also, with the advent of (re-)tuning support, most infrastructure
necessary for SD card UHS-I modes up to SDR104 now is also in place.
Note, though, that the standard SDHCI way of (re-)tuning is special
in several ways, which also is why sending the actual tuning requests
to the device is part of sdhci(4). SDHCI implementations not following
the specification, MMC and non-SDHCI SD card controllers likely will
use a generic implementation in the MMC/SD layer for executing tuning,
which hasn't been written so far, though.
However, in fact this isn't a feature-only change; there are boards
based on Intel Bay Trail where DDR52 is problematic and the suggested
workaround is to use HS200 mode instead. So far exact details are
unknown, however, i. e. whether that's due to a defect in these SoCs
or on the boards.
Moreover, due to the above changes requiring to be aware of possible
MMC siblings in the fast path of mmc(4), corresponding information
now is cached in mmc_softc. As a side-effect, mmc_calculate_clock(),
mmc_delete_cards(), mmc_discover_cards() and mmc_rescan_cards() now
all are guaranteed to operate on the same set of devices as there no
longer is any use of device_get_children(9), which can fail in low
memory situations. Likewise, mmc_calculate_clock() now longer will
trigger a panic due to the latter.
o Fix a bug in the failure reporting of mmcsd_delete(); in case of an
error when the starting block of a previously stored erase request
is used (in order to be able to erase a full erase sector worth of
data), the starting block of the newly supplied bio_pblkno has to be
returned for indicating no progress. Otherwise, upper layers might
be told that a negative number of BIOs have been completed, leading
to a panic.
o Fix 2 bugs on resume:
- Things done in fork1(9) like the acquisition of an SX lock or the
sleepable memory allocation are incompatible with a MTX_DEF taken.
Thus, mmcsd_resume() must not call kproc_create(9), which in turn
uses fork1(9), with the disk_mtx (formerly part_mtx) held.
- In mmc_suspend(), the bus is powered down, which in the typical
case of a device being selected at the time of suspension, causes
the device deselection as part of the bus acquisition by mmc(4) in
mmc_scan() to fail as the bus isn't powered up again before later
in mmc_go_discovery(). Thus, power down with the bus acquired in
mmc_suspend(), which will trigger the deselection up-front.
o Fix a memory leak in mmcsd_ioctl() in case copyin(9) fails. [1]
o Fix missing variable initialization in mmc_switch_status(). [2]
o Fix R1_SWITCH_ERROR detection in mmc_switch_status(). [3]
o Handle the case of device_add_child(9) failing, for example due to
a memory shortage, gracefully in mmc(4) and sdhci(4), including not
leaking memory for the instance variables in case of mmc(4) (which
might or might not fix [4] as the latter problem has been discovered
independently).
o Handle the case of an unknown SD CSD version in mmc_decode_csd_sd()
gracefully instead of calling panic(9).
o Again, check and handle the return values of some additional function
calls in mmc(4) instead of assuming that everything went right or mark
non-fatal errors by casting the return value to void.
o Correct a typo in the Linux IOCTL compatibility; it should have been
MMC_IOC_MULTI_CMD rather than MMC_IOC_CMD_MULTI.
o Now that we are reaching ever faster speeds (more improvement in this
regard is to be expected when adding ADMA support to sdhci(4)), apply
a few micro-optimizations like predicting mmc(4) and sdhci(4) debugging
to be off or caching erase sector and maximum data sizes as well support
of block addressing in mmsd(4) (instead of doing 2 indirections on every
read/write request for determining the maximum data size for example).
Reported by: Coverity
CID: 1372612 [1], 1372624 [2], 1372594 [3], 1007069 [4]
the default partition, eMMC v4.41 and later devices can additionally
provide up to:
1 enhanced user data area partition
2 boot partitions
1 RPMB (Replay Protected Memory Block) partition
4 general purpose partitions (optionally with a enhanced or extended
attribute)
Of these "partitions", only the enhanced user data area one actually
slices the user data area partition and, thus, gets handled with the
help of geom_flashmap(4). The other types of partitions have address
space independent from the default partition and need to be switched
to via CMD6 (SWITCH), i. e. constitute a set of additional "disks".
The second kind of these "partitions" doesn't fit that well into the
design of mmc(4) and mmcsd(4). I've decided to let mmcsd(4) hook all
of these "partitions" up as disk(9)'s (except for the RPMB partition
as it didn't seem to make much sense to be able to put a file-system
there and may require authentication; therefore, RPMB partitions are
solely accessible via the newly added IOCTL interface currently; see
also below). This approach for one resulted in cleaner code. Second,
it retains the notion of mmcsd(4) children corresponding to a single
physical device each. With the addition of some layering violations,
it also would have been possible for mmc(4) to add separate mmcsd(4)
instances with one disk each for all of these "partitions", however.
Still, both mmc(4) and mmcsd(4) share some common code now e. g. for
issuing CMD6, which has been factored out into mmc_subr.c.
Besides simply subdividing eMMC devices, some Intel NUCs having UEFI
code in the boot partitions etc., another use case for the partition
support is the activation of pseudo-SLC mode, which manufacturers of
eMMC chips typically associate with the enhanced user data area and/
or the enhanced attribute of general purpose partitions.
CAVEAT EMPTOR: Partitioning eMMC devices is a one-time operation.
- Now that properly issuing CMD6 is crucial (so data isn't written to
the wrong partition for example), make a step into the direction of
correctly handling the timeout for these commands in the MMC layer.
Also, do a SEND_STATUS when CMD6 is invoked with an R1B response as
recommended by relevant specifications. However, quite some work is
left to be done in this regard; all other R1B-type commands done by
the MMC layer also should be followed by a SEND_STATUS (CMD13), the
erase timeout calculations/handling as documented in specifications
are entirely ignored so far, the MMC layer doesn't provide timeouts
applicable up to the bridge drivers and at least sdhci(4) currently
is hardcoding 1 s as timeout for all command types unconditionally.
Let alone already available return codes often not being checked in
the MMC layer ...
- Add an IOCTL interface to mmcsd(4); this is sufficiently compatible
with Linux so that the GNU mmc-utils can be ported to and used with
FreeBSD (note that due to the remaining deficiencies outlined above
SANITIZE operations issued by/with `mmc` currently most likely will
fail). These latter will be added to ports as sysutils/mmc-utils in
a bit. Among others, the `mmc` tool of the GNU mmc-utils allows for
partitioning eMMC devices (tested working).
- For devices following the eMMC specification v4.41 or later, year 0
is 2013 rather than 1997; so correct this for assembling the device
ID string properly.
- Let mmcsd.ko depend on mmc.ko. Additionally, bump MMC_VERSION as at
least for some of the above a matching pair is required.
- In the ACPI front-end of sdhci(4) describe the Intel eMMC and SDXC
controllers as such in order to match the PCI one.
Additionally, in the entry for the 80860F14 SDXC controller remove
the eMMC-only SDHCI_QUIRK_INTEL_POWER_UP_RESET.
OKed by: imp
Submitted by: ian (mmc_switch_status() implementation)
comments, marking unused parameters as such, style(9), whitespace,
etc.
o In the mmc(4) bridges and sdhci(4) (bus) front-ends:
- Remove redundant assignments of the default bus_generic_print_child
device method (I've whipped these out of the tree as part of r227843
once, but they keep coming back ...),
- use DEVMETHOD_END,
- use NULL instead of 0 for pointers.
o Trim/adjust includes.
or write, resulting in random short-read and short-write returns for
requests. Fixing this fixes nominal block I/O via mmcsd(4).
Obtained from: DragonFlyBSD (fd4b97583be1a1e57234713c25f6e81bc0411cb0)
MFC after: 5 days
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.
tend to be invalid. On a Beaglebone Black, we get 8192 sectors per
track and that causes major breakages.
Differential Revision: D2646
Reviewed by: ian@ imp@
used to align partitions in gpart. We also try to align partitions by
stripe size when creating new media. Align these two concepts by
making fwsectors the same as the stripe size. Select a sensible number
of heads so we wind up with about 20 cylinders. This number was
selected to keep the rounding effects to a few percent while keeping
the number of cylinder groups low.
Sadly, it is not possible to make these numbers match the numbers used
by SD card readers. There apperas to be much variation between brands
so there's no one universal number. These numbers are also not aligned
to the stripe size, so some performance problems may still be present
when SD cards are created this way.
Also, these numbers will differ from the far less common SD to ATA
adapters, which present a different, but more uniform, set of numbers
that also happened to match the old defaults.
Nothing should change for current users. Any suboptimal performance
caused by misalignment will still be there. gpart will honor the
partitions that aren't on proper boudnaries, but editing the partition
tables may result in different alignments being used than before when
editing things natively.
Ideally, there'd be some way to override these values in the disk
subsystem by the user for the USB adapter use case where all "native"
notions of geometry disappear. This does not implement that.
In the mmcsd layer use this value to populate disk->d_ident. Also set
disk->d_descr to the full set of card identification info (includes vendor,
model, manufacturing date, etc).
Report errors indicated by the transport. If this is too chatty, I'll
throw it behind a debug write.
Remove commented out debugs that are no longer useful.
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
PR: 156496
Submitted by: Ian Lepore
MFC after: 1 week
Now it is possible to suspend/resume with inserted and active card.
To reinitialize card on resume and to detect card change while suspended,
implement bus rescan routines. It can also be used by controllers without
card presence detection signals or with multiple cards per slot support.
While there, cleanup msleep() usage. We have no any rights to exit without
"request done" signal from driver as it could lead to modify after free.
sdhci supports up to 65535 blocks transfers, at91_mci - one block.
Enable multiblock operations disabled before to follow at91_mci driver
limitations.
Reviewed by: imp@
Erase operation gives card's logic information about unused areas to help it
implement wear-leveling with lower overhead comparing to usual writing.
Erase is much faster then write and does not depends on data bus speed.
Also as result of hitting in-card write logic optimizations I have measured
up to 50% performance boost on writing undersized blocks into preerased areas.
At the same time there are strict limitations on size and allignment of erase
operations. We can erase only blocks aligned to the erase sector size and
with size multiple of it. Different cards has different erase sector size
which usually varies from 64KB to 4MB. SD cards actually allow to erase
smaller blocks, but it is much more expensive as it is implemented via
read-erase-write sequence and so not sutable for the BIO_DELETE purposes.
Reviewed by: imp@