On some architectures, u_long isn't large enough for resource definitions.
Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but
type `long' is only 32-bit. This extends rman's resources to uintmax_t. With
this change, any resource can feasibly be placed anywhere in physical memory
(within the constraints of the driver).
Why uintmax_t and not something machine dependent, or uint64_t? Though it's
possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on
32-bit architectures. 64-bit architectures should have plenty of RAM to absorb
the increase on resource sizes if and when this occurs, and the number of
resources on memory-constrained systems should be sufficiently small as to not
pose a drastic overhead. That being said, uintmax_t was chosen for source
clarity. If it's specified as uint64_t, all printf()-like calls would either
need casts to uintmax_t, or be littered with PRI*64 macros. Casts to uintmax_t
aren't horrible, but it would also bake into the API for
resource_list_print_type() either a hidden assumption that entries get cast to
uintmax_t for printing, or these calls would need the PRI*64 macros. Since
source code is meant to be read more often than written, I chose the clearest
path of simply using uintmax_t.
Tested on a PowerPC p5020-based board, which places all device resources in
0xfxxxxxxxx, and has 8GB RAM.
Regression tested on qemu-system-i386
Regression tested on qemu-system-mips (malta profile)
Tested PAE and devinfo on virtualbox (live CD)
Special thanks to bz for his testing on ARM.
Reviewed By: bz, jhb (previous)
Relnotes: Yes
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D4544
- Add a per-device mutex to the softc and use it for bus_dma tags,
CAM SIMs, callouts, and interrupt handler.
- Switch from timeout(9) to callout(9).
- Add a separate global mutex to protect the global event buffer ring.
- Return completed index from iir_intr_locked() and remove the global
gdt_wait_* variables.
- Remove global list of gdt softcs and replace its use with
devclass_get_device().
- Use si_drv1 to store softc pointer in the SDEV_PER_HBA case instead
of minor numbers.
- Do math on osreldate instead of dubious char math on osrelease[]
that didn't work on 10.0+.
- Use bus_*() instead of bus_space_*().
- Use device_printf() instead of printf() with a unit number.
Tested by: no one
now takes a device_t to be the parent of the bus that is being created.
Most SIMs have been updated with a reasonable argument, but a few exceptions
just pass NULL for now. This argument isn't used yet and the newbus
integration likely won't be ready until after 7.0-RELEASE.
- Don't use a common buffer in the softc to store per-command data. Reserve
a buffer in the command itself.
- Don't allocate DMA memory for the kernel command structures when all you
really need is DMA memory for the scratch buffer embedded in them. Instead
allocate a slab for the scratch buffers and divide it up as needed.
- Call bus_dmamap_unload() at the completion of commands.
- Preserve and clear the CAM CCB status flags at completion.
- Reorder some low-level command operations to try to close races.
- Limit the simq to 32 commands for now. There are some serious problems
with the driver under load that are not well understood, so keeping the
simq lower helps avoid this. It has been tested at a higher value, but
this is a safe value that doesn't show much performance degredation.
These changes allow the driver to work reliably with >4GB of memory on i386
and amd64, and also work around deadlocks seen under very high load in
certain situations. The work-around is far from ideal, but without and
documentation it is hard to know what the right fix is.
MFC candidate
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma. At the moment, this is used for the
asynchronous busdma_swi and callback mechanism. Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg. dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create(). The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.
sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms. The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.
If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.
Reviewed by: tmm, gibbs
driver at long last!
Many thanks to vaidas.damosevicius@if.lt for keeping this issue alive
and pursuing Intel for a fix, Intel/ICP for working on the driver, and
Sergey Osokin for bringing the original patches up to 5-CURRENT.
prior ICP Vortex models. This driver was developed by Achim Leubner
of Intel (previously with ICP Vortex) and Boji Kannanthanam of Intel.
Submitted by: "Kannanthanam, Boji T" <boji.t.kannanthanam@intel.com>
MFC after: 2 weeks