88591e04af
old/broken hardware. Unfortunately, it adds cache pressure and possible mispredicted branches to the fast path of the bus_dmamap_load collection of functions. Since it's meant for slow path exception processing, de-inline it and allow its conditions to be pre-computed at tag_create time and thus short-circuited at runtime. While here, cut down on the size of _bus_dmamap_load_buffer() by pushing the bounce page logic into a non-inlined function. Again, this helps with cache pressure and mispredicted branches. According to the TSC, this shaves off a few cycles on average. Unfortunately, the data varies quite a bit due to interrupts and preemption, so it's hard to get a good measurement. Real world measurements of network PPS are welcomed. A merge to amd64 and other arches is pending more testing. |
||
---|---|---|
.. | ||
acpica | ||
bios | ||
compile | ||
conf | ||
cpufreq | ||
i386 | ||
ibcs2 | ||
include | ||
isa | ||
linux | ||
pci | ||
svr4 | ||
xbox | ||
Makefile |