21d95a8778
underlying unaligned bcopy) on incoming packets that are already available (albeit unaligned) in a buffer. The performance improvement varies, depending on CPU and memory speed, but can be quite large especially on slow CPUs. I have seen over 50% increase on forwarding speed on the sis driver for the 486/133 (embedded systems), which does exactly the same thing. The behaviour is controlled by a sysctl variable, hw.dc_quick which defaults to 1. Set it to 0 to restore the old behaviour. After running a few experiments (in userland, though) I am convinced that doing the m_devget() is detrimental to performance in almost all cases. Even if your CPU has degraded performance with misaligned data, the bcopy() in the driver has the same overhead due to misaligment as the one that you save in the uiomove(), plus you do one extra copy and pollute the cache. But more often than not, you do not even have to touch the payload, e.g. when you are forwarding packets, and even in the often-cited case of NFS, you often end up passing a pointer to the payload to the disk controller. In any case, you can play with the sysctl variable to toggle between the two behaviours, and see if it makes a difference. MFC-after: 3 days