5ed480f8c9
The cause of "Duplicate mbuf free panic" is in the programming error of hme_load_txmbuf(). The code path of the panic is the following. 1. Due to unknown reason DMA engine was freezed. So TX descritors of HME become full and the last failed attempt to transmit a packet had set its associated mbuf address to hme_txdesc structure. Also the failed packet is requeued into interface queue structure in order to retrasmit it when there are more available TX descritors. 2. Since DMA engine was freezed, if_timer starts to decrement its counter. When if_timer expires it tries to reset HME. During the reset phase, hme_meminit() is called and it frees all associated mbuf with descriptors. The last failed mbuf is also freed here. 3. After HME reset completed, HME starts to retransmit packets by dequeing the first packet in interface queue.(Note! the packet was already freed in hme_meminit()!) 4. When a TX completion interrupt is posted by the HME, driver tries to free the successfylly transmitted mbuf. Since the mbuf was freed in step2, now we get "Duplicate mbuf free panic". However, the real cause is in DMA engine freeze. Since no fatal errors reported via interrupts, there might be other cause of the freeze. I tried hard to understand the cause of DMA engine freeze but couldn't find any clues. It seems that the freeze happens under very high network loads(e.g. 7.5-8.0 MB/s TX speed). Though this fix is not enough to eliminate DMA engine freeze it's better than panic. Reported by: jhb via sparc64 ML