years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.
This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.
"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.
Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.
MFC after: 1 month
Sponsored by: Mellanox Technologies
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
MFC after: 1 week
Sponsored by: Mellanox Technologies
1. oce_multiq_start(): make sure the buffer is consumed even on ENXIO
2. oce_multiq_transmit(): there is an extra call to drbr_enqueue()
causing the mbuf to be enqueued twice when the NIC's queue is full,
and potential panics
3. oce_multiq_transmit(): same problem fixed recently in ixgbe (r267187)
and other drivers: if the mbuf is enqueued, the proper return value is 0
Submitted by: Stefano Garzarella
MFC after: 3 days
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.
o Remove the if_baudrate_pf crutch.
o Make all fields of struct if_data fixed machine independent size. The
notion of data (packet counters, etc) are by no means MD. And it is a
bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
which at modern speeds overflow within a second.
This also removes quite a lot of COMPAT_FREEBSD32 code.
o Give 16 bit for the ifi_datalen field. This field was provided to
make future changes to if_data less ABI breaking. Unfortunately the
8 bit size of it had effectively limited sizeof if_data to 256 bytes.
o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.
__FreeBSD_version bumped.
Discussed with: emax
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
- Add support to 40Gbps devices;
- Add support to control adaptive interrupt coalescing (AIC)
via sysctl;
- Improve support of BE3 devices;
Many thanks to Emulex for their continued support of FreeBSD.
Submitted by: Venkata Duvvuru <VenkatKumar.Duvvuru Emulex Com>
MFC after: 3 days
of Skyhawk adapters.
Many thanks to Emulex for their continued support of FreeBSD.
Submitted by: "Duvvuru,Venkat Kumar" <VenkatKumar.Duvvuru Emulex.Com>
MFC after: 1 day
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.
The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.
Reviewed by: jhb@freebsd.org, jlv@freebsd.org
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.
o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
ALTQ queueing or buf_ring(9) queueing.
- It doesn't touch the buf_ring stats any more.
- It doesn't touch ifnet stats anymore.
- drbr_stats_update() no longer exists.
o buf_ring(9) handles its stats itself:
- It handles br_drops itself.
- br_prod_bytes stats are dropped. Rationale: no one ever
reads them but update of a common counter on every packet
negatively affects performance due to excessive cache
invalidation.
- buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
we no longer account bytes.
o Drivers handle their stats theirselves: if_obytes, if_omcasts.
o mlx4(4), igb(4), em(4), vxge(4), oce(4) and ixv(4) no longer
use drbr_stats_update(), and update ifnet stats theirselves.
o bxe(4) was the most correct driver, it didn't call
drbr_stats_update(), thus it was the only driver accurate under
moderate load. Now it also maintains stats itself.
o ixgbe(4) had already taken stats from hardware, so just
- drop software stats updating.
- take multicast packet count from hardware as well.
o mxge(4) just no longer needs NO_SLOW_STATS define.
o cxgb(4), cxgbe(4) need no change, since they obtain stats
from hardware.
Reviewed by: jfv, gnn