132 Commits

Author SHA1 Message Date
Gleb Smirnoff
0e229f343f Hide struct socket and struct unpcb from the userland.
Violators may define _WANT_SOCKET and _WANT_UNPCB respectively and
are not guaranteed for stability of the structures.  The violators
list is the the usual one: libprocstat(3) and netstat(1) internally
and lsof in ports.

In struct xunpcb remove the inclusion of kernel structure and add
a bunch of spare fields.  The xsocket already has socket not included,
but add there spares as well.  Embed xsockbuf into xsocket.

Sort declarations in sys/socketvar.h to separate kernel only from
userland available ones.

PR:		221820 (exp-run)
2017-10-02 23:29:56 +00:00
Sean Bruno
32a04bb81d Use counter(9) for PLPMTUD counters.
Remove unused PLPMTUD sysctl counters.

Bump UPDATING and FreeBSD Version to indicate a rebuild is required.

Submitted by:	kevin.bowling@kev009.com
Reviewed by:	jtl
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12003
2017-08-25 19:41:38 +00:00
Gleb Smirnoff
779f106aa1 Listening sockets improvements.
o Separate fields of struct socket that belong to listening from
  fields that belong to normal dataflow, and unionize them.  This
  shrinks the structure a bit.
  - Take out selinfo's from the socket buffers into the socket. The
    first reason is to support braindamaged scenario when a socket is
    added to kevent(2) and then listen(2) is cast on it. The second
    reason is that there is future plan to make socket buffers pluggable,
    so that for a dataflow socket a socket buffer can be changed, and
    in this case we also want to keep same selinfos through the lifetime
    of a socket.
  - Remove struct struct so_accf. Since now listening stuff no longer
    affects struct socket size, just move its fields into listening part
    of the union.
  - Provide sol_upcall field and enforce that so_upcall_set() may be called
    only on a dataflow socket, which has buffers, and for listening sockets
    provide solisten_upcall_set().

o Remove ACCEPT_LOCK() global.
  - Add a mutex to socket, to be used instead of socket buffer lock to lock
    fields of struct socket that don't belong to a socket buffer.
  - Allow to acquire two socket locks, but the first one must belong to a
    listening socket.
  - Make soref()/sorele() to use atomic(9).  This allows in some situations
    to do soref() without owning socket lock.  There is place for improvement
    here, it is possible to make sorele() also to lock optionally.
  - Most protocols aren't touched by this change, except UNIX local sockets.
    See below for more information.

o Reduce copy-and-paste in kernel modules that accept connections from
  listening sockets: provide function solisten_dequeue(), and use it in
  the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4),
  infiniband, rpc.

o UNIX local sockets.
  - Removal of ACCEPT_LOCK() global uncovered several races in the UNIX
    local sockets.  Most races exist around spawning a new socket, when we
    are connecting to a local listening socket.  To cover them, we need to
    hold locks on both PCBs when spawning a third one.  This means holding
    them across sonewconn().  This creates a LOR between pcb locks and
    unp_list_lock.
  - To fix the new LOR, abandon the global unp_list_lock in favor of global
    unp_link_lock.  Indeed, separating these two locks didn't provide us any
    extra parralelism in the UNIX sockets.
  - Now call into uipc_attach() may happen with unp_link_lock hold if, we
    are accepting, or without unp_link_lock in case if we are just creating
    a socket.
  - Another problem in UNIX sockets is that uipc_close() basicly did nothing
    for a listening socket.  The vnode remained opened for connections.  This
    is fixed by removing vnode in uipc_close().  Maybe the right way would be
    to do it for all sockets (not only listening), simply move the vnode
    teardown from uipc_detach() to uipc_close()?

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D9770
2017-06-08 21:30:34 +00:00
Gleb Smirnoff
cc65eb4e79 Hide struct inpcb, struct tcpcb from the userland.
This is a painful change, but it is needed.  On the one hand, we avoid
modifying them, and this slows down some ideas, on the other hand we still
eventually modify them and tools like netstat(1) never work on next version of
FreeBSD.  We maintain a ton of spares in them, and we already got some ifdef
hell at the end of tcpcb.

Details:
- Hide struct inpcb, struct tcpcb under _KERNEL || _WANT_FOO.
- Make struct xinpcb, struct xtcpcb pure API structures, not including
  kernel structures inpcb and tcpcb inside.  Export into these structures
  the fields from inpcb and tcpcb that are known to be used, and put there
  a ton of spare space.
- Make kernel and userland utilities compilable after these changes.
- Bump __FreeBSD_version.

Reviewed by:	rrs, gnn
Differential Revision:	D10018
2017-03-21 06:39:49 +00:00
Gleb Smirnoff
d9646465e3 Typo. 2017-03-10 19:08:31 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Andrey V. Elsukov
fcf596178b Merge projects/ipsec into head/.
Small summary
 -------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
  option IPSEC_SUPPORT added. It enables support for loading
  and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
  default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
  support was removed. Added TCP/UDP checksum handling for
  inbound packets that were decapsulated by transport mode SAs.
  setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
  build as part of ipsec.ko module (or with IPSEC kernel).
  It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
  methods. The only one header file <netipsec/ipsec_support.h>
  should be included to declare all the needed things to work
  with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
  Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
  - now all security associations stored in the single SPI namespace,
    and all SAs MUST have unique SPI.
  - several hash tables added to speed up lookups in SADB.
  - SADB now uses rmlock to protect access, and concurrent threads
    can do SA lookups in the same time.
  - many PF_KEY message handlers were reworked to reflect changes
    in SADB.
  - SADB_UPDATE message was extended to support new PF_KEY headers:
    SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
    can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
  avoid locking protection for ipsecrequest. Now we support
  only limited number (4) of bundled SAs, but they are supported
  for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
  used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
  check for full history of applied IPsec transforms.
o References counting rules for security policies and security
  associations were changed. The proper SA locking added into xform
  code.
o xform code was also changed. Now it is possible to unregister xforms.
  tdb_xxx structures were changed and renamed to reflect changes in
  SADB/SPDB, and changed rules for locking and refcounting.

Reviewed by:	gnn, wblock
Obtained from:	Yandex LLC
Relnotes:	yes
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9352
2017-02-06 08:49:57 +00:00
Xin LI
f193c8ce0d Use strlcpy and snprintf in netstat(1).
Expand inet6name() line buffer to NI_MAXHOST and use strlcpy/snprintf
in various places.

Reported by:	Anton Yuzhaninov <citrin citrin ru>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D8916
2017-01-05 09:23:54 +00:00
Marcelo Araujo
e9d2c20108 Print hostcache usage counts with TCP statistics.
PR:		196252
Submitted by:	Anton Yuzhaninov <citrin+pr@citrin.ru>
MFC after:	3 weeks.
2016-12-28 13:11:22 +00:00
Michael Tuexen
db2627f4b1 When calling netstat -Laptcp the local address values are not aligned
with the corresponding entry in the table header.
r295136 increased the value width from 14 to 32 without the corresponding
change to the table header. This commit adds the change to the table
header width.

MFC after:	3 days
2016-07-15 17:40:34 +00:00
Marcelo Araujo
ef1cb629d8 Use NULL instead of 0 for pointers.
Also malloc will return NULL if it cannot allocate memory.

MFC after:	2 weeks.
2016-04-18 05:46:18 +00:00
Gleb Smirnoff
dbfd87087b Print running TCP connection counts with TCP statistics. 2016-03-15 00:19:30 +00:00
Alfred Perlstein
7325dfbb59 Increase max allowed backlog for listen sockets
from short to int.

PR: 203922
Submitted by: White Knight <white_knight@2ch.net>
MFC After: 4 weeks
2016-02-02 05:57:59 +00:00
Hajimu UMEMOTO
0eaa116e1e Fix udp entry of `netstat -TW'. 2015-11-25 11:20:54 +00:00
Hajimu UMEMOTO
046aad399a Correct alignment of the addresses in the `netstat -aW' output. 2015-11-24 14:25:40 +00:00
Mark Johnston
9eddb899d9 Use a common subroutine to fetch and zero protocol stats instead of
duplicating roughly similar code for each protocol.

MFC after:	2 weeks
2015-09-11 04:37:01 +00:00
Hiroki Sato
10d5269ff9 - Add -W flag support for network column in intpr() (-i flag) and
routepr() (-r flag).  It is too narrow to show an IPv6 prefix
  in most cases.

- Accept "local" as a synonym of "unix" in protocol family name.

- Show a prefix length in CIDR notation when name resolution failed in
  netname().

- Make routename() and netname() AF-independent and remove
  unnecessary typecasting from struct sockaddr.

- Use getnameinfo(3) to format L2 addr in intpr().

- Fix a bug which showed "Address" when -A flag is specfied in pr_rthdr().

- Replace cryptic GETSA() macro with SA_SIZE().

- Fix declarations shadowing local variables with the same names.

- Add more static, remove unused header files and variables.

MFC after:	1 week
2015-09-01 08:42:04 +00:00
Marcel Moolenaar
ade9ccfe21 Convert netstat to use libxo.
Obtained from:  Phil Shafer <phil@juniper.net>
Ported to -current by: alfred@ (mostly), Kim Shrier
Formatting: marcel@
Sponsored by:   Juniper Networks, Inc.
2015-02-21 23:47:20 +00:00
Gleb Smirnoff
de5ef1dfee Burn bridges to FreeBSD 7.x IGMP stats. 2015-02-19 19:36:54 +00:00
Gleb Smirnoff
0f9d0a73a4 Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers.  These
mbufs are now being populated by some I/O in background and are
referenced outside.  This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
  freed.

o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
  The sb_ccc stands for ""claimed character count", or "committed
  character count".  And the sb_acc is "available character count".
  Consumers of socket buffer API shouldn't already access them directly,
  but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
  with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
  search.
o New function sbready() is provided to activate certain amount of mbufs
  in a socket buffer.

A special note on SCTP:
  SCTP has its own sockbufs.  Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs.  Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them.  The new
notion of "not ready" data isn't supported by SCTP.  Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
  A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does.  This was discussed with rrs@.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 12:52:33 +00:00
Adrian Chadd
85b0f0f325 Add -R to netstat to dump RSS/flow information.
This is intended to help in diagnostics and debugging of NIC and stack
flowid support.

Eventually this will grow another column (RSS CPU ID) but
that currently isn't cached in the inpcb.

There's also no clean flowtype -> flowtype identifier string.  This is
the mbuf M_HASHTYPE_* values for RSS.

Here's some example output:

adrian@adrian-hackbox:~/work/freebsd/head/src % netstat -Rn | more
Active Internet connections
Proto Recv-Q Send-Q Local Address          Foreign Address           flowid ftype
tcp4       0      0 10.11.1.65.22          10.11.1.64.12409        29041942     2
udp4       0      0 127.0.0.1.123          *.*                     00000000     0
udp6       0      0 fe80::1%lo0.123        *.*                     00000000     0
udp6       0      0 ::1.123                *.*                     00000000     0
udp4       0      0 10.11.1.65.123         *.*                     00000000     0

Tested:

* amd64 system w/ igb NIC; local driver changes to expose RSS flowid in if_igb.
2014-05-19 17:11:43 +00:00
Gleb Smirnoff
c669105d17 - Remove net.inet.tcp.reass.overflows sysctl. It counts exactly
same events that tcpstat's tcps_rcvmemdrop counter counts.
- Rename tcps_rcvmemdrop to tcps_rcvreassfull and improve its
  description in netstat(1) output.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-05-06 00:00:07 +00:00
Bjoern A. Zeeb
a4993b252e Print the MD5 signature information introduced in r221023 in the
TCP statistics output.

MFC after:	3 weeks
2014-02-05 20:43:03 +00:00
Andrey V. Elsukov
69edf037d7 Migrate struct carpstats to PCPU counters. 2013-07-09 10:02:51 +00:00
Andrey V. Elsukov
5b7cb97c2b Migrate structs arpstat, icmpstat, mrtstat, pimstat and udpstat to PCPU
counters.
2013-07-09 09:50:15 +00:00
Andrey V. Elsukov
5da0521fce Use new macros to implement ipstat and tcpstat using PCPU counters.
Change interface of kread_counters() similar ot kread() in the netstat(1).
2013-07-09 09:43:03 +00:00
Andrey V. Elsukov
c80211e3cf Prepare network statistics structures for migration to PCPU counters.
Use uint64_t as type for all fields of structures.

Changed structures: ahstat, arpstat, espstat, icmp6_ifstat, icmp6stat,
in6_ifstat, ip6stat, ipcompstat, ipipstat, ipsecstat, mrt6stat, mrtstat,
pfkeystat, pim6stat, pimstat, rip6stat, udpstat.

Discussed with:	arch@
2013-07-09 09:32:06 +00:00
Gleb Smirnoff
29dde48df4 Use kvm_counter_u64_fetch() to fix obtaining ipstat and tcpstat from
kernel core files.

Sponsored by:	Nginx, Inc.
2013-04-10 20:29:23 +00:00
Gleb Smirnoff
5923c29332 Merge from projects/counters: TCP/IP stats.
Convert 'struct ipstat' and 'struct tcpstat' to counter(9).

  This speeds up IP forwarding at extreme packet rates, and
makes accounting more precise.

Sponsored by:	Nginx, Inc.
2013-04-08 19:57:21 +00:00
Philippe Charnier
321ae07f36 WARNS=6 compliance 2013-02-19 13:17:16 +00:00
Gleb Smirnoff
f89bcdb03b Use pluralies() for "entry"/"entries". 2013-01-22 18:07:59 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Xin LI
96c3073f76 Eliminate an unused parameter of static method igmp_stats_live_old().
MFC after:	1 month
2012-04-13 22:35:53 +00:00
Ed Schouten
b3608ae18f Replace index() and rindex() calls with strchr() and strrchr().
The index() and rindex() functions were marked LEGACY in the 2001
revision of POSIX and were subsequently removed from the 2008 revision.
The strchr() and strrchr() functions are part of the C standard.

This makes the source code a lot more consistent, as most of these C
files also call into other str*() routines. In fact, about a dozen
already perform strchr() calls.
2012-01-03 18:51:58 +00:00
Ruslan Ermilov
afc7401525 Fixed sockets display somewhat (-L, -T, -x, -Lx, with and without -A).
(I didn't try to fix negative TCP timers with -x.)

MFC after:	3 days
2011-03-26 19:09:28 +00:00
Jeff Roberson
aa0a1e58f0 - Merge in OFED 1.5.3 from projects/ofed/head 2011-03-21 09:58:24 +00:00
Joel Dahl
da52b4caaf Remove the advertising clause from UCB copyrighted files in usr.bin. This
is in accordance with the information provided at
ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change

Also add $FreeBSD$ to a few files to keep svn happy.

Discussed with:	imp, rwatson
2010-12-11 08:32:16 +00:00
George V. Neville-Neil
b205d03d78 Restore the (state) and \n printout when not using -T.
Pointed out by:	brucec@
MFC after:	3 weeks
2010-11-22 22:55:43 +00:00
George V. Neville-Neil
f5d34df525 Add new, per connection, statistics for TCP, including:
Retransmitted Packets
Zero Window Advertisements
Out of Order Receives

These statistics are available via the -T argument to
netstat(1).
MFC after:	2 weeks
2010-11-17 18:55:12 +00:00
Ruslan Ermilov
699ef999ee Show hostcache statistics.
Submitted by:	Maxim Dounin
2010-10-05 05:15:27 +00:00
Mike Silbersack
aae0914135 In netstat -x, do not try to print out tcp timer status for udp sockets. 2009-09-23 05:32:33 +00:00
Mike Silbersack
b8614722ff Add the ability to see TCP timers via netstat -x. This can be a useful
feature when you have a seemingly stuck socket and want to figure
out why it has not been closed yet.

No plans to MFC this, as it changes the netstat sysctl ABI.

Reviewed by:	andre, rwatson, Eric Van Gyzen
2009-09-16 05:33:15 +00:00
George V. Neville-Neil
54fc657d59 Add ARP statistics to the kernel and netstat.
New counters now exist for:
requests sent
replies sent
requests received
replies received
packets received
total packets dropped due to no ARP entry
entrys timed out
Duplicate IPs seen

The new statistics are seen in the netstat command
when it is given the -s command line switch.

MFC after:	2 weeks
In collaboration with: bz
2009-09-03 21:10:57 +00:00
Robert Watson
ad71fe3c35 Correct a number of evolved problems with inp_vflag and inp_flags:
certain flags that should have been in inp_flags ended up in inp_vflag,
meaning that they were inconsistently locked, and in one case,
interpreted.  Move the following flags from inp_vflag to gaps in the
inp_flags space (and clean up the inp_flags constants to make gaps
more obvious to future takers):

  INP_TIMEWAIT
  INP_SOCKREF
  INP_ONESBCAST
  INP_DROPPED

Some aspects of this change have no effect on kernel ABI at all, as these
are UDP/TCP/IP-internal uses; however, netstat and sockstat detect
INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this
into account.

MFC after:      1 week (or after dependencies are MFC'd)
Reviewed by:    bz
2009-03-15 09:58:31 +00:00
Bruce M Simpson
d10910e6ce Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSD
IPv4 stack.

Diffs are minimized against p4.
PCS has been used for some protocol verification, more widespread
testing of recorded sources in Group-and-Source queries is needed.
sizeof(struct igmpstat) has changed.

__FreeBSD_version is bumped to 800070.
2009-03-09 17:53:05 +00:00
George V. Neville-Neil
94f138fe60 Fix a printing problem when using the -L flag to netstat caused
by adding the -x flag earlier.

Submitted by:	Anton Yuzhaninov
MFC after:	3 days
2008-11-28 18:35:14 +00:00
Xin LI
1c10962832 Use strlcpy() when we mean it. 2008-10-17 21:14:50 +00:00
David E. O'Brien
dd335a1577 Minimize changes CURRENT<->releng7. 2008-09-01 15:04:38 +00:00
Rui Paulo
4816ba93ac Add ECN stats. 2008-08-26 15:12:29 +00:00
Maksim Yevmenkin
f35a20921e Fix build 2008-07-29 21:20:03 +00:00