freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	247cf5664e	Add SIOCGIFDOWNREASON. The ioctl(2) is intended to provide more details about the cause of the down for the link. Eventually we might define a comprehensive list of codes for the situations. But interface also allows the driver to provide free-form null-terminated ASCII string to provide arbitrary non-formalized information. Sample implementation exists for mlx5(4), where the string is fetched from firmware controlling the port. Reviewed by: hselasky, rrs Sponsored by: Mellanox Technologies MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21527	2019-09-17 18:49:13 +00:00
John Baldwin	b2e60773c6	Add kernel-side support for in-kernel TLS. KTLS adds support for in-kernel framing and encryption of Transport Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports offload of TLS for transmitted data. Key negotation must still be performed in userland. Once completed, transmit session keys for a connection are provided to the kernel via a new TCP_TXTLS_ENABLE socket option. All subsequent data transmitted on the socket is placed into TLS frames and encrypted using the supplied keys. Any data written to a KTLS-enabled socket via write(2), aio_write(2), or sendfile(2) is assumed to be application data and is encoded in TLS frames with an application data type. Individual records can be sent with a custom type (e.g. handshake messages) via sendmsg(2) with a new control message (TLS_SET_RECORD_TYPE) specifying the record type. At present, rekeying is not supported though the in-kernel framework should support rekeying. KTLS makes use of the recently added unmapped mbufs to store TLS frames in the socket buffer. Each TLS frame is described by a single ext_pgs mbuf. The ext_pgs structure contains the header of the TLS record (and trailer for encrypted records) as well as references to the associated TLS session. KTLS supports two primary methods of encrypting TLS frames: software TLS and ifnet TLS. Software TLS marks mbufs holding socket data as not ready via M_NOTREADY similar to sendfile(2) when TLS framing information is added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then called to schedule TLS frames for encryption. In the case of sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving the mbufs marked M_NOTREADY until encryption is completed. For other writes (vn_sendfile when pages are available, write(2), etc.), the PRUS_NOTREADY is set when invoking pru_send() along with invoking ktls_enqueue(). A pool of worker threads (the "KTLS" kernel process) encrypts TLS frames queued via ktls_enqueue(). Each TLS frame is temporarily mapped using the direct map and passed to a software encryption backend to perform the actual encryption. (Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if someone wished to make this work on architectures without a direct map.) KTLS supports pluggable software encryption backends. Internally, Netflix uses proprietary pure-software backends. This commit includes a simple backend in a new ktls_ocf.ko module that uses the kernel's OpenCrypto framework to provide AES-GCM encryption of TLS frames. As a result, software TLS is now a bit of a misnomer as it can make use of hardware crypto accelerators. Once software encryption has finished, the TLS frame mbufs are marked ready via pru_ready(). At this point, the encrypted data appears as regular payload to the TCP stack stored in unmapped mbufs. ifnet TLS permits a NIC to offload the TLS encryption and TCP segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS) is allocated on the interface a socket is routed over and associated with a TLS session. TLS records for a TLS session using ifnet TLS are not marked M_NOTREADY but are passed down the stack unencrypted. The ip_output_send() and ip6_output_send() helper functions that apply send tags to outbound IP packets verify that the send tag of the TLS record matches the outbound interface. If so, the packet is tagged with the TLS send tag and sent to the interface. The NIC device driver must recognize packets with the TLS send tag and schedule them for TLS encryption and TCP segmentation. If the the outbound interface does not match the interface in the TLS send tag, the packet is dropped. In addition, a task is scheduled to refresh the TLS send tag for the TLS session. If a new TLS send tag cannot be allocated, the connection is dropped. If a new TLS send tag is allocated, however, subsequent packets will be tagged with the correct TLS send tag. (This latter case has been tested by configuring both ports of a Chelsio T6 in a lagg and failing over from one port to another. As the connections migrated to the new port, new TLS send tags were allocated for the new port and connections resumed without being dropped.) ifnet TLS can be enabled and disabled on supported network interfaces via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported across both vlan devices and lagg interfaces using failover, lacp with flowid enabled, or lacp with flowid enabled. Applications may request the current KTLS mode of a connection via a new TCP_TXTLS_MODE socket option. They can also use this socket option to toggle between software and ifnet TLS modes. In addition, a testing tool is available in tools/tools/switch_tls. This is modeled on tcpdrop and uses similar syntax. However, instead of dropping connections, -s is used to force KTLS connections to switch to software TLS and -i is used to switch to ifnet TLS. Various sysctls and counters are available under the kern.ipc.tls sysctl node. The kern.ipc.tls.enable node must be set to true to enable KTLS (it is off by default). The use of unmapped mbufs must also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS. KTLS is enabled via the KERN_TLS kernel option. This patch is the culmination of years of work by several folks including Scott Long and Randall Stewart for the original design and implementation; Drew Gallatin for several optimizations including the use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records awaiting software encryption, and pluggable software crypto backends; and John Baldwin for modifications to support hardware TLS offload. Reviewed by: gallatin, hselasky, rrs Obtained from: Netflix Sponsored by: Netflix, Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21277	2019-08-27 00:01:56 +00:00
John Baldwin	82334850ea	Add an external mbuf buffer type that holds multiple unmapped pages. Unmapped mbufs allow sendfile to carry multiple pages of data in a single mbuf, without mapping those pages. It is a requirement for Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web serving workloads when used by sendfile, due to effectively compressing socket buffers by an order of magnitude, and hence reducing cache misses. For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer now points to a struct mbuf_ext_pgs structure instead of a data buffer. This structure contains an array of physical addresses (this reduces cache misses compared to an earlier version that stored an array of vm_page_t pointers). It also stores additional fields needed for in-kernel TLS such as the TLS header and trailer data that are currently unused. To more easily detect these mbufs, the M_NOMAP flag is set in m_flags in addition to M_EXT. Various functions like m_copydata() have been updated to safely access packet contents (using uiomove_fromphys()), to make things like BPF safe. NIC drivers advertise support for unmapped mbufs on transmit via a new IFCAP_NOMAP capability. This capability can be toggled via the new 'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only transmit packet contents via DMA and use bus_dma, adding the capability to if_capabilities and if_capenable should be all that is required. If a NIC does not support unmapped mbufs, they are converted to a chain of mapped mbufs (using sf_bufs to provide the mapping) in ip_output or ip6_output. If an unmapped mbuf requires software checksums, it is also converted to a chain of mapped mbufs before computing the checksum. Submitted by: gallatin (earlier version) Reviewed by: gallatin, hselasky, rrs Discussed with: ae, kp (firewalls) Relnotes: yes Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20616	2019-06-29 00:48:33 +00:00
Mark Johnston	d25f8522be	Plug routing sysctl leaks. Various structures exported by sysctl_rtsock() contain padding fields which were not being zeroed. Reported by: Thomas Barabosch, Fraunhofer FKIE Reviewed by: ae MFC after: 3 days Security: kernel memory disclosure Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18333	2018-11-26 13:42:18 +00:00
Matt Macy	09f6ff4f1a	iflib(9): Add support for cloning pseudo interfaces Part 3 of many ... The VPC framework relies heavily on cloning pseudo interfaces (vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc). This pulls in that piece. Some ancillary changes get pulled in as a side effect. Reviewed by: shurd@ Approved by: sbruno@ Sponsored by: Joyent, Inc. Differential Revision: https://reviews.freebsd.org/D15347	2018-05-11 20:08:28 +00:00
Brooks Davis	756181b8f5	Add 32-bit compat for ioctls that take struct ifgroupreq. Use an accessor to access ifgr_group and ifgr_groups. Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such as "case SIOCAIFGROUP:". This avoids poluting the switch statements with large numbers of #ifdefs. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14960	2018-04-05 22:14:55 +00:00
Brooks Davis	541d96aaaf	Use an accessor function to access ifr_data. This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size). This is believed to be sufficent to fully support ifconfig on 32-bit systems. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14900	2018-03-30 18:50:13 +00:00
Brooks Davis	86d2ef167a	Fix access to ifru_buffer on freebsd32. Make all kernel accesses to ifru_buffer go via access functions which take the process ABI into account and use an appropriate union to access members in the correct place in struct ifreq. Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14846	2018-03-27 18:26:50 +00:00
Konstantin Belousov	f137973487	Allow to specify PCP on packets not belonging to any VLAN. According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be considered as untagged, and only PCP and DEI values from the VLAN tag are meaningful. See for instance https://www.cisco.com/c/en/us/td/docs/switches/connectedgrid/cg-switch-sw-master/software/configuration/guide/vlan0/b_vlan_0.html. Make it possible to specify PCP value for outgoing packets on an ethernet interface. When PCP is supplied, the tag is appended, VLAN id set to 0, and PCP is filled by the supplied value. The code to do VLAN tag encapsulation is refactored from the if_vlan.c and moved into if_ethersubr.c. Drivers might have issues with filtering VID 0 packets on receive. This bug should be fixed for each driver. Reviewed by: ae (previous version), hselasky, melifaro Sponsored by: Mellanox Technologies MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D14702	2018-03-27 15:29:32 +00:00
Gleb Smirnoff	17eea3202a	Garbage collect IFCAP_POLLING_NOCOUNT. It wasn't used since very beginning of polling(4). The module always ignored return value from driver polling handler.	2017-12-06 23:03:34 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Konstantin Belousov	3cf8254f1e	Add a place for a driver to report rx timestamps in nanoseconds from boot for the received packets. The rcv_tstmp field overlaps the place of Ln header length indicators, not used by received packets. The basic pkthdr rearrangement change in sys/mbuf.h was provided by gallatin. There are two accompanying M_ flags: M_TSTMP means that there is the timestamp (and it was generated by hardware). Another flag M_TSTMP_HPREC indicates that the timestamp is high-precision. Practically M_TSTMP_HPREC means that hardware provided additional precision comparing with the stamps when the flag is not set. E.g., for ConnectX all packets are stamped by hardware when PCIe transaction to write out the completion descriptor is performed, but PTP packet are stamped on port. For Intel cards, when PTP assist is enabled, only PTP packets are stamped in the limited number of registers, so if Intel cards ever start support this mechanism, they would always set M_TSTMP \| M_TSTMP_HPREC if hardware timestamp is present for the given packet. Add IFCAP_HWRXTSTMP interface capability to indicate the support for hardware rx timestamping, and ifconfig(8) command to toggle it. Based on the patch by: gallatin Reviewed by: gallatin (previous version), hselasky Sponsored by: Mellanox Technologies MFC after: 2 weeks (? mbuf KBI issue) X-Differential revision: https://reviews.freebsd.org/D12638	2017-11-07 09:29:14 +00:00
Sepherosa Ziehau	0f3af0411d	if: Add ioctls to get RSS key and hash type/function. It will be needed by hn(4) to configure its RSS key and hash type/function in the transparent VF mode in order to match VF's RSS settings. The description of the transparent VF mode and the RSS hash value issue are here: https://svnweb.freebsd.org/base?view=revision&revision=322299 https://svnweb.freebsd.org/base?view=revision&revision=322485 These are generic enough to promise two independent IOCs instead of abusing SIOCGDRVSPEC. Setting RSS key and hash type/function is a different story, which probably requires more discussion. Comment about UDP_{IPV4,IPV6,IPV6_EX} were only in the patch in the review request; these hash types are standardized now. Reviewed by: gallatin MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12174	2017-09-05 05:28:52 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Hans Petter Selasky	f3e7afe2d7	Implement kernel support for hardware rate limited sockets. - Add RATELIMIT kernel configuration keyword which must be set to enable the new functionality. - Add support for hardware driven, Receive Side Scaling, RSS aware, rate limited sendqueues and expose the functionality through the already established SO_MAX_PACING_RATE setsockopt(). The API support rates in the range from 1 to 4Gbytes/s which are suitable for regular TCP and UDP streams. The setsockopt(2) manual page has been updated. - Add rate limit function callback API to "struct ifnet" which supports the following operations: if_snd_tag_alloc(), if_snd_tag_modify(), if_snd_tag_query() and if_snd_tag_free(). - Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT flag, which tells if a network driver supports rate limiting or not. - This patch also adds support for rate limiting through VLAN and LAGG intermediate network devices. - How rate limiting works: 1) The userspace application calls setsockopt() after accepting or making a new connection to set the rate which is then stored in the socket structure in the kernel. Later on when packets are transmitted a check is made in the transmit path for rate changes. A rate change implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the destination network interface, which then sets up a custom sendqueue with the given rate limitation parameter. A "struct m_snd_tag" pointer is returned which serves as a "snd_tag" hint in the m_pkthdr for the subsequently transmitted mbufs. 2) When the network driver sees the "m->m_pkthdr.snd_tag" different from NULL, it will move the packets into a designated rate limited sendqueue given by the snd_tag pointer. It is up to the individual drivers how the rate limited traffic will be rate limited. 3) Route changes are detected by the NIC drivers in the ifp->if_transmit() routine when the ifnet pointer in the incoming snd_tag mismatches the one of the network interface. The network adapter frees the mbuf and returns EAGAIN which causes the ip_output() to release and clear the send tag. Upon next ip_output() a new "snd_tag" will be tried allocated. 4) When the PCB is detached the custom sendqueue will be released by a non-blocking ifp->if_snd_tag_free() call to the currently bound network interface. Reviewed by: wblock (manpages), adrian, gallatin, scottl (network) Differential Revision: https://reviews.freebsd.org/D3687 Sponsored by: Mellanox Technologies MFC after: 3 months	2017-01-18 13:31:17 +00:00
Marcelo Araujo	2ccbbd06d2	Add support to priority code point (PCP) that is an 3-bit field which refers to IEEE 802.1p class of service and maps to the frame priority level. Values in order of priority are: 1 (Background (lowest)), 0 (Best effort (default)), 2 (Excellent effort), 3 (Critical applications), 4 (Video, < 100ms latency), 5 (Video, < 10ms latency), 6 (Internetwork control) and 7 (Network control (highest)). Example of usage: root# ifconfig em0.1 create root# ifconfig em0.1 vlanpcp 3 Note: The review D801 includes the pf(4) part, but as discussed with kristof, we won't commit the pf(4) bits for now. The credits of the original code is from rwatson. Differential Revision: https://reviews.freebsd.org/D801 Reviewed by: gnn, adrian, loos Discussed with: rwatson, glebius, kristof Tested by: many including Matthew Grooms <mgrooms__shrew.net> Obtained from: pfSense Relnotes: Yes	2016-06-06 09:51:58 +00:00
Alexander V. Chernikov	ea463f2dc0	* Add SIOCGI2C driver ioctl used to retrieve i2c info. * Convert ixgbe to use this ioctl * Convert ifconfig to use generic i2c handler for "ix" interfaces. Approved by: Eric Joyner (ixgbe part) MFC after: 2 weeks Sponsored by: Yandex LLC	2014-08-29 18:02:58 +00:00
Gleb Smirnoff	9753faf553	Garbage collect couple of unused fields from struct ifaddr: - ifa_claim_addr() unused since removal of NetAtalk - ifa_metric seems to be never utilized, always a copy of if_metric	2014-07-29 15:01:29 +00:00
Gleb Smirnoff	b245f96c44	Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit interface, in the r241616 a crutch was provided. It didn't work well, and finally we decided that it is time to break ABI and simply make if_baudrate a 64-bit value. Meanwhile, the entire struct if_data was reviewed. o Remove the if_baudrate_pf crutch. o Make all fields of struct if_data fixed machine independent size. The notion of data (packet counters, etc) are by no means MD. And it is a bug that on amd64 we've got a 64-bit counters, while on i386 32-bit, which at modern speeds overflow within a second. This also removes quite a lot of COMPAT_FREEBSD32 code. o Give 16 bit for the ifi_datalen field. This field was provided to make future changes to if_data less ABI breaking. Unfortunately the 8 bit size of it had effectively limited sizeof if_data to 256 bytes. o Give 32 bits to ifi_mtu and ifi_metric. o Give 64 bits to the rest of fields, since they are counters. __FreeBSD_version bumped. Discussed with: emax Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-03-13 03:42:24 +00:00
Gleb Smirnoff	555036b5f6	Remove never used ioctls that originate from KAME. The proof of their zero usage was exp-run from misc/183538.	2013-11-11 05:39:42 +00:00
Gleb Smirnoff	77b89ad837	Provide compat layer for OSIOCAIFADDR.	2013-11-06 19:46:20 +00:00
Gleb Smirnoff	af50ea380f	Axe IFF_SMART. Fortunately this layering violating flag was never used, it was just declared.	2013-11-05 12:52:56 +00:00
Gleb Smirnoff	5fb009bda7	Drop support for historic ioctls and also undefine them, so that code that checks their presence via ifdef, won't use them. Bump __FreeBSD_version as safety measure.	2013-11-05 10:29:47 +00:00
Gleb Smirnoff	c29e1ad930	- Make the prophecy from 1997 happen and remove if_var.h inclusion from if.h. - Remove unnecessary includes and declarations from if.h - Remove unnecessary includes and declarations from if_var.h [1] - Mark some declarations that are about to be removed in near future with comments, explaning why this declaration is still necessary. - Protect eventhandler declarations with #ifdef SYS_EVENTHANDLER_H. Obtained from: bdeBSD [1] Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-28 08:03:40 +00:00
Gleb Smirnoff	4cdc1f5421	There are some high performance NICs that count statistics in hardware, and there are ifnets, that do that via counter(9). Provide a flag that would skip cache line trashing '+=' operation in ether_input(). Sponsored by: Netflix Sponsored by: Nginx, Inc. Reviewed by: melifaro, adrian Approved by: re (marius)	2013-10-09 19:04:40 +00:00
Andre Oppermann	1b4381afbb	Restructure the mbuf pkthdr to make it fit for upcoming capabilities and features. The changes in particular are: o Remove rarely used "header" pointer and replace it with a 64bit protocol/ layer specific union PH_loc for local use. Protocols can flexibly overlay their own 8 to 64 bit fields to store information while the packet is worked on. o Mechanically convert IP reassembly, IGMP/MLD and ATM to use pkthdr.PH_loc instead of pkthdr.header. o Extend csum_flags to 64bits to allow for additional future offload information to be carried (e.g. iSCSI, IPsec offload, and others). o Move the RSS hash type enumerator from abusing m_flags to its own 8bit rsstype field. Adjust accessor macros. o Add cosqos field to store Class of Service / Quality of Service information with the packet. It is not yet supported in any drivers but allows us to get on par with Cisco/Juniper in routing applications (plus MPLS QoS) with a modernized ALTQ. o Add four 8 bit fields l[2-5]hlen to store the relative header offsets from the start of the packet. This is important for various offload capabilities and to relieve the drivers from having to parse the packet and protocol headers to find out location of checksums and other information. Header parsing in drivers is a lot of copy-paste and unhandled corner cases which we want to avoid. o Add another flexible 64bit union to map various additional persistent packet information, like ether_vtag, tso_segsz and csum fields. Depending on the csum_flags settings some fields may have different usage making it very flexible and adaptable to future capabilities. o Restructure the CSUM flags to better signify their outbound (down the stack) and inbound (up the stack) use. The CSUM flags used to be a bit chaotic and rather poorly documented leading to incorrect use in many places. Bring clarity into their use through better naming. Compatibility mappings are provided to preserve the API. The drivers can be corrected one by one and MFC'd without issue. o The size of pkthdr stays the same at 48/56bytes (32/64bit architectures). Sponsored by: The FreeBSD Foundation	2013-08-24 19:51:18 +00:00
Maksim Yevmenkin	608ae712d3	provide helper if_initbaudrate() to set if_baudrate_pf and if_baudrate_pf. again, use ixgbe(4) as an example of how to use new helper function. Reviewed by: jhb MFC after: 1 week	2012-10-17 19:24:13 +00:00
Maksim Yevmenkin	0fef97fea3	introduce concept of ifi_baudrate power factor. the idea is to work around the problem where high speed interfaces (such as ixgbe(4)) are not able to report real ifi_baudrate. bascially, take a spare byte from struct if_data and use it to store ifi_baudrate power factor. in other words, real ifi_baudrate = ifi_baudrate * 10 ^ ifi_baudrate power factor this should be backwards compatible with old binaries. use ixgbe(4) as an example on how drivers would set ifi_baudrate power factor Discussed with: kib, scottl, glebius MFC after: 1 week	2012-10-16 20:18:15 +00:00
John Baldwin	304050dde0	Hold GIF_LOCK() for almost all of gif_start(). It is required to be held across in_gif_output() and in6_gif_output() anyway, and once it is held across those it might as well be held for the entire loop. This simplifies the code and removes the need for the custom IFF_GIF_WANTED flag (which belonged in the softc and not as an IFF_* flag anyway). Tested by: Vincent Hoffman vince unsane co uk	2012-06-29 15:21:34 +00:00
Randall Stewart	6f17e3a31a	Opps forgot to commit the flag.	2012-06-12 12:40:15 +00:00
Bjoern A. Zeeb	356ab07e2d	It turns out that too many drivers are not only parsing the L2/3/4 headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958	2012-05-28 09:30:13 +00:00
Bjoern A. Zeeb	6d076ae8f7	Introduce a new NET_RT_IFLISTL API to query the address list. It works on extended and extensible structs if_msghdrl and ifa_msghdrl. This will allow us to extend both the msghdrl structs and eventually if_data in the future without breaking the ABI. Bump __FreeBSD_version to allow ports to more easily detect the new API. Reviewed by: glebius, brooks MFC after: 3 days	2012-02-11 06:02:16 +00:00
Bjoern A. Zeeb	e82cf13bfb	Backout changes from r228571. Remove if_data from struct ifa_msghdr again. While this breaks carp on HEAD temporary, it restores the upgrade path from stable, and head before 20111215. Reviewed by: glebius, brooks	2012-02-11 05:59:54 +00:00
Gleb Smirnoff	7121247312	Provide ABI compatibility shim to enable configuring of addresses with ifconfig(8) prior to r228571. Requested by: brooks	2011-12-21 12:39:08 +00:00
Gleb Smirnoff	08b68b0e4c	A major overhaul of the CARP implementation. The ip_carp.c was started from scratch, copying needed functionality from the old implemenation on demand, with a thorough review of all code. The main change is that interface layer has been removed from the CARP. Now redundant addresses are configured exactly on the interfaces, they run on. The CARP configuration itself is, as before, configured and read via SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or SIOCAIFADDR_IN6 may now be configured to a particular virtual host id, which makes the prefix redundant. ifconfig(8) semantics has been changed too: now one doesn't need to clone carpXX interface, he/she should directly configure a vhid on a Ethernet interface. To supply vhid data from the kernel to an application the getifaddrs(8) function had been changed to pass ifam_data with each address. [1] The new implementation definitely closes all PRs related to carp(4) being an interface, and may close several others. It also allows to run a single redundant IP per interface. Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for idea on using ifam_data and for several rounds of reviewing! PR: kern/117000, kern/126945, kern/126714, kern/120130, kern/117448 Reviewed by: bz Submitted by: bz [1]	2011-12-16 12:16:56 +00:00
Ed Schouten	cf05e311ea	Add missing #includes. According to POSIX, these two header files should be able to be included by themselves, not depending on other headers. The <net/if.h> header uses struct sockaddr when __BSD_VISIBLE=1, while <netinet/tcp.h> uses integer datatypes (u_int32_t, u_short, etc). MFC after: 2 months	2011-10-21 12:58:34 +00:00
Bjoern A. Zeeb	35fd7bc020	Add infrastructure to allow all frames/packets received on an interface to be assigned to a non-default FIB instance. You may need to recompile world or ports due to the change of struct ifnet. Submitted by: cjsp Submitted by: Alexander V. Chernikov (melifaro ipfw.ru) (original versions) Reviewed by: julian Reviewed by: Alexander V. Chernikov (melifaro ipfw.ru) MFC after: 2 weeks X-MFC: use spare in struct ifnet	2011-07-03 12:22:02 +00:00
Luigi Rizzo	c9d658e9f7	Grab one of the ifcap bits for netmap, and enable printing in ifconfig. Document the fact that we might want an IFCAP_CANTCHANGE mask, even though the value is not yet used in sys/net/if.c (asked on -current a week ago, no feedback so i assume no objection).	2011-06-14 12:40:55 +00:00
Weongyo Jeong	c5649739a5	Adds IFF_CANTCONFIG to IFF_CANTCHANGE that it shouldn't happen through ioctl(2).	2010-12-07 20:31:04 +00:00
Weongyo Jeong	6e3cb00068	Introduces IFF_CANTCONFIG interface flag to point that the interface isn't configurable in a meaningful way. This is for ifconfig(8) or other tools not to change code whenever IFT_USB-like interfaces are registered at the interface list. Reviewed by: brooks No objections: gavin, jkim	2010-12-07 20:23:47 +00:00
Sergey Kandaurov	9af74f3d68	Reshuffle SIOCGIFCONF32 handler from r155224. - move all the chunks into one file, which allows to hide SIOCGIFCONF32 global definition as well. - replace __amd64__ with proper COMPAT_FREEBSD32 around. - handle 32bit capacity before going into the handler itself instead of doing internal 32bit specific changes within it (e.g. as it's done for SIOCGDEFIFACE32_IN6). - use explicitely sized types for ABI compat. Approved by: kib (mentor) MFC after: 2 weeks	2010-10-21 16:20:48 +00:00
Qing Li	6b533b5ddb	Verify interface up status using its link state only if the interface has such capability. The interface capability flag indicates whether such capability exists. This approach is much more backward compatible. Physical device driver changes will be part of another commit. Also updated the ifconfig utility to show the LINKSTATE capability if present. Reviewed by: rwatson, imp, juli MFC after: 3 days	2010-03-16 17:59:12 +00:00
Pyun YongHyeon	9b76d9cb3d	Add TSO support on VLANs. Intentionally separated IFCAP_VLAN_HWTSO from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to TSO over VLAN without VLAN hardware tagging. Driver changes and userland support will follow. Reviewed by: thompsa	2010-02-20 22:47:20 +00:00
Xin LI	215940b3fa	Revised revision 199201 (add interface description capability as inspired by OpenBSD), based on comments from many, including rwatson, jhb, brooks and others. Sponsored by: iXsystems, Inc. MFC after: 1 month	2010-01-27 00:30:07 +00:00
John Baldwin	5428776e2c	Change vlan interfaces to cope more usefully with the parent interface being renamed. Previously the vlan interfaces would lose their configuration as if the parent interface had been physically removed. Now vlan interfaces ignore rename events. - Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being renamed. This flag can be checked in ifnet departure/arrival event handlers to treat rename events differently. - Change the ifnet departure event handler in the if_vlan(4) driver to ignore departure events due to a trunk interface being renamed. Reviewed by: brooks, rwatson MFC after: 1 week	2009-12-29 13:35:18 +00:00
Xin LI	1a9d4dda9b	Revert revision 199201 for now as it has introduced a kernel vulnerability and requires more polishing.	2009-11-12 19:02:10 +00:00
Xin LI	41c8c6e876	Add interface description capability as inspired by OpenBSD. MFC after: 3 months	2009-11-11 21:30:58 +00:00
Jamie Gritton	679e13901c	Manage vnets via the jail system. If a jail is given the boolean parameter "vnet" when it is created, a new vnet instance will be created along with the jail. Networks interfaces can be moved between prisons with an ioctl similar to the one that moves them between vimages. For now vnets will co-exist under both jails and vimages, but soon struct vimage will be going away. Reviewed by: zec, julian Approved by: bz (mentor)	2009-06-15 18:59:29 +00:00
Attilio Rao	1abcdbd127	When user_frac in the polling subsystem is low it is going to busy the CPU for too long period than necessary. Additively, interfaces are kept polled (in the tick) even if no more packets are available. In order to avoid such situations a new generic mechanism can be implemented in proactive way, keeping track of the time spent on any packet and fragmenting the time for any tick, stopping the processing as soon as possible. In order to implement such mechanism, the polling handler needs to change, returning the number of packets processed. While the intended logic is not part of this patch, the polling KPI is broken by this commit, adding an int return value and the new flag IFCAP_POLLING_NOCOUNT (which will signal that the return value is meaningless for the installed handler and checking should be skipped). Bump __FreeBSD_version in order to signal such situation. Reviewed by: emaste Sponsored by: Sandvine Incorporated	2009-05-30 15:14:44 +00:00
Robert Watson	242a8e72eb	Add a new interface flag, IFF_DYING, which is set when a device driver calls if_free(), and remains set if the refcount is elevated. IF_DYING skips the bit in the if_flags bitmask previously used by IFF_NEEDSGIANT, so that an MFC can be done without changing which bit is used, as IFF_NEEDSGIANT is still present in 7.x. ifnet_byindex_ref() checks for IFF_DYING and returns NULL if it is set, preventing new references from by acquired by index, preventing monitoring sysctls from seeing it. Other lookup mechanisms currently do not check IFF_DYING, but may need to in the future. MFC after: 3 weeks	2009-04-23 09:32:30 +00:00

1 2 3 4

166 Commits