freebsd-nq

Author	SHA1	Message	Date
Andrey V. Elsukov	245c40e879	Add more ifdefs. SIOC*_IN6 are defined only with INET6. MFC after: 1 month Reported by: bz	2014-10-14 14:51:27 +00:00
Andrey V. Elsukov	138d56556c	Move memset under ifdef INET6. MFH: 1 month Reported by: bz	2014-10-14 14:41:06 +00:00
Andrey V. Elsukov	0b9f5f8a5f	Overhaul if_gif(4): o convert to if_transmit; o use rmlock to protect access to gif_softc; o use sx lock to protect from concurrent ioctls; o remove a lot of unneeded and duplicated code; o remove cached route support (it won't work with concurrent io); o style fixes. Reviewed by: melifaro Obtained from: Yandex LLC MFC after: 1 month Sponsored by: Yandex LLC	2014-10-14 13:31:47 +00:00
Hiroki Sato	3c3136b1dd	Virtualize if_epair(4). An if_xname check for both "a" and "b" interfaces is added to return EEXIST when only "b" interface exists---this can happen when epair<N>b is moved to a vnet jail and then "ifconfig epair<N> create" is invoked there.	2014-10-10 06:45:13 +00:00
Andrey V. Elsukov	5b7a43f546	When tunneling interface is going to insert mbuf into netisr queue after stripping outer header, consider it as new packet and clear the protocols flags. This fixes problems when IPSEC traffic goes through various tunnels and router doesn't send ICMP/ICMPv6 errors. PR: 174602 Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC	2014-10-08 21:23:34 +00:00
Andrey V. Elsukov	9ef268219a	Our packet filters use mbuf's rcvif pointer to determine incoming interface. Change mbuf's rcvif to enc0 and restore it after pfil processing. PR: 110959 Sponsored by: Yandex LLC	2014-10-07 13:31:04 +00:00
Hiroki Sato	3b4b7de506	Virtualize if_edsc(4).	2014-10-05 21:27:26 +00:00
Hiroki Sato	d6f59204ef	Virtualize if_disc(4) cloner.	2014-10-05 19:46:52 +00:00
Hiroki Sato	c51275260b	Virtualize if_bridge(4) cloner.	2014-10-05 19:43:37 +00:00
Hiroki Sato	7eb756fab1	Use printb() for boolean flags in ro_opts and actor_state for LACP.	2014-10-05 02:37:01 +00:00
Hiroki Sato	6d47816791	- Move L2 addr configuration for the primary port to a taskqueue. This fixes LOR of softc rmlock in iflladdr_event handlers. - Call if_delmulti_ifma() after LACP_UNLOCK(). This fixes another LOR. - Fix a panic in lacp_transit_expire(). - Fix a panic in lagg_input() upon shutting down a port.	2014-10-05 02:34:21 +00:00
Hiroki Sato	9732189ca9	Separate option handling from SIOC[SG]LAGG to SIOC[SG]LAGGOPTS for backward compatibility with old ifconfig(8).	2014-10-02 20:01:13 +00:00
Hiroki Sato	478e052062	Virtualize net.link.vlan.soft_pad.	2014-10-02 05:56:17 +00:00
Hiroki Sato	939a050ad9	Virtualize lagg(4) cloner. This change fixes a panic when tearing down if_lagg(4) interfaces which were cloned in a vnet jail. Sysctl nodes which are dynamically generated for each cloned interface (net.link.lagg.N.*) have been removed, and use_flowid and flowid_shift ifconfig(8) parameters have been added instead. Flags and per-interface statistics counters are displayed in "ifconfig -v". CR: D842	2014-10-01 21:37:32 +00:00
Alexander V. Chernikov	8b1af054e8	Free radix mask entries on main radix destroy. This is temporary commit to be merged to 10. Other approach (like hash table) should be used to store different masks. PR: 194078 Submitted by: Rumen Telbizov MFC after: 3 days	2014-10-01 21:24:58 +00:00
Alexander V. Chernikov	31f0d081d8	Remove lock init from radix.c. Radix has never managed its locking itself. The only consumer using radix with embeded rwlock is system routing table. Move per-AF lock inits there.	2014-10-01 14:39:06 +00:00
Gleb Smirnoff	dee826cec0	Fix off by one in lagg_port_destroy(). Reported by: "Max N. Boyarov" <zotrix bsd.by>	2014-10-01 11:23:54 +00:00
Bjoern A. Zeeb	cbaac00901	Move the unconditional #include of net/ifq.h to the very end of file. This seems to allow us to pass a universe with either clang or gcc after r272244 (and r272260) and probably makes it easier to untabgle these chained #includes in the future.	2014-09-28 17:09:40 +00:00
Bjoern A. Zeeb	0110795a35	Remove duplicate declaraton of the if_inc_counter() function after r272244. if_var.h has the expected on and if_var.h include ifq.h and thus we get duplicates. It seems only one cavium ethernet file actually includes ifq.h directly which might be another cleanup to be done but need to test first.	2014-09-28 15:38:21 +00:00
Gleb Smirnoff	bd071d4d19	- Remove empty wrappers ether_poll_[de]register_drv(). [1] - Move polling(9) declarations out of ifq.h back to if_var.h they are absolutely unrelated to queues. Submitted by: Mikhail <mp lenta.ru> [1]	2014-09-28 14:05:18 +00:00
Gleb Smirnoff	112f50ffb2	Finally, convert counters in struct ifnet to counter(9). Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-28 08:57:07 +00:00
Gleb Smirnoff	2357543753	Convert to if_inc_counter() last remnantes of bare access to struct ifnet counters.	2014-09-28 07:43:38 +00:00
Alexander V. Chernikov	7d6cc45c9b	Use underlying ports counters to get lagg statistics instead of per-packet accounting. This introduce user-visible changes like aggregating error counters. Reviewed by: asomers (prev.version), glebius CR: D781 MFC after: 2 weeks Sponsored by: Yandex LLC	2014-09-27 13:57:48 +00:00
Gleb Smirnoff	eade13f9d2	Remove macros that hide access to struct ifnet fields.	2014-09-26 13:02:29 +00:00
Gleb Smirnoff	38738d739a	Make all lagg protocol methods live in lagg_protos, not in softc. All interfaces of a same protocol, use the same methods. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 12:54:24 +00:00
Andrey V. Elsukov	30e5de489d	Keep list of lagg ports sorted by if_index. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2014-09-26 12:42:06 +00:00
Gleb Smirnoff	6900d0d328	- Whitespace. - Remove caddr_t.	2014-09-26 12:35:58 +00:00
Gleb Smirnoff	16ca790ead	- Provide lagg_proto_attach(), lagg_proto_detach(). - Make detach a protocol method in lagg_protos. - Simplify code to lookup protocols. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 11:01:04 +00:00
Gleb Smirnoff	09c7577ef3	- When reconfiguring protocol on a lagg, first set it to LAGG_PROTO_NONE, then drop lock, run the attach routines, and then set it to specific proto. This removes tons of WITNESS warnings. - Make lagg protocol attach handlers not failing and allocate memory with M_WAITOK. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 08:42:32 +00:00
Gleb Smirnoff	b5e094cfd7	Make lagg protos a enum.	2014-09-26 08:12:12 +00:00
Gleb Smirnoff	b1bbc5b3d1	Make lagg protocols detach methods returning void. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-26 07:12:40 +00:00
Hans Petter Selasky	9fd573c39d	Improve transmit sending offload, TSO, algorithm in general. The current TSO limitation feature only takes the total number of bytes in an mbuf chain into account and does not limit by the number of mbufs in a chain. Some kinds of hardware is limited by two factors. One is the fragment length and the second is the fragment count. Both of these limits need to be taken into account when doing TSO. Else some kinds of hardware might have to drop completely valid mbuf chains because they cannot loaded into the given hardware's DMA engine. The new way of doing TSO limitation has been made backwards compatible as input from other FreeBSD developers and will use defaults for values not set. Reviewed by: adrian, rmacklem Sponsored by: Mellanox Technologies MFC after: 1 week	2014-09-22 08:27:27 +00:00
Hiroki Sato	9f21b0b8b2	Fix build.	2014-09-21 07:16:51 +00:00
Hiroki Sato	89c58b73e0	- Virtualize interface cloner for gre(4). This fixes a panic when destroying a vnet jail which has a gre(4) interface. - Make net.link.gre.max_nesting vnet-local.	2014-09-21 03:56:06 +00:00
Hiroki Sato	a7f5886ec7	Virtualize interface cloner for gif(4). This fixes a panic when destroying a vnet jail which has a gif(4) interface.	2014-09-21 03:55:04 +00:00
Hiroki Sato	ee0bd4b909	Make net.add_addr_allfibs vnet-local.	2014-09-21 03:48:20 +00:00
Gleb Smirnoff	3751dddb3e	Mechanically convert to if_inc_counter().	2014-09-19 10:39:58 +00:00
Gleb Smirnoff	56b61ca27a	Remove ifq_drops from struct ifqueue. Now queue drops are accounted in struct ifnet if_oqdrops. Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops is simply removed from them. There were no API to read this statistic. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-19 09:01:19 +00:00
Gleb Smirnoff	a6f2696932	Increase errors, not queue drops, in cases the module is supplied with a bad packet or if mbuf allocation failes.	2014-09-19 05:43:38 +00:00
Gleb Smirnoff	d2a707cdfa	Remove a bunch of methods that are superseded by if_inc_counter(). Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-18 16:17:20 +00:00
Gleb Smirnoff	1b7fb1d93f	While not too late rename 'ifnet_counter' to 'ift_counter'. One of the imporant moments that we discussed with Marcel and Anuranjan was that a converted driver should return false for 'grep ifnet if_driver.c' :) Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-18 14:47:13 +00:00
Gleb Smirnoff	35853c2c60	Add a function to set if_get_counter method for an ifnet. To be used in the drivers that are already converted to "Juniper drvapi". This can be revisited in future.	2014-09-18 14:38:28 +00:00
Gleb Smirnoff	277e067a58	While not too late rename if_get_counter_compat() to if_get_counter_default(). The compat counters will go away, but the function will remain in its place, and in all places where it is going to be called. Discussed with: melifaro	2014-09-18 10:01:56 +00:00
Gleb Smirnoff	0b7b006c7f	Add if_inc_counter(), a generic method to update ifnet(9) counter w/o dereferencing the struct. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-09-18 09:54:57 +00:00
Marcelo Araujo	5d99eb5926	Revert r271735. The comment is absolutely correct, we do not support 802.1p priority tagging. I got confused with the packet tagged and packet to be tagged. Spotted by: glebius	2014-09-18 05:43:19 +00:00
Marcelo Araujo	397bdf7cd5	Remove old comment, we already do 802.1q tagging. Phabric: D797 Reviewed by: kevlo Approved by: kevlo Sponsored by: QNAP Systems Inc.	2014-09-18 03:09:34 +00:00
Marcelo Araujo	99cdd96163	Add laggproto broadcast, it allows sends frames to all ports of the lagg(4) group and receives frames on any port of the lagg(4). Phabric: D549 Reviewed by: glebius, thompsa Approved by: glebius Obtained from: OpenBSD Sponsored by: QNAP Systems Inc.	2014-09-18 02:12:48 +00:00
Alexander V. Chernikov	6667db3130	* Fix if_omcast handling * Convert if_oerrors to pcpu. Suggested by: glebius MFC after: 2 weeks	2014-09-16 21:48:48 +00:00
Hans Petter Selasky	72f3100047	Revert r271504. A new patch to solve this issue will be made. Suggested by: adrian @	2014-09-13 20:52:01 +00:00
Alexander V. Chernikov	772b000f02	Switch if_vlan(4) to rmlock. MFC after: 2 weeks	2014-09-13 18:41:24 +00:00
Alexander V. Chernikov	299153b570	Switch if_vlan(4) to use counter(9) using new if_get_counter api.	2014-09-13 18:13:08 +00:00
Hans Petter Selasky	eb93b77ae4	Improve transmit sending offload, TSO, algorithm in general. The current TSO limitation feature only takes the total number of bytes in an mbuf chain into account and does not limit by the number of mbufs in a chain. Some kinds of hardware is limited by two factors. One is the fragment length and the second is the fragment count. Both of these limits need to be taken into account when doing TSO. Else some kinds of hardware might have to drop completely valid mbuf chains because they cannot loaded into the given hardware's DMA engine. The new way of doing TSO limitation has been made backwards compatible as input from other FreeBSD developers and will use defaults for values not set. MFC after: 1 week Sponsored by: Mellanox Technologies	2014-09-13 08:26:09 +00:00
Alan Somers	4f8585e021	Revisions 264905 and 266860 added a "int fib" argument to ifa_ifwithnet and ifa_ifwithdstaddr. For the sake of backwards compatibility, the new arguments were added to new functions named ifa_ifwithnet_fib and ifa_ifwithdstaddr_fib, while the old functions became wrappers around the new ones that passed RT_ALL_FIBS for the fib argument. However, the backwards compatibility is not desired for FreeBSD 11, because there are numerous other incompatible changes to the ifnet(9) API. We therefore decided to remove it from head but leave it in place for stable/9 and stable/10. In addition, this commit adds the fib argument to ifa_ifwithbroadaddr for consistency's sake. sys/sys/param.h Increment __FreeBSD_version sys/net/if.c sys/net/if_var.h sys/net/route.c Add fibnum argument to ifa_ifwithbroadaddr, and remove the _fib versions of ifa_ifwithdstaddr, ifa_ifwithnet, and ifa_ifwithroute. sys/net/route.c sys/net/rtsock.c sys/netinet/in_pcb.c sys/netinet/ip_options.c sys/netinet/ip_output.c sys/netinet6/nd6.c Fixup calls of modified functions. share/man/man9/ifnet.9 Document changed API. CR: https://reviews.freebsd.org/D458 MFC after: Never Sponsored by: Spectra Logic	2014-09-11 20:21:03 +00:00
Adrian Chadd	b8bc95cd49	Update the IPv4 input path to handle reassembled frames and incoming frames with no RSS hash. When doing RSS: * Create a new IPv4 netisr which expects the frames to have been verified; it just directly dispatches to the IPv4 input path. * Once IPv4 reassembly is done, re-calculate the RSS hash with the new IP and L3 header; then reinject it as appropriate. * Update the IPv4 netisr to be a CPU affinity netisr with the RSS hash function (rss_soft_m2cpuid) - this will do a software hash if the hardware doesn't provide one. NICs that don't implement hardware RSS hashing will now benefit from RSS distribution - it'll inject into the correct destination netisr. Note: the netisr distribution doesn't work out of the box - netisr doesn't query RSS for how many CPUs and the affinity setup. Yes, netisr likely shouldn't really be doing CPU stuff anymore and should be "some kind of 'thing' that is a workqueue that may or may not have any CPU affinity"; that's for a later commit. Differential Revision: https://reviews.freebsd.org/D527 Reviewed by: grehan	2014-09-09 04:18:20 +00:00
Gleb Smirnoff	bf7dcda366	Clean up unused CSUM_FRAGMENT. Sponsored by: Nginx, Inc.	2014-09-03 08:30:18 +00:00
Gleb Smirnoff	ccbefc2dfa	Toss fields so that no padding field is required to achieve alignment.	2014-08-31 13:30:54 +00:00
Gleb Smirnoff	09a8241fc9	It is actually possible to have if_t a typedef to non-void type, and keep both converted to drvapi and non-converted drivers compilable. o Make if_t typedef to struct ifnet *. o Remove shim functions. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-08-31 12:48:13 +00:00
Gleb Smirnoff	997d2d833f	Provide pointer from struct ifnet to struct netmap_adapter, instead of abusing spare field.	2014-08-31 11:33:19 +00:00
Gleb Smirnoff	e6485f73de	o Remove struct if_data from struct ifnet. Now it is merely API structure for route(4) socket and ifmib(4) sysctl. o Move fields from if_data to ifnet, but keep all statistic counters separate, since they should disappear later. o Provide function if_data_copy() to fill if_data, utilize it in routing socket and ifmib handler. o Provide overridable ifnet(9) method to fetch counters. If no provided, if_get_counters_compat() would be used, that returns old counters. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-08-31 06:46:21 +00:00
Gleb Smirnoff	178b14d674	Remove ability to write to struct if_data residing in struct ifnet via net.link.generic.IFMIB_IFDATA.*.IFDATA_GENERAL sysctl. Reasons for removal are: - No code in tree uses this possibility. - The documentation ifmib(4) doesn't say that such possibility exist. The example provided in manual page only reads data. - On many interfaces the feature simply doesn't work, since they do accounting in hardware, and overwrite if_data on tick. Sponsored by: Nginx, Inc.	2014-08-31 06:23:54 +00:00
Alexander V. Chernikov	ea463f2dc0	* Add SIOCGI2C driver ioctl used to retrieve i2c info. * Convert ixgbe to use this ioctl * Convert ifconfig to use generic i2c handler for "ix" interfaces. Approved by: Eric Joyner (ixgbe part) MFC after: 2 weeks Sponsored by: Yandex LLC	2014-08-29 18:02:58 +00:00
Alexander V. Chernikov	c59adfc6a5	* Add new net/sff8436.h containing constants used to access QSFP+ data via i2c inteface. These constants has been taken from SFF-8436 "QSFP+ 10 Gbs 4X PLUGGABLE TRANSCEIVER" standard rev 4.8. * Add support for printing QSFP+ information from 40G NICs such as Chelsio T5. This commit does not contain ioctl changes necessary for this functionality work, there will be another commit soon. Example: cxl1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=ec07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,.....> ether 00:07:43:28:ad:08 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet 40Gbase-LR4 <full-duplex> status: active plugged: QSFP+ 40GBASE-LR4 (MPO Parallel Optic) vendor: OEM PN: OP-QSFP-40G-LR4 SN: 20140318001 DATE: 2014-03-18 module temperature: 64.06 C voltage: 3.26 Volts lane 1: RX: 0.47 mW (-3.21 dBm) TX: 2.78 mW (4.46 dBm) lane 2: RX: 0.20 mW (-6.94 dBm) TX: 2.80 mW (4.47 dBm) lane 3: RX: 0.18 mW (-7.38 dBm) TX: 2.79 mW (4.47 dBm) lane 4: RX: 0.90 mW (-0.45 dBm) TX: 2.80 mW (4.48 dBm) Tested on: Chelsio T5 Tested on: Mellanox/Huawei passive/active cables/transceivers. MFC after: 2 weeks Sponsored by: Yandex LLC	2014-08-21 17:54:42 +00:00
Alexander V. Chernikov	f88c97416e	* Use standard net/sff8472.h header for sff bits and offsets. * Convert sff_8472_id to 'const char *' to please clang. Pointed by: np	2014-08-16 21:53:44 +00:00
Luigi Rizzo	4bf50f18eb	Update to the current version of netmap. Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for txsync() and rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days.	2014-08-16 15:00:01 +00:00
Roger Pau Monné	af371fc66a	net: move interface removal notification up in if_detach_internal This is needed to prevent having interfaces with ifp->if_addr == NULL on bridge interfaces. Moving the notification event handlers up makes sure the interfaces are removed before doing any more cleanup. Sponsored by: Citrix Systems R&D Reviewed by: melifaro Differential Revision: https://reviews.freebsd.org/D598 net/if.c - Move interface removal notification up in if_detach_internal.	2014-08-16 10:47:24 +00:00
Kevin Lo	73d76e77b6	Change pr_output's prototype to avoid the need for explicit casts. This is a follow up to r269699. Phabric: D564 Reviewed by: jhb	2014-08-15 02:43:02 +00:00
Gleb Smirnoff	a9572d8f02	- Count global pf(4) statistics in counter(9). - Do not count global number of states and of src_nodes, use uma_zone_get_cur() to obtain values. - Struct pf_status becomes merely an ioctl API structure, and moves to netpfil/pf/pf.h with its constants. - V_pf_status is now of type struct pf_kstatus. Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net> Sponsored by: InnoGames GmbH	2014-08-14 18:57:46 +00:00
Marcelo Araujo	133991579d	- Remove unneeded include. Phabric: D563 Reviewed by: kevlo Approved by: kevlo	2014-08-11 03:04:16 +00:00
Kevin Lo	8f5a8818f5	Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb	2014-08-08 01:57:15 +00:00
Alexander Motin	2d222cb761	Improve locking of multicast addresses in VLAN and LAGG interfaces. This fixes several scenarios of reproducible panics, cause by races between multicast address changes and interface destruction. MFC after: 2 weeks	2014-08-04 00:58:12 +00:00
Gleb Smirnoff	9753faf553	Garbage collect couple of unused fields from struct ifaddr: - ifa_claim_addr() unused since removal of NetAtalk - ifa_metric seems to be never utilized, always a copy of if_metric	2014-07-29 15:01:29 +00:00
Kevin Lo	c29a33213b	Deprecate m_act. Use m_nextpkt always.	2014-07-17 05:21:16 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Attilio Rao	3ae10f7477	- Modify vm_page_unwire() and vm_page_enqueue() to directly accept the queue where to enqueue pages that are going to be unwired. - Add stronger checks to the enqueue/dequeue for the pagequeues when adding and removing pages to them. Of course, for unmanaged pages the queue parameter of vm_page_unwire() will be ignored, just as the active parameter today. This makes adding new pagequeues quicker. This change effectively modifies the KPI. __FreeBSD_version will be, however, bumped just when the full cache of free pages will be evicted. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2014-06-16 18:15:27 +00:00
Alexander V. Chernikov	402000ffa3	Improve logic besides net.bpf.optimize_writers. Direct bpf(4) consumers should now work fine with this tunable turned on. In fact, the only case when optimized_writers can change program behavior is direct bpf(4) consumer setting its read filter to catch-all one. MFC after: 2 weeks Sponsored by: Yandex LLC	2014-06-11 11:27:44 +00:00
Luigi Rizzo	9225c8085b	misc bugfixes: - stdio.h is needed for fprint() - make memsize uint32_t to avoid errors due to overflow - honor the *XPOLL flagg in NIOCREGIF requests - mmap fails wit MAP_FAILED, not NULL. MFC after: 3 days	2014-06-06 15:17:19 +00:00
Luigi Rizzo	5c8c100428	whitespace change: fix one comment, remove a stale one.	2014-06-06 15:15:27 +00:00
Luigi Rizzo	43ed1d3c76	whitespace change: remove trailing whitespace	2014-06-05 21:12:41 +00:00
Marcel Moolenaar	62d76917b8	Introduce a procedural interface to the ifnet structure. The new interface allows the ifnet structure to be defined as an opaque type in NIC drivers. This then allows the ifnet structure to be changed without a need to change or recompile NIC drivers. Put differently, NIC drivers can be written and compiled once and be used with different network stack implementations, provided of course that those network stack implementations have an API and ABI compatible interface. This commit introduces the 'if_t' type to replace 'struct ifnet ' as the type of a network interface. The 'if_t' type is defined as 'void ' to enable the compiler to perform type conversion to 'struct ifnet *' and vice versa where needed and without warnings. The functions that implement the API are the only functions that need to have an explicit cast. The MII code has been converted to use the driver API to avoid unnecessary code churn. Code churn comes from having to work with both converted and unconverted drivers in correlation with having callback functions that take an interface. By converting the MII code first, the callback functions can be defined so that the compiler will perform the typecasts automatically. As soon as all drivers have been converted, the if_t type can be redefined as needed and the API functions can be fix to not need an explicit cast. The immediate benefactors of this change are: 1. Juniper Networks - The network stack implementation in Junos is entirely different from FreeBSD's one and this change allows Juniper to build "stock" NIC drivers that can be used in combination with both the FreeBSD and Junos stacks. 2. FreeBSD - This change opens the door towards changing ifnet and implementing new features and optimizations in the network stack without it requiring a change in the many NIC drivers FreeBSD has. Submitted by: Anuranjan Shukla <anshukla@juniper.net> Reviewed by: glebius@ Obtained from: Juniper Networks, Inc.	2014-06-02 17:54:39 +00:00
Alan Somers	2f308a343f	Fix unintended KBI change from r264905. Add _fib versions of ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the _fib() versions with RT_ALL_FIBS, preserving legacy behavior. sys/net/if_var.h sys/net/if.c Add legacy-compatible functions as described above. Ensure legacy behavior when RT_ALL_FIBS is passed as fibnum. sys/netinet/in_pcb.c sys/netinet/ip_output.c sys/netinet/ip_options.c sys/net/route.c sys/net/rtsock.c sys/netinet6/nd6.c Call with _fib() functions if we must use a specific fib, or the legacy functions otherwise. tests/sys/netinet/fibs_test.sh tests/sys/netinet/udp_dontroute.c Improve the udp_dontroute test. The bug that this test exercises is that ifa_ifwithnet() will return the wrong address, if multiple interfaces have addresses on the same subnet but with different fibs. The previous version of the test only considered one possible failure mode: that ifa_ifwithnet_fib() might fail to find any suitable address at all. The new version also checks whether ifa_ifwithnet_fib() finds the correct address by checking where the ARP request goes. Reported by: bz, hrs Reviewed by: hrs MFC after: 1 week X-MFC-with: 264905 Sponsored by: Spectra Logic	2014-05-29 21:03:49 +00:00
Peter Grehan	6902364468	Bump bhyve allocation up to 20 bits to avoid birthday-paradox style address collisions when bhyve VMs are connected to the same broadcoast domain and are using pseudo-random allocations. Reviewed by: gnn MFC after: 1 week	2014-05-20 02:59:13 +00:00
Alexander V. Chernikov	6db47af467	Rename rt_msg1() to more handy rtsock_msg_mbuf(). (Just for history purposes: rt_msg2() was renamed to rtsock_msg_buffer() in r265019). Sponsored by: Yandex LLC MFC after: 1 month	2014-05-08 13:54:57 +00:00
Alexander V. Chernikov	3deb3649d5	Fix incorrect netmasks being passed via rtsock. Since radix has been ignoring sa_family in passed sockaddrs, no one ever has bothered filling valid sa_family in netmasks. Additionally, radix adjusts sa_len field in every netmask not to compare zero bytes at all. This leads us to rt_mask with sa_family of AF_UNSPEC (-1) and arbitrary sa_len field (0 for default route, for example). However, rtsock have been passing that rt_mask intact for ages, requiring all rtsock consumers to make ther own local hacks. We even have unfixed on in base: do `route -n monitor` in one window and issue `route -n get addr` for some directly-connected address. You will probably see the following: got message of size 304 on Thu May 8 15:06:06 2014 RTM_GET: Report Metrics: len 304, pid: 30493, seq 1, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK,IFP,IFA> 10.0.0.0 link#1 (255) ffff ffff ff em0:8.0.27.c5.29.d4 10.0.0.92 _________________^^^^^^^^^^^^^^^^^^ after the change: got message of size 312 on Thu May 8 15:44:07 2014 RTM_GET: Report Metrics: len 312, pid: 2895, seq 1, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK,IFP,IFA> 10.0.0.0 link#1 255.255.255.0 em0:8.0.27.c5.29.d4 10.0.0.92 _________________^^^^^^^^^^^^^^^^^^ Sponsored by: Yandex LLC MFC after: 1 month	2014-05-08 11:56:06 +00:00
Alexander V. Chernikov	c9f98940b9	Fix sysctl_ifmalist() broken in r265019. Reported by: Olivier Cochard-Labbé MFC with: r265019	2014-05-03 17:57:06 +00:00
Alexander V. Chernikov	972ed56a33	Remove additional fib checks from rtalloc1_fib. It looks like current consumers are either unaware of MRT (and uses RT_DEFAULT_FIB implicitly) or know what thay are doing, In latter case they will be either hit by KASSERT or ESCRH will be returned due to NULL rnh.	2014-05-03 16:38:05 +00:00
Alexander V. Chernikov	b980262e63	Pass radix head ptr along with rte to rtexpunge(). Rename rtexpunge to rt_expunge().	2014-05-03 16:28:54 +00:00
Alan Somers	f544a74870	Fix a panic caused by doing "ifconfig -am" while a lagg is being destroyed. The thread that is destroying the lagg has already set sc->sc_psc=NULL when the "ifconfig -am" thread gets to lacp_req(). It tries to dereference sc->sc_psc and panics. The solution is for lacp_req() to check the value of sc->sc_psc. If NULL, harmlessly return an lacp_opreq structure full of zeros. Full details in GNATS. PR: kern/189003 Reviewed by: timeout on freebsd-net@ MFC after: 3 weeks Sponsored by: Spectra Logic Corporation	2014-05-02 16:24:09 +00:00
Alexander V. Chernikov	32fb15e802	Fix rnh_walktree_from() function (patch from kern/174959). Require valid netmask to be passed since host route is always a leaf. PR: kern/174959 Submitted by: Keith Sklower MFC after: 2 weeks	2014-05-01 15:04:32 +00:00
Alexander V. Chernikov	d9437c0f46	Partially revert r265019 - allocating 512 bytes on stack can be too much for architectures like ARM. Always use rounded malloc instead. Discussed with: jmallett MFC after: 4 weeks	2014-04-29 19:48:11 +00:00
Alexander V. Chernikov	0fb9298db9	Move rt_setmetrics() from rtsock.c to route.c. All rtsock-initiated rte creation/modification are now performed in route.c holding radix tree write lock. This reduces the need for per-rte mutex. Sponsored by: Yandex LLC MFC after: 1 month	2014-04-29 19:14:42 +00:00
Alexander V. Chernikov	a713ee5cf7	Do not use senderr() in rtrequest1_fib_change(). Suggested by: glebius MFC after: 4 weeks	2014-04-29 12:52:36 +00:00
Alexander V. Chernikov	de46b2c650	Fix build Found by: ian Pointyhat to: me	2014-04-27 21:17:54 +00:00
Alexander V. Chernikov	f2e5eb368a	Improve memory allocation model for rt_msg2() rtsock messages: * memory is now allocated as early as possible, without holding locks. * sysctl users are now guaranteed to get a response (M_WAITOK buffer prealloc). * socket users are more likely to use on-stack buffer for replies. * standard kernel malloc/free functions are now used instead of radix wrappers. rt_msg2() has been renamed to rtsock_msg_buffer(). MFC after: 1 month	2014-04-27 17:41:18 +00:00
Alexander V. Chernikov	f1fcb55271	Remove useless zeroing of RTAX_DST on error. Cleanup a bit. MFC after: 1 month	2014-04-27 10:43:48 +00:00
Alexander V. Chernikov	92c227af54	Cleanup route_output() a bit. MFC after: 1 month	2014-04-27 10:20:37 +00:00
Alexander V. Chernikov	2277c5e5e2	Do not delay freeing rtm. Bandaid added in r227061 is not needed since r227061, MFC after: 1 month	2014-04-27 09:49:35 +00:00
Alexander V. Chernikov	f5d9a6964d	Move up fibnum to ensure it is always defined. Found by: ian MFC with: r264987	2014-04-27 02:20:09 +00:00
Alexander V. Chernikov	f59c6cb0fc	Remove useless `register' declarations. MFC after: 1 month	2014-04-26 22:42:21 +00:00

1 2 3 4 5 ...

3306 Commits