freebsd-dev

Author	SHA1	Message	Date
Alexander V. Chernikov	ff3a85d324	[lltable] Add per-family lltable getters. Introduce a new function, lltable_get(), to retrieve lltable pointer for the specified interface and family. Use it to avoid all-iftable list traversal when adding or deleting ARP/ND records. Differential Revision: https://reviews.freebsd.org/D33660 MFC after: 2 weeks	2021-12-29 20:57:15 +00:00
Gleb Smirnoff	eb8dcdeac2	jail: network epoch protection for IP address lists Now struct prison has two pointers (IPv4 and IPv6) of struct prison_ip type. Each points into epoch context, address count and variable size array of addresses. These structures are freed with network epoch deferred free and are not edited in place, instead a new structure is allocated and set. While here, the change also generalizes a lot (but not enough) of IPv4 and IPv6 processing. E.g. address family agnostic helpers for kern_jail_set() are provided, that reduce v4-v6 copy-paste. The fast-path prison_check_ip[46]_locked() is also generalized into prison_ip_check() that can be executed with network epoch protection only. Reviewed by: jamie Differential revision: https://reviews.freebsd.org/D33339	2021-12-26 10:45:50 -08:00
Mike Karels	2f35e7d9fa	kernel: partially revert e9efb1125a15, default inet mask When no mask is supplied to the ioctl adding an Internet interface address, revert to using the historical class mask rather than a single default. Similarly for the NFS bootp code. MFC after: 3 weeks Reviewed by: melifaro glebius Differential Revision: https://reviews.freebsd.org/D32951	2021-11-14 14:12:25 -06:00
Gleb Smirnoff	9c89392f12	Add in_localip_fib(), in6_localip_fib(). Check if given address/FIB exists locally. Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D32913	2021-11-12 08:59:42 -08:00
Mike Karels	20d5940396	kernel: deprecate Internet Class A/B/C Hide historical Class A/B/C macros unless IN_HISTORICAL_NETS is defined; define it for user level. Define IN_MULTICAST separately from IN_CLASSD, and use it in pf instead of IN_CLASSD. Stop using class for setting default masks when not specified; instead, define new default mask (24 bits). Warn when an Internet address is set without a mask. MFC after: 1 month Reviewed by: cy Differential Revision: https://reviews.freebsd.org/D32708	2021-11-09 09:32:38 -06:00
Gleb Smirnoff	c8ee75f231	Use network epoch to protect local IPv4 addresses hash. The modification to the hash are already naturally locked by in_control_sx. Convert the hash lists to CK lists. Remove the in_ifaddr_rmlock. Assert the network epoch where necessary. Most cases when the hash lookup is done the epoch is already entered. Cover a few cases, that need entering the epoch, which mostly is initial configuration of tunnel interfaces and multicast addresses. Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D32584	2021-10-22 14:40:53 -07:00
Gleb Smirnoff	2144431c11	Remove in_ifaddr_lock acquisiton to access in_ifaddrhead. An IPv4 address is embedded into an ifaddr which is freed via epoch. And the in_ifaddrhead is already a CK list. Use the network epoch to protect against use after free. Next step would be to CK-ify the in_addr hash and get rid of the... Reviewed by: melifaro Differential Revision: https://reviews.freebsd.org/D32434	2021-10-13 10:04:46 -07:00
Mike Karels	fd0765933c	Change lowest address on subnet (host 0) not to broadcast by default. The address with a host part of all zeros was used as a broadcast long ago, but the default has been all ones since 4.3BSD and RFC1122. Until now, we would broadcast the host zero address as well as the configured address. Change to not broadcasting that address by default, but add a sysctl (net.inet.ip.broadcast_lowest) to re-enable it. Note that the correct way to use the zero address for broadcast would be to configure it as the broadcast address for the network. See https:/datatracker.ietf.org/doc/draft-schoen-intarea-lowest-address/ and the discussion in https://reviews.freebsd.org/D19316. Note, Linux now implements this. Reviewed by: rgrimes, tuexen; melifaro (previous version) MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D31861	2021-09-16 19:42:20 -05:00
Alexander V. Chernikov	4b631fc832	routing: fix source address selection rules for IPv4 over IPv6. Current logic always selects an IFA of the same family from the outgoing interfaces. In IPv4 over IPv6 setup there can be just single non-127.0.0.1 ifa, attached to the loopback interface. Create a separate rt_getifa_family() to handle entire ifa selection for the IPv4 over IPv6. Differential Revision: https://reviews.freebsd.org/D31868 MFC after: 1 week	2021-09-07 21:41:05 +00:00
Alexander V. Chernikov	936f4a42fa	lltable: do not require prefix lookup when checking lle allocation rules. With the new FIB_ALGO infrastructure, nearly all subsystems use fib[46]_lookup() functions, which provides lockless lookups. A number of places remains that uses old-style lookup functions, that still requires RIB read lock to return the result. One of such places is arp processing code. FIB_ALGO implementation makes some tradeoffs, resulting in (relatively) prolonged periods of holding RIB_WLOCK. If the lock is held and datapath competes for it, the RX ring may get blocked, ending in traffic delays and losses. As currently arp processing is performed directly in the interrupt handler, handling ARP replies triggers the problem descibed above when the amount of ARP replies is high. To be more specific, prior to creating new ARP entry, routing lookup for the entry address in interface fib is executed. The following conditions are the verified: 1. If lookup returns an empty result, or the resulting prefix is non-directly-reachable, failure is returned. The only exception are host routes w/ gateway==address. 2. If the routing lookup returns different interface and non-host route, we want to support the use case of having multiple interfaces with the same prefix. In fact, the current code just checks if the returned prefix covers target address (always true) and effectively allow allocating ARP entries for any directly-reachable prefix, regardless of its interface. Change the code to perform the following: 1) use fib4_lookup() to get the nexthop, instead of requesting exact prefix. 2) Rewrite first condition check using nexthop flags (1:1 match) 3) Rewrite second condition to check for interface addresses matching target address on the input interface. Differential Revision: https://reviews.freebsd.org/D31824 Reviewed by: ae MFC after: 1 week PR: 257965	2021-09-06 21:03:22 +00:00
Artem Khramov	620cf65c2b	netinet: prevent NULL pointer dereference in in_aifaddr_ioctl() It appears that maliciously crafted ifaliasreq can lead to NULL pointer dereference in in_aifaddr_ioctl(). In order to replicate that, one needs to 1. Ensure that carp(4) is not loaded 2. Issue SIOCAIFADDR call setting ifra_vhid field of the request to a negative value. A repro code would look like this. int main() { struct ifaliasreq req; struct sockaddr_in sin, mask; int fd, error; bzero(&sin, sizeof(struct sockaddr_in)); bzero(&mask, sizeof(struct sockaddr_in)); sin.sin_len = sizeof(struct sockaddr_in); sin.sin_family = AF_INET; sin.sin_addr.s_addr = inet_addr("192.168.88.2"); mask.sin_len = sizeof(struct sockaddr_in); mask.sin_family = AF_INET; mask.sin_addr.s_addr = inet_addr("255.255.255.0"); fd = socket(AF_INET, SOCK_DGRAM, 0); if (fd < 0) return (-1); memset(&req, 0, sizeof(struct ifaliasreq)); strlcpy(req.ifra_name, "lo0", sizeof(req.ifra_name)); memcpy(&req.ifra_addr, &sin, sin.sin_len); memcpy(&req.ifra_mask, &mask, mask.sin_len); req.ifra_vhid = -1; return ioctl(fd, SIOCAIFADDR, (char *)&req); } To fix, discard both positive and negative vhid values in in_aifaddr_ioctl, if carp(4) is not loaded. This prevents NULL pointer dereference and kernel panic. Reviewed by: imp@ Pull Request: https://github.com/freebsd/freebsd-src/pull/530	2021-08-26 12:08:03 -06:00
Alexander V. Chernikov	f3a3b06121	[lltable] Unify datapath feedback mechamism. Use newly-create llentry_request_feedback(), llentry_mark_used() and llentry_get_hittime() to request datapatch usage check and fetch the results in the same fashion both in IPv4 and IPv6. While here, simplify llentry_provide_feedback() wrapper by eliminating 1 condition check. MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31390	2021-08-04 22:52:43 +00:00
Mark Johnston	8e8f1cc9bb	Re-enable network ioctls in capability mode This reverts a portion of `274579831b` ("capsicum: Limit socket operations in capability mode") as at least rtsol and dhcpcd rely on being able to configure network interfaces while in capability mode. Reported by: bapt, Greg V Sponsored by: The FreeBSD Foundation	2021-04-23 09:22:49 -04:00
Mark Johnston	274579831b	capsicum: Limit socket operations in capability mode Capsicum did not prevent certain privileged networking operations, specifically creation of raw sockets and network configuration ioctls. However, these facilities can be used to circumvent some of the restrictions that capability mode is supposed to enforce. Add capability mode checks to disallow network configuration ioctls and creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET internet sockets. Reviewed by: oshogbo Discussed with: emaste Reported by: manu Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29423	2021-04-07 14:32:56 -04:00
Alexander V. Chernikov	9fdbf7eef5	Make in_localip_more() fib-aware. It fixes loopback route installation for the interfaces in the different fibs using the same prefix. Reviewed By: donner PR: 189088 Differential Revision: https://reviews.freebsd.org/D28673 MFC after: 1 week	2021-02-16 20:00:46 +00:00
Alexander V. Chernikov	130aebbab0	Further refactor IPv4 interface route creation. * Fix bug with /32 aliases introduced in `81728a538d`. * Explicitly document business logic for IPv4 ifa routes. * Remove remnants of rtinit() * Deduplicate ifa->route prefix code by moving it into ia_getrtprefix() * Deduplicate conditional check for ifa_maintain_loopback_route() by moving into ia_need_loopback_route() * Remove now-unused flags argument from in_addprefix(). Reviewed by: donner PR: 252883 Differential Revision: https://reviews.freebsd.org/D28246	2021-01-21 21:48:49 +00:00
Alexander V. Chernikov	81728a538d	Split rtinit() into multiple functions. rtinit[1]() is a function used to add or remove interface address prefix routes, similar to ifa_maintain_loopback_route(). It was intended to be family-agnostic. There is a problem with this approach in reality. 1) IPv6 code does not use it for the ifa routes. There is a separate layer, nd6_prelist_(), providing interface for maintaining interface routes. Its part, responsible for the actual route table interaction, mimics rtenty() code. 2) rtinit tries to combine multiple actions in the same function: constructing proper route attributes and handling iterations over multiple fibs, for the non-zero net.add_addr_allfibs use case. It notably increases the code complexity. 3) dstaddr handling. flags parameter re-uses RTF_ flags. As there is no special flag for p2p connections, host routes and p2p routes are handled in the same way. Additionally, mapping IFA flags to RTF flags makes the interface pretty messy. It make rtinit() to clash with ifa_mainain_loopback_route() for IPV4 interface aliases. 4) rtinit() is the last customer passing non-masked prefixes to rib_action(), complicating rib_action() implementation. 5) rtinit() coupled ifa announce/withdrawal notifications, producing "false positive" ifa messages in certain corner cases. To address all these points, the following has been done: * rtinit() has been split into multiple functions: - Route attribute construction were moved to the per-address-family functions, dealing with (2), (3) and (4). - funnction providing net.add_addr_allfibs handling and route rtsock notificaions is the new routing table inteface. - rtsock ifa notificaion has been moved out as well. resulting set of funcion are only responsible for the actual route notifications. Side effects: * /32 alias does not result in interface routes (/32 route and "host" route) * RTF_PINNED is now set for IPv6 prefixes corresponding to the interface addresses Differential revision: https://reviews.freebsd.org/D28186	2021-01-16 22:42:41 +00:00
Alexander V. Chernikov	d68cf57b7f	Refactor rt_addrmsg() and rt_routemsg(). Summary: * Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally. * Fix rt_routemsg() and multipath by accepting nexthop instead of interface pointer. * Refactor rtsock_routemsg(): avoid accessing rtentry fields directly. * Simplify in_addprefix() by moving prefix search to a separate function. Reviewers: #network Subscribers: imp, ae, bz Differential Revision: https://reviews.freebsd.org/D28011	2021-01-07 19:38:19 +00:00
Andrey V. Elsukov	6952c3e1ac	Implement SIOCGIFALIAS. It is lightweight way to check if an IPv4 address exists. Submitted by: Roy Marples Reviewed by: gnn, melifaro MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26636	2020-10-14 09:22:54 +00:00
Andrey V. Elsukov	3f740d4393	Join to AllHosts multicast group again when adding an existing IPv4 address. When SIOCAIFADDR ioctl configures an IPv4 address that is already exist, it removes old ifaddr. When this IPv4 address is only one configured on the interface, this also leads to leaving from AllHosts multicast group. Then an address is added again, but due to the bug, this doesn't lead to joining to AllHosts multicast group. Submitted by: yannis.planus_alstomgroup.com Reviewed by: gnn MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D26757	2020-10-13 19:34:36 +00:00
Alexander V. Chernikov	fedeb08b6a	Introduce scalable route multipath. This change is based on the nexthop objects landed in D24232. The change introduces the concept of nexthop groups. Each group contains the collection of nexthops with their relative weights and a dataplane-optimized structure to enable efficient nexthop selection. Simular to the nexthops, nexthop groups are immutable. Dataplane part gets compiled during group creation and is basically an array of nexthop pointers, compiled w.r.t their weights. With this change, `rt_nhop` field of `struct rtentry` contains either nexthop or nexthop group. They are distinguished by the presense of NHF_MULTIPATH flag. All dataplane lookup functions returns pointer to the nexthop object, leaving nexhop groups details inside routing subsystem. User-visible changes: The change is intended to be backward-compatible: all non-mpath operations should work as before with ROUTE_MPATH and net.route.multipath=1. All routes now comes with weight, default weight is 1, maximum is 2^24-1. Current maximum multipath group width is statically set to 64. This will become sysctl-tunable in the followup changes. Using functionality: * Recompile kernel with ROUTE_MPATH * set net.route.multipath to 1 route add -6 2001:db8::/32 2001:db8::2 -weight 10 route add -6 2001:db8::/32 2001:db8::3 -weight 20 netstat -6On Nexthop groups data Internet6: GrpIdx NhIdx Weight Slots Gateway Netif Refcnt 1 ------- ------- ------- --------------------------------------- --------- 1 13 10 1 2001:db8::2 vlan2 14 20 2 2001:db8::3 vlan2 Next steps: * Land outbound hashing for locally-originated routes ( D26523 ). * Fix net/bird multipath (net/frr seems to work fine) * Add ROUTE_MPATH to GENERIC * Set net.route.multipath=1 by default Tested by: olivier Reviewed by: glebius Relnotes: yes Differential Revision: https://reviews.freebsd.org/D26449	2020-10-03 10:47:17 +00:00
Mateusz Guzik	662c13053f	net: clean up empty lines in .c and .h files	2020-09-01 21:19:14 +00:00
Hans Petter Selasky	3689652c65	Make sure the multicast release tasks are properly drained when destroying a VNET or a network interface. Else the inm release tasks, both IPv4 and IPv6 may cause a panic accessing a freed VNET or network interface. Reviewed by: jmg@ Discussed with: bz@ Differential Revision: https://reviews.freebsd.org/D24914 MFC after: 1 week Sponsored by: Mellanox Technologies	2020-08-10 10:46:08 +00:00
Randall Stewart	481be5de9d	White space cleanup -- remove trailing tab's or spaces from any line. Sponsored by: Netflix Inc.	2020-02-12 13:31:36 +00:00
Gleb Smirnoff	2a4bd982d0	Introduce NET_EPOCH_CALL() macro and use it everywhere where we free data based on the network epoch. The macro reverses the argument order of epoch_call(9) - first function, then its argument. NFC	2020-01-15 06:05:20 +00:00
Gleb Smirnoff	b8a6e03fac	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
Rodney W. Grimes	6c1c6ae537	Use IN_foo() macros from sys/netinet/in.h inplace of handcrafted code There are a few places that use hand crafted versions of the macros from sys/netinet/in.h making it difficult to actually alter the values in use by these macros. Correct that by replacing handcrafted code with proper macro usage. Reviewed by: karels, kristof Approved by: bde (mentor) MFC after: 3 weeks Sponsored by: John Gilmore Differential Revision: https://reviews.freebsd.org/D19317	2019-04-04 19:01:13 +00:00
Mark Johnston	49cf58e559	Style. Reviewed by: bz MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-01-23 22:19:49 +00:00
Mark Johnston	c06cc56e39	Fix an LLE lookup race. After the afdata read lock was converted to epoch(9), readers could observe a linked LLE and block on the LLE while a thread was unlinking the LLE. The writer would then release the lock and schedule the LLE for deferred free, allowing readers to continue and potentially schedule the LLE timer. By the point the timer fires, the structure is freed, typically resulting in a crash in the callout subsystem. Fix the problem by modifying the lookup path to check for the LLE_LINKED flag upon acquiring the LLE lock. If it's not set, the lookup fails. PR: 234296 Reviewed by: bz Tested by: sbruno, Victor <chernov_victor@list.ru>, Mike Andrews <mandrews@bit0.com> MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18906	2019-01-23 22:18:23 +00:00
Gleb Smirnoff	a68cc38879	Mechanical cleanup of epoch(9) usage in network stack. - Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always. - Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit. This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively. Discussed with: jtl, gallatin	2019-01-09 01:11:19 +00:00
Andrey V. Elsukov	64d63b1e03	Add ifaddr_event_ext event. It is similar to ifaddr_event, but the handler receives the type of event IFADDR_EVENT_ADD/IFADDR_EVENT_DEL, and the pointer to ifaddr. Also ifaddr_event now is implemented using ifaddr_event_ext handler. MFC after: 3 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17100	2018-10-21 15:02:06 +00:00
Luiz Otavio O Souza	59b2022f94	Late style follow up on r312770. Submitted by: glebius X-MFC with: r312770 MFC after: 3 days	2018-08-15 15:44:30 +00:00
Andrew Turner	5f901c92a8	Use the new VNET_DEFINE_STATIC macro when we are defining static VNET variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147	2018-07-24 16:35:52 +00:00
Andrey V. Elsukov	acf673edf0	Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Submitted by: hselasky (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D4605	2018-07-17 11:33:23 +00:00
Matt Macy	4f6c66cc9c	UDP: further performance improvements on tx Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps Single stream throughput increases from 910kpps to 1.18Mpps Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg - Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg - Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg - Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409	2018-05-23 21:02:14 +00:00
Matt Macy	f6960e207e	netinet silence warnings	2018-05-19 05:56:21 +00:00
Matt Macy	d7c5a620e2	ifnet: Replace if_addr_lock rwlock with epoch + mutex Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366	2018-05-18 20:13:34 +00:00
Brooks Davis	514ef08caf	Unwrap a line that no longer requires wrapping.	2018-05-15 20:14:38 +00:00
Brooks Davis	423349f696	Remove stray tabs from in_lltable_dump_entry().	2018-05-15 20:13:00 +00:00
Matt Macy	b6f6f88018	r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a a multicast update would still be queued for deletion when ifconfig deleted the interface thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses on the interface. Synchronously remove all external references to a multicast address before enqueueing for delete. Reported by: lwhsu Approved by: sbruno	2018-05-06 20:34:13 +00:00
Stephen Hurd	f3e1324b41	Separate list manipulation locking from state change in multicast Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a non sleepable lock. Serialize multicast updates instead. Submitted by: mmacy <mmacy@mattmacy.io> Reviewed by: shurd, sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14969	2018-05-02 19:36:29 +00:00
Alexander V. Chernikov	1435dcd94f	Fix outgoing TCP/UDP packet drop on arp/ndp entry expiration. Current arp/nd code relies on the feedback from the datapath indicating that the entry is still used. This mechanism is incorporated into the arpresolve()/nd6_resolve() routines. After the inpcb route cache introduction, the packet path for the locally-originated packets changed, passing cached lle pointer to the ether_output() directly. This resulted in the arp/ndp entry expire each time exactly after the configured max_age interval. During the small window between the ARP/NDP request and reply from the router, most of the packets got lost. Fix this behaviour by plugging datapath notification code to the packet path used by route cache. Unify the notification code by using single inlined function with the per-AF callbacks. Reported by: sthaug at nethelp.no Reviewed by: ae MFC after: 2 weeks	2018-03-17 17:05:48 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Ed Maste	3e85b721d6	Remove register keyword from sys/ and ANSIfy prototypes A long long time ago the register keyword told the compiler to store the corresponding variable in a CPU register, but it is not relevant for any compiler used in the FreeBSD world today. ANSIfy related prototypes while here. Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10193	2017-05-17 00:34:34 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Eric van Gyzen	8144690af4	Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack. Suggested by: glebius, emaste Reviewed by: gnn MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9625	2017-02-16 20:47:41 +00:00
Luiz Otavio O Souza	338e227ac0	After the in_control() changes in r257692, an existing address is (intentionally) deleted first and then completely added again (so all the events, announces and hooks are given a chance to run). This cause an issue with CARP where the existing CARP data structure is removed together with the last address for a given VHID, which will cause a subsequent fail when the address is later re-added. This change fixes this issue by adding a new flag to keep the CARP data structure when an address is not being removed. There was an additional issue with IPv6 CARP addresses, where the CARP data structure would never be removed after a change and lead to VHIDs which cannot be destroyed. Reviewed by: glebius Obtained from: pfSense MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate)	2017-01-25 19:04:08 +00:00
Eric van Gyzen	2d9db0bc63	Add GARP retransmit capability A single gratuitous ARP (GARP) is always transmitted when an IPv4 address is added to an interface, and that is usually sufficient. However, in some circumstances, such as when a shared address is passed between cluster nodes, this single GARP may occasionally be dropped or lost. This can lead to neighbors on the network link working with a stale ARP cache and sending packets destined for that address to the node that previously owned the address, which may not respond. To avoid this situation, GARP retransmissions can be enabled by setting the net.link.ether.inet.garp_rexmit_count sysctl to a value greater than zero. The setting represents the maximum number of retransmissions. The interval between retransmissions is calculated using an exponential backoff algorithm, doubling each time, so the retransmission intervals are: {1, 2, 4, 8, 16, ...} (seconds). Due to the exponential backoff algorithm used for the interval between GARP retransmissions, the maximum number of retransmissions is limited to 16 for sanity. This limit corresponds to a maximum interval between retransmissions of 2^16 seconds ~= 18 hours. Increasing this limit is possible, but sending out GARPs spaced days apart would be of little use. Submitted by: David A. Bright <david.a.bright@dell.com> MFC after: 1 month Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D7695	2016-10-02 01:42:45 +00:00
Ryan Stone	11f2a7cd67	Fix unlocked access to ifnet address list in_broadcast() was iterating over the ifnet address list without first taking an IF_ADDR_RLOCK. This could cause a panic if a concurrent operation modified the list. Reviewed by: bz MFC after: 2 months Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7227	2016-08-18 22:59:10 +00:00
Ryan Stone	90cc51a1ab	Don't iterate over the ifnet addr list in ip_output() For almost every packet that is transmitted through ip_output(), a call to in_broadcast() was made to decide if the destination IP was a broadcast address. in_broadcast() iterates over the ifnet's address to find a source IP matching the subnet of the destination IP, and then checks if the IP is a broadcast in that subnet. This is completely redundant as we have already performed the route lookup, so the source IP is already known. Just use that address to directly check whether the destination IP is a broadcast address or not. MFC after: 2 months Sponsored By: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7266	2016-08-18 22:59:00 +00:00

1 2 3 4 5 ...

338 Commits