freebsd-nq

Author	SHA1	Message	Date
Gleb Smirnoff	4937a6561f	Simplify ip_stripoptions() reducing number of intermediate variables.	2012-10-23 10:29:31 +00:00
Gleb Smirnoff	8ad458a471	Do not reduce ip_len by size of IP header in the ip_input() before passing a packet to protocol input routines. For several protocols this mean that now protocol needs to do subtraction itself, and for another half this means that we do not need to add header length back to the packet. Make ip_stripoptions() to adjust ip_len, since now we enter this function with a packet header whose ip_len does represent length of entire packet, not payload only.	2012-10-23 08:33:13 +00:00
Xin LI	6f56329a25	Remove __P. Submitted by: kevlo Reviewed by: md5(1) MFC after: 2 months	2012-10-22 21:49:56 +00:00
Gleb Smirnoff	8f134647ca	Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>	2012-10-22 21:09:03 +00:00
Andrey Zonov	32fe38f123	- Update cachelimit after hashsize and bucketlimit were set. Reported by: az Reviewed by: melifaro Approved by: kib (mentor) MFC after: 1 week	2012-10-19 14:00:03 +00:00
Andre Oppermann	c9b652e3e8	Mechanically remove the last stray remains of spl* calls from net/. They have been Noop's for a long time now.	2012-10-18 13:57:24 +00:00
Ed Maste	983731268c	Avoid potential bad pointer dereference. Previously RuleAdd would leave entry->la unset for the first entry in the proxyList. Sponsored by: ADARA Networks MFC After: 1 week	2012-10-17 20:23:07 +00:00
Gleb Smirnoff	e76163a539	We don't need to convert ip6_len to host byte order before ip6_output(), the IPv6 stack is working in net byte order. The reason this code worked before is that ip6_output() doesn't look at ip6_plen at all and recalculates it based on mbuf length.	2012-10-15 07:57:55 +00:00
Gleb Smirnoff	347d90acff	Fix a miss from r241344: in ip_mloopback() we need to go to net byte order prior to calling in_delayed_cksum(). Reported by: Olivier Cochard-Labbe <olivier cochard.me>	2012-10-14 15:08:07 +00:00
Alexander V. Chernikov	3bff27cd67	Cleanup documentation: cloning route support has been removed in r186119. MFC after: 2 weeks	2012-10-13 09:31:01 +00:00
Gleb Smirnoff	86b61e4748	Revert fixup of ip_len from r241480. Now stack isn't yet ready for that change.	2012-10-12 09:32:38 +00:00
Gleb Smirnoff	105bd2113b	In ip_stripoptions(): - Remove unused argument and incorrect comment. - Fixup ip_len after stripping.	2012-10-12 09:24:24 +00:00
Alexander V. Chernikov	3c2824b9ef	Do not check if found IPv4 rte is dynamic if net.inet.icmp.drop_redirect is enabled. This eliminates one mtx_lock() per each routing lookup thus improving performance in several cases (routing to directly connected interface or routing to default gateway). Icmp redirects should not be used to provide routing direction nowadays, even for end hosts. Routers should not use them too (and this is explicitly restricted in IPv6, see RFC 4861, clause 8.2). Current commit changes rnh_machaddr function to 'stock' rn_match (and back) for every AF_INET routing table in given VNET instance on drop_redirect sysctl change. This change is part of bigger patch eliminating rte locking. Sponsored by: Yandex LLC MFC after: 2 weeks	2012-10-10 19:06:11 +00:00
Kevin Lo	9823d52705	Revert previous commit... Pointyhat to: kevlo (myself)	2012-10-10 08:36:38 +00:00
Kevin Lo	a10cee30c9	Prefer NULL over 0 for pointers	2012-10-09 08:27:40 +00:00
Gleb Smirnoff	23e9c6dc1e	After r241245 it appeared that in_delayed_cksum(), which still expects host byte order, was sometimes called with net byte order. Since we are moving towards net byte order throughout the stack, the function was converted to expect net byte order, and its consumers fixed appropriately: - ip_output(), ipfilter(4) not changed, since already call in_delayed_cksum() with header in net byte order. - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order there and back. - mrouting code and IPv6 ipsec now need to switch byte order there and back, but I hope, this is temporary solution. - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum(). - pf_route() catches up on r241245 changes to ip_output().	2012-10-08 08:03:58 +00:00
Gleb Smirnoff	b7fb54d8ae	No reason to play with IP header before calling sctp_delayed_cksum() with offset beyond the IP header.	2012-10-08 07:21:32 +00:00
Gleb Smirnoff	21d172a3f1	A step in resolving mess with byte ordering for AF_INET. After this change: - All packets in NETISR_IP queue are in net byte order. - ip_input() is entered in net byte order and converts packet to host byte order right _after_ processing pfil(9) hooks. - ip_output() is entered in host byte order and converts packet to net byte order right _before_ processing pfil(9) hooks. - ip_fragment() accepts and emits packet in net byte order. - ip_forward(), ip_mloopback() use host byte order (untouched actually). - ip_fastforward() no longer modifies packet at all (except ip_ttl). - Swapping of byte order there and back removed from the following modules: pf(4), ipfw(4), enc(4), if_bridge(4). - Swapping of byte order added to ipfilter(4), based on __FreeBSD_version - __FreeBSD_version bumped. - pfil(9) manual page updated. Reviewed by: ray, luigi, eri, melifaro Tested by: glebius (LE), ray (BE)	2012-10-06 10:02:11 +00:00
Gleb Smirnoff	df4e91d386	There is a complex race in in_pcblookup_hash() and in_pcblookup_group(). Both functions need to obtain lock on the found PCB, and they can't do classic inter-lock with the PCB hash lock, due to lock order reversal. To keep the PCB stable, these functions put a reference on it and after PCB lock is acquired drop it. If the reference was the last one, this means we've raced with in_pcbfree() and the PCB is no longer valid. This approach works okay only if we are acquiring writer-lock on the PCB. In case of reader-lock, the following scenario can happen: - 2 threads locate pcb, and do in_pcbref() on it. - These 2 threads drop the inp hash lock. - Another thread comes to delete pcb via in_pcbfree(), it obtains hash lock, does in_pcbremlists(), drops hash lock, and runs in_pcbrele_wlocked(), which doesn't free the pcb due to two references on it. Then it unlocks the pcb. - 2 aforementioned threads acquire reader lock on the pcb and run in_pcbrele_rlocked(). One gets 1 from in_pcbrele_rlocked() and continues, second gets 0 and considers pcb freed, returns. - The thread that got 1 continutes working with detached pcb, which later leads to panic in the underlying protocol level. To plumb that problem an additional INPCB flag introduced - INP_FREED. We check for that flag in the in_pcbrele_rlocked() and if it is set, we pretend that that was the last reference. Discussed with: rwatson, jhb Reported by: Vladimir Medvedkin <medved rambler-co.ru>	2012-10-02 12:03:02 +00:00
Gleb Smirnoff	891122d180	carp_send_ad() should never return without rescheduling next run.	2012-09-29 05:52:19 +00:00
Gleb Smirnoff	85c05144f1	Fix bug in TCP_KEEPCNT setting, which slipped in in the last round of reviewing of r231025. Unlike other options from this family TCP_KEEPCNT doesn't specify time interval, but a count, thus parameter supplied doesn't need to be multiplied by hz. Reported & tested by: amdmi3	2012-09-27 07:13:21 +00:00
Michael Tuexen	e06f3469e0	Whitespace change. MFC after: 3 days	2012-09-23 07:43:10 +00:00
Michael Tuexen	a98809db78	Declare a static function as such. MFC after: 3 days	2012-09-23 07:23:18 +00:00
Michael Tuexen	efb0814c24	Fix a bug related to handling Re-config chunks. It is not true that the association can be removed if the socket is gone. MFC after: 3 days	2012-09-22 22:04:17 +00:00
Michael Tuexen	2089750009	Small cleanups. No functional change. MFC after: 10 days	2012-09-22 14:39:20 +00:00
Kevin Lo	b7e1113e8f	Fix typo: s/pakcet/packet	2012-09-20 03:29:43 +00:00
Eitan Adler	582212fa04	s/teh/the/g Approved by: cperciva MFC after: 3 days	2012-09-14 21:59:55 +00:00
Michael Tuexen	dcb68fba2d	Small cleanups. No functional change. MFC after: 10 days	2012-09-14 18:32:20 +00:00
Gleb Smirnoff	3b3a8eb937	o Create directory sys/netpfil, where all packet filters should reside, and move there ipfw(4) and pf(4). o Move most modified parts of pf out of contrib. Actual movements: sys/contrib/pf/net/.c -> sys/netpfil/pf/ sys/contrib/pf/net/.h -> sys/net/ contrib/pf/pfctl/.c -> sbin/pfctl contrib/pf/pfctl/.h -> sbin/pfctl contrib/pf/pfctl/pfctl.8 -> sbin/pfctl contrib/pf/pfctl/.4 -> share/man/man4 contrib/pf/pfctl/.5 -> share/man/man5 sys/netinet/ipfw -> sys/netpfil/ipfw The arguable movement is pf/net/*.h -> sys/net. There are future plans to refactor pf includes, so I decided not to break things twice. Not modified bits of pf left in contrib: authpf, ftp-proxy, tftp-proxy, pflogd. The ipfw(4) movement is planned to be merged to stable/9, to make head and stable match. Discussed with: bz, luigi	2012-09-14 11:51:49 +00:00
Michael Tuexen	8225a9bc85	Whitespace changes. MFC after: 10 days	2012-09-09 08:14:04 +00:00
Michael Tuexen	fe6bb0a788	Whitespace cleanup. MFC after: 10 days	2012-09-08 20:54:54 +00:00
Gleb Smirnoff	d6d3f01e0a	Merge the projects/pf/head branch, that was worked on for last six months, into head. The most significant achievements in the new code: o Fine grained locking, thus much better performance. o Fixes to many problems in pf, that were specific to FreeBSD port. New code doesn't have that many ifdefs and much less OpenBSDisms, thus is more attractive to our developers. Those interested in details, can browse through SVN log of the projects/pf/head branch. And for reference, here is exact list of revisions merged: r232043, r232044, r232062, r232148, r232149, r232150, r232298, r232330, r232332, r232340, r232386, r232390, r232391, r232605, r232655, r232656, r232661, r232662, r232663, r232664, r232673, r232691, r233309, r233782, r233829, r233830, r233834, r233835, r233836, r233865, r233866, r233868, r233873, r234056, r234096, r234100, r234108, r234175, r234187, r234223, r234271, r234272, r234282, r234307, r234309, r234382, r234384, r234456, r234486, r234606, r234640, r234641, r234642, r234644, r234651, r235505, r235506, r235535, r235605, r235606, r235826, r235991, r235993, r236168, r236173, r236179, r236180, r236181, r236186, r236223, r236227, r236230, r236252, r236254, r236298, r236299, r236300, r236301, r236397, r236398, r236399, r236499, r236512, r236513, r236525, r236526, r236545, r236548, r236553, r236554, r236556, r236557, r236561, r236570, r236630, r236672, r236673, r236679, r236706, r236710, r236718, r237154, r237155, r237169, r237314, r237363, r237364, r237368, r237369, r237376, r237440, r237442, r237751, r237783, r237784, r237785, r237788, r237791, r238421, r238522, r238523, r238524, r238525, r239173, r239186, r239644, r239652, r239661, r239773, r240125, r240130, r240131, r240136, r240186, r240196, r240212. I'd like to thank people who participated in early testing: Tested by: Florian Smeets <flo freebsd.org> Tested by: Chekaluk Vitaly <artemrts ukr.net> Tested by: Ben Wilber <ben desync.com> Tested by: Ian FREISLICH <ianf cloudseed.co.za>	2012-09-08 06:41:54 +00:00
Michael Tuexen	a169d6ec2b	Don't include a structure containing a flexible array in another structure. MFC after: 10 days	2012-09-07 13:36:42 +00:00
Michael Tuexen	12780a595e	Get rid of a gcc'ism. MFC after: 10 days	2012-09-06 07:03:56 +00:00
Michael Tuexen	dd294dcec6	Using %p in a format string requires a void *. MFC after: 10 days	2012-09-05 18:52:01 +00:00
Michael Tuexen	2899aa8f65	Use the consistenly the size of a variable. This helps to keep the code simpler for the userland implementation. MFC after: 3 days	2012-09-04 22:45:00 +00:00
Michael Tuexen	c6328f940e	Whitespace change. MFC after: 3 days	2012-09-04 22:40:49 +00:00
Alexander V. Chernikov	7d4317bd40	Introduce new link-layer PFIL hook V_link_pfil_hook. Merge ether_ipfw_chk() and part of bridge_pfil() into unified ipfw_check_frame() function called by PFIL. This change was suggested by rwatson? @ DevSummit. Remove ipfw headers from ether/bridge code since they are unneeded now. Note this thange introduce some (temporary) performance penalty since PFIL read lock has to be acquired for every link-level packet. MFC after: 3 weeks	2012-09-04 19:43:26 +00:00
Gleb Smirnoff	478df1d534	Provide a sysctl switch that allows to install ARP entries with multicast bit set. FreeBSD refuses to install such entries since 9.0, and this broke installations running Microsoft NLB, which are violating standards. Tested by: Tarasov Oleg <oleg_tarasov sg-tea.com>	2012-09-03 14:29:28 +00:00
Michael Tuexen	81eb4e6351	Fix a typo which results in RTT to be off by a factor of 10, if the RTT is larger than 1 second. MFC after: 3 days	2012-09-02 12:37:30 +00:00
Eitan Adler	64baf9fbe0	Mark the ipfw interface type as not being ether. This fixes an issue where uuidgen tried to obtain a ipfw device's mac address which was always zero. PR: 170460 Submitted by: wxs Reviewed by: bdrewery Reviewed by: delphij Approved by: cperciva MFC after: 1 week	2012-09-01 23:33:49 +00:00
Randall Stewart	ec03d5433f	This small change takes care of a race condition that can occur when both sides close at the same time. If that occurs, without this fix the connection enters FIN1 on both sides and they will forever send FIN\|ACK at each other until the connection times out. This is because we stopped processing the FIN\|ACK and thus did not advance the sequence and so never ACK'd each others FIN. This fix adjusts it so we do process the FIN properly and the race goes away ;-) MFC after: 1 month	2012-08-25 09:26:37 +00:00
Navdeep Parhar	06fd9875aa	Correctly handle the case where an inp has already been dropped by the time the TOE driver reports that an active open failed. toe_connect_failed is supposed to handle this but it should be provided the inpcb instead of the tcpcb which may no longer be around.	2012-08-21 18:09:33 +00:00
Randall Stewart	7db496de2c	Though I disagree, I conceed to jhb & Rui. Note that we still have a problem with this whole structure of locks and in_input.c [it does not lock which it should not, but this can lead to crashes]. (I have seen it in our SQA testbed.. besides the one with a refcnt issue that I will have SQA work on next week ;-)	2012-08-19 11:54:02 +00:00
Randall Stewart	9424879158	Ok jhb, lets move the ifa_free() down to the bottom to assure that all tables and such are removed before we start to free. This won't protect the Hash in ip_input.c but in theory should protect any other uses that do use locks. MFC after: 1 week (or more)	2012-08-17 05:51:46 +00:00
Lawrence Stewart	ee24d3b840	The TCP PAWS fix for kernels with fast tick rates (r231767) changed the TCP timestamp related stack variables to reference ms directly instead of ticks. The h_ertt(4) Khelp module relies on TCP timestamp information in order to calculate its enhanced RTT estimates, but was not updated as part of r231767. Consequently, h_ertt has not been calculating correct RTT estimates since r231767 was comitted, which in turn broke all delay-based congestion control algorithms because they rely on the h_ertt RTT estimates. Fix the breakage by switching h_ertt to use tcp_ts_getticks() in place of all previous uses of the ticks variable. This ensures all timestamp related variables in h_ertt use the same units as the TCP stack and therefore results in meaningful comparisons and RTT estimate calculations. Reported & tested by: Naeem Khademi (naeemk at ifi uio no) Discussed with: bz MFC after: 3 days	2012-08-17 01:49:51 +00:00
Randall Stewart	184749821f	Its never a good idea to double free the same address. MFC after: 1 week (after the other commits ahead of this gets MFC'd)	2012-08-16 17:55:16 +00:00
Luigi Rizzo	e5813a3bce	s/lenght/length/ in comments	2012-08-07 07:52:25 +00:00
Luigi Rizzo	17369272e4	move functions outside the SYSBEGIN/SYSEND block (SYSBEGIN/SYSEND are specific to ipfw/dummynet and are used to emulate sysctl on platforms that do not have them, and they work by creating an array which contains all the sysctl-ed symbols.)	2012-08-06 11:02:23 +00:00
Luigi Rizzo	00c4633285	use FREE_PKT instead of m_freem to free an mbuf. The former is the standard form used in ipfw/dummynet, so that it is easier to remap it to different memory managers depending on the platform.	2012-08-06 10:50:43 +00:00

... 2 3 4 5 6 ...

4624 Commits