Commit Graph

1649 Commits

Author SHA1 Message Date
Matthew Dillon
d65bf08af3 Add the tcps_sndrexmitbad statistic, keep track of late acks that caused
unnecessary retransmissions.
2002-07-19 18:29:38 +00:00
Matthew Dillon
701bec5a38 Introduce two new sysctl's:
net.inet.tcp.rexmit_min (default 3 ticks equiv)

    This sysctl is the retransmit timer RTO minimum,
    specified in milliseconds.  This value is
    designed for algorithmic stability only.

net.inet.tcp.rexmit_slop (default 200ms)

    This sysctl is the retransmit timer RTO slop
    which is added to every retransmit timeout and
    is designed to handle protocol stack overheads
    and delayed ack issues.

Note that the *original* code applied a 1-second
RTO minimum but never applied real slop to the RTO
calculation, so any RTO calculation over one second
would have no slop and thus not account for
protocol stack overheads (TCP timestamps are not
a measure of protocol turnaround!).  Essentially,
the original code made the RTO calculation almost
completely irrelevant.

Please note that the 200ms slop is debateable.
This commit is not meant to be a line in the sand,
and if the community winds up deciding that increasing
it is the correct solution then it's easy to do.
Note that larger values will destroy performance
on lossy networks while smaller values may result in
a greater number of unnecessary retransmits.
2002-07-18 19:06:12 +00:00
Luigi Rizzo
90780c4b05 Move IPFW2 definition before including ip_fw.h
Make indentation of new parts consistent with the style used for this file.
2002-07-18 05:18:41 +00:00
Matthew Dillon
22fd54d461 I don't know how the minimum retransmit timeout managed to get set to
one second but it badly breaks throughput on networks with minor packet
loss.

Complaints by: at least two people tracked down to this.
MFC after:	3 days
2002-07-17 23:32:03 +00:00
Luigi Rizzo
318aa87b59 Fix a panic when doing "ipfw add pipe 1 log ..."
Also synchronize ip_dummynet.c with the version in RELENG_4 to
ease MFC's.
2002-07-17 07:21:42 +00:00
Luigi Rizzo
a8c102a2ec Implement keepalives for dynamic rules, so they will not expire
just because you leave your session idle.

Also, put in a fix for 64-bit architectures (to be revised).

In detail:

ip_fw.h

  * Reorder fields in struct ip_fw to avoid alignment problems on
    64-bit machines. This only masks the problem, I am still not
    sure whether I am doing something wrong in the code or there
    is a problem elsewhere (e.g. different aligmnent of structures
    between userland and kernel because of pragmas etc.)

  * added fields in dyn_rule to store ack numbers, so we can
    generate keepalives when the dynamic rule is about to expire

ip_fw2.c

  * use a local function, send_pkt(), to generate TCP RST for Reset rules;

  * save about 250 bytes by cleaning up the various snprintf()
    in ipfw_log() ...

  * ... and use twice as many bytes to implement keepalives
    (this seems to be working, but i have not tested it extensively).

Keepalives are generated once every 5 seconds for the last 20 seconds
of the lifetime of a dynamic rule for an established TCP flow.  The
packets are sent to both sides, so if at least one of the endpoints
is responding, the timeout is refreshed and the rule will not expire.

You can disable this feature with

        sysctl net.inet.ip.fw.dyn_keepalive=0

(the default is 1, to have them enabled).

MFC after: 1 day

(just kidding... I will supply an updated version of ipfw2 for
RELENG_4 tomorrow).
2002-07-14 23:47:18 +00:00
Luigi Rizzo
3956b02345 Avoid dereferencing a null pointer in ro_rt.
This was always broken in HEAD (the offending statement was introduced
in rev. 1.123 for HEAD, while RELENG_4 included this fix (in rev.
1.99.2.12 for RELENG_4) and I inadvertently deleted it in 1.99.2.30.

So I am also restoring these two lines in RELENG_4 now.
We might need another few things from 1.99.2.30.
2002-07-12 22:08:47 +00:00
Don Lewis
2d20c83f93 Back out the previous change, since it looks like locking udbinfo provides
sufficient protection.
2002-07-12 09:55:48 +00:00
Don Lewis
bb1dd7a45a Lock inp while we're accessing it. 2002-07-12 08:05:22 +00:00
Don Lewis
0e1eebb846 Defer calling SYSCTL_OUT() until after the locks have been released. 2002-07-11 23:18:43 +00:00
Don Lewis
142b2bd644 Reduce the nesting level of a code block that doesn't need to be in
an else clause.
2002-07-11 23:13:31 +00:00
Luigi Rizzo
c7ea683135 Change one variable to make it easier to switch between ipfw and ipfw2 2002-07-09 06:53:38 +00:00
Luigi Rizzo
b3063f064c Fix a bug caused by dereferencing an invalid pointer when
no punch_fw was used.
Fix another couple of bugs which prevented rules from being
installed properly.

On passing, use IPFW2 instead of NEW_IPFW to compile the new code,
and slightly simplify the instruction generation code.
2002-07-08 22:57:35 +00:00
Luigi Rizzo
d63b346ab1 No functional changes, but:
Following Darren's suggestion, make Dijkstra happy and rewrite the
ipfw_chk() main loop removing a lot of goto's and using instead a
variable to store match status.

Add a lot of comments to explain what instructions are supposed to
do and how -- this should ease auditing of the code and make people
more confident with it.

In terms of code size: the entire file takes about 12700 bytes of text,
about 3K of which are for the main function, ipfw_chk(), and 2K (ouch!)
for ipfw_log().
2002-07-08 22:46:01 +00:00
Luigi Rizzo
7d4d3e9051 Remove one unused command name. 2002-07-08 22:39:19 +00:00
Luigi Rizzo
5185195169 Forgot to update one field name in one of the latest commits. 2002-07-08 22:37:55 +00:00
Luigi Rizzo
5e43aef891 Implement the last 2-3 missing instructions for ipfw,
now it should support all the instructions of the old ipfw.

Fix some bugs in the user interface, /sbin/ipfw.

Please check this code against your rulesets, so i can fix the
remaining bugs (if any, i think they will be mostly in /sbin/ipfw).

Once we have done a bit of testing, this code is ready to be MFC'ed,
together with a bunch of other changes (glue to ipfw, and also the
removal of some global variables) which have been in -current for
a couple of weeks now.

MFC after: 7 days
2002-07-05 22:43:06 +00:00
Brian Somers
27cc91fbf8 Remove trailing whitespace 2002-07-01 11:19:40 +00:00
Jesper Skriver
eb538bfd64 Extend the effect of the sysctl net.inet.tcp.icmp_may_rst
so that, if we recieve a ICMP "time to live exceeded in transit",
(type 11, code 0) for a TCP connection on SYN-SENT state, close
the connection.

MFC after:	2 weeks
2002-06-30 20:07:21 +00:00
Jonathan Lemon
0080a004d7 One possible code path for syncache_respond() is:
syncache_respond(A), ip_output(), ip_input(), tcp_input(), syncache_badack(B)

Which winds up deleting a different entry from the syncache.  Handle
this by not utilizing the next entry in the timer chain until after
syncache_respond() completes.  The case of A == B should not be possible.

Problem found by: Don Bowman <don@sandvine.com>
2002-06-28 19:12:38 +00:00
Doug Rabson
24f8fd9fd1 Fix warning.
Reviewed by: luigi
2002-06-28 08:36:26 +00:00
Luigi Rizzo
9758b77ff1 The new ipfw code.
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.

The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).

The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c .  Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw).  The old files are still there, and will be removed
in due time.

I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.

In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.

On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like

ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any

This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.

Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:

        10.20.30.0/26{18,44,33,22,9}

which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).

Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.

The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
2002-06-27 23:02:18 +00:00
Maxime Henrion
7627c6cbcc Warning fixes for 64 bits platforms. With this last fix,
I can build a GENERIC sparc64 kernel with -Werror.

Reviewed by:	luigi
2002-06-27 11:02:06 +00:00
Luigi Rizzo
713a6ea063 Just a comment on some additional consistency checks that could
be added here.
2002-06-26 21:00:53 +00:00
Kenneth D. Merry
98cb733c67 At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
Jeffrey Hsu
6fd22caf91 Avoid unlocking the inp twice if badport_bandlim() returns -1.
Reported by:	jlemon
2002-06-24 22:25:00 +00:00
Jeffrey Hsu
f14e4cfe33 Style bug: fix 4 space indentations that should have been tabs.
Submitted by:	jlemon
2002-06-24 16:47:02 +00:00
Luigi Rizzo
f10e85d797 Slightly restructure the #ifdef INET6 sections to make the code
more readable.

Remove the six "register" attributes from variables tcp_output(), the
compiler surely knows well how to allocate them.
2002-06-23 21:25:36 +00:00
Luigi Rizzo
410bb1bfe2 Move two global variables to automatic variables within the
only function where they are used (they are used with TCPDEBUG only).
2002-06-23 21:22:56 +00:00
Luigi Rizzo
4d2e36928d Move some global variables in more appropriate places.
Add XXX comments to mark places which need to be taken care of
if we want to remove this part of the kernel from Giant.

Add a comment on a potential performance problem with ip_forward()
2002-06-23 20:48:26 +00:00
Luigi Rizzo
51aed12e52 fix bad indentation and whitespace resulting from cut&paste 2002-06-23 09:15:43 +00:00
Luigi Rizzo
dfd1ae2f86 fix indentation of a comment 2002-06-23 09:14:24 +00:00
Luigi Rizzo
a5924d6100 fix a typo in a comment 2002-06-23 09:13:46 +00:00
Luigi Rizzo
ec3057db9e Remove ip_fw_fwd_addr (forgotten in previous commit)
remove some extra whitespace.
2002-06-23 09:03:42 +00:00
Luigi Rizzo
2b25acc158 Remove (almost all) global variables that were used to hold
packet forwarding state ("annotations") during ip processing.
The code is considerably cleaner now.

The variables removed by this change are:

        ip_divert_cookie        used by divert sockets
        ip_fw_fwd_addr          used for transparent ip redirection
        last_pkt                used by dynamic pipes in dummynet

Removal of the first two has been done by carrying the annotations
into volatile structs prepended to the mbuf chains, and adding
appropriate code to add/remove annotations in the routines which
make use of them, i.e. ip_input(), ip_output(), tcp_input(),
bdg_forward(), ether_demux(), ether_output_frame(), div_output().

On passing, remove a bug in divert handling of fragmented packet.
Now it is the fragment at offset 0 which sets the divert status of
the whole packet, whereas formerly it was the last incoming fragment
to decide.

Removal of last_pkt required a change in the interface of ip_fw_chk()
and dummynet_io(). On passing, use the same mechanism for dummynet
annotations and for divert/forward annotations.

option IPFIREWALL_FORWARD is effectively useless, the code to
implement it is very small and is now in by default to avoid the
obfuscation of conditionally compiled code.

NOTES:
 * there is at least one global variable left, sro_fwd, in ip_output().
   I am not sure if/how this can be removed.

 * I have deliberately avoided gratuitous style changes in this commit
   to avoid cluttering the diffs. Minor stule cleanup will likely be
   necessary

 * this commit only focused on the IP layer. I am sure there is a
   number of global variables used in the TCP and maybe UDP stack.

 * despite the number of files touched, there are absolutely no API's
   or data structures changed by this commit (except the interfaces of
   ip_fw_chk() and dummynet_io(), which are internal anyways), so
   an MFC is quite safe and unintrusive (and desirable, given the
   improved readability of the code).

MFC after: 10 days
2002-06-22 11:51:02 +00:00
Jeffrey Hsu
2ded288c88 Fix logic which resulted in missing a call to INP_UNLOCK().
Submitted by:	jlemon, mux
2002-06-21 22:54:16 +00:00
Jeffrey Hsu
2d40081d1f TCP notify functions can change the pcb list. 2002-06-21 22:52:48 +00:00
Peter Wemm
532cf61bcf Solve the 'unregistered netisr 18' information notice with a sledgehammer.
Register the ISR early, but do not actually kick off the timer until we
see some activity.  This still saves us from running the arp timers on
a system with no network cards.
2002-06-20 01:27:40 +00:00
Seigo Tanimura
03e4918190 Remove so*_locked(), which were backed out by mistake. 2002-06-18 07:42:02 +00:00
Jeffrey Hsu
3ce144ea88 Notify functions can destroy the pcb, so they have to return an
indication of whether this happenned so the calling function
knows whether or not to unlock the pcb.

Submitted by:	Jennifer Yang (yangjihui@yahoo.com)
Bug reported by:  Sid Carter (sidcarter@symonds.net)
2002-06-14 08:35:21 +00:00
Mike Silbersack
eb5afeba22 Re-commit w/fix:
Ensure that the syn cache's syn-ack packets contain the same
  ip_tos, ip_ttl, and DF bits as all other tcp packets.

  PR:             39141
  MFC after:      2 weeks

This time, make sure that ipv4 specific code (aka all of the above)
is only run in the ipv4 case.
2002-06-14 03:08:05 +00:00
Mike Silbersack
70d2b17029 Back out ip_tos/ip_ttl/DF "fix", it just panic'd my box. :)
Pointy-hat to:	silby
2002-06-14 02:43:20 +00:00
Mike Silbersack
21c3b2fc69 Ensure that the syn cache's syn-ack packets contain the same
ip_tos, ip_ttl, and DF bits as all other tcp packets.

PR:		39141
MFC after:	2 weeks
2002-06-14 02:36:34 +00:00
Jeffrey Hsu
9c68f33a9d Because we're holding an exclusive write lock on the head, references to
the new inp cannot leak out even though it has been placed on the head list.
2002-06-13 23:14:58 +00:00
Jeffrey Hsu
61ffc0b1a6 The UDP head was unlocked too early in one unicast case.
Submitted by:	bug reported by arr
2002-06-12 15:21:41 +00:00
Jeffrey Hsu
73dca2078d Fix logic which resulted in missing a call to INP_UNLOCK(). 2002-06-12 03:11:06 +00:00
Jeffrey Hsu
3cfcc388ea Fix typo where INP_INFO_RLOCK should be INP_INFO_RUNLOCK.
Submitted by: tegge, jlemon

Prefer LIST_FOREACH macro.
  Submitted by: jlemon
2002-06-12 03:08:08 +00:00
Jeffrey Hsu
7a9378e7f5 Remember to initialize the control block head mutex. 2002-06-11 10:58:57 +00:00
Jeffrey Hsu
3d9baf34c0 Fix typo.
Submitted by:	Kyunghwan Kim <redjade@atropos.snu.ac.kr>
2002-06-11 10:56:49 +00:00
Jeffrey Hsu
e98d6424af Every array elt is initialized in the following loop, so remove
unnecessary M_ZERO.
2002-06-10 23:48:37 +00:00
Jeffrey Hsu
f76fcf6d4c Lock up inpcb.
Submitted by:	Jennifer Yang <yangjihui@yahoo.com>
2002-06-10 20:05:46 +00:00
Seigo Tanimura
4cc20ab1f0 Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by:	hsu
2002-05-31 11:52:35 +00:00
Garrett Wollman
c7c5d95d56 Avoid unintentional trigraph. 2002-05-30 20:53:45 +00:00
Andrew R. Reiter
db40007d42 - Change the newly turned INVARIANTS #ifdef blocks (they were changed from
DIAGNOSTIC yesterday) into KASSERT()'s as these help to increase code
  readability.
2002-05-21 18:52:24 +00:00
Andrew R. Reiter
4cb674c960 - Turn a few DIAGNOSTIC into INVARIANTS since they are really sanity
checks.
2002-05-20 22:05:13 +00:00
Andrew R. Reiter
1e404e4e86 - Turn a DIAGNOSTIC into an INVARIANTS since it's a sanity check. Use
proper ``if'' statement style.
2002-05-20 22:04:19 +00:00
Andrew R. Reiter
e16f6e6200 - Turn a #ifdef DIAGNOSTIC to #ifdef INVARIANTS as the code from this line
through the #endif is really a sanity check.

Reviewed by: jake
2002-05-20 21:50:39 +00:00
Seigo Tanimura
243917fe3b Lock down a socket, milestone 1.
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
  socket buffer. The mutex in the receive buffer also protects the data
  in struct socket.

o Determine the lock strategy for each members in struct socket.

o Lock down the following members:

  - so_count
  - so_options
  - so_linger
  - so_state

o Remove *_locked() socket APIs.  Make the following socket APIs
  touching the members above now require a locked socket:

 - sodisconnect()
 - soisconnected()
 - soisconnecting()
 - soisdisconnected()
 - soisdisconnecting()
 - sofree()
 - soref()
 - sorele()
 - sorwakeup()
 - sotryfree()
 - sowakeup()
 - sowwakeup()

Reviewed by:	alfred
2002-05-20 05:41:09 +00:00
Kelly Yancey
c3a2190cdc Reset token-ring source routing control field on receipt of ethernet frame
without source routing information.  This restores the behaviour in this
scenario to that of prior to my last commit.
2002-05-15 01:03:32 +00:00
Robert Watson
f83c7ad731 Modify the arguments to syncache_socket() to include the mbuf (m) that
results in the syncache entry being turned into a socket.  While it's
not used in the main tree, this is required in the MAC tree so that
labels can be propagated from the mbuf to the socket.  This is also
useful if you're doing things like transparent IP connection hijacking
and you want to use the syncache/cookie mechanism, but we won't go
there.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-05-14 18:57:55 +00:00
Luigi Rizzo
4b9840932d Add ipfw hooks to ether_demux() and ether_output_frame().
Ipfw processing of frames at layer 2 can be enabled by the sysctl variable

	net.link.ether.ipfw=1

Consider this feature experimental, because right now, the firewall
is invoked in the places indicated below, and controlled by the
sysctl variables listed on the right.  As a consequence, a packet
can be filtered from 1 to 4 times depending on the path it follows,
which might make a ruleset a bit hard to follow.

I will add an ipfw option to tell if we want a given rule to apply
to ether_demux() and ether_output_frame(), but we have run out of
flags in the struct ip_fw so i need to think a bit on how to implement
this.

		to upper layers
	     |			     |
	     +----------->-----------+
	     ^			     V
	[ip_input]		[ip_output]	net.inet.ip.fw.enable=1
	     |			     |
	     ^			     V
	[ether_demux]      [ether_output_frame]	net.link.ether.ipfw=1
	     |			     |
	     +->- [bdg_forward]-->---+		net.link.ether.bridge_ipfw=1
	     ^			     V
	     |			     |
		 to devices
2002-05-13 10:37:19 +00:00
Luigi Rizzo
2f8707ca5d Remove custom definitions (IP_FW_TCPF_SYN etc.) of TCP header flags
which are the same as the original ones (TH_SYN etc.)
2002-05-13 10:21:13 +00:00
Luigi Rizzo
201efb1913 Add code to match MAC header fields (at the moment supported on
bridged packets only, soon to come also for packets on ordinary
ether_input() and ether_output() paths. The syntax is

    ipfw add <action> MAC dst src type

where dst and src can be "any" or a MAC address optionallyfollowed
by a mask, e.g.

	10:20:30:40:50
	10:20:30:40:50/32
	10:20:30:40:50&ff:ff:ff:f0:ff:0f

and type can be a single ethernet type, a range, or a type followed by
a mask (values are always in hexadecimal) e.g.

	0800
	0800-0806
	0800/8
	0800&03ff

Note, I am still uncertain on what is the best format for inputting
these values, having the values in hexadecimal is convenient in most
cases but can be confusing sometimes. Suggestions welcome.

Implement suggestion from PR 37778 to allow "not me" on destination
and source IP. The code in the PR was slightly wrong and interfered
with the normal handling of IP addresses. This version hopefully is
correct.

Minor cleanup of the code, in some places moving the indentation to 4
spaces because the code was becoming too deep. Eventually, in a
separate commit, I will move the whole file to 4 space indent.
2002-05-12 20:43:50 +00:00
Dima Dorfman
11612afabe s/demon/daemon/ 2002-05-12 00:22:38 +00:00
Mike Barcroft
9828cf071d Remove some duplicate types that should have been removed as part of
the rearranging in the previous revision.

Pointy hat to:	cvs update (merging), mike (for not noticing)
2002-05-11 23:28:51 +00:00
Luigi Rizzo
d60315bef5 Cleanup the interface to ip_fw_chk, two of the input arguments
were totally useless and have been removed.

ip_input.c, ip_output.c:
    Properly initialize the "ip" pointer in case the firewall does an
    m_pullup() on the packet.

    Remove some debugging code forgotten long ago.

ip_fw.[ch], bridge.c:
    Prepare the grounds for matching MAC header fields in bridged packets,
    so we can have 'etherfw' functionality without a lot of kernel and
    userland bloat.
2002-05-09 10:34:57 +00:00
Kelly Yancey
42fdfc126a Move ISO88025 source routing information into sockaddr_dl's sdl_data
field.  This returns the sdl_data field to a variable-length field.  More
importantly, this prevents a easily-reproduceable data-corruption bug when
the interface name plus the hardware address exceed the sdl_data field's
original 12 byte limit.  However, token-ring interfaces may still overflow
the new sdl_data field's 46 byte limit if the interface name exceeds 6
characters (since 6 characters for interface name plus 6 for hardware
address plus 34 for source routing = the size of sdl_data).  Further
refinements could overcome this limitation but would break binary
compatibility; this commit only addresses fixing the bug for
commonly-occuring cases without breaking binary compatibility with the
intention that the functionality can be MFC'ed to -stable.

  See message ID's (both send to -arch):
	20020421013332.F87395-100000@gateway.posi.net
	20020430181359.G11009-300000@gateway.posi.net
  for a more thorough description of the bug addressed and how to
reproduce it.

Approved by:	silence on -arch and -net
Sponsored by:	NTT Multimedia Communications Labs
MFC after:	1 week
2002-05-07 22:14:06 +00:00
Hajimu UMEMOTO
8117063142 Revised MLD-related definitions
- Used mld_xxx and MLD_xxx instead of mld6_xxx and MLD6_xxx according
  to the official defintions in rfc2292bis
  (macro definitions for backward compatibility were provided)
- Changed the first member of mld_hdr{} from mld_hdr to mld_icmp6_hdr
  to avoid name space conflict in C++

This change makes ports/net/pchar compilable again under -CURRENT.

Obtained from:	KAME
2002-05-06 16:28:25 +00:00
Luigi Rizzo
43d11e8453 Indentation and comments cleanup, no functional change.
MFC after: 3 days
2002-05-05 21:27:47 +00:00
Alfred Perlstein
f132072368 Redo the sigio locking.
Turn the sigio sx into a mutex.

Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.

In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.
2002-05-01 20:44:46 +00:00
Alfred Perlstein
59017610b2 Fix some edge cases where bad string handling could occur.
Submitted by: ps
2002-05-01 08:29:41 +00:00
Alfred Perlstein
ef1047305e cleanup:
fix line wraps, add some comments, fix macro definitions, fix for(;;) loops.
2002-05-01 08:08:24 +00:00
Crist J. Clark
0f56b10c4b Enlighten those who read the FINE POINTS of the documentation a bit
more on how ipfw(8) deals with tiny fragments. While we're at it, add
a quick log message to even let people know we dropped a packet. (Note
that the second FINE POINT is somewhat redundant given the first, but
since the code is there, leave the docs for it.)

MFC after:	1 day
2002-05-01 06:29:16 +00:00
Seigo Tanimura
960ed29c4b Revert the change of #includes in sys/filedesc.h and sys/socketvar.h.
Requested by:	bde

Since locking sigio_lock is usually followed by calling pgsigio(),
move the declaration of sigio_lock and the definitions of SIGIO_*() to
sys/signalvar.h.

While I am here, sort include files alphabetically, where possible.
2002-04-30 01:54:54 +00:00
Seigo Tanimura
d48d4b2501 Add a global sx sigio_lock to protect the pointer to the sigio object
of a socket.  This avoids lock order reversal caused by locking a
process in pgsigio().

sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now
require sigio_lock to be locked.  Provide sowwakeup_locked(),
soisconnected_locked(), and so on in case where we have to modify a
socket and wake up a process atomically.
2002-04-27 08:24:29 +00:00
Mike Barcroft
58631bbe0e Rearrange <netinet/in.h> so that it is easier to conditionalize
sections for various standards.  Conditionalize sections for various
standards.  Use standards conforming spelling for types in the
sockaddr_in structure.
2002-04-24 01:26:11 +00:00
Mike Barcroft
c6e43821cd Add sa_family_t type to <sys/_types.h> and typedefs to <netinet/in.h>
and <sys/socket.h>.  Previously, sa_family_t was only typedef'd in
<sys/socket.h>.
2002-04-20 02:24:35 +00:00
SUZUKI Shinsuke
88ff5695c1 just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD.
(based on freebsd4-snap-20020128)

Reviewed by:	ume
MFC after:	1 week
2002-04-19 04:46:24 +00:00
SUZUKI Shinsuke
f361efa0be initialize local variable explicitly
Reviewed by:	ume
Obtained from:	Fujitsu guys
MFC after:	1 week
2002-04-11 02:14:21 +00:00
Mike Silbersack
898568d8ab Remove some ISN generation code which has been unused since the
syncache went in.

MFC after:	3 days
2002-04-10 22:12:01 +00:00
Mike Silbersack
f2697d4d75 Totally nuke IPPORT_USERRESERVED, it is no longer used anywhere, update
remaining comments to reflect new ephemeral port range.

Reminded by:	Maxim Konovalov <maxim@macomnet.ru>
MFC after:	3 days
2002-04-10 19:30:58 +00:00
Mike Barcroft
13c3fcc238 Unconditionalize the definition of INET_ADDRSTRLEN and
INET6_ADDRSTRLEN.  Doing this helps expose bogus redefinitions in 3rd
party software.
2002-04-10 11:59:02 +00:00
Brian Somers
6ce6e2be71 Remove the code that masks an EEXIST returned from rtinit() when
calling ioctl(SIOC[AS]IFADDR).

This allows the following:

  ifconfig xx0 inet 1.2.3.1 netmask 0xffffff00
  ifconfig xx0 inet 1.2.3.17 netmask 0xfffffff0 alias
  ifconfig xx0 inet 1.2.3.25 netmask 0xfffffff8 alias
  ifconfig xx0 inet 1.2.3.26 netmask 0xffffffff alias

but would (given the above) reject this:

  ifconfig xx0 inet 1.2.3.27 netmask 0xfffffff8 alias

due to the conflicting netmasks.  I would assert that it's wrong
to mask the EEXIST returned from rtinit() as in the above scenario, the
deletion of the 1.2.3.25 address will leave the 1.2.3.27 address
as unroutable as it was in the first place.

Offered for review on: -arch, -net
Discussed with: stephen macmanus <stephenm@bayarea.net>
MFC after: 3 weeks
2002-04-10 01:42:44 +00:00
Brian Somers
5a43847d1c Don't add host routes for interface addresses of 0.0.0.0/8 -> 0.255.255.255.
This change allows bootp to work with more than one interface, at the
expense of some rather ``wrong'' looking code.  I plan to MFC this in
place of luigi's recent #ifdef BOOTP stuff that was committed to this
file in -stable, as that's slightly more wrong that this is.

Offered for review on: -arch, -net
MFC after: 2 weeks
2002-04-10 01:42:32 +00:00
John Baldwin
ad278afdf0 Change the first argument of prison_xinpcb() to be a thread pointer instead
of a proc pointer so that prison_xinpcb() can use td_ucred.
2002-04-09 20:04:10 +00:00
Mike Silbersack
c3b2fe55ba Update comments to reflect the recent ephemeral port range
change.

Noticed by:	ru
MFC After:	1 day
2002-04-09 18:01:26 +00:00
Matthew N. Dodd
b0c570df3f Retire this copy; it now lives in sys/net/fddi.h. 2002-04-05 19:24:38 +00:00
John Baldwin
6008862bc2 Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on:	i386, alpha, sparc64
2002-04-04 21:03:38 +00:00
John Baldwin
44731cab3b Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
Mike Barcroft
8822d3fb83 o Implement <sys/_types.h>, a new header for storing types that are
MI, not required to be a fixed size, and used in multiple headers.
  This will grow in time, as more things move here from <sys/types.h>
  and <machine/ansi.h>.
o Add missing type definitions (uint16_t and uint32_t) to
  <arpa/inet.h> and <netinet/in.h>.
o Reduce pollution in <sys/types.h> by using `#if _FOO_T_DECLARED'
  widgets to avoid including <sys/stdint.h>.
o Add some missing type definitions to <unistd.h> and note the ones
  that still need to be added.
o Make use of <sys/_types.h> primitives in <grp.h> and <sys/types.h>.

Reviewed by:	bde
2002-04-01 08:12:25 +00:00
Bruce Evans
c1cd65bae8 Fixed some style bugs in the removal of __P(()). Continuation lines
were not outdented to preserve non-KNF lining up of code with parentheses.
Switch to KNF formatting.
2002-03-24 10:19:10 +00:00
Robert Watson
29dc1288b0 Merge from TrustedBSD MAC branch:
Move the network code from using cr_cansee() to check whether a
    socket is visible to a requesting credential to using a new
    function, cr_canseesocket(), which accepts a subject credential
    and object socket.  Implement cr_canseesocket() so that it does a
    prison check, a uid check, and add a comment where shortly a MAC
    hook will go.  This will allow MAC policies to seperately
    instrument the visibility of sockets from the visibility of
    processes.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 19:57:41 +00:00
Ruslan Ermilov
e3f406b3c1 Prevent icmp_reflect() from calling ip_output() with a NULL route
pointer which will then result in the allocated route's reference
count never being decremented.  Just flood ping the localhost and
watch refcnt of the 127.0.0.1 route with netstat(1).

Submitted by:	jayanth

Back out ip_output.c,v 1.143 and ip_mroute.c,v 1.69 that allowed
ip_output() to be called with a NULL route pointer.  The previous
paragraph shows why this was a bad idea in the first place.

MFC after:	0 days
2002-03-22 16:45:54 +00:00
Mike Silbersack
9e5a5ed4c5 Change the ephemeral port range from 1024-5000 to 49152-65535.
This increases the number of concurrent outgoing connections from ~4000
to ~16000.  Other OSes (Solaris, OS X, NetBSD) and many other NAT
products have already made this change without ill effects, so we
should not run into any problems.

MFC after:	1 week
2002-03-22 03:28:11 +00:00
Orion Hodson
f0f3379ed5 Send periodic ARP requests when ARP entries for hosts we are sending
to are about to expire.  This prevents high packet rate flows from
experiencing packet drops at the sender following ARP cache entry
timeout.

PR:		kern/25517
Reviewed by:	luigi
MFC after:	7 days
2002-03-20 15:56:36 +00:00
Jeff Roberson
69c2d429c1 Switch vm_zone.h with uma.h. Change over to uma interfaces. 2002-03-20 05:48:55 +00:00
Alfred Perlstein
4d77a549fe Remove __P. 2002-03-19 21:25:46 +00:00
Jeff Roberson
8355f576a9 This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by:	arch@
2002-03-19 09:11:49 +00:00
Robert Watson
16aae019a0 NAI DBA update 2002-03-14 16:53:39 +00:00
Mike Barcroft
6a6230d2f6 o Add INET_ADDRSTRLEN and INET6_ADDRSTRLEN defines to <arpa/inet.h>
for POSIX.1-2001 conformance.
o Add magic to <netinet/in.h> and <netinet6/in6.h> to prevent
  redefining INET_ADDRSTRLEN and INET6_ADDRSTRLEN.
o Add a note about missing typedefs in <arpa/inet.h>.
2002-03-10 06:42:27 +00:00
Mike Barcroft
d846855da8 o Don't require long long support in bswap64() functions.
o In i386's <machine/endian.h>, macros have some advantages over
  inlines, so change some inlines to macros.
o In i386's <machine/endian.h>, ungarbage collect word_swap_int()
  (previously __uint16_swap_uint32), it has some uses on i386's with
  PDP endianness.

Submitted by:	bde

o Move a comment up in <machine/endian.h> that was accidentially moved
  down a few revisions ago.
o Reenable userland's use of optimized inline-asm versions of
  byteorder(3) functions.
o Fix ordering of prototypes vs. redefinition of byteorder(3)
  functions, so that the non-GCC (libc asm) case has proper
  prototypes.
o Add proper prototypes for byteorder(3) functions in <sys/param.h>.
o Prevent redundant duplicate prototypes by making use of the
  _BYTEORDER_PROTOTYPED define.
o Move the bswap16(), bswap32(), bswap64() C functions into MD space
  for platforms in which asm versions don't exist.  This significantly
  reduces the complexity of some things at the cost of duplicate code.

Reviewed by:	bde
2002-03-09 21:02:16 +00:00
Hajimu UMEMOTO
b7d6d9526c - Set inc_isipv6 in tcp6_usr_connect().
- When making a pcb from a sync cache, do not forget to copy inc_isipv6.

Obtained from:	KAME
MFC After:	1 week
2002-02-28 17:11:10 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Crist J. Clark
93ec91ba6d Change the wording of the inline comments from the previous commit.
Objection from:	ru
2002-02-27 13:52:06 +00:00
Alfred Perlstein
60d9b32e44 More IPV6 const fixes. 2002-02-27 05:11:50 +00:00
Dima Dorfman
76183f3453 Introduce a version field to `struct xucred' in place of one of the
spares (the size of the field was changed from u_short to u_int to
reflect what it really ends up being).  Accordingly, change users of
xucred to set and check this field as appropriate.  In the kernel,
this is being done inside the new cru2x() routine which takes a
`struct ucred' and fills out a `struct xucred' according to the
former.  This also has the pleasant sideaffect of removing some
duplicate code.

Reviewed by:	rwatson
2002-02-27 04:45:37 +00:00
Brooks Davis
673623b376 Staticize an extern that no one else used. 2002-02-26 18:24:00 +00:00
Chris D. Faulhaber
546f251b29 Enforce inbound IPsec SPD
Reviewed by:	fenner
2002-02-26 02:11:13 +00:00
Alfred Perlstein
6d53e16389 Document what inpcb->inp_vflag is for.
Submitted by: Marco Molteni <molter@tin.it>
2002-02-25 09:41:43 +00:00
Crist J. Clark
2ca2159f22 The TCP code did not do sufficient checks on whether incoming packets
were destined for a broadcast IP address. All TCP packets with a
broadcast destination must be ignored. The system only ignored packets
that were _link-layer_ broadcasts or multicast. We need to check the
IP address too since it is quite possible for a broadcast IP address
to come in with a unicast link-layer address.

Note that the check existed prior to CSRG revision 7.35, but was
removed. This commit effectively backs out that nine-year-old change.

PR:		misc/35022
2002-02-25 08:29:21 +00:00
Luigi Rizzo
ca462bcfcb BUGFIX: make use of the pointer to the target of skipto rules,
so that after the first time we can follow the pointer instead
of having to scan the list.
This was the intended behaviour from day one.

PR: 34639
MFC-after: 3 days
2002-02-20 17:15:57 +00:00
Jonathan Lemon
6b33ceb8e3 When expanding a syncache entry into a socket, inherit the socket options
from the current listen socket instead of the cached (and possibly stale)
TCB pointer.
2002-02-20 16:47:11 +00:00
Mike Barcroft
fd8e4ebc8c o Move NTOHL() and associated macros into <sys/param.h>. These are
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
  source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
  Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
  POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
  and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
  complexities associated with having MD (asm and inline) versions, and
  having to prevent exposure of these functions in other headers that
  happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
  third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.

Tested on:	alpha, i386
Reviewed by:	bde, jake, tmm
2002-02-18 20:35:27 +00:00
Ruslan Ermilov
51c8ec4a3d Moved the 127/8 check below so that IPF redirects have a chance of working.
MFC after:	1 day
2002-02-15 12:19:03 +00:00
Jonathan Lemon
0cab7c4b08 When a duplicate SYN arrives which matches an entry in the syncache,
update our lazy reference to the inpcb structure, as it may have changed.

Found by: dima
2002-02-12 02:03:50 +00:00
Dima Dorfman
364efeccb0 Silence unused variable warning in the !KLD_MODULE case.
Submitted by:	archie
2002-02-10 22:22:05 +00:00
Julian Elischer
079b7badea Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
Hajimu UMEMOTO
c5993889d8 In tcp_respond(), correctly reset returned IPv6 header. This is essential
when the original packet contains an IPv6 extension header.

Obtained from:	KAME
MFC after:	1 week
2002-02-04 17:37:06 +00:00
Mark Murray
6ebd73f0f6 WARNS=n and lint(1) silencer. Declare an array of (const) strings
as const char.
2002-02-03 11:57:32 +00:00
Crist J. Clark
56962689cc The ipfw(8) 'tee' action simply hasn't worked on incoming packets for
some time. _All_ packets, regardless of destination, were accepted by
the machine as if addressed to it.

Jump back to 'pass' processing for a teed packet instead of falling
through as if it was ours.

PR:		kern/31130
Reviewed by:	-net, luigi
MFC after:	2 weeks
2002-01-26 10:14:08 +00:00
Jonathan Lemon
d9b7cc1c8d The ENDPTS_EQ macro was comparing the one of the fports to itself. Fix.
Submitted by: emy@boostworks.com
2002-01-22 17:54:28 +00:00
Hajimu UMEMOTO
a4a6e77341 - Check the address family of the destination cached in a PCB.
- Clear the cached destination before getting another cached route.
  Otherwise, garbage in the padding space (which might be filled in if it was
  used for IPv4) could annoy rtalloc.

Obtained from:	KAME
2002-01-21 20:04:22 +00:00
Ruslan Ermilov
8c3f5566ae RFC1122 requires that addresses of the form { 127, <any> } MUST NOT
appear outside a host.

PR:		30792, 33996
Obtained from:	ip_input.c
MFC after:	1 week
2002-01-21 13:59:42 +00:00
Ruslan Ermilov
84caf00847 Fix a panic condition in icmp_reflect() introduced in rev. 1.61.
(We should be able to handle locally originated IP packets, and
these do not have m_pkthdr.rcvif set.)

PR:		kern/32806, kern/33766
Reviewed by:	luigi
Fix tested by:	Maxim Konovalov <maxim@macomnet.ru>,
		Erwin Lansing <erwin@lansing.dk>
2002-01-11 12:13:57 +00:00
Mike Smith
bedbd47e6a Initialise the intrq_present fields at runtime, not link time. This allows
us to load protocols at runtime, and avoids the use of common variables.

Also fix the ip6_intrq assignment so that it works at all.
2002-01-08 10:34:03 +00:00
Crist J. Clark
db38d34abb Fix a missing "ipfw:" in a syslog message.
MFC after:	1 day
2002-01-07 07:12:09 +00:00
Bill Fenner
92bdb2fa39 Pre-calculate the checksum for multicast packets sourced on a
multicast router.  This is overkill; it should be possible to
delay to hardware interfaces and only pre-calculate when forwarding
to a tunnel.
2002-01-05 18:23:53 +00:00
Robert Watson
e6658b129e o Spelling fix in comment: tcp_ouput -> tcp_output 2002-01-04 17:21:27 +00:00
Yaroslav Tykhiy
d0ebc0d2f1 Don't reveal a router in the IPSTEALTH mode through IP options.
The following steps are involved:
a) the IP options related to routing (LSRR and SSRR) are processed
   as though the router were a host,
b) the other IP options are processed as usual only if the packet
   is destined for the router; otherwise they are ignored.

PR:		kern/23123
Discussed in:	freebsd-hackers
2001-12-29 09:24:18 +00:00
Julian Elischer
3efc30142c Fix ipfw fwd so that it acts as the docs say
when forwarding an incoming packet to another machine.

Obtained from:	Vicor Production tree
MFC after: 3 weeks
2001-12-28 21:21:57 +00:00
Yaroslav Tykhiy
37b5d6e33d Implement matching IP precedence in ipfw(4).
Submitted by:	Igor Timkin <ivt@gamma.ru>
2001-12-21 18:43:02 +00:00
Jonathan Lemon
6c19b85f43 Remove a change that snuck in from my private tree. 2001-12-21 05:07:39 +00:00
Jonathan Lemon
45a0329051 If syncookies are disabled (net.inet.tcp.syncookies) then use the faster
arc4random() routine to generate ISNs instead of creating them with MD5().

Suggested by: silby
2001-12-21 04:41:08 +00:00
Jonathan Lemon
e579ba1aea When storing an int value in a void *, use intptr_t as the cast type
(instead of int) to keep the 64 bit platforms happy.
2001-12-19 15:57:43 +00:00
Yaroslav Tykhiy
3f9e31220b Don't try to free a NULL route when doing IPFIREWALL_FORWARD.
An old route will be NULL at that point if a packet were initially
routed to an interface (using the IP_ROUTETOIF flag.)

Submitted by:	Igor Timkin <ivt@gamma.ru>
2001-12-19 14:54:13 +00:00
Jonathan Lemon
a9c9684163 Extend the SYN DoS defense by adding syncookies to the syncache.
All TCP ISNs that are sent out are valid cookies, which allows entries
in the syncache to be dropped and still have the ACK accepted later.
As all entries pass through the syncache, there is no sudden switchover
from cache -> cookies when the cache is full; instead, syncache entries
simply have a reduced lifetime.  More details may be found in the
"Resisting DoS attacks with a SYN cache" paper in the Usenix BSDCon 2002
conference proceedings.

Sponsored by: DARPA, NAI Labs
2001-12-19 06:12:14 +00:00
Ruslan Ermilov
4aa5d00e3d Fixed the bug in transparent TCP proxying with the "encode_ip_hdr"
option -- TcpAliasOut() did not catch the IP header length change.

Submitted by:	Stepachev Andrey <aka50@mail.ru>
2001-12-18 16:13:45 +00:00
Robert Watson
365979cdac o Add IPOPT_ESO for the 'Extended Security' IP option (RFC1108)
Obtained from:	TrustedBSD Project
2001-12-14 19:37:32 +00:00
Robert Watson
18e2b6a995 o Add definition for IPOPT_CIPSO, the commercial security IP option
number.

Submitted by:	Ilmar S. Habibulin <ilmar@watson.org>
Obtained from:	TrustedBSD Project
2001-12-14 19:34:42 +00:00
Jonathan Lemon
aa1f5daa31 whitespace and style fixes recovered from -stable. 2001-12-14 19:34:11 +00:00
Jonathan Lemon
6f00486cfd minor style and whitespace fixes. 2001-12-14 19:33:29 +00:00
Jonathan Lemon
effa274e9e whitespace fixes. 2001-12-14 19:32:47 +00:00
Jonathan Lemon
f8b6a631a2 minor whitespace fixes. 2001-12-14 19:32:00 +00:00
Mike Silbersack
3260eb18f3 Reduce the local network slowstart flightsize from infinity to 4 packets.
Now that we've increased the size of our send / receive buffers, bursting
an entire window onto the network may cause congestion.  As a result,
we will slow start beginning with a flightsize of 4 packets.

Problem reported by: Thomas Zenker <thz@Lennartz-electronic.de>

MFC after:	3 days
2001-12-14 18:26:52 +00:00
Jonathan Lemon
04cad5adb1 Undo one of my last minute changes; move sc_iss up earlier so it
is initialized in case we take the T/TCP path.
2001-12-13 04:05:26 +00:00
Jonathan Lemon
7c183182bf Fix up tabs from cut&n&paste. 2001-12-13 04:02:31 +00:00
Jonathan Lemon
0ef3206bf5 Fix up tabs in comments. 2001-12-13 04:02:09 +00:00
Jonathan Lemon
eaa6d8efe5 Minor style fixes. 2001-12-13 04:01:23 +00:00
Jonathan Lemon
c448c89c59 Minor style fix. 2001-12-13 04:01:01 +00:00
David E. O'Brien
6e551fb628 Update to C99, s/__FUNCTION__/__func__/,
also don't use ANSI string concatenation.
2001-12-10 08:09:49 +00:00
Robert Watson
c39a614e0d o Our currenty userland boot code (due to rc.conf and rc.network) always
enables TCP keepalives using the net.inet.tcp.always_keepalive by default.
  Synchronize the kernel default with the userland default.
2001-12-07 17:01:28 +00:00
Ruslan Ermilov
47891de1a5 Fixed remotely exploitable DoS in arpresolve().
Easily exploitable by flood pinging the target
host over an interface with the IFF_NOARP flag
set (all you need to know is the target host's
MAC address).

MFC after:	0 days
2001-12-05 18:13:34 +00:00
Robert Watson
011376308f o Introduce pr_mtx into struct prison, providing protection for the
mutable contents of struct prison (hostname, securelevel, refcount,
  pr_linux, ...)
o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/
  so as to enforce these protections, in particular, in kern_mib.c
  protection sysctl access to the hostname and securelevel, as well as
  kern_prot.c access to the securelevel for access control purposes.
o Rewrite linux emulator abstractions for accessing per-jail linux
  mib entries (osname, osrelease, osversion) so that they don't return
  a pointer to the text in the struct linux_prison, rather, a copy
  to an array passed into the calls.  Likewise, update linprocfs to
  use these primitives.
o Update in_pcb.c to always use prison_getip() rather than directly
  accessing struct prison.

Reviewed by:	jhb
2001-12-03 16:12:27 +00:00
Matthew Dillon
262c1c1a4e Fix a bug with transmitter restart after receiving a 0 window. The
receiver was not sending an immediate ack with delayed acks turned on
when the input buffer is drained, preventing the transmitter from
restarting immediately.

Propogate the TCP_NODELAY option to accept()ed sockets.  (Helps tbench and
is a good idea anyway).

Some cleanup.  Identify additonal issues in comments.

MFC after:	1 day
2001-12-02 08:49:29 +00:00
Ruslan Ermilov
04d59553b2 Allow for ip_output() to be called with a NULL route pointer.
This fixes a panic I introduced yesterday in ip_icmp.c,v 1.64.
2001-12-01 13:48:16 +00:00
Mike Barcroft
de2656d0ed o Stop abusing MD headers with non-MD types.
o Hide nonstandard functions and types in <netinet/in.h> when
  _POSIX_SOURCE is defined.
o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>.
o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new
  __FBSDID() macro.
o Fix some miscellaneous issues in <arpa/inet.h>.
o Correct final argument for the inet_ntop() function (POSIX.1-200x).
o Get rid of the namespace pollution from <sys/types.h> in
  <arpa/inet.h>.

Reviewed by:		fenner
Partially submitted by:	bde
2001-12-01 03:43:01 +00:00
Matthew Dillon
d912c694ee The transmit burst limit for newreno completely breaks TCP's performance
if the receive side is using delayed acks.  Temporarily remove it.

MFC after:	0 days
2001-11-30 21:33:39 +00:00
Brian Somers
0f02fdac67 During SIOCAIFADDR, if in_ifinit() fails and we've already added an
interface address, blow the address away again before returning the
error.

In in_ifinit(), if we get an error from rtinit() and we've also got
a destination address, return the error rather than masking EEXISTS.
Failing to create a host route when configuring an interface should
be treated as an error.
2001-11-30 14:00:55 +00:00
Ruslan Ermilov
bd7142087b - Make ip_rtaddr() global, and use it to look up the correct source
address in icmp_reflect().
- Two new "struct icmpstat" members: icps_badaddr and icps_noroute.

PR:		kern/31575
Obtained from:	BSD/OS
MFC after:	1 week
2001-11-30 10:40:28 +00:00
Dima Dorfman
3a33b1b3b7 ipfw_modevent(): Don't use an unnatural block to define a variable
(fcp) that's already defined in the outer block and isn't used
anywhere else.  This silences -Wunused.

Reviewed by:	md5(1)
2001-11-27 20:32:47 +00:00
Dima Dorfman
e8d41815df Remove debugging printfs that weren't conditional on any debugging
options in handling MOD_{UN,}LOAD (they weren't very useful, anyway).
2001-11-27 20:28:48 +00:00
Dima Dorfman
0d4bef5dd4 In icmp_reflect(): If the packet was not addressed to us and was
received on an interface without an IP address, try to find a
non-loopback AF_INET address to use.  If that fails, drop it.
Previously, we used the address at the top of the in_ifaddrhead list,
which didn't make much sense, and would cause a panic if there were no
AF_INET addresses configured on the system.

PR:		29337, 30524
Reviewed by:	ru, jlemon
Obtained from:	NetBSD
2001-11-27 19:58:09 +00:00
Robert Watson
38e04732fc Add include of net/route.h, as structures moved around due to the
syncache rely on 'struct route' being defined.  This fixes the
LINT build some.
2001-11-27 17:36:39 +00:00
Seigo Tanimura
df89626872 Clear a new syncache entry first, followed by filling in values. This
fixes route breakage due to uncleared gabage on my box.
2001-11-27 11:55:28 +00:00
Ruslan Ermilov
8573e68110 When servicing an internal FTP server, punch ipfirewall(4) holes
for passive mode data connections (PASV/EPSV -> 227/229).  Well,
the actual punching happens a bit later, when the aliasing link
becomes fully specified.

Prodded by:	Danny Carroll <dannycarroll@hotmail.com>
MFC after:	1 week
2001-11-27 10:50:23 +00:00
Ruslan Ermilov
8ba0396688 Restore the ability to use IP_FW_ADD with setsockopt(2) that got
broken in revision 1.86.  This broke natd(8)'s -punch_fw option.

Reported by:	Daniel Rock <D.Rock@t-online.de>,
		setantae <setantae@submonkey.net>
2001-11-26 10:05:58 +00:00
Bruce Evans
419d3454b1 Fixed a buffer overrun. In my kernel configuration, tcp_syncache happens
to be followed by nfsnodehashtbl, so bzeroing callouts beyond the end of
tcp_syncache soon caused a null pointer panic when nfsnodehashtbl was
accessed.
2001-11-23 12:31:27 +00:00
Jonathan Lemon
be2ac88c59 Introduce a syncache, which enables FreeBSD to withstand a SYN flood
DoS in an improved fashion over the existing code.

Reviewed by: silby  (in a previous iteration)
Sponsored by: DARPA, NAI Labs
2001-11-22 04:50:44 +00:00
Jonathan Lemon
d00fd2011d Move initialization of snd_recover into tcp_sendseqinit(). 2001-11-21 18:45:51 +00:00
Matthew Dillon
b1e4abd246 Give struct socket structures a ref counting interface similar to
vnodes.  This will hopefully serve as a base from which we can
expand the MP code.  We currently do not attempt to obtain any
mutex or SX locks, but the door is open to add them when we nail
down exactly how that part of it is going to work.
2001-11-17 03:07:11 +00:00
Robert Watson
ce17880650 o Replace reference to 'struct proc' with 'struct thread' in 'struct
sysctl_req', which describes in-progress sysctl requests.  This permits
  sysctl handlers to have access to the current thread, permitting work
  on implementing td->td_ucred, migration of suser() to using struct
  thread to derive the appropriate ucred, and allowing struct thread to be
  passed down to other code, such as network code where td is not currently
  available (and curproc is used).

o Note: netncp and netsmb are not updated to reflect this change, as they
  are not currently KSE-adapted.

Reviewed by:		julian
Obtained from:	TrustedBSD Project
2001-11-08 02:13:18 +00:00
Andrew R. Reiter
83103a7397 - Fixes non-zero'd out sin_zero field problem so that the padding
is used as it is supposed to be.

Inspired by: PR #31704
Approved by: jdp
Reviewed by: jhb, -net@
2001-11-06 00:48:01 +00:00
Poul-Henning Kamp
d3c64689d8 3.5 years ago Wollman wrote:
"[...] and removes the hostcache code from standard kernels---the
   code that depends on it is not going to happen any time soon,
   I'm afraid."
Time to clean up.
2001-11-05 21:25:02 +00:00
Luigi Rizzo
7b109fa404 MFS: sync the ipfw/dummynet/bridge code with the one recently merged
into stable (mostly , but not only, formatting and comments changes).
2001-11-04 22:56:25 +00:00
Luigi Rizzo
09b2ca212b s/FREE/free/ 2001-11-04 17:35:31 +00:00
Brian Somers
e83aaae350 cmott@scientech.com -> cm@linktel.net
Requested by:	Charles Mott <cmott@scientech.com>
2001-11-03 11:34:09 +00:00
Bill Paul
3528d68f71 Fix a (long standing?) bug in ip_output(): if ip_insertoptions() is
called and ip_output() encounters an error and bails (i.e. host
unreachable), we will leak an mbuf. This is because the code calls
m_freem(m0) after jumping to the bad: label at the end of the function,
when it should be calling m_freem(m). (m0 is the original mbuf list
_without_ the options mbuf prepended.)

Obtained from:	NetBSD
2001-10-30 18:15:48 +00:00
Dag-Erling Smørgrav
bc183b3fe8 Make sure the netmask always has an address family. This fixes Linux
ifconfig, which expects the address returned by the SIOCGIFNETMASK ioctl
to have a valid sa_family.  Similar changes may be necessary for IPv6.

While we're here, get rid of an unnecessary temp variable.

MFC after:	2 weeks
2001-10-30 15:57:20 +00:00
Jonathan Lemon
35609d458d When dropping a packet because there is no room in the queue (which itself
is somewhat bogus), update the statistics to indicate something was dropped.

PR: 13740
2001-10-30 14:58:27 +00:00
Josef Karthauser
06dae58b17 A few more style changes picked up whilst working on an MFC to -stable. 2001-10-29 15:09:07 +00:00
Josef Karthauser
f227535cd8 Fix some whitespace, and a comment that I missed in the last commit. 2001-10-29 14:08:51 +00:00
Josef Karthauser
25549c009a Clean up the style of this header file. 2001-10-29 04:41:28 +00:00
Matthew Dillon
2326da5db5 fix int argument used in printf w/ %ld (cast to long) 2001-10-29 02:19:19 +00:00
Jonathan Lemon
0751407193 Don't use the ip_timestamp structure to access timestamp options, as the
compiler may cause an unaligned access to be generated in some cases.

PR: 30982
2001-10-25 06:27:51 +00:00
Jonathan Lemon
ec691a10e6 If we are bridging, fall back to using any inet address in the system,
irrespective of receive interface, as a last resort.

Submitted by: ru
2001-10-25 06:14:21 +00:00
Jonathan Lemon
807b8338ba Relocate the KASSERT for a null recvif to a location where it will
actually do some good.

Pointed out by: ru
2001-10-25 05:56:30 +00:00
Hajimu UMEMOTO
cb34210012 restore the data of the ip header when extended udp header and data checksum
is calculated.  this caused some trouble in the code which the ip header
is not modified.  for example, inbound policy lookup failed.

Obtained from:	KAME
MFC after:	1 week
2001-10-22 12:43:30 +00:00
Jonathan Lemon
d8b84d9e07 Only examine inet addresses of the interface. This was broken in r1.83,
with the result that the system would reply to an ARP request of 0.0.0.0
2001-10-20 05:14:06 +00:00
Ruslan Ermilov
8071913df2 Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2.
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *''
as the argument.  Pass rt_addrinfo all the way down to rtrequest1
and ifa->ifa_rtrequest.  3rd argument of ifa->ifa_rtrequest is now
``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is
using it anyways).

Benefit: the following command now works.  Previously we needed
two route(8) invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

Remove unsafe typecast in rtrequest(), from ``rtentry *'' to
``sockaddr *''.  It was introduced by 4.3BSD-Reno and never
corrected.

Obtained from:	BSD/OS, NetBSD
MFC after:	1 month
PR:		kern/28360
2001-10-17 18:07:05 +00:00
Max Khon
322dcb8d3d bring in ARP support for variable length link level addresses
Reviewed by:	jdp
Approved by:	jdp
Obtained from:	NetBSD
MFC after:	6 weeks
2001-10-14 20:17:53 +00:00
Robert Watson
8a7d8cc675 - Combine kern.ps_showallprocs and kern.ipc.showallsockets into
a single kern.security.seeotheruids_permitted, describes as:
  "Unprivileged processes may see subjects/objects with different real uid"
  NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is
  an API change.  kern.ipc.showallsockets does not.
- Check kern.security.seeotheruids_permitted in cr_cansee().
- Replace visibility calls to socheckuid() with cr_cansee() (retain
  the change to socheckuid() in ipfw, where it is used for rule-matching).
- Remove prison_unpcb() and make use of cr_cansee() against the UNIX
  domain socket credential instead of comparing root vnodes for the
  UDS and the process.  This allows multiple jails to share the same
  chroot() and not see each others UNIX domain sockets.
- Remove unused socheckproc().

Now that cr_cansee() is used universally for socket visibility, a variety
of policies are more consistently enforced, including uid-based
restrictions and jail-based restrictions.  This also better-supports
the introduction of additional MAC models.

Reviewed by:	ps, billf
Obtained from:	TrustedBSD Project
2001-10-09 21:40:30 +00:00
Jayanth Vijayaraghavan
c24d5dae7a Add a flag TF_LASTIDLE, that forces a previously idle connection
to send all its data, especially when the data is less than one MSS.
This fixes an issue where the stack was delaying the sending
of data, eventhough there was enough window to send all the data and
the sending of data was emptying the socket buffer.

Problem found by Yoshihiro Tsuchiya (tsuchiya@flab.fujitsu.co.jp)

Submitted by: Jayanth Vijayaraghavan
2001-10-05 21:33:38 +00:00
Paul Saab
4787fd37af Only allow users to see their own socket connections if
kern.ipc.showallsockets is set to 0.

Submitted by:	billf (with modifications by me)
Inspired by:	Dave McKay (aka pm aka Packet Magnet)
Reviewed by:	peter
MFC after:	2 weeks
2001-10-05 07:06:32 +00:00
Paul Saab
db69a05dce Make it so dummynet and bridge can be loaded as modules.
Submitted by:	billf
2001-10-05 05:45:27 +00:00
Jonathan Lemon
22c819a73a in_ifinit apparently can be used to rewrite an ip address; recalculate
the correct hash bucket for the entry.

Submitted by: iedowse  (with some munging by me)
2001-10-01 18:07:08 +00:00
Luigi Rizzo
cc33247e33 Fix a problem with unnumbered rules introduced in latest commit.
Reported by: des
2001-10-01 17:35:54 +00:00
Ruslan Ermilov
32eef9aeb1 mdoc(7) police: Use the new .In macro for #include statements. 2001-10-01 16:09:29 +00:00
Matthew Dillon
e2505aa676 Add __FBSDID's to libalias 2001-09-30 21:03:33 +00:00
Jonathan Lemon
01a5f19070 Nuke unused (and incorrect) #define of INADDR_HMASK.
Spotted by: ru
2001-09-29 14:59:20 +00:00
Jonathan Lemon
a931d7ed29 Make the INADDR_TO_IFP macro use the IP address hash lookup instead of
walking the entire list of IP addresses.

Pointed out by: bfumerola
2001-09-29 06:16:02 +00:00
Jonathan Lemon
ca925d9c17 Add a hash table that contains the list of internet addresses, and use
this in place of the in_ifaddr list when appropriate.  This improves
performance on hosts which have a large number of IP aliases.
2001-09-29 04:34:11 +00:00
Jonathan Lemon
9a10980e2a Centralize satosin(), sintosa() and ifatoia() macros in <netinet/in.h>
Remove local definitions.
2001-09-29 03:23:44 +00:00
Luigi Rizzo
830cc17841 Two main changes here:
+ implement "limit" rules, which permit to limit the number of sessions
   between certain host pairs (according to masks). These are a special
   type of stateful rules, which might be of interest in some cases.
   See the ipfw manpage for details.

 + merge the list pointers and ipfw rule descriptors in the kernel, so
   the code is smaller, faster and more readable. This patch basically
   consists in replacing "foo->rule->bar" with "rule->bar" all over
   the place.
   I have been willing to do this for ages!

MFC after: 1 week
2001-09-27 23:44:27 +00:00
Luigi Rizzo
832d32eb5b Remove unused (and duplicate) struct ip_opts which is never used,
not referenced in Stevens, and does not compile with g++.
There is an equivalent structure, struct ipoption in ip_var.h
which is actually used in various parts of the kernel, and also referenced
in Stevens.

Bill Fenner also says:
... if you want the trivia, struct ip_opts was introduced
in in.h SCCS revision 7.9, on 6/28/1990, by Mike Karels.
struct ipoption was introduced in ip_var.h SCCS revision 6.5,
on 9/16/1985, by... Mike Karels.

MFC-after: 3 days
2001-09-27 11:53:22 +00:00
Brooks Davis
49c024e373 Include sys/proc.h for the definition of securelevel_ge().
Submitted by:	LINT
2001-09-26 21:53:20 +00:00
Robert Watson
785f9ffca3 o Modify IPFW and DUMMYNET administrative setsockopt() calls to use
securelevel_gt() to check the securelevel, rather than direct access
  to the securelevel variable.

Obtained from:	TrustedBSD Project
2001-09-26 19:58:29 +00:00
Brooks Davis
9494d5968f Make faith loadable, unloadable, and clonable. 2001-09-25 18:40:52 +00:00
Luigi Rizzo
078156d09d Fix a null pointer dereference introduced in the last commit, plus
remove a useless assignment and move a comment.

Submitted by: Thomas Moestl
2001-09-24 05:24:19 +00:00
Ruslan Ermilov
c1dd00f75c Fixed the bug that prevented communication with FTP servers behind
NAT in extended passive mode if the server's public IP address was
different from the main NAT address.  This caused a wrong aliasing
link to be created that did not route the incoming packets back to
the original IP address of the server.

	natd -v -n pub0 -redirect_address localFTP publicFTP

Note that even if localFTP == publicFTP, one still needs to supply
the -redirect_address directive.  It is needed as a helper because
extended passive mode's 229 reply does not contain the IP address.

MFC after:	1 week
2001-09-21 14:38:36 +00:00
Robert Watson
94088977c9 o Rename u_cansee() to cr_cansee(), making the name more comprehensible
in the face of a rename of ucred to cred, and possibly generally.

Obtained from:	TrustedBSD Project
2001-09-20 21:45:31 +00:00
Luigi Rizzo
32f967a3c0 A bunch of minor changes to the code (see below) for readability, code size
and speed. No new functionality added (yet) apart from a bugfix.
MFC will occur in due time and probably in stages.

BUGFIX: fix a problem in old code which prevented reallocation of
the hash table for dynamic rules (there is a PR on this).

OTHER CHANGES: minor changes to the internal struct for static and dynamic rules.
Requires rebuild of ipfw binary.

Add comments to show how data structures are linked together.
(It probably makes no sense to keep the chain pointers separate
from actual rule descriptors. They will be hopefully merged soon.

keep a (sysctl-readable) counter for the number of static rules,
to speed up IP_FW_GET operations

initial support for a "grace time" for expired connections, so we
can set timeouts for closing connections to much shorter times.

merge zero_entry() and resetlog_entry(), they use basically the
same code.

clean up and reduce replication of code for removing rules,
both for readability and code size.

introduce a separate lifetime for dynamic UDP rules.

fix a problem in old code which prevented reallocation of
the hash table for dynamic rules (PR ...)

restructure dynamic rule descriptors

introduce some local variables to avoid multiple dereferencing of
pointer chains (reduces code size and hopefully increases speed).
2001-09-20 13:52:49 +00:00
Munechika SUMIKAWA
862e52ea61 Fixed comment: ipip_input -> mroute_encapcheck.
Reported by:	bde
2001-09-20 07:59:45 +00:00
Munechika SUMIKAWA
33ae84b7c6 Removed ipip_input(). No codes calls it anymore due to ip_encap.c's
encapsulation support.
2001-09-18 14:52:20 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Julian Elischer
aa1489d4fa Remove some un-needed code that was accidentally included in
the 2nd previous KAME patch.

Submitted by:	SUMIKAWA Munechika <sumikawa@ebina.hitachi.co.jp>
2001-09-07 07:24:28 +00:00
Julian Elischer
ff265614c1 Patches from KAME to remove usage of Varargs in existing
IPV4 code. For now they will still have some in the developing stuff (IPv6)

Submitted by:	Keiichi SHIMA / <keiichi@iij.ad.jp>
Obtained from:	KAME
2001-09-07 07:19:12 +00:00
Jonathan Lemon
f9132cebdc Wrap array accesses in macros, which also happen to be lvalues:
ifnet_addrs[i - 1]  -> ifaddr_byindex(i)
        ifindex2ifnet[i]    -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.
2001-09-06 02:40:43 +00:00
Alfred Perlstein
75ce322136 Fix sysctl comment field, s/the the/then the
Pointed out by: ru
2001-09-04 15:25:23 +00:00
Alfred Perlstein
e3d123d63d Allow disabling of "arp moved" messages.
Submitted by: Stephen Hurd <deuce@lordlegacy.org>
2001-09-03 21:53:15 +00:00
Julian Elischer
4d2c57188f I really hope this is the right answer.
call ip_input directly but take the offset off the
packet first if it's an IPV4 packet encapsulated.
2001-09-03 21:07:31 +00:00
Julian Elischer
7dd66b4ad8 Call ip_input() instead of ipip_input()
when decoding encapsulated ipv4 packets.
(allows line to compile again)
2001-09-03 20:55:35 +00:00
Julian Elischer
2e4f1ee934 One caller of rip_input failed to be converted in the last commit. 2001-09-03 20:40:35 +00:00
Julian Elischer
f0ffb944d2 Patches from Keiichi SHIMA <keiichi@iij.ad.jp>
to make ip use the standard protosw structure again.

Obtained from: Well, KAME I guess.
2001-09-03 20:03:55 +00:00
Jayanth Vijayaraghavan
e7e2b80184 when newreno is turned on, if dupacks = 1 or dupacks = 2 and
new data is acknowledged, reset the dupacks to 0.
The problem was spotted when a connection had its send buffer full
because the congestion window was only 1 MSS and was not being incremented
because dupacks was not reset to 0.

Obtained from:		Yahoo!
2001-08-29 23:54:13 +00:00
Jesper Skriver
3b8123b72c When net.inet.tcp.icmp_may_rst is enabled, report ECONNREFUSED not ENETRESET
to the application as a RST would, this way we're compatible with the most
applications.

MFC candidate.

Submitted by:	Scott Renfro <scott@renfro.org>
Reviewed by:	Mike Silbersack <silby@silby.com>
2001-08-27 22:10:07 +00:00
Bill Fumerola
52cf11d8a1 the IP_FW_GET code in ip_fw_ctl() sizes a buffer to hold information
about rules and dynamic rules. it later fills this buffer with these
rules.

it also takes the opporunity to compare the expiration of the dynamic
rules with the current time and either marks them for deletion or simply
charges the countdown.

unfortunatly it does this all (the sizing, the buffer copying, and the
expiration GC) with no spl protection whatsoever. it was possible for
the dynamic rule(s) to be ripped out from under the request before it
had completed, resulting in corrupt memory dereferencing.

Reviewed by:	ps
MFC before:	4.4-RELEASE, hopefully.
2001-08-26 10:09:47 +00:00
Dima Dorfman
745bab7f84 Correct a typo in a comment: FIN_WAIT2 -> FIN_WAIT_2
PR:		29970
Submitted by:	Joseph Mallett <jmallett@xMach.org>
2001-08-23 22:34:29 +00:00
Mike Silbersack
b0e3ad758b Much delayed but now present: RFC 1948 style sequence numbers
In order to ensure security and functionality, RFC 1948 style
initial sequence number generation has been implemented.  Barring
any major crypographic breakthroughs, this algorithm should be
unbreakable.  In addition, the problems with TIME_WAIT recycling
which affect our currently used algorithm are not present.

Reviewed by: jesper
2001-08-22 00:58:16 +00:00
Ruslan Ermilov
d86293dbea Added TFTP support.
Submitted by:	Joe Clarke <marcus@marcuscom.com>
MFC after:	2 weeks
2001-08-21 16:25:38 +00:00
Ruslan Ermilov
04c3e33949 Close the "IRC DCC" security breach reported recently on Bugtraq.
Submitted by:	Makoto MATSUSHITA <matusita@jp.FreeBSD.org>
2001-08-21 11:21:08 +00:00
Brian Somers
f68e0a68d8 Make the copyright consistent.
Previously approved by:	Charles Mott <cmott@scientech.com>
2001-08-20 22:57:33 +00:00
Brian Somers
7806546c39 Handle snprintf() returning -1
MFC after: 2 weeks
2001-08-20 12:06:42 +00:00
Julian Elischer
2b6a0c4fcd Make the protoswitch definitiosn checkable in the same way that
cdevsw entries have been for a long time.
Discover that we now have two version sof the same structure.
I will shoot one of them shortly when I figure out why someone thinks
they need it. (And I can prove they don't)
(netinet/ipprotosw.h should GO AWAY)
2001-08-10 23:17:22 +00:00
Ruslan Ermilov
c4d9468ea0 mdoc(7) police:
Avoid using parenthesis enclosure macros (.Pq and .Po/.Pc) with plain text.
Not only this slows down the mdoc(7) processing significantly, but it also
has an undesired (in this case) effect of disabling hyphenation within the
entire enclosed block.
2001-08-07 15:48:51 +00:00
Hajimu UMEMOTO
e43cc4ae36 When running aplication joined multicast address,
removing network card, and kill aplication.
imo_membership[].inm_ifp refer interface pointer
after removing interface.
When kill aplication, release socket,and imo_membership.
imo_membership use already not exist interface pointer.
Then, kernel panic.

PR:		29345
Submitted by:	Inoue Yuichi <inoue@nd.net.fujitsu.co.jp>
Obtained from:	KAME
MFC after:	3 days
2001-08-04 17:10:14 +00:00
Daniel C. Sobral
07203494d2 MFS: Avoid dropping fragments in the absence of an interface address.
Noticed by:	fenner
Submitted by:	iedowse
Not committed to current by:	iedowse ;-)
2001-08-03 17:36:06 +00:00
Peter Wemm
57e119f6f2 Fix a warning. 2001-07-27 00:04:39 +00:00
Peter Wemm
016517247f Patch up some style(9) stuff in tcp_new_isn() 2001-07-27 00:03:49 +00:00
Peter Wemm
92971bd3f1 s/OpemBSD/OpenBSD/ 2001-07-27 00:01:48 +00:00
Hajimu UMEMOTO
13cf67f317 move ipsec security policy allocation into in_pcballoc, before
making pcbs available to the outside world.  otherwise, we will see
inpcb without ipsec security policy attached (-> panic() in ipsec.c).

Obtained from:	KAME
MFC after:	3 days
2001-07-26 19:19:49 +00:00
Bill Fenner
3f2e902a15 Somewhat modernize ip_mroute.c:
- Use sysctl to export stats
- Use ip_encap.c's encapsulation support
- Update lkm to kld (is 6 years a record for a broken module?)
- Remove some unused cruft
2001-07-25 20:15:49 +00:00
Ruslan Ermilov
38c1bc358b Avoid a NULL pointer derefence introduced in rev. 1.129.
Problem noticed by:	bde, gcc(1)
Panic caught by:	mjacob
Patch tested by:	mjacob
2001-07-23 16:50:01 +00:00
Ruslan Ermilov
f2c2962ee5 Backout non-functional changes from revision 1.128.
Not objected to by:	dcs
2001-07-19 07:10:30 +00:00
Daniel C. Sobral
3afefa3924 Skip the route checking in the case of multicast packets with known
interfaces.

Reviewed by:	people at that channel
Approved by:	silence on -net
2001-07-17 18:47:48 +00:00
Ruslan Ermilov
9f81cc840b Backout damage to the INADDR_TO_IFP() macro in revision 1.7.
This macro was supposed to only match local IP addresses of
interfaces, and all consumers of this macro assume this as
well.  (See IP_MULTICAST_IF and IP_ADD_MEMBERSHIP socket
options in the ip(4) manpage.)

This fixes a major security breach in IPFW-based firewalls
where the `me' keyword would match the other end of a P2P
link.

PR:		kern/28567
2001-07-17 10:30:21 +00:00
David E. O'Brien
81e561cdf2 Bump net.inet.tcp.sendspace to 32k and net.inet.tcp.recvspace to 65k.
This should help us in nieve benchmark "tests".

It seems a wide number of people think 32k buffers would not cause major
issues, and is in fact in use by many other OS's at this time.  The
receive buffers can be bumped higher as buffers are hardly used and several
research papers indicate that receive buffers rarely use much space at all.

Submitted by:			Leo Bicknell <bicknell@ufp.org>
				<20010713101107.B9559@ussenterprise.ufp.org>
Agreed to in principle by:	dillon (at the 32k level)
2001-07-13 18:38:04 +00:00
Ruslan Ermilov
a307d59838 mdoc(7) police: removed HISTORY info from the .Os call. 2001-07-10 13:41:46 +00:00
Mike Silbersack
2d610a5028 Temporary feature: Runtime tuneable tcp initial sequence number
generation scheme.  Users may now select between the currently used
OpenBSD algorithm and the older random positive increment method.

While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT
handling; this is causing trouble for an increasing number of folks.

To switch between generation schemes, one sets the sysctl
net.inet.tcp.tcp_seq_genscheme.  0 = random positive increments,
1 = the OpenBSD algorithm.  1 is still the default.

Once a secure _and_ compatible algorithm is implemented, this sysctl
will be removed.

Reviewed by: jlemon
Tested by: numerous subscribers of -net
2001-07-08 02:20:47 +00:00
Brooks Davis
53dab5fe7b gif(4) and stf(4) modernization:
- Remove gif dependencies from stf.
 - Make gif and stf into modules
 - Make gif cloneable.

PR:		kern/27983
Reviewed by:	ru, ume
Obtained from:	NetBSD
MFC after:	1 week
2001-07-02 21:02:09 +00:00
Crist J. Clark
92a99815a8 While in there fixing a fragment logging bug, fix it so we log
fragments "right." Log fragment information tcpdump(8)-style,

   Jul  1 19:38:45 bubbles /boot/kernel/kernel: ipfw: 1000 Accept ICMP:8.0 192.168.64.60 192.168.64.20 in via ep0 (frag 53113:1480@0+)

That is, instead of the old,

  ... Fragment = <offset/8>

Do,

  ... (frag <IP ID>:<data len>@<offset>[+])

PR:		kern/23446
Approved by:	ru
MFC after:	1 week
2001-07-02 15:50:31 +00:00
Ruslan Ermilov
8bf82a92d5 Backout CSRG revision 7.22 to this file (if in_losing notices an
RTF_DYNAMIC route, it got freed twice).  I am not sure what was
the actual problem in 1992, but the current behavior is memory
leak if PCB holds a reference to a dynamically created/modified
routing table entry.  (rt_refcnt>0 and we don't call rtfree().)

My test bed was:

1.  Set net.inet.tcp.msl to a low value (for test purposes), e.g.,
    5 seconds, to speed up the transition of TCP connection to a
    "closed" state.
2.  Add a network route which causes ICMP redirect from the gateway.
3.  ping(8) host H that matches this route; this creates RTF_DYNAMIC
    RTF_HOST route to H.  (I was forced to use ICMP to cause gateway
    to generate ICMP host redirect, because gateway in question is a
    4.2-STABLE system vulnerable to a problem that was fixed later in
    ip_icmp.c,v 1.39.2.6, and TCP packets with DF bit set were
    triggering this bug.)
4.  telnet(1) to H
5.  Block access to H with ipfw(8)
6.  Send something in telnet(1) session; this causes EPERM, followed
    by an in_losing() call in a few seconds.
7.  Delete ipfw(8) rule blocking access to H, and wait for TCP
    connection moving to a CLOSED state; PCB is freed.
8.  Delete host route to H.
9.  Watch with netstat(1) that `rttrash' increased.
10. Repeat steps 3-9, and watch `rttrash' increases.

PR:		kern/25421
MFC after:	2 weeks
2001-06-29 12:07:29 +00:00
Ruslan Ermilov
3277d1c498 Fixed the brain-o in rev. 1.10: the logic check was reversed.
Reported by:	Bernd Fuerwitt <bf@fuerwitt.de>
2001-06-27 14:11:25 +00:00
Ruslan Ermilov
a447a5ae06 Bring in fix from NetBSD's revision 1.16:
Pass the correct destination address for the route-to-gateway case.

PR:		kern/10607
MFC after:	2 weeks
2001-06-26 09:00:50 +00:00
David Malone
7ce87f1205 Allow getcred sysctl to work in jailed root processes. Processes can
only do getcred calls for sockets which were created in the same jail.
This should allow the ident to work in a reasonable way within jails.

PR:		28107
Approved by:	des, rwatson
2001-06-24 12:18:27 +00:00
Jonathan Lemon
f962cba5c3 Replace bzero() of struct ip with explicit zeroing of structure members,
which is faster.
2001-06-23 17:44:27 +00:00
Ruslan Ermilov
c73d99b567 Add netstat(1) knob to reset net.inet.{ip|icmp|tcp|udp|igmp}.stats.
For example, ``netstat -s -p ip -z'' will show and reset IP stats.

PR:		bin/17338
2001-06-23 17:17:59 +00:00
Mike Silbersack
08517d530e Eliminate the allocation of a tcp template structure for each
connection.  The information contained in a tcptemp can be
reconstructed from a tcpcb when needed.

Previously, tcp templates required the allocation of one
mbuf per connection.  On large systems, this change should
free up a large number of mbufs.

Reviewed by:	bmilekic, jlemon, ru
MFC after: 2 weeks
2001-06-23 03:21:46 +00:00
Munechika SUMIKAWA
a96c00661a - Renumber KAME local ICMP types and NDP options numberes beacaues they
are duplicated by newly defined types/options in RFC3121
- We have no backward compatibility issue. There is no apps in our
  distribution which use the above types/options.

Obtained from:	KAME
MFC after:	2 weeks
2001-06-21 07:08:43 +00:00
Hajimu UMEMOTO
ff2428299f made sure to use the correct sa_len for rtalloc().
sizeof(ro_dst) is not necessarily the correct one.
this change would also fix the recent path MTU discovery problem for the
destination of an incoming TCP connection.

Submitted by:	JINMEI Tatuya <jinmei@kame.net>
Obtained from:	KAME
MFC after:	2 weeks
2001-06-20 12:32:48 +00:00
Jonathan Lemon
08aadfbb98 Do not perform arp send/resolve on an interface marked NOARP.
PR: 25006
MFC after: 2 weeks
2001-06-15 21:00:32 +00:00
Peter Wemm
215db1379e Fix a stack of KAME netinet6/in6.h warnings:
592: warning: `struct mbuf' declared inside parameter list
595: warning: `struct ifnet' declared inside parameter list
2001-06-15 00:37:27 +00:00
Hajimu UMEMOTO
3384154590 Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
  - The definitions of SADB_* in sys/net/pfkeyv2.h are still different
    from RFC2407/IANA assignment because of binary compatibility
    issue.  It should be fixed under 5-CURRENT.
  - ip6po_m member of struct ip6_pktopts is no longer used.  But, it
    is still there because of binary compatibility issue.  It should
    be removed under 5-CURRENT.

Reviewed by:	itojun
Obtained from:	KAME
MFC after:	3 weeks
2001-06-11 12:39:29 +00:00
Jesper Skriver
96c2b04290 Make the default value of net.inet.ip.maxfragpackets and
net.inet6.ip6.maxfragpackets dependent on nmbclusters,
defaulting to nmbclusters / 4

Reviewed by:	bde
MFC after:	1 week
2001-06-10 11:04:10 +00:00
Peter Wemm
0978669829 "Fix" the previous initial attempt at fixing TUNABLE_INT(). This time
around, use a common function for looking up and extracting the tunables
from the kernel environment.  This saves duplicating the same function
over and over again.  This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.
2001-06-08 05:24:21 +00:00
Jonathan Lemon
0a52f59c36 Move IPFilter into contrib. 2001-06-07 05:13:35 +00:00
Peter Wemm
4422746fdf Back out part of my previous commit. This was a last minute change
and I botched testing.  This is a perfect example of how NOT to do
this sort of thing. :-(
2001-06-07 03:17:26 +00:00
Peter Wemm
81930014ef Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros.  TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.
2001-06-06 22:17:08 +00:00
Jesper Skriver
65f28919b3 Silby's take one on increasing FreeBSD's resistance to SYN floods:
One way we can reduce the amount of traffic we send in response to a SYN
flood is to eliminate the RST we send when removing a connection from
the listen queue.  Since we are being flooded, we can assume that the
majority of connections in the queue are bogus.  Our RST is unwanted
by these hosts, just as our SYN-ACK was.  Genuine connection attempts
will result in hosts responding to our SYN-ACK with an ACK packet.  We
will automatically return a RST response to their ACK when it gets to us
if the connection has been dropped, so the early RST doesn't serve the
genuine class of connections much.  In summary, we can reduce the number
of packets we send by a factor of two without any loss in functionality
by ensuring that RST packets are not sent when dropping a connection
from the listen queue.

Submitted by:	Mike Silbersack <silby@silby.com>
Reviewed by:	jesper
MFC after:	2 weeks
2001-06-06 19:41:51 +00:00
Brian Somers
f987e1bd0f Add BSD-style copyright headers
Approved by: Charles Mott <cmott@scientech.com>
2001-06-04 15:09:51 +00:00
Brian Somers
888b1a7aa5 Change to a standard BSD-style copyright
Approved by:	Atsushi Murai <amurai@spec.co.jp>
2001-06-04 14:52:17 +00:00
Jesper Skriver
690a6055ff Prevent denial of service using bogus fragmented IPv4 packets.
A attacker sending a lot of bogus fragmented packets to the target
(with different IPv4 identification field - ip_id), may be able
to put the target machine into mbuf starvation state.

By setting a upper limit on the number of reassembly queues we
prevent this situation.

This upper limit is controlled by the new sysctl
net.inet.ip.maxfragpackets which defaults to 200,
as the IPv6 case, this should be sufficient for most
systmes, but you might want to increase it if you have
lots of TCP sessions.
I'm working on making the default value dependent on
nmbclusters.

If you want old behaviour (no upper limit) set this sysctl
to a negative value.

If you don't want to accept any fragments (not recommended)
set the sysctl to 0 (zero).

Obtained from:	NetBSD
MFC after:	1 week
2001-06-03 23:33:23 +00:00
Kris Kennaway
64dddc1872 Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets.
This closes a minor information leak which allows a remote observer to
determine the rate at which the machine is generating packets, since the
default behaviour is to increment a counter for each packet sent.

Reviewed by:    -net
Obtained from:  OpenBSD
2001-06-01 10:02:28 +00:00
David E. O'Brien
240ef84277 Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile. 2001-06-01 09:51:14 +00:00
Jesper Skriver
2b1a209a17 Prevent denial of service using bogus fragmented IPv4 packets.
A attacker sending a lot of bogus fragmented packets to the target
(with different IPv4 identification field - ip_id), may be able
to put the target machine into mbuf starvation state.

By setting a upper limit on the number of reassembly queues we
prevent this situation.

This upper limit is controlled by the new sysctl
net.inet.ip.maxfragpackets which defaults to NMBCLUSTERS/4

If you want old behaviour (no upper limit) set this sysctl
to a negative value.

If you don't want to accept any fragments (not recommended)
set the sysctl to 0 (zero)

Obtained from:	NetBSD (partially)
MFC after:	1 week
2001-05-31 21:57:29 +00:00
Jesper Skriver
7ceb778366 Disable rfc1323 and rfc1644 TCP extensions if we havn't got
any response to our third SYN to work-around some broken
terminal servers (most of which have hopefully been retired)
that have bad VJ header compression code which trashes TCP
segments containing unknown-to-them TCP options.

PR:		kern/1689
Submitted by:	jesper
Reviewed by:	wollman
MFC after:	2 weeks
2001-05-31 19:24:49 +00:00
Ruslan Ermilov
79ec1c507a Add an integer field to keep protocol-specific flags with links.
For FTP control connection, keep the CRLF end-of-line termination
status in there.

Fixed the bug when the first FTP command in a session was ignored.

PR:		24048
MFC after:	1 week
2001-05-30 14:24:35 +00:00
Jesper Skriver
e4b6428171 Inline TCP_REASS() in the single location where it's used,
just as OpenBSD and NetBSD has done.

No functional difference.

MFC after:	2 weeks
2001-05-29 19:54:45 +00:00
Jesper Skriver
853be1226e properly delay acks in half-closed TCP connections
PR:	24962
Submitted by:	Tony Finch <dot@dotat.at>
MFC after:	2 weeks
2001-05-29 19:51:45 +00:00
Ruslan Ermilov
9185426827 In in_ifadown(), differentiate between whether the interface goes
down or interface address is deleted.  Only delete static routes
in the latter case.

Reported by:	Alexander Leidinger <Alexander@leidinger.net>
2001-05-11 14:37:34 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Jesper Skriver
d1745f454d Say goodbye to TCP_COMPAT_42
Reviewed by:	wollman
Requested by:	wollman
2001-04-20 11:58:56 +00:00
Kris Kennaway
f0a04f3f51 Randomize the TCP initial sequence numbers more thoroughly.
Obtained from:	OpenBSD
Reviewed by:	jesper, peter, -developers
2001-04-17 18:08:01 +00:00
Darren Reed
454a43c1f1 fix security hole created by fragment cache 2001-04-06 15:52:28 +00:00
Bill Fumerola
0901f62e11 pipe/queue are the only consumers of flow_id, so only set it in those cases 2001-04-06 06:52:25 +00:00
Jesper Skriver
b77d155dd3 MFC candidate.
Change code from PRC_UNREACH_ADMIN_PROHIB to PRC_UNREACH_PORT for
ICMP_UNREACH_PROTOCOL and ICMP_UNREACH_PORT

And let TCP treat PRC_UNREACH_PORT like PRC_UNREACH_ADMIN_PROHIB

This should fix the case where port unreachables for udp returned
ENETRESET instead of ECONNREFUSED

Problem found by:	Bill Fenner <fenner@research.att.com>
Reviewed by:		jlemon
2001-03-28 14:13:19 +00:00
Ruslan Ermilov
4a558355e5 MAN[1-9] -> MAN. 2001-03-27 17:27:19 +00:00
Yaroslav Tykhiy
4cbc8ad1bb Add a missing m_pullup() before a mtod() in in_arpinput().
PR: kern/22177
Reviewed by: wollman
2001-03-27 12:34:58 +00:00
Hidetoshi Shimokawa
110a013333 Replace dyn_fin_lifetime with dyn_ack_lifetime for half-closed state.
Half-closed state could last long for some connections and fin_lifetime
(default 20sec) is too short for that.

OK'ed by: luigi
2001-03-27 05:28:30 +00:00
Poul-Henning Kamp
f83880518b Send the remains (such as I have located) of "block major numbers" to
the bit-bucket.
2001-03-26 12:41:29 +00:00
Brian Somers
71593f95e0 Make header files conform to style(9).
Reviewed by (*): bde

(*) alias_local.h only got a cursory glance.
2001-03-25 12:05:10 +00:00
Brian Somers
adad9908fa Remove an extraneous declaration. 2001-03-25 03:34:29 +00:00
Hajimu UMEMOTO
2da24fa6e9 IPv4 address is not unsigned int. This change introduces in_addr_t.
PR:		9982
Adviced by:	des
Reviewed by:	-alpha and -net (no objection)
Obtained from:	OpenBSD
2001-03-23 18:59:31 +00:00
Brian Somers
30fcf11451 Remove (non-protected) variable names from function prototypes. 2001-03-22 11:55:26 +00:00
Paul Richards
1789d85615 Only flush rules that have a rule number above that set by a new
sysctl, net.inet.ip.fw.permanent_rules.

This allows you to install rules that are persistent across flushes,
which is very useful if you want a default set of rules that
maintains your access to remote machines while you're reconfiguring
the other rules.

Reviewed by:	Mark Murray <markm@FreeBSD.org>
2001-03-21 08:19:31 +00:00
Dag-Erling Smørgrav
c59319bf1a Axe TCP_RESTRICT_RST. It was never a particularly good idea except for a few
very specific scenarios, and now that we have had net.inet.tcp.blackhole for
quite some time there is really no reason to use it any more.

(last of three commits)
2001-03-19 22:09:00 +00:00
Ruslan Ermilov
1e3d5af041 Invalidate cached forwarding route (ipforward_rt) whenever a new route
is added to the routing table, otherwise we may end up using the wrong
route when forwarding.

PR:		kern/10778
Reviewed by:	silence on -net
2001-03-19 09:16:16 +00:00
Ruslan Ermilov
4078ffb154 Make sure the cached forwarding route (ipforward_rt) is still up before
using it.  Not checking this may have caused the wrong IP address to be
used when processing certain IP options (see example below).  This also
caused the wrong route to be passed to ip_output() when forwarding, but
fortunately ip_output() is smart enough to detect this.

This example demonstrates the wrong behavior of the Record Route option
observed with this bug.  Host ``freebsd'' is acting as the gateway for
the ``sysv''.

1. On the gateway, we add the route to the destination.  The new route
   will use the primary address of the loopback interface, 127.0.0.1:

:  freebsd# route add 10.0.0.66 -iface lo0 -reject
:  add host 10.0.0.66: gateway lo0

2. From the client, we ping the destination.  We see the correct replies.
   Please note that this also causes the relevant route on the ``freebsd''
   gateway to be cached in ipforward_rt variable:

:  sysv# ping -snv 10.0.0.66
:  PING 10.0.0.66: 56 data bytes
:  ICMP Host Unreachable from gateway 192.168.0.115
:  ICMP Host Unreachable from gateway 192.168.0.115
:  ICMP Host Unreachable from gateway 192.168.0.115
:
:  ----10.0.0.66 PING Statistics----
:  3 packets transmitted, 0 packets received, 100% packet loss

3. On the gateway, we delete the route to the destination, thus making
   the destination reachable through the `default' route:

:  freebsd# route delete 10.0.0.66
:  delete host 10.0.0.66

4. From the client, we ping destination again, now with the RR option
   turned on.  The surprise here is the 127.0.0.1 in the first reply.
   This is caused by the bug in ip_rtaddr() not checking the cached
   route is still up befor use.  The debug code also shows that the
   wrong (down) route is further passed to ip_output().  The latter
   detects that the route is down, and replaces the bogus route with
   the valid one, so we see the correct replies (192.168.0.115) on
   further probes:

:  sysv# ping -snRv 10.0.0.66
:  PING 10.0.0.66: 56 data bytes
:  64 bytes from 10.0.0.66: icmp_seq=0. time=10. ms
:    IP options:  <record route> 127.0.0.1, 10.0.0.65, 10.0.0.66,
:                                192.168.0.65, 192.168.0.115, 192.168.0.120,
:                                0.0.0.0(Current), 0.0.0.0, 0.0.0.0
:  64 bytes from 10.0.0.66: icmp_seq=1. time=0. ms
:    IP options:  <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66,
:                                192.168.0.65, 192.168.0.115, 192.168.0.120,
:                                0.0.0.0(Current), 0.0.0.0, 0.0.0.0
:  64 bytes from 10.0.0.66: icmp_seq=2. time=0. ms
:    IP options:  <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66,
:                                192.168.0.65, 192.168.0.115, 192.168.0.120,
:                                0.0.0.0(Current), 0.0.0.0, 0.0.0.0
:
:  ----10.0.0.66 PING Statistics----
:  3 packets transmitted, 3 packets received, 0% packet loss
:  round-trip (ms)  min/avg/max = 0/3/10
2001-03-18 13:04:07 +00:00
Poul-Henning Kamp
462b86fe91 <sys/queue.h> makeover. 2001-03-16 20:00:53 +00:00
Poul-Henning Kamp
ccd6f42dc9 Fix a style(9) nit. 2001-03-16 19:36:23 +00:00
Ruslan Ermilov
089cdfad78 net/route.c:
A route generated from an RTF_CLONING route had the RTF_WASCLONED flag
  set but did not have a reference to the parent route, as documented in
  the rtentry(9) manpage.  This prevented such routes from being deleted
  when their parent route is deleted.

  Now, for example, if you delete an IP address from a network interface,
  all ARP entries that were cloned from this interface route are flushed.

  This also has an impact on netstat(1) output.  Previously, dynamically
  created ARP cache entries (RTF_STATIC flag is unset) were displayed as
  part of the routing table display (-r).  Now, they are only printed if
  the -a option is given.

netinet/in.c, netinet/in_rmx.c:

  When address is removed from an interface, also delete all routes that
  point to this interface and address.  Previously, for example, if you
  changed the address on an interface, outgoing IP datagrams might still
  use the old address.  The only solution was to delete and re-add some
  routes.  (The problem is easily observed with the route(8) command.)

  Note, that if the socket was already bound to the local address before
  this address is removed, new datagrams generated from this socket will
  still be sent from the old address.

PR:		kern/20785, kern/21914
Reviewed by:	wollman (the idea)
2001-03-15 14:52:12 +00:00