Commit Graph

2501 Commits

Author SHA1 Message Date
Qing Li
c1fd993af9 Redo the previous fix by setting the UMA_ZONE_ZINIT bit in the syncache
zone, eliminating the need to call bzero() after each syncache entry
allocation.

Suggested by:	glebius
Reviewed by:	andre
MFC after:	3 days
2006-02-08 23:32:57 +00:00
Qing Li
737b12e98f Fixes a crash due to the memory of the newly allocated syncache entry
in syncache_lookup() is not cleared and may lead to an arbitrary and
bogus rtentry pointer which later gets free'd.

Reviewed by: andre
MFC after: 3 days
2006-02-07 19:59:46 +00:00
Oleg Bulyzhin
6edb555dbc Fix five years old bug in ip_reass(): if we are using 'full' (i.e. including
pseudo header) hardware rx checksum offloading ip_reass() fails to calculate
TCP/UDP checksum for reassembled packet correctly.  This also should fix
recent 'NFS over UDP over bge' issue exposed by if_bge.c rev. 1.123

Reviewed by:	sam (earlier version), bde
Approved by:	glebius (mentor)
MFC after:	2 weeks
2006-02-07 11:48:10 +00:00
Hajimu UMEMOTO
d5e8a67ee9 Never select the PCB that has INP_IPV6 flag and is bound to :: if
we have another PCB which is bound to 0.0.0.0.  If a PCB has the
INP_IPV6 flag, then we set its cost higher than IPv4 only PCBs.

Submitted by:	Keiichi SHIMA <keiichi__at__iijlab.net>
Obtained from:	KAME
MFC after:	1 week
2006-02-04 07:59:17 +00:00
Gleb Smirnoff
a7908db153 Dropping the lock in the transmit_event() is not safe, because we
store some pipe pointers on stack. If user reconfigures dummynet
in the interlock gap, we can work with freed pipes after relock.

To fix this, we decided not to send packets in transmit_event(),
but fill a queue. At the end of dummynet() and dummynet_io(),
after the lock is dropped, if there is something in the queue
we run dummynet_send() to process the queue.

In collaboration with:	ru
2006-02-03 11:38:19 +00:00
Gleb Smirnoff
ce62866023 Axe unused function. 2006-02-03 10:42:28 +00:00
Christian S.J. Peron
f5cdbcf14c Use PFIL_HOOKED macros in if_bridge and pass the right argument to
rw_assert. This un-breaks the build.

Submitted by:	Kostik Belousov
Pointy hat to:	csjp
2006-02-02 16:41:20 +00:00
Christian S.J. Peron
604afec496 Somewhat re-factor the read/write locking mechanism associated with the packet
filtering mechanisms to use the new rwlock(9) locking API:

- Drop the variables stored in the phil_head structure which were specific to
  conditions and the home rolled read/write locking mechanism.
- Drop some includes which were used for condition variables
- Drop the inline functions, and convert them to macros. Also, move these
  macros into pfil.h
- Move pfil list locking macros intp phil.h as well
- Rename ph_busy_count to ph_nhooks. This variable will represent the number
  of IN/OUT hooks registered with the pfil head structure
- Define PFIL_HOOKED macro which evaluates to true if there are any
  hooks to be ran by pfil_run_hooks
- In the IP/IP6 stacks, change the ph_busy_count comparison to use the new
  PFIL_HOOKED macro.
- Drop optimization in pfil_run_hooks which checks to see if there are any
  hooks to be ran, and returns if not. This check is already performed by the
  IP stacks when they call:

        if (!PFIL_HOOKED(ph))
                goto skip_hooks;

- Drop in assertion which makes sure that the number of hooks never drops
  below 0 for good measure. This in theory should never happen, and if it
  does than there are problems somewhere
- Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep
- Drop variables which support home rolled read/write locking mechanism from
  the IPFW firewall chain structure.
- Swap out the read/write firewall chain lock internal to use the rwlock(9)
  API instead of our home rolled version
- Convert the inlined functions to macros

Reviewed by:	mlaier, andre, glebius
Thanks to:	jhb for the new locking API
2006-02-02 03:13:16 +00:00
Andre Oppermann
1dfcf0d2a3 Move the IPSEC related code blocks to their own file to unclutter
and signifincantly improve the readability of ip_input() and
ip_output() again.

The resulting IPSEC hooks in ip_input() and ip_output() may be
used later on for making IPSEC loadable.

This move is mostly mechanical and should preserve current IPSEC
behaviour as-is.  Nothing shall prevent improvements in the way
IPSEC interacts with the IPv4 stack.

Discussed with:	bz, gnn, rwatson; (earlier version)
2006-02-01 13:55:03 +00:00
Ruslan Ermilov
e46c3da737 Brain-o (use standard int types now). 2006-02-01 06:15:37 +00:00
Ruslan Ermilov
bc7eeed4c9 Fix multicast routing on 64-bit platforms.
Tested on:	amd64
MFC after:	3 days
2006-01-31 22:39:35 +00:00
Andrew Thompson
235073f4c0 Now that the bridge also processes Ethernet frames as itself, two arp replies
will be sent if there is an address on the bridge. Exclude the bridge from the
special arp handling.

This has been tested with all combinations of addresses on the bridge and members.

Pointed out by:	Michal Mertl
2006-01-31 21:29:41 +00:00
Gleb Smirnoff
25af0bb50e Add some initial locking to gif(4). It doesn't covers the whole driver,
however IPv4-in-IPv4 tunnels are now stable on SMP. Details:

- Add per-softc mutex.
- Hold the mutex on output.

The main problem was the rtentry, placed in softc. It could be
freed by ip_output(). Meanwhile, another thread being in
in_gif_output() can read and write this rtentry.

Reported by:	many
Tested by:	Alexander Shiryaev <aixp mail.ru>
2006-01-30 08:39:09 +00:00
Andrew Thompson
74948aa6f3 Back out of r1.148, it causes two arp replies to be sent with different mac
addresses. One for the bridged interface with the IP address assigned but then
another with the mac for the bridge itself.
2006-01-29 23:21:01 +00:00
Andre Oppermann
ab48768b20 When doing IP forwarding with [FAST_]IPSEC compiled into the kernel
ip_forward() would report back a zero MTU in ICMP needfrag messages
because on a IPSEC SP lookup failure no MTU got computed.

Fix this by changing the logic to compute a new MTU in any case if
IPSEC didn't do it.

Change MTU computation logic to use egress interface MTU if available
or the next smaller MTU compared to the current packet size instead
of falling back to a very small fixed MTU.

Fix associated comment.

PR:		kern/91412
MFC after:	3 days
2006-01-24 17:57:19 +00:00
Andre Oppermann
1dec73a153 In ip_mdq() compute the TV_DELTA the correct way around.
PR:		kern/91851
Submitted by:	SAKAI Hiroaki <sakai.hiroaki-at-jp.fujitsu.com>
MFC after:	3 days
2006-01-24 17:09:12 +00:00
Andre Oppermann
31343a3da2 In in_control() remove the temporary in_ifaddr structure from the
ia_hash only if it actually is an AF_INET address.  All other places
test for sa_family == AF_INET but this one.

PR:		kern/92091
Submitted by:	Seth Kingsley <sethk-at-meowfishies.com>
MFC after:	3 days
2006-01-24 16:19:31 +00:00
Oleg Bulyzhin
44a515834f Fix minor bug in uRPF:
If net.link.ether.inet.useloopback=1 and we send broadcast packet using our
  own source ip address it may be rejected by uRPF rules.

  Same bug was fixed for IPv6 in rev. 1.115 by suz.

PR:		kern/76971
Approved by:	glebius (mentor)
MFC after:	3 days
2006-01-24 13:38:06 +00:00
Gleb Smirnoff
0b4ae859ac Implement 'ipfw fwd laddr,port' feature for UDP. According to ipfw(8)
it should work, however it never did. People expect it to work.

PR:		kern/90834
2006-01-24 09:08:54 +00:00
Gleb Smirnoff
1c0b0f523d Fix build. 2006-01-23 20:10:49 +00:00
Andre Oppermann
06003a1e7c Simplify ip_next_mtu() and make its logic more easy to see while
silencing code analysis tools.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID341
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-01-23 17:06:32 +00:00
Robert Watson
136d4f1cf2 Convert remaining functions to ANSI C function declarations; remove
'register' where present.

MFC after:	1 week
2006-01-22 01:16:25 +00:00
Robert Watson
d0c75d36b9 Convert last remaining function in ip_gre.c to ANSI C function
declaration.

MFC after:	1 week
2006-01-22 01:08:30 +00:00
Bjoern A. Zeeb
3f2e28fe9f Fix stack corruptions on amd64.
Vararg functions have a different calling convention than regular
functions on amd64. Casting a varag function to a regular one to
match the function pointer declaration will hide the varargs from
the caller and we will end up with an incorrectly setup stack.

Entirely remove the varargs from these functions and change the
functions to match the declaration of the function pointers.
Remove the now unnecessary casts.

Lots of explanations and help from:     peter
Reviewed by:                            peter
PR:                                     amd64/89261
MFC after:                              6 days
2006-01-21 10:44:34 +00:00
Christian S.J. Peron
9c57c204be - Change the return type for init_tables from void to int so we can propagate
errors from rn_inithead back to the ipfw initialization function.
- Check return value of rn_inithead for failure, if table allocation has
  failed for any reason, free up any tables we have created and return ENOMEM
- In ipfw_init check the return value of init_tables and free up any mutexes or
  UMA zones which may have been created.
- Assert that the supplied table is not NULL before attempting to dereference.

This fixes panics which were a result of invalid memory accesses due to failed
table allocation. This is an issue mainly because the R_Zalloc function is a
malloc(M_NOWAIT) wrapper, thus making it possible for allocations to fail.

Found by:	Coverity Prevent (tm)
Coverity ID:	CID79
MFC after:	1 week
2006-01-20 05:35:27 +00:00
Christian S.J. Peron
e9186cb94b Destroy the dynamic rule zone in the event that we fail to insert the
initial default rule.

MFC after:	1 week
2006-01-20 03:21:25 +00:00
Andre Oppermann
0270746230 Do not derefence the ip header pointer in the IPv6 case.
This fixes a bug in the previous commit.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID253
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:59:30 +00:00
Andre Oppermann
8f8d29f686 In in_delayed_cksum() we can't perform a m_pullup() as it may
change the mbuf pointer and we don't have any way of passing
it back to the callers.  Instead just fail silently without
updating the checksum but leaving the mbuf+chain intact.

A search in our GNATS database did not turn up any match for
the existing warning message when this case is encountered.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID779
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:49:16 +00:00
Andre Oppermann
79eb490467 In syncache_expand() insert a proper syncache_free() to fix a case
that currently can't be triggered.  But better be safe than sorry
later on.  Additionally it properly silences Coverity Prevent for
future tests.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID802
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 18:25:03 +00:00
Andre Oppermann
39550088cf Prevent dereferencing a NULL route pointer when trying to update the
route MTU.

This bug is very difficult to reach and not remotely exploitable.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID162
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-01-18 15:05:05 +00:00
Andre Oppermann
5d691e6da8 Return mbuf pointer or NULL from ip_fastforward() as the mbuf pointer
may have changed by m_pullup() during fastforward processing.

While this is a bug it is actually never triggered in real world
situations and it is not remotely exploitable.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID780
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-01-18 14:24:39 +00:00
Robert Watson
d248c7d7f5 Modify the IP fragment reassembly code so that it uses a new UMA zone,
ipq_zone, to allocate fragment headers from, rather than using cast mbuf
storage.  This was one of the few remaining uses of mbuf storage for
local data structures that relied on dtom().  Implement the resource
limit on ipq's using UMA zone limits, but preserve current sysctl
semantics using a sysctl proc.

MFC after:	3 weeks
2006-01-15 18:58:21 +00:00
Robert Watson
dfa60d9354 Staticize ipqlock, since it is local to ip_input.c.
MFC after:	3 days
2006-01-15 17:05:48 +00:00
George V. Neville-Neil
34f83c52e7 Check the correct TTL in both the IPv6 and IPv4 cases.
Submitted by:	glebius
Reviewed by:	gnn, bz
Found with:     Coverity Prevent(tm)
2006-01-14 16:39:31 +00:00
Gleb Smirnoff
ecedca7441 UMA can return NULL not only in case when our zone is full, but
also in case of generic memory shortage. In the latter case we may
not find an old entry.

Found with:	Coverity Prevent(tm)
2006-01-14 13:04:08 +00:00
Robert Watson
e5bc0aa3c3 Remove dead code: 'opts' is not used in udp_append(), only in udp_input(),
so no need to assign it to NULL or conditionally free it.

Found with:	Coverity Prevent(tm)
MFC after:	3 days
2006-01-14 11:18:32 +00:00
Andrew Thompson
54c427e0e2 Include the bridge interface itself in the special arp handling.
PR:		90973
MFC after:	1 week
2006-01-12 21:05:30 +00:00
Colin Percival
9ed97bee65 Correct insecure temporary file usage in texindex. [06:01]
Correct insecure temporary file usage in ee. [06:02]
Correct a race condition when setting file permissions, sanitize file
names by default, and fix a buffer overflow when handling files
larger than 4GB in cpio. [06:03]
Fix an error in the handling of IP fragments in ipfw which can cause
a kernel panic. [06:04]

Security:	FreeBSD-SA-06:01.texindex
Security:	FreeBSD-SA-06:02.ee
Security:	FreeBSD-SA-06:03.cpio
Security:	FreeBSD-SA-06:04.ipfw
2006-01-11 08:02:16 +00:00
Andrew Thompson
73ff045c57 Add RFC 3378 EtherIP support. This change makes it possible to add gif
interfaces to bridges, which will then send and receive IP protocol 97 packets.
Packets are Ethernet frames with an EtherIP header prepended.

Obtained from:	NetBSD
MFC after:	2 weeks
2005-12-21 21:29:45 +00:00
Xin LI
92e0a4a2a4 Use consistent indent character as other IPPROTO_* lines did. 2005-12-20 09:38:03 +00:00
George V. Neville-Neil
496f9fc522 Add protocol number for SCTP.
Submitted by:	Randall Stewart rrs at cisco.com
MFC after:	1 week
2005-12-20 09:24:04 +00:00
Gleb Smirnoff
3939390679 Add a knob to suppress logging of attempts to modify
permanent ARP entries.

Submitted by:	Andrew Alcheyev <buddy telenet.ru>
2005-12-18 19:11:56 +00:00
Ed Maste
bd2b686fe8 Add descriptions for sysctl -d.
Approved by:	glebius
Silence from:	rwatson (mentor)
2005-12-16 15:01:44 +00:00
Gleb Smirnoff
6e02dbdfa3 Cleanup __FreeBSD_version. 2005-12-16 13:10:32 +00:00
John Baldwin
636a309adb Use %t (ptrdiff_t modifier) to print a couple of pointer differences rather
than casting them to int.
2005-12-15 21:57:32 +00:00
Maxime Henrion
e59898ff36 Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to
match the type of the variable they are exporting.

Spotted by:	Thomas Hurst <tom@hur.st>
MFC after:	3 days
2005-12-14 22:27:48 +00:00
Gleb Smirnoff
40b1ae9e00 Add a new feature for optimizining ipfw rulesets - substitution of the
action argument with the value obtained from table lookup. The feature
is now applicable only to "pipe", "queue", "divert", "tee", "netgraph"
and "ngtee" rules.

An example usage:

  ipfw pipe 1000 config bw 1000Kbyte/s
  ipfw pipe 4000 config bw 4000Kbyte/s
  ipfw table 1 add x.x.x.x 1000
  ipfw table 1 add x.x.x.y 4000
  ipfw pipe tablearg ip from table(1) to any

In the example above the rule will throw different packets to different pipes.

TODO:
  - Support "skipto" action, but without searching all rules.
  - Improve parser, so that it warns about bad rules. These are:
    - "tablearg" argument to action, but no "table" in the rule. All
      traffic will be blocked.
    - "tablearg" argument to action, but "table" searches for entry with
      a specific value. All traffic will be blocked.
    - "tablearg" argument to action, and two "table" looks - for src and
      for dst. The last lookup will match.
2005-12-13 12:16:03 +00:00
Gleb Smirnoff
bbce982bd5 When we drop packet due to no space in output interface output queue, also
increase the ifp->if_snd.ifq_drops.

PR:		72440
Submitted by:	ikob
2005-12-06 11:16:11 +00:00
Gleb Smirnoff
95d1f36f82 Optimize parallel processing of ipfw(4) rulesets eliminating the locking
of the radix lookup tables. Since several rnh_lookup() can run in
parallel on the same table, we can piggyback on the shared locking
provided by ipfw(4).
  However, the single entry cache in the ip_fw_table can't be used lockless,
so it is removed. This pessimizes two cases: processing of bursts of similar
packets and matching one packet against the same table several times during
one ipfw_chk() lookup. To optimize the processing of similar packet bursts
administrator should use stateful firewall. To optimize the second problem
a solution will be provided soon.

Details:
  o Since we piggyback on the ipfw(4) locking, and the latter is per-chain,
    the tables are moved from the global declaration to the
    struct ip_fw_chain.
  o The struct ip_fw_table is shrunk to one entry and thus vanished.
  o All table manipulating functions are extended to accept the struct
    ip_fw_chain * argument.
  o All table modifing functions use IPFW_WLOCK_ASSERT().
2005-12-06 10:45:49 +00:00
Ruslan Ermilov
f4e9888107 Fix -Wundef. 2005-12-04 02:12:43 +00:00
Hajimu UMEMOTO
8846bbf3ce obey opt_inet6.h and opt_ipsec.h in kernel build directory.
Requested by:	hrs
2005-11-29 17:56:11 +00:00
Gleb Smirnoff
b090e4ce1f Garbage-collect now unused struct _ipfw_insn_pipe and flush_pipe_ptrs(),
thus removing a few XXXes.
  Document the ABI breakage in UPDATING.
2005-11-29 08:59:41 +00:00
Gleb Smirnoff
99b41b34fb First step in removing welding between ipfw(4) and dummynet.
o Do not use ipfw_insn_pipe->pipe_ptr in locate_flowset(). The
  _ipfw_insn_pipe isn't touched by this commit to preserve ABI
  compatibility.
o To optimize the lookup of the pipe/flowset in locate_flowset()
  introduce hashes for pipes and queues:
  - To preserve ABI compatibility utilize the place of global list
    pointer for SLIST_ENTRY.
  - Introduce locate_flowset(queue nr) and locate_pipe(pipe nr).
o Rework all the dummynet code to deal with the hashes, not global
  lists. Also did some style(9) changes in the code blocks that were
  touched by this sweep:
  - Be conservative about flowset and pipe variable names on stack,
    use "fs" and "pipe" everywhere.
  - Cleanup whitespaces.
  - Sort variables.
  - Give variables more meaningful names.
  - Uppercase and dots in comments.
  - ENOMEM when malloc(9) failed.
2005-11-29 00:11:01 +00:00
Ruslan Ermilov
fc1eaecf4a Fix prototype. 2005-11-24 14:17:35 +00:00
Paul Saab
d0a14f55c3 Fix for a bug that causes SACK scoreboard corruption when the limit
on holes per connection is reached.

Reported by:	Patrik Roos
Submitted by:	Mohan Srinivasan
Reviewed by:	Raja Mukerji, Noritoshi Demizu
2005-11-21 19:22:10 +00:00
Andre Oppermann
22f2c8b5db Remove 'ipprintfs' which were protected under DIAGNOSTIC. It doesn't
have any know to enable it from userland and could only be enabled by
either setting it to 1 at compile time or through the kernel debugger.

In the future it may be brought back as KTR tracing points.

Discussed with:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-19 17:04:52 +00:00
Andre Oppermann
c444cdded2 Move MAX_IPOPTLEN and struct ipoption back into ip_var.h as
userland programs depend on it.

Pointed out by:	le
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-19 14:01:32 +00:00
Andre Oppermann
ef39adf007 Consolidate all IP Options handling functions into ip_options.[ch] and
include ip_options.h into all files making use of IP Options functions.

From ip_input.c rev 1.306:
  ip_dooptions(struct mbuf *m, int pass)
  save_rte(m, option, dst)
  ip_srcroute(m0)
  ip_stripoptions(m, mopt)

From ip_output.c rev 1.249:
  ip_insertoptions(m, opt, phlen)
  ip_optcopy(ip, jp)
  ip_pcbopts(struct inpcb *inp, int optname, struct mbuf *m)

No functional changes in this commit.

Discussed with:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-18 20:12:40 +00:00
Andre Oppermann
147f74d176 Purge layer specific mbuf flags on layer crossings to avoid confusing
upper or lower layers.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-18 16:23:26 +00:00
Andre Oppermann
e86ebebc52 Rework icmp_error() to deal with truncated IP packets from
ip_forward() when doing extended quoting in error messages.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-18 14:48:42 +00:00
Andre Oppermann
780b2f698c In ip_forward() copy as much into the temporary error mbuf as we
have free space in it.  Allocate correct mbuf from the beginning.
This allows icmp_error() to quote the entire TCP header in error
messages.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-18 14:44:48 +00:00
Gleb Smirnoff
218837618a MFOpenBSD 1.62:
Prevent backup CARP hosts from replying to arp requests, fixes strangeness
  with some layer-3 switches. From Bill Marquette.

Tested by:	Kazuaki Oda <kaakun highway.ne.jp>
2005-11-17 12:56:40 +00:00
Ruslan Ermilov
433aaf04cb Unbreak for !INET6 case. 2005-11-14 12:50:23 +00:00
Ruslan Ermilov
4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
SUZUKI Shinsuke
d9a989231e fixed a bug that uRPF does not work properly for an IPv6 packet bound for the sending machine itself (this is a bug introduced due to a change in ip6_input.c:Rev.1.83)
Pointed out by: Sean McNeil and J.R.Oldroyd
MFC after: 3 days
2005-11-10 22:10:39 +00:00
Ruslan Ermilov
303989a2f3 Use sparse initializers for "struct domain" and "struct protosw",
so they are easier to follow for the human being.
2005-11-09 13:29:16 +00:00
Andrew Thompson
4e7e0183e1 Move the cloned interface list management in to if_clone. For some drivers the
softc lists and associated mutex are now unused so these have been removed.

Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.

Idea by:	brooks
Reviewed by:	brooks
2005-11-08 20:08:34 +00:00
Gleb Smirnoff
e1ff74c58d Rework ARP retransmission algorythm so that ARP requests are
retransmitted without suppression, while there is demand for
such ARP entry. As before, retransmission is rate limited to
one packet per second. Details:
  - Remove net.link.ether.inet.host_down_time
  - Do not set/clear RTF_REJECT flag on route, to
    avoid rt_check() returning error. We will generate error
    ourselves.
  - Return EWOULDBLOCK on first arp_maxtries failed
    requests , and return EHOSTDOWN/EHOSTUNREACH
    on further requests.
  - Retransmit ARP request always, independently from return
    code. Ratelimit to 1 pps.
2005-11-08 12:05:57 +00:00
Andre Oppermann
34333b16cd Retire MT_HEADER mbuf type and change its users to use MT_DATA.
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it.  It only adds a layer of confusion.  The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.

Non-native code is not changed in this commit.  For compatibility
MT_HEADER is mapped to MT_DATA.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-02 13:46:32 +00:00
Robert Watson
5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Robert Watson
d374e81efd Push the assignment of a new or updated so_qlimit from solisten()
following the protocol pru_listen() call to solisten_proto(), so
that it occurs under the socket lock acquisition that also sets
SO_ACCEPTCONN.  This requires passing the new backlog parameter
to the protocol, which also allows the protocol to be aware of
changes in queue limit should it wish to do something about the
new queue limit.  This continues a move towards the socket layer
acting as a library for the protocol.

Bump __FreeBSD_version due to a change in the in-kernel protocol
interface.  This change has been tested with IPv4 and UNIX domain
sockets, but not other protocols.
2005-10-30 19:44:40 +00:00
Gleb Smirnoff
f3d30eb20d First fill in structure with valid values, and only then attach it
to the global list.

Reviewed by:	rwatson
2005-10-28 20:29:42 +00:00
Yaroslav Tykhiy
9f4abef9a3 Since carp(4) interfaces presently are kinda fake yet possess
IP addresses, mark them with LOOPBACK so that routing daemons
take them easy for link-state routing protocols.

Reviewed by:	glebius
2005-10-26 05:57:35 +00:00
Max Laier
1e4b360655 Fix build after in6_joingroup change. It remains unclear if DAD breaks CARP
or not.
2005-10-22 14:54:02 +00:00
Gleb Smirnoff
bfb26eecfb In in_addprefix() compare not only route addresses, but their masks,
too. This fixes problem when connected prefixes overlap.

Obtained from:	OpenBSD (rev. 1.40 by claudio);
		[ I came to this fix myself, and then found out that
		  OpenBSD had already fixed it the same way.]
2005-10-22 14:50:27 +00:00
SUZUKI Shinsuke
743eee666f sync with KAME regarding NDP
- introduced fine-grain-timer to manage ND-caches and IPv6 Multicast-Listeners
- supports Router-Preference <draft-ietf-ipv6-router-selection-07.txt>
- better prefix lifetime management
- more spec-comformant DAD advertisement
- updated RFC/internet-draft revisions

Obtained from: KAME
Reviewed by: ume, gnn
MFC after: 2 month
2005-10-21 16:23:01 +00:00
Robert Watson
a65e12b09d Convert if (tp->t_state == TCPS_LISTEN) panic() into a KASSERT.
MFC after:	2 weeks
2005-10-19 09:37:52 +00:00
Andrew Thompson
febd0759f3 Change the reference counting to count the number of cloned interfaces for each
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.

Assert that all cloners have been destroyed when freeing the memory.

Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.

Discussed with:	brooks, pjd, -current
Reviewed by:	brooks
2005-10-12 19:52:16 +00:00
Maxim Konovalov
d46ff6bd1e o INP_ONESBCAST is inpcb.inp_vflag flag not inp_flags. The confusion
with IP_PORTRANGE_HIGH leads to the incorrect checksum calculation.

PR:		kern/87306
Submitted by:	Rickard Lind
Reviewed by:	bms
MFC after:	2 weeks
2005-10-12 18:13:25 +00:00
Philip Paeps
7691747aac Unbreak the net.inet6.tcp6.getcred sysctl.
This makes inetd/auth work again in IPv6 setups.

Pointy hat to:	ume/KAME
2005-10-12 09:24:18 +00:00
Andrew Thompson
f69453ca8b When bridging is enabled and an ARP request is recieved on a member interface,
the arp code will search all local interfaces for a match. This triggers a
kernel log if the bridge has been assigned an address.

arp: ac🇩🇪48:18:83:3d is using my IP address 192.168.0.142!

bridge0: flags=8041<UP,RUNNING,MULTICAST> mtu 1500
        inet 192.168.0.142 netmask 0xffffff00
        ether ac🇩🇪48:18:83:3d

Silence this warning for 6.0 to stop unnecessary bug reports, the code will need
to be reworked.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-10-04 19:50:02 +00:00
Andre Oppermann
1fd7af262a Correct brainfart in SO_BINTIME test.
Pointed out by:	nate
Pointy hat to:	andre
2005-10-04 18:19:21 +00:00
Andre Oppermann
e5fbf72cd8 Make SO_BINTIME timestamps available on raw_ip sockets.
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-10-04 18:07:11 +00:00
Robert Watson
c48b03fb69 Unlock Giant symmetrically with respect to lock acquire order as that's
generally nicer.

Spotted by:	johan
MFC after:	1 week
2005-10-03 11:34:29 +00:00
Robert Watson
1fa9efeffb Acquire Giant conditionally in in_addmulti() and in_delmulti() based on
whether the interface being accessed is IFF_NEEDSGIANT or not.  This
avoids lock order reversals when calling into the interface ioctl
handler, which could potentially lead to deadlock.

The long term solution is to eliminate non-MPSAFE network drivers.

Discussed with:	jhb
MFC after:	1 week
2005-10-03 11:09:39 +00:00
Maxim Konovalov
ac827533df o Teach sysctl_drop() how to deal with the sockets in TIME_WAIT state.
This is a special case because tcp_twstart() destroys a tcp control
block via tcp_discardcb() so we cannot call tcp_drop(struct *tcpcb) on
such connections.  Use tcp_twclose() instead.

MFC after:	5 days
2005-10-02 08:43:57 +00:00
Max Laier
b6de9e91bd Remove bridge(4) from the tree. if_bridge(4) is a full functional
replacement and has additional features which make it superior.

Discussed on:	-arch
Reviewed by:	thompsa
X-MFC-after:	never (RELENG_6 as transition period)
2005-09-27 18:10:43 +00:00
Andre Oppermann
b2828ad291 Implement IP_DONTFRAG IP socket option enabling the Don't Fragment
flag on IP packets.  Currently this option is only repected on udp
and raw ip sockets.  On tcp sockets the DF flag is controlled by the
path MTU discovery option.

Sending a packet larger than the MTU size of the egress interface
returns an EMSGSIZE error.

Discussed with:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-09-26 20:25:16 +00:00
Andre Oppermann
fe53256dc2 Use monotonic 'time_uptime' instead of 'time_second' as timebase
for rt->rt_rmx.rmx_expire.
2005-09-19 22:54:55 +00:00
Andre Oppermann
e6b9152d20 Use monotonic 'time_uptime' instead of 'time_second' as timebase
for timeouts.
2005-09-19 22:31:45 +00:00
Robert Watson
b1c53bc9c0 Take a first cut at cleaning up ifnet removal and multicast socket
panics, which occur when stale ifnet pointers are left in struct
moptions hung off of inpcbs:

- Add in_ifdetach(), which matches in6_ifdetach(), and allows the
  protocol to perform early tear-down on the interface early in
  if_detach().

- Annotate that if_detach() needs careful consideration.

- Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR --
  this is not the place to detect interface removal!  This also
  removes what is basically a nasty (and now unnecessary) hack.

- Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP
  IPv4 sockets.

It is now possible to run the msocket_ifnet_remove regression test
using HEAD without panicking.

MFC after:	3 days
2005-09-18 17:36:28 +00:00
Andre Oppermann
db1240661f Do not ignore all other TCP options (eg. timestamp, window scaling)
when responding to TCP SYN packets with TCP_MD5 enabled and set.

PR:		kern/82963
Submitted by:	<demizu at dd.iij4u.or.jp>
MFC after:	3 days
2005-09-14 15:06:22 +00:00
Bjoern A. Zeeb
75398603ad Fix panic when kernel compiled without INET6 by rejecting
IPv6 opcodes which are behind #if(n)def INET6 now.

PR:		kern/85826
MFC after:	3 days
2005-09-14 07:53:54 +00:00
Andre Oppermann
ffabe3dce8 In tcp_ctlinput() do not swap ip->ip_len a second time. It
has been done in icmp_input() already.

This fixes the ICMP_UNREACH_NEEDFRAG case where no MTU was
proposed in the ICMP reply.

PR:		kern/81813
Submitted by:	Vitezslav Novy <vita at fio.cz>
MFC after:	3 days
2005-09-10 07:43:29 +00:00
Gleb Smirnoff
a20e25385c - Do not hold route entry lock, when calling arprequest(). One such
call was introduced by me in 1.139, the other one was present before.
- Do all manipulations with rtentry and la before dropping the lock.
- Copy interface address from route into local variable before dropping
  the lock. Supply this copy as argument to arprequest()

LORs fixed:
		http://sources.zabbadoz.net/freebsd/lor/003.html
		http://sources.zabbadoz.net/freebsd/lor/037.html
		http://sources.zabbadoz.net/freebsd/lor/061.html
		http://sources.zabbadoz.net/freebsd/lor/062.html
		http://sources.zabbadoz.net/freebsd/lor/064.html
		http://sources.zabbadoz.net/freebsd/lor/068.html
		http://sources.zabbadoz.net/freebsd/lor/071.html
		http://sources.zabbadoz.net/freebsd/lor/074.html
		http://sources.zabbadoz.net/freebsd/lor/077.html
		http://sources.zabbadoz.net/freebsd/lor/093.html
		http://sources.zabbadoz.net/freebsd/lor/135.html
		http://sources.zabbadoz.net/freebsd/lor/140.html
		http://sources.zabbadoz.net/freebsd/lor/142.html
		http://sources.zabbadoz.net/freebsd/lor/145.html
		http://sources.zabbadoz.net/freebsd/lor/152.html
		http://sources.zabbadoz.net/freebsd/lor/158.html
2005-09-09 10:06:27 +00:00
Gleb Smirnoff
5d40d65b5a When a carp(4) interface is being destroyed and is in a promiscous mode,
first interface is detached from parent and then bpfdetach() is called.
If the interface was the last carp(4) interface attached to parent, then
the mutex on parent is destroyed. When bpfdetach() calls if_setflags()
we panic on destroyed mutex.

To prevent the above scenario, clear pointer to parent, when we detach
ourselves from parent.
2005-09-09 08:41:39 +00:00
Sam Leffler
245c31ccaf clear lock on error in O_LIMIT case of install_state
Submitted by:	Ted Unangst
MFC after:	3 days
2005-09-04 17:33:40 +00:00
Andre Oppermann
e0aec68255 Use the correct mbuf type for MGET(). 2005-08-30 16:35:27 +00:00
Gleb Smirnoff
e3ea67a077 Add newline to debuging printf.
PR:		kern/85271
Submitted by:	Simon Morgan
2005-08-26 15:27:18 +00:00
Gleb Smirnoff
360856f60e - Refuse hashsize of 0, since it is invalid.
- Use defined constant instead of 512.
2005-08-25 13:57:00 +00:00