Commit Graph

23 Commits

Author SHA1 Message Date
Luigi Rizzo
7da98b8ab6 MFC 205602:
Honor ip.fw.one_pass when a packet comes out of a pipe without being delayed.
I forgot to handle this case when i did the mtag cleanup three months ago.

I am merging immediately because this bugfix is important for
people using RELENG_8.

PR:           145004
2010-03-24 15:19:47 +00:00
Luigi Rizzo
8018e843a3 MFC of a large number of ipfw and dummynet fixes and enhancements
done in CURRENT over the last 4 months.
HEAD and RELENG_8 are almost in sync now for ipfw, dummynet
the pfil hooks and related components.

Among the most noticeable changes:
- r200855 more efficient lookup of skipto rules, and remove O(N)
  blocks from critical sections in the kernel;
- r204591 large restructuring of the dummynet module, with support
  for multiple scheduling algorithms (4 available so far)
See the original commit logs for details.

Changes in the kernel/userland ABI should be harmless because the
kernel is able to understand previous requests from RELENG_8 and
RELENG_7. For this reason, this changeset would be applicable
to RELENG_7 as well, but i am not sure if it is worthwhile.
2010-03-23 09:58:59 +00:00
Julian Elischer
2ae7ec29fd MFC of 197952 and 198075
Virtualize the pfil hooks so that different jails may chose different
    packet filters. ALso allows ipfw to be enabled on on ejail and disabled
    on another. In 8.0 it's a global setting.
and
    Unbreak the VIMAGE build with IPSEC, broken with r197952 by
    virtualizing the pfil hooks.
    For consistency add the V_ to virtualize the pfil hooks in here as well.
2010-02-07 09:00:22 +00:00
Hajimu UMEMOTO
479812d91c MFC r200055, r200102:
- Teach an IPv6 to the debug prints.
- Use INET_ADDRSTRLEN and INET6_ADDRSTRLEN rather than hard
  coded number.
2010-01-04 15:22:38 +00:00
Hajimu UMEMOTO
30feab0076 MFC r200027: Teach an IPv6 to send_pkt() and ipfw_tick().
It fixes the issue which keep-alive doesn't work for an IPv6.
2010-01-04 15:05:11 +00:00
Luigi Rizzo
3cdcbc4885 some simple MFC:
r200020:
  change the type of the opcode from enum *:8  to u_int8_t
  so the size and alignment of the ipfw_insn is not compiler dependent.
  No changes in the code generated by gcc.

r200023:
  Add new sockopt names for ipfw and dummynet.

  This commit is just grabbing entries for the new names
  that will be used in the future, so you don't need to
  rebuild anything now.

r200034
  Dispatch sockopt calls to ipfw and dummynet
  using the new option numbers, IP_FW3 and IP_DUMMYNET3.
  Right now the modules return an error if called with those arguments
  so there is no danger of unwanted behaviour.

r200040
  - initialize src_ip in the main loop to prevent a compiler warning
    (gcc 4.x under linux, not sure how real is the complaint).
  - rename a macro argument to prevent name clashes.
  -  add the macro name on a couple of #endif
  - add a blank line for readability.
2009-12-05 12:51:51 +00:00
Oleg Bulyzhin
c366b7c1e2 MFC r198845:
Fix two issues that can lead to exceeding configured pipe bandwidth:
- do not expire queues which are not ready to be expired.
- properly calculate available burst size.

MFC r199073:
style(9): add missing parentheses
2009-11-09 10:13:24 +00:00
Julian Elischer
f8f0b70474 MFC r196423
Fix ipfw's initialization functions to get the correct order of evaluation
  to allow vnet and non vnet operation. Move some functions from ip_fw_pfil.c
  to ip_fw2.c and mode to mostly using the SYSINIT and VNET_SYSINIT handlers
  instead of the modevent handler. Correct some spelling errors in comments
  in the affected code. Note this bug fixes a crash in NON VIMAGE kernels when
  ipfw is unloaded.

  This patch is a minimal patch for 8.0
  I have a much larger patch that actually fixes the underlying problems
  that will be applied after 8.0

Reviewed by:	zec@, rwatson@, bz@(earlier version)
Approved by:	re (rwatson)
2009-08-21 11:23:29 +00:00
Julian Elischer
f394882d89 MFC of r196201
URL: http://svn.freebsd.org/changeset/base/196201

  Fix ipfw crash on uid or gid check.
  Receiving any ip packet for which there is no existing socket will
  crash if ipfw has a uid or gid test rule, as the uid/gid
  of the non existent owner of said non existent socket is tested.
  Brooks introduced this error as part of his >16 gids patch.
  It appears to be a cut-n-paste error from similar code a few lines
  before. The old code used the 'pcb' variable here, but in the
  new code that switched the 'inp' variable, which is often NULL
  and what is tested in the code further up. The rest of the multi-gid
  patch for ipfw seems solid (and cleaner than previous code).

p.s. What's up with all the properties changing? It is a fresh checkout.

Reviewed by:	brooks
Approved by:	re (rwatson)
2009-08-14 10:25:14 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Julian Elischer
3d1001cb11 Startup the vnet part of initialization a bit after the global part.
Fixes crash on boot if ipfw compiled in.

Submitted by:	tegge@
Reviewed by:	tegge@
Approved by:	re (kib)
2009-07-28 19:58:07 +00:00
Julian Elischer
9d85f50ad5 Catch ipfw up to the rest of the vimage code.
It got left behind when it moved to its new location.

Approved by:	re (kensmith)
2009-07-25 06:42:42 +00:00
Robert Watson
1e77c1056a Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used.  Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with:	bz, julian
Reviewed by:	bz
Approved by:	re (kensmith, kib)
2009-07-16 21:13:04 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Robert Watson
6c8615603b Update various IPFW-related modules to use if_addr_rlock()/
if_addr_runlock() rather than IF_ADDR_LOCK()/IF_ADDR_UNLOCK().

MFC after:	6 weeks
2009-06-26 00:46:50 +00:00
Oleg Bulyzhin
6882bf4d92 - fix dummynet 'fast' mode for WF2Q case.
- fix printing of pipe profile data.
- introduce new pipe parameter: 'burst' - how much data can be sent through
  pipe bypassing bandwidth limit.
2009-06-24 22:57:07 +00:00
Brooks Davis
838d985825 Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively.  (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer.  Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively.  Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary.  In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups.  When feasible, truncate
the group list rather than generating an error.

Minor changes:
  - Reduce the number of hand rolled versions of groupmember().
  - Do not assign to both cr_gid and cr_groups[0].
  - Modify ipfw to cache ucreds instead of part of their contents since
    they are immutable once referenced by more than one entity.

Submitted by:	Isilon Systems (initial implementation)
X-MFC after:	never
PR:		bin/113398 kern/133867
2009-06-19 17:10:35 +00:00
Oleg Bulyzhin
1917ef996d Since dn_pipe.numbytes is int64_t now - remove unnecessary overflow detection
code in ready_event_wfq().
2009-06-15 17:14:47 +00:00
Luigi Rizzo
6167e6c88f in ip_dn_ctl(), do not allocate a large structure on the stack,
and use malloc() instead if/when it is necessary.

The problem is less relevant in previous versions because
the variable involved (tmp_pipe) is much smaller there.
Still worth fixing though.

Submitted by:	Marta Carbone (GSOC)
MFC after:	3 days
2009-06-10 10:47:31 +00:00
Luigi Rizzo
1a5d0c2bf0 small simplifications to the code in charge of reaping deleted rules:
- clear the head pointer immediately before using it, so there is
  no chance of mistakes;
- call reap_rules() unconditionally. The function can handle a NULL
  argument just fine, and the cost of the extra call is hardly
  significant given that we do it rarely and outside the lock.

MFC after:	3 days
2009-06-10 10:34:59 +00:00
Oleg Bulyzhin
dda10d624c Close long existed race with net.inet.ip.fw.one_pass = 0:
If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc)
it carries pointer to matching ipfw rule. If this packet then reinjected back
to ipfw, ruleset processing starts from that rule. If rule was deleted
meanwhile, due to existed race condition panic was possible (as well as
other odd effects like parsing rules in 'reap list').

P.S. this commit changes ABI so userland ipfw related binaries should be
recompiled.

MFC after:	1 month
Tested by:	Mikolaj Golub
2009-06-09 21:27:11 +00:00
Bjoern A. Zeeb
8d8bc0182e After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.
2009-06-08 19:57:35 +00:00
Luigi Rizzo
908e960ea6 move kernel ipfw-related sources to a separate directory,
adjust conf/files and modules' Makefiles accordingly.

No code or ABI changes so this and most of previous related
changes can be easily MFC'ed

MFC after:	5 days
2009-06-05 19:22:47 +00:00