Commit Graph

226 Commits

Author SHA1 Message Date
Gleb Smirnoff
6aef0416fb Use M_NOWAIT while holding the pf giant lock. 2012-07-15 19:10:00 +00:00
Gleb Smirnoff
40874f18de Merge revision 1.715 from OpenBSD:
date: 2010/12/24 20:12:56;  author: henning;  state: Exp;  lines: +3 -3
  in pf_src_connlimit, the indices to sk->addr were swapped.
  tracked down and diff sent by Robert B Mills <rbmills at sdf.lonestar.org>
  thanks, very good work! ok claudio

Impact is that the "flush" keyword didn't work.

Obtained from:	OpenBSD
MFC after:	1 week
2012-06-06 09:36:52 +00:00
Ermal Luçi
0ad5ef9c8f Correct table counter functionality to not panic.
This was caused by not proper initialization of necessary parameters.

PR: 168200
Reviewed by:	bz@, glebius@
MFC after:	1 week
2012-05-31 20:10:05 +00:00
Alexander V. Chernikov
bdf942c3f0 Revert r234834 per luigi@ request.
Cleaner solution (e.g. adding another header) should be done here.

Original log:
  Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
  Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by:      luigi
Approved by:       kib(mentor)
2012-05-03 08:56:43 +00:00
Alexander V. Chernikov
7bd5e9b143 Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by:        ae(mentor)
MFC after:          2 weeks
2012-04-30 10:22:23 +00:00
Andrey V. Elsukov
59894e4a44 Fix VIMAGE build. 2012-04-05 04:41:06 +00:00
Gleb Smirnoff
07b6b55dce Merge from OpenBSD:
revision 1.173
  date: 2011/11/09 12:36:03;  author: camield;  state: Exp;  lines: +11 -12
  State expire time is a baseline time ("last active") for expiry
  calculations, and does _not_ denote the time when to expire.  So
  it should never be added to (set into the future).

  Try to reconstruct it with an educated guess on state import and
  just set it to the current time on state updates.

  This fixes a problem on pfsync listeners where the expiry time
  could be double the expected value and cause a lot more states
  to linger.
2012-04-04 14:47:59 +00:00
Gleb Smirnoff
64484cf630 Since pf 4.5 import pf(4) has a mechanism to defer
forwarding a packet, that creates state, until
pfsync(4) peer acks state addition (or 10 msec
timeout passes).

This is needed for active-active CARP configurations,
which are poorly supported in FreeBSD and arguably
a good idea at all.

Unfortunately by the time of import this feature in
OpenBSD was turned on, and did not have a switch to
turn it off. This leaked to FreeBSD.

This change make it possible to turn this feature
off via ioctl() and turns it off by default.

Obtained from:	OpenBSD
2012-04-03 18:09:20 +00:00
Gleb Smirnoff
0e2fe5f990 Merge from OpenBSD:
revision 1.146
  date: 2010/05/12 08:11:11;  author: claudio;  state: Exp;  lines: +2 -3
  bzero() the full compressed update struct before setting the values.
  This is needed because pf_state_peer_hton() skips some fields in certain
  situations which could result in garbage beeing sent to the other peer.
  This seems to fix the pfsync storms seen by stephan@ and so dlg owes me
  a whiskey.

I didn't see any storms, but this definitely fixes a useless memory
allocation on the receiving side, due to non zero scrub_flags field
in a pfsync_state_peer structure.
2012-03-08 09:20:00 +00:00
Bjoern A. Zeeb
0e2181f578 Extend IPv6 routing lookups in pf(4) to use the new multi-FIB KPI.
Try to make the "rtable" handling work but the current version of
pf(4) does not fully support it yet as especially callers of
PF_MISMATCHAW() are not fully FIB-aware.  OpenBSD seems to have
fixed this in a later version.  Prepare as much as possible.

Sponsored by:	Cisco Systems, Inc.
2012-02-03 13:20:48 +00:00
Gleb Smirnoff
3a8c7fa008 Allocate our mbuf with m_get2(). 2012-01-17 12:14:26 +00:00
Christian S.J. Peron
5646ad6d27 Revert to the old behavior of allocating table/table entries using
M_NOWAIT.  Currently, the code allows for sleeping in the ioctl path
to guarantee allocation.  However code also handles ENOMEM gracefully, so
propagate this error back to user-space, rather than sleeping while
holding the global pf mutex.

Reviewed by:	glebius
Discussed with:	bz
2012-01-14 22:51:34 +00:00
Gleb Smirnoff
b4f66a1781 Redo r226660:
- Define schednetisr() to swi_sched.
 - In the swi handler check if there is some data prepared,
   and if true, then call pfsync_sendout(), however tell it
   not to schedule swi again.
 - Since now we don't obtain the pfsync lock in the swi handler,
   don't use ifqueue mutex to synchronize queue access.
2012-01-11 18:34:57 +00:00
Gleb Smirnoff
122d395f85 Fix some spacing in code under __FreeBSD__. 2012-01-11 14:24:03 +00:00
Gleb Smirnoff
c4f01d2d34 Add necessary locking in pfsync_in_ureq(). 2012-01-11 14:19:04 +00:00
Gleb Smirnoff
0744a28a79 Move PF_LOCK_ASSERT() under __FreeBSD__. 2012-01-11 14:13:42 +00:00
Gleb Smirnoff
3488c2786e Merge from OpenBSD:
revision 1.128
  date: 2009/08/16 13:01:57;  author: jsg;  state: Exp;  lines: +1 -5
  remove prototypes of a bunch of functions that had their implementations
  removed in pfsync v5.
2012-01-11 14:11:10 +00:00
Gleb Smirnoff
686cb93667 When running with INVARIANTS the mutex(9) code does all necessary
asserts for non-recursive mutexes.
2012-01-11 13:57:48 +00:00
Gleb Smirnoff
1d89f286c4 Can't pass MSIZE to m_cljget(), an mbuf can't be attached as external storage
to another mbuf.
2012-01-09 14:35:05 +00:00
Gleb Smirnoff
317ebc3d0d Backout of backout: we need SI_SUB_PROTO_DOMAIN for pfsync, since
it needs existing inetdomain on startup.
2012-01-09 12:06:02 +00:00
Gleb Smirnoff
101881ef42 Revert sub argument of MODULE_DECLARE back to r226532.
Noticed by:	bz
2012-01-09 09:19:00 +00:00
Gleb Smirnoff
151ceaa22c In FreeBSD we determine presence of pfsync(4) at run-time, not
at compile time, so define NPFSYNC to 1 always. While here, remove
unused defines.
2012-01-09 08:55:23 +00:00
Gleb Smirnoff
5c39f7bdeb Bunch of fixes to pfsync(4) module load/unload:
o Make the pfsync.ko actually usable. Before this change loading it
  didn't register protosw, so was a nop. However, a module /boot/kernel
  did confused users.
o Rewrite the way we are joining multicast group:
  - Move multicast initialization/destruction to separate functions.
  - Don't allocate memory if we aren't going to join a multicast group.
  - Use modern API for joining/leaving multicast group.
  - Now the utterly wrong pfsync_ifdetach() isn't needed.
o Move module initialization from SYSINIT(9) to moduledata_t method.
o Refuse to unload module, unless asked forcibly.
o Improve a bit some FreeBSD porting code:
  - Use separate malloc type.
  - Simplify swi sheduling.

This change is probably wrong from VIMAGE viewpoint, however pfsync
wasn't VIMAGE-correct before this change, too.

Glanced at by:	bz
2012-01-09 08:50:22 +00:00
Gleb Smirnoff
98a38f5b1d o Fix panic on module unload, that happened due to mutex being
destroyed prior to pfsync_uninit(). To do this, move all the
  initialization to the module_t method, instead of SYSINIT(9).
o Fix another panic after module unload, due to not clearing the
  m_addr_chg_pf_p pointer.
o Refuse to unload module, unless being unloaded forcibly.
o Revert the sub argument to MODULE_DECLARE, to the stable/8 value.

This change probably isn't correct from viewpoint of VIMAGE, but
the module wasn't VIMAGE-correct before the change, as well.

Glanced at by:	bz
2012-01-09 08:36:12 +00:00
Gleb Smirnoff
dabfce9a5a Merge from OpenBSD:
revision 1.170
  date: 2011/10/30 23:04:38;  author: mikeb;  state: Exp;  lines: +6 -7
  Allow setting big MTU values on the pfsync interface but not larger
  than the syncdev MTU.  Prompted by the discussion with and tested
  by Maxim Bourmistrov;  ok dlg, mpf

Consistently use sc_ifp->if_mtu in the MTU check throughout the
module. This backs out r228813.
2012-01-07 14:39:45 +00:00
Gleb Smirnoff
e883df1d1b Fix indentation. 2012-01-07 12:40:45 +00:00
Sergey Kandaurov
da914858e1 Fix LINT-VIMAGE build after r228814: use virtualized pf_pool_limits. 2011-12-24 00:23:27 +00:00
Gleb Smirnoff
6bc752e028 Merge from OpenBSD:
revision 1.122
  date: 2009/05/13 01:01:34;  author: dlg;  state: Exp;  lines: +6 -4
  only keep track of the number of updates on tcp connections. state sync on
  all the other protocols is simply pushing the timeouts along which has a
  resolution of 1 second, so it isnt going to be hurt by pfsync taking up
  to a second to send it over.

  keep track of updates on tcp still though, their windows need constant
  attention.
2011-12-22 19:09:55 +00:00
Gleb Smirnoff
2662e31fc3 Merge from OpenBSD:
revision 1.120
  date: 2009/04/04 13:09:29;  author: dlg;  state: Exp;  lines: +5 -5
  use time_uptime instead of time_second internally. time_uptime isnt
  affected by adjusting the clock.

  revision 1.175
  date: 2011/11/25 12:52:10;  author: dlg;  state: Exp;  lines: +3 -3
  use time_uptime to set state creation values as time_second can be
  skewed at runtime by things like date(1) and ntpd. time_uptime is
  monotonic and therefore more useful to compare against.
2011-12-22 19:05:58 +00:00
Gleb Smirnoff
e3b670692a Merge couple more fixes from OpenBSD to bulk processing:
revision 1.118
  date: 2009/03/23 06:19:59;  author: dlg;  state: Exp;  lines: +8 -6
  wait an appropriate amount of time before giving up on a bulk update,
  rather than giving up after a hardcoded 5 seconds (which is generally much
  too short an interval for a bulk update).
  pointed out by david@, eyeballed by mcbride@

  revision 1.171
  date: 2011/10/31 22:02:52;  author: mikeb;  state: Exp;  lines: +2 -1
  Don't forget to cancel bulk update failure timeout when destroying an
  interface.  Problem report and fix from Erik Lax, thanks!

Start a brief note of revisions merged from OpenBSD.
2011-12-22 18:56:27 +00:00
Gleb Smirnoff
c5360c2998 We really mean MTU of the real interface here, not of our pseudo. 2011-12-22 18:51:35 +00:00
Gleb Smirnoff
538c3a7cd0 In FreeBSD we always have bpf(4) API, either real or stub. No need
in detecting presense of 'device bpf'.
2011-12-22 18:31:47 +00:00
Gleb Smirnoff
f08535f872 Restore a feature that was present in 5.x and 6.x, and was cleared in
7.x, 8.x and 9.x with pf(4) imports: pfsync(4) should suppress CARP
preemption, while it is running its bulk update.

However, reimplement the feature in more elegant manner, that is
partially inspired by newer OpenBSD:

- Rename term "suppression" to "demotion", to match with OpenBSD.
- Keep a global demotion factor, that can be raised by several
  conditions, for now these are:
  - interface goes down
  - carp(4) has problems with ip_output() or ip6_output()
  - pfsync performs bulk update
- Unlike in OpenBSD the demotion factor isn't a counter, but
  is actual value added to advskew. The adjustment values for
  particular error conditions are also configurable, and their
  defaults are maximum advskew value, so a single failure bumps
  demotion to maximum. This is for POLA compatibility, and should
  satisfy most users.
- Demotion factor is a writable sysctl, so user can do
  foot shooting, if he desires to.
2011-12-20 13:53:31 +00:00
Gleb Smirnoff
352e70652f - Cover pfsync callouts deletion with PF_LOCK().
- Cover setting up interface between pf and pfsync with PF_LOCK().
2011-12-20 12:34:16 +00:00
Gleb Smirnoff
c53680a8ec Return value should be conditional on return value of pfsync_defer_ptr()
PR:		kern/162947
Submitted by:	Matthieu Kraus <matthieu.kraus s2008.tu-chemnitz.de>
2011-11-30 08:47:17 +00:00
Kevin Lo
8d5eb1c4c8 Add missing PF_UNLOCK in pf_test
Reviewed by:	bz
2011-10-30 14:55:00 +00:00
Gleb Smirnoff
3e850a12ef Utilize new IF_DEQUEUE_ALL(ifq, m) macro in pfsyncintr() to reduce
contention on ifqueue lock.
2011-10-27 09:47:00 +00:00
Gleb Smirnoff
9932deae93 Merge several fixes to bulk update processing from OpenBSD. Merged
revisions: 1.148, 1.149, 1.150. This makes number of states on
master/slave to be of a sane value.
2011-10-23 15:15:17 +00:00
Gleb Smirnoff
8ecd40b6b2 Fix indentation, no code changed. 2011-10-23 15:10:15 +00:00
Gleb Smirnoff
2f2086d57e - Fix a bad typo (FreeBSD specific) in pfsync_bulk_update(). Instead
of scheduling next run pfsync_bulk_update(), pfsync_bulk_fail()
  was scheduled.
  This lead to instant 100% state leak after first bulk update
  request.
- After above fix, it appeared that pfsync_bulk_update() lacks
  locking. To fix this, sc_bulk_tmo callout was converted to an
  mtx one. Eventually, all pf/pfsync callouts should be converted
  to mtx version, since it isn't possible to stop or drain a
  non-mtx callout without risk of race.
- Add comment that callout_stop() in pfsync_clone_destroy() lacks
  locking. Since pfsync0 can't be destroyed (yet), let it be here.
2011-10-23 15:08:18 +00:00
Gleb Smirnoff
35ad95774e Fix from r226623 is not sufficient to close all races in pfsync(4).
The root of problem is re-locking at the end of pfsync_sendout().
Several functions are calling pfsync_sendout() holding pointers
to pf data on stack, and these functions expect this data to be
consistent.

To fix this, the following approach was taken:

- The pfsync_sendout() doesn't call ip_output() directly, but
  enqueues the mbuf on sc->sc_ifp's interfaces queue, that
  is currently unused. Then pfsync netisr is scheduled. PF_LOCK
  isn't dropped in pfsync_sendout().
- The netisr runs through queue and ip_output()s packets
  on it.

Apart from fixing race, this also decouples stack, fixing
potential issues, that may happen, when sending pfsync(4)
packets on input path.

Reviewed by:	eri (a quick review)
2011-10-23 14:59:54 +00:00
Gleb Smirnoff
68270a37c8 Absense of M_WAITOK in malloc flags for UMA doesn't
equals presense of M_NOWAIT. Specify M_NOWAIT explicitly.

This fixes sleeping with PF_LOCK().
2011-10-23 10:13:20 +00:00
Gleb Smirnoff
f54a3a046e Correct flag for uma_zalloc() is M_WAITOK. M_WAIT is an old and
deprecated flag from historical mbuf(9) allocator.

This is style only change.
2011-10-23 10:05:25 +00:00
Gleb Smirnoff
8dc59178a8 Fix a race: we should update sc_len before dropping the pf lock, otherwise a
number of packets can be queued on sc, while we are in ip_output(), and then
we wipe the accumulated sc_len. On next pfsync_sendout() that would lead to
writing beyond our mbuf cluster.
2011-10-21 22:28:15 +00:00
Gleb Smirnoff
b6b8562bfc In FreeBSD ip_output() expects ip_len and ip_off in host byte order
PR:		kern/159029
2011-10-21 11:11:18 +00:00
Bjoern A. Zeeb
e999988442 Fix recursive pf locking leading to panics. Splatter PF_LOCK_ASSERT()s
to document where we are expecting to be called with a lock held to
more easily catch unnoticed code paths.
This does not neccessarily improve locking in pfsync, it just tries
to avoid the panics reported.

PR:		kern/159390, kern/158873
Submitted by:	pluknet (at least something that partly resembles
		my patch ignoring other cleanup, which I only saw
		too late on the 2nd PR)
MFC After:	3 days
2011-10-19 13:13:56 +00:00
Bjoern A. Zeeb
c902d29994 De-virtualize the pf_task_mtx lock. At the current state of pf locking
and virtualization it is not helpful but complicates things.

Current state of art is to not virtualize these kinds of locks -
inp_group/hash/info/.. are all not virtualized either.

MFC after:	3 days
2011-10-19 11:04:49 +00:00
Bjoern A. Zeeb
232ec0c97d Adjust the PF_ASSERT() macro to what we usually use in the network stack:
PF_LOCK_ASSERT() and PF_UNLOCK_ASSERT().

MFC after:	3 days
2011-10-19 10:16:42 +00:00
Bjoern A. Zeeb
72aed41bed In the non-FreeBSD case we do not expect PF_LOCK and friends to do anything.
MFC after:	3 days
2011-10-19 10:08:58 +00:00
Bjoern A. Zeeb
5b63183446 Pseudo interfaces should go at SI_SUB_PSEUDO. However at least
pfsync also depends on pf to be initialized already so pf goes at
FIRST and the interfaces go at ANY.
Then the (VNET_)SYSINIT startups for pf stays at SI_SUB_PROTO_BEGIN
and for pfsync we move to the later SI_SUB_PROTO_IF.

This is not ideal either but at least an order that should work for
the moment and can be re-fined with the VIMAGE merge, once this will
actually work with more than one network stack.

MFC after:	3 days
2011-10-19 10:04:24 +00:00