107 Commits

Author SHA1 Message Date
jhb
b92445bf9a MFC 266852,270223:
- Fix pf(4) to build with MAXCPU set to 256.  MAXCPU is actually a count,
  not a maximum ID value (so it is a cap on mp_ncpus, not mp_maxid).
- Bump MAXCPU on amd64 from 64 to 256.  In practice APIC only permits 255
  CPUs (IDs 0 through 254).  Getting above that limit requires x2APIC.
2015-05-22 21:51:36 +00:00
gnn
b67748dabd MFC: 281529
I can find no reason to allow packets with both SYN and FIN bits
set past this point in the code. The packet should be dropped and
not massaged as it is here.

Differential Revision:  https://reviews.freebsd.org/D2266
Submitted by: eri
Sponsored by: Rubicon Communications (Netgate)
2015-05-09 19:36:30 +00:00
ae
7fe0fee3eb MFC r279910:
Reset mbuf pointer to NULL in fastroute case to indicate that mbuf was
  consumed by filter. This fixes several panics due to accessing to mbuf
  after free.
2015-03-19 12:49:55 +00:00
glebius
52282183c0 Merge r274709 by eri@: deal with IPv6 same way as we IPv4 and calculate
the checksum before entering pf_test6().

PR:		172648, 179392
2015-01-23 18:15:15 +00:00
gnn
c5ebff0eaa MFC: 272906
Change the PF hash from Jenkins to Murmur3.  In forwarding tests
this showed a conservative 3% incrase in PPS.

Original Differential Revision:	https://reviews.freebsd.org/D461
Submitted by:	des
Reviewed by:	emaste
2014-11-13 21:58:42 +00:00
hselasky
1f41d295fb MFC r263710, r273377, r273378, r273423 and r273455:
- De-vnet hash sizes and hash masks.
- Fix multiple issues related to arguments passed to SYSCTL macros.

Sponsored by:	Mellanox Technologies
2014-10-27 14:38:00 +00:00
glebius
86f98e492e Merge r272358 from head:
Use rn_detachhead() instead of direct free(9) for radix tables.
2014-10-16 20:43:12 +00:00
glebius
39bf2b6509 Merge r270928: explicitly free packet on PF_DROP, otherwise a "quick"
rule with "route-to" may still forward it.

PR:		177808
Approved by:	re (gjb)
2014-09-09 10:29:27 +00:00
glebius
c13a1bd643 Fix ABI broken in r270576. This is direct commit to stable/10.
Reported by:	kib
2014-09-01 08:34:39 +00:00
glebius
93084ab958 Merge r270023 from head:
Do not lookup source node twice when pf_map_addr() is used.

  PR:           184003
  Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
  Sponsored by: InnoGames GmbH
2014-08-25 15:51:07 +00:00
glebius
dbdbc550be Merge r270022 from head:
pf_map_addr() can fail and in this case we should drop the packet,
  otherwise bad consequences including a routing loop can occur.

  Move pf_set_rt_ifp() earlier in state creation sequence and
  inline it, cutting some extra code.

  PR:           183997
  Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
  Sponsored by: InnoGames GmbH
2014-08-25 15:49:41 +00:00
glebius
59e5a700b1 Merge 270010 from head:
Fix synproxy with IPv6. pf_test6() was missing a check for M_SKIP_FIREWALL.

  PR:           127920
  Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
  Sponsored by: InnoGames GmbH
2014-08-25 15:48:28 +00:00
glebius
3722b178a3 Merge r269998 from head:
- Count global pf(4) statistics in counter(9).
  - Do not count global number of states and of src_nodes,
    use uma_zone_get_cur() to obtain values.
  - Struct pf_status becomes merely an ioctl API structure,
    and moves to netpfil/pf/pf.h with its constants.
  - V_pf_status is now of type struct pf_kstatus.

  Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
  Sponsored by: InnoGames GmbH
2014-08-25 15:40:37 +00:00
glebius
3d5a7426d5 Merge r268492:
On machines with strict alignment copy pfsync_state_key from packet
  on stack to avoid unaligned access.

PR:	187381
2014-08-22 13:39:56 +00:00
ae
7d6d803f86 MFC r266399:
Since ipfw nat configures all options in one step, we should set all bits
  in the mask when calling LibAliasSetMode() to properly clear unneeded
  options.

  PR:		189655
2014-05-26 07:02:03 +00:00
melifaro
89bf7e80ea Merge r258708, r258711, r260247, r261117.
r258708:
Check ipfw table numbers in both user and kernel space before rule addition.
Found by:       Saychik Pavel <umka@localka.net>

r258711:
Simplify O_NAT opcode handling.

r260247:
Use rnh_matchaddr instead of rnh_lookup for longest-prefix match.
rnh_lookup is effectively the same as rnh_matchaddr if called with
empy network mask.

r261117:
Reorder struct ip_fw_chain:
* move rarely-used fields down
* move uh_lock to different cacheline
* remove some usused fields
2014-05-08 19:11:41 +00:00
trociny
4d8a3153db MFC r264963:
Define startup order the same way as it is in dummynet.
2014-05-02 14:44:17 +00:00
mm
5b89692b00 MFC r264689:
De-virtualize UMA zone pf_mtag_z and move to global initialization part.

The m_tag struct does not know about vnet context and the pf_mtag_free()
callback is called unaware of current vnet. This causes a panic.

PR:		kern/182964
2014-04-27 09:05:34 +00:00
ae
8057942ba8 MFC r264540:
Set oif only for outgoing packets.

  PR:		188543
2014-04-23 09:56:17 +00:00
brueffer
4bf359bbbf MFC: r264421
Free resources in error cases; re-indent a curly brace while here.

CID:		1199366
Found with:	Coverity Prevent(tm)
2014-04-23 07:22:40 +00:00
mm
e1ea0d7316 MFC r264220:
Execute pf_overload_task() in vnet context. Fixes a vnet kernel panic.

Reviewed by:	trociny
2014-04-14 09:36:15 +00:00
glebius
b117682e18 Merge r263497: fix ipfw + VIMAGE sysctls.
PR:		kern/187665
2014-03-24 10:19:07 +00:00
glebius
03fdc2934e Merge r262763, r262767, r262771, r262806 from head:
- Remove rt_metrics_lite and simply put its members into rtentry.
  - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
    removes another cache trashing ++ from packet forwarding path.
  - Create zini/fini methods for the rtentry UMA zone. Via initialize
    mutex and counter in them.
  - Fix reporting of rmx_pksent to routing socket.
  - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
2014-03-21 15:15:30 +00:00
glebius
f937dcf2bd Bulk sync of pf changes from head, in attempt to fixup broken build I
made in r263029.

Merge r257186,257215,257349,259736,261797.

These changesets split pfvar.h into several smaller headers and make
userland utilities to include only some of them.
2014-03-12 10:45:58 +00:00
glebius
71d3a4f585 Merge r261882, r261898, r261937, r262760, r262799:
Once pf became not covered by a single mutex, many counters in it became
  race prone. Some just gather statistics, but some are later used in
  different calculations.

  A real problem was the race provoked underflow of the states_cur counter
  on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this
  value is used in pf_state_expires() and any state created by this rule
  is immediately expired.

  Thus, make fields states_cur, states_tot and src_nodes of struct
  pf_rule be counter(9)s.
2014-03-11 15:43:06 +00:00
glebius
6f16f3acfd Merge r261029: remove NULL pointer dereference. 2014-03-11 15:20:47 +00:00
glebius
c517ab4bf1 Merge r261028: fix resource leak and simplify code for DIOCCHANGEADDR. 2014-03-11 15:19:11 +00:00
dim
591aab5d2a MFC r261915:
Under sys/netpfil/ipfw, surround two IPv6-specific static functions with
#ifdef INET6, since they are unused when INET6 is disabled.
2014-02-19 07:51:58 +00:00
glebius
dfcbb7ef75 Merge r260377: fix panic on pf_get_translation() failure.
PR:		182557
2014-01-22 10:45:16 +00:00
glebius
99ea781723 Merge r258478, r258479, r258480, r259719: fixes related to mass source
nodes removal.

PR:		176763
2014-01-22 10:29:15 +00:00
glebius
5da449f113 Merge several fixlets from head:
r257619: Remove unused PFTM_UNTIL_PACKET const.
r257620: Code logic of handling PFTM_PURGE into pf_find_state().
r258475: Don't compare unsigned <= 0.
r258477: Fix off by ones when scanning source nodes hash.
2014-01-22 10:18:25 +00:00
rodrigc
85a1f61056 MFC r258588
In sys/netpfil/ipfw/ip_fw_nat.c:vnet_ipfw_nat_uninit() we call "IPFW_WLOCK(chain);".
This lock gets deleted in sys/netpfil/ipfw/ip_fw2.c:vnet_ipfw_uninit().

Therefore, vnet_ipfw_nat_uninit() *must* be called before vnet_ipfw_uninit(),
but this doesn't always happen, because the VNET_SYSINIT order is the same for both functions.
In sys/net/netpfil/ipfw/ip_fw2.c and sys/net/netpfil/ipfw/ip_fw_nat.c,
IPFW_SI_SUB_FIREWALL == IPFW_NAT_SI_SUB_FIREWALL == SI_SUB_PROTO_IFATTACHDOMAIN
and
IPFW_MODULE_ORDER == IPFW_NAT_MODULE_ORDER

Consequently, if VIMAGE is enabled, and jails are created and destroyed,
the system sometimes crashes, because we are trying to use a deleted lock.

To reproduce the problem:
  (1)  Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS,
       INVARIANTS.
  (2)  Run this command in a loop:
       jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo

       (see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html )

Fix the problem by increasing the value of IPFW_NAT_SI_SUB_FIREWALL,
so that vnet_ipfw_nat_uninit() runs after vnet_ipfw_uninit().

Approved by: re (gjb)
2013-12-04 07:50:18 +00:00
philip
eb3631100b Use the correct EtherType for logging IPv6 packets.
Reviewed by:	melifaro
Approved by:	re (kib, glebius)
MFC after:	3 days
2013-09-28 15:49:36 +00:00
glebius
8159278dbd Merge 1.12 of pf_lb.c from OpenBSD, with some changes. Original commit:
date: 2010/02/04 14:10:12;  author: sthen;  state: Exp;  lines: +24 -19;
  pf_get_sport() picks a random port from the port range specified in a
  nat rule. It should check to see if it's in-use (i.e. matches an existing
  PF state), if it is, it cycles sequentially through other ports until
  it finds a free one. However the check was being done with the state
  keys the wrong way round so it was never actually finding the state
  to be in-use.

  - switch the keys to correct this, avoiding random state collisions
  with nat. Fixes PR 6300 and problems reported by robert@ and viq.

  - check pf_get_sport() return code in pf_test(); if port allocation
  fails the packet should be dropped rather than sent out untranslated.

  Help/ok claudio@.

Some additional changes to 1.12:

- We also need to bzero() the key to zero padding, otherwise key
  won't match.
- Collapse two if blocks into one with ||, since both conditions
  lead to the same processing.
- Only naddr changes in the cycle, so move initialization of other
  fields above the cycle.
- s/u_intXX_t/uintXX_t/g

PR:		kern/181690
Submitted by:	Olivier Cochard-Labbé <olivier cochard.me>
Sponsored by:	Nginx, Inc.
2013-09-02 10:14:25 +00:00
mav
413bf347cd Make dummynet use new direct callout(9) execution mechanism. Since the only
thing done by the dummynet handler is taskqueue_enqueue() call, it doesn't
need extra switch to the clock SWI context.

On idle system this change in half reduces number of active CPU cycles and
wakes up only one CPU from sleep instead of two.

I was going to make this change much earlier as part of calloutng project,
but waited for better solution with skipping idle ticks to be implemented.
Unfortunately with 10.0 release coming it is better get at least this.
2013-08-24 13:34:36 +00:00
trociny
583ac34809 Make ipfw nat init/unint work correctly for VIMAGE:
* Do per vnet instance cleanup (previously it was only for vnet0 on
  module unload, and led to libalias leaks and possible panics due to
  stale pointer dereferences).

* Instead of protecting ipfw hooks registering/deregistering by only
  vnet0 lock (which does not prevent pointers access from another
  vnets), introduce per vnet ipfw_nat_loaded variable. The variable is
  set after hooks are registered and unset before they are deregistered.

* Devirtualize ifaddr_event_tag as we run only one event handler for
  all vnets.

* It is supposed that ifaddr_change event handler is called in the
  interface vnet context, so add an assertion.

Reviewed by:	zec
MFC after:	2 weeks
2013-08-24 11:59:51 +00:00
andre
7cc6cc696c Add m_clrprotoflags() to clear protocol specific mbuf flags at up and
downwards layer crossings.

Consistently use it within IP, IPv6 and ethernet protocols.

Discussed with:	trociny, glebius
2013-08-19 13:27:32 +00:00
ae
2b407e5f3f Fix a possible NULL-pointer dereference on the pfsync(4) reconfiguration.
Reported by:	Eugene M. Zheganin
2013-07-29 13:17:18 +00:00
glebius
6459f4d509 Improve locking strategy between keys hash and ID hash.
Before this change state creating sequence was:

1) lock wire key hash
2) link state's wire key
3) unlock wire key hash
4) lock stack key hash
5) link state's stack key
6) unlock stack key hash
7) lock ID hash
8) link into ID hash
9) unlock ID hash

What could happen here is that other thread finds the state via key
hash lookup after 6), locks ID hash and does some processing of the
state. When the thread creating state unblocks, it finds the state
it was inserting already non-virgin.

Now we perform proper interlocking between key hash locks and ID hash
lock:

1) lock wire & stack hashes
2) link state's keys
3) lock ID hash
4) unlock wire & stack hashes
5) link into ID hash
6) unlock ID hash

To achieve that, the following hacking was performed in pf_state_key_attach():

- Key hash mutex is marked with MTX_DUPOK.
- To avoid deadlock on 2 key hash mutexes, we lock them in order determined
  by their address value.
- pf_state_key_attach() had a magic to reuse a > FIN_WAIT_2 state. It unlinked
  the conflicting state synchronously. In theory this could require locking
  a third key hash, which we can't do now.
  Now we do not remove the state immediately, instead we leave this task to
  the purge thread. To avoid conflicts in a short period before state is
  purged, we push to the very end of the TAILQ.
- On success, before dropping key hash locks, pf_state_key_attach() locks
  ID hash and returns.

Tested by:	Ian FREISLICH <ianf clue.co.za>
2013-06-13 06:07:19 +00:00
glebius
35ec1b4a11 Return meaningful error code from pf_state_key_attach() and
pf_state_insert().
2013-05-11 18:06:51 +00:00
glebius
3a8ddef6a9 Better debug message. 2013-05-11 18:03:36 +00:00
glebius
4a8f8f585a Fix DIOCADDSTATE operation. 2013-05-11 17:58:26 +00:00
glebius
375ef2e633 Invalid creatorid is always EINVAL, not only when we are in verbose mode. 2013-05-11 17:57:52 +00:00
glebius
8adbc6e4ae Improve KASSERT() message. 2013-05-06 21:44:06 +00:00
glebius
b3233b1bbb Simplify printf(). 2013-05-06 21:43:15 +00:00
melifaro
858e632fa7 Use unified method for accessing / updating cached rule pointers.
MFC after:	2 weeks
2013-05-04 18:24:30 +00:00
eadler
a5a9ec51d6 Correct a few sizeof()s
Submitted by:	swildner@DragonFlyBSD.org
Reviewed by:	alfred
2013-05-01 04:37:34 +00:00
glebius
ccddbf9365 Remove useless ifdef KLD_MODULE from dummynet module unload path. This
fixes panic on unload.

Reported by:	pho
2013-04-29 06:11:19 +00:00
glebius
b4bc270e8f Add const qualifier to the dst parameter of the ifnet if_output method. 2013-04-26 12:50:32 +00:00
melifaro
bbeb8a5ba2 Fix ipfw rule validation partially broken by r248552.
Pointed by:	avg
MFC with:	r248552
2013-04-01 11:28:52 +00:00