3083 Commits

Author SHA1 Message Date
ae
24836ef695 MFC r279920:
Add if_input_default() method, that will be used for if_input
  initialization, when no input method specified before if_attach().

  This prevents panics when if_input() method called directly e.g.
  from bpf(4) code.

  PR:		192426
2015-03-19 13:10:09 +00:00
luigi
6e901283bf sync with the version in head (r274338):
fix one comment, and return kernel-supplied error if available.
no API changes.
2015-02-14 19:18:56 +00:00
ae
efbb33d3cf MFC r277295:
Fix condition and really sort ports. Also add comment describing
  the intent of this code.
2015-01-25 16:35:03 +00:00
ae
45e30f880b MFC r276901:
Move the recursion detection code into separate function
  gif_check_nesting(). Also make MTAG_GIF definition private to if_gif.c.

MFC r276907:
  Restore Ethernet-within-IP Encapsulation support that was broken after
  r273087. Move all checks from gif_output() into gif_transmit(). Previously
  they were checked always, because if_start always called gif_output.
  Now gif_transmit() can be called directly from if_bridge() code and we need
  do checks here.

  PR:		196646
2015-01-17 11:43:13 +00:00
ae
7a82e24551 MFC r273087 (with modifications):
Overhaul if_gif(4):
   o convert to if_transmit;
   o use rmlock to protect access to gif_softc;
   o use sx lock to protect from concurrent ioctls;
   o remove a lot of unneeded and duplicated code;
   o remove cached route support (it won't work with concurrent io);
   o style fixes.

MFC r273090:
  Move memset under ifdef INET6.

MFC r273091:
  Add more ifdefs. SIOC*_IN6 are defined only with INET6.

MFC r273121:
  Add inet/inet6 to the dependency list. Without them if_gif is useless.

MFC r273209 by bz:
  After r273087,r273090,r273091,r273121 changes to gif(4) try to fix
  NOIP builds for real.

MFC r273587:
  Remove redundant check and m_pullup() call.
2014-12-23 16:33:44 +00:00
ae
57be9990bd Add if_inc_counter() and if_get_counter_default() functions that do
access to ifnet counters for code compatibility with FreeBSD 11.

This is direct commit to stable/10.

Discussed with:	glebius@, arch@
2014-12-23 09:39:40 +00:00
ae
9a4e55b147 MFC r271917 by hrs:
Virtualize interface cloner for gif(4).  This fixes a panic when destroying
  a vnet jail which has a gif(4) interface.
2014-12-22 17:54:26 +00:00
ae
3449c92ea5 MFC r258167:
ANSIfy function defintions.
2014-12-22 17:32:13 +00:00
ae
3e533b7379 MFC r275394:
Remove unneded check. No need to do m_pullup to the size that we prepended.

Sponsored by:	Yandex LLC
2014-12-16 11:53:45 +00:00
hselasky
9fcf944d2a MFC r274376:
Fix some minor TSO issues:
- Improve description of TSO limits.
- Remove a not needed KASSERT()
- Remove some not needed variable casts.

Sponsored by:	Mellanox Technologies
2014-11-19 09:03:12 +00:00
kib
e4b2ee7e2b Merge the fueword(9) and casueword(9). In particular,
MFC r273783:
Add fueword(9) and casueword(9) functions.
MFC note: ia64 is handled like arm, with NO_FUEWORD define.

MFC r273784:
Replace some calls to fuword() by fueword() with proper error checking.

MFC r273785:
Convert kern_umtx.c to use fueword() and casueword().
MFC note: the sys__umtx_lock and sys__umtx_unlock syscalls are not
converted, they are removed from HEAD, and not used.  The do_sem2*()
family is not yet merged to stable/10, corresponding chunk will be
merged after do_sem2* are committed.

MFC r273788 (by jkim):
Actually install casuword(9) to fix build.

MFC r273911:
Add type qualifier volatile to the base (userspace) address argument
of fuword(9) and suword(9).
2014-11-18 12:53:32 +00:00
hselasky
fa183f0174 MFC r271946 and r272595:
Improve transmit sending offload, TSO, algorithm in general. This
change allows all HCAs from Mellanox Technologies to function properly
when TSO is enabled. See r271946 and r272595 for more details about
this commit.

Sponsored by:	Mellanox Technologies
2014-11-03 12:38:29 +00:00
ae
33d2961d9a MFC r272770:
When tunneling interface is going to insert mbuf into netisr queue after stripping
  outer header, consider it as new packet and clear the protocols flags.

  This fixes problems when IPSEC traffic goes through various tunnels and router
  doesn't send ICMP/ICMPv6 errors.

PR:		174602
Sponsored by:	Yandex LLC
2014-10-30 13:53:57 +00:00
hselasky
1d17f744c7 MFC r273733, r273740 and r273773:
The SYSCTL data pointers can come from userspace and must not be
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.

Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.

Sponsored by:	Mellanox Technologies
2014-10-30 08:04:48 +00:00
hselasky
1f41d295fb MFC r263710, r273377, r273378, r273423 and r273455:
- De-vnet hash sizes and hash masks.
- Fix multiple issues related to arguments passed to SYSCTL macros.

Sponsored by:	Mellanox Technologies
2014-10-27 14:38:00 +00:00
glebius
9ea3e68626 Merge r272385 by melifaro from head:
Free radix mask entries on main radix destroy.
  This is temporary commit to be merged to 10.
  Other approach (like hash table) should be used
  to store different masks.

PR:             194078
2014-10-16 20:46:02 +00:00
ae
f7ad542948 MFC r272176:
Keep list of lagg ports sorted by if_index.
2014-10-07 07:52:47 +00:00
asomers
f906790c87 MFC r265232
Fix a panic caused by doing "ifconfig -am" while a lagg is being destroyed.
The thread that is destroying the lagg has already set sc->sc_psc=NULL when
the "ifconfig -am" thread gets to lacp_req().  It tries to dereference
sc->sc_psc and panics.  The solution is for lacp_req() to check the value of
sc->sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
zeros.  Full details in GNATS.

PR:	189003
2014-10-06 23:17:01 +00:00
glebius
3722b178a3 Merge r269998 from head:
- Count global pf(4) statistics in counter(9).
  - Do not count global number of states and of src_nodes,
    use uma_zone_get_cur() to obtain values.
  - Struct pf_status becomes merely an ioctl API structure,
    and moves to netpfil/pf/pf.h with its constants.
  - V_pf_status is now of type struct pf_kstatus.

  Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
  Sponsored by: InnoGames GmbH
2014-08-25 15:40:37 +00:00
np
c11c6b7951 Update a couple of header files that were missed in r270252. This is a
direct commit to stable/10.

Submitted by:	luigi
2014-08-21 19:42:03 +00:00
mav
0959ad1632 MFC r269492:
Improve locking of multicast addresses in VLAN and LAGG interfaces.

This fixes several scenarios of reproducible panics, cause by races
between multicast address changes and interface destruction.
2014-08-18 15:54:35 +00:00
kevlo
f112206e5a MFC r268787:
Deprecate m_act.  Use m_nextpkt always.
2014-07-24 06:02:03 +00:00
tuexen
493873a6ef MFC r264241:
Call sctp_addr_change() from rt_addrmsg() instead of rt_newaddrmsg_fib(),
since rt_addrmsg() gets also called from other functions.
2014-06-22 16:36:14 +00:00
luigi
2472187c4f MFC 267168:
misc bugfixes:
- stdio.h is needed for fprint()
- make memsize uint32_t to avoid errors due to overflow
- honor the *XPOLL flagg in NIOCREGIF requests
- mmap fails wit MAP_FAILED, not NULL.
2014-06-09 15:16:17 +00:00
luigi
34919b06cf MFC 267167: whitespace changes (comments) 2014-06-09 15:15:08 +00:00
asomers
322a1ee4a0 MFC r264887
Fix host and network routes for new interfaces when net.add_addr_allfibs=0

sys/net/route.c
        In rtinit1, use the interface fib instead of the process fib.  The
        latter wasn't very useful because ifconfig(8) is usually invoked
        with the default process fib.  Changing ifconfig(8) to use setfib(2)
        would be redundant, because it already sets the interface fib.

tests/sys/netinet/fibs_test.sh
        Clear the expected ATF failure

sys/net/if.c
        Pass the interface fib in calls to rtrequest1_fib and rtalloc1_fib

sys/netinet/in.c
sys/net/if_var.h
        Add a fibnum argument to ifa_switch_loopback_route, a subroutine of
        in_scrubprefix.  Pass it the interface fib.
2014-06-06 21:45:14 +00:00
asomers
a8aa481895 MFC changes relating to running multiple interfaces on different fibs but
with addresses on the same subnet.

MFC r266860

Fix unintended KBI change from r264905.  Add _fib versions of
ifa_ifwithnet() and ifa_ifwithdstaddr()  The legacy functions will call the
_fib() versions with RT_ALL_FIBS, preserving legacy behavior.

sys/net/if_var.h
sys/net/if.c
        Add legacy-compatible functions as described above.  Ensure legacy
        behavior when RT_ALL_FIBS is passed as fibnum.

sys/netinet/in_pcb.c
sys/netinet/ip_output.c
sys/netinet/ip_options.c
sys/net/route.c
sys/net/rtsock.c
sys/netinet6/nd6.c
        Call with _fib() functions if we must use a specific fib, or the
        legacy functions otherwise.

tests/sys/netinet/fibs_test.sh
tests/sys/netinet/udp_dontroute.c
        Improve the udp_dontroute test.  The bug that this test exercises is
        that ifa_ifwithnet() will return the wrong address, if multiple
        interfaces have addresses on the same subnet but with different
        fibs.  The previous version of the test only considered one possible
        failure mode: that ifa_ifwithnet_fib() might fail to find any
        suitable address at all.  The new version also checks whether
        ifa_ifwithnet_fib() finds the correct address by checking where the
        ARP request goes.

MFC r264917

Style fixes, mostly trailing whitespace elimination.  No functional change.

MFC r264905

Fix subnet and default routes on different FIBs on the same subnet.

These two bugs are closely related.  The root cause is that ifa_ifwithnet
does not consider FIBs when searching for an interface address.

sys/net/if_var.h
sys/net/if.c
        Add a fib argument to ifa_ifwithnet and ifa_ifwithdstadddr.  Those
        functions will only return an address whose interface fib equals the
        argument.

sys/net/route.c
        Update calls to ifa_ifwithnet and ifa_ifwithdstaddr with fib
        arguments.

sys/netinet/in.c
        Update in_addprefix to consider the interface fib when adding
        prefixes.  This will prevent it from not adding a subnet route when
        one already exists on a different fib.

sys/net/rtsock.c
sys/netinet/in_pcb.c
sys/netinet/ip_output.c
sys/netinet/ip_options.c
sys/netinet6/nd6.c
        Add RT_DEFAULT_FIB arguments to ifa_ifwithdstaddr and ifa_ifwithnet.
        In some cases it there wasn't a clear specific fib number to use.
        In others, I was unable to test those functions so I chose
        RT_DEFAULT_FIB to minimize divergence from current behavior.  I will
        fix some of the latter changes along with PR kern/187553.

tests/sys/netinet/fibs_test.sh
tests/sys/netinet/udp_dontroute.c
tests/sys/netinet/Makefile
        Revert r263738.  The udp_dontroute test was right all along.
        However, bugs kern/187550 and kern/187553 cancelled each other out
        when it came to this test.  Because of kern/187553, ifa_ifwithnet
        searched the default fib instead of the requested one, but because
        of kern/187550, there was an applicable subnet route on the default
        fib.  The new test added in r263738 doesn't work right, however.  I
        can verify with dtrace that ifa_ifwithnet returned the wrong address
        before I applied this commit, but route(8) miraculously found the
        correct interface to use anyway.  I don't know how.

        Clear expected failure messages for kern/187550 and kern/187552.

MFC r263738

tests/sys/netinet/Makefile
tests/sys/netinet/fibs.sh
        Replace fibs:udp_dontroute with fibs:src_addr_selection_by_subnet.
        The original test was poorly written; it was actually testing
        kern/167947 instead of the desired kern/187553.  The root cause of the
        bug is that ifa_ifwithnet did not have a fib argument.  The new test
        more directly targets that behavior.

tests/sys/netinet/udp_dontroute.c
        Delete the auxilliary binary used by the old test
2014-06-06 20:35:40 +00:00
melifaro
aaa6b80bb3 Merge 260488, r260508.
r260488:
  Split rt_newaddrmsg_fib() into two different functions.
  Adding/deleting interface addresses involves access to 3 different subsystems,
  int different parts of code. Each call can fail, so reporting successful
  operation by rtsock in the middle of the process error-prone.

  Further split routing notification API and actual rtsock calls via creating
  public-available rt_addrmsg() / rt_routemsg() functions with "private"
  rtsock_* backend.

r260508:
  Simplify inet alias handling code: if we're adding/removing alias which
  has the same prefix as some other alias on the same interface, use
  newly-added rt_addrmsg() instead of hand-rolled in_addralias_rtmsg().

  This eliminates the following rtsock messages:

  Pinned RTM_ADD for prefix (for alias addition).
  Pinned RTM_DELETE for prefix (for alias withdrawal).

  Example (got 10.0.0.1/24 on vlan4, playing with 10.0.0.2/24):

  before commit, addition:

    got message of size 116 on Fri Jan 10 14:13:15 2014
    RTM_NEWADDR: address being added to iface: len 116, metric 0, flags:
    sockaddrs: <NETMASK,IFP,IFA,BRD>
     255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

    got message of size 192 on Fri Jan 10 14:13:15 2014
    RTM_ADD: Add Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED>
    locks:  inits:
    sockaddrs: <DST,GATEWAY,NETMASK>
     10.0.0.0 10.0.0.2 (255) ffff ffff ff

  after commit, addition:

    got message of size 116 on Fri Jan 10 13:56:26 2014
    RTM_NEWADDR: address being added to iface: len 116, metric 0, flags:
    sockaddrs: <NETMASK,IFP,IFA,BRD>
     255.255.255.0 vlan4:8.0.27.c5.29.d4 14.0.0.2 14.0.0.255

  before commit, wihdrawal:

    got message of size 192 on Fri Jan 10 13:58:59 2014
    RTM_DELETE: Delete Route: len 192, pid: 0, seq 0, errno 0, flags:<UP,PINNED>
    locks:  inits:
    sockaddrs: <DST,GATEWAY,NETMASK>
     10.0.0.0 10.0.0.2 (255) ffff ffff ff

    got message of size 116 on Fri Jan 10 13:58:59 2014
    RTM_DELADDR: address being removed from iface: len 116, metric 0, flags:
    sockaddrs: <NETMASK,IFP,IFA,BRD>
     255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

  adter commit, withdrawal:

    got message of size 116 on Fri Jan 10 14:14:11 2014
    RTM_DELADDR: address being removed from iface: len 116, metric 0, flags:
    sockaddrs: <NETMASK,IFP,IFA,BRD>
     255.255.255.0 vlan4:8.0.27.c5.29.d4 10.0.0.2 10.0.0.255

  Sending both RTM_ADD/RTM_DELETE messages to rtsock is completely wrong
  (and requires some hacks to keep prefix in route table on RTM_DELETE).

  I've tested this change with quagga (no change) and bird (*).

  bird alias handling is already broken in *BSD sysdep code, so nothing
  changes here, too.

  I'm going to MFC this change if there will be no complains about behavior
  change.

  While here, fix some style(9) bugs introduced by r260488
  (pointed by glebius and bde).
2014-05-08 21:03:31 +00:00
melifaro
5ca6003c5c Merge r260379, r260460.
r260379:
  Partially fix IPv4 interface routes deletion in RADIX_MPATH.

  Noticed by:   Nikolay Denev <ndenev at gmail.com>

r260460:
  Constanly use RT_ALL_FIBS everywhere instead of -1.
2014-05-08 20:41:39 +00:00
melifaro
d42ec49fe7 Merge r259528, r259528, r260295.
r259528:
  Simplify contiguous mask checking.

  Suggested by: glebius

r260228:
  Remove useless register variable modifiers.
  Do some more style(9).

r260295:
  Change semantics for rnh_lookup() function: now
  it performs exact match search, regardless of netmask existance.
  This simplifies most of rnh_lookup() consumers.

  Fix panic triggered by deleting non-existent host route.

  PR:           kern/185092
  Submitted by: Nikolay Denev <ndenev at gmail.com>
2014-05-08 20:27:06 +00:00
rmacklem
5bd3f1337e MFC: r264630
For NFS mounts using rsize,wsize=65536 over TSO enabled
network interfaces limited to 32 transmit segments, there
are two known issues.
The more serious one is that for an I/O of slightly less than 64K,
the net device driver prepends an ethernet header, resulting in a
TSO segment slightly larger than 64K. Since m_defrag() copies this
into 33 mbuf clusters, the transmit fails with EFBIG.
A tester indicated observing a similar failure using iSCSI.

The second less critical problem is that the network
device driver must copy the mbuf chain via m_defrag()
(m_collapse() is not sufficient), resulting in measurable overhead.

This patch reduces the default size of if_hw_tsomax
slightly, so that the first issue is avoided.
Fixing the second issue will require a way for the
network device driver to inform tcp_output() that it
is limited to 32 transmit segments.
2014-05-06 02:54:59 +00:00
rmacklem
a54326376a MFC: r264517
Vlan did not set the value of if_hw_tsomax, so when vlan
was stacked on top of a network interface that set if_hw_tsomax,
tcp_output() would see the default value instead of the value
set by the network interface. This patch modifies vlan so that
it sets if_hw_tsomax to the value of the parent interface.
2014-05-06 02:49:31 +00:00
rmacklem
1f951a5c9b MFC: r264469, r264498
Lagg did not set the value of if_hw_tsomax, so when lagg
was stacked on top of network interfaces that set if_hw_tsomax,
tcp_output() would see the default value instead of the value
set by the network interface(s). This patch modifies lagg so that
it sets if_hw_tsomax to the minimum of the value(s) for the
underlying network interfaces.
2014-05-06 02:44:01 +00:00
mm
5b89692b00 MFC r264689:
De-virtualize UMA zone pf_mtag_z and move to global initialization part.

The m_tag struct does not know about vnet context and the pf_mtag_free()
callback is called unaware of current vnet. This causes a panic.

PR:		kern/182964
2014-04-27 09:05:34 +00:00
jmmv
d9b0a628da MFC various fixes to the tools/regression/ tests.
- r262953 Fix m4 tests so that they run cleanly with prove.
- r262954 Fix printf tests so that they run cleanly with prove.
- r262959 Fix sed tests so that they run cleanly with prove.
- r262960 Fix yacc tests so that they run cleanly with prove.
- r262961 Fix pkill tests so that they run cleanly with prove.
- r262962 Fix ncal tests so that they run cleanly with prove.
- r263081 Fix lastcomm tests under amd64.
- r263082 Only run the make tests when make is fmake.
- r263083 Fix sa tests.
- r263084 Turn a test precondition into a skip in the mdconfig tests.
- r263085 Make the strerror tests work without libtap.
- r263087 Remove broken tests for eui64_line.
- r263221 Change etcupdate tests to return 1 on test failures.
- r263352 Make the priv test program exit with non-zero if any failures are detected.
- r263353 errx prepends the program name to the message; don't do it by hand.
- r263362 Include strings.h so that bpf_filter.c can be built in userland.
2014-04-14 13:30:08 +00:00
glebius
a25c39725c Merge r263203: garbage collect long time obsoleted (or never used) stuff
from routing API.
2014-04-09 11:15:50 +00:00
glebius
1e3b300892 o Provide a compatibility shim for netstat(1) to obtain output queue
drops via NET_RT_IFLISTL sysctl. The sysctl handler appends oqdrops
  at the end of struct if_msghdrl, and netstat(1) sees that as an
  additional field of struct if_data. This allows us to fetch the data
  keeping ABI and API compatibility.
  This is direct commit to stable/10.

o Merge r263331 from head, to restore printing of queue drops.

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-04-03 14:58:52 +00:00
glebius
03fdc2934e Merge r262763, r262767, r262771, r262806 from head:
- Remove rt_metrics_lite and simply put its members into rtentry.
  - Use counter(9) for rt_pksent (former rt_rmx.rmx_pksent). This
    removes another cache trashing ++ from packet forwarding path.
  - Create zini/fini methods for the rtentry UMA zone. Via initialize
    mutex and counter in them.
  - Fix reporting of rmx_pksent to routing socket.
  - Fix netstat(1) to report "Use" both in kvm(3) and sysctl(3) mode.
2014-03-21 15:15:30 +00:00
glebius
f937dcf2bd Bulk sync of pf changes from head, in attempt to fixup broken build I
made in r263029.

Merge r257186,257215,257349,259736,261797.

These changesets split pfvar.h into several smaller headers and make
userland utilities to include only some of them.
2014-03-12 10:45:58 +00:00
glebius
71d3a4f585 Merge r261882, r261898, r261937, r262760, r262799:
Once pf became not covered by a single mutex, many counters in it became
  race prone. Some just gather statistics, but some are later used in
  different calculations.

  A real problem was the race provoked underflow of the states_cur counter
  on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this
  value is used in pf_state_expires() and any state created by this rule
  is immediately expired.

  Thus, make fields states_cur, states_tot and src_nodes of struct
  pf_rule be counter(9)s.
2014-03-11 15:43:06 +00:00
glebius
7616e36e49 Merge r262770 from head: pacify gcc. 2014-03-05 03:16:23 +00:00
glebius
ed41469327 Merge r261582, r261601, r261610, r261613, r261627, r261640, r261641, r261823,
r261825, r261859, r261875, r261883, r261911, r262027, r262028, r262029,
      r262030, r262162 from head.

  Large flowtable revamp. See commit messages for merged revisions for
  details.

Sponsored by:	Netflix
2014-03-04 15:14:47 +00:00
glebius
352d508b16 Merge r261590: Fixup for r261590 (vnet sysctl handlers cleanup) 2014-03-04 14:05:37 +00:00
glebius
4b9e17c3ef Merge r261590, r261592 from head:
Remove identical vnet sysctl handlers, and handle CTLFLAG_VNET
  in the sysctl_root().

  Note: SYSCTL_VNET_* macros can be removed as well. All is
    needed to virtualize a sysctl oid is set CTLFLAG_VNET on it.
    But for now keep macros in place to avoid large code churn.
2014-03-04 14:01:12 +00:00
luigi
5bacc3bb87 MFH: sync the netmap code with the one in HEAD
(enhanced VALE switch, netmap pipes, emulated netmap mode).
See details in the log for svn 261909.
2014-02-18 05:01:04 +00:00
gnn
183f607e23 MFC 260207
Convert #defines to enums so that the values are visible in the debugger.

Requested by:	gibbs
2014-02-14 00:26:30 +00:00
glebius
99ea781723 Merge r258478, r258479, r258480, r259719: fixes related to mass source
nodes removal.

PR:		176763
2014-01-22 10:29:15 +00:00
glebius
5da449f113 Merge several fixlets from head:
r257619: Remove unused PFTM_UNTIL_PACKET const.
r257620: Code logic of handling PFTM_PURGE into pf_find_state().
r258475: Don't compare unsigned <= 0.
r258477: Fix off by ones when scanning source nodes hash.
2014-01-22 10:18:25 +00:00
pluknet
9ee780ef73 MFC r258675: Fix build. 2014-01-18 21:57:38 +00:00
avg
c1dbdbde60 MFC r258622: dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE 2014-01-17 10:58:59 +00:00