Commit Graph

3505 Commits

Author SHA1 Message Date
n_hibma
6a72802152 Change net.link.log_promisc_mode_change to a read-only tunable
PR:		166255
Submitted by:	eugen.grosbein.net
Obtained from:	hselasky
MFC after:	3 days
2016-05-25 09:00:05 +00:00
tuexen
d1b4674834 Allow an MTU of 65535 bytes to be set via TUN[SG]IFINFO. This requires
changing the type on the mtu field in struct tuninfo from short to
unsigned short.
This is used, for example, by packetdrill to test with MTUs up to the
maximum value.

Differential Revision:	6452
2016-05-24 11:47:14 +00:00
pfg
40226afa3d sys/net: more spelling. 2016-05-19 16:28:05 +00:00
tuexen
513d614cdf Allow writing IP packets of length TUNMRU no matter if TUNSIFHEAD is set
or not.
2016-05-19 13:52:12 +00:00
bz
47d341d05f Rather than having the if_vmove() code intermixed in the vnet_destroy()
function in vnet.c move it to if.c where it logically belongs and put
it under a VNET_SYSUNINIT() call.
To not change the current behaviour make sure it runs first thing
during teardown. In the future this will allow us more flexibility
on changing the order on when we want to get rid of interfaces.

Stop exporting if_vmove() and make it file static.

Reviewed by:		gnn
Sponsored by:		The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D6438
2016-05-18 20:06:45 +00:00
bz
cd56bdcd66 Add a "vnet_state" field to struct vnet.
This is set to the SI_SUB_* value before executing any VNET_SYSINIT
or VNET_SYSUNINT.  While good for debugging especially VNET teardown
problems having a chance to know at which level during teardown we are,
it will also be used to identify to detcted a "stable state"
(as in fully up and running) later on.

Obtained from:	projects/vnet
Sponsored by:	The FreeBSD Foundation
2016-05-18 15:50:52 +00:00
scottl
eb3b190ade Activate the NO_64BIT_ATOMICS code for mips and powerpc 2016-05-18 15:45:12 +00:00
scottl
4f6cb0738c Remove assertions that don't make sense for the data type. 2016-05-18 15:44:45 +00:00
bz
6a7d3963bc Add a dummy VNET_SYSINIT that will make sure all VNETs started will
always end on SI_SUB_VNET_DONE.

Obtained from:	projects/vnet
Sponsored by:	The FreeBSD Foundation
2016-05-18 15:25:19 +00:00
bz
bfac6710a1 Split 'show vnets' into 'show vnet' and 'show all vnets'.
While here adjust some db_printf format string.

Document the two show commands in ddb.4.

Sponsored by:	The FreeBSD Foundation
2016-05-18 14:43:17 +00:00
bz
74d8a16a48 Make compile without INET or without IP support in the kernel by hiding
variables and lro function calls behind approriate #ifdefs.

Also move the #includes for "opt_*" to the place where they should be.
2016-05-18 14:18:03 +00:00
scottl
596f2a146d Import the 'iflib' API library for network drivers. From the author:
"iflib is a library to eliminate the need for frequently duplicated device
independent logic propagated (poorly) across many network drivers."

Participation is purely optional.  The IFLIB kernel config option is
provided for drivers that want to transition between legacy and iflib
modes of operation.  ixl and ixgbe driver conversions will be committed
shortly.  We hope to see participation from the Broadcom and maybe
Chelsio drivers in the near future.

Submitted by:   mmacy@nextbsd.org
Reviewed by:    gallatin
Differential Revision:  D5211
2016-05-18 04:35:58 +00:00
eadler
156fd4834a Don't repeat the the word 'the'
(one manual change to fix grammar)

Confirmed With: db
Approved by: secteam (not really, but this is a comment typo fix)
2016-05-17 12:52:31 +00:00
bz
393a88e32e Mark the unused arguments of various SYSINIT functions __unused.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2016-05-17 00:32:36 +00:00
truckman
4d98e41134 When handling SIOCSIFNAME ensure that the new interface name is NUL
terminated.  Reject the rename attempt if the name is too long.

MFC after:	1 week
2016-05-15 21:37:36 +00:00
jhb
bcc5b0c55d Add an EARLY_AP_STARTUP option to start APs earlier during boot.
Currently, Application Processors (non-boot CPUs) are started by
MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until
SI_SUB_SMP at which point they are released to run kernel threads.
SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter
the scheduler and start running threads until fairly late in the
boot.

This change moves SI_SUB_SMP up to just before software interrupt
threads are created allowing the APs to start executing kernel
threads much sooner (before any devices are probed).  This allows
several initialization routines that need to perform initialization
on all CPUs to now perform that initialization in one step rather
than having to defer the AP initialization to a second SYSINIT run
at SI_SUB_SMP.  It also permits all CPUs to be available for
handling interrupts before any devices are probed.

This last feature fixes a problem on with interrupt vector exhaustion.
Specifically, in the old model all device interrupts were routed
onto the boot CPU during boot.  Later after the APs were released at
SI_SUB_SMP, interrupts were redistributed across all CPUs.

However, several drivers for multiqueue hardware allocate N interrupts
per CPU in the system.  In a system with many CPUs, just a few drivers
doing this could exhaust the available pool of interrupt vectors on
the boot CPU as each driver was allocating N * mp_ncpu vectors on the
boot CPU.  Now, drivers will allocate interrupts on their desired CPUs
during boot meaning that only N interrupts are allocated from the boot
CPU instead of N * mp_ncpu.

Some other bits of code can also be simplified as smp_started is
now true much earlier and will now always be true for these bits of
code.  This removes the need to treat the single-CPU boot environment
as a special case.

As a transition aid, the new behavior is available under a new kernel
option (EARLY_AP_STARTUP).  This will allow the option to be turned off
if need be during initial testing.  I plan to enable this on x86 by
default in a followup commit in the next few days and to have all
platforms moved over before 11.0.  Once the transition is complete,
the option will be removed along with the !EARLY_AP_STARTUP code.

These changes have only been tested on x86.  Other platform maintainers
are encouraged to port their architectures over as well.  The main
things to check for are any uses of smp_started in MD code that can be
simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in
the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).

PR:		kern/199321
Reviewed by:	markj, gnn, kib
Sponsored by:	Netflix
2016-05-14 18:22:52 +00:00
n_hibma
34530d661f Allow silencing of 'promiscuous mode enabled/disabled' messages.
PR:		166255
Submitted by:	eugen.grosbein.net
Obtained from:	eugen.grosbein.net
MFC after:	1 week
2016-05-12 19:42:13 +00:00
asomers
09b44517ca Improve performance and functionality of the bitstring(3) api
Two new functions are provided, bit_ffs_at() and bit_ffc_at(), which allow
for efficient searching of set or cleared bits starting from any bit offset
within the bit string.

Performance is improved by operating on longs instead of bytes and using
ffsl() for searches within a long. ffsl() is a compiler builtin in both
clang and gcc for most architectures, converting what was a brute force
while loop search into a couple of instructions.

All of the bitstring(3) API continues to be contained in the header file.
Some of the functions are large enough that perhaps they should be uninlined
and moved to a library, but that is beyond the scope of this commit.

sys/sys/bitstring.h:
        Convert the majority of the existing bit string implementation from
        macros to inline functions.

        Properly protect the implementation from inadvertant macro expansion
        when included in a user's program by prefixing all private
        macros/functions and local variables with '_'.

        Add bit_ffs_at() and bit_ffc_at(). Implement bit_ffs() and
        bit_ffc() in terms of their "at" counterparts.

        Provide a kernel implementation of bit_alloc(), making the full API
        usable in the kernel.

        Improve code documenation.

share/man/man3/bitstring.3:
        Add pre-exisiting API bit_ffc() to the synopsis.

        Document new APIs.

        Document the initialization state of the bit strings
        allocated/declared by bit_alloc() and bit_decl().

        Correct documentation for bitstr_size(). The original code comments
        indicate the size is in bytes, not "elements of bitstr_t". The new
        implementation follows this lead. Only hastd assumed "elements"
        rather than bytes and it has been corrected.

etc/mtree/BSD.tests.dist:
tests/sys/Makefile:
tests/sys/sys/Makefile:
tests/sys/sys/bitstring.c:
        Add tests for all existing and new functionality.

include/bitstring.h
	Include all headers needed by sys/bitstring.h

lib/libbluetooth/bluetooth.h:
usr.sbin/bluetooth/hccontrol/le.c:
        Include bitstring.h instead of sys/bitstring.h.

sbin/hastd/activemap.c:
        Correct usage of bitstr_size().

sys/dev/xen/blkback/blkback.c
        Use new bit_alloc.

sys/kern/subr_unit.c:
        Remove hard-coded assumption that sizeof(bitstr_t) is 1.  Get rid of
        unrb.busy, which caches the number of bits set in unrb.map.  When
        INVARIANTS are disabled, nothing needs to know that information.
        callapse_unr can be adapted to use bit_ffs and bit_ffc instead.
        Eliminating unrb.busy saves memory, simplifies the code, and
        provides a slight speedup when INVARIANTS are disabled.

sys/net/flowtable.c:
        Use the new kernel implementation of bit-alloc, instead of hacking
        the old libc-dependent macro.

sys/sys/param.h
        Update __FreeBSD_version to indicate availability of new API

Submitted by:   gibbs, asomers
Reviewed by:    gibbs, ngie
MFC after:      4 weeks
Sponsored by:   Spectra Logic Corp
Differential Revision:  https://reviews.freebsd.org/D6004
2016-05-04 22:34:11 +00:00
pfg
d9c9113377 sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
bz
a0e5ea962e Remove the most useful INET || INET6 check leftover from whenever,
doing nothing.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-05-03 16:01:53 +00:00
rrs
64e463c093 Complete the UDP tunneling of ICMP msgs to those protocols
interested in having tunneled UDP and finding out about the
ICMP (tested by Michael Tuexen with SCTP.. soon to be using
this feature).

Differential Revision:	http://reviews.freebsd.org/D5875
2016-04-28 15:53:10 +00:00
cem
5abde16df3 radix_mpath: Don't derefence a NULL pointer in for loop iteration
It seems rn_dupedkey may be NULL, because of the NULL check inside the loop.
(Also, the rt gets assigned from rn_dupedkey and NULL checked at top of loop.)
However, the for-loop update condition happens before the top-of-loop check and
dereferences 'rt' unconditionally.

Instead, NULL-check before dereferencing.

If rn_dupedkey cannot in fact be NULL, or something else protects this, feel
free to revert this and add an ASSERT of some kind instead.

This was introduced in r191080 (2009) and moved around slightly in r293657.

Reported by:	Coverity
CID:		1348482
Sponsored by:	EMC / Isilon Storage Division
2016-04-26 20:27:17 +00:00
pfg
fc01419148 sys: extend use of the howmany() macro when available.
We have a howmany() macro in the <sys/param.h> header that is
convenient to re-use as it makes things easier to read.
2016-04-26 15:38:17 +00:00
pfg
729533413f sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
pfg
fc65edc1cd Remove slightly used const values that can be replaced with nitems().
Suggested by:	jhb
2016-04-21 15:38:28 +00:00
bz
d0fd815b72 Add more fields from struct ifnet needed during debugging a kernel panic.
Move if_fib into the right place.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2016-04-20 21:04:39 +00:00
cem
1fdda1ad1e radix rn_inithead: Fix minor leak in low memory conditions
R_Zalloc is essentially a malloc(M_NOWAIT) wrapper.  It is possible that 'rnh'
failed to allocate, but 'rmh' succeeds.  In that case, we bail out of
rn_inithead() but previously did not free 'rmh'.

Introduced in r287073 (projects/routing) / MFP r294706.

Reported by:	Coverity
CID:		1350258
Sponsored by:	EMC / Isilon Storage Division
2016-04-20 02:01:45 +00:00
cem
9904129a9f bpf_getdltlist: Don't overrun 'lst'
'lst' is allocated with 'n1' members.  'n' indexes 'lst'.  So 'n == n1' is an
invalid 'lst' index.  This is a follow-up to r296009.

Reported by:	Coverity
CID:		1352743
Sponsored by:	EMC / Isilon Storage Division
2016-04-20 01:39:31 +00:00
pfg
a7d40a88c9 kernel: use our nitems() macro when it is available through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:48:27 +00:00
pfg
12232f8463 sys/net* : for pointers replace 0 with NULL.
Mostly cosmetical, no functional change.

Found with devel/coccinelle.
2016-04-15 17:30:33 +00:00
bz
146f54f44d During if_vmove() we call if_detach_internal() which in turn calls the event
handler notifying about interface departure and one of the consumers will
detach if_bpf.
There is no way for us to re-attach this easily as the DLT and hdrlen are
only given on interface creation.
Add a function to allow us to query the DLT and hdrlen from a current
BPF attachment and after if_attach_internal() manually re-add the if_bpf
attachment using these values.

Found by panics triggered by nd6 packets running past BPF_MTAP() with no
proper if_bpf pointer on the interface.

Also add a basic DDB show function to investigate the if_bpf attachment
of an interface.

Reviewed by:	gnn
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5896
2016-04-11 10:00:38 +00:00
pfg
b63211eed5 Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
rpokala
b4cec75b57 Revert accidental submit of WIP as part of r297609
Pointyhat to:	rpokala
2016-04-06 04:58:20 +00:00
rpokala
1f8d2c0640 Storage Controller Interface driver - typo in unimplemented macro in
scic_sds_controller_registers.h

s/contoller/controller/

PR:		207336
Submitted by:	Tony Narlock <tony @ git-pull.com>
2016-04-06 04:50:28 +00:00
jhb
70dc8c3b4a Remove an unneeded check.
CPUs with valid per-CPU data are not absent.

Sponsored by:	Netflix
2016-04-05 00:09:19 +00:00
bz
945ed5d635 Catch up with some more fields. I needed the bpf one lately.
Sponsored by:	The FreeBSD Foundation
2016-03-31 18:53:13 +00:00
trasz
ca92bb3067 Remove some NULL checks for M_WAITOK allocations.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-03-29 13:56:59 +00:00
gnn
e3bae03c17 Add ethertype reserved for network testing
MFC after:	2 weeks
2016-03-28 18:25:54 +00:00
bz
a460d01567 Fix compile errors after r297225:
- properly V_irtualise variable access unbreaking VIMAGE kernels.
- remove the volatile from the function return type to make architecture
  using gcc happy [-Wreturn-type]
  "type qualifiers ignored on function return type"
  I am not entirely happy with this solution putting the u_int there
  but it will do for now.
2016-03-24 11:40:10 +00:00
gnn
c3d5404bbe FreeBSD previously provided route caching for TCP (and UDP). Re-add
route caching for TCP, with some improvements. In particular, invalidate
the route cache if a new route is added, which might be a better match.
The cache is automatically invalidated if the old route is deleted.

Submitted by:	Mike Karels
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D4306
2016-03-24 07:54:56 +00:00
sephe
0123218eb8 buf_ring/drbr: Add buf_ring_peek_clear_sc and use it in drbr_peek
Unlike buf_ring_peek, it only supports single consumer mode, and it
clears the cons_head if DEBUG_BUFRING/INVARIANTS is defined.

The normal use case of drbr_peek for network drivers is:

m = drbr_peek(br);
err = hw_spec_encap(&m); /* could m_defrag/m_collapse */
(*)
if (err) {
    if (m == NULL)
        drbr_advance(br);
    else
        drbr_putback(br, m);
    /* break the loop */
}
drbr_advance(br);

The race is:
If hw_spec_encap() m_defrag or m_collapse the mbuf, i.e. the old mbuf
was freed, or like the Hyper-V's network driver, that transmission-
done does not even require the TX lock; then on the other CPU at the
(*) time, the freed mbuf could be recycled and being drbr_enqueue even
before the current CPU had the chance to call drbr_{advance,putback}.
This triggers a panic in drbr_enqueue duplicated element check, if
DEBUG_BUFRING/INVARIANTS is defined.

Use buf_ring_peek_clear_sc() in drbr_peek() to fix the above race.

This change is a NO-OP, if neither DEBUG_BUFRING nor INVARIANTS are
defined.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5416
2016-02-29 03:54:51 +00:00
kib
4c4becd77e In bpf_getdltlist(), do not call copyout(9) while holding bpf lock.
Copy the data into temprorary malloced buffer and drop the lock for
copyout.

Reported, reviewed and tested by:	cem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-02-24 22:00:35 +00:00
araujo
42270c385f Fix regression introduced on 272446r.
lagg(4) supports the protocol none, where it disables any traffic without
disabling the lagg(4) interface itself.

PR:		206921
Submitted by:	Pushkar Kothavade <pushkarbk@gmail.com>
Reviewed by:	rpokala
Approved by:	bapt (mentor)
MFC after:	3 weeks
Sponsored by:	gandi.net
Differential Revision:	https://reviews.freebsd.org/D5076
2016-02-19 06:35:53 +00:00
dteske
51b30e8967 Merge SVN r295220 (bz) from projects/vnet/
Fix a panic that occurs when a vnet interface is unavailable at the time the
vnet jail referencing said interface is stopped.

Sponsored by:	FIS Global, Inc.
2016-02-11 17:07:19 +00:00
glebius
306a6faf84 These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
glebius
2f1de40ae4 Provide TCPSTAT_DEC() and TCPSTAT_FETCH() macros. 2016-01-27 00:20:07 +00:00
zec
0035768b8e Prune a definition which is / was never used. 2016-01-25 20:35:15 +00:00
melifaro
196af28820 Fix flowtable part missed in r294706. 2016-01-25 09:31:32 +00:00
melifaro
23582454c7 MFP r287070,r287073: split radix implementation and route table structure.
There are number of radix consumers in kernel land (pf,ipfw,nfs,route)
  with different requirements. In fact, first 3 don't have _any_ requirements
  and first 2 does not use radix locking. On the other hand, routing
  structure do have these requirements (rnh_gen, multipath, custom
  to-be-added control plane functions, different locking).
Additionally, radix should not known anything about its consumers internals.

So, radix code now uses tiny 'struct radix_head' structure along with
  internal 'struct radix_mask_head' instead of 'struct radix_node_head'.
  Existing consumers still uses the same 'struct radix_node_head' with
  slight modifications: they need to pass pointer to (embedded)
  'struct radix_head' to all radix callbacks.

Routing code now uses new 'struct rib_head' with different locking macro:
  RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing
  information base).

New net/route_var.h header was added to hold routing subsystem internal
  data. 'struct rib_head' was placed there. 'struct rtentry' will also
  be moved there soon.
2016-01-25 06:33:15 +00:00
melifaro
6feca17fa9 Remove unused radix_mpath definitions. 2016-01-25 05:28:19 +00:00