Commit Graph

2724 Commits

Author SHA1 Message Date
Gleb Smirnoff
4c506522c1 When removing ifnets, we should first remove the reference to ifnet
from the interface index, then decrease refcount, not vice versa.

Otherwise there is a race (reproducible) when if_free_internal()
contests on IFNET_WLOCK(), and we got a zero-refed ifnet in the
index for a long time. It may be picked by some other thread,
that runs ifnet_byindex_ref(), who takes the ifnet from index,
and bumps refcount. When reader drops the lock, if_free_internal()
proceeds with free. Then reader tries to free it a second time.
2011-04-04 07:45:08 +00:00
Jeff Roberson
e4cd31dd3c - Merge changes to the base system to support OFED. These include
a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND,
   and other miscellaneous small features.
2011-03-21 09:40:01 +00:00
Dmitry Chagin
2093339ead Remove dead code.
MFC after:	1 Week
2011-03-20 08:35:00 +00:00
Dmitry Chagin
e579f1c1cf ouch, newrt is used on the return path, my fault.
Partialy revert the previous change.

MFC after:	1 Week.
2011-03-19 21:10:57 +00:00
Dmitry Chagin
523e60025b A bit rearranged rtalloc1_fib() code.
Initialize a variable when it is really needed.
To avoid code duplication move the miss label to line up and jump on it.

MFC after:	1 Week
2011-03-19 19:50:36 +00:00
Dmitry Chagin
6a873ef717 Remove a now unused variable.
MFC after:	1 Week
2011-03-19 16:52:06 +00:00
Ermal Luçi
5f82cfdf6c Fix a panic that can happen when trying to destroy a lagg(4) with scheduler set to none.
Approved by:	thompsa(mentor)
MFC after:	1 week
2011-03-04 20:37:38 +00:00
Bjoern A. Zeeb
e3416ab0c0 Hide the outer IP addresses of a tunnel interfaces (gif(4), gre(4))
from processes inside jails if the addresses do not belong to the jail.

Originally reported by: Pieter de Boer via remko
PR:		kern/151119
Tested by:	Piotr KUCHARSKI (nospam 42.pl) [gif]
MFC after:	1 week
2011-03-02 21:39:08 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Bjoern A. Zeeb
1fb51a12f2 Mfp4 CH=177274,177280,177284-177285,177297,177324-177325
VNET socket push back:
  try to minimize the number of places where we have to switch vnets
  and narrow down the time we stay switched.  Add assertions to the
  socket code to catch possibly unset vnets as seen in r204147.

  While this reduces the number of vnet recursion in some places like
  NFS, POSIX local sockets and some netgraph, .. recursions are
  impossible to fix.

  The current expectations are documented at the beginning of
  uipc_socket.c along with the other information there.

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb
  Tested by:    zec

Tested by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	2 weeks
2011-02-16 21:29:13 +00:00
Bjoern A. Zeeb
144e6203ff Mfp4 CH=177255:
Resort the CURVNET_SET* macros in the non-VNET_DEBUG case to match
  the call order of the VNET_DEBUG case.

  Add the VNET_ASSERT() to the non-VNET_DEBUG case as well so that
  INVARIANTS will still catch problems.

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb

MFC after:	2 weeks
2011-02-11 14:17:58 +00:00
Bjoern A. Zeeb
0028e52461 Mfp4 CH=177255:
Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS.

  Change the syntax to match KASSERT() to allow more flexible panic
  messages rather than having a printf with hardcoded arguments
  before panic.

  Adjust the few assertions we have to the new format (and enhance
  the output).

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:	jhb

MFC after:	2 weeks
2011-02-11 13:27:00 +00:00
Bjoern A. Zeeb
6cf986ac19 Mfp4 CH=177255:
Use __func__ rather than __FUNCTION__.

MFC after:	2 weeks
2011-02-11 12:56:05 +00:00
Max Laier
826bf287b5 As info.rti_info[RTAX_DST] can point inside of rtm we must not free the rtm
until rt_dispatch is done with the sockaddr.

Found by:	memguard
MFC after:	3 days
2011-02-10 01:24:09 +00:00
John Baldwin
5f3b301a43 Fix a LOR by dropping the global ifnet locks while allocating a new ifnet
table in if_grow().  The order of the SYSINIT's for ifnet state were swapped
so that the various locks were initialized before being used.

Reviewed by:	pluknet, bz
MFC after:	2 weeks
2011-01-24 22:21:58 +00:00
Matthew D Fleming
f8e4b4ef49 sysctl(8) should use the CTLTYPE to determine the type of data when
reading.  (This was already done for writing to a sysctl).  This
requires all SYSCTL setups to specify a type.  Most of them are now
checked at compile-time.

Remove SYSCTL_*X* sysctl additions as the print being in hex should be
controlled by the -x flag to sysctl(8).

Succested by:	bde
2011-01-19 17:04:07 +00:00
Matthew D Fleming
f88910cdf5 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the net* piece.
2011-01-12 19:53:50 +00:00
John Baldwin
58ccf5b41c Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by:	bde
2011-01-11 13:59:06 +00:00
Bjoern A. Zeeb
9269189f98 MfP4 CH=185246 [1]:
Add FEATURE() to announce optional VIMAGE.

MFC after:	3 days
[1] for the moment put it in vnet.c.
2011-01-09 20:40:21 +00:00
John Baldwin
a8f4344f08 - Restore dropping the priority of syncer down to PPAUSE when it is idle.
This was lost when it was converted to using a condition variable instead
  of lbolt.
- Drop the priority of flowtable down to PPAUSE when it is idle as well
  since it is a similar background task.

MFC after:	2 weeks
2011-01-06 22:17:07 +00:00
Marius Strobl
a0fc3825c3 Teach ifconfig(8) the handy shared option shortcut aliases the NetBSD
counterpart also takes, i.e. "fdx" for "full-duplex", "flow" for
"flowcontrol", "hdx" for "half-duplex" as well as "loop" and "loopback"
for "hw-loopback".

MFC after:	1 week
2011-01-05 15:28:30 +00:00
Marius Strobl
8b28e7e1a3 Fix whitespace.
MFC after:	1 week
2011-01-05 14:51:04 +00:00
Bjoern A. Zeeb
962be6dfb3 Use NULL rather than 0 to invalidate a pointer.
Rather than duplicating the LLE_FREE_LOCKED() macro code in LLE_FREE(),
call it directly (like we do for the RT_* macros).

Sponsored by:	ISPsystem [1]
Reviewed by:	julian [1]
MFC After:	1 week

[1] Early 2010.
2010-12-31 21:57:54 +00:00
Bjoern A. Zeeb
c9a2711a54 Print the vnet pointer under DDB when iterating over flowtables of each
virtual network stack instance.

Sponsored by:	ISPsystem [1]
Reviewed by:	julian [1]
MFC after:	1 week

[1] Early 2010.
2010-12-31 21:20:32 +00:00
Bjoern A. Zeeb
f0a56b0678 Move the increment operation under the lock and split the condition
variable into two so that we can see on which one we are waiting.
This might also more properly propagate the update of the
flowclean_cycles flag and avoid "hangs" people were seeing.

Suggested by:	rwatson [1]
Sponsored by:	ISPsystem [1]
Reviewed by:	julian [1]
Updated by:	Mikolaj Golub (to.my.trociny gmail.com)
Tested by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC After:	1 week

[1] Early 2010, initial version.
2010-12-31 21:06:52 +00:00
Alan Cox
82de724fe1 Introduce and use a new VM interface for temporarily pinning pages. This
new interface replaces the combined use of vm_fault_quick() and
pmap_extract_and_hold() throughout the kernel.

In collaboration with:	kib@
2010-12-25 21:26:56 +00:00
Weongyo Jeong
c5649739a5 Adds IFF_CANTCONFIG to IFF_CANTCHANGE that it shouldn't happen through
ioctl(2).
2010-12-07 20:31:04 +00:00
Weongyo Jeong
6e3cb00068 Introduces IFF_CANTCONFIG interface flag to point that the interface
isn't configurable in a meaningful way.  This is for ifconfig(8) or
other tools not to change code whenever IFT_USB-like interfaces are
registered at the interface list.

Reviewed by:	brooks
No objections:	gavin, jkim
2010-12-07 20:23:47 +00:00
Maxim Konovalov
57542d0481 o Swap descriptions for net.bpf.bufsize and net.bpf.maxbufsize.
PR:		misc/152531
MFC after:	1 week
2010-11-24 05:50:19 +00:00
Marko Zec
ccf7ba972c Allow for vlan(4) ifnets to have overlapping unit numbers if they are
created in separated vnets.  As a side-effect of having a separated
if_cloner instance for each vnet, all vlan ifnets created in a vnet
will be automatically destroyed when vnet teardown is initiated.

Disallow SIOCSETVLAN and SIOCGETVLAN ioctls on vlan ifnets which are
associated with physical ifnets residing in parent vnets.

This is an interim vlan-specific solution which will be superseded by a
more generic if_cloner V_irtualization change from p4.  For nooptions
VIMAGE builds, this should be a no-op change.

Discussed with:	bz
MFC after:	3 days
2010-11-22 23:35:29 +00:00
Dimitry Andric
3e288e6238 After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files.  A better long-term solution is
still being considered.  This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
2010-11-22 19:32:54 +00:00
Bjoern A. Zeeb
2c8b047c07 Add a missing ';' and change the debugging sysctl from xint to int.
Submitted by:		Mikolaj Golub (to.my.trociny gmail.com)
MFC after:		3 days
2010-11-21 19:33:19 +00:00
Dimitry Andric
c3adda9fc3 Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.
2010-11-14 20:40:55 +00:00
Dimitry Andric
31c6a0037e Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.
2010-11-14 20:38:11 +00:00
Dimitry Andric
47d46d92c2 Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE. 2010-11-14 20:23:02 +00:00
Marius Strobl
efd4fc3fb3 o Flesh out the generic IEEE 802.3 annex 31B full duplex flow control
support in mii(4):
  - Merge generic flow control advertisement (which can be enabled by
    passing by MIIF_DOPAUSE to mii_attach(9)) and parsing support from
    NetBSD into mii_physubr.c and ukphy_subr.c. Unlike as in NetBSD,
    IFM_FLOW isn't implemented as a global option via the "don't care
    mask" but instead as a media specific option this. This has the
    following advantages:
    o allows flow control advertisement with autonegotiation to be
      turned on and off via ifconfig(8) with the default typically
      being off (though MIIF_FORCEPAUSE has been added causing flow
      control to be always advertised, allowing to easily MFC this
      changes for drivers that previously used home-grown support for
      flow control that behaved that way without breaking POLA)
    o allows to deal with PHY drivers where flow control advertisement
      with manual selection doesn't work or at least isn't implemented,
      like it's the case with brgphy(4), e1000phy(4) and ip1000phy(4),
      by setting MIIF_NOMANPAUSE
    o the available combinations of media options are readily available
      from the `ifconfig -m` output
  - Add IFM_FLOW to IFM_SHARED_OPTION_DESCRIPTIONS and IFM_ETH_RXPAUSE
    and IFM_ETH_TXPAUSE to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so
    these are understood by ifconfig(8).
o Make the master/slave support in mii(4) actually usable:
  - Change IFM_ETH_MASTER from being implemented as a global option via
    the "don't care mask" to a media specific one as it actually is only
    applicable to IFM_1000_T to date.
  - Let mii_phy_setmedia() set GTCR_MAN_MS in IFM_1000_T slave mode to
    actually configure manually selected slave mode (like we also do in
    the PHY specific implementations).
  - Add IFM_ETH_MASTER to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS so it
    is understood by ifconfig(8).
o Switch bge(4), bce(4), msk(4), nfe(4) and stge(4) along with brgphy(4),
  e1000phy(4) and ip1000phy(4) to use the generic flow control support
  instead of home-grown solutions via IFM_FLAGs. This includes changing
  these PHY drivers and smcphy(4) to no longer unconditionally advertise
  support for flow control but only if the selected media has IFM_FLOW
  set (or MIIF_FORCEPAUSE is set) and implemented for these media variants,
  i.e. typically only for copper.
o Switch brgphy(4), ciphy(4), e1000phy(4) and ip1000phy(4) to report and
  set IFM_1000_T master mode via IFM_ETH_MASTER instead of via IFF_LINK0
  and some IFM_FLAGn.
o Switch brgphy(4) to add at least the the supported copper media based on
  the contents of the BMSR via mii_phy_add_media() instead of hardcoding
  them. The latter approach seems to have developed historically, besides
  causing unnecessary code duplication it was also undesirable because
  brgphy_mii_phy_auto() already based the capability advertisement on the
  contents of the BMSR though.
o Let brgphy(4) set IFM_1000_T master mode on all supported PHY and not
  just BCM5701. Apparently this was a misinterpretation of a workaround
  in the Linux tg3 driver; BCM5701 seem to require RGPHY_1000CTL_MSE and
  BRGPHY_1000CTL_MSC to be set when configuring autonegotiation but
  this doesn't mean we can't set these as well on other PHYs for manual
  media selection.
o Let ukphy_status() report IFM_1000_T master mode via IFM_ETH_MASTER so
  IFM_1000_T master mode support now is generally available with all PHY
  drivers.
o Don't let e1000phy(4) set master/slave bits for IFM_1000_SX as it's
  not applicable there.

Reviewed by:	yongari (plus additional testing)
Obtained from:	NetBSD (partially), OpenBSD (partially)
MFC after:	2 weeks
2010-11-14 13:26:10 +00:00
Konstantin Belousov
7b3b099e07 Use 'z' modifier for size_t printing. 2010-11-13 11:11:51 +00:00
Dimitry Andric
7e54af0831 Similar to r212647, remove the workaround in sys/net/vnet.h for an ld
bug (incorrect placement of __start_SECNAME in some cases) that was
fixed in r210245.

There is already an UPDATING entry about needing a recent ld.

MFC after:	1 month
2010-11-12 22:59:50 +00:00
George V. Neville-Neil
e162ea60d4 Add a queue to hold packets while we await an ARP reply.
When a fast machine first brings up some non TCP networking program
it is quite possible that we will drop packets due to the fact that
only one packet can be held per ARP entry.  This leads to packets
being missed when a program starts or restarts if the ARP data is
not currently in the ARP cache.

This code adds a new sysctl, net.link.ether.inet.maxhold, which defines
a system wide maximum number of packets to be held in each ARP entry.
Up to maxhold packets are queued until an ARP reply is received or
the ARP times out.  The default setting is the old value of 1
which has been part of the BSD networking code since time
immemorial.

Expose the time we hold an incomplete ARP entry by adding
the sysctl net.link.ether.inet.wait, which defaults to 20
seconds, the value used when the new ARP code was added..

Reviewed by:	bz, rpaulo
MFC after: 3 weeks
2010-11-12 22:03:02 +00:00
Dimitry Andric
4403994d7d Use the same treatment as in linker_set.h for the __start and __stop
symbols of the set_vnet and set_pcpu sections, so those symbols will
always be emitted in kernel modules, if they use vnet.h or pcpu.h.

Also, for pcpu.h, make the __(start|stop)_set_pcpu declarations, and
associated macros invisible to userland, to prevent it picking up these
symbols.

Reviewed by:	kib
2010-11-11 19:18:52 +00:00
Rui Paulo
09b6dcf968 Sync DLTs with the latest pcap version. 2010-10-29 18:41:09 +00:00
Bjoern A. Zeeb
a38de0134b Factor out DDB commands from r204145, r204279 into if_debug.c for further
enhancements (1).  Switch to a standard 2-clause BSD license for this (2).

Unfortunately we have to un-static the ifindex_table for this but do not
publicly export it.

Suggested by:	rwatson (1) a while back.
Approved by:	thompsa (2) for the change from r204279.
MFC after:	6 days
2010-10-25 08:30:19 +00:00
Sergey Kandaurov
9af74f3d68 Reshuffle SIOCGIFCONF32 handler from r155224.
- move all the chunks into one file, which allows to hide SIOCGIFCONF32
  global definition as well.
- replace __amd64__ with proper COMPAT_FREEBSD32 around.
- handle 32bit capacity before going into the handler itself instead of
  doing internal 32bit specific changes within it (e.g. as it's done for
  SIOCGDEFIFACE32_IN6).
- use explicitely sized types for ABI compat.

Approved by:	kib (mentor)
MFC after:	2 weeks
2010-10-21 16:20:48 +00:00
Bjoern A. Zeeb
ee7c7fee94 Close a race acquiring the IF_ADDR_LOCK() for each entry while iterating
over all interfaces to make sure the address will neither change nor be
freed while we are working on it.

PR:		kern/146250
Submitted by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	1 week
2010-10-16 19:25:27 +00:00
Bjoern A. Zeeb
fc2bfb3294 lltable_drain() has never been used so far, thus #if 0 it for now.
While touching it add the missing locking to the now disabled code
for the time when we'll resurrect it.

MFC after:	3 days
2010-10-16 18:42:09 +00:00
Bjoern A. Zeeb
b6b8c0779d Only hide the ifa and not the tp under #ifdef INET as the tp is needed
for locking evenwhen there is no INET.

MFC after:	3 days
2010-10-01 15:14:14 +00:00
John Baldwin
24f481fde2 - Expand scope of tun/tap softc locks to cover more softc fields and
driver-maintained ifnet fields (such as if_drv_flags).
- Use soft locks as the mutex that protects each interface's knote list
  rather than using the global knote list lock.  Also, use the softc
  for kn_hook instead of the cdev.
- Use mtx_sleep() instead of tsleep() when blocking in the read routines.
  This fixes a lost wakeup race.
- Remove D_NEEDGIANT now that the cdevsw routines use the softc lock
  where locking is needed.
- Lock IFQ when calculating the result for FIONREAD in tap(4).  tun(4)
  already did this.
- Remove remaining spl calls.

Submitted by:	Marcin Cieslak  saper of saper|info (3)
MFC after:	2 weeks
2010-09-22 21:02:43 +00:00
Jung-uk Kim
d0d7bcdf92 Fix a typo in a comment.
Submitted by:	afiveg
2010-09-16 18:37:33 +00:00
Matthew D Fleming
4d369413e1 Replace sbuf_overflowed() with sbuf_error(), which returns any error
code associated with overflow or with the drain function.  While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.
2010-09-10 16:42:16 +00:00
Bjoern A. Zeeb
73e39d6137 MFp4 CH=183259:
No reason to use if_free_type() as we don't change our type.
  Just if_free() is fine.

MFC after:	3 days
2010-09-02 16:11:12 +00:00