Commit Graph

228 Commits

Author SHA1 Message Date
Michael Tuexen
580e30a33e Use strlcpy() instead of strncpy().
Approved by:            re (kib@)
CID:			1395980, 1395981
X-MFC with:		r339012
MFC after:              1 week
2018-10-03 07:35:16 +00:00
Michael Tuexen
1687b1ab24 For changing the MTU on tun/tap devices, it should not matter whether it
is done via using ifconfig, which uses a SIOCSIFMTU ioctl() command, or
doing it using a TUNSIFINFO/TAPSIFINFO ioctl() command.
Without this patch, for IPv6 the new MTU is not used when creating routes.
Especially, when initiating TCP connections after increasing the MTU,
the old MTU is still used to compute the MSS.
Thanks to ae@ and bz@ for helping to improve the patch.

Reviewed by:		ae@, bz@
Approved by:		re (kib@)
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D17180
2018-09-29 13:01:23 +00:00
Mark Murray
19fa89e938 Remove the Yarrow PRNG algorithm option in accordance with due notice
given in random(4).

This includes updating of the relevant man pages, and no-longer-used
harvesting parameters.

Ensure that the pseudo-unit-test still does something useful, now also
with the "other" algorithm instead of Yarrow.

PR:		230870
Reviewed by:	cem
Approved by:	so(delphij,gtetlow)
Approved by:	re(marius)
Differential Revision:	https://reviews.freebsd.org/D16898
2018-08-26 12:51:46 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Hans Petter Selasky
fa3f256682 Disallow TUN and TAP character device IOCTLs to modify the network device
type to any value. This can cause page faults and panics due to accessing
uninitialized fields in the "struct ifnet" which are specific to the network
device type.

MFC after:	1 week
Found by:	jau@iki.fi
PR:		223767
Sponsored by:	Mellanox Technologies
2017-11-29 09:40:11 +00:00
Michael Tuexen
683300d1d5 Allow writing IP packets of length TUNMRU no matter if TUNSIFHEAD is set
or not.
2016-05-19 13:52:12 +00:00
Mark Murray
3aa77530ca * Address review (and add a bit myself).
- Tweek man page.
 - Remove all mention of RANDOM_FORTUNA. If the system owner wants YARROW or DUMMY, they ask for it, otherwise they get FORTUNA.
 - Tidy up headers a bit.
 - Tidy up declarations a bit.
 - Make static in a couple of places where needed.
 - Move Yarrow/Fortuna SYSINIT/SYSUNINIT to randomdev.c, moving us towards a single file where the algorithm context is used.
 - Get rid of random_*_process_buffer() functions. They were only used in one place each, and are better subsumed into those places.
 - Remove *_post_read() functions as they are stubs everywhere.
 - Assert against buffer size illegalities.
 - Clean up some silly code in the randomdev_read() routine.
 - Make the harvesting more consistent.
 - Make some requested argument name changes.
 - Tidy up and clarify a few comments.
 - Make some requested comment changes.
 - Make some requested macro changes.

* NOTE: the thing calling itself a 'unit test' is not yet a proper
  unit test, but it helps me ensure things work. It may be a proper
  unit test at some time in the future, but for now please don't make
  any assumptions or hold any expectations.

Differential Revision:	https://reviews.freebsd.org/D2025
Approved by:	so (/dev/random blanket)
2015-07-12 18:14:38 +00:00
Mark Murray
d1b06863fb Huge cleanup of random(4) code.
* GENERAL
- Update copyright.
- Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set
  neither to ON, which means we want Fortuna
- If there is no 'device random' in the kernel, there will be NO
  random(4) device in the kernel, and the KERN_ARND sysctl will
  return nothing. With RANDOM_DUMMY there will be a random(4) that
  always blocks.
- Repair kern.arandom (KERN_ARND sysctl). The old version went
  through arc4random(9) and was a bit weird.
- Adjust arc4random stirring a bit - the existing code looks a little
  suspect.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Redo read_random(9) so as to duplicate random(4)'s read internals.
  This makes it a first-class citizen rather than a hack.
- Move stuff out of locked regions when it does not need to be
  there.
- Trim RANDOM_DEBUG printfs. Some are excess to requirement, some
  behind boot verbose.
- Use SYSINIT to sequence the startup.
- Fix init/deinit sysctl stuff.
- Make relevant sysctls also tunables.
- Add different harvesting "styles" to allow for different requirements
  (direct, queue, fast).
- Add harvesting of FFS atime events. This needs to be checked for
  weighing down the FS code.
- Add harvesting of slab allocator events. This needs to be checked for
  weighing down the allocator code.
- Fix the random(9) manpage.
- Loadable modules are not present for now. These will be re-engineered
  when the dust settles.
- Use macros for locks.
- Fix comments.

* src/share/man/...
- Update the man pages.

* src/etc/...
- The startup/shutdown work is done in D2924.

* src/UPDATING
- Add UPDATING announcement.

* src/sys/dev/random/build.sh
- Add copyright.
- Add libz for unit tests.

* src/sys/dev/random/dummy.c
- Remove; no longer needed. Functionality incorporated into randomdev.*.

* live_entropy_sources.c live_entropy_sources.h
- Remove; content moved.
- move content to randomdev.[ch] and optimise.

* src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h
- Remove; plugability is no longer used. Compile-time algorithm
  selection is the way to go.

* src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h
- Add early (re)boot-time randomness caching.

* src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h
- Remove; no longer needed.

* src/sys/dev/random/uint128.h
- Provide a fake uint128_t; if a real one ever arrived, we can use
  that instead. All that is needed here is N=0, N++, N==0, and some
  localised trickery is used to manufacture a 128-bit 0ULLL.

* src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h
- Improve unit tests; previously the testing human needed clairvoyance;
  now the test will do a basic check of compressibility. Clairvoyant
  talent is still a good idea.
- This is still a long way off a proper unit test.

* src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h
- Improve messy union to just uint128_t.
- Remove unneeded 'static struct fortuna_start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])

* src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h
- Improve messy union to just uint128_t.
- Remove unneeded 'staic struct start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
- Fix some magic numbers elsewhere used as FAST and SLOW.

Differential Revision: https://reviews.freebsd.org/D2025
Reviewed by: vsevolod,delphij,rwatson,trasz,jmg
Approved by: so (delphij)
2015-06-30 17:00:45 +00:00
Mark Murray
10cb24248a This is the much-discussed major upgrade to the random(4) device, known to you all as /dev/random.
This code has had an extensive rewrite and a good series of reviews, both by the author and other parties. This means a lot of code has been simplified. Pluggable structures for high-rate entropy generators are available, and it is most definitely not the case that /dev/random can be driven by only a hardware souce any more. This has been designed out of the device. Hardware sources are stirred into the CSPRNG (Yarrow, Fortuna) like any other entropy source. Pluggable modules may be written by third parties for additional sources.

The harvesting structures and consequently the locking have been simplified. Entropy harvesting is done in a more general way (the documentation for this will follow). There is some GREAT entropy to be had in the UMA allocator, but it is disabled for now as messing with that is likely to annoy many people.

The venerable (but effective) Yarrow algorithm, which is no longer supported by its authors now has an alternative, Fortuna. For now, Yarrow is retained as the default algorithm, but this may be changed using a kernel option. It is intended to make Fortuna the default algorithm for 11.0. Interested parties are encouraged to read ISBN 978-0-470-47424-2 "Cryptography Engineering" By Ferguson, Schneier and Kohno for Fortuna's gory details. Heck, read it anyway.

Many thanks to Arthur Mesh who did early grunt work, and who got caught in the crossfire rather more than he deserved to.

My thanks also to folks who helped me thresh this out on whiteboards and in the odd "Hallway track", or otherwise.

My Nomex pants are on. Let the feedback commence!

Reviewed by:	trasz,des(partial),imp(partial?),rwatson(partial?)
Approved by:	so(des)
2014-10-30 21:21:53 +00:00
Gleb Smirnoff
3751dddb3e Mechanically convert to if_inc_counter(). 2014-09-19 10:39:58 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Gleb Smirnoff
45c203fce2 Remove AppleTalk support.
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.

Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 06:29:43 +00:00
Gleb Smirnoff
2c284d9395 Remove IPX support.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.

Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 02:58:48 +00:00
Alexander V. Chernikov
50da3e886d Teach every SIOCGIFSTATUS provider to fill in ifs->ascii anyway.
Remove old bits of data concat for 'ascii' field.
Remove special SIOCGIFSTATUS handling from if.c (which Coverity yells at).

Reported by:	Coverity
Coverity CID:	1147174
MFC after:	2 weeks
2014-01-07 15:59:33 +00:00
Adrian Chadd
dd50b3107e Restore the entropy gathering from the m_data pointer value, not the
m_data payload.

After talking with markm/bde, this is what markm actually intended.
2013-11-02 15:13:02 +00:00
Adrian Chadd
a09968c479 Convert the random entropy harvesting code to use a const void * pointer
rather than just void *.

Then, as part of this, convert a couple of mbuf m->m_data accesses
to mtod(m, const void *).

Reviewed by:	markm
Approved by:	security-officer (delphij)
Sponsored by:	Netflix, Inc.
2013-11-01 20:53:49 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Mark Murray
ad1f331196 Debug run. This now works, except that the "live" sources haven't
been tested. With all sources turned on, this unlocks itself in
a couple of seconds! That is no my box, and there is no guarantee
that this will be the case everywhere.

* Cut debug prints.

* Use the same locks/mutexes all the way through.

* Be a tad more conservative about entropy estimates.
2013-10-06 12:40:32 +00:00
Mark Murray
f02e47dc1e Snapshot. This passes the build test, but has not yet been finished or debugged.
Contains:

* Refactor the hardware RNG CPU instruction sources to feed into
the software mixer. This is unfinished. The actual harvesting needs
to be sorted out. Modified by me (see below).

* Remove 'frac' parameter from random_harvest(). This was never
used and adds extra code for no good reason.

* Remove device write entropy harvesting. This provided a weak
attack vector, was not very good at bootstrapping the device. To
follow will be a replacement explicit reseed knob.

* Separate out all the RANDOM_PURE sources into separate harvest
entities. This adds some secuity in the case where more than one
is present.

* Review all the code and fix anything obviously messy or inconsistent.
Address som review concerns while I'm here, like rename the pseudo-rng
to 'dummy'.

Submitted by:	Arthur Mesh <arthurmesh@gmail.com> (the first item)
2013-10-04 06:55:06 +00:00
Gleb Smirnoff
c7063c15b0 Clear knlist before destroying it in tap(4) and tun(4). This fixes later
crash, when a kqueue descriptor tries to dereference appropriate knotes.

Approved by:	re (kib)
2013-10-02 20:44:36 +00:00
Gleb Smirnoff
540b1a7238 Clean up SIOCSIFDSTADDR usage from ifnet drivers. The ioctl itself is
extremely outdated, and I doubt that it was ever used for ifnet drivers.
It was used for AF_INET sockets in pre-FreeBSD time.

Approved by:	re (hrs)
Sponsored by:	Nginx, Inc.
2013-09-11 09:19:44 +00:00
Mark Murray
a40c2646a4 Bring in some behind-the-scenes development, mainly By Arthur Mesh,
the rest by me.

o Namespace cleanup; the Yarrow name is now restricted to where it
  really applies; this is in anticipation of being augmented or
  replaced by Fortuna in the future. Fortuna is mentioned, but behind
  #if logic, and is ignorable for now.

o The harvest queue is pulled out into its own modules.

o Entropy harvesting is emproved, both by being made more conservative,
  and by separating (a bit!) the sources. Available entropy crumbs are
  marginally improved.

o Selection of sources is made clearer. With recent revelations,
  this will receive more work in the weeks and months to come.

Submitted by:	 Arthur Mesh (partly) <arthurmesh@gmail.com>
2013-09-07 14:15:13 +00:00
Davide Italiano
ab97ad0806 Don't clear the unused SI_CHEAPCLONE flag in tap_create()/tuncreate().
Reviewed by:	kib
2013-09-07 13:50:13 +00:00
Mark Murray
c495c93567 Snapshot; Do some running repairs on entropy harvesting. More needs to follow. 2013-08-26 18:35:21 +00:00
Mark Johnston
5bc4f6b3ab Add a missing module version declaration to if_tun(4).
PR:		181078
Submitted by:	Brandon Gooch <jamesbrandongooch@gmail.com>
MFC after:	1 week
2013-08-07 01:32:08 +00:00
Gleb Smirnoff
47e8d432d5 Add const qualifier to the dst parameter of the ifnet if_output method. 2013-04-26 12:50:32 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Gleb Smirnoff
42a58907c3 Make the "struct if_clone" opaque to users of the cloning API. Users
now use function calls:

  if_clone_simple()
  if_clone_advanced()

to initialize a cloner, instead of macros that initialize if_clone
structure.

Discussed with:		brooks, bz, 1 year ago
2012-10-16 13:37:54 +00:00
Kevin Lo
9823d52705 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
Kevin Lo
a10cee30c9 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
Ed Maste
cf8f32025f Remove an incorrect comment 2012-09-25 21:19:17 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Ed Schouten
a185bd12f3 Get rid of D_PSEUDO.
It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL.
Nowadays we have a different interface for that; make_dev_p(). There's
no need to keep it there.

While there, remove an unneeded D_NEEDMINOR from the gpio driver.

Discussed with:	gonzo@ (gpio)
2011-10-18 08:09:44 +00:00
Attilio Rao
6aba400a70 Fix a deficiency in the selinfo interface:
If a selinfo object is recorded (via selrecord()) and then it is
quickly destroyed, with the waiters missing the opportunity to awake,
at the next iteration they will find the selinfo object destroyed,
causing a PF#.

That happens because the selinfo interface has no way to drain the
waiters before to destroy the registered selinfo object. Also this
race is quite rare to get in practice, because it would require a
selrecord(), a poll request by another thread and a quick destruction
of the selrecord()'ed selinfo object.

Fix this by adding the seldrain() routine which should be called
before to destroy the selinfo objects (in order to avoid such case),
and fix the present cases where it might have already been called.
Sometimes, the context is safe enough to prevent this type of race,
like it happens in device drivers which installs selinfo objects on
poll callbacks. There, the destruction of the selinfo object happens
at driver detach time, when all the filedescriptors should be already
closed, thus there cannot be a race.
For this case, mfi(4) device driver can be set as an example, as it
implements a full correct logic for preventing this from happening.

Sponsored by:	Sandvine Incorporated
Reported by:	rstone
Tested by:	pluknet
Reviewed by:	jhb, kib
Approved by:	re (bz)
MFC after:	3 weeks
2011-08-25 15:51:54 +00:00
Bjoern A. Zeeb
a34c6aeb85 Tag mbufs of all incoming frames or packets with the interface's FIB
setting (either default or if supported as set by SIOCSIFFIB, e.g.
from ifconfig).

Submitted by:	Alexander V. Chernikov (melifaro ipfw.ru)
Reviewed by:	julian
MFC after:	2 weeks
2011-07-03 16:08:38 +00:00
John Baldwin
190367ef1c Properly return an ENOBUFS error if a write to a tun(4) device fails
due to m_uiotombuf() failing.

While here, trim unneeded error handling related to tuninit() since it
can never fail.

Submitted by:	Martin Birgmeier  la5lbtyi aon at
Reviewed by:	glebius
MFC after:	1 week
2011-06-03 13:47:05 +00:00
Pyun YongHyeon
d2d0470dc6 Fix white space nits and style 2011-05-06 20:46:29 +00:00
Pyun YongHyeon
26b8066bce Do not increment collision counter if transmit have failed.
Transmission error in tun(4) is queueing error(i.e. ENOBUFS) and it
has nothing to do with collision.

Reported by:	Zeus V Panchenko (zeus <> ibs dot dn dot ua)
2011-05-06 20:37:07 +00:00
Bjoern A. Zeeb
b6b8c0779d Only hide the ifa and not the tp under #ifdef INET as the tp is needed
for locking evenwhen there is no INET.

MFC after:	3 days
2010-10-01 15:14:14 +00:00
John Baldwin
24f481fde2 - Expand scope of tun/tap softc locks to cover more softc fields and
driver-maintained ifnet fields (such as if_drv_flags).
- Use soft locks as the mutex that protects each interface's knote list
  rather than using the global knote list lock.  Also, use the softc
  for kn_hook instead of the cdev.
- Use mtx_sleep() instead of tsleep() when blocking in the read routines.
  This fixes a lost wakeup race.
- Remove D_NEEDGIANT now that the cdevsw routines use the softc lock
  where locking is needed.
- Lock IFQ when calculating the result for FIONREAD in tap(4).  tun(4)
  already did this.
- Remove remaining spl calls.

Submitted by:	Marcin Cieslak  saper of saper|info (3)
MFC after:	2 weeks
2010-09-22 21:02:43 +00:00
Qing Li
6b533b5ddb Verify interface up status using its link state only
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another
commit.

Also updated the ifconfig utility to show the LINKSTATE
capability if present.

Reviewed by:	rwatson, imp, juli
MFC after:	3 days
2010-03-16 17:59:12 +00:00
Konstantin Belousov
22e62e7e6e In both if_tun and if_tap:
Do not do additional dev_ref() on the newly created interface in the
if_clone create method [1]. This reference is not needed and never
removed, causing struct cdevpriv leakage. Remove the setting of
SI_CHEAPCLONE flag as well, since it is unused.

For dev_clone handlers, create cdevs with the call make_dev_credf(MAKEDEV_REF)
instead of calling make_dev() and then dev_ref(), to avoid a race.

Call drain_dev_clone_events() at the module unload time after dev_clone
handler is deinstalled.

Submitted by:	Mikolaj Golub <to.my.trociny gmail com> [1]
MFC after:	1 week
2010-02-28 16:25:49 +00:00
Robert Watson
530c006014 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
Robert Watson
3893212ddc Update if_stf and if_tun to use if_addr_rlock()/if_addr_runlock() rather
than IF_ADDR_LOCK()/IF_ADDR_UNLOCK() when iterating ifp->if_addrhead.

MFC after:	6 weeks
2009-06-26 00:45:20 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
Bjoern A. Zeeb
ebd8672cc3 Add explicit includes for jail.h to the files that need them and
remove the "hidden" one from vimage.h.
2009-06-17 15:01:01 +00:00
Jamie Gritton
9ed47d01eb Get vnets from creds instead of threads where they're available, and from
passed threads instead of curthread.

Reviewed by:	zec, julian
Approved by:	bz (mentor)
2009-06-15 19:01:53 +00:00
Konstantin Belousov
d8b0556c6d Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use
vnode interlock to protect the knote fields [1]. The locking assumes
that shared vnode lock is held, thus we get exclusive access to knote
either by exclusive vnode lock protection, or by shared vnode lock +
vnode interlock.

Do not use kl_locked() method to assert either lock ownership or the
fact that curthread does not own the lock. For shared locks, ownership
is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared
lock not owned by curthread, causing false positives in kqueue subsystem
assertions about knlist lock.

Remove kl_locked method from knlist lock vector, and add two separate
assertion methods kl_assert_locked and kl_assert_unlocked, that are
supposed to use proper asserts. Change knlist_init accordingly.

Add convenience function knlist_init_mtx to reduce number of arguments
for typical knlist initialization.

Submitted by:	jhb [1]
Noted by:	jhb [2]
Reviewed by:	jhb
Tested by:	rnoland
2009-06-10 20:59:32 +00:00