Commit Graph

645 Commits

Author SHA1 Message Date
Ed Schouten
4b2361f811 Convert syscons on i386 to TERM=xterm.
TEKEN_XTERM is now gone. Because we always use xterm mode now, we only
need a TEKEN_CONS25 switch to go back to cons25.
2009-11-13 11:28:54 +00:00
Ed Schouten
e42fc36867 Switch the default terminal emulation style to xterm for most platforms.
Right now syscons(4) uses a cons25-style terminal emulator. The
disadvantages of that are:

- Little compatibility with embedded devices with serial interfaces.
- Bad bandwidth efficiency, mainly because of the lack of scrolling
  regions.
- A very hard transition path to support for modern character sets like
  UTF-8.

Our terminal emulation library, libteken, has been supporting
xterm-style terminal emulation for months, so flip the switch and make
everyone use an xterm-style console driver.

I still have to enable this on i386. Right now pc98 and i386 share the
same /etc/ttys file. I'm not going to switch pc98, because it uses its
own Kanji-capable cons25 emulator.

IMPORTANT: What to do if things go wrong (i.e. graphical artifacts):

- Run the application inside script(1), try to reduce the problem and
  send me the log file.
- In the mean time, you can run `vidcontrol -T cons25' and `export
  TERM=cons25' so you can run applications the same way you did before.
  You can also build your kernel with `options TEKEN_CONS25' to make all
  virtual terminals use the cons25 emulator by default.

Discussed on:	current@
2009-11-13 05:54:55 +00:00
Rui Paulo
07ddebb5fb Mention the layout change of ieee80211req_scan_result. 2009-11-09 16:05:32 +00:00
Andrew Thompson
21293e7016 Belatedly add an UPDATING message for the usb ethernet ifnet naming in r188412.
MFC after:	3 days
2009-11-03 21:06:19 +00:00
Alexander Motin
b868265d82 Document atapci kernel module split.
PR:		amd64/139859
MFC after:	3 days
2009-10-26 09:16:08 +00:00
Rui Paulo
63b49c2b8a Explain that iwn was updated and the firmware images are now split. 2009-10-25 10:29:37 +00:00
Hiroki Sato
2e77c5abfb Fix several logic bugs in the previous IPv6 variable change and
re-add $ipv6_enable support for backward compatibility.  From
UPDATING:

 1. To use IPv6, simply define $ifconfig_IF_ipv6 like $ifconfig_IF
    for IPv4.  For aliases, $ifconfig_IF_aliasN should be used.
    Note that both variables need the "inet6" keyword at the head.

    Do not set $ipv6_network_interfaces manually if you do not
    understand what you are doing.  It is not needed in most cases.

    $ipv6_ifconfig_IF and $ipv6_ifconfig_IF_aliasN still work, but
    they are obsolete.

 2. $ipv6_enable is obsolete.  Use $ipv6_prefer and/or
    "inet6 accept_rtadv" keyword in ifconfig(8) instead.

    If you define $ipv6_enable=YES, it means $ipv6_prefer=YES and
    all configured interfaces have "inet6 accept_rtadv" in the
    $ifconfig_IF_ipv6.  These are for backward compatibility.

 3. A new variable $ipv6_prefer has been added.  If NO, IPv6
    functionality of interfaces with no corresponding
    $ifconfig_IF_ipv6 is disabled by using "inet6 ifdisabled" flag,
    and the default address selection policy of ip6addrctl(8)
    is the IPv4-preferred one (see rc.d/ip6addrctl for more details).
    Note that if you want to configure IPv6 functionality on the
    disabled interfaces after boot, first you need to clear the flag by
    using ifconfig(8) like:

         ifconfig em0 inet6 -ifdisabled

    If YES, the default address selection policy is set as
    IPv6-preferred.

    The default value of $ipv6_prefer is NO.

 4. If your system need to receive Router Advertisement messages,
    define "inet6 accept_rtadv" in $ifconfig_IF_ipv6.  The rc(8)
    scripts automatically invoke rtsol(8) when the interface becomes
    UP.  The Router Advertisement messages are used for SLAAC
    (State-Less Address AutoConfiguration).
2009-09-26 18:59:00 +00:00
Rui Paulo
350036a06d Note the D3.03 mesh changes.
MFC after:	1 week
2009-09-22 18:19:18 +00:00
Pawel Jakub Dawidek
63e1d3df27 - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular
df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows
  ZFS route of not listing snapshots by default with 'zfs list' command.
- Add UPDATING entry to note that ZFS snapshots are no longer visible in
  mount(8) and df(1) output by default.

Reviewed by:	kib
MFC after:	3 days
2009-09-14 21:10:40 +00:00
Warner Losh
f6a4f4b521 Go ahead and mention the CVS branch name as well as the svn branch name. 2009-09-05 08:09:35 +00:00
Warner Losh
411c7658a8 Note migration of tunable from hw.bus.devctl_disable to
hw.bus.devctl_queue.  The sysctl interface provides legacys upport for
the latter sysctl, but the tunable support was removed.

MFC after:	1 day
2009-09-05 08:08:14 +00:00
Warner Losh
4a19f42fe7 Actually, stable/8 is what was created... 2009-09-03 17:13:54 +00:00
Warner Losh
456b5dd815 Time for house-cleaning:
o remove all entries before RELENG_7 was branched, as is tradition[*].
o Update examples...  nobody cares about 5.x upgrades.
o minor format tweaking in a few places.
o update copyright (although at best I hold an editors copyright these days).
o Remove giving people permission to buy me beer.  I don't do enough for
  this document for that anymore...
2009-09-03 17:04:42 +00:00
Ken Smith
cf48cc9f69 Make head 9.0-CURRENT in preparation for lifting code freeze.
Approved by:	re (implicit)
2009-08-22 23:44:37 +00:00
Attilio Rao
dc6fbf6545 * Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions.  This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by:	jhb
Tested by:	pho, bz, rink
Approved by:	re (kib)
2009-08-13 17:09:45 +00:00
Konstantin Belousov
560bf6e4ae Note that COMPAT_43 requires COMPAT_FREEBSD7 too.
Submitted by:	Steve Kargl
Approved by:	re (kensmith)
2009-07-26 20:12:06 +00:00
Ken Smith
3ca3047aee Bump the version of all non-symbol-versioned shared libraries in
preparation for 8.0-RELEASE.  Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.

Reviewed by:    kib
Approved by:    re (rwatson)
2009-07-19 17:25:24 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Lawrence Stewart
237fbe0a1c Replace struct tcpopt with a proxy toeopt struct in the TOE driver interface to
the TCP syncache. This returns struct tcpopt to being private within the TCP
implementation, thus allowing it to be modified without ABI concerns.

The patch breaks the ABI. Bump __FreeBSD_version to 800103 accordingly. The cxgb
driver is the only TOE consumer affected by this change, and needs to be
recompiled along with the kernel.

Suggested by:	rwatson
Reviewed by:	rwatson, kmacy
Approved by:	re (kensmith), kensmith (mentor temporarily unavailable)
2009-07-13 11:51:02 +00:00
Lawrence Stewart
962ebef8c0 Pad the following TCP related structs to allow MFCs of upcoming features/fixes
back to the 8 branch:

tcp_var.h
- struct sackhint
- struct tcpcb
- struct tcpstat

The patch breaks the ABI. Bump __FreeBSD_version to 800102 accordingly. User
space tools that rely on the size of any of these structs (e.g. sockstat) need
to be recompiled.

Reviewed by:	rpaulo, sam, andre, rwatson
Approved by:	re & mentor (gnn)
2009-07-12 09:14:28 +00:00
Doug Rabson
81611ea7c6 Clarify the node about removing NFS_LEGACYRPC
Approved by: re
2009-07-01 18:12:50 +00:00
Doug Rabson
bab42aad98 Add an entry documenting removal of the NFS_LEGACYRPC option.
Submitted by: Steve Kargl
Approved by: re
2009-07-01 07:35:57 +00:00
Brooks Davis
6cb7f168db Remove support for the /dev/net/* per-interface devices. They serve
little purpose and are unused in the base system.

The IOCTL functionality is entirely duplicated and routing sockets
provide a richer interface than the kqueue functionality.

Further, it is not practical for these devices to be made sensible in
the face of VIMAGE.

Bump __FreeBSD_version on the off chance that there is any code out
there that actually uses this stuff.

Reviewed by:	rwatson
Discussed with:	bz, zec
Approved by:	re@ (kensmith)
2009-06-29 19:46:29 +00:00
Marc Fonvieille
944bc81da9 - release/* update to use freebsd-doc-* packages instead of building
FreeBSD docset during 'make release' this will speed up release
  builds;
- sysinstall(8) has also been updated to use these packages with a new
  menu allowing people to choose what localized doc to install;
- mention in UPDATING that docs from the FreeBSD Documentation project
  are now installed in /usr/local/share/doc/freebsd instead of
  /usr/share/doc.

Approved by:	re (kensmith)
2009-06-28 08:59:46 +00:00
John Baldwin
f5e4c1052a Note that as a result of the SYSV IPC changes, COMPAT_FREEBSD[456] now
require COMPAT_FREEBSD7.  Also, explicitly note in NOTES that any version
of COMPAT_FREEBSD<n> effectively requires for newer binaries (i.e.
COMPAT_FREEBSD<n+1>, etc.).  While this has been true in practice
previously, it used to compile ok before the commit earlier this week.

Discussed with:	peter
Approved by:	re (kensmith)
2009-06-26 17:50:52 +00:00
Doug Barton
b5820f18cc Revert the entry about pf and ipfw starting before netif 2009-06-26 01:10:10 +00:00
Bjoern A. Zeeb
b58ea5f310 Move virtualization of routing related variables into their own
Vimage module, which had been there already but now is stateful.

All variables are now file local; so this further limits the global
spreading of routing related things throughout the kernel.

Add a missing function local variable in case of MPATHing.

Reviewed by:	zec
2009-06-22 17:48:16 +00:00
Brooks Davis
838d985825 Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively.  (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer.  Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively.  Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary.  In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups.  When feasible, truncate
the group list rather than generating an error.

Minor changes:
  - Reduce the number of hand rolled versions of groupmember().
  - Do not assign to both cr_gid and cr_groups[0].
  - Modify ipfw to cache ucreds instead of part of their contents since
    they are immutable once referenced by more than one entity.

Submitted by:	Isilon Systems (initial implementation)
X-MFC after:	never
PR:		bin/113398 kern/133867
2009-06-19 17:10:35 +00:00
Attilio Rao
651175c9db Introduce support for adaptive spinning in lockmgr.
Actually, as it did receive few tuning, the support is disabled by
default, but it can opt-in with the option ADAPTIVE_LOCKMGRS.
Due to the nature of lockmgrs, adaptive spinning needs to be
selectively enabled for any interested lockmgr.
The support is bi-directional, or, in other ways, it will work in both
cases if the lock is held in read or write way.  In particular, the
read path is passible of further tunning using the sysctls
debug.lockmgr.retries and debug.lockmgr.loops .  Ideally, such sysctls
should be axed or compiled out before release.

Addictionally note that adaptive spinning doesn't cope well with
LK_SLEEPFAIL.  The reason is that many (and probabilly all) consumers
of LK_SLEEPFAIL are mainly interested in knowing if the interlock was
dropped or not in order to reacquire it and re-test initial conditions.
This directly interacts with adaptive spinning because lockmgr needs
to drop the interlock while spinning in order to avoid a deadlock
(further details in the comments inside the patch).

Final note: finding someone willing to help on tuning this with
relevant workloads would be either very important and appreciated.

Tested by:	jeff, pho
Requested by:	many
2009-06-17 01:55:42 +00:00
Sam Leffler
2c727cb9aa note abi change for IEEE80211_IOC_STA_INFO 2009-06-13 23:44:56 +00:00
Marko Zec
f089869fa5 Introduce a mechanism for detecting calls from outbound path of the
network stack when reentering the inbound path from netgraph, and
force queueing of mbufs at the outbound netgraph node.

The mechanism relies on two components.  First, in netgraph nodes
where outbound path of the network stack calls into netgraph, the
current thread has to be appropriately marked using the new
NG_OUTBOUND_THREAD_REF() macro before proceeding to call further
into the netgraph topology, and unmarked using the
NG_OUTBOUND_THREAD_UNREF() macro before returning to the caller.
Second, netgraph nodes which can potentially reenter the network
stack in the inbound path have to mark their inbound hooks using
NG_HOOK_SET_TO_INBOUND() macro.  The netgraph framework will then
detect when there is a danger of a call graph looping back from
outbound to inbound path via netgraph, and defer handing off the
mbufs to the "inbound" node to a worker thread with a clean stack.

In this first pass only the most obvious netgraph nodes have been
updated to ensure no outbound to inbound calls can occur.  Nodes
such as ng_ipfw, ng_gif etc. should be further examined whether a
potential for outbound to inbound call looping exists.

This commit changes the layout of struct thread, but due to
__FreeBSD_version number shortage a version bump has been omitted
at this time, nevertheless kernel and modules have to be rebuilt.

Reviewed by:	julian, rwatson, bz
Approved by:	julian (mentor)
2009-06-11 16:50:49 +00:00
Marko Zec
bc29160df3 Introduce an infrastructure for dismantling vnet instances.
Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state.  The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.

While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers.  Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.

Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels.  Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.

Bump __FreeBSD_version to 800097.
Reviewed by:	bz, julian
Approved by:	rwatson, kib (re), julian (mentor)
2009-06-08 17:15:40 +00:00
Ed Schouten
89f98d57d6 Remove window(1) from the base system.
Some time ago Tom Rhodes sent me an email that he was willing to perform
various cleanups to the window(1) source code. After some discussion, we
both decided the best thing to do, was to move window(1) to the ports
tree. The application isn't used a lot nowadays, mainly because it has
been superseeded by screen, tmux, etc.

A couple of hours ago Tom committed window(1) to ports (misc/window), so
I'm removing it from the tree. I don't think people will really miss it,
but I'm describing the change in UPDATING anyway.

Discussed with:	trhodes, pav, kib
Approved by:	re
2009-06-02 13:44:36 +00:00
Doug Barton
ce62079b6b Add a note about the change to rcorder for pf and ipfw. 2009-06-01 22:47:59 +00:00
Bjoern A. Zeeb
ab1a4d4834 Decrement __FreeBSD_version again to 96 as we are runing out of digits
and want to be conservative - so not more than one version bump per day.

Discussed with:	jhb, kensmith
2009-06-01 18:07:38 +00:00
Robert Watson
529cb8e371 Update UPDATING for NETISR2 merge, fix a typo in another UPDATING entry. 2009-06-01 16:00:36 +00:00
Bjoern A. Zeeb
c2c2a7c11e Convert the two dimensional array to be malloced and introduce
an accessor function to get the correct rnh pointer back.

Update netstat to get the correct pointer using kvm_read()
as well.

This not only fixes the ABI problem depending on the kernel
option but also permits the tunable to overwrite the kernel
option at boot time up to MAXFIBS, enlarging the number of
FIBs without having to recompile. So people could just use
GENERIC now.

Reviewed by:	julian, rwatson, zec
X-MFC:		not possible
2009-06-01 15:49:42 +00:00
Attilio Rao
faef64cc39 Remove the now invalid (and possibly unused) debug.mpsafevfs
sysctl/tunable.

Reviewed by:	emaste
Sponsored by:	Sandvine Incorporated
2009-05-30 23:52:23 +00:00
Edward Tomasz Napierala
2a61ba476b Bump __FreeBSD_version after addition of VOP_ACCESSX(9). 2009-05-30 14:01:01 +00:00
Maxim Konovalov
2a5f2c1d7f o Add missed quotation mark. 2009-05-29 19:45:39 +00:00
Edward Tomasz Napierala
b89fed6747 Update __FreeBSD_version after addition of mnt_xflag. Add a note
to UPDATING.
2009-05-29 18:50:27 +00:00
Attilio Rao
1ae1c2a3bd Reverse the logic for ADAPTIVE_SX option and enable it by default.
Introduce for this operation the reverse NO_ADAPTIVE_SX option.
The flag SX_ADAPTIVESPIN to be passed to sx_init_flags(9) gets suppressed
and the new flag, offering the reversed logic, SX_NOADAPTIVE is added.

Additively implements adaptive spininning for sx held in shared mode.
The spinning limit can be handled through sysctls in order to be tuned
while the code doesn't reach the release, after which time they should
be dropped probabilly.

This change has made been necessary by recent benchmarks where it does
improve concurrency of workloads in presence of high contention
(ie. ZFS).

KPI breakage is documented by __FreeBSD_version bumping, manpage and
UPDATING updates.

Requested by:	jeff, kmacy
Reviewed by:	jeff
Tested by:	pho
2009-05-29 01:49:27 +00:00
Jamie Gritton
0304c73163 Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
Marko Zec
37f17770e0 V_irtualize the if_clone framework, thus allowing for clonable ifnets
to optionally have overlapping unit numbers if attached in different
vnets.

At this stage if_loop is the only clonable ifnet class that has been
extended to allow for such overlapping allocation of unit numbers, i.e.
in each vnet it is possible to have a lo0 interface.  Other clonable ifnet
classes remain to operate with traditional semantics, i.e. each instance
of a clonable ifnet will be assigned a globally unique unit number,
regardless in which vnet such an ifnet becomes instantiated.

While here, garbage collect unused _lo_list field in struct vnet_net,
as well as improve indentation for #defines in sys/net/vnet.h.

The layout of struct vnet_net has changed, therefore bump
__FreeBSD_version.

This change has no functional impact on nooptions VIMAGE kernel builds.

Reviewed by:	bz, brooks
Approved by:	julian (mentor)
2009-05-23 21:43:44 +00:00
Joel Dahl
813bb2c94e Fix minor typo. 2009-05-23 09:24:07 +00:00
Edwin Groothuis
609e2d1b62 Rework the text for the import of zic(8) at 20090523.
Suggested by Niclas Zeising (and he was absolutely right on it!)
2009-05-23 08:49:55 +00:00
Edwin Groothuis
dfc79e892f MFV of tzcode2009e:
Upgrade of the tzcode from 2004a to 2009e.

Changes are numerous, but include...

- New format of the output of zic, which supports both 32 and 64
  bit time_t formats.

- zdump on 64 bit platforms will actually produce some output instead
  of doing nothing for a looooooooong time.

- linux_base-fX, with X >= at least 8, will work without problems related
  to the local time again.

The original patch, based on the 2008e, has been running for a long
time on both my laptop and desktop machine and have been tested by
other people.

After the installation of this code and the running of zic(8), you
need to run tzsetup(8) again to install the new datafile.

Approved by:	wollman@ for usr.sbin/zic
MFC after:	1 month
2009-05-23 06:31:50 +00:00
Andrew Thompson
9360ae4073 Rename the usb sysctl tree from hw.usb2.* back to hw.usb.*.
Submitted by:	Hans Petter Selasky
2009-05-21 01:48:42 +00:00
Sam Leffler
23790ac0d7 bump for net80211 monitor mode changes 2009-05-20 20:05:56 +00:00
Marko Zec
f6dfe47a14 Permit buiding kernels with options VIMAGE, restricted to only a single
active network stack instance.  Turning on options VIMAGE at compile
time yields the following changes relative to default kernel build:

1) V_ accessor macros for virtualized variables resolve to structure
fields via base pointers, instead of being resolved as fields in global
structs or plain global variables.  As an example, V_ifnet becomes:

    options VIMAGE:          ((struct vnet_net *) vnet_net)->_ifnet
    default build:           vnet_net_0._ifnet
    options VIMAGE_GLOBALS:  ifnet

2) INIT_VNET_* macros will declare and set up base pointers to be used
by V_ accessor macros, instead of resolving to whitespace:

    INIT_VNET_NET(ifp->if_vnet); becomes

    struct vnet_net *vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET];

3) Memory for vnet modules registered via vnet_mod_register() is now
allocated at run time in sys/kern/kern_vimage.c, instead of per vnet
module structs being declared as globals.  If required, vnet modules
can now request the framework to provide them with allocated bzeroed
memory by filling in the vmi_size field in their vmi_modinfo structures.

4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are
extended to hold a pointer to the parent vnet.  options VIMAGE builds
will fill in those fields as required.

5) curvnet is introduced as a new global variable in options VIMAGE
builds, always pointing to the default and only struct vnet.

6) struct sysctl_oid has been extended with additional two fields to
store major and minor virtualization module identifiers, oid_v_subs and
oid_v_mod.  SYSCTL_V_* family of macros will fill in those fields
accordingly, and store the offset in the appropriate vnet container
struct in oid_arg1.
In sysctl handlers dealing with virtualized sysctls, the
SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target
variable and make it available in arg1 variable for further processing.

Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have
been deleted.

Reviewed by:	bz, rwatson
Approved by:	julian (mentor)
2009-04-30 13:36:26 +00:00