FreeBSD src
Go to file
Rick Macklem 90d2dfab19 Merge the pNFS server code from projects/pnfs-planb-server into head.
This code merge adds a pNFS service to the NFSv4.1 server. Although it is
a large commit it should not affect behaviour for a non-pNFS NFS server.
Some documentation on how this works can be found at:
http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt
and will hopefully be turned into a proper document soon.
This is a merge of the kernel code. Userland and man page changes will
come soon, once the dust settles on this merge.
It has passed a "make universe", so I hope it will not cause build problems.
It also adds NFSv4.1 server support for the "current stateid".

Here is a brief overview of the pNFS service:
A pNFS service separates the Read/Write oeprations from all the other NFSv4.1
Metadata operations. It is hoped that this separation allows a pNFS service
to be configured that exceeds the limits of a single NFS server for either
storage capacity and/or I/O bandwidth.
It is possible to configure mirroring within the data servers (DSs) so that
the data storage file for an MDS file will be mirrored on two or more of
the DSs.
When this is used, failure of a DS will not stop the pNFS service and a
failed DS can be recovered once repaired while the pNFS service continues
to operate.  Although two way mirroring would be the norm, it is possible
to set a mirroring level of up to four or the number of DSs, whichever is
less.
The Metadata server will always be a single point of failure,
just as a single NFS server is.

A Plan B pNFS service consists of a single MetaData Server (MDS) and K
Data Servers (DS), all of which are recent FreeBSD systems.
Clients will mount the MDS as they would a single NFS server.
When files are created, the MDS creates a file tree identical to what a
single NFS server creates, except that all the regular (VREG) files will
be empty. As such, if you look at the exported tree on the MDS directly
on the MDS server (not via an NFS mount), the files will all be of size 0.
Each of these files will also have two extended attributes in the system
attribute name space:
pnfsd.dsfile - This extended attrbute stores the information that
    the MDS needs to find the data storage file(s) on DS(s) for this file.
pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime
    and Change attributes for the file, so that the MDS doesn't need to
    acquire the attributes from the DS for every Getattr operation.
For each regular (VREG) file, the MDS creates a data storage file on one
(or more if mirroring is enabled) of the DSs in one of the "dsNN"
subdirectories.  The name of this file is the file handle
of the file on the MDS in hexadecimal so that the name is unique.
The DSs use subdirectories named "ds0" to "dsN" so that no one directory
gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize
on the MDS, with the default being 20.
For production servers that will store a lot of files, this value should
probably be much larger.
It can be increased when the "nfsd" daemon is not running on the MDS,
once the "dsK" directories are created.

For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces
of information to the client that allows it to do I/O directly to the DS.
DeviceInfo - This is relatively static information that defines what a DS
             is. The critical bits of information returned by the FreeBSD
             server is the IP address of the DS and, for the Flexible
             File layout, that NFSv4.1 is to be used and that it is
             "tightly coupled".
             There is a "deviceid" which identifies the DeviceInfo.
Layout     - This is per file and can be recalled by the server when it
             is no longer valid. For the FreeBSD server, there is support
             for two types of layout, call File and Flexible File layout.
             Both allow the client to do I/O on the DS via NFSv4.1 I/O
             operations. The Flexible File layout is a more recent variant
             that allows specification of mirrors, where the client is
             expected to do writes to all mirrors to maintain them in a
             consistent state. The Flexible File layout also allows the
             client to report I/O errors for a DS back to the MDS.
             The Flexible File layout supports two variants referred to as
             "tightly coupled" vs "loosely coupled". The FreeBSD server always
             uses the "tightly coupled" variant where the client uses the
             same credentials to do I/O on the DS as it would on the MDS.
             For the "loosely coupled" variant, the layout specifies a
             synthetic user/group that the client uses to do I/O on the DS.
             The FreeBSD server does not do striping and always returns
             layouts for the entire file. The critical information in a layout
             is Read vs Read/Writea and DeviceID(s) that identify which
             DS(s) the data is stored on.

At this time, the MDS generates File Layout layouts to NFSv4.1 clients
that know how to do pNFS for the non-mirrored DS case unless the sysctl
vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File
layouts are generated.
The mirrored DS configuration always generates Flexible File layouts.
For NFS clients that do not support NFSv4.1 pNFS, all I/O operations
are done against the MDS which acts as a proxy for the appropriate DS(s).
When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy.
If the DS is on the same machine, the MDS/DS will do the RPC on the DS as
a proxy and so on, until the machine runs out of some resource, such as
session slots or mbufs.
As such, DSs must be separate systems from the MDS.

Tested by:	james.rose@framestore.com
Relnotes:	yes
2018-06-12 19:36:32 +00:00
bin Add an example to the chflags(1) man page. 2018-06-12 16:44:13 +00:00
cddl Process CUs with a language attribute of DW_LANG_Mips_Assembler. 2018-06-11 16:33:36 +00:00
contrib Add DW_LANG_* definitions from DWARF 4 and 5. 2018-06-09 14:50:38 +00:00
crypto Merge upstream patch to unbreak tunnel forwarding. 2018-05-16 14:04:39 +00:00
etc User service foo rather than /etc/rc.d/foo. 2018-06-11 22:48:34 +00:00
gnu Use a script wrapper for <compress>grep 2018-04-25 13:23:58 +00:00
include Add time2posix and posix2time to time.h 2018-05-25 13:40:05 +00:00
kerberos5 various: general adoption of SPDX licensing ID tags. 2017-11-27 15:37:16 +00:00
lib pmc gcc fixups 2018-06-11 16:09:54 +00:00
libexec Make rtld use libc_nossp_pic.a. Remove SSP shims. 2018-05-09 10:30:56 +00:00
release Enable USB OTG serial terminal on ARM SD card images. This configures 2018-06-12 16:45:52 +00:00
rescue Avoid referencing private lib names directly. 2017-11-10 07:53:02 +00:00
sbin Follow r333233, add fat32lba description to gpart(8) 2018-06-12 01:50:58 +00:00
secure Upgrade to OpenSSH 7.7p1. 2018-05-11 13:22:43 +00:00
share Add hrs, meta and myself to share/misc/committers-ports.dot 2018-06-12 07:51:03 +00:00
stand lualoader: Match Forth module-loading behavior w.r.t flags 2018-06-12 18:42:41 +00:00
sys Merge the pNFS server code from projects/pnfs-planb-server into head. 2018-06-12 19:36:32 +00:00
targets Add kernel and userspace code to dump the firmware state of supported 2018-03-08 15:21:56 +00:00
tests audit(4): add tests for stat(2) and friends 2018-06-10 21:36:29 +00:00
tools WITHOUT_NLS cleanup of more empty dirs. 2018-06-12 19:26:25 +00:00
usr.bin Fix memory leak 2018-06-12 16:42:11 +00:00
usr.sbin cpucontrol: 2018-06-12 18:58:56 +00:00
.arcconfig callsign isn't required anymore 2016-09-29 06:19:45 +00:00
.arclint arc lint: ignore /tests/ in chmod 2017-12-19 03:38:06 +00:00
.gitattributes .git*: add gitattributes and gitignore 2017-12-25 21:07:54 +00:00
.gitignore .git*: add gitattributes and gitignore 2017-12-25 21:07:54 +00:00
COPYRIGHT Remove 'All Rights Reserved' from the collection copyright and templates. 2018-05-09 02:02:49 +00:00
LOCKS LOCKS: update current locks 2018-06-09 03:08:04 +00:00
MAINTAINERS Pass on bhyve kernel module maintenance to 2018-06-10 04:25:19 +00:00
Makefile Restore arm, riscv, sparc64, and mips to UNIVERSE after r334128 2018-05-24 14:01:22 +00:00
Makefile.inc1 Bump __FreeBSD_version after r334881 and force libdwarf to be rebuilt. 2018-06-09 20:01:03 +00:00
Makefile.libcompat libpmc/pmu: enable for i386 as well 2018-05-31 22:26:55 +00:00
Makefile.sys.inc AUTO_OBJ: For all top-level targets enforce using an OBJDIR. 2017-12-05 21:29:47 +00:00
ObsoleteFiles.inc Revert r334929 2018-06-10 19:15:38 +00:00
README README: Reduce the textdump; describe the project 2018-05-23 04:09:01 +00:00
README.md README: Reduce the textdump; describe the project 2018-05-23 04:09:01 +00:00
UPDATING Revert size limits. 2018-06-11 20:38:30 +00:00

FreeBSD Source:

This is the top level of the FreeBSD source directory. This file was last revised on: FreeBSD

FreeBSD is an operating system used to power modern servers, desktops, and embedded platforms. A large community has continually developed it for more than thirty years. Its advanced networking, security, and storage features have made FreeBSD the platform of choice for many of the busiest web sites and most pervasive embedded networking and storage devices.

For copyright information, please see the file COPYRIGHT in this directory. Additional copyright information also exists for some sources in this tree - please see the specific source directories for more information.

The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree. See build(7), config(8), https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html, and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html for more information, including setting make(1) variables.

Source Roadmap:

bin		System/user commands.

cddl		Various commands and libraries under the Common Development
		and Distribution License.

contrib		Packages contributed by 3rd parties.

crypto		Cryptography stuff (see crypto/README).

etc		Template files for /etc.

gnu		Various commands and libraries under the GNU Public License.
		Please see gnu/COPYING* for more information.

include		System include files.

kerberos5	Kerberos5 (Heimdal) package.

lib		System libraries.

libexec		System daemons.

release		Release building Makefile & associated tools.

rescue		Build system for statically linked /rescue utilities.

sbin		System commands.

secure		Cryptographic libraries and commands.

share		Shared resources.

stand		Boot loader sources.

sys		Kernel sources.

sys/<arch>/conf Kernel configuration file

tests		Regression tests which can be run by Kyua.  See tests/README
		for additional information.

tools		Utilities for regression testing and miscellaneous tasks.

usr.bin		User commands.

usr.sbin	System administration commands.

For information on synchronizing your source tree with one or more of the FreeBSD Project's development branches, please see:

https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/current-stable.html