Kirk McKusick 47806d1b93 Occasional cylinder-group check-hash errors were being reported on
systems running with a heavy filesystem load. Tracking down this
bug was elusive because there were actually two problems. Sometimes
the in-memory check hash was wrong and sometimes the check hash
computed when doing the read was wrong. The occurrence of either
error caused a check-hash mismatch to be reported.

The first error was that the check hash in the in-memory cylinder
group was incorrect. This error was caused by the following
sequence of events:

- We read a cylinder-group buffer and the check hash is valid.
- We update its cg_time and cg_old_time which makes the in-memory
  check-hash value invalid but we do not mark the cylinder group dirty.
- We do not make any other changes to the cylinder group, so we
  never mark it dirty, thus do not write it out, and hence never
  update the incorrect check hash for the in-memory buffer.
- Later, the buffer gets freed, but the page with the old incorrect
  check hash is still in the VM cache.
- Later, we read the cylinder group again, and the first page with
  the old check hash is still in the VM cache, but some other pages
  are not, so we have to do a read.
- The read does not actually get the first page from disk, but rather
  from the VM cache, resulting in the old check hash in the buffer.
- The value computed after doing the read does not match causing the
  error to be printed.

The fix for this problem is to only set cg_time and cg_old_time as
the cylinder group is being written to disk. This keeps the in-memory
check-hash valid unless the cylinder group has had other modifications
which will require it to be written with a new check hash calculated.
It also requires that the check hash be recalculated in the in-memory
cylinder group when it is marked clean after doing a background write.

The second problem was that the check hash computed at the end of the
read was incorrect because the calculation of the check hash on
completion of the read was being done too soon.

- When a read completes we had the following sequence:

  - bufdone()
  -- b_ckhashcalc (calculates check hash)
  -- bufdone_finish()
  --- vfs_vmio_iodone() (replaces bogus pages with the cached ones)

- When we are reading a buffer where one or more pages are already
  in memory (but not all pages, or we wouldn't be doing the read),
  the I/O is done with bogus_page mapped in for the pages that exist
  in the VM cache. This mapping is done to avoid corrupting the
  cached pages if there is any I/O overrun. The vfs_vmio_iodone()
  function is responsible for replacing the bogus_page(s) with the
  cached ones. But we were calculating the check hash before the
  bogus_page(s) were replaced. Hence, when we were calculating the
  check hash, we were partly reading from bogus_page, which means
  we calculated a bad check hash (e.g., because multiple pages have
  been mapped to bogus_page, so its contents are indeterminate).

The second fix is to move the check-hash calculation from bufdone()
to bufdone_finish() after the call to vfs_vmio_iodone() so that it
computes the check hash over the correct set of pages.

With these two changes, the occasional cylinder-group check-hash
errors are gone.

Submitted by: David Pfitzner <dpfitzner@netflix.com>
Reviewed by: kib
Tested by: David Pfitzner
2018-02-06 00:19:46 +00:00
2018-02-05 18:48:00 +00:00
2018-02-05 18:10:28 +00:00
2017-12-07 18:02:57 +00:00
2017-12-29 23:56:06 +00:00
2018-02-05 18:48:00 +00:00
2017-12-19 03:38:06 +00:00
2017-12-31 16:48:04 +00:00
2017-12-07 17:37:15 +00:00
2017-12-07 17:37:15 +00:00

FreeBSD Source:

This is the top level of the FreeBSD source directory. This file was last revised on: FreeBSD

For copyright information, please see the file COPYRIGHT in this directory (additional copyright information also exists for some sources in this tree - please see the specific source directories for more information).

The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree. See build(7) and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html for more information, including setting make(1) variables.

The buildkernel and installkernel targets build and install the kernel and the modules (see below). Please see the top of the Makefile in this directory for more information on the standard build targets and compile-time flags.

Building a kernel is a somewhat more involved process. See build(7), config(8), and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html for more information.

Note: If you want to build and install the kernel with the buildkernel and installkernel targets, you might need to build world before. More information is available in the handbook.

The kernel configuration files reside in the sys/<arch>/conf sub-directory. GENERIC is the default configuration used in release builds. NOTES contains entries and documentation for all possible devices, not just those commonly used.

Source Roadmap:

bin				System/user commands.

cddl			Various commands and libraries under the Common Development  
				and Distribution License.

contrib			Packages contributed by 3rd parties.

crypto			Cryptography stuff (see crypto/README).

etc				Template files for /etc.

gnu				Various commands and libraries under the GNU Public License.  
				Please see gnu/COPYING* for more information.

include			System include files.

kerberos5		Kerberos5 (Heimdal) package.

lib				System libraries.

libexec			System daemons.

release			Release building Makefile & associated tools.

rescue			Build system for statically linked /rescue utilities.

sbin			System commands.

secure			Cryptographic libraries and commands.

share			Shared resources.

stand			Boot loader sources.

sys				Kernel sources.

tests			Regression tests which can be run by Kyua.  See tests/README
				for additional information.

tools			Utilities for regression testing and miscellaneous tasks.

usr.bin			User commands.

usr.sbin		System administration commands.

For information on synchronizing your source tree with one or more of the FreeBSD Project's development branches, please see:

https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/current-stable.html

Description
freebsd with flexible iflib nic queues
Readme 2.6 GiB
Languages
C 60.1%
C++ 26.1%
Roff 4.9%
Shell 3%
Assembly 1.7%
Other 3.7%