Commit Graph

264 Commits

Author SHA1 Message Date
John Baldwin
a08d2e7fe1 - Conditionally acquire Giant in mdstart_vnode(), mdcreate_vnode(), and
mddestroy() only if the file is from a non-MPSAFE VFS.
- No longer unconditionally hold Giant in the md kthread for vnode-backed
  kthreads.
- Improve the handling of the thread exit race when destroying an md
  device.
2006-03-28 21:25:11 +00:00
Wojciech A. Koszek
c27a895433 Teach md(4) and mdconfig(8) how to understand XML. Right now there won't be
a problem with listing large number of md(4) devices. Either 'list' or
'query' mode uses XML.

Additionally, new functionality was introduced. It's possible to pass
multiple devices to -u:

	# ./mdconfig -l -u md0,md1

Approved by:	cognet (mentor)
2006-03-26 23:21:11 +00:00
Luigi Rizzo
de64f22aa4 make sure that the start and end preloaded MFS markers are
in contiguous strings, and that the compiler does not optimize them
away because it thinks they are unused.
2006-01-31 13:35:30 +00:00
Pawel Jakub Dawidek
b322d85d53 Call NDFREE() only when vn_open() succeeded.
MFC after:	3 days
2006-01-27 11:27:55 +00:00
Maxim Konovalov
6c3cd0e2f6 o Fix typos in the comments.
Submitted by:	Wojciech A. Koszek
2005-12-28 15:18:18 +00:00
Robert Watson
5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Poul-Henning Kamp
947fc8de03 Make sure that the worker thread knows the type early enough to
grab Giant for vnode backing.

Found by:	pho & tegge
2005-10-06 19:47:04 +00:00
Poul-Henning Kamp
9b00ca1961 Fix configuration locking in MD.
Remove  md_mtx.

Remove GIANT from the mdctl device driver and avoid DROP_GIANT,
PICKUP_GIANT and geom events since we can call into GEOM directly
now.

Pick up Giant around vn_close().

Apply an exclusive sx around mdctls ioctl and preloading to protect
lists etc..

Don't initialize our lock (md_mtx or md_sx) from a
SYSINIT when there is a perfectly good pair of _fini/_init
functions to do it from.

Prune any final fractional sector from the mediasize to
keep GEOM happy.

Cleanups:

Unify MDIOVERSION check in (x)mdctlioctl()

Add pointer to start() routine to softc to eliminate a switch{}

Inline guts of mddetach().

Always pass error pointer to mdnew(), simplify implementation.
2005-09-19 06:55:27 +00:00
Poul-Henning Kamp
9fbea3e365 Do not destroy the queue mutex until the thread is done with it. 2005-09-11 12:35:32 +00:00
Pawel Jakub Dawidek
7ee3c044d0 - Add md_mtx lock to protect ID number and list of devices.
- Always check mdnew() return value, as even in !autounit case
  kthread_create() can fail.

Those two changes fix serval panics provked by simple stress test.

Tested by:	Kris The BugMagnet
MFC after:	3 days
2005-08-31 19:45:11 +00:00
Christian S.J. Peron
8677689134 Ensure that file flags such as schg, sappnd (and others) are honored
by md(4). Before this change, it was possible to by-pass these flags
by creating memory disks which used a file as a backing store and
writing to the device.

This was discussed by the security team, and although this is problematic,
it was decided that it was not critical as we never guarantee that root will
be restricted.

This change implements the following behavior changes:

-If the user specifies the readonly flag, unset write operations before
 opening the file. If the FWRITE mask is unset, the device will be
 created with the MD_READONLY mask set. (readonly)
-Add a check in g_md_access which checks to see if the MD_READONLY mask
 is set, if so return EROFS
-Do not gracefully downgrade access modes without telling the user. Instead
 make the user specify their intentions for the device (assuming the file is
 read only). This seems like the more correct way to handle things.

This is a RELENG_6 candidate.

PR:		kern/84635
Reviewed by:	phk
2005-08-17 01:24:55 +00:00
Alan Cox
e340fc602b Request a CPU private mapping from sf_buf_alloc(). If the swap-backed
memory disk is larger than the number of available sf_bufs, this improves
performance on SMPs by eliminating interprocessor TLB shootdowns.  For
example, with 6656 sf_bufs, the default on my test machine, and a 256MB
swap-backed memory disk, I see the command
"dd if=/dev/md0 of=/dev/null bs=64k" achieve ~489MB/sec with the default,
shared mappings, and ~587MB/sec with CPU private mappings.
2005-02-13 21:51:50 +00:00
Poul-Henning Kamp
d9aaa28f63 Use MAXMINOR 2005-01-29 16:50:04 +00:00
Pawel Jakub Dawidek
1db17c6db2 - Don't destroy UMA zone on error in mdcreate_malloc(), because we need it
in mddestroy() to properly free already allocated memory.
  This fixes a panic when we want to create too big memory backed device
  with preallocate memory (-o reserve).
- Remove redundant { }.

MFC after:	1 week
2005-01-22 19:56:03 +00:00
Poul-Henning Kamp
9d3a77c463 Add a couple of mtx_asserts() to try to narrow down the window on
a bug repeatedly reported.
2005-01-22 19:08:50 +00:00
Warner Losh
098ca2bda9 Start each of the license/copyright comments with /*-, minor shuffle of lines 2005-01-06 01:43:34 +00:00
Alan Cox
c935314fae Add needed synchronization to the error handling code that was introduced
in revision 1.141.

Lock assertion failures reported by: Kris Kennaway
2005-01-05 05:32:52 +00:00
John Baldwin
63710c4d35 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
Pawel Jakub Dawidek
88b5b78d59 Rewrite piece of code which I committed some time ago that allows to
show file name for 'mdconfig -l -u <x>' command.
This allows to preserve API/ABI compatibility with version 0 (that's why
I changed version number back to 0) and will allow to merge this change
to RELENG_5.

MFC after:	5 days
2004-12-27 17:20:06 +00:00
Marcel Moolenaar
8b6fc67a49 Fix the MDIOCDETACH ioctl() for md(4). Now that the md_file field in
the mdio structure is an array and not a pointer, we cannot test for
it to be NULL. It never is. Instead, test for md_file[0] to be '\0'.
2004-11-13 05:00:12 +00:00
Pawel Jakub Dawidek
e3ed29a739 Be consistent and use 'if (error != 0)' instead of 'if (error)' everywhere. 2004-11-06 13:16:35 +00:00
Pawel Jakub Dawidek
61a6eb62ec For file backed md(4) devices output their source file via
'mdconfig -l -u <unit>'.
Bump version number, as this change breaks ABI/API.
2004-11-06 13:07:02 +00:00
Poul-Henning Kamp
3b66ad07db Don't explicitly call g_waitidle(), it happens automagically now. 2004-10-23 20:50:06 +00:00
Brian Feldman
812851b6c9 Account for failure in vm_pager_allocate() or vm_pager_get_pages() in
md(8).  The former is generally not going to fail, but the latter can
fail when the underlying swap device returns an error.

There are still plenty of other places where vm_pager_get_pages() failing
will lead directly to crashes, so it's a good idea to put your swap on
RAID if you care enough to put any of your disks on RAID....
2004-10-12 04:47:16 +00:00
Pawel Jakub Dawidek
e4cdd0d4b5 Actually this order (unlock, wakeup) in this case is race-safe and can
save us 2 context switches.

Explained by:	njl
2004-09-18 09:16:19 +00:00
Pawel Jakub Dawidek
b830359bc5 - Make md(4) 64-bit clean.
After this change it should be possible to use very big md(4) devices.
- Clean up and simplify the code a bit.
- Use humanize_number(3) to print size of md(4) devices.
- Add 't' suffix which stands for terabyte.
- Make '-S' to really work with all types of devices.
- Other minor changes.
2004-09-16 21:32:13 +00:00
Pawel Jakub Dawidek
fcd57fbe6f There is no need to keep 'npage' value inside our softc structure,
it is only used in one function. While doing so, change its type to
vm_ooffset_t.
We are still limited for swap-backed devices to 16TB on 32-bit architectures
where PAGE_SIZE is 4096 bytes.
2004-09-16 20:38:11 +00:00
Pawel Jakub Dawidek
a8a58d03f6 - Do not use bio_pblkno as it is going away anyway.
- Prefer bio_length than bio_bcount.
2004-09-16 19:42:17 +00:00
Pawel Jakub Dawidek
4b07ede4a7 First wakeup, then unlock. 2004-09-16 18:59:19 +00:00
Pawel Jakub Dawidek
6ab0a0aefe Type 'int' is too small for 'i' and 'lastp' variables. Use proper type,
which is vm_pindex_t (unsigned 64bit on i386).
2004-09-16 18:56:20 +00:00
Pawel Jakub Dawidek
2eafd8b126 Deallocate VM object on failure. 2004-09-14 19:55:07 +00:00
Pawel Jakub Dawidek
7a0970111f One more missing NDFREE(9). 2004-09-14 19:27:59 +00:00
Pawel Jakub Dawidek
52c6716fee - Don't forget about NDFREE() in case of vn_open() failure.
- Don't forget about vn_close() in case of failure.
2004-09-14 18:43:24 +00:00
Pawel Jakub Dawidek
f9963bbc08 Fix UMA zone leak. 2004-09-14 18:32:05 +00:00
Poul-Henning Kamp
affa470653 Use bioq_takefirst() 2004-09-07 07:54:45 +00:00
Colin Percival
972be79add Don't g_waitidle() when initializing a preloaded md. This fixes a
deadlock which otherwise occurs during the boot process.

Reported by:	kensmith
MFC after:	3 days
		(assuming that re@ approves)
2004-08-30 08:38:30 +00:00
Colin Percival
2b004a226d When creating a new md, wait for geom's event queue to become empty
before returning.  Device nodes are created via the "taste" mechanism,
so this is necessary in order to make sure that devfs entries are
created before mdconfig(8) returns.

This may be a MFC candidate for 5.3.

Suggested by:	phk
2004-08-22 19:44:24 +00:00
Poul-Henning Kamp
5721c9c76a Tag all geom classes in the tree with a version number. 2004-08-08 07:57:53 +00:00
Poul-Henning Kamp
199456973f Use a ->fini() from the geom class to destroy the control device.
Use default initialization of geom methods.
2004-08-08 06:47:43 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Pawel Jakub Dawidek
6a40892929 Fix panic which occurs when given sector size for memory-backed device
is less than DEV_BSIZE (512) bytes.

Reported by:	Mike Bristow <mike@urgle.com>
Approved by:	phk
2004-05-18 07:30:04 +00:00
Warner Losh
ed010cdfa2 Ooops, removed this acknowledgement bogusly.
Eagle Eyes: bde
2004-04-09 05:12:47 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Alan Cox
121230a40d In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it
should not.  Add a new parameter so that the caller can specify which is
the case.

Reported by:	dillon
2004-04-03 09:16:27 +00:00
Luigi Rizzo
5d4ca75e56 Fix a bug with preloaded image -- for some reason [that i don't
completely understand], md_takeroot() runs before md_preloaded(),
rendering both useless.
As a fix, move the body (effectively one line!) of md_takeroot()
into md_preloaded(), and get rid of the stuff that has become useless.

Bug and fix reported 10 days ago on -current, no reply.
2004-03-31 21:48:02 +00:00
Alan Cox
07be617f09 - Remove some unused #includes.
- Apply some style fixes to mdstart_swap().
2004-03-19 21:19:15 +00:00
Alan Cox
7cd53fdda8 Utilize sf_buf_alloc() and sf_buf_free() to implement the ephemeral
mappings required by mdstart_swap().  On i386, if the ephemeral mapping
is already in the sf_buf mapping cache, a swap-backed md performs
similarly to a malloc-backed md.  Even if the ephemeral mapping is not
cached, this implementation is still faster.  On 64-bit platforms, this
change has the effect of using the direct virtual-to-physical mapping,
avoiding ephemeral mapping overheads, such as TLB shootdowns on SMPs.

On a 2.4GHz, 400MHz FSB P4 Xeon configured with 64K sf_bufs and
"mdmfs -S -o async -s 128m md /mnt"

before:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.430923 secs (311465697 bytes/sec)

after with cold sf_buf cache:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.367948 secs (364773576 bytes/sec)

after with warm sf_buf cache:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.252826 secs (530870010 bytes/sec)

malloc-backed md:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.253126 secs (530240978 bytes/sec)
2004-03-18 18:23:37 +00:00
Alan Cox
33651381b7 Allow swap-backed devices to run without Giant. 2004-03-14 00:24:30 +00:00
Poul-Henning Kamp
7a6b2b6429 Fix a long-standing deadlock issue with vnode backed md(4) devices:
On vnode backed md(4) devices over a certain, currently undetermined
size relative to the buffer cache our "lemming-syncer" can provoke
a buffer starvation which puts the md thread to sleep on wdrain.

This generally tends to grind the entire system to a stop because the
event that is supposed to wake up the thread will not happen until a fair
bit of the piled up I/O requests in the system finish, and since a lot
of those are on a md(4) vnode backed device which is currently waiting
on wdrain until a fair amount of the piled up ... you get the picture.

The cure is to issue all VOP_WRITES on the vnode backing the device
with IO_SYNC.

In addition to more closely emulating a real disk device with a
non-lying write-cache, this makes the writes exempt from rate-limited
(there to avoid starving the buffer cache) and consequently prevents
the deadlock.

Unfortunately performance takes a hit.

Add "async" option to give people who know what they are doing the
old behaviour.
2004-03-10 20:41:09 +00:00
John Baldwin
6074439965 kthread_exit() no longer requires Giant, so don't force callers to acquire
Giant just to call kthread_exit().

Requested by:	many
2004-03-05 22:42:17 +00:00
Poul-Henning Kamp
9ed40643ea Make swapbacked md(4) devices respect the -x and -y emulation arguments. 2004-03-02 20:13:23 +00:00
Colin Percival
e07113d65c Use DEV_BSIZE byte sectors instead of PAGE_SIZE byte sectors for
swap-backed memory disks.  This reduces filesystem allocation overhead
and makes swap-backed memory disks compatible with broken code (dd,
for example) which expects to see 512 byte sectors.  The size of a
swap-backed memory disk must still be a multiple of the page size.

When performing page-aligned operations, this change has zero
performance impact.

Reviewed by:	phk
Approved by:	rwatson (mentor)
2004-02-29 15:58:54 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Poul-Henning Kamp
d5a929dc17 Allow specification of a geometry for vnode backed devices as well as
for malloc backed devices.
2004-01-12 10:52:00 +00:00
Poul-Henning Kamp
0a93720637 Fix a locking problem with MD_ROOT_SIZE.
Retire md(4)'s static major number.
2003-12-13 18:12:58 +00:00
Poul-Henning Kamp
0eb1430969 Use the class->init() to hitch up preload devices, rather than rely on
the "old" SYSINIT.  This makes sure things happen in the right order.

XXX: md(4) needs to be fully geom-ified and in particluar /dev/md.ctl
should be abandonded for the GEOM OaM api.

Approved by:	re@
2003-11-18 18:19:26 +00:00
Poul-Henning Kamp
6e9a011a6a Don't initialize unused bio_blkno field. 2003-10-18 11:25:42 +00:00
Poul-Henning Kamp
70cd771337 The present defaults for the open and close for device drivers which
provide no methods does not make any sense, and is not used by any
driver.

It is a pretty hard to come up with even a theoretical concept of
a device driver which would always fail open and close with ENODEV.

Change the defaults to be nullopen() and nullclose() which simply
does nothing.

Remove explicit initializations to these from the drivers which
already used them.
2003-09-27 12:01:01 +00:00
John Baldwin
8b149b5131 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
Poul-Henning Kamp
8e28326a8b Change the implementation of swap backing to use the VM system in normal
ways, and drop the need for vm_pager_strategy().
2003-08-05 06:54:44 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
8198a1a472 Remove 256 unit limit, there is no evil minor number encoding to
deal with any more.

Spotted by:	"Darren Freestone" <df@cops.org>
2003-06-22 11:31:38 +00:00
Poul-Henning Kamp
f075585f67 Remove the G_CLASS_INITIALIZER, we do not need it anymore. 2003-05-31 16:59:27 +00:00
Poul-Henning Kamp
17a1391990 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
Alan Cox
f820bc501e Use vm_object_deallocate(), not vm_pager_deallocate(), to destroy a
vm object.  (vm_pager_deallocate() does not, in fact, destroy a vm object.)

Approved by:	re (scottl)
Reviewed by:	phk
2003-05-16 07:28:27 +00:00
Poul-Henning Kamp
6b60a2cd57 Call g_wither_geom(), instead of just setting the flag. 2003-05-02 06:18:58 +00:00
Poul-Henning Kamp
4e8bfe1482 Add a couple of undocumented test options to MD(4) to aid in regression
testting of GEOM.
2003-04-09 11:59:29 +00:00
Poul-Henning Kamp
4eba52a2d2 Remove all references to BIO_SETATTR. We will not be using it. 2003-04-03 19:19:36 +00:00
Poul-Henning Kamp
891619a66d Use bioq_flush() to drain a bio queue with a specific error code.
Retain the mistake of not updating the devstat API for now.

Spell bioq_disksort() consistently with the remaining bioq_*().

#include <geom/geom_disk.h> where this is more appropriate.
2003-04-01 15:06:26 +00:00
Poul-Henning Kamp
51a5b7f1ba Don't include <sys/disk.h>. 2003-04-01 13:33:28 +00:00
Poul-Henning Kamp
67ffbc63f3 remove a blank line. 2003-03-29 22:13:32 +00:00
Poul-Henning Kamp
83e13864c3 Allocate the toplevel indir with M_WAITOK to avoid complicating things
needlessly.

Detected by:	rwatsons EvilMalloc(9)
2003-03-27 10:14:36 +00:00
Poul-Henning Kamp
5d445dcb4e Change g_class initialization to sparse format. 2003-03-24 19:46:26 +00:00
Poul-Henning Kamp
b4b138c27f Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
Poul-Henning Kamp
60794e0478 Centralize the devstat handling for all GEOM disk device drivers
in geom_disk.c.

As a side effect this makes a lot of #include <sys/devicestat.h>
lines not needed and some biofinish() calls can be reduced to
biodone() again.
2003-03-08 08:01:31 +00:00
Poul-Henning Kamp
ebe789d61c Add a "-S sectorsize" option to enable Kirk to find a bug :-) 2003-03-03 13:05:00 +00:00
Poul-Henning Kamp
7ac40f5f59 Gigacommit to improve device-driver source compatibility between
branches:

Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.

This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.

Approved by:    re(scottl)
2003-03-03 12:15:54 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Poul-Henning Kamp
b3b3d1b7fe Mark our provider with G_PF_CANDELETE in the cases where this is actually
the case.
2003-02-11 12:35:44 +00:00
Poul-Henning Kamp
5777c5b989 NO_GEOM cleanup: unifdef 2003-01-30 13:12:31 +00:00
Poul-Henning Kamp
16bcbe8cf7 Implement MDIOCLIST which returns the unit numbers of configured md(4)
devices.

We use the md_pad[] array and if there are more units than its size the
last returned unit number will be -1, but the number of units returned
is correct.
2003-01-27 07:58:18 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Poul-Henning Kamp
26d48b4044 OK Ok, so I didn't check the NO_GEOM case for the final version...
Stumbled on by:	bde
2003-01-13 20:19:04 +00:00
Poul-Henning Kamp
a4f8615810 Enable the new h0h0magic code which on GEOM kernels make the md(4)
driver a _real_ GEOM driver.
2003-01-13 17:31:46 +00:00
Poul-Henning Kamp
0f8500a5b3 Add a mutex around the per unit bioqueue.
Only grab giant in the per unit kthread for SWAP and VNODE backed devices.

Initialize the bioq before the kthread gets a chance to study it.

Don't lock Giant in mddone_swap, we shouldn't need it.
2003-01-13 08:50:23 +00:00
Poul-Henning Kamp
64bfc43b26 Remove the printf which announces the creation of malloc disks: it is
inconsistent when we do not do it for swap or vnode.

We still printf for preloaded disks because of the weak debugging
options people have in embedded/tiny environments where this is
usually used.
2003-01-13 08:01:09 +00:00
Poul-Henning Kamp
6f4f00f114 Add code to make md(4) a GEOM device driver instead of relying in
the disk mini-layer.

This is currently not enabled.
2003-01-12 21:16:49 +00:00
Poul-Henning Kamp
a522a15950 Shift things around a bit in preparation for future evilness. 2003-01-12 17:39:29 +00:00
Ian Dowse
e176446dc8 Move the check for the MD_SHUTDOWN flag to before the tsleep() call
in the per-device kthread. This ensures that synchronisation with
mddestroy() succeeds even if the kthread was not waiting in tsleep()
at the time of the wakeup(). Among other things, this fixes the
problem of mdconfig getting stuck when an attempt is made to use a
zero-length file as a vnode-type backing store.

Approved by:	re
2002-11-30 22:03:53 +00:00
Poul-Henning Kamp
8689acc48c We want /dev/md0 for ramdisk roots, not /dev/md0c.
Sponsored by:	DARPA & NAI Labs
2002-10-21 20:08:28 +00:00
Poul-Henning Kamp
975b628f27 Use ENOSPC error return, not ENOMEM.
Use %jd rather than %lld.
2002-10-20 20:50:31 +00:00
Jake Burkholder
81bb0b95b1 MODINFO_SIZE metadata has type size_t, not unsigned. This makes preloaded
md root work on sparc64.
2002-10-13 18:19:22 +00:00
Scott Long
316ec49abd Some kernel threads try to do significant work, and the default KSTACK_PAGES
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create.  Passing the
value 0 prevents the alternate kstack from being created.  Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.

Reviewed by:	jake, peter, jhb
2002-10-02 07:44:29 +00:00
Poul-Henning Kamp
1cff889a46 Put the casts on the right hand side of =. 2002-09-28 14:17:24 +00:00
Peter Grehan
a0c6726472 Initialize fwsectors/fwheads to allow the DIOCGFWSECTORS and
DIOCGFWHEADS ioctls to return meaningful values to disklabel/newfs

Approved by: phk
2002-09-22 10:07:18 +00:00
Poul-Henning Kamp
7812d86f03 (This commit touches about 15 disk device drivers in a very consistent
and predictable way, and I apologize if I have gotten it wrong anywhere,
getting prior review on a patch like this is not feasible, considering
the number of people involved and hardware availability etc.)

If struct disklabel is the messenger: kill the messenger.

Inside struct disk we had a struct disklabel which disk drivers used to
communicate certain metrics to the disklayer above (GEOM or the disk
mini-layer).  This commit changes this communication to use four
explicit fields instead.

Amongst the benefits is that the fields do not get overwritten by
wrong or bogus on-disk disklabels.

Once that is clear, <sys/disk.h> which is included in the drivers
no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in,
the few places that needs them, have gotten explicit #includes for
them.

The disklabel inside struct disk is now only for internal use in
the disk mini-layer, so instead of embedding it, we malloc it as
we need it.

This concludes (modulus any mistakes) the series of disklabel related
commits.

I belive it all amounts to a NOP for all the rest of you :-)

Sponsored by:   DARPA & NAI Labs.
2002-09-20 19:36:05 +00:00
Archie Cobbs
4a6a94d8d8 Replace (ab)uses of "NULL" where "0" is really meant. 2002-08-22 21:24:01 +00:00
Maxime Henrion
6569d6f3ec Yet another warning fix for 64 bits platforms.
Reviewed by:	phk
2002-06-24 12:07:02 +00:00
Poul-Henning Kamp
58f3c42e6d mdcreate_vnode() isn't correctly clearing things out of the linked
list if the file is of 0 size or mdsetcred() fails.

Submitted by:	Martin Faxer <gmh003532@brfmasthugget.se>
2002-06-15 19:18:43 +00:00
Maxim Sobolev
d8a186ebbb - Whitespace only: use return statement consistentlt (return (foo), not
return(foo)), kill extra blank names between function names;
- fix format string in printf(): devtoname() returns string, not pointer.
2002-06-10 19:25:21 +00:00
Ian Dowse
5c97ca54e5 Use a per-device worker thread to avoid blocking in mdstrategy()
until the I/O completes. This fixes some easily reproducable deadlocks
that occur when using md(4) with GEOM.

Reviewed by:	phk
2002-06-03 22:09:04 +00:00
Poul-Henning Kamp
fcf867e9f7 Mis-edit in last commit. 2002-05-26 09:57:59 +00:00
Poul-Henning Kamp
fde2a2e414 Be a bit smarter about rewriting data so we don't loose too much performance.
Sponsored by: DARPA & NAI Labs.
2002-05-26 09:38:51 +00:00
Poul-Henning Kamp
f43b2bac72 Use an umazone per unit for allocating the sectors for malloc backing.
Clean up things properly when we unconfigure malloc backed units.

Sponsored by:	DARPA & NAI Labs.
2002-05-26 06:48:55 +00:00
Poul-Henning Kamp
c6517568df Give the "malloc" backing of md(4) an adaptive multilevel index tree to
remove the need for a contiguous array with pointers to all the sectors.

Try to make failure to malloc(9) memory a non-hang situation.

Eventually this will allow us to test the 64bit cleanness of the disk
I/O patch, but more work is outstanding here and elsewhere.

Sponsored by:	DARPA & NAI Labs.
2002-05-25 20:44:20 +00:00
Poul-Henning Kamp
9589c2561c Fix a memory-leak when configuring a vnode backed md(4) device fails.
Submitted by:	Martin Faxér <gmh003532@brfmasthugget.se>
MFC after:	4 weeks
2002-05-03 17:55:10 +00:00
Jeff Roberson
421e6a659e Remove unused include. 2002-03-20 09:55:07 +00:00
Bruce Evans
1ab0b5f920 The previous commit missed fixing 2 old printf format errors and
introduced a format printf error.
2002-03-19 04:07:29 +00:00
Andrew Gallatin
54ed0c3221 Fix printf warning caused by recent changes in bio_pblkno's type. 2002-03-19 01:45:04 +00:00
Kirk McKusick
0d2af52141 Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
2002-03-15 18:49:47 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Poul-Henning Kamp
e087ce2dd2 Staticize the malloc definitions.
Obtained from:	~bde/sys.dif.gz
2002-02-10 21:42:44 +00:00
Poul-Henning Kamp
3ca627fefa Gah! last commit botched indentation, fix indentation and some other
white-space nits while at it.
2002-01-21 20:57:03 +00:00
Poul-Henning Kamp
b4a4f93c5e Restructure slightly, eliminating some repetitive source lines and
making GEOM patches simpler and more readable at the same time.
2002-01-21 20:50:06 +00:00
Dima Dorfman
53d745bc7c Actually make use of the md_version field of 'struct mdio'. In order
not to needlessly break compatibility, decrement MDIOVERSION to 0.

Approved by:	phk
2001-12-20 06:38:21 +00:00
Matthew Dillon
7e76bb562e Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blocking
in wdrain during a write.  This flag needs to be used in devices whos
strategy routines turn-around and issue another high level I/O, such as
when MD turns around and issues a VOP_WRITE to vnode backing store, in order
to avoid deadlocking the dirty buffer draining code.

Remove a vprintf() warning from MD when the backing vnode is found to be
in-use.  The syncer of buf_daemon could be flushing the backing vnode at
the time of an MD operation so the warning is not correct.

MFC after:	1 week
2001-11-05 18:48:54 +00:00
John Baldwin
bd78cece5d Change the kernel's ucred API as follows:
- crhold() returns a reference to the ucred whose refcount it bumps.
- crcopy() now simply copies the credentials from one credential to
  another and has no return value.
- a new crshared() primitive is added which returns true if a ucred's
  refcount is > 1 and false (0) otherwise.
2001-10-11 23:38:17 +00:00
John Baldwin
9935282d50 Use crhold() instead of crdup(). The md(4) driver doesn't modify the ucred
that it uses, so it merely needs to bump its refcount to make it immutable
rather than obtain its own copy.
2001-10-09 16:37:51 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Maxim Sobolev
166cd0b42f OOPS, remove local change that somehow slipped into a commit (I swear that
I already deleted it some time ago). This should fix problem people have with
unsefined reference to `MD_PRELOAD_COMPRESSED'.

Submitted by:	Manfred Antar <null@pozo.com>
2001-08-27 17:48:37 +00:00
Maxim Sobolev
9d4b594558 - On module unload try to detach all configured disks and let unload proceed
if all disks were detached sucessfully;
- use consistent style for return statements and fix several others style
  inconsistencies.

Reviewed by:	ru
Approved by:	phk
2001-08-27 13:25:47 +00:00
Dima Dorfman
e0cebb4027 There is no MD_OBJET disk type, it's actually MD_SWAP. I guess the
former was either a previous or proposed name that kind of snuck in.
2001-08-16 03:04:49 +00:00
Dima Dorfman
26a0ee75c6 Introduce a force option, MD_FORCE, that instructs the driver to
bypass some extra anti-foot-shooting measures.  Currently, its only
effect is to allow detaching a device while it's still open (e.g.,
mounted).  This is useful for testing how the system reacts to a disk
suddenly going away, which can happen with some removeable media.

At this point, the force option is only checked on detach, so it
would've been possible to allow the option to be passed with the
MDIOCDETACH operation.  This was not done to allow the possibility of
having the force flag influence other tests in the future, which may
not necessarily deal with detaching the device.

Reviewed by:	sobomax
Approved by:	phk
2001-08-07 19:23:16 +00:00
Maxim Sobolev
fe603109d1 - Deny detaching requests until device is still open, otherwise it is possible
to hang or panic kernel by detaching disk from which fs is mounted;
- replace "md" with MD_NAME in yet another place.

Reviewed by:	phk
Approved by:	phk
2001-08-02 10:19:13 +00:00
Thomas Moestl
3d3c27fedf Make sure the total number of sectors is not 0 for a vnode-type md to
avoid a division by zero which would occur on open() in this case.

Reviewed by:	phk
2001-07-26 20:05:20 +00:00
Dima Dorfman
10b0e058bb Use MD_NAME and MDCTL_NAME constants where appropriate. 2001-07-18 13:32:38 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
John Baldwin
277be4c219 We don't need the vm lock to perform a few simple calculations on the
md device's softc.
2001-06-25 18:24:52 +00:00
Poul-Henning Kamp
8adeb35aff Remove MFS compat bits. 2001-05-29 18:49:23 +00:00
Dima Dorfman
5a025167e7 Acquire vm_mtx before calling vm_pager_deallocate.
Reviewed by:	phk
2001-05-27 00:42:46 +00:00
John Baldwin
9dceb26b23 Sort includes. 2001-05-21 18:52:02 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Poul-Henning Kamp
d4e6d409ca Polish error handling code using biofinish() 2001-05-08 09:09:32 +00:00
Poul-Henning Kamp
a468031ce8 Actually biofinish(struct bio *, struct devstat *, int error) is more general
than the bioerror().

Most of this patch is generated by scripts.
2001-05-06 20:00:03 +00:00
Poul-Henning Kamp
1f4ee1aac4 Fix a panic if MD devices were left half-created.
XXX: the real bug is that devstat isn't part of the disk minilayer.

PR:		27158
Submitted by:	Anders Nordby <anders@fix.no>
2001-05-06 17:17:23 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Poul-Henning Kamp
53233f9499 Fix a reference to the "vn" driver in a warning message. 2001-03-20 12:31:53 +00:00
Poul-Henning Kamp
0ac6032373 Use a more BIOS friendly geometry.
Submitted by:	joe
2001-03-09 20:06:30 +00:00
Poul-Henning Kamp
174b5e9aec Make "md" and "mdctl" macroized parameters.
Implement "-l" option to mdconfig which can list one or all md devices.

Submitted by:   Dima Dorfman <dima@unixfreak.org>
2001-02-25 13:12:57 +00:00
Poul-Henning Kamp
57e9624ec9 Make md/mdconfig do kld.
Submitted by:	dcs
2001-02-24 16:26:41 +00:00
Poul-Henning Kamp
c93849206e Remove devstat entries in mddelete()
Spotted:	tegge
2001-01-28 20:55:55 +00:00
Poul-Henning Kamp
96b6a55f97 General cleanup. 2001-01-21 22:57:56 +00:00
Peter Wemm
dc57d7c6dc Fix a maybe-not-so-harmless warning. 2001-01-19 09:02:40 +00:00
Poul-Henning Kamp
637f671a3d Either cvs(1) or I forgot this file in my last commit.
Please see commit log for rev 1.4 of src/sbin/mdconfig/mdconfig.c
2001-01-02 09:42:47 +00:00
Poul-Henning Kamp
8f8def9e2c This is the first snapshot of the new all-singing-and-dancing md(4).
Using the mdconfig(8) program you can now configure memory disks
on malloc(9), swap or a file/vnode.  preloaded md disks also work
as usual.
2000-12-31 13:03:42 +00:00
Poul-Henning Kamp
e0913a2c44 Enforce disk unit numbers upper limit in cloning. 2000-12-15 16:42:38 +00:00
David Malone
7cc0979fd6 Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
Poul-Henning Kamp
db90128160 Avoid the modules madness I inadvertently introduced by making the
cloning infrastructure standard in kern_conf.  Modules are now
the same with or without devfs support.

If you need to detect if devfs is present, in modules or elsewhere,
check the integer variable "devfs_present".

This happily removes an ugly hack from kern/vfs_conf.c.

This forces a rename of the eventhandler and the standard clone
helper function.

Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include
like <sys/queue.h>

Remove all #includes of opt_devfs.h they no longer matter.
2000-09-02 19:17:34 +00:00
Doug Rabson
21c3015a24 * Completely rewrite the alpha busspace to hide the implementation from
the drivers.
* Remove legacy inx/outx support from chipset and replace with macros
  which call busspace.
* Rework pci config accesses to route through the pcib device instead of
  calling a MD function directly.

With these changes it is possible to cleanly support machines which have
more than one independantly numbered PCI busses. As a bonus, the new
busspace implementation should be measurably faster than the old one.
2000-08-28 21:48:13 +00:00