This produced races resulting in panics and filesystem corruptions
under some circumstances.
Reviewed by: luoqi chen <luoqi@freebsd.org>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>
Submitted by: Matt Dillon <dillon@freebsd.org>
time now.
For whatever reason, the kernel seems to have generated SIGIOs
previously without an initial fcntl(...,F_SETOWN), but does no longer.
This caused window(1) to wait indefinitely for input.
Also, undo rev 1.3 of wwspawn.c, it was not well-thought, and
apparently not even tested at all. The blindly (even in a nonsensical
place like the comment on top of the function) applied replacement of
vfork() by fork() totally ignored that window(1) *does* abuse the
feature of vfork() where a modification of the parent's address space
is possible (in this case, to notify the parent of an erred exec*).
Also, with vfork(), it is guaranteed that the parent is only woken up
after the exec*() happened, where the replacement by fork() made the
parent to almost always become runnable again before the child, in
which case the parent simply told `subprocess died'. Unfortunately,
working around _this_ seems to be a lot more of redesign work compared
to little gained value, so i think relying on the specifics of vfork()
is the simpler way.
Submitted by: Philipp Mergenthaler <un1i@rz.uni-karlsruhe.de>
Describe /dev/vinum/control*
Describe drive "referenced" state.
Remove warning about kldunload; it seems to work now.
Still more descriptions of how to debug things.
Wait4 zombies.
make_devices: Don't try if the /dev directory is mounted read-only.
Create daemon superdevice /dev/vinum/controld.
Format a couple of multiline comments conformant with style(9).
for us.
Rebuild the (almost empty) /dev/vinum directory.
vinum_start: remove superflous "read" parameter when starting with no
parameters.
vinum_stop: without an argument, stop Vinum and remove the kld if
it's idle.
vinum_saveconfig: New command to save configuration.
Change VINUM_SAVECONFIG: it now requires a parameter. 0 means
"configuration updates are finished, please save", and 1 means "please
just save the config". This second meaning is invoked by the new
"saveconfig" command to vinum(8).
Recognize "referenced" drives by the lack of a slash in the device
name, not by a NUL character.
vinum_scandisk: return error indication (ENOENT if we can't find any
vinum drive, otherwise 0).
VINUM_SAVECONFIG: change parameters.
Don't save config while we're reading it from disk.
Change the way we handle the daemon: if we can't communicate with it
for 1 second (which is possible), start a new one. The daemon saves
its pid in daemonpid; on each iteration of the main loop the daemon
checks whether it's still in favour. If not, it silently exits.
Also, when trying to communicate with the daemon, check daemonpid
first. If it's set to 0, don't even try.
Rename the VF_KERNELOP to VF_DISKCONFIG and checkkernel () to
checkdiskconfig (), which better describes their function.
Disable configuration updates if we have an error reading in the
configuration. This stops a "shoot-in-foot" problem where a mistake
can cause the configuration to be obliterated.
Tidy up some messages, which included superfluous \ns.
Recognize RAID-5 configuration information even in the non-RAID-5
version. This fixes shoot-in-foot problems where starting the wrong
version of vinum would kill RAID-5 plexes.
Recognize drives that have been referenced, but for which no physical
location is known. This is part of a modification which will
ultimately allow incrementally reading configurations. Such drives
will have a device name "unknown".
New function return_drive_space () returns space to a drive.
Previously this was part of free_sd ().
give_sd_to_drive: don't do it if the subdisk needs more space than the
drive has available.
config_sd: if reading config from disk, accept plex offset, drive
offset and length specs of -1 to indicate error conditions.
parse_config: return ENOENT if the "read" command doesn't find any
drives.
remove_sd_entry: don't do it, even by force, if it's open.
If the size of a striped or RAID-5 plex is not an integral multiple of
the stripe size, trim the size until it is.
reinstate update_volume_config, which had atrophied, to recalculate
the size of a volume if a plex has shrunk due to stripe size
considerations.
vinumattach: Zero out tables after allocating them
Modify procedure at unload: if a vinum(8) has the superdev open, don't
close down. If only the daemon has it open, send the daemon a stop
request and wait for it to close the superdev, then unload.
In order to do this, create a second superdev which is opened by the
daemon. The open and close routines set a different bit in
vinum_conf.flags; otherwise the treatment is identical.
Remove opencount field in vol structure; replace by a flag bit, since
we can't count the number of opens.
Remove dead LKM grunge.
lives in ext2_vnops.c for ext2fs. Also remove cast from comparision.
Bruce pointed out that it was bogus since we'd force a signed
comparision when we really wanted an unsigned comparison.
to write all the dirty blocks. If some of those blocks have dependencies,
they will be remarked dirty when the I/O completes. On systems with
really fast I/O systems, it is possible to get in an infinite loop trying
to flush the buffers, because the I/O finishes before we can get all the
dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The
previous algorithm looped over the v_dirtyblkhd list writing out buffers
until the list emptied.) So, now we mark each buffer that we try to
write so that we can distinguish the ones that are being remarked dirty
from those that we have not yet tried to flush. Once we have tried to
push every buffer once, we then push any associated metadata that is
causing the remaining buffers to be redirtied.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
The much roumored replacement for our current IDE/ATA/ATAPI is
materialising in the CVS repositories around the globe.
So what does this bring us:
A new reengineered ATA/ATAPI subsystem, that tries to overcome
most of the deficiencies with the current drivers.
It supports PCI as well as ISA devices without all the hackery
in ide_pci.c to make PCI devices look like ISA counterparts.
It doesn't have the excessive wait problem on probe, in fact you
shouldn't notice any delay when your devices are getting probed.
Probing and attaching of devices are postponed until interrupts
are enabled (well almost, not finished yet for disks), making
things alot cleaner.
Improved performance, although DMA support is still WIP and not
in this pre alpha release, worldstone is faster with the new
driver compared to the old even with DMA.
So what does it take away:
There is NO support for old MFM/RLL/ESDI disks.
There is NO support for bad144, if your disk is bad, ditch it, it has
already outgrown its internal spare sectors, and is dying.
For you to try this out, you will have to modify your kernel config
file to use the "ata" controller instead of all wdc? entries.
example:
# for a PCI only system (most modern machines)
controller ata0
device atadisk0 # ATA disks
device atapicd0 # ATAPI CDROM's
device atapist0 # ATAPI tapes
#You should add the following on ISA systems:
controller ata1 at isa? port "IO_WD1" bio irq 14
controller ata2 at isa? port "IO_WD2" bio irq 15
You can leave it all in there, the system knows how to manage.
For now this driver reuses the device entries from the old system
(that will probably change later), but remember that disks are
now numbered in the sequence they are found (like the SCSI system)
not as absolute positions as the old system.
Although I have tested this on all the systems I can get my hands on,
there might very well be gremlins in there, so use AT YOU OWN RISK!!
This is still WIP, so there are lots of rough edges and unfinished
things in there, and what I have in my lab might look very different
from whats in CVS at any given time. So please have all eventual
changes go through me, or chances are they just dissapears...
I would very much like to hear from you, both good and bad news
are very welcome.
Enjoy!!
-Søren