Commit Graph

68 Commits

Author SHA1 Message Date
pjd
7a0948ca69 Fix storing offset of already synchronized data. Offset in entire array was
stored in metadata instead of an offset in single disk.
After reboot/crash synchronization process started from a wrong offset
skipping (not synchronizing) part of the component which can lead to data
corrutpion (when synchronization process was interrupted on initial
synchronization) or other strange situations like 'graid3 status' showing
value more than 100%.

Reported, reviewed and tested by:	ru
Reported by:	Dmitry Morozovsky <marck@rinet.ru>
MFC after:	1 day
2006-04-18 13:52:11 +00:00
pjd
d7eb5b2fe9 Introduce and use delayed-destruction functionality from a pre-sync hook,
which means that devices will be destroyed on last close.

This fixes destruction order problems when, eg. RAID3 array is build on
top of RAID1 arrays.

Requested, reviewed and tested by:	ru
MFC after:	2 weeks
2006-04-10 10:32:22 +00:00
pjd
46a2a98421 Preserve previous behaviour of kern.geom.raid3.n{64,16,4}k tunables were 0
means unlimited.

Reported by:	ru
MFC after:	3 days
2006-03-28 18:34:36 +00:00
pjd
2f146bc4fd Increase debug level for "Thread exiting." message. It's not that important
and is 0 by accident.

MFC after:	3 days
2006-03-25 23:30:36 +00:00
pjd
ba3414666e Update copyright for 2006. 2006-03-19 12:55:51 +00:00
pjd
5990508a15 kern.geom.raid3.sync_requests=2 seems to be a better default - it still
keeps disks very busy, but makes system much more responsive.

While here, kill extra space.
2006-03-19 11:18:33 +00:00
ru
9348187bf1 Fix build on 64-bit platforms. 2006-03-13 14:48:45 +00:00
pjd
349adc9b52 - Reimplement I/O data allocation to prevent deadlocks.
Submitted by:	green

- Speed up synchronization process by using configurable number of I/O
  requests in parallel.
  + Add kern.geom.raid3.sync_requests tunable which defines how many parallel
    I/O requests should be used.
  + Retire kern.geom.raid3.reqs_per_sync and kern.geom.raid3.syncs_per_sec
    sysctls.
- Fix race between regular and synchronization requests.
- Reimplement raid3's data synchronization - do not use the topology lock
  for this purpose, as it may case deadlocks.
- Stop synchronization from pre-sync hook.
- Fix some other minor issues.

Tested by:	Mike Tancsa <mike@sentex.net>
MFC after:	3 days
2006-03-13 01:03:18 +00:00
pjd
f0925fcaf9 When inserting a new component md_provsize metadata field wasn't set, which
means that old problem was triggered (when two providers end at the same
offset, eg. ad0 and ad0s1 and the wrong was is picked up by gmirror/graid3).

Reported by:	Michal Suszko <dry@dry.pl>
MFC after:	3 days
2006-03-10 07:41:31 +00:00
pjd
46e57ae3d3 Do not use bio structure after g_io_deliver(), it may not longer by valid.
Found and fixed by:	Vsevolod Lobko <seva@ip.net.ua>
MFC after:		3 days
2006-02-22 10:21:05 +00:00
pjd
dded50a417 On component state change to ACTIVE don't forget to update metadata.
MFC after:	3 days
2006-02-12 17:38:09 +00:00
pjd
a9a29a4821 Use time_uptime instead of time_second, as the latter may go backwards.
Suggested by:	ru
MFC after:	3 days
2006-02-12 17:36:09 +00:00
pjd
9357beb7f2 Allow to set kern.geom.raid3.disconnect_on_failure from loader.conf.
MFC after:	3 days
2006-02-12 02:01:38 +00:00
pjd
beaa5fcb4d - Add kern.geom.raid3.disconnect_on_failure sysctl/tunnable (default to 1
to preserve currect behaviour). When set to 0, components are not
  disconnected - graid3 will try to still use them (only first error will
  be logged). This is helpful when we have two broken components, but in
  different places, so actually all data is available.
  Such buggy component will be visible in 'graid3 list' output with flag
  BROKEN.
- Never disconnect the last valid component. If we detect errors there we
  will just pass them up. This wasn't reasonable to deny access to the
  whole provider because of one broken sector.

Prodded by:	ru
MFC after:	3 days
2006-02-11 17:42:31 +00:00
pjd
1aa881eae6 Correct typo. 'fbp' is NULL here so this will result in a panic.
MFC after:	3 days
2006-02-11 17:29:06 +00:00
pjd
26f9aeb047 Mark array as CLEAN when there are no write requests in
kern.geom.raid3.idletime seconds. Write, not any requests.
Mark array as clean immediatelly on last write close.

Prodded by:	ru
MFC after:	3 days
2006-02-11 14:42:58 +00:00
pjd
6f074b6d64 Remove trailing spaces. 2006-02-01 12:06:01 +00:00
pjd
1fe4753153 Fix typo which cased that 64kB elements limit was not set properly and
16kB elements limit wasn't set at all.

Submitted by:	Vsevolod Lobko <seva@ip.net.ua>
MFC after:	3 days
2006-01-30 22:45:43 +00:00
pjd
fb2c7cfc24 Remove dead code.
Found by:	Coverity Prevent(tm)
Coverity ID:	CID105
MFC after:	3 days
2006-01-18 21:43:27 +00:00
sobomax
29543921ea Check for g_read_data(9) errors properly:
o The only indication of error condition is NULL value returned by
  the function;

o value pointed to by error argument is undefined in the case when
  operation completes successfully.

Discussed with: phk
2005-11-30 19:24:51 +00:00
rwatson
be4f357149 Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
pjd
38ae497c6a Fix possible live-lock under heavy load where we can't allocate more
memory for request.
I was sure graid3 should handle such situations well, but green@ reported
it is not and we want to fix it before 6.0.

Submitted by:	green
2005-10-28 20:25:02 +00:00
pjd
eb467446d5 Use root_mount KPI for RAID3 to delay root file system mount.
Actually, one cannot setup root file system on RAID3 device, but when
other file system exist in /etc/fstab which are placed on RAID3 device,
boot process will be interrupted when these devices are missing.

MFC after:	3 days
X-MFC-note:	MFC only to RELENG_6, as RELENG_5 doesn't have root_mount KPI.
2005-07-27 09:03:51 +00:00
pjd
0a798a236d cp can't be NULL.
Noticed by:	Coverity Prevent analysis tool
2005-05-11 19:36:56 +00:00
pjd
0e95eeadc2 gp can't be NULL.
Noticed by:	Coverity Prevent analysis tool
2005-05-11 19:35:43 +00:00
pjd
9129b5403c If an error occurs, clean up before returning from g_raid3_connect_disk(). 2005-03-26 17:24:19 +00:00
pjd
90f14e00b7 Check for return values.
Submitted by:	sam
Found by:	Coverity Prevent analysis tool
2005-03-26 16:51:19 +00:00
pjd
668a028670 - Add md_provsize field to metadata, which will help with
shared-last-sector problem.
  After this change, even if there is more than one provider with the same
  last sector, the proper one will be chosen based on its size.
  It still doesn't fix the 'c' partition problem (when da0s1 can be confused
  with da0s1c) and situation when 'a' partition starts at offset 0
  (then da0s1a can be confused with da0s1 and da0s1c). One can use '-h'
  option there, when creating device or avoid sharing last sector.
  Actually, when providers share the same last sector and their size is equal,
  they provide exactly the same data, so the name (da0s1, da0s1a, da0s1c)
  isn't important at all.
- Provide backward compatibility.
- Update copyright's year.

MFC after:	1 week
2005-02-27 23:07:47 +00:00
pjd
c05fe5379b Update copyright in files changed this year. 2005-02-16 22:14:52 +00:00
pjd
a063b2627e Increase default synchronization speed.
MFC after: 3 days
2005-01-09 14:43:39 +00:00
pjd
589a9682ce - Fix 'rebuild' command - it can no longer relay on retaste event
(we ignore it).
- Remove code used for handling spoil events, as spoiling is not possible
  anymore, because we keep consumers open for writing all the time.

MFC after:	4 days
2005-01-04 12:15:21 +00:00
pjd
25d6a6cbc3 Remove unused #include. 2005-01-03 12:53:10 +00:00
jhb
7b611b0cb2 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
pjd
1955652326 Remove debug code. 2004-12-28 21:52:45 +00:00
pjd
0bb72c3b00 - Add genid field to the metadata which will allow to improve reliability a bit.
After this change, when component is disconnected because of an I/O error,
  it will not be connected and synchronized automatically, it will be logged
  as broken and skipped. Autosynchronization can occur, when component is
  disconnected (on orphan event) and connected again - there were no I/O
  error, so there is no need to not connected the component, but when there were
  writes while it wasn't connected, it will be synchronized.
  This fix cases, when component is disconnected because of I/O error and can be
  connected again and again.
- Bump version number.
- Implement backward compatibility mechanism. After this change when metadata in
  old version is detected, it is automatically upgraded to the new (current)
  version.
2004-12-25 19:17:47 +00:00
pjd
968f03faa7 Now, when force device destruction is done on shutdown, hide warning,
that device cannot be destroyed immediately, under debug=1.

Suggested by:	simon
2004-12-21 19:50:18 +00:00
pjd
c18c5197dc Improve reliability and clean up code a bit.
For more details check src/sys/geom/mirror/g_mirror.c rev.1.47,1.48,1.49,1.50.
2004-12-21 19:30:59 +00:00
pjd
7136589630 bioq_insert_head() function is already in subr_disk.c. 2004-12-13 13:02:06 +00:00
pjd
ca88614a1e When initializing device, set d_softc and d_no fields for all components,
because we know it then and we need it when inserting a component which
wasn't destroyed while device was running.

Reported by:	Michael Handler <handler@grendel.net>
MFC after:	1 week
2004-12-04 21:20:59 +00:00
pjd
64bf081bc3 Before trying to update metadata (so open consumer for writing), be sure
that the events queue is empty. In other case we're able to hit the race
where for example da0s1 is tasted by some other class, which means that
da0 is open with exclusive bit set, which means that we can't open da0
for writing if it is our component.

Reported by:	Attila Nagy <bra@fsn.hu> (and somebody else sometime ago,
		                          but I cannot find who it was)
2004-11-09 23:27:21 +00:00
pjd
a2d5ae235d Don't rely on DIRTY flag to be sure that consumer if open, because
DIRTY flag can be removed in idle process. Use consumer's acw field
instead to avoid opening consumer twice.
2004-11-09 23:15:40 +00:00
pjd
5cac522046 For BIO_READ check if provider is open for reading and for BIO_WRITE,
check if provider is open for writing.
This fixes panic when device is open only for writing and we send write
request.
2004-11-09 23:04:45 +00:00
pjd
592adb99c2 Drop Giant lock before grabbing the topology lock. 2004-11-09 00:35:08 +00:00
pjd
b9bb54bcb8 If device is marked as beeing destroyed, deny all access requests. 2004-11-08 20:23:53 +00:00
pjd
d62be2e0d6 Don't forget to make sure that there are no not-finished requests before
marking components as clean.

Pointed out by:	scottl
2004-11-05 17:18:39 +00:00
pjd
b004592010 - Mark all raid3 components as clean after kern.geom.raid3.idletime seconds.
- Make kern.geom.raid3.timeout variable tunable.
2004-11-05 13:12:58 +00:00
pjd
f229109eb7 Mark raid3 devices as clean on shutdown (after all file systems are
unmounted).

Suggested by:	scottl
2004-11-05 13:01:25 +00:00
pjd
270f218c1d - Use ->index consumer's field to track number of in-flight requests.
- Remove unused #include.
2004-11-05 12:42:16 +00:00
pjd
63dd0f756b Just use MAXPHYS as maximum I/O request size, instead of using my own
#define for this purpose.
No functional change.
2004-09-28 07:33:37 +00:00
pjd
ef6747fa18 Decrease kern.geom.raid3.timeout to 4, so it is smaller than
vfs.root.mountdelay by default.
2004-09-27 22:12:14 +00:00