41563 Commits

Author SHA1 Message Date
phk
d0c4c329b1 Use sparse struct initialization for struct pagerops.
Mark our buffers B_KEEPGIANT before sending them downstream.

Remove swap_pager_strategy implementation.
2003-08-05 06:54:56 +00:00
phk
5997dd2e8b Change the implementation of swap backing to use the VM system in normal
ways, and drop the need for vm_pager_strategy().
2003-08-05 06:54:44 +00:00
phk
a295f12128 Use sparse struct initializations for struct pagerops.
This makes grepping for which pagers implement which methods easier.
2003-08-05 06:51:26 +00:00
phk
aa4433feb6 Only drop Giant around the drivers ->d_strategy() if the buffer is not
marked to prevent this.
2003-08-05 06:43:56 +00:00
phk
bf614e1208 Add a B_KEEPGIANT flag so non-SMPng code can get preferential treatment. 2003-08-05 06:43:12 +00:00
simokawa
2e2a46ee29 Change device name notation.
- /dev/fw{,mem}X.Y represents the Y'th unit on the X'th bus.
- /dev/fw{,mem}X is an alias of fw{,mem}X.0 for compatibility.
- Clone devices.
2003-08-05 03:11:39 +00:00
simokawa
8e8a7cd92a Enable IFCAP_VLAN_MTU and increase MTU for it.
Reviewed by: wpaul
2003-08-05 02:34:35 +00:00
hsu
fb82c18f66 Make the second argument to sooptcopyout() constant in order to
simplify the upcoming PIM patches.

Submitted by:   Pavlin Radoslavov <pavlin@icir.org>
2003-08-05 00:27:54 +00:00
iedowse
7bf5fa9caf In the mknod(), mkfifo(), link(), symlink() and undelete() syscalls,
use vrele() instead of vput() on the parent directory vnode returned
by namei() in the case where it is equal to the target vnode. This
handles namei()'s somewhat strange (but documented) behaviour of
not locking either vnode when the two vnodes are equal and LOCKPARENT
but not LOCKLEAF is specified.

Note that since a vnode double-unlock is not currently fatal, these
coding errors were effectively harmless.

Spotted by:	Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
Reviewed by:	mckusick
2003-08-05 00:26:51 +00:00
scottl
c5cc3acad0 In _bus_dmamap_load_buffer(), only count the number of bounce pages needed if
they haven't been counted before.  This test was ommitted when bus_dmamap_load()
was merged into this function, and results in the pagesneeded field growing
without bounds when multiple deferrals happen.

Thanks to Paul Saab for beating his head against this for a few hours =-)
2003-08-04 23:40:35 +00:00
marcel
1e97e213cf Fix logic bug in the previous commit. Any region less than 5 is a
user space region. Hence, we need to test if 5 is greater than the
region; not greater equal.
This bug caused us to call ast() while interrupting kernel mode.
2003-08-04 22:00:48 +00:00
dwmalone
cb188056e6 Do some minor Giant pushdown made possible by copyin, fget, fdrop,
malloc and mbuf allocation all not requiring Giant.

1) ostat, fstat and nfstat don't need Giant until they call fo_stat.
2) accept can copyin the address length without grabbing Giant.
3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit.
4) move Giant grabbing from each indivitual recv* syscall to recvit.
2003-08-04 21:28:57 +00:00
jhb
e71dfc3b00 Adjust a comment to remove staleness and take slightly less implementation
specific perspective.
2003-08-04 20:35:13 +00:00
jhb
b47c7929f5 - GC unused cpu_thread_link().
- Move the enabling of interrupts out of assembly and into C a few
  instructions later at cpu_critical_fork_exit().  This puts more of the
  MD critical section implementation under the MD critical section API
  making it easier to test and develop alternative implementations.
2003-08-04 20:34:25 +00:00
jhb
e4889cd470 - Since td_critnest is now initialized in MI code, it doesn't have to be
set in cpu_critical_fork_exit() anymore.
- As far as I can tell, cpu_thread_link() has never been used, not even
  when it was originally added, so remove it.
2003-08-04 20:32:45 +00:00
jhb
52adb98aef Set td_critnest to 1 when setting up a thread since it is a MI field with
MI values.  This ensures that td_critnest for a newly fork'd thread is
always valid.

Requested by:	bde (a long time ago)
2003-08-04 20:28:20 +00:00
jhb
a69166c61f Insert cosmetic spaces.
Reported by:	kris
2003-08-04 19:24:25 +00:00
julian
cbe1603500 Allow foot shooting as Linux emulation needs it.
Also change "Auto mode" to use a "special" value
instead of 0, and define and document it.
I had thought libpthread had already been switched to use auto mode but
it appears that patch hasn't been committed yet.

Discussed with:	 Davidxu
2003-08-04 19:11:56 +00:00
des
f907274e33 Add support for multiple CPUs to cpuinfo. 2003-08-04 10:55:22 +00:00
phk
8952fe0759 Put an uncovered page between the swap devices, that way we can be sure
to not get any cross-device I/O requests.  (The unallocated first page
protecting BSD labels already gave us this, but that hack may go away
at some point in time).

Remove the check for cross-device I/O requests in swap_pager_strategy.

Move the repeated statistics updating into flushchainbuf().
2003-08-04 08:22:49 +00:00
wpaul
3268777224 Set the BGE_RX_MTU register correctly so that we can receive slightly
larger than normal frames, to account for the case where a bge(4) NIC
is used with VLANs. Since we set the IFCAP_VLAN_MTU flag, we must allow
reception of frames up to 1522 bytes in size rather than 1518.

Note that it is possible to work around this bug by doing:

# ifconfig bge0 mtu 1504

prior to configuring any VLAN interfaces.
2003-08-04 05:50:53 +00:00
simokawa
ab808e6a79 - Don't mess with TX queue in fwohci_stop() if we failed to attach the device.
Tested by: wilko

- Detect memory mapping failure of registers by checking OHCI version.

Tested by: KONDOU, Kazuhiro <kazuhiro@alib.jp>
2003-08-04 05:43:02 +00:00
marcel
d5a33e59d1 Cleanup the clock code. This includes:
o  Remove alpha specific timer code (mc146818A) and compiled-out
   calibration of said timer.
o  Remove i386 inherited timer code (i8253) and related acquire and
   release functions.
o  Move sysbeep() from clock.c to machdep.c and have it return
   ENODEV. Console beeps should be implemented using ACPI or if no
   such device is described, using the sound driver.
o  Move the sysctls related to adjkerntz, disable_rtc_set and
   wall_cmos_clock from machdep.c to clock.c, where the variables
   are.
o  Don't hardcode a hz value of 1024 in cpu_initclocks() and don't
   bother faking a stathz that's 1/8 of that. Keep it simple: hz
   defaults to HZ and stathz equals hz. This is also how it's done
   for sparc64.
o  Keep a per-CPU ITC counter (pc_clock) and adjustment (pc_clockadj)
   to calculate ITC skew and corrections. On average, we adjust the
   ITC match register once every ~1500 interrupts for a duration of
   2 consequtive interruprs. This is to correct the non-deterministic
   behaviour of the ITC interrupt (there's a delay between the match
   and the raising of the interrupt).
o  Add 4 debugging sysctls to monitor clock behaviour. Those are
   debug.clock_adjust_edges, debug.clock_adjust_excess,
   debug.clock_adjust_lost and debug.clock_adjust_ticks. The first
   counts the individual adjustment cycles (when the skew first
   crosses the threshold), the second counts the number of times the
   adjustment was excessive (any non-zero value is to be considered
   a bug), the third counts lost clock interrupts and the last counts
   the number of interrupts for which we applied an adjustment
   (debug.clock_adjust_ticks / debug.clock_adjust_edges gives the
   avarage duration of an individual adjustment -- should be ~2).

While here, remove some nearby (trivial) left-overs from alpha and
other cleanups.
2003-08-04 05:13:18 +00:00
alc
321771d262 Use kmem_alloc_nofault() instead of kmem_alloc_pageable() to allocate
swapbkva.  Swapbkva mappings are explicitly managed using pmap_qenter(),
not on-demand by vm_fault(), making kmem_alloc_nofault() more appropriate.

Submitted by:	tegge
2003-08-04 04:35:04 +00:00
rwatson
bf98881a21 Now that the central POSIX.1e ACL code implements functions to
generate the inode mode from a default ACL and creation mask,
implement ufs_sync_inode_from_acl() using acl_posix1e_newfilemode().

Since ACL_OVERRIDE_MASK/ACL_PRESERVE_MASK are defined, we no
longer need to explicitly pass in a "preserve_mask" field: this
is implicit in the use of POSIX.1e semantics.

Note: this change contains a semantic bugfix for new file creation:
we now intersect the ACL-generated mode and the cmode requested by
the user process.  This means permissions on newly created file
objects will now be more conservative.  In the future, we may want
to provide alternative semantics (similar to Solaris and Linux) in
which the ACL mask overrides the umask, permitting ACLs to broaden
the rights beyond the requested umask.

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 03:29:13 +00:00
imp
f63ee25e0f fix disordering of filenames. Place the dev/ppc files in alphabetical
order.
2003-08-04 02:39:14 +00:00
rwatson
543a037619 Move more ACL logic from the UFS code (ufs_acl.c) to the central POSIX.1e
support routines in kern_acl.c:

- Define ACL_OVERRIDE_MASK and ACL_PRESERVE_MASK centrally in acl.h: the
  mode bits that are (and aren't) stored in the ACL.

- Add acl_posix1e_acl_to_mode(): given a POSIX.1e extended ACL, generate
  a compatibility mode (only the bits supported by the POSIX.1e ACL).

- acl_posix1e_newfilemode(): Given a requested creation mode and default
  ACL, calculate the mode for the new file system object (only the bits
  supported by the POSIX.1e ACL).

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 02:13:05 +00:00
rwatson
ba4ccf26ea In ufs_chmod(), use privilege only when required in the following
cases:

- Setting sticky bit on non-directory
- Setting setgid on a file with a group that isn't in the effective
  or extended groups of the authorizing credential

I.e., test the requirement first, then do the privilege test,
rather than doing the privilege test regardless of the need for
privilege.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-04 00:31:01 +00:00
jdp
7d24cc9a9e Use the revision ID from PCI configuration space to identify Intel
8255x chips more precisely.  The information was obtained from Intel's
Open Source Software Developer Manual for the 8255x.

MFC after:	1 day
2003-08-04 00:17:16 +00:00
marcel
47e1af7da8 Fix handling of external interrupts: we weren't calling ast() when
interrupting user mode. The net effect of this bug is that a clock
interrupt does not cause rescheduling and processes are not
preempted. It only takes a "while (1);" to render the machine
useless.

This bug was introduced by the context changes and EPC syscall code.
Handling of ASTs was moved to C for clarity and ease of maintenance,
but was not added for the external interrupt case.

This needs to be revisited. We now have calls to do_ast() in trap(),
break_syscall() and ivt_External_Interrupt(). A single call in
exception_restore covers these 3 places without duplication. This
is where we handled ASTs prior to the overhaul, except that the
meat has been moved to do_ast(), a C function. This was the goal
to begin with.

Pointy hat: marcel
2003-08-04 00:08:39 +00:00
phk
049e0c4c31 Name swap_pager_find_dev() more correctly swp_pager_finde_dev().
Use ->bio_children to count child buffers, rather than abuse the
bio_caller1 pointer.

Expand the relevant bits of waitchainbuf() inline, this clarifies
the code a little bit.
2003-08-03 21:22:42 +00:00
phk
2b330fbdd3 I accidentally hit undo before committing, fix the resulting off-by-one. 2003-08-03 14:53:52 +00:00
phk
064fa9d0bb Remove the NSWAPDEV option, we have no upper limit on how many
swap devices we can have anymore.
2003-08-03 13:39:59 +00:00
phk
b51aac6e92 Change the layout policy of the swap_pager from a hardcoded width
striping to a per device round-robin algorithm.

Because of the policy of not attempting to retain previous swap
allocation on page-out, this means that a newly added swap device
almost instantly takes its 1/N share of the I/O load but it takes
somewhat longer for it to assume it's 1/N share of the pages if there
is plenty of space on the other devices.

Change the 8G total swapspace limitation to 8G per device instead
by using a per device blist rather than one global blist.  This
reduces the memory footprint by 75% (typically a couple hundred
kilobytes) for the common case with one swapdevice but NSWAPDEV=4.

Remove the compile time constant limit of number of swap devices,
there is no limit now.  Instead of a fixed size array, store the
per swapdev structure in a TAILQ.

Total swap space is still addressed by a 32 bit page number and
therefore the upper limit is now 2^42 bytes = 16TB (for i386).

We still do not allocate the first page of each device in order to
give some amount of protection to any bsdlabel at the start of the
device.

A new device is appended after the existing devices in the swap space,
no attempt is made to fill in holes left behind by swapoff (this can
trivially be changed should it ever become a problem).

The sysctl vm.nswapdev now reflects the number of currently configured
swap devices.

Rename vm_swap_size to swap_pager_avail for consistency with other
exported names.

Change argument type for vm_proc_swapin_all() and swap_pager_isswapped()
to be a struct swdevt pointer rather than an index.

Not changed: we are still using blists to manage the free space,
but since the swapspace is no longer fragmented by the striping
different resource managers might fare better.
2003-08-03 13:35:31 +00:00
phk
61f64f46ab Move extern declaration of the various pagerops from vm_pager.c
to vm_pager.h where the various pagers will also see them.
2003-08-03 09:27:39 +00:00
obrien
3882b9d783 Deal with GCC annoyingly defining _BIG_ENDIAN. 2003-08-03 07:53:50 +00:00
obrien
150d7d3036 Style sync. 2003-08-03 07:50:19 +00:00
alc
52878a6770 Revise obj_alloc(). Most notably, use the object's lock to prevent two
concurrent invocations from acquiring the same address(es).  Also, in case
of an incomplete allocation, free any allocated pages.

In collaboration with:	tegge
2003-08-03 06:08:48 +00:00
bmilekic
2a8e0c5c0a When INVARIANTS is on and we're in uma_zalloc_free(), we need to make
sure that uma_dbg_free() is called if we're about to call
uma_zfree_internal() but we're asking it to skip the dtor and
uma_dbg_free() call itself.  So, if we're about to call
uma_zfree_internal() from uma_zfree_arg() and skip == 1, call
uma_dbg_free() ourselves.
2003-08-02 22:40:27 +00:00
alc
c38b9c732f Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in pmap_mapdev().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by:	tegge
2003-08-02 19:26:09 +00:00
ru
c1f8b453c0 There's already the elink.ko module available, don't embed it here.
Reviewed by:	markm
2003-08-02 18:46:02 +00:00
jhb
f0ef0df712 Both 'c' an 'lines' are unused, the bogus init of lines was accidentally
left behind.
2003-08-02 17:35:00 +00:00
alc
15ec2b9212 Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in proc_rwmem().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by:	tegge
2003-08-02 17:08:21 +00:00
julian
0aaea9d619 fix braino in last commit.
Beaten with clue-stick by: Davidxu
2003-08-02 16:45:32 +00:00
bde
06b828941c Support the Titan VScom PCI-200HV2 2 port serial card.
MFC after:	3 days
2003-08-02 13:25:31 +00:00
phk
4a97de3d53 Kick Giant compatibility one layer up. 2003-08-02 10:11:58 +00:00
phk
adb4818b64 Grab Giant in bufdonebio() since drivers may not hold it.
This only protects the "struct buf" consumers (ie: DEV_STRATEGY()),
but does not protect BIO_STRATEGY() users.
2003-08-02 09:45:10 +00:00
nyan
545a2236ae Merged from sys/dev/sio/sio.c revision 1.400. 2003-08-02 09:41:31 +00:00
phk
e1e146913d Grab Giant in physio() since non-giant drivers are starting to appear. 2003-08-02 09:40:53 +00:00
nyan
10c63974c2 Merged from sys/dev/ppc/ppc.c revision 1.42. 2003-08-02 09:25:25 +00:00