Commit Graph

41272 Commits

Author SHA1 Message Date
Bosko Milekic
d56368d779 Plug a race and a leak in UMA.
1) The race has to do with zone destruction.  From the zone destructor we
   would lock the zone, set the working set size to 0, then unlock the zone,
   drain it, and then free the structure.  Within the window following the
   working-set-size set to 0 and unlocking of the zone and the point where
   in zone_drain we re-acquire the zone lock, the uma timer routine could
   have fired off and changed the working set size to something non-zero,
   thereby potentially preventing us from completely freeing slabs before
   destroying the zone (and thus leaking them).

2) The leak has to do with zone destruction as well.  When destroying a
   zone we would take care to free all the buckets cached in the zone, but
   although we would drain the pcpu cache buckets, we would not free them.
   This resulted in leaking a couple of bucket structures (512 bytes each)
   per cpu on SMP during zone destruction.

While I'm here, also silence GCC warnings by turning uma_slab_alloc()
from inline to real function.  It's too big to be an inline.

Reviewed by: JeffR
2003-07-30 18:55:15 +00:00
Alan Cox
93b4c5b707 The introduction of vm object locking has caused witness to reveal
a long-standing mistake in the way a portion of a pipe's KVA is
allocated.  Specifically, kmem_alloc_pageable() is inappropriate
for use in the "direct" case because it allows a preceding vm map entry
and vm object to be extended to support the new KVA allocation.
However, the direct case KVA allocation should not have a backing
vm object.  This is corrected by using kmem_alloc_nofault().

Submitted by:	tegge (with the above explanation by me)
2003-07-30 18:55:04 +00:00
Nate Lawson
a7985e4feb Use ACPI_FLUSH_CPU_CACHE() instead of wbinvd(). Verified .o with md5.
Pointed out by:	Mark Santcroos <marks@ripe.net>
2003-07-30 17:20:33 +00:00
Thomas Moestl
416c84a212 Return 1 from pmap_protect_tte() instead of 0. When used with
tsb_foreach(), 0 signals to terminate the tsb traversal, so when
tsb_foreach() was used in pmap_protect() (which only happens when
the area to be protected is larger than PMAP_TSB_THRESH = 16MB), only
the first tsb entry in the specified range would be protected.

Reported by:	Andrew Belashov <bel@orel.ru>
2003-07-30 16:27:51 +00:00
Nate Lawson
a329ebca91 Add and document the hw.acpi.ec.poll_timeout tunable. 2003-07-30 16:22:53 +00:00
Bosko Milekic
a40fdcb439 When generating the zone stats make sure to handle the master zone
("UMA Zone") carefully, because it does not have pcpu caches allocated
at all.  In the UP case, we did not catch this because one pcpu cache
is always allocated with the zone, but for the MP case, we were getting
bogus stats for this zone.

Tested by: Lukas Ertl <le@univie.ac.at>
2003-07-30 15:22:37 +00:00
Hartmut Brandt
539f18e4a4 Rearrange the vcc structure so that the generic getvcc function
can be used and add per-VC statistics.
2003-07-30 14:20:00 +00:00
Hartmut Brandt
6662c0c12e Rearrange the fields in the vcc table entry to fit to the requirements
of the generic getvcc function and used that function instead of the home
grown.
2003-07-30 11:32:42 +00:00
Hartmut Brandt
223e90573f Generate events when the carrier goes up or down.
Add two sysctl's that allow read-only access to the current
state of the utopia interface and to the carrier state.
2003-07-30 08:35:58 +00:00
Poul-Henning Kamp
7b4bd98ad5 Remove the disabling of buckets workaround.
Thanks to:	jeffr
2003-07-30 07:50:19 +00:00
Marcel Moolenaar
c9085788f6 In get_mcontext(), if we need to clear the return value, clear
FRAME_A3 as well. Otherwise swapcontext() will return -1.
2003-07-30 06:38:35 +00:00
Jeff Roberson
f828e5bedb - Get rid of the ill-conceived uz_cachefree member of uma_zone.
- In sysctl_vm_zone use the per cpu locks to read the current cache
   statistics this makes them more accurate while under heavy load.

Submitted by:	tegge
2003-07-30 05:59:17 +00:00
Jeff Roberson
d11e0ba565 - Check to see if we need a slab prior to allocating one. Failure to do
so not only wastes memory but it can also cause a leak in zones that
   will be destroyed later.  The problem is that the slab allocation code
   places newly created slabs on the partially allocated list because it
   assumes that the caller will actually allocate some memory from it.
   Failure to do so places an otherwise free slab on the partial slab list
   where we wont find it later in zone_drain().

Continuously prodded to fix by:	phk (Thanks)
2003-07-30 05:42:55 +00:00
Peter Wemm
f3d3771beb Detour via (void *) to defeat gcc's strict-aliasing warnings when using
-O2 or -Os (such as 'make release').

This commit brought to you by the warning:
  dereferencing type-punned pointer will break strict-aliasing rules
2003-07-30 00:04:58 +00:00
Poul-Henning Kamp
0c32d97ab5 Temporary workaround: Always disable buckets, there is a bug there
somewhere.

JeffR will look at this as soon as he has time.

OK'ed by:	jeffr
2003-07-29 22:07:10 +00:00
Bruce Evans
f52ecc3346 Restored clearing of the bss, except for putting it in a correct place
with up to date comments.  This fixes booting kernels with boot2
(except for loss of the features provided by loader) and is suitable
for MFC.  Contrary to the old comments, most loaders don't clear the bss.
biosboot lost clearing of the bss in a code crunch in 1997, and boot2
never did it.

kan didn't notice the problem with gcc-3.3 putting variables that are
initialized to 0 in the bss until after committing gcc-3.3 because he
was already using essentially this patch.  Before gcc-3.3, only the
non-critical `bootdev' variable was clobbered by clearing the bss.

MFC after:	3 days
2003-07-29 21:57:01 +00:00
Poul-Henning Kamp
114ebb2f28 Fix a memory leak in CCD's mirror code. 2003-07-29 20:04:06 +00:00
Nate Lawson
dda5f182ae Fix the new DA_OLD_QUIRKS option for normal and module compiles.
Pointed out by: 	bde
2003-07-29 18:08:16 +00:00
Hartmut Brandt
fa6a7ef6e4 Process events from the ATM drivers. Carrier change and PVC change
messages are forwarded as netgraph control messages to the node
that is connected to the manage hook. If that hook is not connected,
the event is lost. Flow control events are converted to netgraph
flow control messages and send along the hook that is connected to
the flow controlled VC. ACR change events are converted to control
messages and sent along the hook for the given VC.
2003-07-29 16:27:23 +00:00
Hajimu UMEMOTO
6a2a90b794 Cleanup useless break.
Submitted by:	JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
2003-07-29 14:10:13 +00:00
Hartmut Brandt
2df0c5212a Generate events for carrier state, PVC state changes and flow control
changes. Still have to figure out, how to get at the ABR information.
2003-07-29 14:07:19 +00:00
Hartmut Brandt
b6e1558bf4 Remove the rather bogus statistics sysctl and merge it into the
internal driver statistics sysctl.
2003-07-29 14:05:45 +00:00
Hartmut Brandt
07227b05dc Generate events when the interface state or a PVC state changes. 2003-07-29 14:00:59 +00:00
Hartmut Brandt
f345f5e020 The number of prefixes can never be negative so use an u_int for this. 2003-07-29 13:46:43 +00:00
Hartmut Brandt
bfbb5daa0f Use a size_t where a buffer length is meant. 2003-07-29 13:33:14 +00:00
Hartmut Brandt
dd937e32bd Make the ioctl() interface cleaner with regard to types: use size_t
instead of int where the variable has to hold buffer lengths,
use u_int for things like number of network interfaces which
in principle can never be negative.
2003-07-29 13:32:10 +00:00
Hartmut Brandt
fd9f148922 Send events for VCC state changes, ACR rate changes and interface state
changes.
2003-07-29 13:21:57 +00:00
Hartmut Brandt
59db9a86db Implement a mechanism by which ATM drivers can inform interested
parts of the system about certain kinds of events, like changes
in the ABR rate, changes in the carrier state, PVC changes. The
main consumers of these events are the harp(4) pseudo-driver
and the ILMI daemon via ng_atm(4).
2003-07-29 13:04:52 +00:00
David Xu
5a92cbc206 Use PSL_KERNEL as upcall thread's initial rflags, don't use
scratch user rflags.
2003-07-29 12:44:16 +00:00
Bruce Evans
011a891406 Don't hide the name of tmpstk, since there is no need to do so and the
HIDENAME() macro seems to be unimplementable in C.  (HIDENAME() used
to use invalid token pasting using ## for the STDC case until gcc
started rejecting that; now it uses unportable token pasting using
juxtaposition in all cases.)  This reduces use of HIDENAME() in the
kernel to only i386 and amd64 profiling code so that it doesn't bite
most kernels whenever gcc becomes stricter.  Problems with HIDENAME()
in userland are smaller because userland mostly doesn't use strict
flags yet.  There are some advantages to hiding the name of mcount,
but newer arches shouldn't do it; only amd64 does.

MFC after:	3 days

On second thoughts hide tmpstk better by staticizing it.
2003-07-29 11:44:31 +00:00
Poul-Henning Kamp
3f5187f276 Implement DOSPTYP_EXTLBA more completely: loop until we find no more
partitions.

Submitted by:	Rudolf Cejka <cejkar@fit.vutbr.cz>
PR:	53719
2003-07-29 10:09:13 +00:00
Dag-Erling Smørgrav
7576b4b4c0 Try to make 'uname -a' look more like it does on Linux:
- cut the version string at the newline, suppressing information about
   who built the kernel and in what directory.  Most of this information
   was already lost to truncation.

 - on i386, return the precise CPU class (if known) rather than just
   "i386".  Linux software which uses this information to select
   which binary to run often does not know what to make of "i386".
2003-07-29 10:03:15 +00:00
Alan Cox
fbe1bdddcc Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant
when needed.  So, don't do it here.
2003-07-29 05:23:19 +00:00
John-Mark Gurney
44de8a989d fix another bus_dma leak due to not having a size param for our bus_dma
allocation function.  With this patch, it prevents continous growth of
the devbuf memory pool.

Tested with ssh <host> dd of=/dev/null < /dev/zero and vmstat -m | grep devbuf
2003-07-29 05:07:37 +00:00
Nate Lawson
af991a6d16 Deprecate USB and Firewire quirks. We should now never send 6 byte commands
to such devices.  If a device fails due to this commit, add:
   options DA_OLD_QUIRKS
to the kernel config and recompile.  Then send the output of "camcontrol
inquiry da0" to scsi@freebsd.org so the quirk can be re-enabled.
2003-07-29 04:32:33 +00:00
Tim J. Robbins
aae962d56e Fix a problem that occurs when truncating files on NFSv3 mounts: we need
to set np->n_size back to the desired size again after calling
nfs_meta_setsize(), since it could end up in nfs_loadattrcache() getting
called, which would change n_size back to the value it had before the
truncate request was issued. The result of this bug is that the size info
cached in the nfsnode becomes incorrect, lseek(fd, ofs, SEEK_END) seeks
past the end of the file, stat() returns the wrong size, etc.

PR:		41792
MFC after:	2 weeks
2003-07-29 00:17:29 +00:00
Robert Watson
9080ff25cf Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
Robert Watson
2e4a71cdb1 When exporting file descriptor data for threads invoking the
kern.file sysctl, don't return information about processes that
fail p_cansee(td, p).  This prevents sockstat and related
programs from seeing file descriptors owned by processes not
in the same jail as the thread, as well as having implications
for MAC, etc.

This is a partial solution: it permits an information leak about
the number of descriptors in the sizing calculation (but this is
not new information, you can also get it from kern.openfiles),
and doesn't attempt to mask file descriptors based on the
properties of the descriptor, only the process referencing it.
However, it provides most of what you want under most
circumstances, without complicating the locking.

PR:	54211
Based on a patch submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2003-07-28 16:03:53 +00:00
Peter Wemm
3718a191eb Make this compile on 64 bit systems again. You cannot just cast a 32 bit
int to a 64 bit pointer.  This file is already off the vendor branch.
2003-07-28 10:25:26 +00:00
Nate Lawson
1deac58179 Add a PATH_INQ flag, PIM_NO_6_BYTE, which indicates the SIM never wishes to
receive 6 byte commands.  Add a check for this flag to da(4) and cd(4) so
that they honor it.  This is a quick workaround for many devices (especially
USB) that require da(4) quirks to operate.  The more complete approach is
to finish the new transport code which will be aware of the SCSI version a
transport implements.

MFC after:	1 day
2003-07-28 06:15:59 +00:00
Alan Cox
234c7726c8 None of the "alloc" functions used by UMA assume that Giant is held any
longer.  (If they still need it, e.g., contigmalloc(), they acquire it
themselves.)  Therefore, we need not acquire Giant in slab_zalloc().
2003-07-28 02:29:07 +00:00
Warner Losh
373eec79d1 The LP_ETH_10_100_CF entry needs to be tagged as a DL100019.
Submitted by: Scott Renfro
2003-07-28 00:07:58 +00:00
Marcel Moolenaar
de22416ef6 Reset the per-CPU unique value at boot and clear it in the PCB of the
child when forking. This provides a consistent initial state.
Note that cpu_set_upcall() does not clear the per-CPU unique value as
it is followed by a call to set_mcontext(), which sets it accordingly.
2003-07-27 23:45:48 +00:00
Alan Cox
cdedf48666 Make pmap_pvo_allocf() callable without Giant. 2003-07-27 20:57:53 +00:00
Poul-Henning Kamp
cf7742997a Pass the file descriptor index down to vn_open.
If the method vector was replaced and we got the "special return code"
smile and trust that whatever happened below DTRT.
2003-07-27 20:09:13 +00:00
Poul-Henning Kamp
3ab6b09c53 Pass the fdidx argument from vn_open{_cred}() onto VOP_OPEN() 2003-07-27 20:05:36 +00:00
Alan Cox
f50ab15dff Remove GIANT_REQUIRED from kmem_alloc(). 2003-07-27 18:31:32 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
1b6c609507 Call the new argument "fdidx" that is more precise than "fd". 2003-07-27 17:03:20 +00:00
Hajimu UMEMOTO
c2ada8f1de ip6fw does not handle ESP correctly
PR:		kern/54874
Submitted by:	JINMEI Tatuya <jinmei@shuttle.wide.toshiba.co.jp>
MFC after:	1 week
2003-07-27 16:21:10 +00:00