times consistently wrong (up to 1 tick too late), but recent changes
fixed the setting of the main clock, making other times inconsistent.
The inconsistencies tended to show up as a negative resource usage
for the process that set the time.
Fixed the check for setting the clock backwards. A stale timestamp
(`time') was checked, so it was possible to set the clock backwards
by up to almost 1 tick. Until recently, this bug was compensated
for by setting the clock consistently wrong.
Merged the comment about setting the clock backwards from Lite2.
Removed latency micro-optimizations/speed pessimizations in settime().
microtime() and set_timecounter() are relatively expensive, and
they must be called together with clock updates blocked to get a
consistent `delta', so significant latency optimizations are not
possible.
Removed some stale comments.
in a way identically as before.) I had problems with the system properly
handling the number of vnodes when there is alot of system memory, and the
default VM_KMEM_SIZE. Two new options "VM_KMEM_SIZE_SCALE" and
"VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems
with greater than 128MB.
vnodes, therefore vget doesn't need to do so anymore. Other minor
improvements include the temp free vnode queue obeying the VAGE
flag and a printf that warns of to-be-removed code being executed.
Highlights:
* Simple model for underlying hardware.
* Hardware basis for timekeeping can be changed on the fly.
* Only one hardware clock responsible for TOD keeping.
* Provides a real nanotime() function.
* Time granularity: .232E-18 seconds.
* Frequency granularity: .238E-12 s/s
* Frequency adjustment is continuous in time.
* Less overhead for frequency adjustment.
* Improves xntpd performance.
Reviewed by: bde, bde, bde
so_error is set, clear it before returning it. The behavior
introduced in 4.3-Reno (to not clear so_error) causes potentially
transient errors (e.g. ECONNREFUSED if the other end hasn't opened
its socket yet) to be permanent on connected datagram sockets that
are only used for writing.
(soreceive() clears so_error before returning it, as does
getsockopt(...,SO_ERROR,...).)
Submitted by: Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.
or shrinking an open partition (by changing the label for a compatibility
slice while partitions on the corresponding real slice are open, or vice
versa).
waslocked = TRUE. This change may fix lockmgr panic in umapfs/nullfs.
PR: 5634
Reviewed by: "John S. Dyson" <toor@dyson.iquest.net>
Suggested by: Bruce Evans <bde@zeta.org.au>
of the various ad-hoc schemes.
2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup.
3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some
processor errata, and to minimize redundant processor updating of page
tables.
4) Modify pmap_protect so that it can only remove permissions (as it
originally supported.) The additional capability is not needed.
5) Streamline read-only to read-write page mappings.
6) For pmap_copy_page, don't enable write mapping for source page.
7) Correct and clean-up pmap_incore.
8) Cluster initial kern_exec pagin.
9) Removal of some minor lint from kern_malloc.
10) Correct some ioopt code.
11) Remove some dead code from the MI swapout routine.
12) Correct vm_object_deallocate (to remove backing_object ref.)
13) Fix dead object handling, that had problems under heavy memory load.
14) Add minor vm_page_lookup improvements.
15) Some pages are not in objects, and make sure that the vm_page.c can
properly support such pages.
16) Add some more page deficit handling.
17) Some minor code readability improvements.
Realtime priority has to be restricted for reasons which should be
obvious. However, for idle priority, there is a potential for
system deadlock if an idleprio process gains a lock on a resource
that other processes need (and the idleprio process can't run
due to a CPU-bound normal process). Fix me! XXX
PR: 5639
MUST be PG_BUSY. It is bogus to free a page that isn't busy,
because it is in a state of being "unavailable" when being
freed. The additional advantage is that the page_remove code
has a better cross-check that the page should be busy and
unavailable for other use. There were some minor problems
with the collapse code, and this plugs those subtile "holes."
Also, the vfs_bio code wasn't checking correctly for PG_BUSY
pages. I am going to develop a more consistant scheme for
grabbing pages, busy or otherwise. For now, we are stuck
with the current morass.
If you want to play with it, you can find the final version of the
code in the repository the tag LFS_RETIREMENT.
If somebody makes LFS work again, adding it back is certainly
desireable, but as it is now nobody seems to care much about it,
and it has suffered considerable bitrot since its somewhat haphazard
integration.
R.I.P
Make vfs_bio buffer mgmt work better.
Buffers were being used after brelse.
Make nfs_getpages work independently of other NFS
interfaces. This eliminates some difficult
recursion problems and decreases pagefault
overhead.
Remove an erroneous vfs_unbusy_pages.
Fix a reentrancy problem, with nfs_vinvalbuf when
vnode is already being rundown.
Reassignbuf wasn't being called when needed under
certain circumstances.
(Thanks to Bill Paul for help.)
This introduce an xxxFS_BOOT for each of the rootable filesystems.
(Presently not required, but encouraged to allow a smooth move of option *FS
to opt_dontuse.h later.)
LFS is temporarily disabled, and will be re-enabled tomorrow.
1) Start using TSM.
Struct procs continue to point to upages structure, after being freed.
Struct vmspace continues to point to pte object and kva space for kstack.
u_map is now superfluous.
2) vm_map's don't need to be reference counted. They always exist either
in the kernel or in a vmspace. The vmspaces are managed by reference
counts.
3) Remove the "wired" vm_map nonsense.
4) No need to keep a cache of kernel stack kva's.
5) Get rid of strange looking ++var, and change to var++.
6) Change more data structures to use our "zone" allocator. Added
struct proc, struct vmspace and struct vnode. This saves a significant
amount of kva space and physical memory. Additionally, this enables
TSM for the zone managed memory.
7) Keep ioopt disabled for now.
8) Remove the now bogus "single use" map concept.
9) Use generation counts or id's for data structures residing in TSM, where
it allows us to avoid unneeded restart overhead during traversals, where
blocking might occur.
10) Account better for memory deficits, so the pageout daemon will be able
to make enough memory available (experimental.)
11) Fix some vnode locking problems. (From Tor, I think.)
12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
(experimental.)
13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c
code. Use generation counts, get rid of unneded collpase operations,
and clean up the cluster code.
14) Make vm_zone more suitable for TSM.
This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)
This is not the infamous, final cleanup of the vnode stuff, but a necessary
step. Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
a null pointer panic when the pointer for the incorrect process is
NULL. getpriority() was broken in rev.1.27. Rev.1.28 broke the
warning instead of fixing the problem.
PR: 5495
config option in pmap. Fix a problem with faulting in pages. Clean-up
some loose ends in swap pager memory management.
The system should be much more stable, but all subtile bugs aren't fixed yet.
Fix the UIO optimization code.
Fix an assumption in vm_map_insert regarding allocation of swap pagers.
Fix an spl problem in the collapse handling in vm_object_deallocate.
When pages are freed from vnode objects, and the criteria for putting
the associated vnode onto the free list is reached, either put the
vnode onto the list, or put it onto an interrupt safe version of the
list, for further transfer onto the actual free list.
Some minor syntax changes changing pre-decs, pre-incs to post versions.
Remove a bogus timeout (that I added for debugging) from vn_lock.
PHK will likely still have problems with the vnode list management, and
so do I, but it is better than it was.
original BSD code. The association between the vnode and the vm_object
no longer includes reference counts. The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.
When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also. The two "objects" are now
more intimately related, and so the interactions are now much less
complex.
When vnodes are now normally placed onto the free queue with an object still
attached. The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code. There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
of vnodes and objects. There are some metadata performance improvements
that come along with this. There are also a few prototypes added when
the need is noticed. Changes include:
1) Cleaning up vref, vget.
2) Removal of the object cache.
3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore.
4) Correct some missing LK_RETRY's in vn_lock.
5) Correct the page range in the code for msync.
Be gentle, and please give me feedback asap.
for field widths being 2 larger than specified for "%<number>p". Only
printing of null pointers is "wrong" now (it is actually "right", but
inconsistent with printf(3)).
here, but kmem_malloc() is used and it takes the same "flags" as
malloc().
Use the mbuf allocation "flags" M_WAIT and M_DONTWAIT consistently.
There is really only one boolean flag, M_DONTWAIT, but the "flags"
were always treated as enum-like values, except in some places here
where the values are tacitly converted to boolean flags. Treat
them as enum-like values everywhere, except where we tacitly assume
that there are only two values in order to convert them to the
corresponding two kmem_malloc() "flags".
of time that the laptop was suspending. Thus, select() calls that might have
suspended rather than firing at 1hr + "time suspended" since the timer was
posted.
Adding:
options APM_FIXUP_CALLTODO
to the kernel config enables the patch.
[
This patch was slightly modified to use a consistant indent style and
I removed some unused local variables. After this has been tested a
few weeks we'll make the options the default, so for now I'm now
documenting it in LINT. Mike can later if he wants.
]
Reviewed by: Mike Smith <msmith@freebsd.org>
Submitted by: Ken Key <key@cs.utk.edu>
flag is set in the p_pfsflags field. This, essentially, prevents an SUID
proram from hanging after being traced. (E.g., "truss /usr/bin/rlogin" would
fail, but leave rlogin in a stopevent state.) Yet another case where procctl
is (hopefully ;)) no longer needed in the general case.
Reviewed by: bde (thanks bruce :))
Quite amazing that the system runs at all with this bug. Also present in
2.2.5. The bug appears to have come in with changes in rev 1.53.
PR: might fix PR#5313
Submitted by: bde
if one of the new poll types is requested; hopefully this will not break
any existing code. (This is done so that programs have a dependable
way of determining whether a filesystem supports the extended poll types
or not.)
The new poll types added are:
POLLWRITE - file contents may have been modified
POLLNLINK - file was linked, unlinked, or renamed
POLLATTRIB - file's attributes may have been changed
POLLEXTEND - file was extended
Note that the internal operation of poll() means that it is impossible
for two processes to reliably poll for the same event (this could
be fixed but may not be worth it), so it is not possible to rewrite
`tail -f' to use poll at this time.
- A nonprofiling version of s_lock (called s_lock_np) is used
by mcount.
- When profiling is active, more registers are clobbered in
seemingly simple assembly routines. This means that some
callers needed to save/restore extra registers.
- The stack pointer must have space for a 'fake' return address
in idle, to avoid stack underflow.
... fix a bug with orecvfrom() or recvfrom() called with
the MSG_COMPAT flag on kernels compiled with the COMPAT_43 option.
The symptom is that the fromaddr is not correctly returned.
This affects the Linux emulator.
Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)
noticed some major enhancements available for UP situations. The number
of UP TLB flushes is decreased much more than significantly with these
changes. Since a TLB flush appears to cost minimally approx 80 cycles,
this is a "nice" enhancement, equiv to eliminating between 40 and 160
instructions per TLB flush.
Changes include making sure that kernel threads all use the same PTD,
and eliminate unneeded PTD switches at context switch time.
quite a while, but forgot to do so. For now, this code supports
most daemons running as kernel threads in UP kernels, and as
full processes in SMP. We will soon be able to run them as
threads in SMP, but not yet.
Note that an unload facility should be used to call rm_at_exit() (if
procfs is being loaded as an LKM and is subsequently removed), but it
was non-obvious how to do this in the VFS framework.
Reviewed by: Julian Elischer
surprise, procfs actually is optional, and some people truly do generate
kernels without it. Wow. I built a kernel without 'options PROCFS' and
it compiled and linked.