MI API with empty cpu_pause() functions on other arch's, but this
functionality is definitely unique to IA-32, so I decided to leave it
as i386-only and wrap it in #ifdef's. I should have dropped the cpu_
prefix when I made that decision.
Requested by: bde
byte offset of the directory entry for the inode number for all types
of files except directories, although this breaks hard links for
non-directories even if it doesn't cause overflow. Just ignore this
broken inode number for stat() and readdir() and return a less broken
one (the block offset of the file), so that applications normally can't
see the brokenness.
This leaves at least the following brokenness:
- extra inodes, vnodes and caching for hard links.
- various overflow bugs. cd9660 supports 64-bit block numbers, but we
silently ignore the top 32 bits in isonum_733() and then drop another
10 bits for our broken inode numbers. We may also have sign extension
bugs from storing 32-bit extents in ints and longs even if ints are
32-bits. These bugs affect DVDs. mkisofs apparently limits them
by writing directory entries first.
Inode numbers were broken mainly in 4.4BSD-Lite2. FreeBSD-1.1.5 seems
to have a correct implementation modulo the overflow bugs. We need
to look up directory entries from inodes for symlinks only. FreeBSD-1.1.5
use separate fields (iso_parent_extent, iso_parent) to point to the
directory entry. 4.4BSD-Lite doesn't have these, and abuses i_ino to
point to the directory entry. Correct pointers are impossible for
hard links, but symlinks can't be hard links.
Pentium 4's and newer IA32 processors. The "pause" instruction has been
verified by Intel to be a NOP on all currently existing IA32 processors
prior to the Pentium 4.
option is used (not on by default).
- In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set,
then we can safely read the thread pointer from the mtx_lock member while
holding sched_lock. We then examine the thread to see if it is currently
executing on another CPU. If it is, then we keep looping instead of
blocking.
- In the case of trying to unlock a mutex, it is now possible for a mutex
to have MTX_CONTESTED set in mtx_lock but to not have any threads
actually blocked on it, so we need to handle that case. In that case,
we just release the lock as if MTX_CONTESTED was not set and return.
- We do not adaptively spin on Giant as Giant is held for long times and
it slows SMP systems down to a crawl (it was taking several minutes,
like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when
they adaptively spinned on Giant).
- We only compile in the code to do this for SMP kernels, it doesn't make
sense for UP kernels.
Tested on: i386, alpha, sparc64
the relevant classes.
Some methods may implement various "magic spaces", this is reserved
or magic areas on the disk, set a side for various and sundry purposes.
A good example is the BSD disklabel and boot code on i386 which occupies
a total of four magic spaces: boot1, the disklabel, the padding behind
the disklabel and boot2. The reason we don't simply tell people to
write the appropriate stuff on the underlying device is that (some of)
the magic spaces might be real-time modifiable. It is for instance
possible to change a disklabel while partitions are open, provided
the open partitions do not get trampled in the process.
Sponsored by: DARPA & NAI Labs.
value of the tag or data field.
Add macros for getting the page shift, size and mask for the physical page
that a tte maps (which may be one of several sizes).
Use the new cache functions for invalidating single pages.
that td_intr_nesting_level is 0 (like malloc() does). Since malloc() calls
uma we can probably remove the check in malloc() for this now. Also,
perform an extra witness check in that case to make sure we don't hold
any locks when performing a M_WAITOK allocation.
yet. We just return without performing any checks.
- Don't explicitly enter and exit critical sections when walking lock
lists. We don't need a critical section to walk the list of sleep
locks for a thread. We check to see if a spin lock list is empty
before we walk it. If the list is empty we don't need to walk it. If
it isn't then we already hold at least one spin lock and are already in
a critical section and thus don't need our own explicit critical
section.
initialized socket with no qlimit was being passed in. In order
to handle this case properly, we must not use >= when comparing
queue sizes to qlimit. As a result of this improper handling,
a panic could result in certain cases.
PR: 38325
MFC after: 3 days
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
socket buffer. The mutex in the receive buffer also protects the data
in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count
- so_options
- so_linger
- so_state
o Remove *_locked() socket APIs. Make the following socket APIs
touching the members above now require a locked socket:
- sodisconnect()
- soisconnected()
- soisconnecting()
- soisdisconnected()
- soisdisconnecting()
- sofree()
- soref()
- sorele()
- sorwakeup()
- sotryfree()
- sowakeup()
- sowwakeup()
Reviewed by: alfred
make_dev() to create device nodes for each of the serial port channels
(ttym%d and cuam%d respectively, as borrowed from MAKEDEV). This allows
the rc driver to work in 5.0. I've tested it with only one card, but
will try sticking in a second card tomorrow and see what happens.
combining too much conditions and as such ended up with the
kernel map instead of the corresponding process map. While
here, remove code to allow access to the stackgap and restyle
slightly to improve readability.
This fix specifically fixes the procfs failure we're having
when reading the process map (cat /proc/curproc/map)
As a minor positive side-effect, code at -O0 is more optimal. As a
minor negative side-effect, certain boundary cases yield no better
code than non-boundary cases. For example, atomic_set_acq_32(p, 0)
does a useless logical OR with value 0. This was previously elimina-
ted as part of if/while optimizations. Non-boundary cases yield
identical code at -O1 and -O2.
checking, followed by a lookup of the process. Do not call
ptrace() for permission checking, but do it inline.
Spotted by: rwatson
o While here, copy-in arguments before we lock. This fixes
a possible permanent lock.
Reviewed by: rwatson
- Don't include ia64_cpu.h and cpu.h
- Guard definitions by _NO_NAMESPACE_POLLUTION
- Move definition of KERNBASE to vmparam.h
o Move definitions of IA64_RR_{BASE|MASK} to vmparam.h
o Move definitions of IA64_PHYS_TO_RR{6|7} to vmparam.h
o While here, remove some left-over Alpha references.
pointer instead of a proc pointer and require the process pointed to
by the second argument to be locked. We now use the thread ucred reference
for the credential checks in p_can*() as a result. p_canfoo() should now
no longer need Giant.
IFS had its fingers deep in the belly of the UFS/FFS split. IFS
will be reimplemented by the maintainer at a later date.
Requested by: adrian (maintainer)
ext2fs, inode numbers start at 1, so the maximum valid inode number
is (s_inodes_per_group * s_groups_count), not one less. This is
just a minimal change to avoid unnecessary panics and errors; some
other related bugs that Bruce Evans mentioned to me are not addressed.
Reviewed by: bde (ages ago)
yields incorrect behaviour. The hardwiring was present in the very
first commit that implemented msgrcv() (revision 1.4) and hasn't been
changed since. The native implementation was complete at that time,
so there doesn't seem to be a reason for the hardwiring from a
technical point of view.
Submitted by: Reinier Bezuidenhout <rbezuide@yahoo.com>
release Giant around vm_map_madvise()'s call to pmap_object_init_pt().
o Replace GIANT_REQUIRED in vm_object_madvise() with the acquisition
and release of Giant.
o Remove the acquisition and release of Giant from madvise().
kernel BOOTP option. The format will be:
FreeBSD:<MACHINE>:<osrelease>
this way people can tune their DHCP server to server up root file systems
via the OS, machine type and version.
Obtained from: NetBSD
MFC after: 3 weeks
regardless of if they are signed or unsigned since it is easier to work
with sign-extended values. Thus, remove the disabled zapnot to
zero-extend the sign-extended value we read from *p in atomic_cmpset_32()
since the cmpval we are comparing against should already be
sign-extended.
- To ensure that the compiler knows to sign-extend the upper 32 bits of
cmpval rather than leaving garbage in there, cast the appropriately in
the constraints section.
Help from: Richard Henderson <rth@redhat.com>
shared code and converting all ufs references. Originally it may
have made sense to share common features between the two filesystems,
but recently it has only caused problems, the UFS2 work being the
final straw.
All UFS_* indirect calls are now direct calls to ext2_* functions,
and ext2fs-specific mount and inode structures have been introduced.
struct _scrmap, so that it doesn't break C++ programs (name of element of
the structure is the same as the name of the scructure itself).
MFC after: 5 days
the former blocks software interrupts, while the latter blocks
hardware interrupts.
Avoid one place where I'm at splnet across a call to copyout. Leave
one in place to give bde something to complain about :-). Actaully,
I'll fix it in a subsequent commit.
Reviewed by: bde
spl conical hat to: imp
allow recovery from transmission lockups which occur in the middle
of the descriptor list, rather than just at the beginning.
For some unknown reason, Rhine II chips have a tendency to stop
transmitting while under heavy load, possibly due to collisions.
Whether this behavior is due to a hardware bug or a driver glitch
is unknown as of now.
In either case, this change allows the driver to gracefully recover
from such situations.
Special thanks go to The Anarcat <anarcat@anarcat.dyndns.org>, who
bugged me into looking at this and to
Dominic Marks <dominic_marks@btinternet.com>, who performed a great
deal of testing to help characterize this problem.
MFC after: 3 days
inter-process signalling ceased to preserve and return that value,
instead always returning EPERM. This meant that it was possible
to "probe" the pid space for processes that were not otherwise
visible. This change reverts that reversion.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
previously used "micro-optimization" (count-down loop) into a
pessimization. Now the loops are written in the more natural count-up
form.
Also, while being there, i made the logic in out_fdc() similar to the
logic in in_fdc(). The old implementation was a bit bogus anyway
since it first tested the DIO bit and only afterwards the RQM bit.
However, according to the description of the i82077, the DIO bit is
only guaranteed to be valid once the RQM bit is set. Thus, the old
implementatoin would have had the chance to misbehave on a controller
that is implemented in accordance with the i82077 description (but is
not bug-for-bug compatible).
MFC after: 3 days
results in the syncache entry being turned into a socket. While it's
not used in the main tree, this is required in the MAC tree so that
labels can be propagated from the mbuf to the socket. This is also
useful if you're doing things like transparent IP connection hijacking
and you want to use the syncache/cookie mechanism, but we won't go
there.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
structures etc. to ext2fs-specific names, and remove ufs-specific
code that is no longer required. As a first stage, the code will
still convert back and forth between the on-disk format and struct
inode, so the struct dinode fields have been added to struct inode
for now.
Note that these files are not yet connected to the build.
additional system boot ordering entry, SI_SUB_MAC_LATE, which occurs
after all MAC policies have been initialized, permitting the MAC
subsystem to take action once all "early loaded" modules are in place.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
with 16-bit ints, since u_short is promoted when it is passed to a
varargs function. gcc now warns about this. We always pass small
integers (this is well obuscated), so there are no conversion problems.
Fixed a related style bug (bogus cast).
-current, since offsetof() is defined a header under /sys so that
system sources don't need to have this wrong include.
This bug was only detected because my version of <stddef.h> has some
spelling fixes (s/field/member/g) and gcc is now sensitive to the spelling
of arg names in macros as required by standards (ISO C90 6.8.3...).
Get rid of the INTERNALSTATICLIB knob and just use plain INTERNALLIB.
INTERNALLIB now means to build static library only and don't install
anything. Added a NOINSTALLLIB knob for libpam/modules. To not
build any library at all, just do not set LIB.
Ipfw processing of frames at layer 2 can be enabled by the sysctl variable
net.link.ether.ipfw=1
Consider this feature experimental, because right now, the firewall
is invoked in the places indicated below, and controlled by the
sysctl variables listed on the right. As a consequence, a packet
can be filtered from 1 to 4 times depending on the path it follows,
which might make a ruleset a bit hard to follow.
I will add an ipfw option to tell if we want a given rule to apply
to ether_demux() and ether_output_frame(), but we have run out of
flags in the struct ip_fw so i need to think a bit on how to implement
this.
to upper layers
| |
+----------->-----------+
^ V
[ip_input] [ip_output] net.inet.ip.fw.enable=1
| |
^ V
[ether_demux] [ether_output_frame] net.link.ether.ipfw=1
| |
+->- [bdg_forward]-->---+ net.link.ether.bridge_ipfw=1
^ V
| |
to devices
several reasons before. Fixing it involved restructuring the generic hash
code to require calling code to handle locking, unlocking, and freeing hashes
on error conditions.
bridged packets only, soon to come also for packets on ordinary
ether_input() and ether_output() paths. The syntax is
ipfw add <action> MAC dst src type
where dst and src can be "any" or a MAC address optionallyfollowed
by a mask, e.g.
10:20:30:40:50
10:20:30:40:50/32
10:20:30:40:50&ff:ff:ff:f0:ff:0f
and type can be a single ethernet type, a range, or a type followed by
a mask (values are always in hexadecimal) e.g.
0800
0800-0806
0800/8
0800&03ff
Note, I am still uncertain on what is the best format for inputting
these values, having the values in hexadecimal is convenient in most
cases but can be confusing sometimes. Suggestions welcome.
Implement suggestion from PR 37778 to allow "not me" on destination
and source IP. The code in the PR was slightly wrong and interfered
with the normal handling of IP addresses. This version hopefully is
correct.
Minor cleanup of the code, in some places moving the indentation to 4
spaces because the code was becoming too deep. Eventually, in a
separate commit, I will move the whole file to 4 space indent.
default of -fguess-branch-probablility causes time optimizations (?)
like rewriting `if (foo) x++;' as
`if (!foo) goto forth; back: ; ...; forth: x++; goto back;". This is
pessimizes space especially well on i386's because one short branch
gets converted to 2 long ones.
Removed -fno-align-foo since it is implied by -Os. Previous commit
messages seem to have overstated the new alignment bugs in gcc. The
only case that affects boot2 is that -fno-align-functions (or
equivalently -falign-functions=1) actually gives -falign-functions=2.
This is caused by FUNCTION_BOUNDARY being 2 (bytes) instead of 1.
The default case where the optimization level is 1 and no alignment
options are given is more broken. All alignments are minimal, modulo
the bug in FUNCTION_BOUNDARY. This is caused by toplev.c setting
defaults too early.
Some hacks in previous commits ar not needed now, but may as well be
kept until gcc is fixed. The previous on in the Makefile saved 96
bytes of text due to the wrong FUNCTION_BOUNDARY and 32 bytes of data
due to unrelated bloat in the alignment of large objects. There aren't
even any options to control alignment of data.
before rev 1.229 (~ 100 ms). According to bde, some (old) broken
hardware could require it. In order to make timing more accurate than
what could be achieved with a loop around DELAY(1), increase loop
timing after the initial ~ 1 ms.
Also, move the declaration of FDSTS_TIMEOUT out from fdreg.h into fd.c
where it actually belongs to.
MFC after: 2 days
function to return the total number of CPUs and not the highest
CPU id.
o Define mp_maxid based on the minimum of the actual number of
CPUs in the system and MAXCPU.
o In cpu_mp_add, when the CPU id of the CPU we're trying to add
is larger than mp_maxid, don't add the CPU. Formerly this was
based on MAXCPU. Don't count CPUs when we add them. We already
know how many CPUs exist.
o Replace MAXCPU with mp_maxid when used in loops that iterate
over the id space. This avoids a couple of useless iterations.
o In cpu_mp_unleash, use the number of CPUs to determine if we
need to launch the CPUs.
o Remove mp_hardware as it's not used anymore.
o Move the IPI vector array from mp_machdep.c to sal.c. We use
the array as a centralized place to collect vector assignments.
Note that we still assign vectors to SMP specific IPIs in
non-SMP configurations. Rename the array from mp_ipi_vector to
ipi_vector.
o Add IPI_MCA_RENDEZ and IPI_MCA_CMCV. These are used by MCA.
Note that IPI_MCA_CMCV is not SMP specific.
o Initialize the ipi_vector array so that we place the IPIs in
sensible priority classes. The classes are relative to where
the AP wake-up vector is located to guarantee that it's the
highest priority (external) interrupt. Class assignment is
as follows:
class IPI notes
x AP wake-up (normally x=15)
x-1 MCA rendezvous
x-2 AST, Rendezvous, stop
x-3 CMCV, test
vm_object_deallocate(), replacing the assertion GIANT_REQUIRED.
o Remove GIANT_REQUIRED from vm_map_protect() and vm_map_simplify_entry().
o Acquire and release Giant around vm_map_protect()'s call to pmap_protect().
Altogether, these changes eliminate the need for mprotect() to acquire
and release Giant.
nearly in its entirety from i386, so it retains the phk/nati copyright.
Savecore likes the results, but I have no way to test it as gdb is
still broken.
to 4 bytes free. I removed a printf (the Keyboard yes/no) since it is of
marginal value and sed'ed the generated asm output to remove the unwanted
aligns. There's probably a better way to gain a few extra bytes than
losing the printf. Shortening strings is probably a better option but this
should get us over the hurdle.
than the first one on a controller, and work for secondary
controllers.
Due to the prom not having nodes for each disk, but a catch-all one,
we have to iterate over each device, trying to open it to determine
whether it is actually present.
Since probing this way takese some time (and spews some spurious
warnings), it should maybe be short-circuited if we use the
device we were booted from.
Implement lazy device probing, and correct slice/partiniton
handling in the ofwd_open() code. With this, I can now actually boot
a kernel from disk, and the loader does not create unnecessary
delays.
Submitted by: tmm
a floating point instruction into a 6-bit register number for
double and quad arguments.
Make use of the new INSFPdq_RN macro where apporpriate; this
is required for correctly handling the "high" fp registers
(>= %f32).
Fix a number of bugs related to the handling of the high registers
which were caused by using __fpu_[gs]etreg() where __fpu_[gs]etreg64()
should be used (the former can only access the low, single-precision,
registers).
Submitted by: tmm
value we load from memory. gcc3.1 passes in the u_int32_t old value to
compare against as a _sign_-extended 64-bit value for some reason (bug?).
This is a temporary workaround so kernels work again on alpha.
in each cycle, with a tunable max cycle count defined in fdreg.h.
This is said to fix the problem on some Compaq hardware (and perhaps
on other machines using the Natsemi PC87317 chip) where the fdc(4)
driver failed to operate at all.
PR: kern/21397
Submitted by: Jung-uk Kim <jkim@niksun.com>
MFC after: 3 days
make sure it's a correct operation for devfs, do it only in the
ISLASTCN case. If we don't, we are assuming that the final file will
be in devfs, which is not true if another partition is mounted on top
of devfs or with special filenames (like /dev/net/../../foo).
Reviewed by: phk
it into an "#ifdef INET6" block. This caused a (harmless but annoying)
EINVAL return value to be sent even though the operation completed
successfully.
PR: kern/37786
Submitted by: Ari Suutari <ari.suutari@syncrontech.com>,David Malone <dwmalone@maths.tcd.ie>
MFC after: 1 day
- Axe -fdata-sections as turning it on or off makes no difference. If
it did make a difference it would serve to bloat boot2 even further with
extra padding.
- Axe -fforce-addr. This gets us 32 bytes so we are down to only being
64-bytes over.
We still can't compile this with gcc 3.1. The problem seems to be that
the -fno-align-foo options don't actually work. Comparing the new and
old output it turns out that gcc is 4-byte padding all the functions and
labels and what not despite the passed in arguments thus adding the
unfortunate bloat to boot2.
This code works by converting the Sun label to a struct disklabel, which
is probably even the right thing for reading a label. The original
checksum is taken over, so that the label source can be distinguished.
The NetBSD code to wrap a BSD-style disklabel into the Sun disklabel has been
deleted for now - don't know whether that is really desirable, after all Sun
disklabels could just be used always (BSD disklabels are going to have
problems with PROM compatability). The dsinit() call in diskopen() has been
#ifdef'ed out for now, this will be changed to use the minimal slice struct
in case of dsinit() failure.
Submitted by: tmm
Obtained from: NetBSD