Commit Graph

5577 Commits

Author SHA1 Message Date
Konstantin Belousov
30af71199e Fix the remaining race in the revs. 1.232, 1,233 that could occur during
unmount when mp structure is reused while waiting for coveredvp lock.
Introduce struct mount generation count, increment it on each reuse and
compare the generations before and after obtaining the coveredvp lock.

Reviewed by:	tegge, pjd
Approved by:	pjd (mentor)
MFC after:	2 weeks
2006-10-03 10:47:04 +00:00
John Birrell
678faae279 Solaris compatibility only: Be specific about the fact that
the inline function takes no arguments.
2006-10-03 04:01:30 +00:00
Poul-Henning Kamp
e5037a18a9 Use utc_offset() where applicable, and hide the internals of it
as static variables.
2006-10-02 18:23:37 +00:00
John Baldwin
278d119ae6 Update description of td_locks.
MFC after: 	3 days
Requested by:	pjd
2006-10-02 17:48:13 +00:00
Poul-Henning Kamp
f97c1c4bf7 Introduce utc_offset() to capture a calculation currently done all over the
place.
2006-10-02 16:17:23 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Tor Egge
04aa807cb6 If the buffer lock has waiters after the buffer has changed identity then
getnewbuf() needs to drop the buffer in order to wake waiters that might
sleep on the buffer in the context of the old identity.
2006-10-02 02:06:27 +00:00
Simon L. B. Nielsen
1b43c389b6 Bump __FreeBSD_version for OpenSSL 0.9.8d import. 2006-10-01 08:26:41 +00:00
Ruslan Ermilov
4e9d799f8b Retire macros for the old kernel memory allocator.
Submitted by:	bde
2006-09-28 08:36:08 +00:00
Ruslan Ermilov
9fddcc6661 Fix our ioctl(2) implementation when the argument is "int". New
ioctls passing integer arguments should use the _IOWINT() macro.
This fixes a lot of ioctl's not working on sparc64, most notable
being keyboard/syscons ioctls.

Full ABI compatibility is provided, with the bonus of fixing the
handling of old ioctls on sparc64.

Reviewed by:	bde (with contributions)
Tested by:	emax, marius
MFC after:	1 week
2006-09-27 19:57:02 +00:00
Robert Watson
d39de3122d PC98 would also like a trademark.
Who would have thought that getting a kernel printf right would be so
tricky?

MFC after:	3 days
Submitted by:	Gavin Atkinson <gavin dot atkinson at ury dot york dot ac dot uk>
2006-09-26 12:45:47 +00:00
Tor Egge
55b4ff0d9f Increase mnt_noasync once in softdep_mount() to disallow async io,
closing a window where a file system using softupdates could be async
for a short while if both MNT_UPDATE and MNT_ASYNC were passed as flags
to nmount().  Add MNTK_SOFTDEP flag to ensure that softdep_mount()
doesn't increase mnt_noasync multiple times.
2006-09-26 04:17:17 +00:00
Tor Egge
a1e363f256 Add mnt_noasync counter to better handle interleaved calls to nmount(),
sync() and sync_fsync() without losing MNT_ASYNC.  Add MNTK_ASYNC flag
which is set only when MNT_ASYNC is set and mnt_noasync is zero, and
check that flag instead of MNT_ASYNC before initiating async io.
2006-09-26 04:15:59 +00:00
Tor Egge
5da56ddb21 Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().
2006-09-26 04:12:49 +00:00
Robert Watson
5add74b4a7 Add "FreeBSD" trademark statement to copyright section of boot messages.
MFC after:	3 days
Approved by:	core, board at FreeBSDFoundation dot org
2006-09-25 23:19:01 +00:00
John-Mark Gurney
4db71d27a1 hide kqueue_register from public view, and replace it w/ kqfd_register...
this eliminates a possible race in aio registering a kevent..
2006-09-24 04:47:47 +00:00
Alexander Leidinger
b611c801f0 MFp4 the sound Google Summer of Code project:
The goal was to sync with the OSSv4 API 4Front Technologies uses in their
proprietary OSS driver. This was successful as far as possible. The part
of the API which is stable is implemented, for the rest there are some
stubs already.

New system ioctls:
 - SNDCTL_SYSINFO - obtain audio system info (version, # of audio/midi/
   mixer devices, etc.)
 - SNDCTL_AUDIOINFO - fetch details about a specific audio device
 - SNDCTL_MIXERINFO - fetch details about a specific mixer device

New audio ioctls:
 - Sync groups (SNDCTL_DSP_SYNCGROUP/SNDCTL_DSP_SYNCSTART) which allow
   triggered playback/recording on multiple devices (even across processes
   simultaneously).
 - Peak meters (SNDCTL_DSP_GETIPEAKS/SNDCTL_DSP_GETOPEAKS) - can query
   audio drivers for peak levels (needs driver support, disabled for now).
 - Per channel playback/recording levels -
   SNDCTL_DSP_{GET,SET}{PLAY,REC}VOL.  Note that these are still in name
   only, just wrapping around the AC97-style mixer at the moment. The next
   step is to push them down to the drivers.

Audio ioctls still under development by 4Front (for which stubs may exist
in this commit):
 - SNDCTL_GETNAME, SNDCTL_{GET,SET}{SONG,LABEL}
 - SNDCTL_DSP_{GET,SET}_CHNORDER
 - SNDCTL_MIX_ENUMINFO, SNDCTL_MIX_EXTINFO - (might be documented enough in
   the OSS releases to work on this.  These ioctls cover the cool "twiddle
   any knob on your card" features.)

Missing:
 - SNDCTL_DSP_COOKEDMODE -- this ioctl is used to give applications direct
   access to a card's buffers, bypassing the feeder architecture.  It's
   a toughy -- "someone" needs to decide :
   (a) if this is desireable, and (b) if it's reasonably feasible.

Updates for driver writers:
 So far, only two routines to the channel class (in channel_if.m) are added.
 One is for fetching a list of discrete supported playback/recording rates
 of a channel, and the other is for fetching peak level info (useful for
 drawing peak meters).  Interested parties may want to help pushing down
 SNDCTL_DSP_{GET,SET}{PLAY,REC}VOL into the drivers.

To use the new stuff you need to rebuild the sound drivers or your kernel
(depending on if you use modules or not) and to install soundcard.h (a
buildworld/installworld handles this).

Sponsored by:	Google SoC 2006
Submitted by:	ryanb
Many thanks to:	4Front Technologies for their cooperation, explanations
		and the nice license of their soundcard.h.
2006-09-23 20:45:47 +00:00
John Baldwin
d72a078647 Update the ipmi(4) driver:
- Split out the communication protocols into their own files and use
  a couple of function pointers in the softc that the commuication
  protocols setup in their own attach routine.
- Add support for the SSIF interface (talking to IPMI over SMBus).
- Add an ACPI attachment.
- Add a PCI attachment that attaches to devices with the IPMI interface
  subclass.
- Split the ISA attachment out into its own file: ipmi_isa.c.
- Change the code to probe the SMBIOS table for an IPMI entry to just use
  pmap_mapbios() to map the table in rather than trying to setup a fake
  resource on an isa device and then activating the resource to map in the
  table.
- Make bus attachments leaner by adding attach functions for each
  communication interface (ipmi_kcs_attach(), ipmi_smic_attach(), etc.)
  that setup per-interface data.
- Formalize the model used by the driver to handle requests by adding an
  explicit struct ipmi_request object that holds the state of a given
  request and reply for the entire lifetime of the request.  By bundling
  the request into an object, it is easier to add retry logic to the various
  communication backends (as well as eventually support BT mode which uses
  a slightly different message format than KCS, SMIC, and SSIF).
- Add a per-softc lock and remove D_NEEDGIANT as the driver is now MPSAFE.
- Add 32-bit compatibility ioctl shims so you can use a 32-bit ipmitool
  on FreeBSD/amd64.
- Add ipmi(4) to i386 and amd64 NOTES.

Submitted by:	ambrisko (large portions of 2 and 3)
Sponsored by:	IronPort Systems, Yahoo!
MFC after:	6 days
2006-09-22 22:11:29 +00:00
Ruslan Ermilov
43e5f4b8cd Update a comment about M_VLANTAG. 2006-09-22 19:50:04 +00:00
David Xu
cda9a0d1c2 Add compatible code to let 32bit libthr work on 64bit kernel. 2006-09-22 15:04:28 +00:00
David Xu
1eec02f538 Add umtx support for 32bit process on AMD64 machine. 2006-09-22 00:52:54 +00:00
David Xu
cca0a557dd Regenerate. 2006-09-21 04:19:48 +00:00
David Xu
73fa3e5b88 Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparam
with rtprio_thread, while rtprio system call is for process only, the new
system call rtprio_thread is responsible for LWP.
2006-09-21 04:18:46 +00:00
Alexander Kabaev
fa034a084b Use __builtin_offsetof for GCC 4.1. 2006-09-21 01:38:58 +00:00
Robert Watson
5702e0965e Declare security and security.bsd sysctl hierarchies in sysctl.h along
with other commonly used sysctl name spaces, rather than declaring them
all over the place.

MFC after:	1 month
Sponsored by:	nCircle Network Security, Inc.
2006-09-17 20:00:36 +00:00
Andre Oppermann
78ba57b9e1 Move ethernet VLAN tags from mtags to its own mbuf packet header field
m_pkthdr.ether_vlan.  The presence of the M_VLANTAG flag on the mbuf
signifies the presence and validity of its content.

Drivers that support hardware VLAN tag stripping fill in the received
VLAN tag (containing both vlan and priority information) into the
ether_vtag mbuf packet header field:

	m->m_pkthdr.ether_vtag = vlan_id;	/* ntohs()? */
	m->m_flags |= M_VLANTAG;

to mark the packet m with the specified VLAN tag.

On output the driver should check the mbuf for the M_VLANTAG flag to
see if a VLAN tag is present and valid:

	if (m->m_flags & M_VLANTAG) {
		... = m->m_pkthdr.ether_vtag;	/* htons()? */
		... pass tag to hardware ...
	}

VLAN tags are stored in host byte order.  Byte swapping may be necessary.

(Note: This driver conversion was mechanic and did not add or remove any
byte swapping in the drivers.)

Remove zone_mtag_vlan UMA zone and MTAG_VLAN definition.  No more tag
memory allocation have to be done.

Reviewed by:	thompsa, yar
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-17 13:33:30 +00:00
Robert Watson
da7cbdc2b3 Regenerate. 2006-09-17 13:29:36 +00:00
Mohan Srinivasan
7d7d9e2242 Fixes up the handling of shared vnode lock lookups in the NFS client,
adds a FS type specific flag indicating that the FS supports shared
vnode lock lookups, adds some logic in vfs_lookup.c to test this flag
and set lock flags appropriately.

- amd on 6.x is a non-starter (without this change). Using amd under
  heavy load results in a deadlock (with cascading vnode locks all the
  way to the root) very quickly.
- This change should also fix the more general problem of cascading
  vnode deadlocks when an NFS server goes down.

Ideally, we wouldn't need these changes, as enabling shared vnode lock
lookups globally would work. Unfortunately, UFS, for example isn't
ready for shared vnode lock lookups, crashing pretty quickly.

This change is the result of discussions with Stephan Uphoff (ups@).

Reviewed by:	ups@
2006-09-13 18:39:09 +00:00
Christian S.J. Peron
d94f2a68f8 Introduce a new entry point, mac_create_mbuf_from_firewall. This entry point
exists to allow the mandatory access control policy to properly initialize
mbufs generated by the firewall. An example where this might happen is keep
alive packets, or ICMP error packets in response to other packets.

This takes care of kernel panics associated with un-initialize mbuf labels
when the firewall generates packets.

[1] I modified this patch from it's original version, the initial patch
    introduced a number of entry points which were programmatically
    equivalent. So I introduced only one. Instead, we should leverage
    mac_create_mbuf_netlayer() which is used for similar situations,
    an example being icmp_error()

    This will minimize the impact associated with the MFC

Submitted by:	mlaier [1]
MFC after:	1 week

This is a RELENG_6 candidate
2006-09-12 04:25:13 +00:00
John Baldwin
bd4b6eb964 Add prototype for bus_generic_add_child() missed in previous commit. 2006-09-11 19:42:27 +00:00
Robert Watson
198e7d90f9 Add struct msg to the forwarded declared data structures in mac_policy.h.
Obtained from:	TrustedBSD Project
2006-09-09 16:35:44 +00:00
Konstantin Belousov
6c71207db5 Bump __FreeBSD_version for rev. 1.117 of libexec/rtld-elf/rtld.c.
Requested by:	jkim
Approved by:	kan (mentor)
2006-09-09 04:41:40 +00:00
Andre Oppermann
2a6b443f9f Reserve a precious 16bit gap in the mbuf pkthdr struct for ethernet 802.1pq
vlan tags.
2006-09-06 22:33:49 +00:00
Andre Oppermann
233dcce118 First step of TSO (TCP segmentation offload) support in our network stack.
o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6
 o add CSUM_TSO flag to mbuf pkthdr csum_flags field
 o add tso_segsz field to mbuf pkthdr
 o enhance ip_output() packet length check to allow for large TSO packets
 o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities
 o adjust all callers of tcp_maxmtu[46]() accordingly

Discussed on:	-current, -net
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-06 21:51:59 +00:00
Andre Oppermann
60d4ab7abb Improve description of if_capabilities, if_capenable and ifi_hwassist.
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-09-06 18:06:04 +00:00
Sam Leffler
50bdd72083 bump version for libpcap+tcpdump imports 2006-09-04 21:49:31 +00:00
Robert Watson
89ede214c7 Regenerate for updated audit event identifiers. 2006-09-03 15:11:13 +00:00
Robert Watson
863ccba5d5 Regenerate. 2006-09-03 13:48:48 +00:00
Matt Jacob
048963977c Bump __FreeBSD_version by one due to newbus changes. 2006-09-03 01:12:34 +00:00
John-Mark Gurney
a890c1dd21 up the default msgbuf limit to 64k.. a verbose boot on i386 on modern
hardware returns almost 48k of data...  to change the default per platform,
it should be done in DEFAULTS, not here...

Discussed on:	-arch
2006-09-03 00:33:19 +00:00
John-Mark Gurney
378f231e7d add a newbus method for obtaining the bus's bus_dma_tag_t... This is
required by arches like sparc64 (not yet implemented) and sun4v where there
are seperate IOMMU's for each PCI bus...  For all other arches, it will
end up returning NULL, which makes it a no-op...

Convert a few drivers (the ones we've been working w/ on sun4v) to the
new convection...  Eventually all drivers will need to replace the parent
tag of NULL, w/ bus_get_dma_tag(dev), though dev is usually different for
each driver, and will require hand inspection...

Reviewed by:	scottl (earlier version)
2006-09-03 00:27:42 +00:00
John-Mark Gurney
01646adbc8 Break out typedefs from bus_dma.h to _bus_dma.h so that we can get the
typedef for bus_dma_tag_t in sys/bus.h w/o poluting the namespace...
This is in preperation for adding bus_get_dma_tag to sys/bus.h...
2006-09-03 00:26:17 +00:00
John Baldwin
fa4a2ffd58 The _sx_assert() prototype should exist if either of INVARIANTS or
INVARIANT_SUPPORT is defined so you can build a kernel with
INVARIANT_SUPPORT, but build a module with just INVARIANTS on.

MFC after:	3 days
Reported by:	kuriyama
2006-08-29 20:36:33 +00:00
David Xu
cd42ca3c27 Regenerate. 2006-08-28 04:28:25 +00:00
David Xu
d10183d94d This is initial version of POSIX priority mutex support, a new userland
mutex structure is added as following:
struct umutex {
        __lwpid_t       m_owner;
        uint32_t        m_flags;
        uint32_t        m_ceilings[2];
        uint32_t        m_spare[4];
};
The m_owner represents owner thread, it is a thread id, in non-contested
case, userland can simply use atomic_cmpset_int to lock the mutex, if the
mutex is contested, high order bit will be set, and userland should do locking
and unlocking via kernel syscall. Flag UMUTEX_PRIO_INHERIT represents
pthread's PTHREAD_PRIO_INHERIT mutex, which when contention happens, kernel
should do priority propagating. Flag UMUTEX_PRIO_PROTECT indicates it is
pthread's PTHREAD_PRIO_PROTECT mutex, userland should initialize m_owner
to contested state UMUTEX_CONTESTED, then atomic_cmpset_int will be failure
and kernel syscall should be invoked to do locking, this becauses
for such a mutex, kernel should always boost the thread's priority before
it can lock the mutex, m_ceilings is used by PTHREAD_PRIO_PROTECT mutex,
the first element is used to boost thread's priority when it locked the mutex,
second element is used when the mutex is unlocked, the PTHREAD_PRIO_PROTECT
mutex's link list is kept in userland, the m_ceiling[1] is managed by thread
library so kernel needn't allocate memory to keep the link list, when such
a mutex is unlocked, kernel reset m_owner to UMUTEX_CONTESTED.
Flag USYNC_PROCESS_SHARED indicate if the synchronization object is process
shared, if the flag is not set, it saves a vm_map_lookup() call.

The umtx chain is still used as a sleep queue, when a thread is blocked on
PTHREAD_PRIO_INHERIT mutex, a umtx_pi is allocated to support priority
propagating, it is dynamically allocated and reference count is used,
it is not optimized but works well in my tests, while the umtx chain has
its own locking protocol, the priority propagating protocol are all protected
by sched_lock because priority propagating function is called with sched_lock
held from scheduler.

No visible performance degradation is found which these changes. Some parameter
names in _umtx_op syscall are renamed.
2006-08-28 04:24:51 +00:00
David Xu
66e1c26dba Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.
2006-08-28 02:28:15 +00:00
David Xu
3db720fdce Add user priority loaning code to support priority propagation for
1:1 threading's POSIX priority mutexes, the code is no-op unless
priority-aware umtx code is committed.
2006-08-25 06:12:53 +00:00
David Xu
31135ac304 Add member kg_base_user_pri and flag TDF_UBORROWING, they will be used
to support userland priority propagation for 1:1 threading.
2006-08-25 03:15:27 +00:00
Roman Kurakin
9eb5ad2319 Fix typo in a comment: DEFINE_CLASSx => DEFINE_CLASS_x.
MFC after: 1 week
2006-08-24 21:09:39 +00:00
Alan Cox
b276ae6f6a Add _vm_stats and _vm_stats_misc to the sysctl declarations in sysctl.h and
eliminate their declarations from various source files.
2006-08-21 06:27:28 +00:00
Maxim Konovalov
0e33a87e97 o Re-word a comment.
PR:		kern/102127
Submitted by:	Eric Anderson
2006-08-16 09:34:56 +00:00
John Baldwin
462a7add8e Add a new 'show sleepchain' ddb command similar to 'show lockchain' except
that it operates on lockmgr and sx locks.  This can be useful for tracking
down vnode deadlocks in VFS for example.  Note that this command is a bit
more fragile than 'show lockchain' as we have to poke around at the
wait channel of a thread to see if it points to either a struct lock or
a condition variable inside of a struct sx.  If td_wchan points to
something unmapped, then this command will terminate early due to a fault,
but no harm will be done.
2006-08-15 18:29:01 +00:00
John Baldwin
57d6c87c0e Use SYS_AUE_<syscallname> to include the appropriate audit event identifier
for syscalls in kld's, even when compiled into the kernel statically.
Note that since this hardcodes the SYS_ prefix SYSCALL_MODULE_HELPER() now
only works for native ABI system calls.  Those are the only ones that
used the macro anyway, and I chose to not require a second argument to the
macro to specify the prefix or audit event directly.
2006-08-15 17:42:14 +00:00
John Baldwin
f8f1f7fb85 Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier
systrace changes.
2006-08-15 17:37:01 +00:00
Alexander Leidinger
993182e57c - Change process_exec function handlers prototype to include struct
image_params arg.
- Change struct image_params to include struct sysentvec pointer and
  initialize it.
- Change all consumers of process_exit/process_exec eventhandlers to
  new prototypes (includes splitting up into distinct exec/exit functions).
- Add eventhandler to userret.

Sponsored by:		Google SoC 2006
Submitted by:		rdivacky
Parts suggested by:	jhb (on hackers@)
2006-08-15 12:10:57 +00:00
David E. O'Brien
c157a036a9 Add an extension to the UINT & ULONG types. The XINT & XLONG types behave
the same, except sysctl(8) will print out the values in hex.
2006-08-12 23:33:10 +00:00
Pawel Jakub Dawidek
73c0c41140 Add strstr() function to the libkern. 2006-08-12 15:28:39 +00:00
Robert Watson
e4445a031f Move definition of UNIX domain socket protosw and domain entries from
uipc_proto.c to uipc_usrreq.c, making localdomain static.  Remove
uipc_proto.c as it's no longer used.  With this change, UNIX domain
sockets are entirely encapsulated in uipc_usrreq.c.
2006-08-07 12:02:43 +00:00
Giorgos Keramidas
6d35e42f0c Spell "determine" correctly.
Reviewed by:	jb
2006-08-07 10:33:07 +00:00
Robert Watson
14f212e215 Make mpo_associate_nfsd_label() return void, not int, to match
mac_associate_nfsd_label().

Head nod:	csjp
2006-08-06 16:56:15 +00:00
John Birrell
6d9b0007ca Add OpenSolaris compatibility definitions which are only visible if
_SOLARIS_C_SOURCE is defined.

The _OpenSolaris_version is set to match the last import of the OpenSolaris
tar ball and is based on the date in that file name.
2006-08-05 20:35:11 +00:00
John Birrell
3255d30131 Add OpenSolaris compatibility definitions for stat64 and fstat64 which
are only visible if _SOLARIS_C_SOURCE is defined.

Note thar FreeBSD stat() and fstat() are 64-bit functions now and Solaris
still persists with both 32- and 64-bit versions. When I query this, I am
referred to: <http://www.unix.org/version2/whatsnew/lfs20mar.html>.
But when you look at the main page of unix.org you will see that the
Single Unix Specification <http://www.unix.org/version3/> is the most
recent standard they are pushing. And there are no stat64() fstat64()
functions defined there. I guess this just goes to prove that there are so
many standards, you can take your pick.
2006-08-04 23:47:30 +00:00
John Birrell
30436fb40d Add a type definition for the cyclic timer callback function.
The cyclic timer is a high-resolution timer allows timeouts at nanosecond
intervals where hardware support is available. Typically on i386 there
is no HPET (high performance event timer) like the one Intel started
specifying some time in 2004, so the best that tye cyclic timer subsystem
can do is run at Hz.

The cyclic timer code itself is ported from OpenSolaris and is covered
by the CDDL, so it is only loaded as a module. This function type definition
is used in machine-dependent code to provide a hook for the module to
register it's callback function.
2006-08-04 23:31:16 +00:00
John Birrell
3632bc0ae1 Add some OpenSolaris compatibility definitions which are only visible
if _SOLARIS_C_SOURCE is defined.

Add two function prototypes which are required to feed high-resolution
times to DTrace. DTrace requires it's own functions with the dtrace_
prefix so that it knows not to try and trace them. This is a rule that
code executed from the DTrace probe context must obey.

The two functions are only be compiled if the KDTRACE option is defined
to compile in kernel support for loading the DTrace modules.
2006-08-04 23:10:11 +00:00
John Birrell
1d70d6cb78 Add some compatibility definitions for OpenSolaris source.
These are only defined if _SOLARIS_C_SOURCE is defined, so they don't
polute the FreeBSD compile environment.

They are used all over the OpenSolaris source, so defining them here
removes the need to continually resolve differences in FreeBSD system
haeder files from Solaris header files.
2006-08-04 22:54:10 +00:00
John Birrell
d80c69964b Add fields to struct sysent to support the DTrace syscall provider called
systrace.

Another file called systrace_args.c is generated. This will be compiled
into systrace and is used to map the syscall arguments into the 64-bit
parameter array.
2006-08-03 05:26:51 +00:00
David Xu
4657002e65 don't include sys/thr.h and sys/umtx.h, it is unnecessary. 2006-08-02 07:38:59 +00:00
John Baldwin
03e161fdb1 Make system call modules a bit more robust:
- If we fail to register the system call during MOD_LOAD, then note that
  so that we don't try to deregister it or invoke the chained event handler
  during the subsequent MOD_UNLOAD event.  Doing the deregister when the
  register failed could result in trashing system call entries.
- Add a SI_SUB_SYSCALLS just before starting up init and use that to
  register syscall modules instead of SI_SUB_DRIVERS.  Registering system
  calls as late as possible increases the chances that any other module
  event handlers or SYSINITs in a module are executed to initialize the
  data in a kld before a syscall dependent on that data is able to be
  invoked.

MFC after:	3 days
2006-08-01 16:32:20 +00:00
Robert Watson
eaa6dfbcc2 Reimplement socket buffer tear-down in sofree(): as the socket is no
longer referenced by other threads (hence our freeing it), we don't need
to set the can't send and can't receive flags, wake up the consumers,
perform two levels of locking, etc.  Implement a fast-path teardown,
sbdestroy(), which flushes and releases each socket buffer.  A manual
dom_dispose of the receive buffer is still required explicitly to GC
any in-flight file descriptors, etc, before flushing the buffer.

This results in a 9% UP performance improvement and 16% SMP performance
improvement on a tight loop of socket();close(); in micro-benchmarking,
but will likely also affect CPU-bound macro-benchmark performance.
2006-08-01 10:30:26 +00:00
Simon L. B. Nielsen
f57d6668ad Bump __FreeBSD_version for OpenSSL 0.9.8b import. 2006-07-29 19:44:07 +00:00
John Baldwin
cb76d9b05c Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
John Baldwin
91ce2694d1 Regen for MPSAFE flag removal. 2006-07-28 19:08:37 +00:00
John Baldwin
186abbd727 Write a magic value into mtx_lock when destroying a mutex that will force
all other mtx_lock() operations to block.  Previously, when the mutex was
destroyed, it would still have a valid value in mtx_lock(): either the
unowned cookie, which would allow a subsequent mtx_lock() to succeed, or a
pointer to the thread who destroyed the mutex if the mutex was locked when
it was destroyed.

MFC after:	3 days
2006-07-27 19:58:18 +00:00
John Baldwin
f30e89ced3 Fix a file descriptor race I reintroduced when I split accept1() up into
kern_accept() and accept1().  If another thread closed the new file
descriptor and the first thread later got an error trying to copyout the
socket address, then it would attempt to close the wrong file object.  To
fix, add a struct file ** argument to kern_accept().  If it is non-NULL,
then on success kern_accept() will store a pointer to the new file object
there and not release any of the references.  It is up to the calling code
to drop the references appropriately (including a call to fdclose() in case
of error to safely handle the aforementioned race).  While I'm at it, go
ahead and fix the svr4 streams code to not leak the accept fd if it gets an
error trying to copyout the streams structures.
2006-07-27 19:54:41 +00:00
Sam Leffler
246b546762 add support for 802.11 packet injection via bpf
Together with:	Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Reviewed by:	arch@
MFC after:	1 month
2006-07-26 03:15:16 +00:00
Robert Watson
b0668f7151 soreceive_generic(), and sopoll_generic(). Add new functions sosend(),
soreceive(), and sopoll(), which are wrappers for pru_sosend,
pru_soreceive, and pru_sopoll, and are now used univerally by socket
consumers rather than either directly invoking the old so*() functions
or directly invoking the protocol switch method (about an even split
prior to this commit).

This completes an architectural change that was begun in 1996 to permit
protocols to provide substitute implementations, as now used by UDP.
Consumers now uniformly invoke sosend(), soreceive(), and sopoll() to
perform these operations on sockets -- in particular, distributed file
systems and socket system calls.

Architectural head nod:	sam, gnn, wollman
2006-07-24 15:20:08 +00:00
Robert Watson
ad75f55693 Remove MT_FTABLE, as it's no longer used.
Comment that many stats in mbstat are now not used, as libmemstat and
UMA stats are used.
2006-07-24 01:49:57 +00:00
Robert Watson
cac4afa816 Garbage collect #if 0'd MT_ mbuf types, as they are no longer used, and
there are no plans to re-introduce them.
2006-07-24 01:14:05 +00:00
Robert Watson
f9f4beac68 Tweak so_gencnt comment: it was once last, but that is no longer the
case.
2006-07-24 01:05:36 +00:00
Robert Watson
03b8ff0b8f Tweak comment for so_head: it is a pointer to the listen socket, rather
than the accept socket.
2006-07-24 01:02:07 +00:00
Robert Watson
6581a6d7a4 Fix a spelling error in a comment.
Found with:	mckusick's code walkthrough DVDs
2006-07-24 00:33:24 +00:00
Robert Watson
07b1b82e55 Comment extended attribute name space constants. 2006-07-23 19:26:54 +00:00
Robert Watson
ba68fd5b2f Improve comments for label data structure. 2006-07-23 19:26:32 +00:00
Robert Watson
4f1f0ef523 Add two new unpcb flags, UNP_BINDING and UNP_CONNECTING, which will be
used to mark UNIX domain sockets as being in the process of binding or
connecting.  Use these to prevent simultaneous bind or connect
operations by multiple threads or processes on the same socket at the
same time, which closes race conditions present in the UNIX domain
socket implementation since inception.
2006-07-23 12:01:14 +00:00
Warner Losh
16c84e5e51 Add new kernel config option. NO_SYSCTL_DESCR to omit the descriptions for
the sysctls.  This saves a lot of space in the resulting kernel which is
important for embedded systems.  This change was done in a ABI compatible
way.  The pointer is still there, it just points to an empty string instead
of the description.

MFC After: 3 days
2006-07-18 17:00:51 +00:00
Poul-Henning Kamp
eb184de351 Add some casts to make these files more C++ compatible.
Submitted by:	Kristen Nielsen <krn@krn.dk>
2006-07-17 09:05:21 +00:00
Alexander Leidinger
0fa7ab6a31 - Connect the snd_emu10kx driver to the build. [1]
- Bump __FreeBSD_version, no need to build the port now.

Submitted by:	Yuriy Tsibizov <Yuriy.Tsibizov@gfk.ru> [1]
2006-07-15 20:22:40 +00:00
Robert Watson
7455136bbc Define prototype for pru_close, which in the future will notify the
protocol of a socket close event distinct from a detach event, which
will (in a future commit) become aligned with pru_abort, which will
also be a notification of close prior to detach.  Add prurequests event
for close, as well as patch up some existing missing ones.
2006-07-14 09:44:28 +00:00
David Xu
ba493ceb6b regenerate. 2006-07-13 06:32:55 +00:00
David Xu
60088160c9 Add syscalls thr_setscheduler, thr_getscheduler, and thr_setschedparam,
these syscalls are designed to set thread's scheduling parameters and
policy, because each syscall contains a size parameter, it is possible
to support future scheduling option, e.g SCHED_SPORADIC, this option
needs other fields in structure sched_param, current they are not
avaiblable.
2006-07-13 06:26:43 +00:00
Robert Watson
5908c617bb Several protocol switch functions (pru_abort, pru_detach, pru_sosetlabel)
return void, so don't implement no-op versions of these functions.
Instead, consistently check if those switch pointers are NULL before
invoking them.
2006-07-11 23:18:28 +00:00
John Baldwin
90aff9de2d Regen. 2006-07-11 20:55:23 +00:00
David Xu
65343c788c Extended the POSIX scheduler APIs to accept lwpid as well, we've already
done this in ptrace syscall, when a pid is large than PID_MAX, the syscall
will search a thread in current process. It permits 1:1 thread library to
get and set a thread's scheduler attributes.
2006-07-11 06:11:34 +00:00
David Xu
a0712c99d0 Add POSIX scheduler parameters support to thr_new syscall, this permits
privileged process to create realtime thread.
2006-07-11 05:34:35 +00:00
John Baldwin
c870740e09 - Split out kern_accept(), kern_getpeername(), and kern_getsockname() for
use by ABI emulators.
- Alter the interface of kern_recvit() somewhat.  Specifically, go ahead
  and hard code UIO_USERSPACE in the uio as that's what all the callers
  specify.  In place, add a new uioseg to indicate what type of pointer
  is in mp->msg_name.  Previously it was always a userland address, but
  ABI emulators may pass in kernel-side sockaddrs.  Also, remove the
  namelenp field and instead require the two places that used it to
  explicitly copy mp->msg_namelen out to userland.
- Use the patched kern_recvit() to replace svr4_recvit() and the stock
  kern_sendit() to replace svr4_sendit().
- Use kern_bind() instead of stackgap use in ti_bind().
- Use kern_getpeername() and kern_getsockname() instead of stackgap in
  svr4_stream_ti_ioctl().
- Use kern_connect() instead of stackgap in svr4_do_putmsg().
- Use kern_getpeername() and kern_accept() instead of stackgap in
  svr4_do_getmsg().
- Retire the stackgap from SVR4 compat as it is no longer used.
2006-07-10 21:38:17 +00:00
Scott Long
e3546a7549 Use a sleep mutex instead of an sx lock for the kernel environment. This
allows greater flexibility for drivers that want to query the environment.

Reviewed by: jhb, mux
2006-07-09 21:42:58 +00:00
Sam Leffler
6b7330e2d4 Revise network interface cloning to take an optional opaque
parameter that can specify configuration parameters:
o rev cloner api's to add optional parameter block
o add SIOCCREATE2 that accepts parameter data
o rev vlan support to use new api (maintain old code)

Reviewed by:	arch@
2006-07-09 06:04:01 +00:00
John Baldwin
d9f4623307 - Split ioctl() up into ioctl() and kern_ioctl(). The kern_ioctl() assumes
that the 'data' pointer is already setup to point to a valid KVM buffer
  or contains the copied-in data from userland as appropriate (ioctl(2)
  still does this).  kern_ioctl() takes care of looking up a file pointer,
  implementing FIONCLEX and FIOCLEX, and calling fi_ioctl().
- Use kern_ioctl() to implement xenix_rdchk() instead of using the stackgap
  and mark xenix_rdchk() MPSAFE.
2006-07-08 20:12:14 +00:00
John Baldwin
c1cccebe8b Add a kern_close() so that the ABIs can close a file descriptor w/o having
to populate a close_args struct and change some of the places that do.
2006-07-08 20:03:39 +00:00
John Baldwin
b1ee5b654d Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This
mostly consists of pushing a few copyin's and copyout's up into
__semctl() as all the other callers were already doing the UIO_SYSSPACE
case.  This also changes kern_semctl() to set the return value in a passed
in pointer to a register_t rather than td->td_retval[0] directly so that
callers can only set td->td_retval[0] if all the various copyout's succeed.

As a result of these changes, kern_semctl() no longer does copyin/copyout
(except for GETALL/SETALL) so simplify the locking to acquire the semakptr
mutex before the MAC check and hold it all the way until the end of the
big switch statement.  The GETALL/SETALL cases have to temporarily drop it
while they do copyin/malloc and copyout.  Also, simplify the SETALL case to
remove handling for a non-existent race condition.
2006-07-08 19:51:38 +00:00
Warner Losh
db2bc1bb82 Create bus_enumerate_hinted_children. This routine will allow drivers
to use the hinted child system.  Bus drivers that use this need to
implmenet the bus_hinted_child method, where they actually add the
child to their bus, as they see fit.  The bus is repsonsible for
getting the attribtues for the child, adding it in the right order,
etc.  ISA hinting will be updated to use this method.

MFC After: 3 days
2006-07-08 17:06:15 +00:00
John Baldwin
3cb83e714d Add kern_setgroups() and kern_getgroups() and use them to implement
ibcs2_[gs]etgroups() rather than using the stackgap.  This also makes
ibcs2_[gs]etgroups() MPSAFE.  Also, it cleans up one bit of weirdness in
the old setgroups() where it allocated an entire credential just so it had
a place to copy the group list into.  Now setgroups just allocates a
NGROUPS_MAX array on the stack that it copies into and then passes to
kern_setgroups().
2006-07-06 21:32:20 +00:00
Wayne Salamon
761aed363f Regen the system calls files, picking up the extended attr events, and some
mount-related changes done previously.

Approved by: rwatson (mentor)
2006-07-05 19:24:14 +00:00
John Baldwin
49d409a108 - Add a kern_semctl() helper function for __semctl(). It accepts a pointer
to a copied-in copy of the 'union semun' and a uioseg to indicate which
  memory space the 'buf' pointer of the union points to.  This is then used
  in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap.
- Mark linux_ipc() and svr4_sys_semsys() MPSAFE.
2006-06-27 18:28:50 +00:00
John Baldwin
d90db84797 Fix the name of the data set item for the SYSUNINIT in RW_SYSINIT to use
'rw' instead of 'mtx'.  This should only be a cosmetic change rather than
a functional one.

Submitted by:	Alex Lyashkov <shadow AT itt dot net dot ru>
2006-06-23 19:36:50 +00:00
Marcel Moolenaar
97b94c0e72 Add the UUID of Apple's HFS file system as can be found in the Intel
based Macs.
2006-06-22 22:11:12 +00:00
John Baldwin
40bdac68d8 Add a sx_xlocked() macro which returns true if the current thread holds an
exclusive lock on the specified sx lock.
2006-06-21 20:38:29 +00:00
John Baldwin
880eb8c1ef Add a new section in this file for functions that are only exported by the
linker for use in the linker class handlers.  Move linker_add_class(),
linker_file_unload(), linker_load_dependencies(), and linker_make_file()
into this section.
2006-06-20 20:59:55 +00:00
John Baldwin
aeeb017bd6 - Push Giant down into linker_reference_module().
- Add a new function linker_release_module() as a more intuitive complement
  to linker_reference_module() that wraps linker_file_unload().
  linker_release_module() can either take the module name and version info
  passed to linker_reference_module() or it can accept the linker file
  object returned by linker_reference_module().
2006-06-20 20:54:13 +00:00
John Baldwin
f462ce3edd Make linker_find_file_by_name() and linker_find_file_by_id() static to
simplify linker locking.  The only external consumers now use
linker_file_foreach().
2006-06-20 20:41:15 +00:00
John Baldwin
932151064a - Add a new linker_file_foreach() function that walks the list of linker
file objects calling a user-specified predicate function on each object.
  The iteration terminates either when the entire list has been iterated
  over or the predicate function returns a non-zero value.
  linker_file_foreach() returns the value returned by the last invocation
  of the predicate function.  It also accepts a void * context pointer that
  is passed to the predicate function as well.  Using an iterator function
  avoids exposing linker internals to the rest of the kernel making locking
  simpler.
- Use linker_file_foreach() instead of walking the list of linker files
  manually to lookup ndis files in ndis(4).
- Use linker_file_foreach() to implement linker_hwpmc_list_objects().
2006-06-20 20:37:17 +00:00
John Baldwin
aaf3170501 Make linker_file_add_dependency() and linker_load_module() static since
only the linker uses them.
2006-06-20 20:18:42 +00:00
Max Laier
0dad3f0e15 Import interface groups from OpenBSD. This allows to group interfaces in
order to - for example - apply firewall rules to a whole group of
interfaces.  This is required for importing pf from OpenBSD 3.9

Obtained from:	OpenBSD (with changes)
Discussed on:	-net (back in April)
2006-06-19 22:20:45 +00:00
Robert Watson
cd3a3a269f Remove sbinsertoob(), sbinsertoob_locked(). They violate (and have
basically always violated) invariannts of soreceive(), which assume
that the first mbuf pointer in a receive socket buffer can't change
while the SB_LOCK sleepable lock is held on the socket buffer,
which is precisely what these functions do.  No current protocols
invoke these functions, and removing them will help discourage them
from ever being used.  I should have removed them years ago, but
lost track of it.

MFC after:	1 week
Prodded almost by accident by:	peter
2006-06-17 22:48:34 +00:00
Robert Watson
7ffadf3508 Remove extra blank line below comment.
MFC after:	1 week
2006-06-16 22:31:56 +00:00
David Xu
36ec198bd5 Add scheduler API sched_relinquish(), the API is used to implement
yield() and sched_yield() syscalls. Every scheduler has its own way
to relinquish cpu, the ULE and CORE schedulers have two internal run-
queues, a timesharing thread which calls yield() syscall should be
moved to inactive queue.
2006-06-15 06:37:39 +00:00
John Baldwin
d53885879d - Add a kern_kldload() that is most of the previous kldload() and push
Giant down in it.
- Push Giant down in kern_kldunload() and reorganize it slightly to avoid
  using gotos.  Also, expose this function to the rest of the kernel.
2006-06-13 21:28:18 +00:00
David Xu
b41f1452d9 Add scheduler CORE, the work I have done half a year ago, recent,
I picked it up again. The scheduler is forked from ULE, but the
algorithm to detect an interactive process is almost completely
different with ULE, it comes from Linux paper "Understanding the
Linux 2.6.8.1 CPU Scheduler", although I still use same word
"score" as a priority boost in ULE scheduler.

Briefly, the scheduler has following characteristic:
1. Timesharing process's nice value is seriously respected,
   timeslice and interaction detecting algorithm are based
   on nice value.
2. per-cpu scheduling queue and load balancing.
3. O(1) scheduling.
4. Some cpu affinity code in wakeup path.
5. Support POSIX SCHED_FIFO and SCHED_RR.
Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler
uses 256 priority queues. Unlike ULE which using pull and push, the
scheduelr uses pull method, the main reason is to let relative idle
cpu do the work, but current the whole scheduler is protected by the
big sched_lock, so the benefit is not visible, it really can be worse
than nothing because all other cpu are locked out when we are doing
balancing work, which the 4BSD scheduelr does not have this problem.
The scheduler does not support hyperthreading very well, in fact,
the scheduler does not make the difference between physical CPU and
logical CPU, this should be improved in feature. The scheduler has
priority inversion problem on MP machine, it is not good for
realtime scheduling, it can cause realtime process starving.
As a result, it seems the MySQL super-smack runs better on my
Pentium-D machine when using libthr, despite on UP or SMP kernel.
2006-06-13 13:12:56 +00:00
Warner Losh
ccdc8d9bff Add a convenience function rman_init_from_resource for initializing
a rman from a resource.

Also, include _bus.h since the implementation of bus_space isn't
needed here, just the definitions of the types.
2006-06-12 04:06:21 +00:00
Ian Dowse
caea21fc85 Move the new flags field to the end of the structure to maintain
ABI compatibility.

Suggested by:	mlaier (and forgotten by me)
2006-06-10 19:23:49 +00:00
Ian Dowse
eb1030c4fd Keep firmware images on the list until they have been unregistered
with firmware_unregister(). Previously when the last driver reference
had been dropped we would clear the list entry under the assumption
that the firmware module was about to be unloaded, but this was not
true if the firmware image had been loaded manually with kldload.

This makes it possible to manually kldload firmware images as a
workaround for drivers such as ipw that attempt to load firmware
while resuming after a suspend.

Reviewed by:	mlaier (an earlier version of the patch)
2006-06-10 17:04:07 +00:00
Robert Watson
b37ffd3189 Move some functions and definitions from uipc_socket2.c to uipc_socket.c:
- Move sonewconn(), which creates new sockets for incoming connections on
  listen sockets, so that all socket allocate code is together in
  uipc_socket.c.

- Move 'maxsockets' and associated sysctls to uipc_socket.c with the
  socket allocation code.

- Move kern.ipc sysctl node to uipc_socket.c, add a SYSCTL_DECL() for it
  to sysctl.h and remove lots of scattered implementations in various
  IPC modules.

- Sort sodealloc() after soalloc() in uipc_socket.c for dependency order
  reasons.  Statisticize soalloc() and sodealloc() as they are now
  required only in uipc_socket.c, and are internal to the socket
  implementation.

After this change, socket allocation and deallocation is entirely
centralized in one file, and uipc_socket2.c consists entirely of socket
buffer manipulation and default protocol switch functions.

MFC after:	1 month
2006-06-10 14:34:07 +00:00
Robert Watson
6fbb9cf860 Update comments in struct protosw to reflect changing times:
- Between 1996 and 1997, wollman eliminated pr_usrreq() and replaced it
  with direct function pointers.  Update comment to reflect these changes.

- In 2003, I added pru_sosetlabel().  Update comment to reflect this
  change.

MFC after:	1 week
2006-06-07 13:09:04 +00:00
John Baldwin
49b94bfc54 Bah, fix fat finger in last. Invert the ~ on MTX_FLAGMASK as it's
non-intuitive for the ~ to be built into the mask.  All the users now
explicitly ~ the mask.  In addition, add MTX_UNOWNED to the mask even
though it technically isn't a flag.  This should unbreak mtx_owner().

Quickly spotted by:	kris
2006-06-03 21:11:33 +00:00
Maxim Konovalov
e3243f2bf4 o Correct URL to ELF header documantation.
PR:		kern/98213
Submitted by:	Robert Gogolok
2006-05-31 13:47:32 +00:00
Ed Maste
af8d1678e1 Add sanity checking for QUEUE(3) TAILQs under INVARIANTS (similar to
the LIST checks).  Races may lead to list corruption, which can be
difficult to unravel in a post-mortem analysis.  These checks verify
that the prev and next pointers are consistent when inserting or
removing elements, thus catching any corruption earlier.
2006-05-26 18:17:53 +00:00
Poul-Henning Kamp
56de0c9a17 Add new CONSOLE_DRIVER macro which takes just the name of the console
and constructs the member function names with CPPs' ##.

Do not include the checkc entry as it is going away.
2006-05-26 10:58:39 +00:00
Poul-Henning Kamp
16b1613a31 GC the cn_dbctl_t hook for consoles, it is unused.
This used to make syscons switch to vty0 when we entered DDB but this
was lost in the KDB shuffle.  We may want to bring it back down the road
but it should be done by calling cn_init_t/cn_term_t instead, possibly
with a flag argument saying "Debugger!"
2006-05-26 10:24:00 +00:00
Poul-Henning Kamp
9806934e47 Be less harsh on brueffers eyes :-) 2006-05-26 10:23:05 +00:00
Poul-Henning Kamp
0aa6cf76bf Remove SI_SUB_CONSOLE, porting from 4.4-Lite is no longer an issue. 2006-05-26 10:00:58 +00:00
Ed Maste
1b861e4a10 QUEUE_MACRO_DEBUG is intended for userland code, so don't include checks
that call panic under it.
2006-05-26 02:26:53 +00:00
Ruslan Ermilov
8df65d80e2 GC long unused hostnamelen and domainnamelen.
Submitted by:	Alex Lyashkov <shadow@psoft.net>
2006-05-24 07:54:42 +00:00
David Xu
7b8d821268 Move flag TDF_UMTXQ into structure umtxq, this eliminates the requirement
of scheduler lock in some umtx code.
2006-05-18 08:43:46 +00:00
Paul Saab
6befa6ae1b Allow concurrent read(2)/readv(2) access to a file.
Lock file offset against multiple read calls.

Submitted by:	ups
Obtained from:	Yahoo!
MFC after:	2 weeks
2006-05-16 07:50:54 +00:00
Olivier Houchard
ef4d5877dd Switch to a 64bit time_t, while it's not a big problem to do so.
Suggested by:	imp
2006-05-15 00:17:27 +00:00
Max Laier
a0a9755e09 Update UPDATING and bump __FreeBSD_version for the ip6fw removal. 2006-05-13 06:08:25 +00:00
John-Mark Gurney
b0d081a0b4 drop D_MEMDISK, not used in the tree... 2006-05-12 19:40:54 +00:00
John Baldwin
73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
Tor Egge
5ac6cbfdfb Avoid dereferencing NULL pointer. 2006-05-05 19:32:35 +00:00
Marcel Moolenaar
8f405ed335 Remove the puc-specific hacks. The puc(4) driver now properly uses
the rman(9) interface.
2006-04-28 21:23:09 +00:00
Jeff Roberson
6ca9fcc586 - Add a BO_NEEDSGIANT flag to the bufobj. This flag forces all child
buffers to go on the buf daemon's DIRTYGIANT queue.
 - Set BO_NEEDSGIANT on ffs's devvp since the ffs_copyonwrite handler
   runs in the context of the buf daemon and may require Giant.
2006-04-28 01:05:31 +00:00
Robert Watson
1d82b39143 Reconstitute struct mac_policy_ops by breaking out individual function
pointer prototypes from it into their own typedefs.  No functional or
ABI change.  This allows policies to declare their own function
prototypes based on a common definition from mac_policy.h rather than
duplicating these definitions.

Obtained from:	SEDarwin, SPARTA
MFC after:	1 month
2006-04-26 14:18:55 +00:00
Daniel Eischen
1aaee95a1d Bump __FreeBSD_version to reflect the addition of fcloseall() to libc. 2006-04-22 15:12:50 +00:00
Paul Saab
4f590175b7 Allow for nmbclusters and maxsockets to be increased via sysctl.
An eventhandler is used to update all the various zones that depend
on these values.
2006-04-21 09:25:40 +00:00
John-Mark Gurney
be4db476a6 const'ify resource_spec to note that we won't be changing anything while
releasing resources... also, NULL out the resources as we free them...
2006-04-20 01:44:16 +00:00
John Baldwin
fea3efe5bf Implement rw_try_upgrade() and rw_downgrade(). rw_try_upgrade() makes a
single attempt at upgrading a read lock to a write lock, and rw_downgrade()
converts curthread's write lock into a read lock.
2006-04-19 21:06:52 +00:00
John Baldwin
62375b03cd Update comments to mention that each turnstile contains two queues and to
describe turnstile_disown() and turnstile_empty().
2006-04-18 18:21:38 +00:00
John Baldwin
f1a4b852dc - Bring back turnstile_empty() which can check to see if an individual
queue on a turnstile is empty.
- Add a turnstile_disown() function that allows a thread to give up
  ownership of a turnstile w/o waking up any waiters.
2006-04-18 18:16:54 +00:00
Xin LI
4207c279d4 In vfs_hash_get(): mount point should never be changed
so explicitly constify the mp parameter.

Reviewed by:	phk
2006-04-18 08:05:08 +00:00
John Baldwin
efff0b01b6 Update comments to indicate that locks are held by threads, not processes. 2006-04-17 20:17:09 +00:00
John Baldwin
2971c36136 Add a new module_file() function that returns the linker_file_t associated
with a given module_t.  I use this in some the MOD_LOAD event handler for
some test kernel modules to ask the kernel linker to look up the linker
sets in my test modules. (I use linker sets to generate the list of
possible events that I then signal to execute via a sysctl.  On non-amd64,
ld(8) would resolve the entire linker set, but on amd64 I have to ask the
kernel linker to do it for me, and having the kernel linker do it works on
all archs.)
2006-04-17 19:44:44 +00:00
Sam Leffler
587070382a backout rev 1.74
Requested by:	ssouhlal
2006-04-07 05:16:02 +00:00
Christian S.J. Peron
7935d5382b Introduce a new MAC entry point for label initialization of the NFS daemon's
credential: mac_associate_nfsd_label()

This entry point can be utilized by various Mandatory Access Control policies
so they can properly initialize the label of files which get created
as a result of an NFS operation. This work will be useful for fixing kernel
panics associated with accessing un-initialized or invalid vnode labels.

The implementation of these entry points will come shortly.

Obtained from:	TrustedBSD
Requested by:	mdodd
MFC after:	3 weeks
2006-04-06 23:33:11 +00:00
Suleiman Souhlal
90e72822ac Replace FILEDESC_[UN]LOCK_FAST() with a critical section on UP.
Gives a small but measurable performance improvement.

Submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
MFC after:	1 month
2006-04-06 16:27:48 +00:00
David Xu
9d9b92aaf4 WARNS level 4 cleanup, still has work to do. 2006-04-04 02:57:09 +00:00
Robert Watson
bc725eafc7 Chance protocol switch method pru_detach() so that it returns void
rather than an error.  Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF.  so_pcb is now entirely owned and
managed by the protocol code.  Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit.  In their current state they may leak
memory or panic.

MFC after:	3 months
2006-04-01 15:42:02 +00:00
Robert Watson
ac45e92ff2 Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful.  Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit.  This will be corrected shortly in followup
commits to these components.

MFC after:      3 months
2006-04-01 15:15:05 +00:00
Robert Watson
c43a4e8a83 Add a comment describing SS_PROTOREF in detail. This will eventually be
in socket(9).

MFC after:	3 months
2006-04-01 10:54:51 +00:00
Søren Schmidt
cd8a592bb3 Make the ATAPI sense data accessible when using the ioctl interface
MFC candidate.
2006-03-31 08:09:05 +00:00
Jeff Roberson
9eee260594 - Define mnt_startzero and mnt_endzero as a range that excludes mnt_mtx
and mnt_lock so that the mountpoint can be explicitly zeroed on
   creation.

Discussed with: tegge
Tested by:      kris
Sponsored by:   Isilon Systems, Inc.
2006-03-31 03:49:16 +00:00
Jeff Roberson
084d64ac21 - Add the B_NEEDSGIANT flag which is only set if the vnode that owns a buf
requires Giant.  It is set in bgetvp and cleared in brelvp.
 - Create QUEUE_DIRTY_GIANT for dirty buffers that require giant.
 - In the buf daemon, only grab giant when processing QUEUE_DIRTY_GIANT and
   only if we think there are buffers in that queue.

Sponsored by:	Isilon Systems, Inc.
2006-03-31 02:56:30 +00:00
Marcel Moolenaar
1d9c4393ff o Don't make the SER_INT_* defines visible to userland. They
are related to internals, not user-visible state.
o  Add a typedef for serdev_intr_t and protect it with !LOCORE.
2006-03-30 17:24:42 +00:00
John Baldwin
8346951353 Style fix. 2006-03-30 15:48:06 +00:00
John Baldwin
1aa4d65200 Move the PC_TO_I() and KCOUNT() macros so they aren't GUPROF specific
since they operate on fields of struct gmonparam which is not GUPROF
specific.

Approved by:	bde
Reported by:	alc
2006-03-29 18:17:03 +00:00
Joseph Koshy
7aa1f640f5 Remove unused symbols. 2006-03-28 16:20:29 +00:00
Dag-Erling Smørgrav
867c089bc7 Revert previous commit at davidxu's insistance. Instead, use __DECONST
(argh!) and rearrange the prototypes to make it clear that _umtx_op()
is not deprecated.
2006-03-28 14:32:38 +00:00
Dag-Erling Smørgrav
b3efbabe87 The undocumented and deprecated system call _umtx_op() takes two pointer
arguments.  The first one is never used (all callers pass in 0); the
second is sometimes used to pass in a struct timespec * which is used as
a timeout and never modified.  Constify that argument so callers can pass
a const struct timespec * without jumping through hoops.
2006-03-28 09:18:34 +00:00
Robert Watson
d9949cb211 Declare regression subtree in sysctl.h so that components outside of
kern_mib.c can easily add regression sysctls.

MFC after:	1 month
2006-03-26 22:29:45 +00:00
Joseph Koshy
49874f6ea3 MFP4: Support for profiling dynamically loaded objects.
Kernel changes:

  Inform hwpmc of executable objects brought into the system by
  kldload() and mmap(), and of their removal by kldunload() and
  munmap().  A helper function linker_hwpmc_list_objects() has been
  added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve
  the list of currently loaded kernel modules.

  The unused `MAPPINGCHANGE' event has been deprecated in favour
  of separate `MAP_IN' and `MAP_OUT' events; this change reduces
  space wastage in the log.

  Bump the hwpmc's ABI version to "2.0.00".  Teach hwpmc(4) to
  handle the map change callbacks.

  Change the default per-cpu sample buffer size to hold
  32 samples (up from 16).

  Increment __FreeBSD_version.

libpmc(3) changes:

  Update libpmc(3) to deal with the new events in the log file; bring
  the pmclog(3) manual page in sync with the code.

pmcstat(8) changes:

  Introduce new options to pmcstat(8): "-r" (root fs path), "-M"
  (mapfile name), "-q"/"-v" (verbosity control).  Option "-k" now
  takes a kernel directory as its argument but will also work with
  the older invocation syntax.

  Rework string handling in pmcstat(8) to use an opaque type for
  interned strings.  Clean up ELF parsing code and add support for
  tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).

  Report statistics at the end of a log conversion run depending
  on the requested verbosity level.

Reviewed by:	jhb, dds (kernel parts of an earlier patch)
Tested by:	gallatin (earlier patch)
2006-03-26 12:20:54 +00:00
Warner Losh
f9c68c6334 The year field is the 4 digit year (eg, 2006), not 'year - 1900' (eg
106).  Fix the comment to reflect this.
2006-03-24 06:27:34 +00:00
David Xu
177e987e63 Regenerate. 2006-03-23 08:48:37 +00:00
David Xu
53fcc63c10 Add aio_fsync() prototype. 2006-03-23 08:47:28 +00:00
Poul-Henning Kamp
cc548bd9fb Remove nested includes of <sys/_lock.h> and <sys/_mutex.h> which spill into
userland.  The comment indicated that something in userland needed them, but
make universe can't seem to find any traces of it.

Move <sys/queue.h> include up.
2006-03-16 11:19:36 +00:00
Robert Watson
92c07a345e Change soabort() from returning int to returning void, since all
consumers ignore the return value, soabort() is required to succeed,
and protocols produce errors here to report multiple freeing of the
pcb, which we hope to eliminate.
2006-03-16 07:03:14 +00:00
Sam Leffler
47e2996e8b promote fast ipsec's m_clone routine for public use; it is renamed
m_unshare and the caller can now control how mbufs are allocated

Reviewed by:	andre, luigi, mlaier
MFC after:	1 week
2006-03-15 21:11:11 +00:00
Robert Watson
a5c0b80e37 Back out accidentally committed protosw.h:1.49. One of those days. It
will be recommitted with the remainder of the change in the next day or
two.

Submitted by:	thompsa
2006-03-15 20:41:15 +00:00
Andre Oppermann
b1955eecdd Add definitions for MD5_BLOCK_LENGTH, MD5_DIGEST_LENGTH and
MD5_DIGEST_STRING_LENGTH.

MFC after:	3 days
2006-03-15 19:47:12 +00:00
Robert Watson
cf4f9f6d81 Correct spelling of 0x4000 in previous commit. This one line change from
a 42k patch seemed easier to retype than apply, but apparently not. :-)

Submitted by:	pjd
2006-03-15 19:02:43 +00:00
Robert Watson
5d511d26c3 Add SS_PROTOREF socket flag, which represents a strong reference by the
protocol to the socket.  Normally protocol references are weak: that is,
the socket layer can tear down the socket (and hence protocol state)
when it finds convenient.  This flag will allow the protocol to
explicitly declare to the socket layer that it is maintaining a
strong reference, rather than the current implicit model associated
with so_pcb pointer values and repeated attempts to possibly free the
socket.
2006-03-15 12:30:06 +00:00
David Xu
28e989e9ca Remove unused code. 2006-03-13 10:37:25 +00:00
Daniel Eischen
51f38c318b Add macros for generating symbol version assembler opcodes. 2006-03-13 00:49:28 +00:00
Andre Oppermann
83d8a7bbbe Remove comment that does not appy to FreeBSD. 2006-03-12 15:34:33 +00:00
Andre Oppermann
5e82cadbce Import of OpenBSD's sys/sys/hash.h providing generic 32bit hash functions.
Requested by:	flz (to port Open[BGP|OSPF]D)
MFC after:	3 days
2006-03-12 15:33:19 +00:00
Poul-Henning Kamp
272601f8f0 Go over calcru and friends once more.
Reintroduce the monotonicity for the normal case and make the two
special cases behave in what is belived to be the most sensible fasion.
2006-03-11 10:48:19 +00:00
Poul-Henning Kamp
d13856608d Remove last traces of disk_enumerate() 2006-03-11 10:24:50 +00:00
Tor Egge
ca2fa80767 Block secondary writes while expunging active unlinked files.
Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.
2006-03-11 01:08:37 +00:00
Tor Egge
791dd2fade Use vn_start_secondary_write() and vn_finished_secondary_write() as a
replacement for vn_write_suspend_wait() to better account for secondary write
processing.

Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.

Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.
2006-03-08 23:43:39 +00:00
Søren Schmidt
62fba1c397 Add USB modes. 2006-03-05 21:32:38 +00:00
Søren Schmidt
64de47b35e Add two new ATAPI commands. 2006-03-05 17:43:13 +00:00
Maxime Henrion
1bf308c1ea Cast the pointer to void * before casting it back to struct type * in
STAILQ_LAST.  This quiets a warning from GCC about increased required
alignment for the cast.

Idea from:      cognet
2006-03-03 18:54:33 +00:00
David Xu
3dfcaad667 Add signal set sq_kill to sigqueue structure, the member saves all
signals sent by kill() syscall, without this, a signal sent by
sigqueue() can cause a signal sent by kill() to be lost.
2006-03-02 14:06:40 +00:00
Jeff Roberson
eb2ea10590 - Move softdep from using a global worklist to per-mount worklists. This
has many positive effects including improved smp locking, reducing
   interdependencies between mounts that can lead to deadlocks, etc.
 - Add the softdep worklist and various counters to the ufsmnt structure.
 - Add a mount pointer to the workitem and remove mount pointers from the
   various structures derived from the workitem as they are now redundant.
 - Remove the poor-man's semaphore protecting softdep_process_worklist and
   softdep_flushworklist.  Several threads may now process the list
   simultaneously.
 - Add softdep_waitidle() to block the thread until all pending
   dependencies being operated on by other threads have been flushed.
 - Use softdep_waitidle() in unmount and snapshots to block either
   operation until the fs is stable.
 - Remove softdep worklist processing from the syncer and move it into the
   softdep_flush() thread.  This thread processes all softdep mounts
   once each second and when it is called via the new softdep_speedup()
   when there is a resource shortage.  This removes the softdep hook
   from the kernel and various hacks in header files to support it.

Reviewed by/Discussed with:	tegge, truckman, mckusick
Tested by:	kris
2006-03-02 05:50:23 +00:00
Pawel Jakub Dawidek
92ee312dd4 Assert proper use of bio_caller1, bio_caller2, bio_cflags, bio_driver1,
bio_driver2 and bio_pflags fields.

Reviewed by:	phk
2006-03-01 19:01:58 +00:00
David Xu
80452384e6 Regenerate. 2006-03-01 06:49:38 +00:00
David Xu
48d0e3ac7d s/timer_t/int/g 2006-03-01 06:48:31 +00:00
David Xu
61d3a4efc2 Let kernel POSIX timer code and mqueue code to use integer as a resource
handle, the timer_t and mqd_t types will be a pointer which userland
will define it.
2006-03-01 06:29:34 +00:00
John Baldwin
eac727ae4a Allow PHOLD()'s of curproc even if P_WEXIT is set. Normally we don't want
to allow PHOLD()'s of processes that have P_WEXIT set as once that flag
is set we aren't guaranteed to block in exit1() waiting for the PRELE()
(we might already be past the wait).  However, curproc is a bit of a
special case.  By the time P_WEXIT is set, the process is single-threaded,
so the only thread for which can do a PHOLD(curproc) is the thread
executing in exit1().  The fact that this thread is executing ensures
that the process won't go away before the current hold is released via
PRELE().  This fixes some panics due to kicking off softupdate operations
inside of exit1() after the recent PHOLD changes to fix ptrace/procfs vs
exit races.

MFC after:	1 week
Tested by:	pho
2006-02-28 20:11:30 +00:00
Paul Saab
fa545f434c Fix 32bit sendfile by implementing kern_sendfile so that it takes
the header and trailers as iovec arguments instead of copying them
in inside of sendfile.

Reviewed by:	jhb
MFC after:	3 weeks
2006-02-28 19:39:18 +00:00
Marcel Moolenaar
6fcbf91d09 MFp4:
o  Add defines for the 5 interrupt sources typical for serial devices.
   These defines can be used for more finegrained interrupt handling
   between drivers that cooperatively handle multiple serial ports.
o  Add defines for the various bitmasks applicable when all information
   is passed between drivers as a single integral.
2006-02-24 02:24:10 +00:00
Marcel Moolenaar
061ba7bb43 MFp4:
style(9): <tab> after #define
2006-02-24 02:16:09 +00:00