freebsd-skq

Author	SHA1	Message	Date
rpaulo	4e702aed16	Add Qualcomm ZTE CMDMA MSM modem to the list of supported modems. MFC after: 1 week	2008-03-28 14:20:06 +00:00
attilio	b6fa73a781	Bump __FreeBSD_version in order to reflect BUF_LOCKWAITERS() reintegration and lockmgr_waiters() introduction.	2008-03-28 12:31:26 +00:00
attilio	7e107a0c8c	b_waiters cannot be adequately protected by the interlock because it is dropped after the call to lockmgr() so just revert this approach using something similar to the precedent one: BUF_LOCKWAITERS() just checks if there are waiters (not the actual number of them) and it is based on newly introduced lockmgr_waiters() which returns if the lockmgr has waiters or not. The name has been choosen differently by old lockwaiters() in order to not confuse them. KPI results enriched by this commit so __FreeBSD_version bumping and manpage update will be happening soon. 'struct buf' also changes, so kernel ABI is disturbed. Bug found by: jeff Approved by: jeff, kib	2008-03-28 12:30:12 +00:00
dfr	3d7f9d1b14	Minor changes to improve compatibility with older FreeBSD releases.	2008-03-28 09:50:32 +00:00
brooks	5e4993bfe3	Use ; instead of : to end a line. Submitted by: Niclas Zeising <niclas dot zeising at gmail dot com>	2008-03-28 08:19:03 +00:00
marcel	dd866faa70	When retasting, wither any existing GEOMs of the same class. This allows the class to create a different GEOM for the same provider as well as avoid that we end up with multiple GEOMs of the same class with the same name. For example, when a disk contains a PC98 partition table but only MBR is supported, then the partition table can be treated as a MBR. If support for PC98 is later loaded as a module, the MBR scheme is pre-empted for the PC98 scheme as expected.	2008-03-28 06:31:12 +00:00
ps	41d5b26ff8	Add support to mincore for detecting whether a page is part of a "super" page or not. Reviewed by: alc, ups	2008-03-28 04:29:27 +00:00
attilio	f6c880cb2d	_lockmgr_args() accepts a 'char *' string as file, so modify _BUF_LOCK() and _BUF_TIMELOCK() prototypes accordingly with this.	2008-03-28 02:48:16 +00:00
yongari	e8dec714c1	In revision 1.70, 1.71 and 1.84 re(4) tried to workaround checksum offload bugs by manual padding for short IP/UDP frames. Unfortunately it seems that these workaround does not work reliably on newer PCIe variants of RealTek chips. To workaround the hardware bug, always pad short frames if Tx IP checksum offload is requested. It seems that the hardware has a bug in IP checksum offload handling. NetBSD manually pads short frames only when the length of IP frame is less than 28 bytes but I chose 60 bytes to safety. Also unconditionally set IP checksum offload bit in Tx descriptor if any TCP or UDP checksum offload is requested. This is the same way as Linux does but it's not mentioned in data sheet. Obtained from: NetBSD Tested by: remko, danger	2008-03-28 01:21:21 +00:00
jb	cee2462540	Remove the last 3 files I missed. These have been repo copied to the new location under a cddl part of the tree following the core@ license review.	2008-03-28 00:28:45 +00:00
attilio	67b6d2277a	Instruments buffer lock objects in order to track correctly consumers consumers in locking operations. While here, operates some style(9) cleanups.	2008-03-28 00:14:33 +00:00
jb	291b24b755	Remove files that have been repo copied to their new location in cddl-specific parts of the source tree.	2008-03-28 00:08:47 +00:00
jb	5794ada908	The sources covered by Sun's CDDL have been repo copied below the src/cddl and src/sys/cddl directories per the core@ decision following the license review. This change modifies the affected Makefiles to reference the sources in their new location.	2008-03-27 23:21:25 +00:00
mav	586e0246eb	Remove ng_setisr() call from ng_dequeue(). It is useless as we any way will never exit ngintr(), while there is some ready requests on the queue. It was made years ago with hope of parallel queue processing by several net threads. But even if we have several threads sometimes, we have no rights to process queue in parallel as it will break original requests serialization that is critically important for some setups.	2008-03-27 23:02:30 +00:00
antoine	162471f2fb	Remove option headers that do not exist and are not used from the Makefiles in sys/modules. (opt_devfs.h, opt_bdg.h, opt_emu10kx.h and opt_uslcom.h) Approved by: rwatson (mentor)	2008-03-27 20:38:03 +00:00
mav	fd0bef772c	Switch from timeval to bintime, to use 1/(2^20) of seconds instead of microseconds. It allows to use bit shifts instead of some heavy 64bit mul/div math operations.	2008-03-27 20:04:20 +00:00
iedowse	8b81d719e1	Add IFF_NEEDSGIANT to IFF_CANTCHANGE, to prevent user-level code from clearing the IFF_NEEDSGIANT flag on Giant-locked interfaces. In particular, wpa_supplicant was doing this on USB interfaces, causing panics when Giant-locked code was then called without Giant. Submitted by: Alexey Popov Reviewed by: rwatson MFC after: 3 days	2008-03-27 18:02:30 +00:00
dfr	c26f064bd4	Add nfslockd and krpc modules.	2008-03-27 11:55:03 +00:00
dfr	dc98ee4196	Add kernel module support for nfslockd and krpc. Use the module system to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k' option to rpc.lockd and make kernel NLM the default. A user can still force the use of the old user NLM by building a kernel without NFSLOCKD and/or removing the nfslockd.ko module.	2008-03-27 11:54:20 +00:00
jb	34e730ca27	When building a kernel module, define MAXCPU the same as SMP so that modules work with and without SMP.	2008-03-27 05:03:26 +00:00
alc	2a244be094	MFamd64 with few changes: 1. Add support for automatic promotion of 4KB page mappings to 2MB page mappings. Automatic promotion can be enabled by setting the tunable "vm.pmap.pg_ps_enabled" to a non-zero value. By default, automatic promotion is disabled. Tested by: kris 2. To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set. The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set.	2008-03-27 04:34:17 +00:00
jb	735643909e	Regen after makesyscalls.sh change.	2008-03-27 01:55:06 +00:00
jb	4a6deb0614	Generate another function for the DTrace syscall provider to specify the syscall argument types. This code is only compiled into the systrace kernel modul and has no effect otherwise.	2008-03-27 01:53:44 +00:00
attilio	b94ea1b2e3	Really, smb_iod_main() is not totally MPSAFE, so just acquire and drop Giant around it in order to assume MPSAFETY. Reported by: jhb, rwatson Pointy hat to: attilio	2008-03-27 01:23:59 +00:00
phk	c763b22a79	Back in the good old days, PC's had random pieces of rock for frequency generation and what frequency the generated was anyones guess. In general the 32.768kHz RTC clock x-tal was the best, because that was a regular wrist-watch Xtal, whereas the X-tal generating the ISA bus frequency was much lower quality, often costing as much as several cents a piece, so it made good sense to check the ISA bus frequency against the RTC clock. The other relevant property of those machines, is that they typically had no more than 16MB RAM. These days, CPU chips croak if their clocks are not tightly within specs and all necessary frequencies are derived from the master crystal by means if PLL's. Considering that it takes on average 1.5 second to calibrate the frequency of the i8254 counter, that more likely than not, we will not actually use the result of the calibration, and as the final clincher, we seldom use the i8254 for anything besides BEL in syscons anyway, it has become time to drop the calibration code. If you need to tell the system what frequency your i8254 runs, you can do so from the loader using hw.i8254.freq or using the sysctl kern.timecounter.tc.i8254.frequency.	2008-03-26 22:12:00 +00:00
phk	f5d8b74690	Further cleanup of sound generation in syscons: The timer_spkr_() functions take care of the enabling/disabling of the speaker. Test on the existence of timer_spkr_() functions, rather than architectures.	2008-03-26 22:02:51 +00:00
phk	3cbe36127b	Make speaker a pseudo device driver instead of attaching to a PnP id. If somebody cleaned this code up to proper style(9), it could become a great educational starting point for aspiring kernel hackers.	2008-03-26 21:33:41 +00:00
rwatson	61a4ef5ea0	Add a comment explaining that we initialize the 'a' buffer for zero-copy to the store buffer position on the BPF descriptor, and the 'b' buffer as the free buffer in order to fill them in the order documented in bpf(4). MFC after: 4 months Suggested by: csjp	2008-03-26 21:29:13 +00:00
mav	5b9ac353f2	Some minor code and math optimizations.	2008-03-26 21:19:03 +00:00
jhb	20cadd93f0	Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo' (such as 'atime' vs 'noatime'). The filesystems will always see either 'nofoo' or 'nonofoo', never plain 'foo'. As such, their list of valid mount options should include 'nofoo' instead of 'foo'. With this fix, you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as noatime without getting an error. You can also update a noatime FFS filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime' using nmount(2) (e.g. 7.x /sbin/mount binary). MFC after: 1 week Reviewed by: crodig	2008-03-26 20:48:07 +00:00
phk	259c5e1579	Remove two variables which are handled MI now.	2008-03-26 20:28:52 +00:00
phk	168398fe50	Eliminate unnecessary #includes	2008-03-26 20:26:12 +00:00
phk	fa71439e44	The "free-lance" timer in the i8254 is only used for the speaker these days, so de-generalize the acquire_timer/release_timer api to just deal with speakers. The new (optional) MD functions are: timer_spkr_acquire() timer_spkr_release() and timer_spkr_setfreq() the last of which configures the timer to generate a tone of a given frequency, in Hz instead of 1/1193182th of seconds. Drop entirely timer2 on pc98, it is not used anywhere at all. Move sysbeep() to kern/tty_cons.c and use the timer_spkr() if they exist, and do nothing otherwise. Remove prototypes and empty acquire-/release-timer() and sysbeep() functions from the non-beeping archs. This eliminate the need for the speaker driver to know about i8254frequency at all. In theory this makes the speaker driver MI, contingent on the timer_spkr_() functions existing but the driver does not know this yet and still attaches to the ISA bus. Syscons is more tricky, in one function, sc_tone(), it knows the hz and things are just fine. In the other function, sc_bell() it seems to get the period from the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode the 1193182 and leave it at that. It's probably not important. Change a few other sysbeep() uses which obviously knew that the argument was in terms of i8254 frequency, and leave alone those that look like people thought sysbeep() took frequency in hertz. This eliminates the knowledge of i8254_freq from all but the actual clock.c code and the prof_machdep.c on amd64 and i386, where I think it would be smart to ask for help from the timecounters anyway [TBD].	2008-03-26 20:09:21 +00:00
dfr	7ce50c542d	Bump __FreeBSD_version for the addition of 'l_sysid' to the flock structure.	2008-03-26 15:41:00 +00:00
emaste	aa3c79c94c	Add \n to the end of a printf string and remove it from panic strings.	2008-03-26 15:28:56 +00:00
dfr	1c5a20ad66	Regen.	2008-03-26 15:24:02 +00:00
dfr	79d2dfdaa6	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
phk	632e5d39f7	Rename timer0_max_count to i8254_max_count. Rename timer0_real_max_count to i8254_real_max_count and make it static. Rename timer_freq to i8254_freq and make it a loader tunable.	2008-03-26 15:03:24 +00:00
phk	44bfb30efd	The RTC related pscnt and psdiv variables have no business being public.	2008-03-26 13:25:27 +00:00
phk	6f9c1b7d47	Remove old sysctl stuff which is long gone in other arch's.	2008-03-26 13:03:51 +00:00
brueffer	b64d211df2	Fix some "in in" typos in comments. PR: 121490 Submitted by: Anatoly Borodin <anatoly.borodin@gmail.com> Approved by: rwatson (mentor), jkoshy MFC after: 3 days	2008-03-26 07:32:08 +00:00
alc	0e2f1e0b38	Enable the automatic creation of superpage reservations.	2008-03-26 03:12:00 +00:00
sam	2b9c326fca	split out tty create part of ucom_attach into ucom_attach_tty so derived drivers can use it Submitted by: Jared Go MFC after: 3 weeks	2008-03-25 23:46:24 +00:00
sam	5f88f5ba90	add some CDMA modems Submitted by: Jared Go MFC after: 1 week	2008-03-25 23:35:32 +00:00
scottl	a3b7a4bce8	Implement taskqueue_block() and taskqueue_unblock(). These functions allow the owner of a queue to block and unblock execution of the tasks in the queue while allowing tasks to continue to be added queue. Combining this with taskqueue_drain() allows a queue to be safely disabled. The unblock function may run (or schedule to run) the queue when it is called, just as calling taskqueue_enqueue() would. Reviewed by: jhb, sam	2008-03-25 22:38:45 +00:00
emaste	5e698c9f5e	Add 64-bit array support for RAIDs > 2TB. This corresponds to ~ Adaptec driver build 15317. Tested on: Adaptec 2230S, Firmware 4.2-0 (8205) ICP ICP5085BL, Firmware 5.2-0 (12814) Submitted by: Adaptec	2008-03-25 21:39:06 +00:00
sam	f9c4823d64	add __noinline Submitted by: imp Reviewed by: kan (long ago) MFC after: 3 weeks	2008-03-25 21:30:01 +00:00
sam	d5c642ca44	expose if_purgemaddrs, it will be used by the vap code unless someone redesigns the mcast support code in the next few weeks MFC after: 3 weeks	2008-03-25 21:23:32 +00:00
sam	12e7d1940e	IFM_IEEE80211_IBSSMASTER hasn't been used in many years; replace it with IFM_IEEE80211_WDS which will be used by the forthcoming vap code MFC after: 3 weeks	2008-03-25 21:22:43 +00:00
sam	8e10753c85	enable dynamic addition of "show all" commands MFC after: 3 weeks	2008-03-25 20:36:32 +00:00
jhb	fce41b3b76	Regen.	2008-03-25 19:35:34 +00:00
jhb	a8ff4f0990	Add entries for the cpuset-related system calls. The existing system calls can be used on little endian systems. Pointy hat to: jeff	2008-03-25 19:34:47 +00:00
emaste	4146778caa	Correct data direction flags in aac_bio_command() in the !AAC_FLAGS_RAW_IO && AAC_FLAGS_SG_64BIT case. Submitted by: Adaptec	2008-03-25 18:34:04 +00:00
ru	e9ab62a9ff	Fix build. Reported by: ache, tinderbox	2008-03-25 13:20:52 +00:00
ru	3b1bf8c2e9	Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.	2008-03-25 09:39:02 +00:00
ru	0655a583e2	Regen after changing prototypes of cpuset_{get,set}affinity().	2008-03-25 09:14:17 +00:00
ru	4feaeed265	Fixed type of the fourth argument of cpuset_{get,set}affinity(2) to be size_t. Prodded by: davidxu	2008-03-25 09:11:53 +00:00
rwatson	59900e5206	Check for a NULL free buffer pointer in BPF before invoking bpf_canfreebuf() in order to avoid potentially calling a non-inlinable but trivial function in zero-copy buffer mode for every packet received when we couldn't free the buffer anyway. MFC after: 4 months	2008-03-25 07:41:33 +00:00
weongyo	9a9594d179	Add support for Marvell Libertas 88W8335 based PCI network adapters. Reviewed by: sam, many wireless people Approved by: thompsa (mentor)	2008-03-25 06:32:33 +00:00
mav	9af7fc155d	Rewrite node to support multiple hooks, alike to ng_l2tp, to use one pair of pptpgre and ksocket nodes for all calls between two peers. This patch modifies node's API by adding new "session_%04x" hook names support, while keeping backward compatibility. Together with appropriate user-level support (by latest mpd5) it gives huge performance benefits for case of multiple active calls between two peers because of avoiding data duplication and extra socket processing. On my benchmarks I have got more then 10 times speedup for the 200 simultaneous PPTP calls between two peers. In conclusion, it allows now to build effective "clients <=> PAC <=> PNS" setups.	2008-03-24 22:55:22 +00:00
jkim	3e99f5d364	Belatedly add BPF_JITTER in NOTES for supported architectures.	2008-03-24 22:23:22 +00:00
jkim	e4afcc95a9	Fix build with option BPF_JITTER.	2008-03-24 22:21:32 +00:00
jkim	f1bce3f01d	Remove redundant inclusions of net/bpfdesc.h.	2008-03-24 22:16:46 +00:00
kmacy	59f40fe008	change inp_wlock_assert to inp_lock_assert	2008-03-24 20:24:04 +00:00
emaste	86c0a7d5de	Diff reduction to Adaptec's driver (around build 15317): catch up with a change in debugging routines. The fwprintf macro in the AAC_DEBUG case (mapping to printf) isn't from the Adaptec driver.	2008-03-24 19:23:33 +00:00
sam	f95a1f76b1	o add M_PROTO[678]; they'll be needed by net80211 vap code o sort mbuf flags together and extend values to 32 bits o write M_COPYFLAGS in terms of M_PROTOFLAGS o move M_COPYFLAGS and M_PROTOFLAGS up to be together with flag defs Reviewed by: rwatson MFC after: 3 weeks	2008-03-24 19:01:29 +00:00
marius	dd3d50596e	- Const'ify the bus_stream_asi and bus_type_asi arrays. - Replace hard-coded functions names missed in bus_machdep.c rev. 1.44 with __func__. - Break some long lines. MFC after: 1 month	2008-03-24 17:57:01 +00:00
marius	9310ab33a4	- Take advantage of bus_dmamap_load_mbuf_sg(9). - Take advantage of m_collapse(9). - Sync with other NIC drivers and prepend a TX mbuf if the first attempt to load it fails with an error other than EFBIG and stop trying instead of freeing it and keeping on trying to enqueue more mbufs. Also ensure the driver queue isn't empty before trying to enqueue mbufs in order to reduce locking operations. - In xl_ifmedia_upd() add a missing XL_UNLOCK(). [1] - Const'ify the xl_devs array. - Remove an outdated comment. PR: 113406 [1] MFC after: 1 month	2008-03-24 17:49:06 +00:00
marius	9813122d2a	- Const'ify the dc_devs array. - Correct the maxsize parameter when creating the mbufs busdma tag to reflect the actual requirement of dc(4). - Move the KASSERT in dc_newbuf() to the right spot. - Also convert the TX side to take advantage of bus_dmamap_load_mbuf_sg(9). - Move the comment regarding dc_start_locked() to the right spot. MFC after: 2 weeks	2008-03-24 17:38:24 +00:00
marius	cf4d38b379	Split the registers into two halves in preparation for SBus support. Obtained from: NetBSD (loosely) MFC after: 2 weeks	2008-03-24 17:23:53 +00:00
emaste	bfdd190b82	Diff reduction to Adaptec driver build 15317 (refactoring and code shuffling): - Resource allocation in aac_alloc (moved from from aac_init) - Interrupt setup in aac_setup_intr (from aac_attach) - Container probing in aac_get_container_info (from aac_startup and aac_handle_aif) - Firmware status check moved to aac_check_firmware from aac_init	2008-03-24 16:38:47 +00:00
bz	e1cf25141c	Fix a bug that when getting/dumping the soft lifetime we reported the hard lifetime instead. MFC after: 3 days	2008-03-24 15:01:20 +00:00
bz	42fbad307b	Import change from KAME, rev. 1.362 kame/kame/sys/netkey/key.c In case of "new SA", we must check the hard lifetime of the old SA to find out if it is not permanent and we can delete it. Submitted by: sakane via gnn MFC after: 3 days	2008-03-24 14:55:09 +00:00
csjp	5c0a194548	Bump the FreeBSD version for zerocopy bpf buffers and changes to the bpf(4) monitoring ABI/structures.	2008-03-24 14:30:01 +00:00
csjp	310e3f93dd	Introduce support for zero-copy BPF buffering, which reduces the overhead of packet capture by allowing a user process to directly "loan" buffer memory to the kernel rather than using read(2) to explicitly copy data from kernel address space. The user process will issue new BPF ioctls to set the shared memory buffer mode and provide pointers to buffers and their size. The kernel then wires and maps the pages into kernel address space using sf_buf(9), which on supporting architectures will use the direct map region. The current "buffered" access mode remains the default, and support for zero-copy buffers must, for the time being, be explicitly enabled using a sysctl for the kernel to accept requests to use it. The kernel and user process synchronize use of the buffers with atomic operations, avoiding the need for system calls under load; the user process may use select()/poll()/kqueue() to manage blocking while waiting for network data if the user process is able to consume data faster than the kernel generates it. Patchs to libpcap are available to allow libpcap applications to transparently take advantage of this support. Detailed information on the new API may be found in bpf(4), including specific atomic operations and memory barriers required to synchronize buffer use safely. These changes modify the base BPF implementation to (roughly) abstrac the current buffer model, allowing the new shared memory model to be added, and add new monitoring statistics for netstat to print. The implementation, with the exception of some monitoring hanges that break the netstat monitoring ABI for BPF, will be MFC'd. Zerocopy bpf buffers are still considered experimental are disabled by default. To experiment with this new facility, adjust the net.bpf.zerocopy_enable sysctl variable to 1. Changes to libpcap will be made available as a patch for the time being, and further refinements to the implementation are expected. Sponsored by: Seccuris Inc. In collaboration with: rwatson Tested by: pwood, gallatin MFC after: 4 months [1] [1] Certain portions will probably not be MFCed, specifically things that can break the monitoring ABI.	2008-03-24 13:49:17 +00:00
kmacy	9fbcabc6c7	remove unneccessary tcbinfo lock acquisitions - set tp to null affter calling enter_timewait as we no longer own the inpcb	2008-03-24 05:21:10 +00:00
jeff	3ad75daf19	- Greatly simplify vget() by removing the guarantee that any new references to a vnode with VI_OWEINACT set will force the vinactive() call. The kernel makes no guarantees about which reference was the last to close a file or when the actual inactive processing will happen. The previous code was designed to preserve existing semantics in the face of shared locks, however, this was unnecessary. Discussed with: mckusick	2008-03-24 04:22:58 +00:00
jeff	1bf44343e2	- Don't acquire the vnode interlock in _vn_lock() unless no lock type is requested. Handle this case specially before the while loop. - Use the held vnode lock to check for VI_DOOMED. The vnode lock and interlock must both be held to set VI_DOOMED so either one held, even shared, is sufficient to check it. No objection by: kib	2008-03-24 04:17:35 +00:00
jeff	955d594912	- Remove an old comment; vnodes have been working without Giant for years now. - Clarify the locking required for VI_DOOMED in preparation for simplifications to vget() and vn_lock().	2008-03-24 04:11:40 +00:00
kmacy	08877248a3	Label inp as unused in the non-INVARIANTS case	2008-03-24 00:29:01 +00:00
peter	112e790f78	First pass at (possibly futile) microoptimizing of cpu_switch. Results are mixed. Some pure context switch microbenchmarks show up to 29% improvement. Pipe based context switch microbenchmarks show up to 7% improvement. Real world tests are far less impressive as they are dominated more by actual work than switch overheads, but depending on the machine in question, workload, kernel options, phase of moon, etc, a few percent gain might be seen. Summary of changes: - don't reload MSR_[FG]SBASE registers when context switching between non-threaded userland apps. These typically cost 120 clock cycles each on an AMD cpu (less on Barcelona/Phenom). Intel cores are probably no faster on this. - The above change only helps unthreaded userland apps that tend to use the same value for gsbase. Threaded apps will get no benefit from this. - reorder things like accessing the pcb to be in memory order, to give prefetching a better chance of working. Operations are now in increasing memory address order, rather than reverse or random. - Push some lesser used code out of the main code paths. Hopefully allowing better code density in cache lines. This is probably futile. - (part 2 of previous item) Reorder code so that branches have a more realistic static branch prediction hint. Both Intel and AMD cpus default to predicting branches to lower memory addresses as being taken, and to higher memory addresses as not being taken. This is overridden by the limited dynamic branch prediction subsystem. A trip through userland might overflow this. - Futule attempt at spreading the use of the results of previous operations in new operations. Hopefully this will allow the cpus to execute in parallel better. - stop wasting 16 bytes at the top of kernel stack, below the PCB. - Never load the userland fs/gsbase registers for kthreads, but preserve curpcb->pcb_[fg]sbase as caches for the cpu. (Thanks Jeff!) Microbenchmarking this code seems to be really sensitive to things like scheduling luck, timing, cache behavior, tlb behavior, kernel options, other random code changes, etc. While it doesn't help heavy userland workloads much, it does help high context switch loads a little, and should help those that involve switching via kthreads a bit more. A special thanks to Kris for the testing and reality checks, and Jeff for tormenting me into doing this. :) This is still work-in-progress.	2008-03-23 23:09:06 +00:00
alc	f9d9755304	Correct an error in pmap_mincore() when applied to a 2MB page mapping: Use PG_PS_FRAME, not PG_FRAME, to obtain the physical address of the 2MB physical page from the PDE.	2008-03-23 23:04:09 +00:00
peter	b238ee1007	Export TDP_KTHREAD to asm files.	2008-03-23 22:46:37 +00:00
peter	1f7e9770bb	Move pcb_flags to make trivially better use of cache lines.	2008-03-23 22:45:51 +00:00
peter	075e9da352	Protect the setting of the fsbase/gsbase MSR registers and the pcb_[fg]sbase values with a critical section, like the rest of the kernel.	2008-03-23 22:44:56 +00:00
kmacy	fb74f62b24	Insulate inpcb consumers outside the stack from the lock type and offset within the pcb by adding accessor functions. Reviewed by: rwatson MFC after: 3 weeks	2008-03-23 22:34:16 +00:00
alc	e702727e2c	To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set. The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set. However, these changes enable me to remove a work-around from pmap_promote_pde(), the superpage promotion procedure. (Note: The AMD processors that we have tested, including the latest, the Phenom, still exhibit the historical behavior.) Acknowledgments: After I observed the problem, Stephan (ups) was instrumental in characterizing the exact behavior of Intel's recent TLBs. Tested by: Peter Holm	2008-03-23 20:38:01 +00:00
kib	5ddf5664cc	Yield the cpu in the kernel while iterating the list of the vnodes belonging to the mountpoint. Also, yield when in the softdep_process_worklist() even when we are not going to sleep due to buffer drain. It is believed that the ULE fixed the problem [1], but the yielding seems to be needed at least for the 4BSD case. Discussed: on stable@, with bde Reviewed by: tegge, jeff [1] MFC after: 2 weeks	2008-03-23 13:45:24 +00:00
kib	53a15ee1ea	Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping. As Alan noted, pmap_copy() does not require the wrap-around checks because it cannot be applied to the kernel's pmap. The checks there are included for consistency. Reported and tested by: kris (i386/pmap.c:pmap_remove() part) Reviewed by: alc MFC after: 1 week	2008-03-23 07:07:27 +00:00
yongari	fcd39263e4	MSI handling on some RealTek chips are broken so disable it by default. Reported by: Giulio Ferro ( auryn AT zirakzigil DOT org ) Tested by: Giulio Ferro ( auryn AT zirakzigil DOT org )	2008-03-23 05:35:18 +00:00
yongari	031ecde733	For MSI capable hardwares, enable MSI enable bit in RL_CFG2 register. If MSI was disabled by hw.re.msi_disable tunable expliclty clear the MSI enable bit.	2008-03-23 05:31:35 +00:00
yongari	7fea7ba914	Some RealTek chips are known to be buggy on DAC handling, so disable DAC by default.	2008-03-23 05:13:45 +00:00
yongari	00b0cf0b1a	VLAN hardware tag information should be set for all desciptors of a multi-descriptor transmission attempt. Datasheet said nothing about this requirements. This should fix a long-standing VLAN hardware tagging issues with re(4). Reported by: Giulio Ferro ( auryn AT zirakzigil DOT org ) Tested by: Giulio Ferro ( auryn AT zirakzigil DOT org )	2008-03-23 05:06:16 +00:00
yongari	fd413d352f	Always honor configured VLAN/checksum offload capabilities. Previously re(4) used to blindly enable VLAN hardware tag stripping and Rx checksum offload regardless of enabled optional features of interface.	2008-03-23 04:59:13 +00:00
davidxu	c32a483ae9	Remove commented out code, thread suspension is done in thread library.	2008-03-23 02:03:06 +00:00
jeff	8103d042fb	- Only return 1 from sync_vnode() in cases where the vnode is still at the head of the sync list. This prevents sched_sync() from re-queueing a vnode which may have been freed already. Discussed with: kib	2008-03-23 01:44:28 +00:00
marcel	124e0025d3	Instead of making a single geom_part.ko module, make a module for each partitioning scheme. The gpart code is currently non- optional.	2008-03-23 01:42:47 +00:00
jeff	73b6a5597c	- Pass BO_MTX(bo) to lockmgr in vtruncbuf, we don't own the vnode interlock here anymore. Reported by: kris	2008-03-23 01:42:19 +00:00
marcel	c184f6ced2	Redefine G_PART_SCHEME_DECLARE() from populating a private linker set to declaring a proper module. The module event handler is part of the gpart core and will add the scheme to an internal list on module load and will remove the scheme from the internal list on module unload. This makes it possible to dynamically load and unload partitioning schemes.	2008-03-23 01:31:59 +00:00
marcel	31a163ef06	Add g_retaste(), which given a class will present all non-open providers to it for tasting. This is useful when the class, through means outside the scope of GEOM, can claim providers previously unclaimed. The g_retaste() function posts an event which is handled by the g_retaste_event(). Event suggested by: phk	2008-03-23 01:23:35 +00:00

1 2 3 4 5 ...

66862 Commits