freebsd-skq

Author	SHA1	Message	Date
ray	c28e9771d6	MFC @r257698.	2013-11-05 12:55:28 +00:00
kib	79afbd5fdd	Add bus_dmamap_load_ma() function to load map with the array of vm_pages. Provide trivial implementation which forwards the load to _bus_dmamap_load_phys() page by page. Right now all architectures use bus_dmamap_load_ma_triv(). Tested by: pho (as part of the functional patch) Sponsored by: The FreeBSD Foundation MFC after: 1 month	2013-10-27 21:39:16 +00:00
ray	f690470e36	MFC @r257107.	2013-10-25 08:41:36 +00:00
marius	20009df359	Move the implementation of bus_space_barrier(9) to the inline function in the header. Actually, there's only one version for all types of busses, so it doesn't make sense to walk up the hierarchy.	2013-10-24 17:06:41 +00:00
ray	da824c4a2b	MFC @r256148.	2013-10-08 14:02:35 +00:00
marius	fe39ca00e6	Implement GET_STACK_USAGE. Discussed with: mav Approved by: re (kib) MFC after: 1 week	2013-09-29 13:09:25 +00:00
glebius	12fb397079	- Create kern.ipc.sendfile namespace, and put the new "readhead" OID there as "kern.ipc.sendfile.readahead". - Push all nsfbuf related tunables into MD code. Don't move them to new namespace in favor of POLA. Reviewed by: scottl Approved by: re (gjb)	2013-09-22 13:36:52 +00:00
alc	88a4d0f31a	The pmap function pmap_clear_reference() is no longer used. Remove it. pmap_clear_reference() has had exactly one caller in the kernel for several years, more precisely, since FreeBSD 8. Now, that call no longer exists. Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division	2013-09-20 04:30:18 +00:00
pjd	667d7255be	Fix panic in ktrcapfail() when no capability rights are passed. While here, correct all consumers to pass NULL instead of 0 as we pass capability rights as pointers now, not uint64_t. Reported by: Daniel Peyrolon Tested by: Daniel Peyrolon Approved by: re (marius)	2013-09-18 19:26:08 +00:00
jhb	04bb6e10cd	Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)	2013-09-09 18:11:59 +00:00
glebius	cf37135185	Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations to MD headers.	2013-09-06 17:44:13 +00:00
ray	c555be1512	MFC @r255128.	2013-09-01 23:06:28 +00:00
alc	aa9a7bb9e6	Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2). Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%. Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise(). Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division	2013-08-29 15:49:05 +00:00
kib	05a9dff802	Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release(). Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-22 18:12:24 +00:00
kib	ba12eedccd	Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-22 07:39:53 +00:00
pjd	ae1f4cb9ce	Add process descriptors support to the GENERIC kernel. It is already being used by the tools in base systems and with sandboxing more and more tools the usage should only increase. Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org> Sponsored by: Google Summer of Code 2013 MFC after: 1 month	2013-08-18 10:21:29 +00:00
ray	3f79da49d7	MFC @r254374.	2013-08-15 19:32:08 +00:00
attilio	16c7563cf4	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
avg	6f05d223ad	follow up to r254051 - update powerpc/GENERIC64 as well, suggested by mdf - update comments so that they make sense after the change, suggested by jhb X-MFC after: never (change specific to head)	2013-08-09 08:11:09 +00:00
kib	8de1718b60	Split the pagequeues per NUMA domains, and split pageademon process into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-07 16:36:38 +00:00
avg	0255b98224	enable KDB_TRACE in GENERICs KDB_TRACE is not an alternative to DDB/etc, they are complementary. So I do not see any reason to not enable KDB_TRACE by default. X-MFC after: never (change specific to head)	2013-08-07 08:03:50 +00:00
jeff	de4ecca213	Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division	2013-08-07 06:21:20 +00:00
marius	95f847fd26	Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate() to get the semantics when setting the PMAP right. Prior to r251782, the latter already used implicit acquire semantics, which - currently - means to not employ additional explicit memory barriers under the hood (see also r225889).	2013-08-06 15:34:11 +00:00
attilio	598e23d2ea	Remove unused member. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-08-04 21:17:05 +00:00
obrien	7999076e3e	Back out r253779 & r253786.	2013-07-31 17:21:18 +00:00
obrien	721ce839c7	Decouple yarrow from random(4) device. * Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option. The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow. * random(4) device doesn't really depend on rijndael-. Yarrow, however, does. Add random_adaptors.[ch] which is basically a store of random_adaptor's. random_adaptor is basically an adapter that plugs in to random(4). random_adaptor can only be plugged in to random(4) very early in bootup. Unplugging random_adaptor from random(4) is not supported, and is probably a bad idea anyway, due to potential loss of entropy pools. We currently have 3 random_adaptors: + yarrow + rdrand (ivy.c) + nehemeiah * Remove platform dependent logic from probe.c, and move it into corresponding registration routines of each random_adaptor provider. probe.c doesn't do anything other than picking a specific random_adaptor from a list of registered ones. * If the kernel doesn't have any random_adaptor adapters present then the creation of /dev/random is postponed until next random_adaptor is kldload'ed. * Fix randomdev_soft.c to refer to its own random_adaptor, instead of a system wide one. Submitted by: arthurmesh@gmail.com, obrien Obtained from: Juniper Networks Reviewed by: obrien	2013-07-29 20:26:27 +00:00
avg	4e6c4b2a36	Revert r253748,253749 This WIP should not have been committed yet. Pointyhat to: avg	2013-07-28 18:44:17 +00:00
avg	ea04b46439	put contents of cpu.h under _KERNEL no userland-serviceable parts inside MFC after: 20 days	2013-07-28 18:32:27 +00:00
ray	0ac02980ce	MFC @r219886.	2013-07-25 12:43:22 +00:00
ae	998db3fe4a	Include sys/systm.h after sys/param.h. Suggested by: pluknet	2013-07-15 15:40:57 +00:00
ae	6f8e41d6cb	Introduce new structure sfstat for collecting sendfile's statistics and remove corresponding fields from struct mbstat. Use PCPU counters and SFSTAT_INC() macro for update these statistics. Discussed with: glebius	2013-07-15 06:16:57 +00:00
marius	98abe96b02	Prefix the alias macros for members of struct __mcontext with an underscore in order to avoid a clash in the net80211 code.	2013-07-12 14:24:52 +00:00
kib	8a0279994d	Fix issues with zeroing and fetching the counters, on x86 and ppc64. Issues were noted by Bruce Evans and are present on all architectures. On i386, a counter fetch should use atomic read of 64bit value, otherwise carry from the increment on other CPU could be lost for the given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not available on the machine, it cannot be SMP and it is enough to disable preemption around read to avoid the split read. On x86 the counter increment is not atomic on purpose, which makes it possible for the store of the incremented result to override just zeroed per-cpu slot. The effect would be a counter going off by arbitrary value after zeroing. Perform the counter zeroing on the same processor which does the increments, making the operations mutually exclusive. On i386, same as for the fetching, if the cmpxchg8b is not available, machine is not SMP and we disable preemption for zeroing. PowerPC64 is treated the same as amd64. For other architectures, the changes made to allow the compilation to succeed, without fixing the issues with zeroing or fetching. It should be possible to handle them by using the 64bit loads and stores atomic WRT preemption (assuming the architectures also converted from using critical sections to proper asm). If architecture does not provide the facility, using global (spin) mutex would be non-optimal but working solution. Noted by: bde Sponsored by: The FreeBSD Foundation	2013-07-01 02:48:27 +00:00
ed	1fdf0e5b3c	Remove conflicting macros from SPARC64's atomic(9) header. The atomic_load() and atomic_store() macros conflict with the equally named macros from <stdatomic.h>. Remove them, as they are only used to implement functions that are not present on any of the other architectures.	2013-06-15 08:23:53 +00:00
ed	13f575ad68	Stick to using the documented atomic(9) API. The atomic_store_ptr() function is not part of the atomic(9) API. We only provide a version with a release barrier.	2013-06-15 08:21:54 +00:00
jeff	7ee88fb112	- Add a BIT_FFS() macro and use it to replace cpusetffs_obj() Discussed with: attilio Sponsored by: EMC / Isilon Storage Division	2013-06-13 20:46:03 +00:00
attilio	fdf82ef9cf	o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-05-21 20:38:19 +00:00
alc	b55ba81a92	Relax the object locking assertion in pmap_enter_locked(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-05-17 18:59:00 +00:00
peter	cec511ab65	Tidy up some CVS workarounds.	2013-05-12 01:53:47 +00:00
attilio	b24a52ec9e	Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in order to match the MAXCPU concept. The change should also be useful for consolidation and consistency. Sponsored by: EMC / Isilon storage division Obtained from: jeff Reviewed by: alc	2013-05-07 22:46:24 +00:00
trasz	80b8b2f779	Remove ctl(4) from GENERIC. Also remove 'options CTL_DISABLE' and kern.cam.ctl.disable tunable; those were introduced as a workaround to make it possible to boot GENERIC on low memory machines. With ctl(4) being built as a module and automatically loaded by ctladm(8), this makes CTL work out of the box. Reviewed by: ken Sponsored by: FreeBSD Foundation	2013-04-12 16:25:03 +00:00
glebius	9cf64d6c35	Merge from projects/counters: counter(9). Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data. See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details. In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.	2013-04-08 19:40:53 +00:00
glebius	8c6eba117e	Merge from projects/counters: Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones. Sponsored by: Nginx, Inc.	2013-04-08 19:19:10 +00:00
mav	7c2b81b0e9	Remove all legacy ATA code parts, not used since options ATA_CAM enabled in most kernels before FreeBSD 9.0. Remove such modules and respective kernel options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the atacontrol utility and some man pages. Remove useless now options ATA_CAM. No objections: current@, stable@ MFC after: never	2013-04-04 07:12:24 +00:00
ian	5b38501da6	Fix low-level uart drivers that set their fifo sizes in the softc too late. uart(4) allocates send and receiver buffers in attach() before it calls the low-level driver's attach routine. Many low-level drivers set the fifo sizes in their attach routine, which is too late. Other drivers set them in the probe() routine, so that they're available when uart(4) allocates buffers. This fixes the ones that were setting the values too late by moving the code to probe().	2013-04-01 00:44:20 +00:00
ray	3625eb7f3d	MFC@r248830 Approved by: ed (project owner)	2013-03-28 20:27:01 +00:00
kib	7c26a038f9	Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks	2013-03-19 14:13:12 +00:00
kib	63efc821c3	Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks	2013-03-14 20:18:12 +00:00
attilio	7fd2627275	MFC	2013-03-09 01:39:42 +00:00
attilio	bf1dc90446	MFC	2013-03-08 00:03:07 +00:00
gavin	2caee59a46	Correct two spelling mistakes in a comment.	2013-03-07 13:24:49 +00:00
attilio	e98f58faf6	MFC	2013-03-02 14:48:41 +00:00
marius	ee248b021f	- Revert the part of r247601 which turned the overtemperature and power fail interrupt shutdown handlers into filters. Shutdown_nice(9) acquires a sleep lock, which filters shouldn't do. It also seems that kern_reboot(9) still may require Giant to be hold. - Correct an incorrect argument to shutdown_nice(9). Submitted by: bde	2013-03-02 13:08:13 +00:00
marius	b1d7b9754b	Revert the part of r247600 which turned the overtemperature and power fail interrupt shutdown handlers into filters. Shutdown_nice(9) acquires a sleep lock, which filters shouldn't do. It also seems that kern_reboot(9) still may require Giant to be hold. Submitted by: bde	2013-03-02 13:04:58 +00:00
marius	dd15932a15	- Apparently, it's no longer a problem to call shutdown_nice(9) from within an interrupt filter (some other drivers in the tree do the same). So change the overtemperature and power fail interrupts from handlers in order to code and get rid of a !INTR_MPSAFE handlers. - Mark unused parameters as such. - Use NULL instead of 0 for pointers. MFC after: 1 week	2013-03-02 00:41:51 +00:00
marius	2774e0404e	- While Netra X1 generally show no ill effects when registering a power fail interrupt handler, there seems to be either a broken batch of them or a tendency to develop a defect which causes this interrupt to fire inadvertedly. Given that apart from this problem these machines work just fine, add a tunable allowing the setup of the power fail interrupt to be disabled. While at it, remove the DEBUGGER_ON_POWERFAIL compile time option and make that behavior also selectable via the newly added tunable. - Apparently, it's no longer a problem to call shutdown_nice(9) from within an interrupt filter (some other drivers in the tree do the same). So change the power fail interrupt from an handler in order to simplify the code and get rid of a !INTR_MPSAFE handler. - Use NULL instead of 0 for pointers. MFC after: 1 week	2013-03-02 00:37:31 +00:00
marius	0c5e0b209e	- In sbbc_pci_attach() just pass the already obtained bus tag and handle instead of acquiring these anew. - Use NULL instead of 0 for pointers. MFC after: 1 week	2013-03-01 20:36:59 +00:00
marius	944a48f5cd	- Remove an unused header. - Use NULL instead of 0 for pointers. - Let ofw_pcib_probe() return BUS_PROBE_DEFAULT instead of 0 so specialized PCI-PCI-bridge drivers may attach instead. - Add WARs for PLX Technology PEX 8114 bridges and PEX 8532 switches. Ideally, these should live in MI code but at least for the latter we're missing the necessary infrastructure there. MFC after: 1 week	2013-03-01 20:34:02 +00:00
mav	6cf7cc6e4d	MFcalloutng: Switch eventtimers(9) from using struct bintime to sbintime_t. Even before this not a single driver really supported full dynamic range of struct bintime even in theory, not speaking about practical inexpediency. This change legitimates the status quo and cleans up the code.	2013-02-28 13:46:03 +00:00
attilio	8d28f94790	Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-27 18:12:13 +00:00
attilio	cb47f0509b	Merge from vmobj-rwlock branch: Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h. Sponsored by: EMC / Isilon storage division Tested by: pho Reviewed by: alc	2013-02-26 01:00:11 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	066bbc97b6	Fix other architectures and ZFS. Sponsored by: EMC / Isilon storage division	2013-02-21 15:02:36 +00:00
nwhitehorn	c5f6ffc937	IFC @ 247036	2013-02-20 14:26:51 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	1f1e13ca03	There is no need to use VM_OBJECT_LOCKED() as the assertion won't make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().	2013-02-20 10:51:34 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
kib	bd7f0fa0bb	Reform the busdma API so that new types may be added without modifying every architecture's busdma_machdep.c. It is done by unifying the bus_dmamap_load_buffer() routines so that they may be called from MI code. The MD busdma is then given a chance to do any final processing in the complete() callback. The cam changes unify the bus_dmamap_load* handling in cam drivers. The arm and mips implementations are updated to track virtual addresses for sync(). Previously this was done in a type specific way. Now it is done in a generic way by recording the list of virtuals in the map. Submitted by: jeff (sponsored by EMC/Isilon) Reviewed by: kan (previous version), scottl, mjacob (isp(4), no objections for target mode changes) Discussed with: ian (arm changes) Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris), amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)	2013-02-12 16:57:20 +00:00
kib	dfffe11d71	The 'end' word was missed in the comment. MFC after: 3 days	2013-02-08 15:52:20 +00:00
eadler	6a1efe1ad9	Remove support for plip from the GENERIC kernel as no systems in the last 10 years require this support. Discussed with: db Discussed with: kib Reviewed by: imp Reviewed by: jhb Reviewed by: -hackers Approved by: cperciva (mentor)	2013-02-01 20:17:11 +00:00
marius	6892c8a3be	Revert the part of r239864 which removed obtaining the SMP mutex around reading registers from other CPUs. As it turns out, the hardware doesn't really like concurrent IPI'ing causing adverse effects. Also the thought deadlock when using this spin lock here and the targeted CPU(s) are also holding or in case of nested locks can't actually happen. This is due to the fact that on sparc64, spinlock_enter() only raises the PIL but doesn't disable interrupts completely. Thus direct cross calls as used for the register reading (and all other MD IPI needs) still will be executed by the targeted CPU(s) in that case. MFC after: 3 days	2013-01-23 22:52:20 +00:00
marius	0e18f8466b	Revert bogus part of r241740. Reported by: Michael Moll MFC after: 3 days	2013-01-03 23:12:08 +00:00
kib	5a9188f8d3	Enable the UFS quotas for big-iron GENERIC kernels. Discussed with: mckusick MFC after: 2 weeks	2013-01-03 19:03:41 +00:00
des	67e77c00a8	As discussed on -current last October, remove the firewire drivers from GENERIC.	2013-01-03 14:30:24 +00:00
marius	f218d978bf	Revert r237842 and switch back to SCHED_ULE. All problems I encountered with the latter have been fixed with r241780. MFC after: 3 days	2012-12-16 20:54:07 +00:00
nwhitehorn	76289396e6	IFC @ 243795	2012-12-02 21:00:52 +00:00
kib	bc5bfde14d	Move the declaration of vm_phys_paddr_to_vm_page() from vm/vm_page.h to vm/vm_phys.h, where it belongs. Requested and reviewed by: alc MFC after: 2 weeks	2012-11-16 05:55:56 +00:00
jeff	f40f3c3255	- Implement run-time expansion of the KTR buffer via sysctl. - Implement a function to ensure that all preempted threads have switched back out at least once. Use this to make sure there are no stale references to the old ktr_buf or the lock profiling buffers before updating them. Reviewed by: marius (sparc64 parts), attilio (earlier patch) Sponsored by: EMC / Isilon Storage Division	2012-11-15 00:51:57 +00:00
kib	e8ae50d444	Flip the semantic of M_NOWAIT to only require the allocation to not sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions. Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class. Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags(). Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks	2012-11-14 20:01:40 +00:00
dim	373133f0ad	Remove duplicate const specifiers in many drivers (I hope I got all of them, please let me know if not). Most of these are of the form: static const struct bzzt_type { [...list of members...] } const bzzt_devs[] = { [...list of initializers...] }; The second const is unnecessary, as arrays cannot be modified anyway, and if the elements are const, the whole thing is const automatically (e.g. it is placed in .rodata). I have verified this does not change the binary output of a full kernel build (except for build timestamps embedded in the object files). Reviewed by: yongari, marius MFC after: 1 week	2012-11-05 19:16:27 +00:00
attilio	f3501b109e	Rework the known rwlock to benefit about staying on their own cache line in order to avoid manual frobbing but using struct rwlock_padalign. Reviewed by: alc, jimharris	2012-11-03 23:03:14 +00:00
marius	807619d8ba	- Give PIL_PREEMPT the lowest priority just above low/stray interrupts. The reason for this is that the SPARC v9 architecture allows nested interrupts of higher priority/level than that of the current interrupt to occur (and we can't just entirely bypass this model, also, at least for tick interrupts, this also wouldn't be wise). However, when a preemption interrupt interrupts another interrupt of lower priority, f.e. PIL_ITHREAD, and that one in turn is nested by a third interrupt, f.e. PIL_TICK, with SCHED_ULE the execution of interrupts higher than PIL_PREEMPT may be migrated to another CPU. In particular, tl1_ret(), which is responsible for restoring the state of the CPU prior to entry to the interrupt based on the (also migrated) trap frame, then is run on a CPU which actually didn't receive the interrupt in question, causing an inappropriate processor interrupt level to be "restored". In turn, this causes interrupts of the first level, i.e. PIL_ITHREAD in the above scenario, to be blocked on the target of the migration until the correct PIL happens to be restored again on that CPU again. Making PIL_PREEMPT the lowest real priority, this effectively prevents this scenario from happening, as preemption interrupts no longer can interrupt any other interrupt besides stray ones (which is no issue). Thanks to attilio@ and especially mav@ for helping me to understand this problem at the 201208DevSummit. - Give PIL_STOP (which is also used for IPI_STOP_HARD, given that there's no real equivalent to NMIs on SPARC v9) the highest possible priority just below the hardwired PIL_TICK, so it has a chance to interrupt more things. MFC after: 1 week	2012-10-20 12:07:48 +00:00
marius	0e1f679c31	- Remove an unused header. - Don't waste a delay slot. MFC after: 3 days	2012-10-19 17:12:55 +00:00
marius	9e47c0d1ff	Let SCHED_ULE give affinity to the CPU the tick interrupt triggered on when running tick_process(), similarly to what the x86 equivalents of this function do, however employing the less racy sequence also used in intr_event_handle(). MFC after: 3 days	2012-10-19 13:32:37 +00:00
attilio	6997194551	Add an unified macro to deny ability from the compiler to reorder instruction loads/stores at its will. The macro __compiler_membar() is currently supported for both gcc and clang, but kernel compilation will fail otherwise. Reviewed by: bde, kib Discussed with: dim, theraven MFC after: 2 weeks	2012-10-09 14:32:30 +00:00
attilio	3212891c92	Reverts r234074,234105,234564,234723,234989,235231-235232 and part of r234247. Use, instead, the static intializer introduced in r239923 for x86 and sparc64 intr_cpus, unwinding the code to the initial version. Reviewed by: marius	2012-10-09 12:22:43 +00:00
alc	55f6ff40ed	Eliminate a stale comment. It describes another use case for the pmap in Mach that doesn't exist in FreeBSD.	2012-09-28 05:30:59 +00:00
eadler	8600cbb5b6	Correct double "the the" Approved by: cperciva MFC after: 3 days	2012-09-14 21:28:56 +00:00
attilio	8dece93b14	userret() already checks for td_locks when INVARIANTS is enabled, so there is no need to check if Giant is acquired after it. Reviewed by: kib MFC after: 1 week	2012-09-08 18:27:11 +00:00
gavin	9db084a6c3	Prevent indent(1) from reformatting this comment, as it contains a formatting-sensitive table.	2012-09-07 08:18:06 +00:00
marius	1829ce3546	Add a global MD macro for the VIS block size instead of duplicating it and using magic values all over the place. MFC after: 1 week	2012-08-31 11:15:01 +00:00
marius	9f542929d1	- Unlike cache invalidation and TLB demapping IPIs, reading registers from other CPUs doesn't require locking so get rid of it. As the latter is used for the timecounter on certain machine models, using a spin lock in this case can lead to a deadlock with the upcoming callout(9) rework. - Merge r134227/r167250 from x86: Avoid cross-IPI SMP deadlock by using the smp_ipi_mtx spin lock not only for smp_rendezvous_cpus() but also for the MD cache invalidation and TLB demapping IPIs. - Mark some unused function arguments as such. MFC after: 1 week	2012-08-29 16:56:50 +00:00
gjb	3f013cdf9f	Grammar fix: s/NIC's/NICs/ MFC after: 3 days	2012-08-26 01:21:02 +00:00
marius	4c012f63e6	Merge r236494 from x86: Isolate the global TTE list lock from data and other locks to prevent false sharing within the cache. MFC after: 3 days	2012-08-05 22:03:13 +00:00
marius	d339b71305	Switch back to the 4BSD scheduler for now. There is some more or less recent regression with ULE, causing processes to get stuck in getblk as well as interrupt handler execution delays to rise above the command timeout of mpt(4). MFC after: 3 days	2012-06-30 14:55:36 +00:00
ken	da17d879d7	Now that the mps(4) driver is endian-safe, add it to the powerpc and sparc64 GENERIC config files. MFC after: 3 days	2012-06-28 20:48:24 +00:00
alc	c5e6daff9d	Add new pmap layer locks to the predefined lock order. Change the names of a few existing VM locks to follow a consistent naming scheme.	2012-06-27 03:45:25 +00:00
andrew	0a7002aae7	Make the wchar_t type machine dependent. This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an unsigned short with the former preferred. Because of this requirement we need to move the definition of __wchar_t to a machine dependent header. It also cleans up the macros defining the limits of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine dependent header then using them to define WCHAR_MIN and WCHAR_MAX respectively. Discussed with: bde	2012-06-24 04:15:58 +00:00
kib	7b36a08108	Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 07:06:40 +00:00
kib	d98ad62d7e	Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to timekeeping information. MFC after: 1 week	2012-06-22 06:38:31 +00:00

1 2 3 4 5 ...

2088 Commits