freebsd-skq

Author	SHA1	Message	Date
delphij	c07f0f872d	Implement AT_SECURE properly. AT_SECURE auxv entry has been added to the Linux 2.5 kernel to pass a boolean flag indicating whether secure mode should be enabled. 1 means that the program has changes its credentials during the execution. Being exported AT_SECURE used by glibc issetugid() call. Submitted by: imp, dchagin Security: FreeBSD-SA-16:10.linux Security: CVE-2016-1883	2016-01-27 07:20:55 +00:00
dchagin	1fea9511b6	Remove obsolete comment. MFC after: 3 days	2016-01-23 08:08:06 +00:00
dchagin	4258ee00b5	Fix a typo. MFC after: 3 days	2016-01-23 08:04:29 +00:00
hselasky	3e2da6e430	Add missing atomic wrapper macro. Reviewed by: alfred @ Sponsored by: Mellanox Technologies MFC after: 1 week	2016-01-21 18:22:50 +00:00
kib	63cfb09a9a	Use ANSI definitions. Wrap long line. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-01-19 08:08:08 +00:00
kib	bee088bb00	Clear whole XMM register file instead of only XMM0. Also clear x87 registers. This brings amd64 on par with i386, providing consistent initial FPU state. Note that we do not clear any extended state, at least because kernel does not understand extended state structure and consequences of zero overwrite after fninit()/fpusave(). Submitted by: joss.upton@yahoo.com PR: 206370 MFC after: 2 weeks	2016-01-19 08:04:02 +00:00
glebius	f65cb2db64	Regen after r293907.	2016-01-14 10:15:21 +00:00
glebius	d87c627c80	Change linux get_robust_list system call to match actual linux one. The set_robust_list system call request the kernel to record the head of the list of robust futexes owned by the calling thread. The head argument is the list head to record. The get_robust_list system call should return the head of the robust list of the thread whose thread id is specified in pid argument. The list head should be stored in the location pointed to by head argument. In contrast, our implemenattion of get_robust_list system call copies the known portion of memory pointed by recorded in set_robust_list system call pointer to the head of the robust list to the location pointed by head argument. So, it is possible for a local attacker to read portions of kernel memory, which may result in a privilege escalation. Submitted by: mjg Security: SA-16:03.linux	2016-01-14 10:13:58 +00:00
jkim	9dcfa1d85c	Remove dead code when the target processor has POPCNT instruction.	2016-01-13 19:19:50 +00:00
dchagin	e706df7b9a	Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall instead of vdso. An upcoming linux_base-c6 needs it. Differential Revision: https://reviews.freebsd.org/D1090 Reviewed by: kib, trasz MFC after: 1 week	2016-01-09 20:18:53 +00:00
emaste	2029b75c0e	Move amd64 metadata.h to x86 and share with i386 MFC after: 1 week	2016-01-07 19:47:26 +00:00
ian	3d96cedc35	Make the 'env' directive described in config(5) work on all architectures, providing compiled-in static environment data that is used instead of any data passed in from a boot loader. Previously 'env' worked only on i386 and arm xscale systems, because it required the MD startup code to examine the global envmode variable and decide whether to use static_env or an environment obtained from the boot loader, and set the global kern_envp accordingly. Most startup code wasn't doing so. Making things even more complex, some mips startup code uses an alternate scheme that involves calling init_static_kenv() to pass an empty buffer and its size, then uses a series of kern_setenv() calls to populate that buffer. Now all MD startup code calls init_static_kenv(), and that routine provides a single point where envmode is checked and the decision is made whether to use the compiled-in static_kenv or the values provided by the MD code. The routine also continues to serve its original purpose for mips; if a non-zero buffer size is passed the routine installs the empty buffer ready to accept kern_setenv() values. Now if the size is zero, the provided buffer full of existing env data is installed. A NULL pointer can be passed if the boot loader provides no env data; this allows the static env to be installed if envmode is set to do so. Most of the work here is a near-mechanical change to call the init function instead of directly setting kern_envp. A notable exception is in xen/pv.c; that code was originally installing a buffer full of preformatted env data along with its non-zero size (like mips code does), which would have allowed kern_setenv() calls to wipe out the preformatted data. Now it passes a zero for the size so that the buffer of data it installs is treated as non-writeable.	2016-01-02 02:53:48 +00:00
jhb	994c23f093	Move shared variables from {amd64,i386}/initcpu.c to x86/identcpu.c. While here, move the common bits of <machine/cputypes.h> to <x86/cputypes.h> as well. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D4670	2015-12-23 21:41:42 +00:00
ngie	3dc5879a19	Remove redundant ctx_switch_xsave declaration in sys/amd64/include/md_var.h This variable was added to sys/x86/include/x86_var.h recently. This unbreaks building kernel source that #includes both md_var.h and x86_var.h with gcc 4.2.1 on amd64 Differential Revision: https://reviews.freebsd.org/D4686 Reviewed by: kib X-MFC with: r291949 Sponsored by: EMC / Isilon Storage Division	2015-12-22 20:08:32 +00:00
imp	3e2743eaf6	Save the physical address passed into the kernel of the UEFI system table.	2015-12-19 19:01:43 +00:00
kib	f124247e27	Merge common parts of i386 and amd64 md_var.h and smp.h into new headers x86/include x86_var.h and x86_smp.h. Reviewed by: emaste, jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4358	2015-12-07 17:41:20 +00:00
kib	fcdb3dc23f	Use ANSI C definition. MFC after: 1 week	2015-12-07 17:24:55 +00:00
cem	a15dada94b	pmap_invalidate_range: For very large ranges, flush the whole TLB Typical TLBs have 40-512 entries available. At some point, iterating every single page in a requested invalidation range and issuing invlpg on it is more expensive than flushing the TLB and allowing it to reload on demand. Broadwell CPUs have 1536 L2 TLB entries, so I've picked the arbitrary number 4096 entries as a hueristic at which point we flush TLB rather than invalidating every single potential page. Reviewed by: alc Feedback from: jhb, kib MFC notes: Depends on r291688 Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D4280	2015-12-06 17:39:13 +00:00
kib	f741f698b7	For amd64 non-PCID machines, and for i386 machines with support for the PG_G global pte flag, pmap_invalidate_all() fails to flush global TLB entries []. This is because TLB shootdown handler for such configs reloads CR3, and on i386 pmap_invalidate_all() does the same for the initiating CPU. Note that current code does not issue total invalidation requests for the kernel_pmap. Rename amd64 function invltlb_globpcid() to invltlb_glob(), it is not specific for PCID for quite some time, and implement the same functionality for i386. Use the function instead of invltlb() in shootdown handlers and in i386 pmap_invalidate_all(), but only for the kernel pmap (which maps pages with the PG_G attribute set), which takes care of PG_G TLB entries on flush. To detect the affected pmap in i386 TLB shootdown handler, pmap should be passed to the smp_masked_invltlb() function, which makes amd64 and i386 TLB shootdown code almost identical. Merge the code under x86/. Noted by: jhb [] Reviewed by: cem, jhb, pho Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D4346	2015-12-03 11:14:14 +00:00
kib	ee461b4bba	Remove sv_prepsyscall, sv_sigsize and sv_sigtbl members of the struct sysent. sv_prepsyscall is unused. sv_sigsize and sv_sigtbl translate signal number from the FreeBSD namespace into the ABI domain. It is only utilized on i386 for iBCS2 binaries. The issue with this approach is that signals for iBCS2 were delivered with the FreeBSD signal frame layout, which does not follow iBCS2. The same note is true for any other potential user if sv_sigtbl. In other words, if ABI needs signal number translation, it really needs custom sv_sendsig method instead. Sponsored by: The FreeBSD Foundation	2015-11-28 08:49:07 +00:00
emaste	c168857c6a	Fix whitespace on addition of IPSEC option	2015-11-26 21:35:50 +00:00
kib	e0c4faece4	Split kerne timekeep ABI structure vdso_sv_tk out of the struct sysentvec. This allows the timekeep data to be shared between similar ABIs which cannot share sysentvec. Make the timekeep_push_vdso() tick callback to the timekeep structures instead of sysentvecs. If several sysentvec share the vdso_sv_tk structure, we would update the userspace data several times on each tick, without the change. Only allocate vdso_sv_tk in the exec_sysvec_init() sysinit when sysentvec is marked with the new SV_TIMEKEEP flag. This saves allocation and update of unneeded vdso_sv_tk for ABIs which do not provide userspace gettimeofday yet, which are PowerPCs arches right now. Make vdso_sv_tk allocator public, namely split out and export alloc_sv_tk() and alloc_sv_tk_compat32(). ABIs which share timekeep data now can allocate it manually and share as appropriate. Requested by: nwhitehorn Tested by: nwhitehorn, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-11-23 07:09:35 +00:00
markj	3e47d7787e	Remove unneeded includes of opt_kdtrace.h. As of r258541, KDTRACE_HOOKS is defined in opt_global.h, so opt_kdtrace.h is not needed when defining SDT(9) probes.	2015-11-22 02:01:01 +00:00
jhb	c1d9f70889	Export various helper variables describing the layout and size of certain kernel structures for use by debuggers. This mostly aids in examining cores from a kernel without debug symbols as a debugger can infer these values if debug symbols are available. One set of variables describes the layout of 'struct linker_file' to walk the list of loaded kernel modules. A second set of variables describes the layout of 'struct proc' and 'struct thread' to walk the list of processes in the kernel and the threads in each process. The 'pcb_size' variable is used to index into the stoppcbs[] array. The 'vm_maxuser_address' is used to distinguish kernel virtual addresses from user addresses. This doesn't have to be perfect, and 'vm_maxuser_address' is a cheap and simple way to differentiate kernel pointers from simple values like TIDs and PIDs. While here, annotate the fields in struct pcb used by kgdb on amd64 and i386 to note that their ABI should be preserved. Annotations for other platforms will be added in the future. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D3773	2015-11-12 22:00:59 +00:00
cem	dffa7f0590	pmap_change_attr: Only fixup DMAP for DMAPed ranges pmap_change_attr must change the memory type of both the requested KVA and the corresponding DMAP mappings (if such mappings exist), to satisfy an Intel requirement that two or more mappings to the same physical pages must have the same memory type. However, not all kernel mapped pages have corresponding DMAP mappings -- for example, 64-bit BARs. Skip fixing up the DMAP for out-of-bounds addresses. Submitted by: Steve Wahl <steve_wahl@dell.com> Reviewed by: alc, jhb Sponsored by: Dell Compellent Differential Revision: https://reviews.freebsd.org/D4030	2015-10-29 19:07:00 +00:00
jhb	9ee931e10a	Update for LINUX32 rename. The assembler didn't complain about undefined symbols but just used 0 after the rename.	2015-10-29 15:20:47 +00:00
jhb	617f6c60b6	Fix build with DEBUG defined. Reported by: hselasky	2015-10-29 15:16:47 +00:00
mckusick	485c0ddc22	Bring the tags and links entries for amd64 up to date. Based on how out of date it is, I doubt that anyone other than me and my code-reading students still use it.	2015-10-27 22:59:24 +00:00
kib	919ebacc13	Intel SDM before revision 56 described the CLFLUSH instruction as only ordered with the MFENCE instruction. Similar weak guarantees are also specified by the AMD APM vol. 3 rev. 3.22. x86 pmap methods pmap_invalidate_cache_range() and pmap_invalidate_cache_pages() braced CLFLUSH loop with MFENCE both before and after the loop. In the revision 56 of SDM, Intel stated that all existing implementations of CLFLUSH are strict, CLFLUSH instructions execution is ordered WRT other CLFLUSH and writes. Also, the strict behaviour is made architectural. A new instruction CLFLUSHOPT (which was documented for some time in the Instruction Set Extensions Programming Reference) provides the weak behaviour which was previously attributed to CLFLUSH. Use CLFLUSHOPT when available. When CLFLUSH is used on Intel CPUs, do not execute MFENCE before and after the flushing loop. Reviewed by: alc Sponsored by: The FreeBSD Foundation	2015-10-24 21:37:47 +00:00
kib	7eb36dd3f9	Add CLFLUSHOPT instruction wrappers. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-10-23 11:45:38 +00:00
jhb	4a875c71d5	Regen for linux32 rename and linux64 systrace.	2015-10-22 21:33:37 +00:00
jhb	9740ac3060	Rename remaining linux32 symbols such as linux_sysent[] and linux_syscallnames[] from linux_* to linux32_* to avoid conflicts with linux64.ko. While here, add support for linux64 binaries to systrace. - Update NOPROTO entries in amd64/linux/syscalls.master to match the main table to fix systrace build. - Add a special case for union l_semun arguments to the systrace generation. - The systrace_linux32 module now only builds the systrace_linux32.ko. module on amd64. - Add a new systrace_linux module that builds on both i386 and amd64. For i386 it builds the existing systrace_linux.ko. For amd64 it builds a systrace_linux.ko for 64-bit binaries. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D3954	2015-10-22 21:28:20 +00:00
jhb	2705fe5cc1	Merge r289055 to amd64/linux32: linux: fix handling of out-of-bounds syscall attempts Due to an off by one the code would read an entry past the table, as opposed to the last entry which contains the nosys handler.	2015-10-22 21:23:58 +00:00
ed	7fb0afec66	Refactoring: move out generic bits from cloudabi64_sysvec.c. In order to make it easier to support CloudABI on ARM64, move out all of the bits from the AMD64 cloudabi_sysvec.c into a new file cloudabi_module.c that would otherwise remain identical. This reduces the AMD64 specific code to just ~160 lines. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D3974	2015-10-22 09:07:53 +00:00
royger	d31f5fb7b9	x86/xen: Consolidate xen-os.h in a single place amd64 and i386 platform code contain very similar xen/xen-os.h The only differences are: - Functions/variables/types which were unused in i386/xen/xen-os.h: * xen_xchg * __xchg_dummy * __xg * __xchg * atomic_t * atomic_inc * rdtscll The functions/variables/types unused in xen-os.h can be dropped and there is no more differences betwen amd64 and i386. The new header is placed in x86/include/xen and each platform will have dummy headers include x86/xen/.h. This is to be able to include machine/xen/.h in the PV drivers. Submitted by: Julien Grall <julien.grall@citrix.com> Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D3880 Sponsored by: Citrix Systems R&D	2015-10-21 10:04:35 +00:00
mav	64d53c4c7d	Remove compatibility shims for legacy ATA device names. We got new ATA stack in FreeBSD 8.x, switched to it at 9.x, completely removed old stack at 10.x, so at 11.x it is time to remove compat shims.	2015-10-11 13:01:51 +00:00
mjg	d8dc4fc1ae	linux: fix handling of out-of-bounds syscall attempts Due to an off by one the code would read an entry past the table, as opposed to the last entry which contains the nosys handler. Reported by: Pawel Biernacki <pawel.biernacki gmail.com>	2015-10-08 21:08:35 +00:00
royger	375ecc42de	xen/console: Introduce a new console driver for Xen guest The current Xen console driver is crashing very quickly when using it on an ARM guest. This is because the console lock is recursive and it may lead to recursion on the tty lock and/or corrupt the ring pointer. Furthermore, the console lock is not always taken where it should be and has to be released too early because of the way the console has been designed. Over the years, code has been modified to support various new features but the driver has not been reworked. This new driver has been rewritten with the idea of only having a small set of specific function to write either via the shared ring or the hypercall interface. Note that HVM support has been left aside for now because it requires additional features which are not yet supported. A follow-up patch will be sent with HVM guest support. List of items that may be good to have but not mandatory: - Avoid to flush for each character written when using the tty - Support multiple consoles Submitted by: Julien Grall <julien.grall@citrix.com> Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D3698 Sponsored by: Citrix Systems R&D	2015-10-08 16:39:43 +00:00
royger	c1bb2e3246	Update Xen headers from 4.2 to 4.6 Pull the latest headers for Xen which allow us to add support for ARM and use new features in FreeBSD. This is a verbatim copy of the xen/include/public so every headers which don't exits anymore in the Xen repositories have been dropped. Note the interface version hasn't been bumped, it will be done in a follow-up. Although, it requires fix in the code to get it compiled: - sys/xen/xen_intr.h: evtchn_port_t is already defined in the headers so drop it. - {amd64,i386}/include/intr_machdep.h: NR_EVENT_CHANNELS now depends on xen/interface/event_channel.h, so include it. - {amd64,i386}/{amd64,i386}/support.S: It's not neccessary to include machine/intr_machdep.h. This is also fixing build compilation with the new headers. - dev/xen/blkfront/blkfront.c: The typedef for blkif_request_segmenthas been dropped. So directly use struct blkif_request_segment Finally, modify xen/interface/xen-compat.h to throw a preprocessing error if __XEN_INTERFACE_VERSION__ is not set. This is allow us to catch any file where xen/xen-os.h is not correctly included. Submitted by: Julien Grall <julien.grall@citrix.com> Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D3805 Sponsored by: Citrix Systems R&D	2015-10-06 11:29:44 +00:00
alc	57f2addb31	Exploit r288122 to address a cosmetic issue. Since PV chunk pages don't belong to a vm object, they can't be paged out. Since they can't be paged out, they are never enqueued in a paging queue. Nonetheless, passing PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages are being enqueued in the inactive queue. As of r288122, we can avoid this false impression by passing PQ_NONE. Submitted by: kmacy (an earlier version) Differential Revision: https://reviews.freebsd.org/D1674	2015-09-26 07:18:05 +00:00
mjg	0ea1d42c18	amd64: plug redundant bootAP declaration Reported by: gcc5	2015-09-22 21:07:47 +00:00
kib	518734671f	Add support for weak symbols to the kernel linkers. It means that linkers no longer raise an error when undefined weak symbols are found, but relocate as if the symbol value was 0. Note that we do not repeat the mistake of userspace dynamic linker of making the symbol lookup prefer non-weak symbol definition over the weak one, if both are available. In fact, kernel linker uses the first definition found, and ignores duplicates. Signature of the elf_lookup() and elf_obj_lookup() functions changed to split result/error code and the symbol address returned. Otherwise, it is impossible to return zero address as the symbol value, to MD relocation code. This explains the mechanical changes in elf_machdep.c sources. The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the lookup() call, the patch leaves the code as is (untested). Reported by: glebius Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-09-20 01:27:59 +00:00
markj	e8967c8bd9	Add stack_save_td_running(), a function to trace the kernel stack of a running thread. It is currently implemented only on amd64 and i386; on these architectures, it is implemented by raising an NMI on the CPU on which the target thread is currently running. Unlike stack_save_td(), it may fail, for example if the thread is running in user mode. This change also modifies the kern.proc.kstack sysctl to use this function, so that stacks of running threads are shown in the output of "procstat -kk". This is handy for debugging threads that are stuck in a busy loop. Reviewed by: bdrewery, jhb, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3256	2015-09-11 03:54:37 +00:00
markj	dfb0cc5c03	Merge stack(9) implementations for i386 and amd64 under x86/. Reviewed by: jhb, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3255	2015-09-11 03:24:07 +00:00
kib	98df6be028	Do not hold the process around the vm_fault() call from the trap()s. The only operation which is prevented by the hold is the kernel stack swapout for the faulted thread, which should be fine to allow. Remove useless checks for NULL curproc or curproc->p_vmspace from the trap_pfault() wrappers on x86 and powerpc. Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-09-10 17:46:48 +00:00
markj	c72177e471	Remove an unneeded instruction. MFC after: 1 week	2015-08-28 00:17:21 +00:00
cem	06ccf4bc96	Import ioat(4) driver I/OAT is also referred to as Crystal Beach DMA and is a Platform Storage Extension (PSE) on some Intel server platforms. This driver currently supports DMA descriptors only and is part of a larger effort to upstream an interconnect between multiple systems using the Non-Transparent Bridge (NTB) PSE. For now, this driver is only built on AMD64 platforms. It may be ported to work on i386 later, if that is desired. The hardware is exclusive to x86. Further documentation on ioat(4), including API documentation and usage, can be found in the new manual page. Bring in a test tool, ioatcontrol(8), in tools/tools/ioat. The test tool is not hooked up to the build and is not intended for end users. Submitted by: jimharris, Carl Delsey <carl.r.delsey@intel.com> Reviewed by: jimharris (reviewed my changes) Approved by: markj (mentor) Relnotes: yes Sponsored by: Intel Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3456	2015-08-24 19:32:03 +00:00
royger	5b319fbe38	preload_search_info: make sure mod is set Add a check to preload_search_info to make sure mod is set. Most of the callers of preload_search_info don't check that the mod parameter is set, which can cause page faults. While at it, remove some now unnecessary checks before calling preload_search_info. Sponsored by: Citrix Systems R&D Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D3440	2015-08-21 15:57:57 +00:00
bapt	95d9fae2fd	Add a kern.features.cloudabi64 entry when the module is loaded to helps the userland to be able to test is cloudabi64 is supported or not Reviewed by: ed Differential Revision: https://reviews.freebsd.org/D3430	2015-08-19 15:18:32 +00:00
marcel	a428fe7759	Add 24 more page table pages we allocate on boot-up. 16MB slop is a little tight in and by itself, but severily insufficient when one needs to map a large frame buffer as part of console initialization. 64MB slop should be enough for a while. As an example: a 15" MacBook Pro with retina display needs ~28MB of KVA for the frame buffer. PR: 193745	2015-08-18 01:53:41 +00:00
kib	6e1eb32ced	XEN/amd64 may initiate i/o over the pages not mapped by the direct map. Handle busdma bouncing and ata PIO accesses by using global frame used by the current CPU locally for the duration of pmap_quick_enter/remove_page(). A spin mutex protects the concurent frame use and prevents thread migration. Noted by: royger Reviewed by: alc, jah, royger (previous version) Sponsored by: The FreeBSD Foundation	2015-08-17 18:42:45 +00:00
marcel	bacabe8a7e	Better support memory mapped console devices, such as VGA and EFI frame buffers and memory mapped UARTs. 1. Delay calling cninit() until after pmap_bootstrap(). This makes sure we have PMAP initialized enough to add translations. Keep kdb_init() after cninit() so that we have console when we need to break into the debugger on boot. 2. Unfortunately, the ATPIC code had be moved as well so as to avoid a spurious trap #30. The reason for which is not known at this time. 3. In pmap_mapdev_attr(), when we need to map a device prior to the VM system being initialized, use virtual_avail as the KVA to map the device at. In particular, avoid using the direct map on amd64 because we can't demote by virtue of not being able to allocate yet. Keep track of the translation. Re-use the translation after the VM has been initialized to not waste KVA and to satisfy the assumption in uart(4) that the handle returned for the low-level console is the same as later returned when the device is probed and attached. 4. In pmap_unmapdev() remove the mapping from the table when called pre-init. Otherwise keep the mapping. During bus probe and attach device resources are mapped and unmapped multiple times, which would have us destroy the mapping used by the low-level console. 5. In pmap_init(), set pmap_initialized to signal that we're not pre-init anymore. On amd64, bring the direct map in sync with the translations created at that time. 6. Implement bus_space_map() and bus_space_unmap() for real: when the tag corresponds to memory space, call the corresponding pmap_mapdev() and pmap_unmapdev() functions to construct and actual handle. 7. In efifb.c and vt_vga.c, remove the crutches and hacks and simply call pmap_mapdev_attr() or bus_space_map() as desired. Notes: 1. uart(4) already used bus_space_map() during low-level console setup but since serial ports have traditionally been I/O port based, the lack of a proper implementation for said function was not a problem. It has always supported memory mapped UARTs for low-level consoles by setting hw.uart.console accordingly. 2. The use of the direct map on amd64 without setting caching attributes has been a bigger problem than previously thought. This change has the fortunate (and unexpected) side-effect of fixing various EFI frame buffer problems (though not all). PR: 191564, 194952 Special thanks to: 1. XipLink, Inc -- generously donated an Intel Bay Trail E3800 based eval board (ADLE3800PC). 2. The FreeBSD Foundation, in particular emaste@ -- for UEFI support in general and testing. 3. Everyone who tested the proposed for PR 191564. 4. jhb@ and kib@ for being a soundboard and applying a clue bat if so needed.	2015-08-12 15:26:32 +00:00
kib	9458aa2b57	Initialization of smp_tlb_wait does not require release semantic, no data is synchronized by store/load to the variable. The lapic_write_icr() function ensures that store buffers are flushed before IPI command is issued. Discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-08-12 09:46:39 +00:00
kib	bff0ec5bef	AP should load aps_ready with acquire semantic to see BSP updates to the SMP structures, synchronized with the load by release store in release_aps(). The change is formal, x86 strong memory model implicitely provided the guarantees. Discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-08-12 09:43:12 +00:00
kib	9033c894a1	Make kstack_pages a tunable on arm, x86, and powepc. On i386, the initial thread stack is not adjusted by the tunable, the stack is allocated too early to get access to the kernel environment. See TD0_KSTACK_PAGES for the thread0 stack sizing on i386. The tunable was tested on x86 only. From the visual inspection, it seems that it might work on arm and powerpc. The arm USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already incorrect for the threads with non-default kstack size. I only changed the macros to use variable instead of constant, since I cannot test. On arm64, mips and sparc64, some static data structures are sized by KSTACK_PAGES, so the tunable is disabled. Sponsored by: The FreeBSD Foundation MFC after: 2 week	2015-08-10 17:18:21 +00:00
jhb	47d8edd4b1	Remove some more vestiges of the Xen PV domu support. Specifically, use vtophys() directly instead of vtomach() and retire the no-longer-used headers <machine/xenfunc.h> and <machine/xenvar.h>. Reported by: bde (stale bits in <machine/xenfunc.h>) Reviewed by: royger (earlier version) Differential Revision: https://reviews.freebsd.org/D3266	2015-08-06 17:07:21 +00:00
emaste	002d9943c1	Rationalize BSD license on sys/*/include/in_cksum.h Remove the advertising clause from the Regents of the University of California's license, per the letter dated July 22, 1999. Update clause numbering.	2015-08-05 19:05:12 +00:00
jah	b8c4d76738	Add two new pmap functions: vm_offset_t pmap_quick_enter_page(vm_page_t m) void pmap_quick_remove_page(vm_offset_t kva) These will create and destroy a temporary, CPU-local KVA mapping of a specified page. Guarantees: --Will not sleep and will not fail. --Safe to call under a non-sleepable lock or from an ithread Restrictions: --Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms --Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page. --MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm. NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing. Reviewed by: kib Approved by: kib (mentor) Differential Revision: http://reviews.freebsd.org/D3013	2015-08-04 19:46:13 +00:00
imp	04e6a60087	Add pmspvc device back to GENERIC. The issues with the device playing grabby hands with other driver's devices has been solved. MFC After: 3 weeks	2015-08-03 13:49:46 +00:00
ed	4f67d9c92c	Let CloudABI use the SV_CAPSICUM flag. CloudABI processes will now start up in capabilities mode. Reviewed by: kib	2015-08-03 13:42:52 +00:00
kib	b31c115daa	Clear the IA32_MISC_ENABLE MSR bit, which limits the max CPUID reported, on APs. We already did this on BSP. Otherwise, the userspace software which depends on the features reported by the high CPUID levels is misbehaving. In particular, AVX detection is non-functional, depending on which CPU thread happens to execute when doing CPUID. Another victim is the libthr signal handlers interposer, which needs to save full FPU extended state. Reported and tested by: Andre Meiser <ortadur@web.de> Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-08-03 12:14:42 +00:00
ed	c2712ff846	Set p_osrel to __FreeBSD_version on process startup. Certain system calls have quirks applied to make them work as if called on an older version of FreeBSD. As CloudABI executables don't have the FreeBSD OS release number in the ELF header, this value is set to zero, making the system calls fall back to typically historic, non-standard behaviour. Reviewed by: kib	2015-08-03 07:29:57 +00:00
gjb	8ddbbce4f6	Pull pmspcv (pms(4)) from GENERIC. It has PCI ID conflicts with ahd(4), mvs(4), and likely other drivers. MFC after: immediately With hat: re Sponsored by: The FreeBSD Foundation	2015-07-31 15:23:48 +00:00
kib	88c35d6516	Improve comments. Submitted by: bde MFC after: 2 weeks	2015-07-30 15:47:53 +00:00
kib	45167e7aef	Remove full barrier from the amd64 atomic_load_acq_*(). Strong ordering semantic of x86 CPUs makes only the compiler barrier neccessary to give the acquire behaviour. Existing implementation ensured sequentially consistent semantic for load_acq, making much stronger guarantee than required by standard's definition of the load acquire. Consumers which depend on the barrier are believed to be identified and already fixed to use proper operations. Noted by: alc (long time ago) Reviewed by: alc, bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-07-28 07:04:51 +00:00
alc	268d180ee5	Add a comment discussing the appropriate use of the atomic_() functions with acquire and release semantics versus the mb() functions on amd64 processors. Reviewed by: bde (an earlier version), kib Sponsored by: EMC / Isilon Storage Division	2015-07-24 19:43:18 +00:00
jhb	d3b87ae234	Various changes to the registers displayed in DDB for x86. - Fix segment registers to only display the low 16 bits. - Remove unused handlers and entries for the debug registers. - Display xcr0 (if valid) in 'show sysregs'. - Add '0x' prefix to MSR values to match other values in 'show sysregs'. - MFamd64: Display various MSRs in 'show sysregs'. - Add a 'show dbregs' to display the value of debug registers. - Dynamically size the column width for register values to properly align columns on 64-bit platforms. - Display %gs for i386 in 'show registers'. Differential Revision: https://reviews.freebsd.org/D2784 Reviewed by: kib, markj MFC after: 2 weeks	2015-07-22 01:09:02 +00:00
markj	825752c2e5	Let the unwinder handle faults during function prologues or epilogues. The i386 and amd64 DDB stack unwinders contain code to detect and handle the case where the first frame is not completely set up or torn down. This code was accidentally unused however, since db_backtrace() was never called with a non-NULL trap frame. This change fixes that. Also remove get_rsp() from the amd64 code. It appears to have come from i386, which needs to take into account whether the exception triggered a CPL switch, since SS:ESP is only pushed onto the stack if so. On amd64, SS:RSP is pushed regardless, so get_rsp() was doing the wrong thing for kernel-mode exceptions. As a result, we can also remove custom print functions for these registers. Reviewed by: jhb Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D2881	2015-07-21 23:22:23 +00:00
markj	0f1eab2023	Improve stack unwinding on i386 and amd64 after an IP fault. If we can't find a symbol corresponding to the faulting instruction, assume that the previously-executed function is a call and attempt to find the calling function using the return address on the stack. Otherwise we end up associating the last stack frame with the current call, which is incorrect and causes the unwinder to skip printing of the calling function, resulting in a confusing backtrace. Reviewed by: jhb Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D2859	2015-07-21 23:13:11 +00:00
markj	2d0eee8395	Remove some dead code from DDB's amd64 stack unwinder. The amd64 port copied some code from i386 to fetch function arguments and display them in backtraces. However, it was commented out and can't easily be implemented since the function arguments are passed in registers rather than on the stack in amd64. Remove it in preparation for some bug fixes in this area. Reviewed by: jhb Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D2857	2015-07-21 23:03:21 +00:00
ed	9ab04b81ba	Describe COMPAT_CLOUDABI64 in the amd64 configuration NOTES file.	2015-07-21 12:53:47 +00:00
ed	5b106acac6	Make thread creation work for CloudABI processes. Summary: Remove the stub system call that was put in place during the system call import and replace it by a target-dependent version stored in sys/amd64. Initialize the thread in a way similar to cpu_set_upcall_kse(). We provide the entry point with two arguments: the thread ID and the argument pointer. Test Plan: Thread creation still seems to work, both for FreeBSD and CloudABI binaries. Reviewers: dchagin, mjg, kib Reviewed By: kib Subscribers: imp Differential Revision: https://reviews.freebsd.org/D3110	2015-07-21 12:47:15 +00:00
ed	3200268ec4	Make forking of CloudABI processes work. Just like FreeBSD+Capsicum, CloudABI uses process descriptors. Return the file descriptor number to the parent process. To the child process we both return a special value for the file descriptor number (CLOUDABI_PROCESS_CHILD). We also return the thread ID of the new thread in the copied process, so the threading library can reinitialize itself. Obtained from: https://github.com/NuxiNL/freebsd	2015-07-20 13:46:22 +00:00
markj	fb4cb70b7d	Implement the lockstat provider using SDT(9) instead of the custom provider in lockstat.ko. This means that lockstat probes now have typed arguments and will utilize SDT probe hot-patching support when it arrives. Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D2993	2015-07-19 22:14:09 +00:00
benno	e853fb249f	Merge driver for PMC Sierra's range of SAS/SATA HBAs. Submitted by: Achim Leubner <Achim.Leubner@pmcs.com> Reviewed by: scottl	2015-07-17 23:30:43 +00:00
kib	67db66d5ea	When checking for the valid value of the frame pointer, verify that it belongs to the kernel stack address range for the thread. Right now, code checks that new frame is not farther then KSTACK_PAGES pages from the current frame, which allows the address to point past the top of the stack. Reviewed by: andrew, emaste, markj Differential revision: https://reviews.freebsd.org/D3108 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-07-16 19:40:18 +00:00
ed	e0ca6c5ecb	Add a sysentvec for CloudABI on x86-64. Summary: For CloudABI we need to put two things on the stack of new processes: the argument data (a binary blob; not strings) and a startup data structure. The startup data structure contains interesting things such as a pointer to the ELF program header, the thread ID of the initial thread, a stack smashing protection canary, and a pointer to the argument data. Fetching system call arguments and setting the return value is similar to FreeBSD. The only differences are that system call 0 does not exist and that we call into cloudabi_convert_errno() to convert the error code. We also need this function in a couple of other places, so we'd better reuse it here. Reviewers: dchagin, kib Reviewed By: kib Subscribers: imp Differential Revision: https://reviews.freebsd.org/D3098	2015-07-16 18:24:06 +00:00
pkelsey	ca799427e1	Revert inadvertent change to amd64/GENERIC.	2015-07-15 01:04:54 +00:00
pkelsey	1f2d9f52c0	Add netmap support for ixgbe SRIOV VFs (that is, to if_ixv). Differential Revision: https://reviews.freebsd.org/D2923 Reviewed by: erj, gnn Approved by: jmallett (mentor) Sponsored by: Norse Corp, Inc.	2015-07-15 01:02:01 +00:00
brueffer	d9ba778236	Spell crypto correctly.	2015-07-14 10:47:56 +00:00
jmg	42299ebf0d	Now that aesni won't reuse fpu contexts (D3016), add seatbelts to the fpu code to prevent other reuse of the contexts in the future... Differential Revision: https://reviews.freebsd.org/D3015 Reviewed by: kib, gnn	2015-07-08 19:26:36 +00:00
kib	c17f8bfdd5	Add the atomic_thread_fence() family of functions with intent to provide a semantic defined by the C11 fences with corresponding memory_order. atomic_thread_fence_acq() gives r \| r, w, where r and w are read and write accesses, and \| denotes the fence itself. atomic_thread_fence_rel() is r, w \| w. atomic_thread_fence_acq_rel() is the combination of the acquire and release in single operation. Note that reads after the acq+rel fence could be made visible before writes preceeding the fence. atomic_thread_fence_seq_cst() orders all accesses before/after the fence, and the fence itself is globally ordered against other sequentially consistent atomic operations. Reviewed by: alc Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-07-08 18:12:24 +00:00
achim	14e58df516	Driver 'pmspcv' added. Supports PMC-Sierra PM8001/8081/8088/8089/8074/8076/8077 SAS/SATA HBA Controllers.	2015-07-07 13:17:02 +00:00
neel	e9213dd046	Move the 'devmem' device nodes from /dev/vmm to /dev/vmm.io Some external tools just do a 'ls /dev/vmm' to figure out the bhyve virtual machines on the host. These tools break if the devmem device nodes also appear in /dev/vmm. Requested by: grehan	2015-07-06 19:41:43 +00:00
gnn	26ad2548c0	Enable IPSEC in all GENERIC kernels. Universe and kernel build tests passed 4 July 2015 PR: 128030 Sponsored by: Rubicon Communications (Netgate)	2015-07-04 17:37:00 +00:00
kib	77205a0535	Use single instance of the identical INKERNEL() and PMC_IN_KERNEL() macros on amd64 and i386. Move the definition to machine/param.h. kgdb defines INKERNEL() too, the conflict is resolved by renaming kgdb version to PINKERNEL(). On i386, correct the lowest kernel address. After the shared page was introduced, USRSTACK no longer points to the last user address + 1 [] Submitted by: Oliver Pinter [] Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-07-02 14:37:21 +00:00
kib	753543c5c5	Disallow a debugger on 64bit system to set fs/gs bases of the 32bit process beyond the end of the process address space. Such setting is not dangerous to the kernel integrity, but it causes confusing application misbehaviour. Sponsored by: The FreeBSD Foundation MFC after: 12 days	2015-07-01 16:37:03 +00:00
kib	f6cfae6dab	Add a comment about too strong semantic of atomic_load_acq() on x86. Submitted by: bde MFC after: 2 weeks	2015-06-29 09:58:40 +00:00
kib	e85612a06d	pcb_gs32sd is unused for long time, remove it. Keep the padding in pcb. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-06-29 07:53:44 +00:00
kib	9b07cc4555	Add x86 PT_GETFSBASE, PT_GETGSBASE machine-depended ptrace requests to obtain the thread %fs and %gs bases. Add x86 PT_SETFSBASE and PT_SETGSBASE requests to set the bases from debuggers. The set requests, similarly to the sysarch({I386,AMD64}_SET_FSBASE), override the corresponding segment registers. The main purpose of the operations is to retrieve and modify the tcb address for debuggee. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-06-29 07:07:24 +00:00
kib	6279b7c930	Remove unneeded data dependency, currently imposed by atomic_load_acq(9), on it source, for x86. Right now, atomic_load_acq() on x86 is sequentially consistent with other atomics, code ensures this by doing store/load barrier by performing locked nop on the source. Provide separate primitive __storeload_barrier(), which is implemented as the locked nop done on a cpu-private variable, and put __storeload_barrier() before load, to keep seq_cst semantic but avoid introducing false dependency on the no-modification of the source for its later use. Note that seq_cst property of x86 atomic_load_acq() is not documented and not carried by atomics implementations on other architectures, although some kernel code relies on the behaviour. This commit does not intend to change this. Reviewed by: alc Discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-06-28 05:04:08 +00:00
tychon	d4a6573433	verify_gla() needs to account for non-zero segment base addresses. Reviewed by: neel	2015-06-26 18:00:29 +00:00
royger	c61d8ab317	amd64: set the correct LMA values The current linker script generates program headers with VMA == LMA: Entry point 0xffffffff802e7000 There are 6 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0xffffffff80200040 0xffffffff80200040 0x0000000000000150 0x0000000000000150 R E 8 INTERP 0x0000000000000190 0xffffffff80200190 0xffffffff80200190 0x000000000000000d 0x000000000000000d R 1 [Requesting program interpreter: /red/herring] LOAD 0x0000000000000000 0xffffffff80200000 0xffffffff80200000 0x00000000010559b0 0x00000000010559b0 R E 200000 LOAD 0x0000000001056000 0xffffffff81456000 0xffffffff81456000 0x0000000000132638 0x000000000052ecf8 RW 200000 DYNAMIC 0x0000000001056000 0xffffffff81456000 0xffffffff81456000 0x00000000000000d0 0x00000000000000d0 RW 8 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 8 This is fine for the FreeBSD loader, because it completely ignores p_paddr and instead uses p_vaddr with a hardcoded offset. Other loaders however acknowledge p_paddr (like the Xen ELF loader), in which case they will try to load the kernel at the wrong place. Fix this by adding an AT keyword to the first section specifying the physical address, other sections will follow suit, so it ends up looking like: Entry point 0xffffffff802e7000 There are 6 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0xffffffff80200040 0x0000000000200040 0x0000000000000150 0x0000000000000150 R E 8 INTERP 0x0000000000000190 0xffffffff80200190 0x0000000000200190 0x000000000000000d 0x000000000000000d R 1 [Requesting program interpreter: /red/herring] LOAD 0x0000000000000000 0xffffffff80200000 0x0000000000200000 0x00000000010559b0 0x00000000010559b0 R E 200000 LOAD 0x0000000001056000 0xffffffff81456000 0x0000000001456000 0x0000000000132638 0x000000000052ecf8 RW 200000 DYNAMIC 0x0000000001056000 0xffffffff81456000 0x0000000001456000 0x00000000000000d0 0x00000000000000d0 RW 8 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 8 Tested on bare metal using the native FreeBSD loader and grub2 from TRUEOS. Sponsored by: Citrix Systems R&D Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D2783	2015-06-26 07:12:17 +00:00
neel	2c76978d5a	Restore the host's GS.base before returning from 'svm_launch()'. Previously this was done by the caller of 'svm_launch()' after it returned. This works fine as long as no code is executed in the interim that depends on pcpu data. The dtrace probe 'fbt:vmm:svm_launch:return' broke this assumption because it calls 'dtrace_probe()' which in turn relies on pcpu data. Reported by: avg MFC after: 1 week	2015-06-23 02:17:23 +00:00
neel	8c70d6c7af	Restructure memory allocation in bhyve to support "devmem". devmem is used to represent MMIO devices like the boot ROM or a VESA framebuffer where doing a trap-and-emulate for every access is impractical. devmem is a hybrid of system memory (sysmem) and emulated device models. devmem is mapped in the guest address space via nested page tables similar to sysmem. However the address range where devmem is mapped may be changed by the guest at runtime (e.g. by reprogramming a PCI BAR). Also devmem is usually mapped RO or RW as compared to RWX mappings for sysmem. Each devmem segment is named (e.g. "bootrom") and this name is used to create a device node for the devmem segment (e.g. /dev/vmm/testvm.bootrom). The device node supports mmap(2) and this decouples the host mapping of devmem from its mapping in the guest address space (which can change). Reviewed by: tychon Discussed with: grehan Differential Revision: https://reviews.freebsd.org/D2762 MFC after: 4 weeks	2015-06-18 06:00:17 +00:00
jhb	fd66a5bf8b	Report the values of x86 segment registers to remote debuggers. While here, also report %eflags from the i386 trapframe. Differential Revision: https://reviews.freebsd.org/D2743 Reviewed by: kib Obtained from: 1 month	2015-06-12 15:14:08 +00:00
br	1383b5af08	Allow DTrace to be compiled-in to the kernel. This will require for AArch64 as we dont have modules yet. Sponsored by: HEIF5 Sponsored by: ARM Ltd. Differential Revision: https://reviews.freebsd.org/D1997	2015-06-10 15:53:39 +00:00
mjg	b6be1c5ace	Fixup the build after r284215. Submitted by: Ivan Klymenko <fidaj ukr.net> [slighly modified]	2015-06-10 12:39:01 +00:00
mjg	d7bc9285a6	Implement lockless resource limits. Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.	2015-06-10 10:48:12 +00:00
mjg	67f2eebb44	Generalised support for copy-on-write structures shared by threads. Thread credentials are maintained as follows: each thread has a pointer to creds and a reference on them. The pointer is compared with proc's creds on userspace<->kernel boundary and updated if needed. This patch introduces a counter which can be compared instead, so that more structures can use this scheme without adding more comparisons on the boundary.	2015-06-10 10:43:59 +00:00
alc	cbefd9b195	Account for superpage mappings that are created by pmap_copy().	2015-06-09 18:04:28 +00:00
tychon	3259f3f35f	Support guest writes to the TSC by enabling the "use TSC offsetting" execution control and writing the difference between the host TSC and the guest TSC into the TSC offset in the VMCS upon encountering a write. Reviewed by: neel	2015-06-09 00:14:47 +00:00
dchagin	b837e3372d	Futex is an aligned 32-bit integer. Use the proper instruction and operand when dereferencing futex pointer.	2015-06-08 17:39:25 +00:00
alc	263927b83e	Retire VM_FREEPOOL_CACHE as the next step in eliminating PG_CACHE pages. Differential Revision: https://reviews.freebsd.org/D2712 Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2015-06-08 04:59:32 +00:00
kib	a1956e48c3	Update print_INTEL_TLB() by the tag values from the Intel SDM rev. 55. The modern CPUs cache and TLB descriptions looked quite questionable without the update, e.g. Haswell i7 4770S reported: Data TLB: 4 KB pages, 4-way set associative, 64 entries L2 cache: 256 kbytes, 8-way associative, 64 bytes/line After the update, the report is: Data TLB: 1 GByte pages, 4-way set associative, 4 entries Data TLB: 4 KB pages, 4-way set associative, 64 entries Instruction TLB: 2M/4M pages, fully associative, 8 entries Instruction TLB: 4KByte pages, 8-way set associative, 64 entries 64-Byte prefetching Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries L2 cache: 256 kbytes, 8-way associative, 64 bytes/line Some tags were apparently removed from the table 3-21, Vol. 2A. Keep them around, but add a comment stating the removal. Update the format line for cpu_stdext_feature according to the bits from the SDM rev.55. It appears that Haswells do not store %cs and %ds values in the FPU save area. Store content of the %ecx register from the CPUID leaf 0x7 subleaf 0 as cpu_stdext_feature2 and print defined bits from it, again acording to SDM rev. 55. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-06-06 22:03:24 +00:00
neel	954e6fdc3b	The 'verify_gla()' function is used to ensure that the effective address after decoding the instruction matches the one provided by hardware. Prior to r283293 'vie->num_valid' used to contain the actual length of the instruction whereas now it contains the maximum instruction length possible. This introduced a bug when calculating a RIP-relative base address. Fix this by using 'vie->num_processed' rather than 'vie->num_valid' as the length of the emulated instruction. Reported and tested by: tychon MFC after: 1 week	2015-06-05 21:22:26 +00:00
neel	0a2b315713	Use tunable 'hw.vmm.svm.features' to disable specific SVM features even though they might be available in hardware. Use tunable 'hw.vmm.svm.num_asids' to limit the number of ASIDs used by the hypervisor. MFC after: 1 week	2015-06-04 02:12:23 +00:00
dim	2474a7a0d8	Remove unneeded NULL checks in amd64's trap_fatal(). Since td_name is an array member of struct thread, it can never be NULL, so the check can be removed. In addition, curproc can never be NULL, so remove the if statement, and splice the two printfs() together. While here, remove the u_long cast, and use the correct printf format specifier curproc->p_pid. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D2695	2015-06-01 06:50:39 +00:00
kib	e2f56205b5	Remove several write-only variables, all reported by the gcc 4.9 buildkernel run. Some of them were write-only under some kernel options, e.g. variables keeping values only used by CTR() macros. It costs nothing to the code readability and correctness to eliminate the warnings in those cases too by removing the local cached values used only for single-access. Review: https://reviews.freebsd.org/D2665 Reviewed by: rodrigc Looked at by: bjk Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-05-29 13:24:17 +00:00
neel	3f2b4fc770	Fix non-deterministic delays when accessing a vcpu that was in "running" or "sleeping" state. This is done by forcing the vcpu to transition to "idle" by returning to userspace with an exit code of VM_EXITCODE_REQIDLE. MFC after: 2 weeks	2015-05-28 17:37:01 +00:00
kib	167672d7cd	Enabled rewritten PCID support by default. Sponsored by: The FreeBSD Foundation MFC after: 1 month	2015-05-27 09:50:18 +00:00
dchagin	50f49b7240	When I merged the lemul branch I missied kib@'s r282708 commit. This is not the final fix as I need properly cleanup thread resources before other threads suicide. Tested by: Ruslan Makhmatkhanov	2015-05-25 20:44:46 +00:00
dchagin	9ffc61f923	Regen for r283492.	2015-05-24 18:09:01 +00:00
dchagin	5cd0d07723	Implement Linux specific syncfs() system call.	2015-05-24 18:08:01 +00:00
dchagin	21f92dba1f	Regen for r283488.	2015-05-24 18:05:21 +00:00
dchagin	ea835975de	Implement recvmmsg() and sendmmsg() system calls.	2015-05-24 18:04:04 +00:00
dchagin	8164e61122	Reduce duplication between MD Linux code by moving msg related struct definitions out into the compat/linux/linux_socket.h	2015-05-24 18:03:14 +00:00
dchagin	791c259f28	Regen for r283484.	2015-05-24 18:02:17 +00:00
dchagin	8e85763052	Implement epoll_pwait() system call.	2015-05-24 18:00:14 +00:00
dchagin	bf8c32ec80	Regen for r283480.	2015-05-24 17:58:24 +00:00
dchagin	5361a6bf58	Add utimensat() system call. The patch developed by Jilles Tjoelker and Andrew Wilcox and adopted for lemul branch by me.	2015-05-24 17:57:07 +00:00
dchagin	0969667a9e	The kernel sends signals to the processes via ABI specific sv_sendsig method. Native ABI do not need signal conversion, only emulators may want this. Usually emulators implements its own sv_sendsig method. For now only ibcs2 emulator does not have own sv_sendsig implementation and depends on native sendsig() method. So, remove any extra attempts to convert signal numbers from native sendsig() methods except from i386 where ibsc2 is living.	2015-05-24 17:56:02 +00:00
dchagin	fc55d94b46	Rework signal code to allow using it by other modules, like linprocfs: 1. Linux sigset always 64 bit on all platforms. In order to move Linux sigset code to the linux_common module define it as 64 bit int. Move Linux sigset manipulation routines to the MI path. 2. Move Linux signal number definitions to the MI path. In general, they are the same on all platforms except for a few signals. 3. Map Linux RT signals to the FreeBSD RT signals and hide signal conversion tables to avoid conversion errors. 4. Emulate Linux SIGPWR signal via FreeBSD SIGRTMIN signal which is outside of allowed on Linux signal numbers. PR: 197216	2015-05-24 17:47:20 +00:00
dchagin	cace25f46d	According to Linux man sigaltstack(3) shall return EINVAL if the ss argument is not a null pointer, and the ss_flags member pointed to by ss contains flags other than SS_DISABLE. However, in fact, Linux also allows SS_ONSTACK flag which is simply ignored. For buggy apps (at least mono) ignore other than SS_DISABLE flags as a Linux do. While here move MI part of sigaltstack code to the appropriate place. Reported by: abi at abinet dot ru	2015-05-24 17:44:08 +00:00
dchagin	9d4a2ad595	Regen for r283467.	2015-05-24 17:39:18 +00:00
dchagin	92d496261e	Call nosys in case when the incorrect syscall number is specified. Reported by: trinity	2015-05-24 17:38:02 +00:00
dchagin	df01339e31	Regen for r283465.	2015-05-24 17:35:42 +00:00
dchagin	a346bc7dc8	Add preliminary fallocate system call implementation to emulate posix_fallocate() function. Differential Revision: https://reviews.freebsd.org/D1523 Reviewed by: emaste	2015-05-24 17:33:21 +00:00
dchagin	54f00beef0	Regen for r283451.	2015-05-24 17:00:43 +00:00
dchagin	b37f23e513	Implement ppoll() system call. Differential Revision: https://reviews.freebsd.org/D1105 Reviewed by: trasz	2015-05-24 16:59:25 +00:00
dchagin	a03d7a9e7f	Include opt_compat.h, so that COMPAT_LINUX32 is defined, and we can access to the semop structs and functions. Submitted by: cognet@ Differential Revision: https://reviews.freebsd.org/D1095 Reviewed by: trasz	2015-05-24 16:51:04 +00:00
dchagin	495cb86c87	Regen for r283444.	2015-05-24 16:50:17 +00:00
dchagin	d7e47c502a	Implement eventfd system call. Differential Revision: https://reviews.freebsd.org/D1094 In collaboration with: Jilles Tjoelker	2015-05-24 16:49:14 +00:00
dchagin	2336f8bb01	Put the correct value for the abi_nfdbits parameter of kern_select() for all supported Linuxulators. Differential Revision: https://reviews.freebsd.org/D1093 Reviewed by: trasz	2015-05-24 16:47:13 +00:00
dchagin	5e8bb6e371	Regen for r283441.	2015-05-24 16:42:49 +00:00
dchagin	5f069937a3	Implement epoll family system calls. This is a tiny wrapper around kqueue() to implement epoll subset of functionality. The kqueue user data are 32bit on i386 which is not enough for epoll user data, so we keep user data in the proc emuldata. Initial patch developed by rdivacky@ in 2007, then extended by Yuri Victorovich @ r255672 and finished by me in collaboration with mjg@ and jillies@. Differential Revision: https://reviews.freebsd.org/D1092	2015-05-24 16:41:39 +00:00
dchagin	db8a000521	To avoid code duplication move open/fcntl definitions to the MI header file. Differential Revision: https://reviews.freebsd.org/D1087 Reviewed by: trasz	2015-05-24 16:31:44 +00:00
dchagin	d1ecbe4998	Use the BSD_TO_LINUX_SIGNAL() wherever there is no need to check the ABI as it is known. Differential Revision: https://reviews.freebsd.org/D1086	2015-05-24 16:30:23 +00:00
dchagin	d1e150aef1	Being exported through vdso the note.Linux section used by glibc to determine the kernel version (this saves one uname call). Temporarily disable the export of a note.Linux section until I figured out how to change the kernel version in the note.Linux on the fly. Differential Revision: https://reviews.freebsd.org/D1081 Reviewed by: trasz	2015-05-24 16:25:44 +00:00
dchagin	eb881eec7e	Add AT_RANDOM and AT_EXECFN auxiliary vector entries which are used by glibc. At list since glibc version 2.16 using AT_RANDOM is mandatory. Differential Revision: https://reviews.freebsd.org/D1080	2015-05-24 16:24:24 +00:00
dchagin	edf6d015c3	Regen for r283428.	2015-05-24 16:19:57 +00:00
dchagin	bb042fb0da	Change linux faccessat syscall definition to match actual linux one. The AT_EACCESS and AT_SYMLINK_NOFOLLOW flags are actually implemented within the glibc wrapper function for faccessat(). If either of these flags are specified, then the wrapper function employs fstatat() to determine access permissions. Differential Revision: https://reviews.freebsd.org/D1078 Reviewed by: trasz	2015-05-24 16:18:03 +00:00
dchagin	4dc42a9cd3	Regen for r283424.	2015-05-24 16:11:21 +00:00
dchagin	2f453c26e7	Add preliminary support for x86-64 Linux binaries. Differential Revision: https://reviews.freebsd.org/D1076	2015-05-24 16:07:11 +00:00
dchagin	f18b3d51fa	Refund the proc emuldata struct for future use. For now move flags from thread emuldata to proc emuldata as it was originally intended. As we can have both 64 & 32 bit Linuxulator running any eventhandler can be called twice for us. To prevent this move eventhandlers code from linux_emul.c to the linux_common.ko module. Differential Revision: https://reviews.freebsd.org/D1073	2015-05-24 15:54:58 +00:00
dchagin	b08f3f43f9	Introduce a new module linux_common.ko which is intended for the following primary purposes: 1. Remove the dependency of linsysfs and linprocfs modules from linux.ko, which will be architecture specific on amd64. 2. Incorporate into linux_common.ko general code for platforms on which we'll support two Linuxulator modules (for both instruction set - 32 & 64 bit). 3. Move malloc(9) declaration to linux_common.ko, to enable getting memory usage statistics properly. Currently linux_common.ko incorporates a code from linux_mib.c and linux_util.c and linprocfs, linsysfs and linux kernel modules depend on linux_common.ko. Temporarily remove dtrace garbage from linux_mib.c and linux_util.c Differential Revision: https://reviews.freebsd.org/D1072 In collaboration with: Vassilis Laganakos. Reviewed by: trasz	2015-05-24 15:51:18 +00:00
dchagin	dc4523e6a4	x86_64 Linux do not use multiplexing on ipc system calls. Move struct ipc_perm definition to the MD path as it differs for 64 and 32 bit platform. Differential Revision: https://reviews.freebsd.org/D1068 Reviewed by: trasz	2015-05-24 15:44:41 +00:00
dchagin	58270b3600	Remove stale comment about a signal trampoline which is moved to the shared page at r219609. Differential Revision: https://reviews.freebsd.org/D1063 Reviewed by: trasz	2015-05-24 15:32:52 +00:00
dchagin	4178f554e5	Put linux_platform into the vdso to avoid copying it onto the stack at every exec. Differential Revision: https://reviews.freebsd.org/D1062 Reviewed by: trasz	2015-05-24 15:30:52 +00:00
dchagin	c7a9185ada	Eliminate a now unused global declaration of elf_linux_sysvec. Differential Revision: https://reviews.freebsd.org/D1061 Reviewed by: trasz	2015-05-24 15:29:20 +00:00

1 2 3 4 5 ...

7539 Commits