freebsd-dev

Author	SHA1	Message	Date
Jason A. Harmening	eb36b1d0bc	Clean up MD pollution of bus_dma.h: --Remove special-case handling of sparc64 bus_dmamap* functions. Replace with a more generic mechanism that allows MD busdma implementations to generate inline mapping functions by defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This is currently useful for sparc64, x86, and arm64, which all implement non-load dmamap operations as simple wrappers around map objects which may be bus- or device-specific. --Remove NULL-checked bus_dmamap macros. Implement the equivalent NULL checks in the inlined x86 implementation. For non-x86 platforms, these checks are a minor pessimization as those platforms do not currently allow NULL maps. NULL maps were originally allowed on arm64, which appears to have been the motivation behind adding arm[64]-specific barriers to bus_dma.h, but that support was removed in r299463. --Simplify the internal interface used by the bus_dmamap_load* variants and move it to bus_dma_internal.h --Fix some drivers that directly include sys/bus_dma.h despite the recommendations of bus_dma(9) Reviewed by: kib (previous revision), marius Differential Revision: https://reviews.freebsd.org/D10729	2017-07-01 05:35:29 +00:00
Konstantin Belousov	f7df80f4ed	Fix indent. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2017-06-24 10:19:06 +00:00
Konstantin Belousov	746e20fdb1	Correct translations between abridged and full x87 tags. Reported and tested by: karnajit wangkhem <karnajitw@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-06-17 11:25:31 +00:00
Konstantin Belousov	2d88da2f06	Move struct syscall_args syscall arguments parameters container into struct thread. For all architectures, the syscall trap handlers have to allocate the structure on the stack. The structure takes 88 bytes on 64bit arches which is not negligible. Also, it cannot be easily found by other code, which e.g. caused duplication of some members of the structure to struct thread already. The change removes td_dbg_sc_code and td_dbg_sc_nargs which were directly copied from syscall_args. The structure is put into the copied on fork part of the struct thread to make the syscall arguments information correct in the child after fork. This move will also allow several more uses shortly. Reviewed by: jhb (previous version) Sponsored by: The FreeBSD Foundation MFC after: 3 weeks X-Differential revision: https://reviews.freebsd.org/D11080	2017-06-12 21:03:23 +00:00
Konstantin Belousov	43f41dd393	Make struct syscall_args visible to userspace compilation environment from machine/proc.h, consistently on all architectures. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 weeks X-Differential revision: https://reviews.freebsd.org/D11080	2017-06-12 20:53:44 +00:00
John Baldwin	9d24f98ca8	Remove the BSD/OS 2.1 system call gate LDT entry. An extra copy of the system call gate was added to the default LDT back in 1996 (r18513 / r18514). However, the ability to run BSD/OS 2.1 i386 binaries under FreeBSD's native ABI is most likely no longer needed. Discussed with: kib	2017-05-23 22:34:18 +00:00
Ed Maste	3e85b721d6	Remove register keyword from sys/ and ANSIfy prototypes A long long time ago the register keyword told the compiler to store the corresponding variable in a CPU register, but it is not relevant for any compiler used in the FreeBSD world today. ANSIfy related prototypes while here. Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10193	2017-05-17 00:34:34 +00:00
Jung-uk Kim	af1973281e	Use kmem_malloc() instead of malloc(9) for the native amd64 filter. r316767 broke the BPF JIT compiler for amd64 because malloc()'d space is no longer executable. Discussed with: kib, alc	2017-04-17 22:02:09 +00:00
Jung-uk Kim	e329e330d4	Move declarations for a machine-dependent function to the header file.	2017-04-17 21:51:26 +00:00
Jung-uk Kim	3a0e8b7373	Reduce diff with amd64 version.	2017-04-17 21:46:54 +00:00
Gleb Smirnoff	83c9dea1ba	- Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter in place. To do per-cpu stats, convert all fields that previously were maintained in the vmmeters that sit in pcpus to counter(9). - Since some vmmeter stats may be touched at very early stages of boot, before we have set up UMA and we can do counter_u64_alloc(), provide an early counter mechanism: o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter. o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter, so that at early stages of boot, before counters are allocated we already point to a counter that can be safely written to. o For sparc64 that required a whole dummy pcpu[MAXCPU] array. Further related changes: - Don't include vmmeter.h into pcpu.h. - vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit, to match kernel representation. - struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion. This is based on benno@'s 4-year old patch: https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html Reviewed by: kib, gallatin, marius, lidl Differential Revision: https://reviews.freebsd.org/D10156	2017-04-17 17:34:47 +00:00
Gleb Smirnoff	75c4b0b5ac	Remove unused assembly symbols pointing to vmmeter.	2017-04-17 17:18:07 +00:00
Patrick Kelsey	67d955aab4	Corrected misspelled versions of rendezvous. The MFC will include a compat definition of smp_no_rendevous_barrier() that calls smp_no_rendezvous_barrier(). Reviewed by: gnn, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10313	2017-04-09 02:00:03 +00:00
Tai-hwa Liang	7ece126ed8	Trying to be more compatible with Linux if.h definitions: - renaming l_ifreq::ifru_metric to l_ifreq::ifru_ivalue; - adding a definition for ifr_ifindex which points to l_ifreq::ifru_ivalue. A quick search indicates that Linux already got the above changes since 2.1.14. Reviewed by: kib, marcel, dchagin MFC after: 1 week	2017-04-08 14:41:39 +00:00
Mark Johnston	5788c2bde1	Adjust the constraint for "src" in atomic_(f)cmpset_8. "r" is not sufficient to prevent the use of invalid byte-width registers with at least gcc. Reported and reviewed by: bde X-MFC-With: r315718	2017-03-27 16:18:19 +00:00
Andriy Gapon	978f3da16f	revert r315959 because it causes build problems The change introduced a dependency between genassym.c and header files generated from .m files, but that dependency is not specified in the make files. Also, the change could be not as useful as I thought it was. Reported by: dchagin, Manfred Antar <null@pozo.com>, and many others	2017-03-27 12:34:29 +00:00
Bruce Evans	f434f3515b	Fix printing of negative offsets (typically from frame pointers) again. I fixed this in 1997, but the fix was over-engineered and fragile and was broken in 2003 if not before. i386 parameters were copied to 8 other arches verbatim, mostly after they stopped working on i386, and mostly without the large comment saying how the values were chosen on i386. powerpc has a non-verbatim copy which just changes the uncritical parameter and seems to add a sign extension bug to it. Just treat negative offsets as offsets if they are no more negative than -db_offset_max (default -64K), and remove all the broken parameters. -64K is not very negative, but it is enough for frame and stack pointer offsets since kernel stacks are small. The over-engineering was mainly to go more negative than -64K for the negative offset format, without affecting printing for more than a single address. Addresses in the top 64K of a (full 32-bit or 64-bit) address space are now printed less well, but there aren't many interesting ones. For arches that have many interesting ones very near the top (e.g., 68k has interrupt vectors there), there would be no good limit for the negative offset format and -64K is a good as anything.	2017-03-26 18:46:35 +00:00
Andriy Gapon	a7b4c009e1	specific end of interrupt implementation for AMD Local APIC The change is more intrusive than I would like because the feature requires that a vector number is written to a special register. Thus, now the vector number has to be provided to lapic_eoi(). It was readily available in the IO-APIC and MSI cases, but the IPI handlers required more work. Also, we now store the VMM IPI number in a global variable, so that it is available to the justreturn handler for the same reason. Reviewed by: kib MFC after: 6 weeks Differential Revision: https://reviews.freebsd.org/D9880	2017-03-25 18:45:09 +00:00
Dmitry Chagin	6c2a934b79	Implement Linux mincore() system call. This is necessary for the upcoming drm-next. Suggested by: hselasky@ MFC after: 1 month	2017-03-25 15:47:29 +00:00
Bruce Evans	4e501eb7cc	Remove buggy adjustment of page tables in db_write_bytes(). Long ago, perhaps only on i386, kernel text was mapped read-only and it was necessary to change the mapping to read-write to set breakpoints in kernel text. Other writes by ddb to kernel text were also allowed. This write protection is harder to implement with 4MB pages, and was lost even for 4K pages when 4MB pages were implemented. So changing the mapping became useless. It was actually worse than useless since it followed followed various null and otherwise garbage pointers to not change random memory instead of the mapping. (On i386s, the pointers became good in pmap_bootstrap(), and on amd64 the pointers became bad in pmap_bootstrap() if not before.) Another bug broke detection of following of null pointers on i386, except early in boot where not detecting this was a feature. When I fixed the bug, I accidentally broke the feature and soon got traps in db_write_bytes(). Setting breakpoints early in ddb was broken. kib pointed out that a clean way to do the adjustment would be to use a special [sub]map giving a small window on the bytes to be written. The trap handler didn't know how to fix up errors for pagefaults accessing the map itself. Such errors rarely need fixups, since most traps for the map are for the first access which is a read. Reviewed by: kib	2017-03-24 17:34:55 +00:00
Gleb Smirnoff	061f01b16e	Remove Solaris 2.6 syscalls selector. Discussed with: kib	2017-03-23 19:54:41 +00:00
Ed Schouten	ebfc28088b	Stop providing the compat_3_brand. As of r315860, the ELF image activator works fine for CloudABI without it. Reviewed by: kib MFC after: 2 weeks	2017-03-23 14:12:21 +00:00
Konstantin Belousov	2274ab3d7b	Update r315753 with the proper flag name. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-03-22 22:28:13 +00:00
Konstantin Belousov	1438fe3cf2	Add a flag BI_BRAND_ONLY_STATIC to specify that the brand only matches static binaries. Interpretation of the 'static' there is that the binary must not specify an interpreter. In particular, shared objects are matched by the brand if BI_CAN_EXEC_DYN is also set. This improves precision of the brand matching, which should eliminate surprises due to brand ordering. Revert r315701. Discussed with and tested by: ed (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-03-22 22:23:01 +00:00
Mark Johnston	3d6732549d	Add support for 8- and 16-bit atomic_(f)cmpset to x86. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D10068	2017-03-22 17:29:04 +00:00
Ed Schouten	ae2373da91	Set the interpreter path to /nonexistent. CloudABI executables are statically linked and don't have an interpreter. Setting the interpreter path to NULL used to work previously, but r314851 introduced code that checks the string unconditionally. Running CloudABI executables now causes a null pointer dereference. Looking at the rest of imgact_elf.c, it seems various other codepaths already leaned on the fact that the interpreter path is set. Let's just go ahead and pick an obviously incorrect interpreter path to appease imgact_elf.c. MFC after: 1 week	2017-03-22 07:05:27 +00:00
Dmitry Chagin	b1ba0846f1	Implement getrandom() syscall. Note. GRND_RANDOM option is not supported for now. MFC after: 1 month	2017-03-18 18:34:29 +00:00
Dmitry Chagin	857129394d	To reduce code duplication move socket defines to the MI path. MFC after: 1 week	2017-03-18 18:23:30 +00:00
Bruce Evans	ff17a6773e	Don't access the reserved registers %dr4 and %dr5 on i386. On the original i386, %dr[4-5] were unimplemented but not very clearly reserved, so debuggers read them to print them. i386 was still doing this. On the original athlon64, %dr[4-5] are documented as reserved but are aliased to %dr[6-7] unless CR4_DE is set, when accessing them traps. On 2 of my systems, accessing %dr[4-5] trapped sometimes. On my Haswell system, the apparent randomness was because the boot CPU starts with CR4_DE set while all other CPUs start with CR4_DE clear. FreeBSD doesn't support the data breakpoints enabled by CR4_DE and it never changes this flag, so the flag remains different across CPUs and the behaviour seemed inconsistent except while booting when the CPU doesn't change. The invalid accesses broke: - read access for printing the registers in ddb "show watches" on CPUs with CR4_DE set - read accesses in fill_dbregs() on CPUs with CR4_DE set. This didn't implement panic(3) since the user case always skipped %dr[4-5]. - write accesses in set_dbregs(). This also didn't affect userland. When it didn't trap, the aliasing made it fragile. Don't print the dummy (zero) values of %dr[4-5] in "show watches" for i386 or amd64. Fix style bugs near this printing. amd64 also has space in the dbregs struct for the reserved %dr[8-15] and already didn't print the dummy values for these, and never accessed any of the 10 reserved debug registers. Remove cpufuncs for making the invalid accesses. Even amd64 had these.	2017-03-17 13:49:05 +00:00
Emmanuel Vadot	aa6b345634	Remove i915drm and radeondrm from NOTES and conf. This unbreak LINT kernel. Reported by: lwhsu	2017-03-12 00:52:16 +00:00
Dmitry Chagin	ab60bc8488	Reduce code duplication between MD Linux code by moving SYSV IPC 64-bit related struct definitions out into the MI path. Invert the native ipc structs to the Linux ipc structs convesion logic. Since 64-bit variant of ipc structs has more precision convert native ipc structs to the 64-bit Linux ipc structs and then truncate 64-bit values into the non 64-bit if needed. Unlike Linux, return EOVERFLOW if the values do not fit. Fix SYSV IPC for 64-bit Linuxulator which never sets IPC_64 bit. MFC after: 1 month	2017-03-07 17:07:16 +00:00
Mahdi Mokhtari	881b1219aa	Regenerated Linuxulator syscall tables for r314782 Approved by: dchagin MFC after: 1 month	2017-03-06 18:20:37 +00:00
Mahdi Mokhtari	8049c6bfb8	Add UNIMPLEMENTED() placeholder macro for the syscalls that are not implemented in Linux kernel itself. Cleanup DUMMY() macros. Reviewed by: dchagin, trasz Approved by: dchagin MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D9804	2017-03-06 18:11:38 +00:00
Pedro F. Giffuni	25ef829b03	Revert r314669, r314670: Bring back the i486 option in GENERIC by default. The code related to i386 CPU variants configuration has received many changes in the last years: most of the features are detected automatically, so there are no performance penalties from keeping the 486 support enabled. Re-instate the 486 support: while the general configuration could still be cleaned a bit, there is no advantage in removing it. Differential Revision: https://reviews.freebsd.org/D9879	2017-03-06 03:52:15 +00:00
Pedro F. Giffuni	a5730cc510	Drop i486 from the default i386 GENERIC kernel configuration. 80486 production was stopped by Intel on September 2007. Dropping the 486 configuration option from the GENERIC kernel improves performance slightly. Removing I486_CPU is consistent at this time: we don't support any processor without a FPU and the PC-98 arch, which frequently involved i486 CPUs, is also gone so we don't test such platforms anymore. Relnotes: yes MFC after: 2 weeks https://reviews.freebsd.org/D9879	2017-03-04 15:04:17 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Konstantin Belousov	2e6e48fb59	Initialize pcb_save for thread0. Otherwise kernel traps on NULL dereference if fpu_kern(9) is used from the thread0 context. Reported by: cem Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-02-28 22:54:52 +00:00
Gleb Smirnoff	efe3b0de14	Remove SVR4 (System V Release 4) binary compatibility support. UNIX System V Release 4 is operating system released in 1988. It ceased to exist in early 2000-s.	2017-02-28 05:14:42 +00:00
Dmitry Chagin	af68739567	Regen for r314312 (Linux epoll_pwait). MFC after: 1 month	2017-02-26 19:59:28 +00:00
Dmitry Chagin	f8ae1bb64d	Change Linux epoll_pwait syscall definition to match Linux actual one. MFC after: 1 month	2017-02-26 19:57:18 +00:00
Alan Cox	0314966858	Refine the fix from r312954. Specifically, add a new PDE-only flag, PG_PROMOTED, that indicates whether lingering 4KB page mappings might need to be flushed on a PDE change that restricts or destroys a 2MB page mapping. This flag allows the pmap to avoid range invalidations that are both unnecessary and costly. Reviewed by: kib, markj MFC after: 6 weeks Differential Revision: https://reviews.freebsd.org/D9665	2017-02-26 19:54:02 +00:00
Dmitry Chagin	dd93b628e9	Implement timerfd family syscalls. MFC after: 1 month	2017-02-26 09:48:18 +00:00
Dmitry Chagin	354aa2dd56	Regen after r314291 (timerfd definition). MFC after: 1 month	2017-02-26 09:37:25 +00:00
Dmitry Chagin	1064d53fde	Change Linuxulator timerfd syscalls definition to match actual Linux one. MFC after: 1 month	2017-02-26 09:35:44 +00:00
Edward Tomasz Napierala	e801ac7852	Fix linux_fstatfs() to return proper value for f_frsize. Without it, linux df(1) binary from Xenial shows garbage. Reviewed by: dchagin MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9692	2017-02-25 20:32:37 +00:00
Mahdi Mokhtari	bd911530b7	Add linux_preadv() and linux_pwritev() syscalls to Linuxulator. Reviewed by: dchagin Approved by: dchagin, trasz (src committers) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D9722	2017-02-24 20:04:02 +00:00
Dmitry Chagin	8665c4d9cd	Revert r314217. Commit is not match that I have approved.	2017-02-24 19:47:27 +00:00
Mahdi Mokhtari	21d23e3249	Add linux_preadv() and linux_pwritev() syscalls to Linuxulator. Reviewed by: dchagin Approved by: dchagin, trasz (src committers) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D9722	2017-02-24 19:22:17 +00:00
Dmitry Chagin	486a06bdf0	Implement rt_tgsigqueueinfo system call used by glibc for pthread_sigqueue(3). MFC after: 2 week	2017-02-19 07:38:11 +00:00
Konstantin Belousov	dab486441f	MFamd64 r313933: microoptimize pmap_protect_pde(). Noted by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-02-19 06:14:33 +00:00
Konstantin Belousov	b1fa987835	Merge i386 and amd64 mtrr drivers. Reviewed by: royger, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D9648	2017-02-17 21:08:32 +00:00
Roger Pau Monné	43b00aeb88	x86: fix MTRR initialization if EARLY_AP_STARTUP is used MTRR handlers are set in {amd64/i686}_mem_drvinit, which is called at SI_SUB_DRIVERS, and that's too late when EARLY_AP_STARTUP is set because APs have already started at this point. {amd64/i686}_mrinit is also called too late for the BSP, since that happens when the memory device is attached, also after APs have already started. Move the position to SI_SUB_CPU, and also initialize the state for the BSP, so that the APs can correctly get to the same state as the BSP. Sponsored by: Citrix Systems R&D MFC after: 1 week Reviewed by: jhb, kib Differential Revision: https://reviews.freebsd.org/D9630	2017-02-17 12:47:51 +00:00
Warner Losh	86d99b6884	Remove EISA bus support for add-in cards. Remove related kernel and compile options. Remove doxygen pointers to now deleted files. Remove EISA and VME as examples in bus_space.9. Retained EISA mode code for IO PIC and MPTABLES because that's not EISA bus, per se, and some people have abused EISA to mean "EISA-like behavior as opposed to ISA" rather than using it for EISA add-in cards. Relnotes: yes	2017-02-16 21:57:35 +00:00
Warner Losh	5625fe9246	Remove Micro Channel Architecture support. Of the commonly available machines, only a few 486 machines that used it, and those haven't had enough memory to run FreeBSD for quite some time (often limited to 16MB). Not to be confused with the Machine Check Architecture, which is still very much alive and used (and untouched by this commit). No Objection From: arch@	2017-02-15 23:04:25 +00:00
John Baldwin	bb9b710477	Regenerate all the system call tables to drop "created from" lines. One of the ibcs2 files contains some actual changes (new headers) as it hasn't been regenerated after older changes to makesyscalls.sh.	2017-02-10 19:45:02 +00:00
Dmitry Chagin	12bc0fb56f	Regen after r313284. MFC after: 2 week	2017-02-05 14:19:19 +00:00
Dmitry Chagin	8b756d40a7	Update syscall.master to 4.10-rc6. Also fix comments, a typo, and wrong numbering for a few unimplemented syscalls. For 32-bit Linuxulator, socketcall() syscall was historically the entry point for the sockets API. Starting in Linux 4.3, direct syscalls are provided for the sockets API. Enable it. The initial version of patch was provided by trasz@ and extended by me. Submitted by: trasz MFC after: 2 week Differential Revision: https://reviews.freebsd.org/D9381	2017-02-05 14:17:09 +00:00
Konstantin Belousov	57f6622f92	For i386, remove config options CPU_DISABLE_CMPXCHG, CPU_DISABLE_SSE and device npx. This means that FPU is always initialized and handled when available, and SSE+ register file and exception are handled when available. This makes the kernel FPU code much easier to maintain by the cost of slight bloat for CPUs older than 25 years. CPU_DISABLE_CMPXCHG outlived its usefulness, see the removed comment explaining the original purpose. Suggested by and discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2017-02-03 12:51:40 +00:00
Konstantin Belousov	9c16356ccd	Use ANSI definitions for some i386 functions. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-02-02 22:02:10 +00:00
Mateusz Guzik	ed869ff019	i386: fixup fcmpset An incorrect output specifier was used which worked with clang by accident, but breaks with the in-tree gcc version. While here plug a whitespace nit. Reported by: bde	2017-02-02 01:33:08 +00:00
Edward Tomasz Napierala	ae6b6ef6cb	Replace sys_ftruncate() with kern_ftruncate() in various compats. Reviewed by: kib@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9368	2017-01-30 11:50:54 +00:00
Mateusz Guzik	e7a98aef79	i386: add atomic_fcmpset Tested by: pho	2017-01-30 02:24:54 +00:00
Konstantin Belousov	a0f64f38a1	Do not leave stale 4K TLB entries on pde (superpage) removal or protection change. On superpage promotion, x86 pmaps do not invalidate existing 4K entries for the superpage range, because they are compatible with the promoted 2/4M entry. But the invalidation on superpage removal or protection change only did single INVLPG with the base address of the superpage. This reliably flushed superpage TLB entry, and 4K entry for the first page of the superpage, potentially leaving other 4K TLB entries lingering. Do the invalidation of the whole superpage range to correct the problem. Note that the precise invalidation is done by x86 code for kernel_pmap only, for user pmaps whole (per-AS) TLB is flushed. This made the bug well hidden, because promotions of the kernel mappings require specific load. Reported and tested by: Jonathan Looney <jtl@netflix.com> (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-29 19:14:48 +00:00
Jason A. Harmening	d986450859	Implement get_pcpu() for i386 and use it to replace pcpu_find(curcpu) in the i386 pmap. The curcpu macro loads the per-cpu data pointer as its first step, so the remaining steps of pcpu_find(curcpu) are circular. get_pcpu() is already implemented for arm, arm64, and risc-v. My plan is to implement it for the remaining architectures and use it to replace several instances of pcpu_find(curcpu) in MI code. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D9370	2017-01-29 16:54:55 +00:00
Yoshihiro Takahashi	f7c79dd679	Garbage collect the FPU_ERROR_BROKEN option. It is for pc98 only.	2017-01-28 03:53:53 +00:00
Yoshihiro Takahashi	2b375b4edd	Remove pc98 support completely. I thank all developers and contributors for pc98. Relnotes: yes	2017-01-28 02:22:15 +00:00
Konstantin Belousov	5611aaa195	Use SFENCE for ordering CLFLUSHOPT. SDM states that CLFLUSHOPT instructions can be ordered with other writes by SFENCE, heavier MFENCE is not required. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2017-01-20 19:08:44 +00:00
Ed Schouten	4423244072	Catch up with changes to structure member names. Pointer/length pairs are now always named ${name} and ${name}_len.	2017-01-17 22:05:52 +00:00
Jason A. Harmening	86785b54a5	Add comment explaining relative order of sched_unpin() and mtx_unlock(). Suggested by: alc MFC after: 1 week	2017-01-14 19:35:36 +00:00
Jason A. Harmening	28699efd43	For i386 temporary mappings, unpin the thread before releasing the cmap lock. Releasing the lock first may result in the thread being immediately rescheduled and bound to the same CPU, only to unpin itself upon resuming execution. Noted by: skra (in review for armv6 equivalent) MFC after: 1 week	2017-01-14 09:56:01 +00:00
Mark Johnston	bd7abab0c9	Coalesce TLB shootdowns of global PTEs in pmap_advise() on x86. We would previously invalidate such entries individually, resulting in more IPIs than necessary. Reviewed by: alc, kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D9094	2017-01-10 21:52:48 +00:00
Sean Bruno	f2d6ace4a6	Migrate e1000 to the IFLIB framework: - em(4) igb(4) and lem(4) - deprecate the igb device from kernel configurations - create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko Devices tested: - 82574L - I218-LM - 82546GB - 82579LM - I350 - I217 Please report problems to freebsd-net@freebsd.org Partial review from jhb and suggestions on how to not brick folks who originally would have lost their igbX device. Submitted by: mmacy@nextbsd.org MFC after: 2 weeks Relnotes: yes Sponsored by: Limelight Networks and Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8299	2017-01-10 03:23:22 +00:00
Konstantin Belousov	2f304845e2	Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-05 17:19:26 +00:00
Jason A. Harmening	43aabbefd8	Move the objects used to create temporary mappings for i386 pmap zero and copy operations to the MD PCPU region. Change sysmap initialization to only allocate KVA pages for CPUs that are actually present. As a minor optimization, this also prevents false sharing between adjacent sysmap objects since the pcpu struct is already cacheline-aligned. While here, move pc_qmap_addr initialization for the BSP into pmap_bootstrap(), which allows use of pmap_quick* functions during early boot. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D8833	2016-12-23 15:14:56 +00:00
John Baldwin	b663816443	Enable EARLY_AP_STARTUP on amd64 and i386 kernels by default. PR: 199321, 203682 MFC after: 2 months Sponsored by: Netflix	2016-12-16 21:10:37 +00:00
Konrad Witaszczyk	480f31c214	Add support for encrypted kernel crash dumps. Changes include modifications in kernel crash dump routines, dumpon(8) and savecore(8). A new tool called decryptcore(8) was added. A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump configuration in the diocskerneldump_arg structure to the kernel. The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for backward ABI compatibility. dumpon(8) generates an one-time random symmetric key and encrypts it using an RSA public key in capability mode. Currently only AES-256-CBC is supported but EKCD was designed to implement support for other algorithms in the future. The public key is chosen using the -k flag. The dumpon rc(8) script can do this automatically during startup using the dumppubkey rc.conf(5) variable. Once the keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O control. When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random IV and sets up the key schedule for the specified algorithm. Each time the kernel tries to write a crash dump to the dump device, the IV is replaced by a SHA-256 hash of the previous value. This is intended to make a possible differential cryptanalysis harder since it is possible to write multiple crash dumps without reboot by repeating the following commands: # sysctl debug.kdb.enter=1 db> call doadump(0) db> continue # savecore A kernel dump key consists of an algorithm identifier, an IV and an encrypted symmetric key. The kernel dump key size is included in a kernel dump header. The size is an unsigned 32-bit integer and it is aligned to a block size. The header structure has 512 bytes to match the block size so it was required to make a panic string 4 bytes shorter to add a new field to the header structure. If the kernel dump key size in the header is nonzero it is assumed that the kernel dump key is placed after the first header on the dump device and the core dump is encrypted. Separate functions were implemented to write the kernel dump header and the kernel dump key as they need to be unencrypted. The dump_write function encrypts data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps are not supported due to the way they are constructed which makes it impossible to use the CBC mode for encryption. It should be also noted that textdumps don't contain sensitive data by design as a user decides what information should be dumped. savecore(8) writes the kernel dump key to a key.# file if its size in the header is nonzero. # is the number of the current core dump. decryptcore(8) decrypts the core dump using a private RSA key and the kernel dump key. This is performed by a child process in capability mode. If the decryption was not successful the parent process removes a partially decrypted core dump. Description on how to encrypt crash dumps was added to the decryptcore(8), dumpon(8), rc.conf(5) and savecore(8) manual pages. EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU. The feature still has to be tested on arm and arm64 as it wasn't possible to run FreeBSD due to the problems with QEMU emulation and lack of hardware. Designed by: def, pjd Reviewed by: cem, oshogbo, pjd Partial review: delphij, emaste, jhb, kib Approved by: pjd (mentor) Differential Revision: https://reviews.freebsd.org/D4712	2016-12-10 16:20:39 +00:00
Mark Johnston	7f68a896dc	Add a COMPAT_FREEBSD11 kernel option. Use it wherever COMPAT_FREEBSD10 is currently specified. Reviewed by: glebius, imp, jhb Differential Revision: https://reviews.freebsd.org/D8736	2016-12-09 18:54:12 +00:00
Alan Cox	e94965d82e	Previously, vm_radix_remove() would panic if the radix trie didn't contain a vm_page_t at the specified index. However, with this change, vm_radix_remove() no longer panics. Instead, it returns NULL if there is no vm_page_t at the specified index. Otherwise, it returns the vm_page_t. The motivation for this change is that it simplifies the use of radix tries in the amd64, arm64, and i386 pmap implementations. Instead of performing a lookup before every remove, the pmap can simply perform the remove. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D8708	2016-12-08 04:29:29 +00:00
John Baldwin	3cf1f0c347	MFamd64: Various fatal page fault fixes. - If a page fault is triggered due to reserved bits in a PTE, treat it as a fatal fault and panic. - If PG_NX is in use, report whether a fatal page fault is due to an instruction fetch or a data access. - If a fatal page fault is due to reserved bits in a PTE, report that as the page fault type rather than a protection violation. MFC after: 1 month	2016-11-19 01:36:44 +00:00
Bryan Drewery	28323add09	Fix improper use of "its". Sponsored by: Dell EMC Isilon	2016-11-08 23:59:41 +00:00
Konstantin Belousov	d3e4d71f1d	Handle pmap_enter() over an existing 4/2M page in KVA on i386. The userspace case was already handled by pmap_allocpte(). For kernel VA, page table page must exist, and demote cannot fail, so we need to just call pmap_demote_pde(). Also note that due to the machine AS layout, promotions in the KVA on i386 are highly unlikely, so this change is mostly for completeness. Reviewed by: alc, markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8323	2016-10-28 11:53:22 +00:00
John Baldwin	16dcd7734f	MFamd64: Add bounds checks on addresses used with /dev/mem. Reject attempts to read from or memory map offsets in /dev/mem that are beyond the maximum-supported physical address of the current CPU. Reviewed by: kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D7408	2016-10-27 21:23:14 +00:00
John Baldwin	726f4773ec	Enable EFER_NXE properly on APs. EFER_NXE is set in the EFER MSR by initializecpu() and must be set on all CPUs in the system. When PG_NX support was added to PAE on i386, the block to enable EFER_NXE was placed in a section of initializecpu() that only runs if 'cpu == CPU_686'. During early boot, locore does an initial pass to set cpu that sets it to CPU_686 on all CPUs later than a Pentium. Later, printcpuinfo() adjusts the 'cpu' variable on PII and later CPUs to one of CPU_PII, CPU_PIII, or CPU_P4. However, printcpuinfo() is called after initializecpu() on the BSP, so the BSP would enable EFER_NXE and pg_nx. The APs execute initializecpu() much later after printcpuinfo() has run. The end result on a modern CPU was that cpu was set to CPU_PIII when the APs invoked initializecpu(), so they did not enable EFER_NXE. As a result, the APs would fault when trying to access any pages marked with PG_NX set. When booting a 2 CPU PAE kernel in bhyve this manifested as a hang before single user mode. The attempt to execute /bin/init tried to copy out the exec strings (argv, etc.) to a non-executable mapping while running on the AP. The instruction kept faulting due to invalid bits in the PTE in an infinite loop. Fix this by moving the code to enable EFER_NXE out of the switch statement on 'cpu' and always doing it if 'amd_feature' supports AMDID_NX. MFC after: 2 weeks	2016-10-26 18:47:47 +00:00
Konstantin Belousov	295f4b6cfe	Follow-up to r307866: - Make !KDB config buildable. - Simplify interface to nmi_handle_intr() by evaluating panic_on_nmi in one place, namely nmi_call_kdb(). This allows to remove do_panic argument from the functions, and to remove i386/amd64 duplication of the variable and sysctl definitions. Note that now NMI causes panic(9) instead of trap_fatal() reporting and then panic(9), consistently for NMIs delivered while CPU operated in ring 0 and 3. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-10-24 20:47:46 +00:00
Konstantin Belousov	835c2787be	Handle broadcast NMIs. On several Intel chipsets, diagnostic NMIs sent from BMC or NMIs reporting hardware errors are broadcasted to all CPUs. When kernel is configured to enter kdb on NMI, the outcome is problematic, because each CPU tries to enter kdb. All CPUs are executing NMI handlers, which set the latches disabling the nested NMI delivery; this means that stop_cpus_hard(), used by kdb_enter() to stop other cpus by broadcasting IPI_STOP_HARD NMI, cannot work. One indication of this is the harmless but annoying diagnostic "timeout stopping cpus". Much more harming behaviour is that because all CPUs try to enter kdb, and if ddb is used as debugger, all CPUs issue prompt on console and race for the input, not to mention the simultaneous use of the ddb shared state. Try to fix this by introducing a pseudo-lock for simultaneous attempts to handle NMIs. If one core happens to enter NMI trap handler, other cores see it and simulate reception of the IPI_STOP_HARD. More, generic_stop_cpus() avoids sending IPI_STOP_HARD and avoids waiting for the acknowledgement, relying on the nmi handler on other cores suspending and then restarting the CPU. Since it is impossible to detect at runtime whether some stray NMI is broadcast or unicast, add a knob for administrator (really developer) to configure debugging NMI handling mode. The updated patch was debugged with the help from Andrey Gapon (avg) and discussed with him. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D8249	2016-10-24 16:40:27 +00:00
Jung-uk Kim	69d410eeb1	Implement BPF_MOD and BPF_XOR instructions. These two ALU instructions first appeared on Linux. Then, libpcap adopted and made them available since 1.6.2. Now more platforms including NetBSD have them in kernel. So do we. --이 줄 이하는 자동으로 제거됩니다--	2016-10-21 06:55:07 +00:00
Jung-uk Kim	730b3be34f	Redude code for conditional jumps.	2016-10-21 06:09:30 +00:00
Jung-uk Kim	99e3ae6839	Fix compiler warnings for user land.	2016-10-21 06:06:54 +00:00
John Baldwin	31dc1e9681	Drop support for using mmap() with /dev/kmem. Using the device pager with /dev/kmem is not stable since KVA mappings are transient, but the device pager caches the PA associated with a given offset forever. Interestingly, mips' implementation of memmap() already refused requests for /dev/kmem. Note that kvm_read/kvm_write do not use mmap, but use read and write on /dev/kmem, so this should not affect libkvm users. Reviewed by: kib MFC after: 2 months	2016-10-14 20:01:07 +00:00
Warner Losh	b2a7ac4802	Fix building on i386 and arm. But 'public domain' headers on the files with no creative content. Include "lost" changes from git: o Use /dev/efi instead of /dev/efidev o Remove redundant NULL checks. Submitted by: kib@, dim@, zbb@, emaste@	2016-10-13 06:56:23 +00:00
Jonathan T. Looney	bd79708dbf	In the TCP stack, the hhook(9) framework provides hooks for kernel modules to add actions that run when a TCP frame is sent or received on a TCP session in the ESTABLISHED state. In the base tree, this functionality is only used for the h_ertt module, which is used by the cc_cdg, cc_chd, cc_hd, and cc_vegas congestion control modules. Presently, we incur overhead to check for hooks each time a TCP frame is sent or received on an ESTABLISHED TCP session. This change adds a new compile-time option (TCP_HHOOK) to determine whether to include the hhook(9) framework for TCP. To retain backwards compatibility, I added the TCP_HHOOK option to every configuration file that already defined "options INET". (Therefore, this patch introduces no functional change. In order to see a functional difference, you need to compile a custom kernel without the TCP_HHOOK option.) This change will allow users to easily exclude this functionality from their kernel, should they wish to do so. Note that any users who use a custom kernel configuration and use one of the congestion control modules listed above will need to add the TCP_HHOOK option to their kernel configuration. Reviewed by: rrs, lstewart, hiren (previous version), sjg (makefiles only) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D8185	2016-10-12 02:16:42 +00:00
Warner Losh	f79d484dff	Create /dev/efidev to provide an ioctl interface to userland. It supports userland interfaces to UEFI Runtime Services. This is indended to the the MI portion of EFI RuntimeServices support. Differential Revision: https://reviews.freebsd.org/D8128 Reviewed by: kib@, wblock@, Ganael Laplanche	2016-10-11 22:24:30 +00:00
Konstantin Belousov	83c001d3c2	Re-apply r306516 (by cem): Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags Reduce contention during TLB invalidation operations by using a per-CPU completion flag, rather than a single atomically-updated variable. On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements show that smp_tlb_shootdown is about 50% faster with this patch; observations with VTune show that the percentage of time spent in invlrng_single_page on an interrupt (actually doing invalidation, rather than synchronization) increases from 31% with the old mechanism to 71% with the new one. (Running a basic file server workload.) Submitted by: Anton Rang <rang at acm.org> Reviewed by: cem (earlier version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8041	2016-10-04 17:01:24 +00:00
Hans Petter Selasky	97549c34ec	Move the ConnectX-3 and ConnectX-2 driver from sys/ofed into sys/dev/mlx4 like other PCI network drivers. The sys/ofed directory is now mainly reserved for generic infiniband code, with exception of the mthca driver. - Add new manual page, mlx4en(4), describing how to configure and load mlx4en. - All relevant driver C-files are now prefixed mlx4, mlx4_en and mlx4_ib respectivly to avoid object filename collisions when compiling the kernel. This also fixes an issue with proper dependency file generation for the C-files in question. - Device mlxen is now device mlx4en and depends on device mlx4, see mlx4en(4). Only the network device name remains unchanged. - The mlx4 and mlx4en modules are now built by default on i386 and amd64 targets. Only building the mlx4ib module depends on WITH_OFED=YES . Sponsored by: Mellanox Technologies	2016-09-30 08:23:06 +00:00
Bruce Evans	9eeaa0ea1f	Minor fixes for 160-bit disassembly: (1) Print the default segment %ss before adresses relative to %bp. This is too cluttered for me, but so is printing some other default prefixes, and this is a reasonable reminder that %ss is quite likely to be different from %ds in 16-bit mode. db_disasm still handles prefixes poorly, by trying to discard redundant ones. This loses information, and sometimes the result is wrong or misleading. Clean up nearby initializations and dead code. (2) Fix decoding of operand and address size prefixes in 16-bit mode. They reverse the default in all modes. Obtained from: (1) is partly from r1.4 (2003/11/08) in DFlyBSD (?)	2016-09-25 18:39:24 +00:00
Tijl Coosemans	81d7ca7761	MFamd64: r266901 Allocate a zeroed LDT. Failing to do this might result in the LDT appearing to run out of free descriptors because of random junk in the descriptor's 'sd_type' field. http://lists.freebsd.org/pipermail/freebsd-amd64/2014-May/016088.html PR: 212639 Submitted by: wheelcomplex@gmail.com MFC after: 2 weeks	2016-09-25 18:29:02 +00:00
Bruce Evans	808cf02c24	Determine the operand/address size of %cs in a new function db_segsize(). Use db_segsize() to set the default operand/address size for disassembling. Allow overriding this with the "alternate" display format /I. The API of db_disasm() should be debooleanized to pass a more general request (amd64 needs overrides to sizes of 16, 32, and 64, but this commit doesn't implement anything for amd64 since much larger changes are needed to restore the amd64 disassmbler's support for non-default sizes). Fix db_print_loc_and_inst() to ask for the normal format and not the alternate in normal operation. This is most useful for vm86 mode, but also works for 16-bit protected mode. Use db_segsize() to avoid trying to print a garbage stack trace if %cs is 16 bits. Print something like the stack trace termination message for a trap boundary instead. Document that the alternate format is now useful on i386.	2016-09-25 16:30:29 +00:00
Bruce Evans	f5435b8bbe	Fix vm86 initialization, part 3 of 2 and a half. (Actually, just fix early printfs and debugging of vm86 initialization and some other early initialization in some cases.) Add an option debug.late_console (with default 1=off) to move console and kdb initialization back where it was. Do the same for amd64 although there is no vm86 there. On my test system, debug.late_console=0 works for the syscons, sio and uart console drivers on amd64 and i386, and for vt on i386 but not on amd64. The early printfs fixed by debug.late_console=0 are: - on i386, the message about lost memory above 4G - with -v in otherwise normal use, about 20 printfs for SMAP - other debugging messages for memory sizing. Mostly under -v and not printed in normal use. Document in a comment how much earlier the initialization and early printf()s can be. That is very early for the console. Not much more than curthread is needed. kdb use obviously needs to be not so early, since it needs IDT initialization and that is done relatively late for convenience and historical reasons.	2016-09-25 14:56:24 +00:00
Mark Johnston	bdaf6d6913	Regenerate syscall provider argument strings.	2016-09-22 04:50:03 +00:00
Bruce Evans	1d3c0fa7b2	Remove all kernel uses of pcb_psl, but keep in in the struct to preserve the ABI and API for applications. It was removed in the port to amd64, but was remained as garbage giving a micro-pessimization and spurious single-step traps on i386. pcb_psl was intended to be used just to do a context switch of PSL_I, but this context switch was null in most or all versions, and mis-switching of PSL_T was done instead. Some history: - in 386BSD-0.0, cpu_switch() ran at splhigh() and splhigh() did too much interrupt disabling, so interrupts were hard-disabled across cpu_switch() and too many other places - in 386BSD-0.0-patchkit through FreeBSD-4 and FreeBSD-5 before SMPng, splhigh() did soft interrupt masking, and cpu_switch() was excessively cautious and did a cli at the start and a sti at the end to hard-disable interrupts across the switch - SMPng replaced the spl's and cli's by spinlocks (just sched_lock?), so interrupts were hard-disabled across cpu_switch() and too many other places again - initial attempts to fix this intended to restore some soft interrupt disabling, but to support variations in this cpu_switch() used pushfl/popfl into pcb_psl to avoid hard-coding the assumption that the initial and final states have PSL_I enabled. But the version with soft interrupt disabling wasn't used for long, or was never committed, (except I always used my different version of it for UP) so the pushfl/popl and pcb_psl to hold them have been doing less than nothing for about 14 years.	2016-09-17 14:00:52 +00:00

1 2 3 4 5 ...

13047 Commits