exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
from arbitrary processes. Similar to ktrace it can apply a change to all
existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
control operations on processes (as opposed to the debugger-specific
operations provided by ptrace(2)). procctl(2) uses a combination of
idtype_t and an id to identify the set of processes on which to operate
similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
of a set of processes. MADV_PROTECT still works for backwards
compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
the first bit of which is used to track if P_PROTECT should be inherited
by new child processes.
Reviewed by: kib, jilles (earlier version)
Approved by: re (delphij)
MFC after: 1 month
Set a 'fake' acpi_id for the i386 PV port, it is needed in
order to use VIRQs or IPI event channels.
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Reviewed by: gibbs
Approved by: re (blanket Xen)
MFC after: 2 weeks
fix for LIO (Linux target), removing possibility for the target to avoid mutual
CHAP by choosing to skip authentication altogether, and fixing truncated error
messages in iscsictl(8) output. This also fixes several of the problems found
with Coverity.
Note that this change requires world rebuild.
Coverity CID: 1088038, 1087998, 1087990, 1088004, 1088044, 1088041, 1088040
Approved by: re (blanket)
Sponsored by: FreeBSD Foundation
While here, correct all consumers to pass NULL instead of 0 as we pass
capability rights as pointers now, not uint64_t.
Reported by: Daniel Peyrolon
Tested by: Daniel Peyrolon
Approved by: re (marius)
to implement epoll subset of functionality. The kqueue user data are 32bit
on i386 which is not enough for epoll user data so this patch overrides
kqueue fileops to maintain enough space in struct file.
Initial patch developed by me in 2007 and then extended and finished
by Yuri Victorovich.
Approved by: re (delphij)
Sponsored by: Google Summer of Code
Submitted by: Yuri Victorovich <yuri at rawbw dot com>
Tested by: Yuri Victorovich <yuri at rawbw dot com>
wireless home router.
Notable things:
2x 16 MB flash devices
Atheros Wireless
Atheros Switching
Many thanks to adrian@ for his guidance on this and keeping the drivers in
the base system up to date
Approved by: re (delphij)
The freebsd32 compatibility mode (for running 32-bit binaries on 64-bit
kernels) does not currently allow any system calls in capability mode, but
still permits cap_enter(). As a result, 32-bit binaries on 64-bit kernels
that use capability mode do not work (they crash after being disallowed to
call sys_exit()). Affected binaries include dhclient and uniq. The latter's
crashes cause obscure build failures.
This commit makes freebsd32 cap_enter() fail with [ENOSYS], as if capability
mode was not compiled in. Applications deal with this by doing their work
without capability mode.
This commit does not fix the uncommon situation where a 64-bit process
enters capability mode and then executes a 32-bit binary using fexecve().
This commit should be reverted when allowing the necessary freebsd32 system
calls in capability mode.
Reviewed by: pjd
Approved by: re (hrs)
Requirements) systems from the projects/pseries branch. This in principle
includes all IBM POWER hardware released in the last 15 years with the
exception of POWER3-based systems when run in 64-bit mode. The main
development target, however, has been the PAPR logical partition support
that is the default target in KVM on POWER and QEMU -- mileage may vary
on actual hardware at present. Much of the heavy lifting here was done
by Andreas Tobler.
Approved by: re (kib)
immediate operand. The presence of an SIB byte in decoding the ModR/M field
would cause 'imm_bytes' to not be set to the correct value.
Fix this by initializing 'imm_bytes' independent of the ModR/M decoding.
Reported by: grehan@
Approved by: re@
needed on some more fragile systems to avoid machine checks when blindly
probing the PCI bus. Also reduce ofw_pcibus's priority slightly so that it
can be overridden.
Approved by: re (gjb)
into the DMA map. The length of the buffer had not yet been
initialized, however, so this would copy gibberish unless it
happened to be right by chance. This bug mostly only affected
systems with IOMMUs.
Approved by: re (gjb)
MFC after: 3 days
Apply the given advice to the specified range of addresses within the
given pmap. Depending on the advice, clear the referenced and/or
modified flags in each mapping. Superpage within the given range will
be demoted or destroyed.
Reviewed by: alc
Approved by: cognet (mentor)
Approved by: re
When clearing the modification status of the superpage, one of the
base pages produced during demotion should be marked as write disabled.
The intention is that subsequent write access may repromote.
In the current implementation this was done wrong as write permission was
granted instead of forbidden.
Approved by: cognet (mentor)
Approved by: re
and the equivalent functionality is now provided by sendfile(2) over
posix shared memory filedescriptor.
Remove the cow member of struct vm_page, and rearrange the remaining
members. While there, make hold_count unsigned.
Requested and reviewed by: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
Approved by: re (delphij)
not cover entire superpage, avoid copying. Doing partial copy would
require demotion, which is incompatible with the already held locks.
Reported by: cperciva
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (delphij)
used as cross-references in the device tree and phandles as used by the
Open Firmware client interface are in different namespaces. This include
IBM pSeries hardware as well as FDT systems. FDT certainly abuses
ihandles for this purpose and should be modified to use this API
eventually. This changes no behavior on systems where FreeBSD already
worked.
Reviewed by: marius
Approved by: re (kib)
MFC after: 2 weeks
Calling those functions with the drmn device as argument causes a panic,
because it's not a direct child of pci$N. They must be called with the
vgapci device instead.
This fix is not enough to make suspend/resume work reliably.
Approved by: re (blanket)
vga_pci_(un)map_bios() takes a vgapci device as argument, not a drmn
one. This fixes a bug where the BIOS couldn't be mapped if the device
wasn't the boot display.
Approved by: re (kib; blanket for following drm2/radeon commits)
portion is invalidated, invalidate the whole page. Otherwise,
partially valid page appears on a page queue, which is wrong. This
could only happen for the last page, because only then buffer which
triggered invalidation could not cover the whole page.
Reported and tested by: pho (previous version)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Approved by: re (delphij)
MFC after: 2 weeks
time removal on kqueue close.
Reported and tested by: pho
Reviewed by: jmg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (delphij)
exclusively. Filesystems are assumed to disable shared locking for
the fifo vnode locks, but some do not.
Reported and tested by: olgeni
Discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (glebius)
continuous calls to the uprintf(9), the proctree_lock could be
shared-locked for indefinite amount of time, starving exclusive
requests. Since proctree_lock is needed for fork() and exit(), this
effectively stops the machine.
While there, do the similar reduction for tprintf(9).
Reported and tested by: pho
Reviewed by: ed
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (glebius)
chain. This repairs a panic observed during pageout on some 64-bit
PowerPC systems.
Submitted by: grehan
Approved by: re (kib)
MFC after: 2 weeks
Revisit after: 10.0
to not get scanned on boot.
The problem originated in change 253549. With the change to the mps(4)
driver to scan only targets that it knows it has (as opposed to scanning
the entire bus), scanning RAID volumes on boot was omitted.
So, for versions of FreeBSD that have the scanning changes
(__FreeBSD_version 1000039 and higher), scan RAID volumes that are added
whether or not we're booting.
PR: kern/181784
Reported by: Xiguang Wang <kurapica@gmail.com>
Tested by: Dennis Glatting <dg@pki2.com>
Sponsored by: Spectra Logic
Approved by: re (delphij)
MFC After: 3 days
amount of free memory was close to the point at which we would begin
reclaiming pages. Now, we continuously scan the active page queue,
regardless of the amount of free memory. Consequently, we are continuously
calling pmap_ts_referenced() on active pages.
Prior to this change, pmap_ts_referenced() would always demote superpage
mappings in order to obtain finer-grained reference information. This made
sense because we were coming under memory pressure and would soon have to
begin reclaiming pages. Now, however, with continuous scanning of the
active page queue, these demotions are taking a toll on performance. To
address this problem, I have replaced the demotion with a heuristic for
periodically clearing the reference flag on superpage mappings.
Approved by: re (kib)
Sponsored by: EMC / Isilon Storage Division
extremely outdated, and I doubt that it was ever used for ifnet drivers.
It was used for AF_INET sockets in pre-FreeBSD time.
Approved by: re (hrs)
Sponsored by: Nginx, Inc.
the maximum number of VT-d domains (256 on a Sandybridge). We now allocate a
VT-d domain for a guest only if the administrator has explicitly configured
one or more PCI passthru device(s).
If there are no PCI passthru devices configured (the common case) then the
number of virtual machines is no longer limited by the maximum number of
VT-d domains.
Reviewed by: grehan@
Approved by: re@
in addition to the regular files.
Requested by: alc
Discussed with: emaste
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Approved by: re (hrs)
transmission which could be tricked into rounding up to the nearest
page size, leaking up to a page of kernel memory. [13:11]
In IPv6 and NetATM, stop SIOCSIFADDR, SIOCSIFBRDADDR, SIOCSIFDSTADDR
and SIOCSIFNETMASK at the socket layer rather than pass them on to the
link layer without validation or credential checks. [SA-13:12]
Prevent cross-mount hardlinks between different nullfs mounts of the
same underlying filesystem. [SA-13:13]
Security: CVE-2013-5666
Security: FreeBSD-SA-13:11.sendfile
Security: CVE-2013-5691
Security: FreeBSD-SA-13:12.ifioctl
Security: CVE-2013-5710
Security: FreeBSD-SA-13:13.nullfs
Approved by: re
This should be sufficient for 10.0 and will do
until forthcoming work to avoid limitations
in this area is complete.
Thanks to Bela Lubkin at tidalscale for the
headsup on the apic/cpu id/io apic ASL parameters
that are actually hex values and broke when
written as decimal when 11 vCPUs were configured.
Approved by: re@
Illumos ZFS issues:
3582 zfs_delay() should support a variable resolution
3584 DTrace sdt probes for ZFS txg states
Provide a compatibility shim for Solaris's cv_timedwait_hires
to help aid future porting.
Approved by: re (ZFS blanket)
an address in the first 2GB of the process's address space. This flag should
have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address. While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.
Reviewed by: alc
Approved by: re (kib)
alias it to the contents of the output property if it is defined. This
avoids a panic when booting machines (QEMU) where the output-device
property is not defined.
Since output-device is free-form and potentially conflicts with other
entries in /dev, I also am not sure we should be doing the aliasing at
all, but this at least makes things work again.
Approved by: re (kib)
an on-stack array to a pointer and therefore sizeof(errmsg) would
become 4 or 8 bytes depending on the architecture.
Fix this by using ERRMSGL in place of sizeof().
Submitted by: J David <j.david.lists@gmail.com>
MFC after: 3 days
Approved by: re (kib)
amount of free memory was close to the point at which we would begin
reclaiming pages. Now, we continuously scan the active page queue,
regardless of the amount of free memory. Consequently, we are continuously
calling pmap_ts_referenced() on active pages.
Prior to this change, pmap_ts_referenced() would always demote superpage
mappings in order to obtain finer-grained reference information. This made
sense because we were coming under memory pressure and would soon have to
begin reclaiming pages. Now, however, with continuous scanning of the
active page queue, these demotions are taking a toll on performance. For
example, on one of my test machines, the running time for the HPCC Random
Access benchmark (also known as GUPS) has increased by 54%. To address this
problem, I have replaced the demotion with a heuristic for periodically
clearing the reference flag on superpage mappings.
Reviewed by: kib
Approved by: re (glebius)
Sponsored by: EMC / Isilon Storage Division
pmap_remove_all(). Not doing the drain allows the pmap_enter() to
proceed in parallel, making the pmap_remove_all() effects void.
The race results in an invalidated page mapped wired by usermode.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
Approved by: re (glebius)
Pass the pointy hat please.
Also unblock the software (Yarrow) generator for now. This will be
reverted; Yarrow needs to block until secure, not this behaviour
of serving as soon as asked.
Folks with specific requiremnts will be able to (can!) unblock this
device with any write, and are encouraged to do so in /etc/rc.d/*
scripting. ("Any" in this case could be "echo '' > /dev/random" as
root).
Changes are to
- update board and network interface detection logic
- fix reading onboard CPLD in little-endian config
- print NAE frequency conrrectly for Bx chips
- update XAUI config to disable Rx/Tx until interface is up
Submitted by: Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
Use a variant of mips libc memcpy for kernel. This implementation uses
64-bit operations when compiled for 64-bit, and is significantly faster
in that case.
Submitted by: Tanmay Jagdale <tanmayj@broadcom.com>
ffsl() implementation, when it is available, instead of homegrown iteration.
On dual-E5645 amd64 system (2x6x2 cores) under heavy I/O load that reduces
time spent inside cpu_search() from 19% to 13%, while IOPS increased by 5%.
the rest by me.
o Namespace cleanup; the Yarrow name is now restricted to where it
really applies; this is in anticipation of being augmented or
replaced by Fortuna in the future. Fortuna is mentioned, but behind
#if logic, and is ignorable for now.
o The harvest queue is pulled out into its own modules.
o Entropy harvesting is emproved, both by being made more conservative,
and by separating (a bit!) the sources. Available entropy crumbs are
marginally improved.
o Selection of sources is made clearer. With recent revelations,
this will receive more work in the weeks and months to come.
Submitted by: Arthur Mesh (partly) <arthurmesh@gmail.com>
dev_ref() in the clone handlers that still use it.
- Don't set SI_CHEAPCLONE flag, it's not used anywhere neither in devfs
(for anything real)
Reviewed by: kib
IMAN register to clear the pending interrupt status bits. This patch
tries to solve problems seen on the MacBook Air, as reported by
Johannes Lundberg <johannes@brilliantservice.co.jp>
MFC after: 1 week
Our code does not consider yet the case of hash collisions. This
is a rather annoying situation where two or more files that
happen to have the same hash value will not appear accessible.
The situation is not difficult to work-around but given that things
will just work without enabling htree we will save possible
embarrassments for the next release.
Reported by: Kevin Lo
shutdown was reporetd via email. The crashes occurred because the
client side NLM would attempt to use its socket after it had been
destroyed. Looking at the code, it would soclose() once the reference
count on the socket handling structure went to 0. Unfortunately,
nlm_host_get_rpc() will simply allocate a new socket handling structure
when none exists and use the now soclose()d socket. Since there doesn't
seem to be a safe way to determine when the socket is no longer needed,
this patch modifies the code so that it never soclose()es the socket.
Since there is only one socket ever created, this does not introduce a
leak when the rpc.lockd is stopped/restarted. The patch also disables
unloading of the nfslockd module, since it is not safe to do so (and
has never been safe to do so, from what I can see).
Reported by: mav
Tested by: mav
MFC after: 2 weeks
IPI implmementations.
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Submitted by: gibbs (misc cleanup, table driven config)
Reviewed by: gibbs
MFC after: 2 weeks
sys/amd64/include/cpufunc.h:
sys/amd64/amd64/pmap.c:
Move invltlb_globpcid() into cpufunc.h so that it can be
used by the Xen HVM version of tlb shootdown IPI handlers.
sys/x86/xen/xen_intr.c:
sys/xen/xen_intr.h:
Rename xen_intr_bind_ipi() to xen_intr_alloc_and_bind_ipi(),
and remove the ipi vector parameter. This api allocates
an event channel port that can be used for ipi services,
but knows nothing of the actual ipi for which that port
will be used. Removing the unused argument and cleaning
up the comments surrounding its declaration helps clarify
its actual role.
sys/amd64/amd64/mp_machdep.c:
sys/amd64/include/cpu.h:
sys/i386/i386/mp_machdep.c:
sys/i386/include/cpu.h:
Implement a generic framework for amd64 and i386 that allows
the implementation of certain CPU management functions to
be selected at runtime. Currently this is only used for
the ipi send function, which we optimize for Xen when running
on a Xen hypervisor, but can easily be expanded to support
more operations.
sys/x86/xen/hvm.c:
Implement Xen PV IPI handlers and operations, replacing native
send IPI.
sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
sys/i386/include/smp.h:
Remove NR_VIRQS and NR_IPIS from FreeBSD headers. NR_VIRQS
is defined already for us in the xen interface files.
NR_IPIS is only needed in one file per Xen platform and is
easily inferred by the IPI vector table that is defined in
those files.
sys/i386/xen/mp_machdep.c:
Restructure to more closely match the HVM implementation by
performing table driven IPI setup.
These were used to control/export dispatch policy but they're not anymore.
This commit cannot be MFC'ed to 9 because old netstat(9) binary relies
on such sysctl to work. On the other hand, there's no real reason to
keep'em around in 10.
To enable them, set WITH_GCC and WITH_GNUCXX in src.conf.
Make clang default to using libc++ on FreeBSD 10.
Bumped __FreeBSD_version for the change.
GCC is still enabled on PC98, because the PC98 bootloader requires GCC to build
(or, at least, hard-codes the use of gcc into its build).
Thanks to everyone who helped make the ports tree ready for this (and bapt
for coordinating them all). Also to imp for reviewing this and working on the
forward-porting of the changes in our gcc so that we're getting to a much
better place with regard to external toolchains.
Sorry to all of the people who helped who I forgot to mention by name.
Reviewed by: bapt, imp, dim, ...
pmap_is_modified() and pmap_is_referenced(), same as it was done for
pmap_ts_referenced().
Consolidate identical code for pmap_is_modified() and
pmap_is_referenced() into helper pmap_page_test_mappings().
Reviewed by: alc
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
add some packet(s) to tx ring and arge_stop() is called before receive the
sent packet interrupt from hardware. Fix arge_stop() to unload the in use
dma tags and free the associated mbuf.
PR: 178319, 163670
Approved by: adrian (mentor)
sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely
empty functions.
Reviewed by: alc, kib, scottl
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
krpc client side UDP was observed as way out of range and
caused the rpc.lockd daemon to hang trying to do an RPC.
Inspection of the code found two places where the RPC request
is re-queued, but the value of cu_sent was not incremented.
Since cu_sent is always decremented when the RPC request is
dequeued, I think this could have caused cu_sent to go out of
range. This patch adds lines to increment cu_sent for these
two cases.
Reported by: dwhite@ixsystems.com
Discussed with: dwhite@ixsystems.com
MFC after: 2 weeks
making sure they are all misaligned at +8 bytes. This fixes clang builds
of powerpc64 kernels (aside from a required increase in KSTACK_PAGES which
will come later).
This commit from FreeBSD/powerpc64 with a clang-built kernel.
MFC after: 2 weeks