freebsd-dev

Author	SHA1	Message	Date
Rick Macklem	e2ade3b6f7	Move the "retry:" label so that the calls to m_pullup() are not done after the call to m_defrag(). This fixes a problem where m_pullup() would prepend an mbuf to the list created by m_defrag() making the chain greater than 32 again. Tested by: rcarter@pinyon.org Reviewed by: yongari, jfv MFC after: 2 weeks	2014-07-15 23:32:13 +00:00
Xin LI	7079d5877c	MFV r268714: Improve extreme rewind import. When doing an "extreme rewind" import ("zpool import -XF"), we attempt to verify all data in the pool, essentially scrubbing the entire pool. The problem is that spa_load_verify_cb() issues an unbounded number of concurrent scrub i/os. This can lead to all of memory being used for these zio's, wedging the system. Like normal scrub, we need to put a cap on the number of outstanding i/os, and have the traverse thread block when we reach this cap. For this purpose the cap can be very large (10,000) to optimize the elevator algorithm. Three kernel tunables have been added: vfs.zfs.spa_load_verify_maxinflight vfs.zfs.spa_load_verify_metadata vfs.zfs.spa_load_verify_data The latter two tunables controls whether metadata and/or user data when doing extreme rewind. Make 'zpool import -T' imply scrub. Make zpool import -T <txg> accept hexadecimal values for the txg when prefixed with 0x. Skip txg's for which there is no uberblock when doing extreme rewind. Skip reading all user data twice by skipping prefetches when doing extreme rewinds as we do not access via the ARC. Illumos issues: 4970 need controls on i/o issued by zpool import -XF 4971 zpool import -T should accept hex values 4972 zpool import -T implies extreme rewind, and thus a scrub 4973 spa_load_retry retries the same txg 4974 spa_load_verify() reads all data twice MFC after: 2 weeks	2014-07-15 22:44:04 +00:00
Xin LI	eb75155228	MFV r268702: Add missing *_destroy() calls in various places with ZFS. Illumos issue: 4975 missing mutex_destroy() calls in zfs MFC after: 2 weeks	2014-07-15 20:32:23 +00:00
Konstantin Belousov	a62eb1398a	Followup to r268466. - Move the code to calculate resident count into separate function. It reduces the indent level and makes the operation of vmmap_skip_res_cnt tunable more clear. - Optimize the calculation of the resident page count for map entry. Skip directly to the next lowest available index and page among the whole shadow chain. - Restore the use of pmap_incore(9), only to verify that current mapping is indeed superpage. - Note the issue with the invalid pages. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-15 19:57:03 +00:00
Konstantin Belousov	3760e341ca	Change the calculation of the kinfo_vmentry field kve_private_resident to reflect its name. Noted and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-15 19:49:00 +00:00
Navdeep Parhar	bae4e5af99	cxgbe(4): Display CF facility correctly in the device log. MFC after: 3 days	2014-07-15 18:24:41 +00:00
Neel Natu	f7a9f1784f	Add support for operand size and address size override prefixes in bhyve's instruction emulation [1]. Fix bug in emulation of opcode 0x8A where the destination is a legacy high byte register and the guest vcpu is in 32-bit mode. Prior to this change instead of modifying %ah, %bh, %ch or %dh the emulation would end up modifying %spl, %bpl, %sil or %dil instead. Add support for moffsets by treating it as a 2, 4 or 8 byte immediate value during instruction decoding. Fix bug in verify_gla() where the linear address computed after decoding the instruction was not being truncated to the effective address size [2]. Tested by: Leon Dang [1] Reported by: Peter Grehan [2] Sponsored by: Nahanni Systems	2014-07-15 17:37:17 +00:00
Alan Cox	87dd8ef960	Actually set the "no execute" bit on 1 MB page mappings in pmap_protect(). Previously, the "no execute" bit was being set directly in the PTE, instead of the local variable in which the new PTE value is being constructed. So, when the local variable was finally assigned to the PTE, the "no execute" bit setting was lost.	2014-07-15 17:16:06 +00:00
John Baldwin	fae9277339	Fix build with SMP disabled. CR: https://phabric.freebsd.org/D407 Reviewed by: royger	2014-07-15 15:40:33 +00:00
Konstantin Belousov	5e351014e0	Make amd64 pmap_copy_pages() functional for pages not mapped by DMAP. Requested and reviewed by: royger Tested by: pho, royger Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-15 09:30:43 +00:00
Alan Cox	c3c820296f	Eliminate repeated calculation of next_bucket in pmap_protect() and pmap_remove(). Eliminate an unnecessary variable from pmap_remove() and pmap_advise().	2014-07-15 05:34:27 +00:00
Navdeep Parhar	44eb893659	Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The firmware allows up to 48B to be read this way but the driver limits itself to 8B at a time to remain compatible with old cxgbetool binaries. MFC after: 1 week	2014-07-15 01:03:29 +00:00
Mateusz Guzik	965d08605f	Plug p_pptr null test in do_execve. It is always true.	2014-07-14 22:40:46 +00:00
Mateusz Guzik	c959c23740	Manage struct sigacts refcnt with atomics instead of a mutex. MFC after: 1 week	2014-07-14 21:12:59 +00:00
Ian Lepore	0f822edead	Fix the Zedboard/Zynq ethernet driver to handle media speed changes so that it can connect to switches at speeds other than 1gb. This requires changing the reference clock speed. Since we still don't have a general clock API that lets a SoC-independant driver manipulate its own clocks, this change includes a weak reference to a routine named cgem_set_ref_clk(). The default implementation is a no-op; SoC-specific code can provide an implementation that actually changes the speed. Submitted by: Thomas Skibo <ThomasSkibo@sbcglobal.net>	2014-07-14 20:58:57 +00:00
Nathan Whitehorn	b85beee188	On my Lenovo laptop, the firmware maps the EFI framebuffer with MTRRs set to uncacheable. This leads to execrable console performance. Once PMAP is up, remap the framebuffer as write-combining. This reduces boot time on my laptop by 60% when booting with EFI. MFC after: 2 weeks	2014-07-14 17:42:22 +00:00
Alan Cox	db3ddfd672	Eliminate dead code. There is no direct map. This code was cut-and-pasted from amd64.	2014-07-14 17:16:09 +00:00
Konstantin Belousov	4cda7f7ece	Rework the tmpfs unmount. - Suspend filesystem for unmount. This prevents new tmpfs nodes from instantiating, and also ensures that only unmount thread can destroy nodes. - Do not start tmpfs node deletion until all vnodes are reclaimed, which guarantees that no thread can access tmpfs data. For this, call vflush() in the loop, until the mnt_nvnodelistsize is non-zero. Note that after mnt_nvnodelistsize becomes 0, insmntque() blocks insertion of a vnode germ into the mount list of vnodes. - Fail node allocation when the filesystem is being unmounted. This is race-free due to the vflush() call in loop. This is mostly cosmetic, avoiding some more work which might be done until suspension in unmount is started. Note that there is currently no way to prevent new vnode instantiation from readers during the unmount. Due to this, forced unmount might live-lock if vflush() loop cannot get to the zero vnode count due to races with readers. The unmount would proceed after the load is lifted. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:52:33 +00:00
Konstantin Belousov	b5b3326191	Change forgotten in r268615. Set the OBJ_TMPFS_NODE flag for vm_object of VREG tmpfs node. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:35:14 +00:00
Konstantin Belousov	f08f7dca40	The OBJ_TMPFS flag of vm_object means that there is unreclaimed tmpfs vnode for the tmpfs node owning this object. The flag is currently used for two purposes. First, it allows to correctly handle VV_TEXT for tmpfs vnode when the ref count on the object is decremented to 1, similar to vnode_pager_dealloc() for regular filesystems. Second, it prevents some operations, which are done on OBJT_SWAP vm objects backing user anonymous memory, but are incorrect for the object owned by tmpfs node. The second kind of use of the OBJ_TMPFS flag is incorrect, since the vnode might be reclaimed, which clears the flag, but vm object operations must still be disallowed. Introduce one more flag, OBJ_TMPFS_NODE, which is permanently set on the object for VREG tmpfs node, and used instead of OBJ_TMPFS to test whether vm object collapse and similar actions should be disabled. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:30:37 +00:00
Konstantin Belousov	eb2c06b63a	Use tmpfs_vn_get_ino_gen() to handle the races with reclaim in tmpfs dotdot lookup. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:16:55 +00:00
Konstantin Belousov	fd63693dcf	Style. Add comment about lock mode. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:13:56 +00:00
Konstantin Belousov	895b3782c6	Extract the code to put a filesystem into the suspended state (at the unmount time) in the helper vfs_write_suspend_umnt(). Use it instead of two inline copies in FFS. Fix the bug in the FFS unmount, when suspension failed, the ufs extattrs were not reinitialized. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:10:00 +00:00
Konstantin Belousov	7a41bc2f41	In tmpfs_alloc_file(), code after the 'out' label does only 'return error;'. Replace goto's with the return. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 09:02:40 +00:00
Konstantin Belousov	d2ca06cdd2	Add convenience macro to assert tmpfs node lock. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:59:25 +00:00
Konstantin Belousov	55781cb922	Add some assertions for the code handling vm_object for tmpfs vnode. In particular, vnode must be exclusively locked when the tmpfs vnode and object are divorced. When the vnode is opened, the object must be still alive, since only live vnode can be opened, and the tmpfs node owns a reference on the object. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:55:02 +00:00
Konstantin Belousov	706f80801d	The tmpfs_link() must not dereference the filesystem-specific data for a vnode until it is verified that the vnode indeed belongs to tmpfs mount. Otherwise, it might access random memory, at least in the debug kernel. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:45:29 +00:00
Konstantin Belousov	57ef02ff0f	In kern_linkat(), avoid passing doomed vnode to the VOP. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:41:13 +00:00
Konstantin Belousov	a69452162a	Generalize vn_get_ino() to allow filesystems to use custom vnode producer, instead of hard-coding VFS_VGET(). New function, which takes callback, is called vn_get_ino_gen(), standard callback for vn_get_ino() is provided. Convert inline copies of vn_get_ino() in msdosfs and cd9660 into the uses of vn_get_ino_gen(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:34:54 +00:00
Konstantin Belousov	fca015d301	Remove code separator lines which do not conform to style(9). Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-14 08:17:11 +00:00
Kevin Lo	cb7df69b7e	Make bind(2) and connect(2) return EAFNOSUPPORT for AF_UNIX on wrong address family. See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191586 for the original discussion. Reviewed by: terry	2014-07-14 06:00:01 +00:00
Mark Johnston	291624fdf6	Invoke the DTrace trap handler before calling trap() on amd64. This matches the upstream implementation and helps ensure that a trap induced by tracing fbt::trap:entry is handled without recursively generating another trap. This makes it possible to run most (but not all) of the DTrace tests under common/safety/ without triggering a kernel panic. Submitted by: Anton Rang <anton.rang@isilon.com> (original version) Phabric: D95	2014-07-14 04:38:17 +00:00
Alan Cox	f26bcf99e0	Eliminate an unused variable. Refresh two comments.	2014-07-13 17:52:07 +00:00
Alan Cox	a844c68fc2	Implement pmap_unwire(). See r268327 for the motivation behind this change.	2014-07-13 16:27:57 +00:00
Mark Johnston	05929f8bd0	Add a headphone redirection quirk for the Lenovo G580. MFC after: 1 week	2014-07-13 10:31:29 +00:00
Hans Petter Selasky	817a8cac2e	Turn off blinking device leds at attach. MFC after: 3 days PR: 183735	2014-07-13 09:34:59 +00:00
Hans Petter Selasky	948d799e27	Fix performance problems with AXGE network adapter in RX direction: - Remove 4 extra bytes from the ethernet payload. - The maximum RX buffer was incorrectly set. Increase it to 64K for now, until the exact limit is understood. - Enable hardware checksumming again. - Make hardware data structure packed. MFC after: 3 days	2014-07-13 07:39:28 +00:00
Alexander Motin	4d877c4148	Merge several equal serialization indexes.	2014-07-13 06:01:23 +00:00
Mateusz Guzik	8bedd5d782	Clear nonblock and async on devctl close instaed of open. This is a purely cosmetic change.	2014-07-12 15:35:04 +00:00
Rui Paulo	efce3748f3	Revert r268543. We should probably fix sys/gpio.h instead.	2014-07-12 06:23:42 +00:00
Adrian Chadd	c7c0d94874	Add IPv6 flowid, bindmulti and RSS awareness.	2014-07-12 05:46:33 +00:00
Adrian Chadd	a8a2d8003a	Add INP_RSS_BUCKET_SET awareness for IPv6 pcbgroup entries. This ensures that a listen socket with INP_RSS_BUCKET_SET set will use the pre-determined PCBGROUP rather than what the hashing path chooses.	2014-07-12 05:45:53 +00:00
Adrian Chadd	6e4405cee1	Add the IPv6 versions of the multi-bind, hash/hash type and RSS options.	2014-07-12 05:44:16 +00:00
Adrian Chadd	e989b65f79	Add RSS hashing awareness for IPv6 and TCP IPv6 hash types.	2014-07-12 05:43:43 +00:00
Adrian Chadd	76e63232b6	Add some hash types for UDP RSS for both IPv4 and IPv6. Nothing is yet using this but I'd like to reserve these values.	2014-07-12 05:42:57 +00:00
Adrian Chadd	d5bb8bd315	Expose in_pcbbind_check_bindmulti() so the upcoming IPv6 RSS changes can be made to use it.	2014-07-12 05:40:13 +00:00
Rui Paulo	bd08cbb81a	Move iic.h to sys/ so that it's automatically installed in /usr/include/sys. This lets us call iic(4) ioctls without needing the kernel source code and follows the same model of GPIO. MFC after: 3 weeks	2014-07-12 01:04:10 +00:00
Rui Paulo	0f912c5a5d	Remove _DTRACE_VERSION from sdt.h. It will now come from the command line (bsd.dep.mk). MFC after: 3 weeks	2014-07-12 00:57:00 +00:00
Michael Tuexen	0c8682e8ad	Whitespace changes. MFC after: 1 week	2014-07-11 21:15:40 +00:00
Navdeep Parhar	30f337891d	cxgbe(4): Add an iSCSI softc to the adapter structure.	2014-07-11 21:02:54 +00:00
Gleb Smirnoff	1fbe6a82f4	Improve reference counting of EXT_SFBUF pages attached to mbufs. o Do not use UMA refcount zone. The problem with this zone is that several refcounting words (16 on amd64) share the same cache line, and issueing atomic(9) updates on them creates cache line contention. Also, allocating and freeing them is extra CPU cycles. Instead, refcount the page directly via vm_page_wire() and the sfbuf via sf_buf_alloc(sf_buf_page(sf)) [1]. o Call refcounting/freeing function for EXT_SFBUF via direct function call, instead of function pointer. This removes barrier for CPU branch predictor. o Do not cleanup the mbuf to be freed in mb_free_ext(), merely to satisfy assertion in mb_dtor_mbuf(). Remove the assertion from mb_dtor_mbuf(). Use bcopy() instead of manual assignments to copy m_ext in mb_dupcl(). [1] This has some problems for now. Using sf_buf_alloc() merely to increase refcount is expensive, and is broken on sparc64. To be fixed. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-07-11 19:40:50 +00:00
Michael Tuexen	f64a0b069a	Bugfix: When a remote address was added to an endpoint, a source address was selected and cached, but it was not stored that is was cached. This resulted in selecting different source addresses for the INIT-ACK and COOKIE-ACK when possible. Thanks to Niu Zhixiong for reporting the issue. MFC after: 1 week	2014-07-11 17:31:40 +00:00
Cy Schubert	42834773b3	Remove redundant USE_INET6 test that enables INET6 in the ipfilter userland regardless of the setting in make.conf. PR: 190964 Approved by: glebius (mentor) MFC after: 1 week	2014-07-11 16:26:51 +00:00
Gleb Smirnoff	fcc34a238c	Fix style bug: rename the refcount field of m_ext to ext_cnt, to match other members. Sponsored by: Nginx, Inc.	2014-07-11 14:34:29 +00:00
Gleb Smirnoff	15c28f87b8	All mbuf external free functions never fail, so let them be void. Sponsored by: Nginx, Inc.	2014-07-11 13:58:48 +00:00
Michael Tuexen	4474d71a7b	Integrate upstream changes. MFC after: 1 week	2014-07-11 06:52:48 +00:00
Andrey V. Elsukov	ff899182ec	Fix condition. Sponsored by: Yandex LLC	2014-07-11 06:34:15 +00:00
Neel Natu	3ada6e07ac	Use the correct offset when converting a logical address (segment:offset) to a linear address.	2014-07-11 01:23:38 +00:00
Mateusz Guzik	88f98985aa	Eliminate plim and vtmp local vars in exit1. No functional changes. MFC after: 1 week	2014-07-10 22:54:38 +00:00
Mateusz Guzik	30d58d6b39	Don't make a temporary copy of fixed sysctl strings.	2014-07-10 21:46:57 +00:00
Mateusz Guzik	b23c40d7b1	Don't zero fd_nfiles during fdp destruction. Code trying to take a look has to check fd_refcnt and it is 0 by that time. This is a follow up to r268505, without this the code would leak memory for tables bigger than the default. MFC after: 1 week	2014-07-10 21:05:45 +00:00
Mateusz Guzik	e518baf8f9	Avoid relocking filedesc lock when closing fds during fdp destruction. Don't call bzero nor fdunused from fdfree for such cases. It would do unnecessary work and complain that the lock is not taken. MFC after: 1 week	2014-07-10 20:59:54 +00:00
Alan Cox	bfc30490a7	Correct the accounting code for wired mappings. The wrong field of the PVO entry was being tested. We were incrementing and decrementing the pmap's wired mapping count based on whether the physical page being mapped or unmapped was cache coherent, not whether it was a wired mapping. Reviewed by: nwhitehorn	2014-07-10 20:55:38 +00:00
Mark Johnston	58e6549541	Correct the setting of the VID in transmit descriptors when hardware VLAN tagging is enabled. This was broken in r266978. Reported by: gjb Tested by: gjb	2014-07-10 16:46:46 +00:00
Ian Lepore	8d99c2a062	Pending interrupt status is cleared by writing to the ISR, not the data reg. MFC after: 1 week	2014-07-10 14:06:18 +00:00
Pietro Cerutti	7150b86bfe	Implement Short/Small String Optimization in SBUF(9) and change lengths and positions in the API from ssize_t and int to size_t. CR: D388 Approved by: des, bapt	2014-07-10 13:08:51 +00:00
Gleb Smirnoff	8ff2bd98d6	On machines with strict alignment copy pfsync_state_key from packet on stack to avoid unaligned access. PR: 187381 Submitted by: Lytochkin Boris <lytboris gmail.com>	2014-07-10 12:41:58 +00:00
Konstantin Belousov	479fcb4e32	Unconditionally initialize addr to handle the case of changed map timestamp while the map is unlocked. Reported by: bz Sponsored by: The FreeBSD Foundation MFC after: 6 days	2014-07-10 11:20:24 +00:00
Kevin Lo	b11ce478cf	Enable 8051 before downloading firmware. Tested by: Carlos Jacobo Puga Medina <cpm at fbsd dot es>	2014-07-10 09:42:34 +00:00
Bryan Venteicher	32487a8973	Rework when the Tx queue completion interrupt is enabled The Tx interrupt is now kept disabled in the common case, only enabled when the number of free descriptors in the queue falls below a threshold. Transmitted frames are cleared from the VQ before subsequent transmit, or in the watchdog timer. This was a very big performance improvement for an experimental Netmap bhyve backend. MFC after: 1 month	2014-07-10 05:36:04 +00:00
Bryan Venteicher	4b59668f0e	Add accessor to get the number of free descriptors in the virtqueue MFC after: 1 month	2014-07-10 05:26:01 +00:00
Adrian Chadd	0a100a6f1e	Implement the first stage of multi-bind listen sockets and RSS socket awareness. * Introduce IP_BINDMULTI - indicating that it's okay to bind multiple sockets on the same bind details. Although the PCB code has been taught about this (see below) this patch doesn't introduce the rest of the PCB changes necessary to distribute lookups among multiple PCB entries in the global wildcard table. * Introduce IP_RSS_LISTEN_BUCKET - placing an listen socket into the given RSS bucket (and thus a single PCBGROUP hash.) * Modify the PCB add path to be aware of IP_BINDMULTI: + Only allow further PCB entries to be added if the owner credentials and IP_BINDMULTI has been specified. Ie, only allow further IP_BINDMULTI sockets to appear if the first bind() was IP_BINDMULTI. * Teach the PCBGROUP code about IP_RSS_LISTE_BUCKET marked PCB entries. Instead of using the wildcard logic and hashing, these sockets are simply placed into the PCBGROUP and _not_ in the wildcard hash. * When doing a PCBGROUP lookup, also do a wildcard match as well. This allows for an RSS bucket PCB entry to appear in a PCBGROUP rather than having to exist in the wildcard list. Tested: * TCP IPv4 server testing with igb(4) * TCP IPv4 server testing with ix(4) TODO: * The pcbgroup lookup code duplicated the wildcard and wildcard-PCB logic. This could be refactored into a single function. * This doesn't yet work for IPv6 (The PCBGROUP code in netinet6/ doesn't yet know about this); nor does it yet fully work for UDP.	2014-07-10 03:10:56 +00:00
Warner Losh	aa0b5651c1	Compile boot2 with clang on pc98.	2014-07-10 00:15:50 +00:00
Warner Losh	53dda6a8d5	Make SERIAL support optional again. Enable it for i386 because a huge percentage of machines has a 16550. Disable it for pc98 since only a tiny fraction of them have one. These changes save 293 bytes when building with clang, but preserves the ability to build with serial if you really want. We now have 92 bytes free (412 with the in-tree gcc).	2014-07-10 00:15:42 +00:00
Warner Losh	522d68a17f	Merge the clang support from i386. Don't move to clang yet.	2014-07-10 00:15:38 +00:00
Xin LI	1b174fa1eb	MFV r268455: Use reserved space for ZFS administrative commands. We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at least) for system use. Most ZPL operations, e.g. write(2), creat(2), will fail with ENOSPC if we fall below this. Certain operations, e.g. file removal and most administrative actions, still permitted until half of the slop space is used. This would allow users to use these operations to free up space in the pool when pool is close to full but half of slop space is still free. A very restricted set of operations that frees up space or change quota are always permitted, regardless of the amount of free space. MFC after: 2 weeks	2014-07-09 23:14:59 +00:00
Aleksandr Rybalko	79b647995d	Should check fb_read method presence instead of double check for fb_write. Pointed by: emaste Sponsored by: The FreeBSD Foundation	2014-07-09 21:55:34 +00:00
Konstantin Belousov	fd815c0b8d	For safety, ensure that any consumer of the set_regs() and ptrace_set_pc() use the correct return to userspace using iret. The signal return, PT_CONTINUE (which in fact uses signal return path) set the pcb flag already. The setcontext(2) enforces iret return when %rip is incorrect. Due to this, the change is redundand, but is made to ensure that no path which modifies context, forgets to set PCB_FULL_IRET. Inspired by: CVE-2014-4699 Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-09 21:39:40 +00:00
Konstantin Belousov	a91831a261	Current code in sysctl proc.vmmap, which intent is to calculate the amount of resident pages, in fact calculates the amount of installed pte entries in the region. Resident pages which were not soft-faulted yet are not counted. Calculate the amount of resident pages by looking in the objects chain backing the region. Add a knob to disable the residency calculation at all. For large sparce regions, either previous or updated algorithm runs for too long time, while several introspection tools do not need the (advisory) RSS value at all. PR: kern/188911 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-09 19:11:57 +00:00
Xin LI	fdc0ee2cf5	MFV r268452: Explicitly mark file removal transactions as "presumed to result in a net free of space" so they will not fail with ENOSPC. Illumos issue: 4950 files sometimes can't be removed from a full filesystem MFC after: 2 weeks	2014-07-09 18:32:40 +00:00
Aleksandr Rybalko	97f3c4e8a4	Fix inconsistent token parameters for kbd_allocate() and kbd_release() in vt(4). PR: 191306 Submitted by: jau789@gmail.com Sponsored by: The FreeBSD Foundation	2014-07-09 14:36:03 +00:00
Roger Pau Monné	38d6b2dcb2	vm_phys: remove limitation on number of fictitious regions The number of vm fictitious regions was limited to 8 by default, but Xen will make heavy usage of those kind of regions in order to map memory from foreign domains, so instead of increasing the default number, change the implementation to use a red-black tree to track vm fictitious ranges. The public interface remains the same. Sponsored by: Citrix Systems R&D Reviewed by: kib, alc Approved by: gibbs vm/vm_phys.c: - Replace the vm fictitious static array with a red-black tree. - Use a rwlock instead of a mutex, since now we also need to take the lock in vm_phys_fictitious_to_vm_page, and it can be shared.	2014-07-09 08:12:58 +00:00
Gleb Smirnoff	fe82cbe85c	In several cases in ip_output() we obtain reference on ifa. Do not leak it. Together with: asomers, np Sponsored by: Nginx, Inc.	2014-07-09 07:48:05 +00:00
Alexander Motin	409a3c1383	Add LUN options to specify 64-bit EUI and NAA identifiers.	2014-07-09 04:37:50 +00:00
Peter Wemm	ba8cd08ba9	Bump __FreeBSD_version after last SA-14:17.kmem so we have something to test against in the freebsd.org cluster.	2014-07-09 00:12:05 +00:00
Xin LI	e432298ade	Initialize SCTP cmsg's and notification's buffer before copying out to userland. Submitted by: tuexen Security: CVE-2014-3953 Security: FreeBSD-SA-14:17.kmem	2014-07-08 21:54:27 +00:00
Xin LI	2827952eb4	Don't leave the padding between the msg header and the cmsg data, and the padding after the cmsg data un-initialized. Submitted by: tuexen Security: CVE-2014-3952 Security: FreeBSD-SA-14:17.kmem	2014-07-08 21:54:23 +00:00
Neel Natu	b301b9e28f	Accurately identify the vcpu's operating mode as 64-bit, compatibility, protected or real.	2014-07-08 21:48:57 +00:00
Neel Natu	3527963b26	Invalidate guest TLB mappings as a side-effect of its CR3 being updated. This is a pre-requisite for task switch emulation since the CR3 is loaded from the new TSS.	2014-07-08 20:51:03 +00:00
Alexander Motin	3120a49e50	Remove status setting from datamove() path. Leave that to other places.	2014-07-08 18:51:03 +00:00
Alexander Motin	e327a057a7	Remove IO_SYNC flag when writing extended file attributes on ZFS. While it is possible to create and write file, modify its permissions, etc. without ever doing sync, it looks odd that it is required for setting extended file attributes on ZFS. UFS does not do sync there too. Samba uses those extended attributes to store some its data, and doing it synchronously by many times reduces file creation performance for systems without SLOG device. Reviewed by: delphij, jpaetzel, silence on fs@ MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2014-07-08 17:26:08 +00:00
Alexander Motin	ad3cd840f2	Fix use-after-free on XPT_RESET_BUS. That command is not queued, so does not use later status update.	2014-07-08 16:56:21 +00:00
Alexander Motin	b33b96e352	Enable TAS feature: notify initiator if its command was aborted by other. That should make operation more kind to multi-initiator environment. Without this, other initiators may find out that something bad happened to their commands only via command timeout.	2014-07-08 16:38:05 +00:00
Ian Lepore	1e3d53c687	Use named constant rather than '0' to access the reset controller register.	2014-07-08 14:35:09 +00:00
Alexander Motin	d6205772da	Fix typo in r267873.	2014-07-08 13:28:37 +00:00
Alexander Motin	950b6e126b	Pass correct command that should be aborted to ISPCTL_ABORT_CMD. This makes XPT_ABORT to work for me on initiator side of isp(4). Previous code was trying to abort the XPT_ABORT itself and failed. MFC after: 1 week	2014-07-08 13:01:36 +00:00
Alexander Motin	6c7a99be9f	Do not return statuses for aborted iSCSI commands.	2014-07-08 12:16:28 +00:00
Alexander Motin	f5ffef352f	Return task management requests to queued execution, but differently. Testing shown that both original queued design with separate task queue, and recent direct execution design had significant flaw: If abort request arrives just after the victim, the last one may not be in the ooa_queue yet, and so invisible for the task management function. Unlike original queued implementation, use same queue for all SCSI and TASK requests from the same initiator. That avoids races between them: task functions are always executed in proper time, relatively to other requests.	2014-07-08 12:15:15 +00:00
Alexander Motin	aa75114c57	Add XPT_ABORT support to iSCSI initiator. While CAM does not use it normally, it is useful for targets testing. MFC after: 2 weeks	2014-07-08 09:37:41 +00:00
Alexander Motin	fdfc6c8ebd	Fix task management functions status: task not found is not an error, while not implemented function is.	2014-07-08 08:34:34 +00:00
Konstantin Belousov	3bcc218f46	Correct the problem reported by test16 from tools/regression/file/flock/flock.c, which completes the fix in r192685. When the lock was stolen from us, retry the whole lock sequence in kernel, instead of returning EINTR to usermode and hoping that application would handle it correctly by restarting the lock acquire. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-07-08 08:10:15 +00:00
Konstantin Belousov	a5244bac2e	Correct si_code for the SIGBUS signal generated by the alignment trap. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-07-08 08:05:42 +00:00
Warner Losh	e3a6cb96d6	Fix typo in flag name.	2014-07-07 23:21:15 +00:00
Warner Losh	cfe87f0076	Naughty NANDFS was using hidden unused flag, hiding the fact that the flag was used and wasn't really available. Change the name without fixing any laying issues that might be present in NANDFS' use of this flag.	2014-07-07 23:21:07 +00:00
Don Lewis	626a79752f	Declaration whitespace changes for style(9). MFC after: 1 week	2014-07-07 22:02:39 +00:00
Alexander Motin	462cf3ba2a	Make XPT_GET_TRAN_SETTINGS to report CAM that command queueing is enabled, but make couple changes to handle non-queued commands too, if happen. MFC after: 2 weeks	2014-07-07 17:34:48 +00:00
Mateusz Guzik	5e2554b7f8	Don't call crdup nor uifind under vnode lock. A locked vnode can get into the way of satisyfing malloc with M_WATOK. This is a fixup to r268087. Suggested by: kib MFC after: 1 week	2014-07-07 14:03:30 +00:00
Alexander Motin	dbd849d868	Fix "use after free" on port creation error in r268291.	2014-07-07 11:52:22 +00:00
Alexander Motin	1e5a8b8f4b	Add support for READ FULL STATUS action of PERSISTENT RESERVE IN command.	2014-07-07 11:05:04 +00:00
Alexander Motin	604e257984	Teach ctl_add_initiator() to dynamically allocate IIDs from pool. If port passed negative IID value, the function will try to allocate IID from the pool of unused, based on passed wwpn or name arguments. It does all its best to make IID unique and persistent across reconnects. This makes persistent reservation properly work for iSCSI. Previously, in case of reconnects, reservation could be unexpectedly lost, or even migrate between intiators.	2014-07-07 09:37:22 +00:00
Alexander Motin	0f8de8afaa	Fix bugs for PERSISTENT RESERVE OUT bits in r268096.	2014-07-07 08:58:36 +00:00
Fabien Thomas	b8fad6c0ef	Optim and Fix for mge driver: - add missing rcvif in mbuf - add missing ipacket stat - remove uncessary mbuf copy on output path - fix deadlock of the TX engine in case of error Obtained from: NETASQ MFC after: 2 weeks	2014-07-07 08:22:39 +00:00
Alexander Motin	99f8c067e6	Correction to r268356: collide only sessions to the same target.	2014-07-07 06:17:07 +00:00
Alexander Motin	2c6c9e47b2	When new connection comes in, check whether we already have session from the same intiator (Name+ISID). If so -- terminate the old session and let the new one take its place, as required by iSCSI RFC.	2014-07-07 05:48:11 +00:00
Hans Petter Selasky	88e0a63961	Improve support for Intel Lynx Point USB 3.0 controllers by masking the port routing bits like done in Linux. MFC after: 1 week Tested by: Tur-Wei Chan <twchan@singnet.com.sg>	2014-07-07 05:17:16 +00:00
Alexander Motin	0020682baa	Implement ABORT TASK SET and I_T NEXUS RESET task management functions. Use the last one to terminate active commands on iSCSI session termination. Previous code was aborting only commands doing some data moves.	2014-07-07 03:10:56 +00:00
Marcel Moolenaar	e7d939bda2	Remove ia64. This includes: o All directories named ia64 o All files named ia64 o All ia64-specific code guarded by __ia64__ o All ia64-specific makefile logic o Mention of ia64 in comments and documentation This excludes: o Everything under contrib/ o Everything under crypto/ o sys/xen/interface o sys/sys/elf_common.h Discussed at: BSDcan	2014-07-07 00:27:09 +00:00
Nathan Whitehorn	00cf40b0ca	Use common vt_fb parts in ofwfb as far as we are able without sacrificing performance. MFC after: 2 weeks	2014-07-07 00:12:18 +00:00
Bryan Venteicher	6700a7d44b	Use the appropriate IPv6 hashtype defines when looking up the PCBGROUP Reviewed by: adrian@	2014-07-07 00:02:49 +00:00
Andreas Tobler	64175581d0	Make gcc happy, init idlen2.	2014-07-06 20:09:23 +00:00
Alexander Motin	1380b77c12	Close race in r268291 between port destruction, delayed by sessions teardown, and new port creation during `service ctld restart`. Close it by returning iSCSI port internal state, that allows to identify dying ports, which should not be counted as existing, from really alive.	2014-07-06 17:57:59 +00:00
Alan Cox	09132ba6ac	Introduce pmap_unwire(). It will replace pmap_change_wiring(). There are several reasons for this change: pmap_change_wiring() has never (in my memory) been used to set the wired attribute on a virtual page. We have always used pmap_enter() to do that. Moreover, it is not really safe to use pmap_change_wiring() to set the wired attribute on a virtual page. The description of pmap_change_wiring() says that it assumes the existence of a mapping in the pmap. However, non-wired mappings may be reclaimed by the pmap at any time. (See pmap_collect().) Many implementations of pmap_change_wiring() will crash if the mapping does not exist. pmap_unwire() accepts a range of virtual addresses, whereas pmap_change_wiring() acts upon a single virtual page. Since we are typically unwiring a range of virtual addresses, pmap_unwire() will be more efficient. Moreover, pmap_unwire() allows us to unwire superpage mappings. Previously, we were forced to demote the superpage mapping, because pmap_change_wiring() only allowed us to express the unwiring of a single base page mapping at a time. This added to the overhead of unwiring for large ranges of addresses, including the implicit unwiring that occurs at process termination. Implementations for arm and powerpc will follow. Discussed with: jeff, marcel Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2014-07-06 17:42:38 +00:00
Alexander Motin	ffe82e05b3	Make iSCSI initiator keep Initiator Session ID (ISID) across reconnects. Previously ISID was changed every time, that made impossible correct persistent reservation, because reconnected session was identified as completely new one. Reviewed by: trasz MFC after: 1 week	2014-07-06 17:37:49 +00:00
Nathan Whitehorn	0558e4bb2b	In case we ever support little-endian PowerPC (probably userland only), avoid hardcoding endianness here.	2014-07-06 16:20:37 +00:00
Nathan Whitehorn	770047f5bb	Add a new CPU id for a POWER8 variant.	2014-07-06 16:19:55 +00:00
Hans Petter Selasky	5cb6b3afa4	Fix OFED startup order: All SYSINIT()'s and modules should be loaded prior to starting "/sbin/init" which will run all the "/etc/rc.d/xxx" scripts. Else there can be a race configuring the interfaces via "/etc/rc.conf". MFC after: 4 weeks Sponsored by: Mellanox Technologies	2014-07-06 14:22:13 +00:00
Hans Petter Selasky	22239af86c	Fix compile warning. MFC after: 4 weeks Sponsored by: Mellanox Technologies	2014-07-06 14:20:47 +00:00
Hans Petter Selasky	d291b07865	Fix some compile warnings. MFC after: 4 weeks Sponsored by: Mellanox Technologies	2014-07-06 14:14:07 +00:00
Alexander Motin	99ae56ac82	Add support for SCSI Ports (88h) VPD page.	2014-07-06 07:34:18 +00:00
Alexander Motin	69d7b87790	Make REPORT TARGET PORT GROUPS command report realistic data instead of hardcoded garbage.	2014-07-06 07:02:36 +00:00
Alexander Motin	c26eee2dc9	Move lun_map() method from command nexus to port. Previous implementation made impossible to do some things, such as calling it for ports other then one through which command arrived.	2014-07-06 06:21:34 +00:00
Alexander Motin	561764b1c5	Relax some bit checks for INQUIRY command. FreeBSD still tries to put LUN number in second byte until it get device protocol version, even that it was obsoleted about 20 years ago.	2014-07-06 06:12:29 +00:00
Gavin Atkinson	764442e03d	Add support to asmc(4) for Macmini 3,1. PR: 190195 Submitted by: fbsdbugs2 sentry.org MFC after: 1 week Relnotes: yes	2014-07-05 21:34:37 +00:00
Alexander Motin	6d81c129dd	Pass through iSCSI session ISID from LOGIN request to the CTL frontend. ISID is an important part of initiator transport ID for iSCSI. It is not used now, but should be to properly implement persistent reservation.	2014-07-05 21:18:33 +00:00
Luiz Otavio O Souza	28b07d23a9	Allow the PVID setting on CPU port. Return our static list of supported media for the CPU port. Tested on TP-Link 1043ND.	2014-07-05 19:31:22 +00:00
Alexander Motin	027e5269c9	Burry devid port method, which was a gross hack. Instead make ports provide wanted port and target IDs, and LUNs provide wanted LUN IDs. After that core Device ID VPD code only had to link all of them together and add relative port and port group numbers. LUN ID for iSCSI LUNs no longer created by CTL, but by ctld, and passed to CTL as "scsiname" LUN option. This makes LUNs to report the same set of IDs, independently from the port through which it is accessed, as required by SCSI specifications.	2014-07-05 19:30:20 +00:00
Alexander Motin	917d38fb99	Create separate CTL port for every iSCSI target (and maybe portal group). Having single port for all iSCSI connections makes problematic implementing some more advanced SCSI functionality in CTL, that require proper ports enumeration and identification. This change extends CTL iSCSI API, making ctld daemon to control list of iSCSI ports in CTL. When new target is defined in config fine, ctld will create respective port in CTL. When target is removed -- port will be also removed after all active commands through that port properly aborted. This change require ctld to be rebuilt to match the kernel. As a minor side effect, this allows to have iSCSI targets without LUNs. While that may look odd and not very useful, that is not incorrect.	2014-07-05 18:15:00 +00:00
Pedro F. Giffuni	5f40879138	Merge from OpenSolaris (24-Jul-2010): 6679140 asymmetric alloc/dealloc activity can induce dynamic variable drops 6679193 dtrace_dynvar walker produces flood of dtrace_dynhash_sink This finishes a set of merges from the older OpenSolaris releases. Still the FreeBSD port has many differences that are difficult to account for but that seems normal given that the kernels are different. MFC after: 1 week	2014-07-05 15:36:17 +00:00
Alexander Motin	831e16f359	Improve CTL_BEARG_* flags support, including optional values copyout.	2014-07-05 14:32:42 +00:00
Alexander Motin	ab2616c5b0	Implement and use ctl_frontend_find().	2014-07-05 13:50:05 +00:00
Hans Petter Selasky	604bf9d37e	When getting the initial value of numeric tunables use the getenv_xxx() functions instead of strtoq(), because the getenv_xxx() functions include wrappers for various postfixes like G/M/K, which strtoq() doesn't do.	2014-07-05 06:12:48 +00:00
Alexander Motin	92782c33a6	Introduce new IOCTL CTL_PORT_LIST reporting in more flexible XML format. Leave old CTL_GET_PORT_LIST in place so far. Garbage-collect it later.	2014-07-05 05:44:26 +00:00
Alexander Motin	2cfbcb9b3a	Improve readability of XML generated by CTL_LUN_LIST.	2014-07-05 04:10:24 +00:00
Alexander Motin	43fb3a65e3	Make options KPI more generic to allow it to be used for ports too, not only for LUNs.	2014-07-05 03:34:52 +00:00
Alexander Motin	487ddad55e	Use proper links field for ports linking.	2014-07-05 01:24:06 +00:00
Rick Macklem	6c7d2293d3	The new NFSv3 server did not generate directory postop attributes for the reply to ReaddirPlus when the server failed within the loop that calls VFS_VGET(). This failure is most likely an error return from VFS_VGET() caused by a bogus d_fileno that was truncated to 32bits. This patch fixes the server so that it will return directory postop attributes for the failure. It does not fix the underlying issue caused by d_fileno being uint32_t when a file system like ZFS generates a fileno that is greater than 32bits. Reported by: jpaetzel Reviewed by: jpaetzel MFC after: 1 month	2014-07-04 22:47:07 +00:00
Alexander Motin	92168f4c01	Separate concepts of frontend and port. Before iSCSI implementation CTL had no knowledge about frontend drivers, it had only frontends, which really were ports (alike to LUNs, if comparing to backends). But iSCSI added there ioctl() method, which does not belong to frontend as a port, but belongs to a frontend driver.	2014-07-04 19:27:06 +00:00
Alexander Motin	2f5be87a14	Remove targ_enable()/targ_disable() frontend methods. Those methods were never implemented, and I believe that their concept is wrong, since single frontend (SCSI port) can not handle several targets.	2014-07-04 19:19:03 +00:00
Nathan Whitehorn	1ee0f08975	After EFI support was added to the installer, it needed to allow boot partitions of types other than "freebsd-boot" (in particular, "efi"). This allows the removal of some nasty hacks for supporting PowerPC systems, in particular aliasing freebsd-boot to apple-boot on APM and an IBM-specific code on MBR. This changes the installer to use the correct names, which also breaks a degeneracy in the meaning of "freebsd-boot" that allows the addition of support for some newer IBM systems that can boot from GPT in addition to MBR. Since I have no idea how to detect which those systems are, leave the default on IBM PPC systems as MBR for now.	2014-07-04 15:55:32 +00:00
John-Mark Gurney	962ce8cf82	add a hit that you can enable this by default if you want... necessary if you want the keyboard break to work early in boot.. MFC after: 1 week	2014-07-04 14:49:40 +00:00

1 2 3 4 5 ...

99095 Commits