freebsd-skq

Author	SHA1	Message	Date
Martin Matuska	8b2aa22d8f	Partially fix ZFS compat code for sparc64. Some endianess bugs still need to be resolved. Submitted by: marius (parts of the fix) MFC after: 1 month	2011-04-08 11:08:26 +00:00
Artem Belevich	7a3f3cabb1	Stripped '32' suffix from linux systrace module name on i386. Approved by: avg	2011-04-08 06:27:43 +00:00
Jung-uk Kim	3453537fa5	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
Pawel Jakub Dawidek	65612637e8	Checking file access on size change is bogus. The checks are done earlier by VFS where we know if this is truncate(2) or ftruncate(2). If this is the latter we should depend on the mode the file was opened and not on the current permission. PR: standards/154873 Reported by: Mark Martinec <Mark.Martinec@ijs.si> Discussed with: Eric Schrock <eric.schrock@delphix.com> Discussed with: Mark Maybee <Mark.Maybee@Oracle.COM> MFC after: 1 month	2011-03-24 20:28:09 +00:00
Pawel Jakub Dawidek	d7d23301ae	Fix potential panic in dbuf_sync_list() relate to spill blocks handling. Obtained from: IllumOS MFC after: 1 month	2011-03-14 11:07:12 +00:00
Andriy Gapon	308bce2a0e	add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls Add systrace_linux32 and systrace_freebsd32 modules which provide support for tracing compat system calls in addition to native system call tracing provided by systrace module. Provided that all the systrace modules are loaded now you can select what syscalls to trace in the following manner: syscall::xxx:yyy - work on all system calls that match the specification syscall:freebsd:xxx:yyy - only native system calls syscall:linux32:xxx:yyy - linux32 compat system calls syscall:freebsd32:xxx:yyy - freebsd32 compat system calls on amd64 PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (earlier version) MFC after: 3 weeks	2011-03-12 09:09:25 +00:00
Pawel Jakub Dawidek	cae905e5d0	Correct readdir over ZFS handling. Reported by: Pierre Beyssac <pb@fasterix.frmug.org> MFC after: 1 month	2011-03-08 18:39:41 +00:00
Pawel Jakub Dawidek	a96e8e86f0	Fix libzpool build. MFC after: 1 month	2011-03-06 01:22:14 +00:00
Pawel Jakub Dawidek	2348f1110e	Make renaming of a ZVOL, ZVOL's parent directory and ZVOL snapshot work. Reported by: avg MFC after: 1 month	2011-03-05 22:31:03 +00:00
Pawel Jakub Dawidek	5bf0660559	Simplify zvol_remove_minors() a bit. MFC after: 1 month	2011-03-05 22:24:31 +00:00
Pawel Jakub Dawidek	2fbdb9c0a0	Use proper lock in assertion. MFC after: 1 month	2011-02-28 05:45:31 +00:00
Pawel Jakub Dawidek	10b9d77bf1	Finally... Import the latest open-source ZFS version - (SPA) 28. Few new things available from now on: - Data deduplication. - Triple parity RAIDZ (RAIDZ3). - zfs diff. - zpool split. - Snapshot holds. - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode. MFC after: 1 month	2011-02-27 19:41:40 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Marcel Moolenaar	6e23016fd7	Use the preload_fetch_addr() and preload_fetch_size() convenience functions to obtain the address and size of the preloaded pool configuration file/repository. Sponsored by: Juniper Networks.	2011-02-13 19:46:55 +00:00
Konstantin Belousov	ca67168159	For UIO_NOCOPY case of reading request on zfs vnode, which has vm object attached, activate the page after the successful read, and free the page if read was unsuccessfull. Freshly allocated page is not on any queue yet, and not activating (or deactivating) the page leaves it on no queue, excluding the page from pagedaemon scans and making the memory disappeared until the vnode reclaimed. Reviewed by: avg MFC after: 1 week	2011-02-11 10:46:15 +00:00
Edward Tomasz Napierala	dc7a965673	Make it impossible to clear the MNT_NFS4ACLS flag on ZFS filesystem by using "mount -uw". Reviewed by: pjd MFC after: 2 weeks	2011-02-06 23:34:09 +00:00
Andrey V. Elsukov	459d0e830d	vdev's sectorsize should not be greater than 8 Kbytes and also it should be power of 2. This prevents non-aligned access while probing vdev's labels. PR: kern/147852 Reviewed by: pjd MFC after: 1 week	2011-02-04 15:22:56 +00:00
Martin Matuska	5c92680fa9	Recommit r218169, enclosing with #ifdef _KERNEL This change is sufficient for the ZFS kernel module. Discussed with: pjd MFC after: 1 week	2011-02-01 23:12:13 +00:00
Alexander Kabaev	a9c28a203d	Revert r218169 until it can be tested and fixed properly.	2011-02-01 21:15:35 +00:00
Martin Matuska	4530e5f790	For ZFS, change the type of clock_t to int64_t. The clock_t type in OpenSolaris is long (int64_t on amd64). On FreeBSD clock_t is int32_t. The clock_t type is used in several places in the ZFS code to store system uptime in milliseconds ("seconds * hz"). With hz=1000 we have a 32-bit integer overflow in 24 days, 20 hours, 31 minutes and 23.648 seconds. This has a user reported negative impact on l2arc_feed_thread() and may cause unexpected results from other functions using clock_t. Reported by: Artem Belevich <fbsdlist@src.cx> on freebsd-fs@ MFC after: 1 week	2011-02-01 14:28:50 +00:00
Jayachandran C.	baa8c35cb4	CDDL fixes for MIPS n32. Provide 64 bit atomic ops, and use 32 bit pointer.	2011-01-28 06:12:59 +00:00
Matthew D Fleming	cbc134ad03	Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().	2011-01-19 23:00:25 +00:00
Edward Tomasz Napierala	7a93bf9a69	Add MNT_NFS4ACLS to ZFS mount flags. It's not conditional, since there is no way to disable NFSv4 ACLs in ZFS. This should make it easier for the NFS server to figure out whether the exported filesystem supports ACLs or not. Reviewed by: pjd MFC after: 2 weeks	2011-01-19 17:11:52 +00:00
Matthew D Fleming	e704482d43	Re-commit the zfs sysctl(9) type-safety changes. Thanks to dim and pjd for the pointer to zfs_context.h for building userland.	2011-01-13 18:20:19 +00:00
Matthew D Fleming	374a993a88	Revert cddl changes for sysctl(9) until I understand why this isn't building on universe.	2011-01-12 23:06:38 +00:00
Matthew D Fleming	4a2ce5903f	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the zfs piece.	2011-01-12 19:53:30 +00:00
Martin Matuska	df06a59a77	MFp4 r186485, r186859: Fix a race by defining two tasks in the zio structure as we can still be returning from issue task when interrupt task is used. Tested by: pjd Approved by: pjd, delphij (mentor) MFC after: 3 days	2011-01-03 12:57:07 +00:00
Andriy Gapon	dfe3a1b374	cyclic xcall: use smp_no_rendevous_barrier as setup function parameter In this case we call target function only on a single CPU and do not need any synchronization at the setup stage. It's a bit non-obvious but setup function of NULL means that smp_rendezvous_cpus waits for all CPUs to arrive at the rendezvous point, but without doing any actual setup. While using smp_no_rendevous_barrier means that each CPU proceeds on its own schedule without any synchronization whatsoever. MFC after: 3 weeks	2010-12-17 18:22:50 +00:00
Pawel Jakub Dawidek	8735863465	Remove redundant semicolon and empty like.	2010-12-11 13:35:25 +00:00
Ivan Voras	d7ccd95be8	Undo r216230: the interaction between saved ashift in metadata and detected ashift does not support this. With this change, pools created while stripesize=512 could not be imported when stripesize becomes larger (on the same drive). Noticed by: pjd	2010-12-07 15:24:08 +00:00
Andriy Gapon	58f61ce4eb	opensolaris cyclic: fix deadlock and make a little bit closer to upstream The dealock was caused in the following way: - thread T1 on CPU C1 holds a spin mutex, IPIs CPU C2 and waits for the IPI to be handled - C2 executes timer interrupt filter, thus has interrupts disabled, and gets blocked on the spin mutex held by T1 The problem seems to have been introduced by simplifications made to OpenSolaris code during porting. The problem is fixed by reorganizing the code to more closely resemble the upstream version. Interrupt filter (cyclic_fire) now doesn't acquire any locks, all per-CPU data accesses are performed on a target CPU with preemption and interrupts disabled thus precluding concurrent access to the data. cyp_mtx spin mutex is used to disable preemtion and interrupts; it's not used for classical mutual exclusion, because xcall already serializes calls to a CPU. It's an emulation of OpenSolaris cyb_set_level(CY_HIGH_LEVEL) call, the spin mutexes could probably be reduced to just a spinlock_enter()/_exit() pair. Diff with upstream version is now reduced by ~500 lines, however it still remains quite large - many things that are not needed (at the moment) or are irrelevant on FreeBSD were simply ripped out during porting. Examples of such things: - support for CPU onlining/offlining - support for suspend/resume - support for running callouts at soft interrupt levels - support for callout rebinding from CPU to CPU - support for CPU partitions Tested by: Artem Belevich <fbsdlist@src.cx> MFC after: 3 weeks X-MFC with: r216252	2010-12-07 12:25:26 +00:00
Andriy Gapon	a10b0e67d9	opensolaris cyclic xcall: no need for special handling of curcpu smp_rendezvous_cpus already properly handles current CPU case and non-SMP case. MFC after: 3 weeks	2010-12-07 12:04:06 +00:00
Andriy Gapon	fe8c7b3d77	dtrace_xcall: no need for special handling of curcpu smp_rendezvous_cpus alreadt does the right thing in a very similar fashion, so the code was kind of duplicating that. MFC after: 3 weeks	2010-12-07 09:19:47 +00:00
Andriy Gapon	7becfa95b9	dtrace_gethrtime_init: pin to master while examining other CPUs Also use pc_cpumask to be future-friendly. Reviewed by: jhb MFC after: 2 weeks	2010-12-07 09:03:17 +00:00
Ivan Voras	8b08562112	Use GEOM stripesize field when calculating ashift. This will enable correct alignment on drives with large sector sizes (e.g. 4 KiB) but the implementation might need to be revisited if devices with large stripesizes appear (e.g. if RAID controllers or flash drives start using the field), probably by introducing a physsectorsize field in GEOM providers. Discussed with: mav, mostly silence on freebsd-geom@ and freebsd-fs@	2010-12-06 12:18:02 +00:00
Edward Tomasz Napierala	de2a57325d	Don't panic when we read an empty ACL from ZFS. Apparently this may happen with filesystems created under MacOS X ZFS port. This is kind of filesystem corruption (we don't allow for setting empty ACLs), so make acl_get_file(3) and related syscalls fail with EINVAL in that case. In theory, we could return empty ACL to userland, but I'm afraid this would break some code. MFC after: 3 days	2010-11-30 21:04:05 +00:00
Andriy Gapon	c59690f249	zfs+sendfile: populate all requested pages, not just those already cached kern_sendfile() uses vm_rdwr() to read-ahead blocks of data to populate page cache. When sendfile stumbles upon a page that is not populated yet, it sends out all the mbufs that it collected so far. This resulted in very poor performance with ZFS when file data is not in the page cache, because ZFS vop_read for UIO_NOCOPY case populated only those pages that are already in cache, but not valid. Which means that most of the time it populated only the first requested page in the described above scenario. Reported by: Alexander Zagrebin <alexz@visp.ru> Tested by: Alexander Zagrebin <alexz@visp.ru>, Artemiev Igor <ai@kliksys.ru> MFC after: 12 days	2010-11-16 15:53:44 +00:00
Andriy Gapon	f9e2e99d5d	fix misspelling in a comment Reported by: Daniel Braniss <danny@cs.huji.ac.il> MFC after: 3 days	2010-11-16 12:30:47 +00:00
Martin Matuska	8db47aa15e	Disable VFS_HOLD placed on mnt_vnodecovered during the mount of a snapshot and VFS_RELE on a non-existing hold on snapshot parent's z_vfs. This disables the changes from OpenSolaris onnv-revision 9234:bffdc4fc05c4 (bug IDs: 6792139, 6794830) - not applicable to FreeBSD. This fixes the process hang if umounting a manually mounted snapshot. Reported by: Alexander Zagrebin <alexz@visp.ru> Approved by: delphij (mentor) MFC after: 1 week	2010-11-13 21:09:18 +00:00
Xin LI	b97a9057c2	Validate whether the zfs_cmd_t submitted from userland is not smaller than what we have. Without the check the kernel could accessing memory that does not belong to the request struct. Note that we do not test if the struct equals in size at this time, which may faciliate forward compatibility with newer binaries. Reviewed by: pjd at MeetBSD CA '2010 MFC after: 1 week	2010-11-05 22:18:09 +00:00
Martin Matuska	e25376bdd0	Bugfix merge from OpenSolaris: OpenSolaris onnv-revision: 10209:91f47f0e7728 6830541 zfs_get_data_trips on a verify 6696242 multiple zfs_fillpage() zfs: accessing past end of object panics 6785914 zfs fails to drop dn_struct_rwlock in recovery code path Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6830541, 6696242, 6785914) MFC after: 2 weeks	2010-10-26 15:48:03 +00:00
Andriy Gapon	23a1bcf8c6	zfs: add vop_getpages method implementation This should make vnode_pager_getpages path a bit shorter and clearer. Also this should eliminate problems with partially valid pages. Having this method opens room for future optimizations. To do: try to satisfy other pages besides the required one taking into account tradeofs between number of page faults, read throughput and read latency. Also, eventually vop_putpages should be added too. Reviewed by: kib, mm, pjd MFC after: 3 weeks	2010-10-16 20:43:05 +00:00
Rui Paulo	910a5e18ba	Pass a format string to panic() and to taskqueue_start_threads(). Found with: clang	2010-10-13 17:13:43 +00:00
Rui Paulo	6e634bb80f	In zfs_post_common(), use %d instead of %hhu. Found with: clang	2010-10-13 17:12:23 +00:00
Andriy Gapon	f6bb41924c	zfs + sendfile: do not produce partially valid pages for vnode's tail Since r212650 and before this change sendfile(2) could produce a partially valid page for a trailing portion of a ZFS vnode. vm_fault() always wants to see a fully valid page even if it's the last page that partially extends beyond vnode's end. Otherwise it calls vop_getpages() to bring in the page. In the case of ZFS this means that the data is read from the page into the same page and this breaks checks in ZFS mappedread() - a thread that set VPO_BUSY on the page in vm_fault() will get blocked forever waiting for it to be cleared. Many thanks to Kai and Jeremy for reproducing the issue and providing important debugging information and help. Reported by: Kai Gallasch <gallasch@free.de>, Jeremy Chadwick <freebsd@jdc.parodius.com> Tested by: Kai Gallasch <gallasch@free.de>, Jeremy Chadwick <freebsd@jdc.parodius.com> Reviewed by: kib MFC after: 3 days To-Do: apply the same treatment to tmpfs + sendfile	2010-10-12 17:04:21 +00:00
Pawel Jakub Dawidek	19ebc67beb	Provide internal ioflags() function that converts ioflag provided by FreeBSD's VFS to OpenSolaris-specific ioflag expected by ZFS. Use it for read and write operations. Reviewed by: mm MFC after: 1 week	2010-10-10 20:49:33 +00:00
Martin Matuska	a362d75576	Change FAPPEND to IO_APPEND as this is a ioflag and not a fflag. This corrects writing to append-only files on ZFS. PR: kern/149495 [1], kern/151082 [2] Submitted by: Daniel Zhelev <daniel@zhelev.biz> [1], Michael Naef <cal@linu.gs> [2] Approved by: delphij (mentor) MFC after: 1 week	2010-10-08 23:01:38 +00:00
Andriy Gapon	6c6aca1203	opensolaris_kmem kmem_size(): report lesser of vm_kmem_size and available physical memory This is needed to correctly autotune ZFS ARC size when vm_kmem_size is set to value larger than available physical memory. MFC after: 2 weeks	2010-10-07 18:16:14 +00:00
Martin Matuska	aa007a9f0e	Properly handle IO with B_FAILFAST Retry IO once with ZIO_FLAG_TRYHARD before declaring a pool faulted OpenSolaris revision and Bug IDs: 9725:0bf7402e8022 6843014 ZFS B_FAILFAST handling is broken Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6843014) MFC after: 3 weeks	2010-09-27 09:42:31 +00:00
Martin Matuska	96a1a6a568	Enable offlining of log devices. OpenSolaris revision and Bug IDs: 9701:cc5b64682e64 6803605 should be able to offline log devices 6726045 vdev_deflate_ratio is not set when offlining a log device 6599442 zpool import has faults in the display Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6803605, 6726045, 6599442) MFC after: 3 weeks	2010-09-27 09:05:51 +00:00

1 2 3 4 5 ...

512 Commits