fspacectl(2): Clarifies the return values

rmacklem@ spotted two things in the system call:
- Upon returning from a successful operation, vop_stddeallocate can
  update rmsr.r_offset to a value greater than file size. This behavior,
  although being harmless, can be confusing.
- The EINVAL return value for rqsr.r_offset + rqsr.r_len > OFF_MAX is
  undocumented.

This commit has the following changes:
- vop_stddeallocate and shm_deallocate to bound the the affected area
  further by the file size.
- The EINVAL case for rqsr.r_offset + rqsr.r_len > OFF_MAX is
  documented.
- The fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9)'s return
  len is explicitly documented the be the value 0, and the return offset
  is restricted to be the smallest of off + len and current file size
  suggested by kib@. This semantic allows callers to interact better
  with potential file size growth after the call.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D31604
This commit is contained in:
Ka Ho Ng 2021-08-24 17:04:02 +08:00
parent 5425ba8332
commit 1eaa36523c
5 changed files with 53 additions and 9 deletions

View File

@ -27,7 +27,7 @@
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE. .\" SUCH DAMAGE.
.\" .\"
.Dd August 4, 2021 .Dd August 18, 2021
.Dt FSPACECTL 2 .Dt FSPACECTL 2
.Os .Os
.Sh NAME .Sh NAME
@ -67,6 +67,17 @@ argument is non-NULL, the
.Fa spacectl_range .Fa spacectl_range
structure it points to is updated to contain the unprocessed operation range structure it points to is updated to contain the unprocessed operation range
after the system call returns. after the system call returns.
.Pp
For a successful completion without an unprocessed part in the requested
operation range,
.Fa "rmsr->r_len"
is updated to be the value 0, and
.Fa "rmsr->r_offset"
is updated to be the smallest of
.Fa "rqsr->r_offset" +
.Fa "rqsr->r_len" ;
and the end-of-file offset.
The file descriptor's file offset is not used or modified by the system call.
Both Both
.Fa rqsr .Fa rqsr
and and
@ -92,9 +103,9 @@ Zero a region in the file specified by the
.Fa rqsr .Fa rqsr
argument. argument.
The The
.Va "rqsr->r_offset" .Fa "rqsr->r_offset"
has to be a value greater than or equal to 0, and the has to be a value greater than or equal to 0, and the
.Va "rqsr->r_len" .Fa "rqsr->r_len"
has to be a value greater than 0. has to be a value greater than 0.
.Pp .Pp
If the file system supports hole-punching, If the file system supports hole-punching,
@ -132,11 +143,17 @@ If the
argument is argument is
.Dv SPACECTL_DEALLOC , .Dv SPACECTL_DEALLOC ,
either the either the
.Fa "range->r_offset" .Fa "rqsr->r_offset"
argument was less than zero, or the argument was less than zero, or the
.Fa "range->r_len" .Fa "rqsr->r_len"
argument was less than or equal to zero. argument was less than or equal to zero.
.It Bq Er EINVAL .It Bq Er EINVAL
The value of
.Fa "rqsr->r_offset" +
.Fa "rqsr->r_len"
is greater than
.Dv OFF_MAX .
.It Bq Er EINVAL
An invalid or unsupported flag is included in An invalid or unsupported flag is included in
.Fa flags . .Fa flags .
.It Bq Er EINVAL .It Bq Er EINVAL

View File

@ -74,6 +74,14 @@ and
are updated to reflect the portion of the range that are updated to reflect the portion of the range that
still needs to be zeroed/deallocated on return. still needs to be zeroed/deallocated on return.
Partial result is considered a successful operation. Partial result is considered a successful operation.
For a successful completion without an unprocessed portion of the range,
.Fa *len
is updated to be the value 0, and
.Fa *offset
is updated to be the smallest of
.Fa *offset +
.Fa *len
passed to the call and the end-of-file offset.
.Sh LOCKS .Sh LOCKS
The vnode should be locked on entry and will still be locked on exit. The vnode should be locked on entry and will still be locked on exit.
.Sh RETURN VALUES .Sh RETURN VALUES

View File

@ -95,6 +95,14 @@ Attempt to bypass buffer cache.
and and
.Fa *length .Fa *length
are updated to reflect the unprocessed operation range of the call. are updated to reflect the unprocessed operation range of the call.
For a successful completion,
.Fa *length
is updated to be the value 0, and
.Fa *offset
is updated to be the smallest of
.Fa *offset +
.Fa *length
passed to the call and the end-of-file offset.
.Sh RETURN VALUES .Sh RETURN VALUES
Upon successful completion, the value 0 is returned; otherwise the Upon successful completion, the value 0 is returned; otherwise the
appropriate error is returned. appropriate error is returned.

View File

@ -1905,6 +1905,8 @@ shm_deallocate(struct shmfd *shmfd, off_t *offset, off_t *length, int flags)
off = *offset; off = *offset;
len = *length; len = *length;
KASSERT(off + len <= (vm_ooffset_t)OFF_MAX, ("off + len overflows")); KASSERT(off + len <= (vm_ooffset_t)OFF_MAX, ("off + len overflows"));
if (off + len > shmfd->shm_size)
len = shmfd->shm_size - off;
object = shmfd->shm_object; object = shmfd->shm_object;
startofs = off & PAGE_MASK; startofs = off & PAGE_MASK;
endofs = (off + len) & PAGE_MASK; endofs = (off + len) & PAGE_MASK;
@ -1913,6 +1915,13 @@ shm_deallocate(struct shmfd *shmfd, off_t *offset, off_t *length, int flags)
pi = OFF_TO_IDX(off + PAGE_MASK); pi = OFF_TO_IDX(off + PAGE_MASK);
error = 0; error = 0;
/* Handle the case when offset is beyond shm size */
if ((off_t)len < 0) {
*offset = shmfd->shm_size;
*length = 0;
return (0);
}
VM_OBJECT_WLOCK(object); VM_OBJECT_WLOCK(object);
if (startofs != 0) { if (startofs != 0) {
@ -1974,8 +1983,6 @@ shm_fspacectl(struct file *fp, int cmd, off_t *offset, off_t *length, int flags,
break; break;
} }
error = shm_deallocate(shmfd, &off, &len, flags); error = shm_deallocate(shmfd, &off, &len, flags);
if (error != 0)
break;
*offset = off; *offset = off;
*length = len; *length = len;
break; break;

View File

@ -1138,14 +1138,13 @@ vop_stddeallocate(struct vop_deallocate_args *ap)
vp = ap->a_vp; vp = ap->a_vp;
offset = *ap->a_offset; offset = *ap->a_offset;
len = *ap->a_len;
cred = ap->a_cred; cred = ap->a_cred;
error = VOP_GETATTR(vp, &va, cred); error = VOP_GETATTR(vp, &va, cred);
if (error) if (error)
return (error); return (error);
len = omin(OFF_MAX - offset, *ap->a_len); len = omin((off_t)va.va_size - offset, *ap->a_len);
while (len > 0) { while (len > 0) {
noff = offset; noff = offset;
error = vn_bmap_seekhole_locked(vp, FIOSEEKDATA, &noff, cred); error = vn_bmap_seekhole_locked(vp, FIOSEEKDATA, &noff, cred);
@ -1185,6 +1184,11 @@ vop_stddeallocate(struct vop_deallocate_args *ap)
if (should_yield()) if (should_yield())
break; break;
} }
/* Handle the case when offset is beyond EOF */
if (len < 0) {
offset += len;
len = 0;
}
out: out:
*ap->a_offset = offset; *ap->a_offset = offset;
*ap->a_len = len; *ap->a_len = len;