with 128K of random data and truncated to 800K can have SEEK_DATA return -1
when given an offset of 128K. On UFS, the SEEK_DATA returns 800K (the size
of the file). SEEK_HOLE on ZFS seems to behave the same as UFS.
To handle this, map -1 to the size of the file (`end') when lseek returns
this for either SEEK_HOLE or SEEK_DATA. When sparse files are not supported
by the file system both `hole' and `data' will now be equal to `end' and we
will treat the entire file as data. This way, the -1 return for SEEK_DATA
on ZFS will end up doing the right thing.
Reported by: gjb@
MFC after: 3 days
cylinder, head and track numbers. Return ~0U for these values when
mkimg wasn't given both -T and -H (i.e. no geometry) or the cylinder
would be larger than the provided maximum.
Use mkimgs_chs() for the EBR, MBR and PC98 schemes to fill in the
appropriate fields. Make sure to use a "rounded" size so that the
partition is always a multiple of the track size. We reserved the
room for it in the metadata callback so that's a valid thing to
do.
Bump the mkimg version number.
While doing that again: have mkimg.o depend on the Makefile so that
a version change triggers a rebuild as needed.
that keeps track of a particular region of the image. In particular the
image_data() function needs to return to the caller whether a region
contains data or is all zeroes. This required reading the region from
the temporary file and comparing the bytes. When image_data() is used
multiple times for the same region, this will get painful fast.
With a chunk describing a region of the image, we now also have a way
to refer to the image provided on the command line. This means we don't
need to copy the image into a temporary file. We just keep track of the
file descriptor and offset within the source file on a per-chunk basis.
For streams (pipes, sockets, fifos, etc) we now use the temporary file
as a swap file. We read from the input file and create a chunk of type
"zeroes" for each sequence of zeroes that's a multiple of the sector
size. Otherwise, we allocte from the swap file, mmap(2) it, read into
the mmap(2)'d memory and create a chunk representing data.
For regular files, we use SEEK_HOLE and SEEK_DATA to handle sparse files
eficiently and create a chunk of type zeroes for holes and a chunk of
type data for data regions. For data regions, we still compare the bytes
we read to handle differences between a file system's block size and our
sector size.
After reading all files, image_write() is used by schemes to scribble in
the reserved sectors. Since this never amounts to much, keep this data
in memory in chunks of exactly 1 sector.
The output image is created by looking using the chunk list to find the
data and write it out to the output file. For chunks of type "zeroes"
we prefer to seek, but fall back to writing zeroes to handle pipes.
For chunks of type "file" and "memoty" we simply write.
The net effect of this is that for reasonably large images the execution
time drops from 1-2 minutes to 10-20 seconds. A typical speedup is about
5 to 8 times, depending on partition sizes, output format whether in
input files are sparse or not.
Bump version to 20141001.
options. Bump the version number to 20140927.
While here, use explicit fputc() calls to skip a line in the output.
This to avoid having to hunt for extra '\n' characters in the printf
format strings.
MFC after: 1 week
Relnotes: yes
--version print the version of mkimg and also whether it's
64- or 32-bit.
--formats list the supported output formats separated by space.
--schemes list the supported partitioning schemes separated by
space.
Inspired by a patch from: gjb@
MFC after: 1 week
Relnotes: yes
We have a different ordering for the RC block(s) and L2 tables.
This is expected to be a non-issue, because everything is found
through file offsets in the corresponding RC table and L1 table.
Files that grow organically have RC blocks and L2 tables scattered
all over the place anyway.
The reason for the difference is that mkimg needs to be able to
write to a pipe. We can't seek forward and backward to fill in
the bits in non-sequential order.
variable was assigned the image offset in bytes and not in blocks
(i.e. sectors). This had image_data() return FALSE, which meant that
we didn't assign a cluster when we needed and also meant that we
didn't write parts of the L2 table when we should have. The result
being that the actual data clusters were written at the wrong offset.
Improve support for QCOW version 2. We're having the right layout
and even know how many refcnt blocks we need. All we need to do is
populate the refcnt blocks for every cluster we write and allocate
a cluster when we need a new refcnt block. The allocation part is
tricky in that it'll interleave with the assignment of clusters to
L2 tables and data. Since version 2 is not quite done, keep it
compiled out for now.
trying to get the test name right, failed, gave up and used a sequence
number instead. When I realized it wasn't because of the number of
underscores in the name that I really started to think. I didn't have
braces around the variable names ...
Thus: test_1 is now called apm_1x1_512_qcow, which gives you all you
need to run mkimg by hand.
Dumb-ass: marcel
to the baseline. Since we don't run gzip with the -n option, the output
of gzip varies for identical result files if and when they are created
at different time. Ouch...
Rather than add -n and commit a 600K+ diff for the changes to all the .uu
files, it's less of a churn to uudecode and gunzip the baseline file and
compare that to the new result file to determine if the baseline file
needs to be updated.
This way, "atf-sh mkimg.sh rebase" can be run as many times as people like
and a subsequent "svn status" will not show unnecessary diffs.
And because of that, it's entirely disabled for now. Both versions
are similar enough that a single header definition works for both
of them. The only "diverting" side-effect is that the union of the
two is larger than the official V1 header.
What this means for our V1 support is that we can't put the L1 table
adjacent to the V1 header (i.e. at offset 0x30 in the file), unless
we revert to hackery and klugery. Let's not. Instead, we align the L1
table at the cluster boundary. This is in line with the V2 layout and
perfectly ok for V1 anyway (ok -- as far as I've seen so far).
Due to the alignment, our V1 image seems to be 1 cluster larger than
the V1 image created by qemu-img (on average).
Compression of the clusters is not supported at this time.
MFC after: 2 months
-T (track size) or -H (number of heads) is given:
o scheme_metadata() always rounded to the block size. This is not
always valid (e.g. vtoc8 that must have partitions start at cylinder
boundaries).
o The bsd and vtoc8 schemes "resized" the image to make it match the
geometry, but since the geometry is an approximation and the size
of the image computed from cylinders * heads * sectors is always
smaller than the original image size, the partition information ran
out of bounds.
The fix is to have scheme_metadata() simply pass it's arguments to the
per-scheme metadata callback, so that schemes not only know where the
metadata is to go, but also what the current block address is. It's now
up to the per-scheme callback to reserve room for metadata and to make
sure alignment and rounding is applied.
The BSD scheme now has the most elaborate alignment and rounding. Just
to make the point: partitions are aligned on block boundaries, but the
image is rounded to the next cyclinder boundary.
vtoc8 now properly has all partitions aligned (and rounded) to the
cyclinder boundary.
Obtained from: Juniper Networks, Inc.
MFC after: 3 days
numbers or names. This gives more control over the actual layout and
helps to construct BSD disklabels with /usr or /var at dedicated
partitions.
Obtained from: Juniper Networks, Inc.
MFC after: 3 days
Relnotes: yes
the second sector by only clearing the amount of bytes needed for the
disklabel in the second sector. Previously we were clearing exactly 1
sector worth of bytes and as such writing over boot code that may have
been there.
Since we do support more than 8 partitions, make sure to set all fields
in d_partitions. For the first 8 partitions this is unneeded, but for
partitioons 9 and up this compensates for the fact that we don't clear
an entire sector anymore.
Obviously, one cannot use more than 8 partitions when using boot code
that starts right after the disk label.
Relevant GRNs:
107879 - Employ unused bytes after the disklabel in the second sector.
189500 - Revert the part of change 107879 that employs the unused bytes
after the disklabel in the 2nd sector for boot code.
Obtained from: Juniper Networks, Inc.
MFC after: 3 days
1. Iterate over all partitions counted in the label, which can be more
than the number of partitions given to mkimg(1).
2. Start the checksum from the beginning of the label; not the beginning
of the bootarea.
Tested with bsdlabel(8).
MFC after: 3 days
cheating by assigning the same sector offset to both directories,
but it seems that VirtualBox doesn't like that. Neither does
qemu from the looks of it. We now actually write the directory
and table twice.
MFC after: 3 days
a raw image with a VHD footer appended. There's little value that I
can see to use the fixed image type, but in order to make VHD images
for use by Microsoft's Azure platform, they must be fixed VHD images.
Support has been added by refactoring the code to re-use common code
and by adding a second output format structure. To created fixed VHD
images, specify "vhdf" as the output format.
Use this for VHD and VMDK to avoid allocating space in the image
for empty sectors.
Note that this negatively affects performance because mkimg uses a
temporary file for the intermediate storage. When mkimg has better
internal book keeping, performance can be significantly improved.
among others.
Add an undocumented option for unit testing (-y). When given, the image
will have UUIDs and timestamps synthesized in a way that gives identical
results across runs. As such, UUIDs stop being unique, globally or
otherwise.
VHD support requested by: gjb@
Add support for different output formats:
1. The output file that was previously written is now called the raw format.
2. Add the vmdk output format to create VMDK images.
When the format is not given, the raw output format is assumed.
sector granularity for both offset and length. Have all schemes
use mkimg_write() instead of mkimg_seek() followed by write(2).
Now that schemes don't use lseek(2) nor write(2) directly, it's
easier to support output formats other than raw disks.