Cédric Le Goater [Mon, 25 Jan 2016 14:07:34 +0000 (15:07 +0100)]
ipmi: add ACPI power and GUID commands
>From the specs (20.8 Get Device GUID Command), the command needs to
return a GUID (Globally Unique ID), or UUID, that should never change
over the lifetime of the device. qemu_uuid looked like a good
candidate to start with but we could use a specific BMC property also
if needed.
Cédric Le Goater [Mon, 25 Jan 2016 14:07:31 +0000 (15:07 +0100)]
ipmi: introduce a struct ipmi_sdr_compact
Currently, sdr attributes are identified using byte offsets and this
can be a bit confusing.
This patch adds a struct ipmi_sdr_compact conforming to the IPMI specs
and replaces byte offsets with names. It also introduces and uses a
struct ipmi_sdr_header in sections of the code where no assumption is
made on the type of SDR. This leave rooms to potential usage of other
types in the future.
Cédric Le Goater [Mon, 25 Jan 2016 14:07:30 +0000 (15:07 +0100)]
ipmi: fix SDR length value
The IPMI BMC simulator populates the SDR table with a set of initial
SDRs. The length of each SDR is taken from the record itself (byte 4)
which does not include the size of the header. But, the full length
(header + data) is required by the sdr_add_entry() routine.
Cédric Le Goater [Mon, 25 Jan 2016 14:07:27 +0000 (15:07 +0100)]
ipmi: replace goto by a return statement
Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
handle hidden errors. Using directly a return statement has the same
effect and it removes the fact that 'out' needs to be defined.
The code exits in ipmi_sim_handle_command() are a little different
from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
handled before making use of it. This might be a bit excessive as a
minimum response len is currently 300 bytes and the patch checks that
at least 3 are available.
Marcel Apfelbaum [Mon, 18 Jan 2016 15:27:26 +0000 (17:27 +0200)]
hw/pci: ensure that only PCI/PCIe bridges can be attached to pxb/pxb-pcie devices
PCI devices can't be plugged directly into PCI extra root bridges
because their resources can't be computed by firmware before the ACPI
tables are loaded.
Paolo Bonzini [Thu, 4 Feb 2016 15:00:52 +0000 (16:00 +0100)]
vhost-user-test: use correct ROM to speed up and avoid spurious failures
The mechanism to get the option ROM for virtio-net does not block the
PCI ROM from being loaded. Therefore, in vhost-user-test there are
two entries in the boot menu for the virtio-net card: one as an
embedded option ROM, one from the ROM BAR.
The embedded option ROM in vhost-user-test is the non-EFI-enabled,
while the ROM BAR has an EFI-enabled ROM. The two are compiled with
slightly different parameters, where only the old BIOS-only one doesn't
have a timeout for the "Press Ctrl-B" banner. When using a new
machine type, therefore, the vhost-user-test has to wait for the
EFI-enabled ROM's banner to go away. There are several ways to fix
this:
1) fix the ROMs to have the same configuration
2) add ",romfile=" to the -device line
3) remove --option-rom and add the ROM file name to the -device line
4) use an old machine type
This patch chooses 3. In addition, the file name was wrong because
qtest runs QEMU relative to the top build directory, not to the
x86_64-softmmu/ subdirectory, which is fixed too.
Fill in an element of the used ring with a single combined access to the
guest physical memory, rather than using two separated accesses.
This reduces the overhead due to expensive address translation.
virtio: read avail_idx from VQ only when necessary
The virtqueue_pop() implementation needs to check if the avail ring
contains some pending buffers. To perform this check, it is not
always necessary to fetch the avail_idx in the VQ memory, which is
expensive. This patch introduces a shadow variable tracking avail_idx
and modifies virtio_queue_empty() to access avail_idx in physical
memory only when necessary.
Accessing used_idx in the VQ requires an expensive access to
guest physical memory. Before this patch, 3 accesses are normally
done for each pop/push/notify call. However, since the used_idx is
only written by us, we can track it in our internal data structure.
Paolo Bonzini [Sun, 31 Jan 2016 10:29:03 +0000 (11:29 +0100)]
virtio: combine the read of a descriptor
Compared to vring, virtio has a performance penalty of 10%. Fix it
by combining all the reads for a descriptor in a single address_space_read
call. This also simplifies the code nicely.
Paolo Bonzini [Sun, 31 Jan 2016 10:29:02 +0000 (11:29 +0100)]
vring: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list. The cost of the copy is minimal compared to that
of a large malloc.
Paolo Bonzini [Sun, 31 Jan 2016 10:29:01 +0000 (11:29 +0100)]
virtio: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list. The cost of the copy is minimal compared to that
of a large malloc.
When virtqueue_map is used on the destination side of migration or on
loadvm, the iovecs have already been split at memory region boundary,
so we can just reuse the out_num/in_num we find in the file.
Paolo Bonzini [Sun, 31 Jan 2016 10:29:00 +0000 (11:29 +0100)]
virtio: introduce virtqueue_alloc_element
Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the
VirtQueueElement. For now, virtqueue_pop and vring_pop keep
allocating a very large VirtQueueElement.
Paolo Bonzini [Sun, 31 Jan 2016 10:28:59 +0000 (11:28 +0100)]
virtio: introduce qemu_get/put_virtqueue_element
Move allocation to virtio functions also when loading/saving a
VirtQueueElement. This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.
Paolo Bonzini [Thu, 4 Feb 2016 14:26:51 +0000 (16:26 +0200)]
virtio: move allocation to virtqueue_pop/vring_pop
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0. We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.
The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.
The patch is pretty large, but changes to each device are testable
more or less independently. Splitting it would mostly add churn.
Paolo Bonzini [Sun, 31 Jan 2016 10:28:57 +0000 (11:28 +0100)]
virtio: move VirtQueueElement at the beginning of the structs
The next patch will make virtqueue_pop/vring_pop allocate memory for
the VirtQueueElement. In some cases (blk, scsi, gpu) the device wants
to extend VirtQueueElement with device-specific fields and, until now,
the place of the VirtQueueElement within the containing struct didn't
matter. When allocating the entire block in virtqueue_pop/vring_pop,
however, the containing struct must basically be a "subclass" of
VirtQueueElement, with the VirtQueueElement as the first field. Make
that the case for blk and scsi; gpu is already doing it.
Igor Mammedov [Fri, 22 Jan 2016 14:36:06 +0000 (15:36 +0100)]
pc: acpi: merge SSDT into DSDT
Since both tables are built dynamically now,
there is no point in keeping ASL in them in separate
tables.
So do the same as we do for ARM where we have only
DSDT table, i.e. move SSDT ASL into DSDT and
drop SSDT altogether.
This patch doesn't change moved SSDT ASL in any way,
but it opens a way to relatively independently simplify
generated ASL on per device/subsystem basis in
followup series.
It also simplifies bios-tables-test where expected
SSDT blobs could be dropped and only DSDT ones
have to be maintained.
I misunderstood the vmstate macro definition when I reworked the
virtio .get/.put.
The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
variable length array (i.e. _type *_field) but we know the
length". However it actually specified operation for arrays embedded in
the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
flag. This caused offset calculation to be completely off, examining and
potentially sending random data instead of the VirtQueue content.
Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
(so now actually doing what it advertises) and use it in the virtio
migration code.
Fixes and description as per Sascha's suggestions/debug.
Peter Maydell [Wed, 3 Feb 2016 19:00:33 +0000 (19:00 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Wed 03 Feb 2016 15:47:34 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <[email protected]>"
# gpg: aka "Stefan Hajnoczi <[email protected]>"
* remotes/stefanha/tags/tracing-pull-request:
log: add "-d trace:PATTERN"
trace: switch default backend to "log"
trace: convert stderr backend to log
log: move qemu-log.c into util/ directory
log: do not unnecessarily include qom/cpu.h
trace: add "-trace help"
trace: add "-trace enable=..."
trace: no need to call trace_backend_init in different branches now
trace: split trace_init_file out of trace_init_backends
trace: split trace_init_events out of trace_init_backends
trace: fix documentation
trace: track enabled events in a separate array
trace: count number of enabled events
Peter Maydell [Wed, 3 Feb 2016 12:23:48 +0000 (12:23 +0000)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160203-1' into staging
virtio-gpu: bugfixes and spice support preparation
# gpg: Signature made Wed 03 Feb 2016 09:47:13 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg: aka "Gerd Hoffmann <[email protected]>"
# gpg: aka "Gerd Hoffmann (private) <[email protected]>"
* remotes/kraxel/tags/pull-vga-20160203-1:
virtio-gpu: block any rendering until client (ui) is done
virtio-gpu: add support to enable/disable command processing
virtio-gpu: maintain command queue
virtio-gpu: fix memory leak in error path
console: block rendering until client is done
zap qemu_egl_has_ext in include/ui/egl-helpers.h
Paolo Bonzini [Thu, 7 Jan 2016 13:55:32 +0000 (16:55 +0300)]
log: add "-d trace:PATTERN"
This is a bit easier to use than "-trace" if you are also enabling
other kinds of logging. It is also more discoverable for experienced
QEMU users, and accessible from user-mode emulators.
Gerd Hoffmann [Tue, 1 Dec 2015 11:05:14 +0000 (12:05 +0100)]
virtio-gpu: maintain command queue
We'll go take out the commands we receive out of the virt queue and put
them into a linked list, to decouple virtio queue handling from actual
command processing.
Also move cmd processing to new virtio_gpu_handle_ctrl func, so we can
easily kick it from different places.
Gerd Hoffmann [Thu, 3 Dec 2015 11:34:25 +0000 (12:34 +0100)]
console: block rendering until client is done
Allow gl user interfaces to block display device gl rendering.
The ui code might want to do that in case it takes a little
longer to bring things to screen, for example because we'll
hand over a dma-buf to another process (spice will do that).
Paolo Bonzini [Wed, 28 Oct 2015 06:06:26 +0000 (07:06 +0100)]
trace: count number of enabled events
This lets trace_event_get_state_dynamic quickly return false. Right
now there is hardly any benefit because there are also many assertions
and indirections, but the next patch will streamline all of this.
hmp: fix sendkey out of bounds write (CVE-2015-8619)
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.
Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.
Reported-by: Ling Liu <[email protected]> Signed-off-by: Wolfgang Bumiller <[email protected]>
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked] Signed-off-by: Markus Armbruster <[email protected]>
Peter Maydell [Tue, 2 Feb 2016 18:04:04 +0000 (18:04 +0000)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging
Block patches
# gpg: Signature made Tue 02 Feb 2016 17:23:44 GMT using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <[email protected]>"
* remotes/maxreitz/tags/pull-block-for-peter-2016-02-02: (50 commits)
block: qemu-iotests - add test for snapshot, commit, snapshot bug
block: set device_list.tqe_prev to NULL on BDS removal
iotests: Add "qemu-img map" test for VMDK extents
qemu-img: Make MapEntry a QAPI struct
qemu-img: In "map", use the returned "file" from bdrv_get_block_status
block: Use returned *file in bdrv_co_get_block_status
vmdk: Return extent's file in bdrv_get_block_status
vmdk: Fix calculation of block status's offset
vpc: Assign bs->file->bs to file in vpc_co_get_block_status
vdi: Assign bs->file->bs to file in vdi_co_get_block_status
sheepdog: Assign bs to file in sd_co_get_block_status
qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
parallels: Assign bs->file->bs to file in parallels_co_get_block_status
iscsi: Assign bs to file in iscsi_co_get_block_status
raw: Assign bs to file in raw_co_get_block_status
qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
qcow: Assign bs->file->bs to file in qcow_co_get_block_status
block: Add "file" output parameter to block status query functions
block: acquire in bdrv_query_image_info
iotests: Add test for block jobs and BDS ejection
...
Jeff Cody [Tue, 2 Feb 2016 01:33:10 +0000 (20:33 -0500)]
block: set device_list.tqe_prev to NULL on BDS removal
This fixes a regression introduced with commit 3f09bfbc7. Multiple
bugs arise in conjunction with live snapshots and mirroring operations
(which include active layer commit).
After a live snapshot occurs, the active layer and the base layer both
have a non-NULL tqe_prev field in the device_list, although the base
node's tqe_prev field points to a NULL entry. This non-NULL tqe_prev
field occurs after the bdrv_append() in the external snapshot calls
change_parent_backing_link().
In change_parent_backing_link(), when the previous active layer is
removed from device_list, the device_list.tqe_prev pointer is not
set to NULL.
The operating scheme in the block layer is to indicate that a BDS belongs
in the bdrv_states device_list iff the device_list.tqe_prev pointer
is non-NULL.
This patch does two things:
1.) Introduces a new block layer helper bdrv_device_remove() to remove a
BDS from the device_list, and
2.) uses that new API, which also fixes the regression once used in
change_parent_backing_link().
Fam Zheng [Tue, 26 Jan 2016 03:58:48 +0000 (11:58 +0800)]
block: Add "file" output parameter to block status query functions
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.
Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.
The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.
Max Reitz [Fri, 29 Jan 2016 15:36:15 +0000 (16:36 +0100)]
iotests: Add test for multiple BB on BDS tree
This adds a test for having multiple BlockBackends in one BDS tree. In
this case, there is one BB for the protocol BDS and one BB for the
format BDS in a simple two-BDS tree (with the protocol BDS and BB added
first).
When bdrv_close_all() is executed, no cached data from any BDS should be
lost; the protocol BDS may not be closed until the format BDS is closed.
Otherwise, metadata updates may be lost.
Max Reitz [Fri, 29 Jan 2016 15:36:14 +0000 (16:36 +0100)]
block: Rewrite bdrv_close_all()
This patch rewrites bdrv_close_all(): Until now, all root BDSs have been
force-closed. This is bad because it can lead to cached data not being
flushed to disk.
Instead, try to make all reference holders relinquish their reference
voluntarily:
1. All BlockBackend users are handled by making all BBs simply eject
their BDS tree. Since a BDS can never be on top of a BB, this will
not cause any of the issues as seen with the force-closing of BDSs.
The references will be relinquished and any further access to the BB
will fail gracefully.
2. All BDSs which are owned by the monitor itself (because they do not
have a BB) are relinquished next.
3. Besides BBs and the monitor, block jobs and other BDSs are the only
things left that can hold a reference to BDSs. After every remaining
block job has been canceled, there should not be any BDSs left (and
the loop added here will always terminate (as long as NDEBUG is not
defined), because either all_bdrv_states will be empty or there will
not be any block job left to cancel, failing the assertion).
Max Reitz [Fri, 29 Jan 2016 15:36:13 +0000 (16:36 +0100)]
block: Add blk_remove_all_bs()
When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.
This patch adds a function for doing that, but does not yet incorporate
it in bdrv_close_all().
Max Reitz [Fri, 29 Jan 2016 15:36:11 +0000 (16:36 +0100)]
block: Add list of all BlockDriverStates
We need this list so that bdrv_close_all() can keep track of which BDSs
are still open after having removed the BDSs from all of the BBs and
having released all monitor BDS references.
Max Reitz [Fri, 29 Jan 2016 15:36:10 +0000 (16:36 +0100)]
block: Make bdrv_close() static
There are no users of bdrv_close() left, except for one of bdrv_open()'s
failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
Make bdrv_close() static so nobody makes the mistake of directly using
bdrv_close() again.
Max Reitz [Fri, 29 Jan 2016 15:36:06 +0000 (16:36 +0100)]
nbd: Switch from close to eject notifier
The NBD code uses the BDS close notifier to determine when a medium is
ejected. However, now it should use the BB's BDS removal notifier for
that instead of the BDS's close notifier.
Max Reitz [Fri, 29 Jan 2016 15:36:04 +0000 (16:36 +0100)]
virtio-blk: Functions for op blocker management
Put the code for setting up and removing op blockers into an own
function, respectively. Then, we can invoke those functions whenever a
BDS is removed from an virtio-blk BB or inserted into it.
Max Reitz [Fri, 29 Jan 2016 15:36:03 +0000 (16:36 +0100)]
block: Add BB-BDS remove/insert notifiers
bdrv_close() no longer signifies ejection of a medium, this is now done
by removing the BDS from the BB. Therefore, we want to have a notifier
for that in the BB instead of a close notifier in the BDS. The former is
added now, the latter is removed later.
Symmetrically, another notifier list is added that is invoked whenever a
BDS is inserted. We will need that for virtio-blk and virtio-scsi, which
can then remove their op blockers on BDS ejection and set them up on
insertion.
Max Reitz [Fri, 29 Jan 2016 15:36:01 +0000 (16:36 +0100)]
block: Release named dirty bitmaps in bdrv_close()
bdrv_delete() is not very happy about deleting BlockDriverStates with
dirty bitmaps still attached to them. In the past, we got around that
very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
bdrv_close() simply ignoring that condition. We should fix that by
releasing all named dirty bitmaps in bdrv_close() (there should not be
any unnamed bitmaps left) and moving the assertion from bdrv_delete()
there.
Max Reitz [Mon, 25 Jan 2016 18:41:14 +0000 (19:41 +0100)]
iotests: Make redirecting qemu's stderr optional
Redirecting qemu's stderr to stdout makes working with the stderr output
difficult due to the other file descriptor magic performed in
_launch_qemu ("ambiguous redirect").
Add an option which specifies whether stderr should be redirected to
stdout or not (allowing for other modes to be added in the future).
Max Reitz [Mon, 25 Jan 2016 18:41:13 +0000 (19:41 +0100)]
iotests: Make _filter_nbd support more URL types
This function should support URLs of the "nbd://" format (without
swallowing the export name), and for "nbd:///" URLs it should replace
"?socket=$TEST_DIR" by "?socket=TEST_DIR" because putting the Unix
socket files into the test directory makes sense.
Max Reitz [Mon, 25 Jan 2016 18:41:10 +0000 (19:41 +0100)]
iotests: Change coding style of _filter_nbd in 083
In order to be able to move _filter_nbd to common.filter in the next
patch, its coding style needs to be adapted to that of common.filter.
That means, we have to convert tabs to four spaces, adjust the alignment
of the last line (done with spaces already, assuming one tab equals
eight spaces), fix the line length of the comment, and add a line break
before the opening brace.
Max Reitz [Mon, 25 Jan 2016 18:41:09 +0000 (19:41 +0100)]
iotests: Rename filter_nbd to _filter_nbd in 083
In the patch after the next, this function is moved to common.filter.
Therefore, its name should be preceded by an underscore to signify its
global availability.
To keep the code motion patch clean, we cannot rename it in the same
patch, so we need to choose some order of renaming vs. motion. It is
better to keep a supposedly global function used by only a single test
in that test than to keep a supposedly local function in a common* file
and use it from a test, so we should rename the function before moving
it.
Max Reitz [Mon, 25 Jan 2016 18:41:08 +0000 (19:41 +0100)]
nbd: client_close on error in nbd_co_client_start
Use client_close() if an error in nbd_co_client_start() occurs instead
of manually inlining parts of it. This fixes an assertion error on the
server side if nbd_negotiate() fails.
Fam Zheng [Mon, 25 Jan 2016 02:26:23 +0000 (10:26 +0800)]
vmdk: Fix converting to streamOptimized
Commit d62d9dc4b8 lifted streamOptimized images's version to 3, but we
now refuse to open version 3 images read-write. We need to make
streamOptimized an exception to allow converting to it. This fixes the
accidentally broken iotests case 059 for the same reason.