* jsnow/tags/ide-pull-request:
ahci-test: add QMP tray test for ATAPI
libqos/ahci: Add get_sense and test_ready
libqos/ahci: Add ATAPI tray macros
libqos/ahci: Support expected errors
libqtest: add qmp_eventwait_ref
block-backend: Always notify on blk_eject
ahci-test: test atapi read_cd with bcl, nb_sectors = 0
ahci-test: Create smaller test ISO images
atapi: classify read_cd as conditionally returning data
John Snow [Mon, 14 Nov 2016 16:15:55 +0000 (11:15 -0500)]
libqos/ahci: Add ATAPI tray macros
(1) Add START_STOP_UNIT command to ahci-test suite
(2) Add eject/start macro commands; this is not a data transfer
command so it is not well-served by the existing generic pipeline.
John Snow [Mon, 14 Nov 2016 16:15:54 +0000 (11:15 -0500)]
block-backend: Always notify on blk_eject
blk_eject is only used by scsi-disk and atapi, and in both cases we
only attempt to invoke blk_eject if we have a bona-fide change in
tray state.
The "issue" here is that the tray state does not generate a QMP event
unless there is a medium/BDS attached to the device, so if libvirt et al
are waiting for a tray event to occur from an empty-but-closed drive,
software opening that drive will not emit an event and libvirt will
wait forever.
Change this by modifying blk_eject to always emit an event, instead of
conditionally on a "real" backend eject.
John Snow [Mon, 14 Nov 2016 16:15:54 +0000 (11:15 -0500)]
ahci-test: test atapi read_cd with bcl, nb_sectors = 0
Commit 9ef2e93f introduced the concept of tagging ATAPI commands as
NONDATA, but this introduced a regression for certain commands better
described as CONDDATA. read_cd is such a command that both requires
a non-zero BCL if a transfer size is set, but is perfectly content to
accept a zero BCL if the transfer size is 0.
This test adds a regression test for the case where BCL and nb_sectors
are both 0.
Flesh out the CDROM tests by:
(1) Allowing the test to specify a BCL
(2) Allowing the buffer comparison test to compare a 0-size buffer
(3) Fix the BCL specification in libqos (It is LE, not BE)
(4) Add a nice human-readable message for future SCSI command additions
John Snow [Mon, 14 Nov 2016 16:15:54 +0000 (11:15 -0500)]
atapi: classify read_cd as conditionally returning data
For the purposes of byte_count_limit verification, add a new flag that
identifies read_cd as sometimes returning data, then check the BCL in
its command handler after we know that it will indeed return data.
Stefan Hajnoczi [Mon, 14 Nov 2016 15:42:22 +0000 (15:42 +0000)]
Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches for 2.8.0-rc0
# gpg: Signature made Fri 11 Nov 2016 03:46:12 PM GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream:
raw-posix: Rename 'raw_s' to 'rs'
iotests: Always use -machine accel=qtest
iotests: Skip test 162 if there is no SSH support
block: Emit modules in bdrv_iterate_format()
block: Fix bdrv_iterate_format() sorting
nfs: Fix memory leak in nfs_file_create()
qcow2: Remove stale FIXME comment
raw_bsd: don't check size alignment when only offset is set
raw_bsd: move check to prevent overflow
hmp: Make block_stream set an explicit job ID
block/ssh: Code cleanup for unused parameter
block/nbd: Fix the leaked visitor
Kevin Wolf [Fri, 11 Nov 2016 14:58:12 +0000 (15:58 +0100)]
Merge remote-tracking branch 'mreitz/tags/pull-block-2016-11-11' into queue-block
Block patches for qemu 2.8
# gpg: Signature made Fri Nov 11 15:56:59 2016 CET
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2016-11-11:
raw-posix: Rename 'raw_s' to 'rs'
iotests: Always use -machine accel=qtest
iotests: Skip test 162 if there is no SSH support
block: Emit modules in bdrv_iterate_format()
block: Fix bdrv_iterate_format() sorting
Max Reitz [Mon, 17 Oct 2016 18:39:17 +0000 (20:39 +0200)]
iotests: Always use -machine accel=qtest
Currently, we only use -machine accel=qtest when qemu is invoked through
the common.qemu functions. However, we always want to use it, so move it
from common.qemu directly into QEMU_OPTIONS.
Max Reitz [Wed, 12 Oct 2016 20:49:05 +0000 (22:49 +0200)]
block: Fix bdrv_iterate_format() sorting
bdrv_iterate_format() did not actually sort the formats by name but by
"pointer interpreted as string". That is probably not what we intended
to do, so fix it (by changing qsort_strcmp() so it matches the example
from qsort()'s manual page).
raw_bsd: don't check size alignment when only offset is set
We make sure that the size is aligned to sector length to prevent any
round ups. Otherwise we could end up reading/writing data outside the
area specified by user. This is only needed when user supplies the size
option to avoid any surprises. It is not necessary when only offset is
set.
More over, the check made it difficult to use the offset option without
size option. The check puts unneeded restriction on the offset which had
to be aligned too. Because bdrv_getlength() returns aligned value having
unaligned offset would make the check fail.
When only offset is specified but no size and the offset is greater than
the real size of the containing device an overflow occurs when parsing
the options. This overflow is harmless because we do check for this
exact situation little bit later, but it leads to an error message with
weird values. It is better to do the check is sooner and prevent the
overflow.
Alberto Garcia [Fri, 4 Nov 2016 13:44:43 +0000 (15:44 +0200)]
hmp: Make block_stream set an explicit job ID
A job ID is always required in order to create a block job on a
non-root node. The default ID (obtained with bdrv_get_device_name())
is otherwise empty in this scenario and the job cannot be created.
The HMP block_stream command doesn't set a job ID and therefore it
doesn't allow streaming to intermediate nodes. One solution is to add
an extra parameter to set a job ID. The other solution is to simply
use the node name passed to block_stream as job ID. This won't work
if it's automatically generated (because it contains a '#') but is
otherwise simple enough for all other cases.
This way 'block_stream node3' will create a job with the ID 'node3'
and the good old 'block_stream virtio0' will keep the previous
behaviour and use 'virtio0' for the job ID.
This patch drops the unused parameter "BDRVSSHState" being passed into
the ssh_config() function and does code cleanup. The unused parameter
was introduced by the commit c322712.
* bonzini/tags/for-upstream:
nbd: Don't inf-loop on early EOF
target-i386: document how x86 gdb_num_core_regs is computed.
qdev: fix use-after-free regression from becdfa00cfa
target-i386/machine: fix migrate faile because of Hyper-V HV_X64_MSR_VP_RUNTIME
vl.c: move pidfile creation up the line
target-i386: fix typo
John Snow [Thu, 4 Aug 2016 18:18:51 +0000 (14:18 -0400)]
MAINTAINERS: Add Fam and Jsnow for Bitmap support
These files are currently unmaintained.
I'm proposing that Fam and I co-maintain them; under the model that
whomever between us isn't authoring a given series will be responsible
for reviewing it.
Thomas Huth [Tue, 8 Nov 2016 11:46:22 +0000 (12:46 +0100)]
MAINTAINERS: Add an entry for the CHRP NVRAM files
I recently added new files to the source tree that are not
covered by any maintainer yet -- and since every new source
file should have a maintainer nowadays, I volunteer to look
after these files now, too.
Thomas Huth [Wed, 2 Nov 2016 08:39:33 +0000 (09:39 +0100)]
m68k: Update the 68k sections in the MAINTAINERS file
disas/m68k.c obviously belong to the m68k CPU section in
the MAINTAINERS file, but remove the hw/m68k/ directory
here since it only contains machine (not CPU) related
files, as requested by Laurent. Add the machine related
files to the right machine sections instead.
Thomas Huth [Fri, 23 Sep 2016 12:14:18 +0000 (14:14 +0200)]
MAINTAINERS: Add some ARM related files to the corresponding sections
The files w/cpu/a*mpcore.c are already assigned to the ARM CPU
section, but the corresponding headers include/hw/cpu/a*mpcore.h
are still missing.
The file hw/*/imx* are already assigned to the i.MX31 machine, but
the corresponding header files include/hw/*/imx* are still missing.
The file hw/misc/arm_integrator_debug.c seems to belong to Integrator
CP, hw/cpu/realview_mpcore.c seems to belong to Real View, and
hw/misc/mst_fpga.c seems to belong to PXA2XX.
And the files hw/misc/zynq* and include/hw/misc/zynq* seem to belong
to the Xilinx Zynq machine.
Samuel Thibault [Wed, 9 Nov 2016 10:27:52 +0000 (11:27 +0100)]
Fix cursesw detection
On systems which do not provide ncursesw.pc and whose /usr/include/curses.h
does not include wide support, we should not only try with no -I, i.e.
/usr/include, but also with -I/usr/include/ncursesw.
To properly detect for wide support with and without -Werror, we need to
check for the presence of e.g. the WACS_DEGREE macro.
We also want to stop at the first curses_inc_list configuration which works,
and make sure to set IFS to : at each new loop.
Peter Korsgaard [Fri, 28 Oct 2016 14:51:32 +0000 (16:51 +0200)]
hw/input/hid: support alternative sysrq/break scancodes for gtk-vnc
The printscreen/sysrq and pause/break keys currently don't work for guests
using -usbdevice keyboard when accessed through vnc with a gtk-vnc based
client.
The reason for this is a mismatch between gtk-vnc and qemu in how these keys
should be mapped to XT keycodes.
On the original IBM XT these keys behaved differently than other keys.
Quoting from https://www.win.tue.nl/~aeb/linux/kbd/scancodes-1.html:
The keys PrtSc/SysRq and Pause/Break are special. The former produces
scancode e0 2a e0 37 when no modifier key is pressed simultaneously, e0 37
together with Shift or Ctrl, but 54 together with (left or right) Alt. (And
one gets the expected sequences upon release. But see below.) The latter
produces scancode sequence e1 1d 45 e1 9d c5 when pressed (without modifier)
and nothing at all upon release. However, together with (left or right)
Ctrl, one gets e0 46 e0 c6, and again nothing at release. It does not
repeat.
Gtk-vnc supports the 'QEMU Extended Key Event Message' RFB extension to send
raw XT keycodes directly to qemu, but the specification doesn't explicitly
specify how to map such long/complicated keycode sequences. From the spec
(https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#qemu-extended-key-event-message)
The keycode is the XT keycode that produced the keysym. An XT keycode is an
XT make scancode sequence encoded to fit in a single U32 quantity. Single
byte XT scancodes with a byte value less than 0x7f are encoded as is.
2-byte XT scancodes whose first byte is 0xe0 and second byte is less than
0x7f are encoded with the high bit of the first byte set
hid.c currently expects the keycode sequence with shift/ctl for sysrq (e0 37
-> 0xb7 in RFB), whereas gtk-vnc uses the sequence with alt (0x54).
Likewise, hid.c expects the code without modifiers (e1 1d 45 -> 0xc5 in
RFB), whereas gtk-vnc sends the keycode sequence with ctrl for pause (e0 46
-> 0xc6 in RFB).
See keymaps.cvs in gtk-vnc for the mapping used:
https://git.gnome.org/browse/gtk-vnc/tree/src/keymaps.csv#n150
Now, it isn't obvious to me which sequence is really "right", but as the
0x54/0xc6 keycodes are currently unused in hid.c, supporting both seems like
the pragmatic solution to me. The USB HID keyboard boot protocol used by
hid.c doesn't have any other mapping applicable to these keys.
The other guest keyboard interfaces (ps/2, virtio, ..) are not affected,
because they handle these keys differently.
Thomas Huth [Wed, 2 Nov 2016 10:08:48 +0000 (11:08 +0100)]
ui/gtk: Fix build with older versions of gtk
GDK_KEY_Delete is only defined with gtk version 2.22 and newer,
on older versions this key was called GDK_Delete instead.
Since this is the case for all GDK_KEY_* defines, change the
already existing preprocessor check there to test for version 2.22,
so we know that we can remove this code block in case we require
that version as a minimum one day.
Li Qiang [Tue, 8 Nov 2016 05:57:46 +0000 (21:57 -0800)]
usbredir: free vm_change_state_handler in usbredir destroy dispatch
In usbredir destroy dispatch function, it doesn't free the vm change
state handler once registered in usbredir_realize function. This will
lead a memory leak issue. This patch avoid this.
Li Qiang [Tue, 8 Nov 2016 12:11:10 +0000 (04:11 -0800)]
usb: ehci: fix memory leak in ehci_init_transfer
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.
Laszlo Ersek (3):
[efi] Install the HII config access protocol on a child of the SNP handle
[librm] Conditionalize the workaround for the Tivoli VMM's SSE garbling
[build] Disable TIVOLI_VMM_WORKAROUND in the qemu configuration
Lukas Grossar (1):
[intel] Add PCI device ID for I219-V/LM
Michael Brown (57):
[efi] Fix uninitialised data in HII IFR structures
[bios] Do not enable interrupts when printing to the console
[pxe] Disable interrupts on the PIC before starting NBP
[dhcp] Allow for variable encapsulation of architecture-specific options
[dhcpv6] Include RFC5970 client architecture options in DHCPv6 requests
[dhcpv6] Include vendor class identifier option in DHCPv6 requests
[dhcp] Automatically generate vendor class identifier string
[xfer] Send intf_close() if redirection fails
[downloader] Treat redirection failures as fatal
[iscsi] Treat redirection failures as fatal
[debug] Allow per-object runtime enabling/disabling of debug messages
[debug] Allow debug messages to be initially disabled at runtime
[libc] Allow assertions to be globally enabled or disabled
[profile] Allow profiling to be globally enabled or disabled
[rng] Check for functioning RTC interrupt
[acpi] Add support for ACPI power off
[acpi] Allow time for ACPI power off to take effect
[ipv4] Send gratuitous ARPs whenever a new IPv4 address is applied
[intel] Strip spurious VLAN tags received by virtual function NICs
[intel] Remove duplicate intelvf_mbox_queues() function
[ipv6] Perform SLAAC only during autoconfiguration
[settings] Create space for IPv6 in settings display order
[ipv6] Rename ipv6_scope to dhcpv6_scope
[settings] Correctly mortalise autovivified child settings blocks
[ipv6] Allow settings to comprise arbitrary subsets of NDP options
[ipv6] Expose IPv6 settings acquired through NDP
[dhcpv6] Expose IPv6 address setting acquired through DHCPv6
[ipv6] Expose IPv6 link-local address settings
[settings] Allow settings blocks to specify a sibling ordering
[ipv6] Match user expectations for IPv6 settings priorities
[ipv6] Create routing table based on IPv6 settings
[ipv6] Rename ipv6_scope to ipv6_settings_scope
[test] Update IPv6 tests to use okx()
[ipv6] Allow for multiple routers
[hyperv] Use instance UUID in device name
[crypto] Remove obsolete extern declaration for asn1_invalidate_cursor()
[crypto] Allow for parsing of partial ASN.1 cursors
[image] Add image_asn1() to extract ASN.1 objects from image
[crypto] Add DER image format
[crypto] Add PEM image format
[image] Use image_asn1() to extract data from CMS signature images
[build] Remove obsolete explicit object requirements
[crypto] Enable both DER and PEM formats by default
[build] Remove more obsolete explicit object requirements
[pixbuf] Enable PNG format by default
[crypto] Add image_x509() to extract X.509 certificates from image
[crypto] Generalise X.509 "valid" field to a "flags" field
[list] Add list_next_entry() and list_prev_entry()
[crypto] Expose certstore_del() to explicitly remove stored certificates
[crypto] Allow certificates to be marked as having been added explicitly
[crypto] Add certstat() to display basic certificate information
[cmdline] Add certificate management commands
[crypto] Mark permanent certificates as permanent
[efi] Mark AppleNetBoot.h as a native iPXE header
[efi] Update to current EDK2 headers
[efi] Add EFI_BLOCK_IO2_PROTOCOL header and GUID definition
[bzimage] Fix page alignment of initrd images
Eric Blake [Mon, 7 Nov 2016 20:38:13 +0000 (14:38 -0600)]
nbd: Don't inf-loop on early EOF
Commit 7d3123e converted a single read_sync() into a while loop
that assumed that read_sync() would either make progress or give
an error. But when the server hangs up early, the client sees
EOF (a read_sync() of 0) and never makes progress, which in turn
caused qemu-iotest './check -nbd 83' to go into an infinite loop.
Rework the loop to accomodate reads cut short by EOF.
Michael Tokarev [Wed, 2 Nov 2016 14:18:50 +0000 (17:18 +0300)]
vl.c: move pidfile creation up the line
With current code, pid file is open after various
sockets, chardevs, fsdevs and the like. This causes
interesting effects, for example when monitor is a
unix-socket, and another qemu instance is already
running, new qemu first "damages" the socket and
next complain that it can't acquire the pid file and
exits, making running qemu unreachable.
Move pid file creation earlier, right after the call
to os_daemonize(), where we know our process id (pid).
Paolo Bonzini [Wed, 2 Nov 2016 19:58:25 +0000 (20:58 +0100)]
target-i386: fix typo
The impact is small because kvm_get_vcpu_events fixes env->hflags, but
it is wrong and could cause INITs to be delayed arbitrarily with
-machine kernel_irqchip=off.
Kevin Wolf [Fri, 4 Nov 2016 23:03:15 +0000 (00:03 +0100)]
block: Don't mark node clean after failed flush
Commit 3ff2f67a changed bdrv_co_flush() so that no flush is issues if
the image hasn't been dirtied since the last flush. This is not quite
correct: The condition should be that the image hasn't been dirtied
since the last _successful_ flush. This patch changes the logic
accordingly.
Without this fix, subsequent bdrv_co_flush() calls would return success
without actually doing anything even though the image is still dirty.
The difference is visible in some blkdebug test cases where error
messages incorrectly disappeared after commit 3ff2f67a.
Stefan Hajnoczi [Mon, 7 Nov 2016 14:02:15 +0000 (14:02 +0000)]
Merge remote-tracking branch 'pm215/tags/pull-target-arm-20161107' into staging
target-arm queue:
* bitbang_i2c: Handle NACKs from devices
* Fix corruption of CPSR when SCTLR.EE is set
* nvic: set pending status for not active interrupts
* char: cadence: check baud rate generator and divider values
# gpg: Signature made Mon 07 Nov 2016 10:43:07 AM GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
# gpg: aka "Peter Maydell <[email protected]>"
# gpg: aka "Peter Maydell <[email protected]>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* pm215/tags/pull-target-arm-20161107:
hw/i2c/bitbang_i2c: Handle NACKs from devices
Fix corruption of CPSR when SCTLR.EE is set
nvic: set pending status for not active interrupts
char: cadence: check baud rate generator and divider values
Cornelia Huck [Wed, 2 Nov 2016 16:21:03 +0000 (17:21 +0100)]
s390x/kvm: fix run_on_cpu sigp conversions
Commit 14e6fe12a ("*_run_on_cpu: introduce run_on_cpu_data type")
attempted to convert all users of run_on_cpu to use the new
run_on_cpu_data type. It missed to change the called sigp_* routines,
however. Fix that.
Peter Maydell [Mon, 24 Oct 2016 18:12:29 +0000 (19:12 +0100)]
hw/i2c/bitbang_i2c: Handle NACKs from devices
If the guest attempts to talk to a nonexistent device over i2c,
the i2c_start_transfer() function will return non-zero, indicating
that the bus is signalling a NACK. Similarly, if the i2c_send()
function returns nonzero then the target device returned a NACK.
Handle this possibility in the bitbang_i2c code, by returning
the state machine to the STOPPED state and returning the NACK
bit to the guest.
This bit of missing functionality was spotted by Coverity
(it noticed that we weren't checking the return value from
i2c_start_transfer()).
Julian Brown [Mon, 7 Nov 2016 10:00:24 +0000 (10:00 +0000)]
Fix corruption of CPSR when SCTLR.EE is set
Fix a typo in arm_cpu_do_interrupt_aarch32 (OR'ing with ~CPSR_E
instead of CPSR_E) which meant that when we took an interrupt with
SCTLR.EE set we would corrupt the CPSR.
nvic: set pending status for not active interrupts
According to ARM DUI 0552A 4.2.10. NVIC set pending status
also for disabled interrupts. Correct the logic for
when interrupts are marked pending both on input level
transition and when interrupts are dismissed, to match
the NVIC behaviour rather than the 11MPCore GIC.
char: cadence: check baud rate generator and divider values
The Cadence UART device emulator calculates speed by dividing the
baud rate by a 'baud rate generator' & 'baud rate divider' value.
The device specification defines these register values to be
non-zero and within certain limits. Add checks for these limits
to avoid errors like divide by zero.
# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>"
# gpg: aka "Paolo Bonzini <[email protected]>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (30 commits)
main-loop: Suppress I/O thread warning under qtest
docs/rcu.txt: Fix minor typo
vl: exit qemu on guest panic if -no-shutdown is not set
checkpatch: allow spaces before parenthesis for 'coroutine_fn'
x86: add AVX512_4VNNIW and AVX512_4FMAPS features
slirp: fix CharDriver breakage
qemu-char: do not forward events through the mux until QEMU has started
nbd: Implement NBD_CMD_WRITE_ZEROES on client
nbd: Implement NBD_CMD_WRITE_ZEROES on server
nbd: Improve server handling of shutdown requests
nbd: Refactor conversion to errno to silence checkpatch
nbd: Support shorter handshake
nbd: Less allocation during NBD_OPT_LIST
nbd: Let client skip portions of server reply
nbd: Let server know when client gives up negotiation
nbd: Share common option-sending code in client
nbd: Send message along with server NBD_REP_ERR errors
nbd: Share common reply-sending code in server
nbd: Rename struct nbd_request and nbd_reply
nbd: Rename NbdClientSession to NBDClientSession
...
Wei Liu [Tue, 1 Nov 2016 17:44:16 +0000 (17:44 +0000)]
PCMachineState: introduce acpi_build_enabled field
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.
It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.
Max Reitz [Mon, 17 Oct 2016 18:09:39 +0000 (20:09 +0200)]
main-loop: Suppress I/O thread warning under qtest
We do not want to display the "I/O thread spun" warning for test cases
that run under qtest. The first attempt for this (commit 01c22f2cdd4fcf02276ea10f48253850a5fd7259) tested whether qtest_enabled()
was true.
Commit 21a24302e85024dd7b2a151158adbc1f5dc5c4dd correctly recognized
that just testing qtest_enabled() is not sufficient since there are some
tests that do not use the qtest accelerator but just the qtest character
device, and thus replaced qtest_enabled() by qtest_driver().
However, there are also some tests that only use the qtest accelerator
and not the qtest chardev; perhaps most notably the bash iotests.
Therefore, we have to check both qtest_enabled() and qtest_driver().
vl: exit qemu on guest panic if -no-shutdown is not set
For automated testing purposes it can be helpful to exit qemu
(poweroff) when the guest panics. Make this the default unless
-no-shutdown is specified.
For internal-errors like errors from KVM_RUN the behaviour is
not changed, in other words QEMU does not exit to allow debugging
in the QEMU monitor.
Paolo Bonzini [Thu, 27 Oct 2016 20:04:58 +0000 (22:04 +0200)]
slirp: fix CharDriver breakage
SLIRP expects a CharBackend as the third argument to slirp_add_exec,
but net/slirp.c was passing a CharDriverState. Fix this to restore
guestfwd functionality.
Paolo Bonzini [Thu, 27 Oct 2016 13:38:19 +0000 (15:38 +0200)]
qemu-char: do not forward events through the mux until QEMU has started
Otherwise, the CHR_EVENT_OPENED event is sent twice: first when the
backend (for example "stdio") is opened, and second after processing
the command line.
The incorrect sending of the event prints the monitor banner when
QEMU is started with "-serial mon:stdio". This includes the "(qemu)"
prompt; thus the monitor seems to be dead, whereas actually the
active front-end is the serial port.
Eric Blake [Fri, 14 Oct 2016 18:33:18 +0000 (13:33 -0500)]
nbd: Implement NBD_CMD_WRITE_ZEROES on client
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.
The generic block code takes care of falling back to the obvious
write of lots of zeroes if we return -ENOTSUP because the server
does not have WRITE_ZEROES.
Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data
over the wire, we want to support transactions that are much
larger than the normal 32M limit imposed on NBD_CMD_WRITE. But
the server may still have a limit smaller than UINT_MAX, so
until experimental NBD protocol additions for advertising various
command sizes is finalized (see [1], [2]), for now we just stick to
the same limits as normal writes.
Eric Blake [Fri, 14 Oct 2016 18:33:17 +0000 (13:33 -0500)]
nbd: Implement NBD_CMD_WRITE_ZEROES on server
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants to allow
a hole.
Note that when it comes to requiring full allocation, vs.
permitting optimizations, the NBD spec intentionally picked a
different sense for the flag; the rules in qemu are:
MAY_UNMAP == 0: must write zeroes
MAY_UNMAP == 1: may use holes if reads will see zeroes
while in NBD, the rules are:
FLAG_NO_HOLE == 1: must write zeroes
FLAG_NO_HOLE == 0: may use holes if reads will see zeroes
In all cases, the 'may use holes' scenario is optional (the
server need not use a hole, and must not use a hole if
subsequent reads would not see zeroes).
Eric Blake [Fri, 14 Oct 2016 18:33:16 +0000 (13:33 -0500)]
nbd: Improve server handling of shutdown requests
NBD commit 6d34500b clarified how clients and servers are supposed
to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN
(for the server to announce it is about to go away during option
haggling, so the client should quit sending NBD_OPT_* other than
NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about
to go away during transmission, so the client should quit sending
NBD_CMD_* other than NBD_CMD_DISC). It also clarified that
NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not.
This patch merely adds the missing reply to NBD_OPT_ABORT and teaches
the client to recognize server errors. Actually teaching the server
to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that
the server has been requested to shut down soon (maybe we could do
that by installing a SIGINT handler in qemu-nbd, which transitions
from RUNNING to a new state that waits for the client to react,
rather than just out-right quitting - but that's a bigger task for
another day).
Eric Blake [Fri, 14 Oct 2016 18:33:15 +0000 (13:33 -0500)]
nbd: Refactor conversion to errno to silence checkpatch
Checkpatch complains that 'return EINVAL' is usually wrong
(since we tend to favor 'return -EINVAL'). But it is a
false positive for nbd_errno_to_system_errno(). Since NBD
may add future defined wire values, refactor the code to
keep checkpatch happy.
Eric Blake [Fri, 14 Oct 2016 18:33:14 +0000 (13:33 -0500)]
nbd: Support shorter handshake
The NBD Protocol allows the server and client to mutually agree
on a shorter handshake (omit the 124 bytes of reserved 0), via
the server advertising NBD_FLAG_NO_ZEROES and the client
acknowledging with NBD_FLAG_C_NO_ZEROES (only possible in
newstyle, whether or not it is fixed newstyle). It doesn't
shave much off the wire, but we might as well implement it.
Eric Blake [Fri, 14 Oct 2016 18:33:13 +0000 (13:33 -0500)]
nbd: Less allocation during NBD_OPT_LIST
Since we know that the maximum name we are willing to accept
is small enough to stack-allocate, rework the iteration over
NBD_OPT_LIST responses to reuse a stack buffer rather than
allocating every time. Furthermore, we don't even have to
allocate if we know the server's length doesn't match what
we are searching for.
Eric Blake [Fri, 14 Oct 2016 18:33:12 +0000 (13:33 -0500)]
nbd: Let client skip portions of server reply
The server has a nice helper function nbd_negotiate_drop_sync()
which lets it easily ignore fluff from the client (such as the
payload to an unknown option request). We can't quite make it
common, since it depends on nbd_negotiate_read() which handles
coroutine magic, but we can copy the idea into the client where
we have places where we want to ignore data (such as the
description tacked on the end of NBD_REP_SERVER).
Eric Blake [Fri, 14 Oct 2016 18:33:11 +0000 (13:33 -0500)]
nbd: Let server know when client gives up negotiation
The NBD spec says that a client should send NBD_OPT_ABORT
rather than just dropping the connection, if the client doesn't
like something the server sent during option negotiation. This
is a best-effort attempt only, and can only be done in places
where we know the server is still in sync with what we've sent,
whether or not we've read everything the server has sent.
Technically, the server then has to reply with NBD_REP_ACK, but
it's not worth complicating the client to wait around for that
reply.
Eric Blake [Fri, 14 Oct 2016 18:33:10 +0000 (13:33 -0500)]
nbd: Share common option-sending code in client
Rather than open-coding each option request, it's easier to
have common helper functions do the work. That in turn requires
having convenient packed types for handling option requests
and replies.
Eric Blake [Fri, 14 Oct 2016 18:33:09 +0000 (13:33 -0500)]
nbd: Send message along with server NBD_REP_ERR errors
The NBD Protocol allows us to send human-readable messages
along with any NBD_REP_ERR error during option negotiation;
make use of this fact for clients that know what to do with
our message.
Eric Blake [Fri, 14 Oct 2016 18:33:08 +0000 (13:33 -0500)]
nbd: Share common reply-sending code in server
Rather than open-coding NBD_REP_SERVER, reuse the code we
already have by adding a length parameter. Additionally,
the refactoring will make adding NBD_OPT_GO in a later patch
easier.
Eric Blake [Fri, 14 Oct 2016 18:33:07 +0000 (13:33 -0500)]
nbd: Rename struct nbd_request and nbd_reply
Our coding convention prefers CamelCase names, and we already
have other existing structs with NBDFoo naming. Let's be
consistent, before later patches add even more structs.
Eric Blake [Fri, 14 Oct 2016 18:33:05 +0000 (13:33 -0500)]
nbd: Rename NBDRequest to NBDRequestData
We have both 'struct NBDRequest' and 'struct nbd_request'; making
it confusing to see which does what. Furthermore, we want to
rename nbd_request to align with our normal CamelCase naming
conventions. So, rename the struct which is used to associate
the data received during request callbacks, while leaving the
shorter name for the description of the request sent over the
wire in the NBD protocol.
Eric Blake [Fri, 14 Oct 2016 18:33:04 +0000 (13:33 -0500)]
nbd: Treat flags vs. command type as separate fields
Current upstream NBD documents that requests have a 16-bit flags,
followed by a 16-bit type integer; although older versions mentioned
only a 32-bit field with masking to find flags. Since the protocol
is in network order (big-endian over the wire), the ABI is unchanged;
but dealing with the flags as a separate field rather than masking
will make it easier to add support for upcoming NBD extensions that
increase the number of both flags and commands.
Improve some comments in nbd.h based on the current upstream
NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md),
and touch some nearby code to keep checkpatch.pl happy.
Eric Blake [Fri, 14 Oct 2016 18:33:03 +0000 (13:33 -0500)]
nbd: Add qemu-nbd -D for human-readable description
The NBD protocol allows servers to advertise a human-readable
description alongside an export name during NBD_OPT_LIST. Add
an option to pass through the user's string to the NBD client.
Doing this also makes it easier to test commit 200650d4, which
is the client counterpart of receiving the description.
Haozhong Zhang [Wed, 19 Oct 2016 09:19:25 +0000 (17:19 +0800)]
acpi: fix assert failure caused by commit 35c5a52d
Commit 35c5a52d "acpi: do not use TARGET_PAGE_SIZE" changed struct
NvdimmDsmIn from a variable-size structure to a fixed-size structure of
4096 bytes. It forgot to adjust an assert in
nvdimm_dsm_set_label_data(..., NvdimmDsmIn *in, ...):
assert(sizeof(*in) + sizeof(*set_label_data) + set_label_data->length <=
4096);
which could crash QEMU when guest writes NVDIMM labels.
Fix it by replacing sizeof(*in) by offsetof(NvdimmDsmIn, arg3).
Corey Minyard [Mon, 24 Oct 2016 20:10:20 +0000 (15:10 -0500)]
ipmi: Add graceful shutdown handling to the external BMC
I misunderstood the workings of the power settings, the power off
is a force off operation and there needs to be a separate graceful
shutdown operation. So replace the force off operation with a
graceful shutdown.
Cédric Le Goater [Mon, 24 Oct 2016 20:10:17 +0000 (15:10 -0500)]
ipmi: chassis poweroff should use qemu_system_shutdown_request()
When issuing a chassis 'powerdown' control command, the routine
qemu_system_shutdown_request() should be used to exit the guest.
qemu_system_powerdown_request() will initiate a soft shutdown which is
not what is required by the IPMI (28.3 Chassis Control Command):
0h = power down. Force system into soft off (S4/S45) state. This
is for 'emergency' management power down actions. The command does
not initiate a clean shut-down of the operating system prior to
powering down the system
Xiao Guangrong [Fri, 28 Oct 2016 16:35:39 +0000 (00:35 +0800)]
nvdimm acpi: introduce _FIT
_FIT is required for hotplug support, guest will inquire the updated
device info from it if a hotplug event is received
As FIT buffer is not completely mapped into guest address space, so a
new function, Read FIT whose UUID is UUID 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
is concatenated before _FIT return
Refer to docs/specs/acpi-nvdimm.txt for detailed design
Xiao Guangrong [Fri, 28 Oct 2016 16:35:38 +0000 (00:35 +0800)]
nvdimm acpi: introduce fit buffer
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug
As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.
@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access
As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced