Kevin Wolf [Tue, 11 Jul 2017 12:04:08 +0000 (14:04 +0200)]
ide: bdrv_attach_dev() for empty CD-ROM
If no drive=... option is passed (for an empty drive), we don't only
lack the BlockBackend normally created by parse_drive(), but we also
need to manually call blk_attach_dev().
IDE does not support hot unplug, but if it did, qdev would take care to
call the matching blk_detach_dev() on unplug.
This fixes at least the bug that such devices didn't show up in
query-block, and probably some more problems.
Kevin Wolf [Tue, 11 Jul 2017 12:00:57 +0000 (14:00 +0200)]
block: List anonymous device BBs in query-block
Instead of listing only monitor-owned BlockBackends in query-block, also
add those anonymous BlockBackends that are owned by a qdev device and as
such under the control of the user.
This allows using query-block to inspect BlockBackends for the modern
configuration syntax with -blockdev and -device.
Kevin Wolf [Tue, 11 Jul 2017 11:04:28 +0000 (13:04 +0200)]
block/qapi: Use blk_all_next() for query-block
This patch replaces the blk_next() loop in query-block by a
blk_all_next() one so that we also get access to BlockBackends that
aren't owned by the monitor. For now, the next thing we do is check
whether each BB has a name, so there is no semantic difference.
Kevin Wolf [Tue, 11 Jul 2017 11:27:38 +0000 (13:27 +0200)]
block/qapi: Add qdev device name to query-block
With -blockdev/-device, users can indirectly create anonymous
BlockBackends, while the state of such backends is still of interest. As
a preparation for making such BBs visible in query-block, make sure that
they can be identified even without a name by adding the ID/QOM path of
their qdev device to BlockInfo.
Peter Maydell [Sun, 9 Jul 2017 21:07:17 +0000 (22:07 +0100)]
block/vpc.c: Handle write failures in get_image_offset()
Coverity (CID 1355236) points out that get_image_offset() doesn't check that
it actually succeeded in writing the updated block bitmap to the file.
Check the error return from bdrv_pwrite_sync() and propagate an error
response back up to the function which calls get_image_offset() for
a write so that it can return the error to its caller.
get_sector_offset() is only used for reads, but we move it to the
same API for consistency.
Peter Maydell [Sun, 9 Jul 2017 17:06:14 +0000 (18:06 +0100)]
block/vmdk: Report failures in vmdk_read_cid()
The function vmdk_read_cid() can fail if the read on the underlying
block device fails, or if there's a format error in the VMDK file.
However its API doesn't provide a mechanism to report these errors,
and in some cases we were returning a CID of 0 and in some cases a
CID of 0xffffffff, either of which might potentially be valid values.
Change the function to return 0 on success or a negative errno, and
return the CID via a uint32_t* argument. Update the callsites to
handle and propagate the error appropriately.
This fixes in passing a Coverity-spotted issue (CID 1350038) where
we weren't checking the return value from sscanf().
block: remove timer canceling in throttle_config()
throttle_config() cancels the timers of the calling BlockBackend. This
doesn't make sense because other BlockBackends in the group remain
untouched. There's no need to cancel the timers in the one specific
BlockBackend so let's not do that. Throttled requests will run as
scheduled and future requests will follow the new configuration. This
also allows a throttle group's configuration to be changed even when it
has no members.
Clock type in throttling is currently inferred by the ThrottleTimer's
clock type even though it is a per-ThrottleGroup property; it doesn't
make sense to have different clock types in the same group. Moving this
to a field in ThrottleGroup can simplify some of the throttle functions.
Kevin Wolf [Mon, 10 Jul 2017 11:42:35 +0000 (13:42 +0200)]
commit: Add NULL check for overlay_bs
I can't see how overlay_bs could become NULL with the current code, but
other code in this function already checks it and we can make Coverity
happy with this check, so let's add it.
* remotes/stefanha/tags/block-pull-request:
block: fix shadowed variable in bdrv_co_pdiscard
util/aio-win32: Only select on what we are actually waiting for
Peter Maydell [Tue, 18 Jul 2017 10:41:03 +0000 (11:41 +0100)]
Merge remote-tracking branch 'remotes/aurel/tags/pull-target-mips-20170717' into staging
Queued target/mips patches
# gpg: Signature made Mon 17 Jul 2017 15:50:27 BST
# gpg: using RSA key 0xBA9C78061DDD8C9B
# gpg: Good signature from "Aurelien Jarno <[email protected]>"
# gpg: aka "Aurelien Jarno <[email protected]>"
# gpg: aka "Aurelien Jarno <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77 196D BA9C 7806 1DDD 8C9B
* remotes/aurel/tags/pull-target-mips-20170717:
target/mips: optimize WSBH, DSBH and DSHD
mips: set CP0 Debug DExcCode for SDBBP instruction
Peter Maydell [Tue, 18 Jul 2017 09:35:06 +0000 (10:35 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170717' into staging
target-arm queue:
* new model of the ARM MPS2/MPS2+ FPGA based development board
* clean up DISAS_* exit conditions and fix various regressions
since commits e75449a3468a6b28c7b5 (in particular including
ones which broke OP-TEE guests)
* make Cortex-M3 and M4 correctly default to 8 PMSA regions
* remotes/pmaydell/tags/pull-target-arm-20170717:
MAINTAINERS: Add entries for MPS2 board
hw/arm/mps2: Add ethernet
hw/arm/mps2: Add SCC
hw/misc/mps2_scc: Implement MPS2 Serial Communication Controller
hw/arm/mps2: Add timers
hw/char/cmsdk-apb-timer: Implement CMSDK APB timer device
hw/arm/mps2: Add UARTs
hw/char/cmsdk-apb-uart.c: Implement CMSDK APB UART
hw/arm/mps2: Implement skeleton mps2-an385 and mps2-an511 board models
target/arm: use DISAS_EXIT for eret handling
target/arm: use gen_goto_tb for ISB handling
target/arm/translate: ensure gen_goto_tb sets exit flags
target/arm/translate.h: expand comment on DISAS_EXIT
target/arm/translate: make DISAS_UPDATE match declared semantics
include/exec/exec-all: document common exit conditions
target/arm: Make Cortex-M3 and M4 default to 8 PMSA regions
qdev: support properties which don't set a default value
qdev-properties.h: Explicitly set the default value for arraylen properties
Peter Maydell [Tue, 18 Jul 2017 08:16:43 +0000 (09:16 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 17 Jul 2017 13:17:17 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
virtio-net: fix offload ctrl endian
virtion-net: Prefer is_power_of_2()
docs/colo-proxy.txt: Update colo-proxy usage of net driver with vnet_header
net/filter-rewriter.c: Make filter-rewriter support vnet_hdr_len
net/colo-compare.c: Add vnet packet's tcp/udp/icmp compare
net/colo.c: Add vnet packet parse feature in colo-proxy
net/colo-compare.c: Make colo-compare support vnet_hdr_len
net/colo-compare.c: Introduce parameter for compare_chr_send()
net/colo.c: Make vnet_hdr_len as packet property
net/filter-mirror.c: Add new option to enable vnet support for filter-redirector
net/filter-mirror.c: Make filter mirror support vnet support.
net/filter-mirror.c: Introduce parameter for filter_send()
net/net.c: Add vnet_hdr support in SocketReadState
net: Add vnet_hdr_len arguments in NetClientState
* remotes/stefanha/tags/tracing-pull-request:
trace: update old trace events in docs
trace: [trivial] Statically enable all guest events
trace: [tcg, trivial] Re-align generated code
trace: [tcg] Do not generate TCG code to trace dynamically-disabled events
exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
trace: [tcg] Delay changes to dynamic state when translating
trace: Allocate cpu->trace_dstate in place
Pavel Dovgalyuk [Fri, 23 Jun 2017 10:41:16 +0000 (12:41 +0200)]
mips: set CP0 Debug DExcCode for SDBBP instruction
This patch fixes setting DExcCode field of CP0 Debug register
when SDBBP instruction is executed. According to EJTAG specification,
this field must be set to the value 9 (Bp).
* remotes/kraxel/tags/audio-20170717-pull-request:
audio/adlib: remove limitation of one adlib card
audio/fmopl: modify timer callback to give opaque and channel parameters in two arguments
audio: st_rate_flow exist a infinite loop
Alexander Graf [Wed, 12 Jul 2017 12:34:28 +0000 (14:34 +0200)]
PPC: E500: Update u-boot to v2017.07
Quite a while has passed since we last updated U-Boot for e500. This patch
bumps it to the last released version 2017.07 to make sure users don't feel
like they're using out of date software.
Peter Maydell [Mon, 17 Jul 2017 12:36:09 +0000 (13:36 +0100)]
MAINTAINERS: Add entries for MPS2 board
Add entries to the MAINTAINERS file for the new MPS2
board and devices.
Since the CMSDK devices are not specific to the MPS2 board,
extend the existing 'PrimeCell' section to cover CMSDK
devices as well; in both cases these are devices implemented
by ARM and provided as RTL that may be used in multiple
SoCs and boards.
Peter Maydell [Mon, 17 Jul 2017 12:36:09 +0000 (13:36 +0100)]
hw/arm/mps2: Add ethernet
The MPS2 FPGA images support ethernet via a LAN9220. We use
QEMU's LAN9118 model, which is software compatible except
that it is missing the checksum-offload feature.
Peter Maydell [Mon, 17 Jul 2017 12:36:08 +0000 (13:36 +0100)]
hw/misc/mps2_scc: Implement MPS2 Serial Communication Controller
Implement a model of the Serial Communication Controller (SCC) found
in MPS2 FPGA images.
The primary purpose of this device is to communicate with the
Motherboard Configuration Controller (MCC) which is located on
the MPS board itself, outside the FPGA image. This is used
for programming the MPS clock generators. The SCC also has
some basic ID registers and an output for the board LEDs.
Peter Maydell [Mon, 17 Jul 2017 12:36:08 +0000 (13:36 +0100)]
hw/arm/mps2: Add UARTs
Add the UARTs to the MPS2 board models.
Unfortunately the details of the wiring of the interrupts through
various OR gates differ between AN511 and AN385 so this can't
be purely a data-driven difference.
Peter Maydell [Mon, 17 Jul 2017 12:36:08 +0000 (13:36 +0100)]
hw/arm/mps2: Implement skeleton mps2-an385 and mps2-an511 board models
Model the ARM MPS2/MPS2+ FPGA based development board.
The MPS2 and MPS2+ dev boards are FPGA based (the 2+ has a bigger
FPGA but is otherwise the same as the 2). Since the CPU itself
and most of the devices are in the FPGA, the details of the board
as seen by the guest depend significantly on the FPGA image.
We model the following FPGA images:
"mps2_an385" -- Cortex-M3 as documented in ARM Application Note AN385
"mps2_an511" -- Cortex-M3 'DesignStart' as documented in AN511
They are fairly similar but differ in the details for some
peripherals.
Alex Bennée [Mon, 17 Jul 2017 12:36:07 +0000 (13:36 +0100)]
target/arm: use DISAS_EXIT for eret handling
Previously DISAS_JUMP did ensure this but with the optimisation of 8a6b28c7 (optimize indirect branches) we might not leave the loop.
This means if any pending interrupts are cleared by changing IRQ flags
we might never get around to servicing them. You usually notice this
by seeing the lookup_tb_ptr() helper gainfully chaining TBs together
while cpu->interrupt_request remains high and the exit_request has not
been set.
This breaks amongst other things the OPTEE test suite which executes
an eret from the secure world after a non-secure world IRQ has gone
pending which then never gets serviced.
Instead of using the previously implied semantics of DISAS_JUMP we use
DISAS_EXIT which will always exit the run-loop.
Alex Bennée [Mon, 17 Jul 2017 12:36:07 +0000 (13:36 +0100)]
target/arm: use gen_goto_tb for ISB handling
While an ISB will ensure any raised IRQs happen on the next
instruction it doesn't cause any to get raised by itself. We can
therefore use a simple tb exit for ISB instructions and rely on the
exit_request check at the top of each TB to deal with exiting if
needed.
Alex Bennée [Mon, 17 Jul 2017 12:36:07 +0000 (13:36 +0100)]
target/arm/translate: make DISAS_UPDATE match declared semantics
DISAS_UPDATE should be used when the wider CPU state other than just
the PC has been updated and we should therefore exit the TCG runtime
and return to the main execution loop rather assuming DISAS_JUMP would
do that.
Peter Maydell [Mon, 17 Jul 2017 12:36:07 +0000 (13:36 +0100)]
target/arm: Make Cortex-M3 and M4 default to 8 PMSA regions
The Cortex-M3 and M4 CPUs always have 8 PMSA MPU regions (this isn't
a configurable option for the hardware). Make the default value of
the pmsav7-dregion property be set per-cpu, so we don't need to have
every user of these CPUs set it manually. (The existing default of
16 is correct for the other PMSAv7 core, the Cortex-R5.)
This fixes a bug where we were creating the M3 and M4 with
too many regions; most guest software would not notice or
care, though, since it would just not use the registers
associated with the unexpected extra regions.
Peter Maydell [Mon, 17 Jul 2017 12:36:06 +0000 (13:36 +0100)]
qdev: support properties which don't set a default value
In some situations it's useful to have a qdev property which doesn't
automatically set its default value when qdev_property_add_static is
called (for instance when the default value is not constant).
Support this by adding a flag to the Property struct indicating
whether to set the default value. This replaces the existing test
for whether the PropertyInfo set_default_value function pointer is
NULL, and we set the .set_default field to true for all those cases
of struct Property which use a PropertyInfo with a non-NULL
set_default_value, so behaviour remains the same as before.
This gives us the semantics of:
* if .set_default is true, then .info->set_default_value must
be not NULL, and .defval is used as the the default value of
the property
* otherwise, the property system does not set any default, and
the field will retain whatever initial value it was given by
the device's .instance_init method
We define two new macros DEFINE_PROP_SIGNED_NODEFAULT and
DEFINE_PROP_UNSIGNED_NODEFAULT, to cover the most plausible use cases
of wanting to set an integer property with no default value.
Peter Maydell [Mon, 17 Jul 2017 12:36:06 +0000 (13:36 +0100)]
qdev-properties.h: Explicitly set the default value for arraylen properties
In DEFINE_PROP_ARRAY, because we use a PropertyInfo (qdev_prop_arraylen)
which has a .set_default_value member we will set the field to a default
value. That default value will be zero, by the C rule that struct
initialization sets unmentioned members to zero if at least one member
is initialized. However it's clearer to state it explicitly.
net/filter-rewriter.c: Make filter-rewriter support vnet_hdr_len
We add the vnet_hdr_support option for filter-rewriter, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object filter-rewriter,id=rew0,netdev=hn0,queue=all,vnet_hdr_support
We get the vnet_hdr_len from NetClientState that make us
parse net packet correctly.
net/colo-compare.c: Make colo-compare support vnet_hdr_len
We add the vnet_hdr_support option for colo-compare, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support
COLO-compare can get vnet header length from filter,
Add vnet_hdr_len to struct packet and output packet with
the vnet_hdr_len.
net/colo-compare.c: Introduce parameter for compare_chr_send()
This patch change the compare_chr_send() parameter from CharBackend to CompareState,
we can get more information like vnet_hdr(We use it to support packet with vnet_header).
net/filter-mirror.c: Add new option to enable vnet support for filter-redirector
We add the vnet_hdr_support option for filter-redirector, default is disabled.
If you use virtio-net-pci net driver or other driver needs vnet_hdr, please enable it.
Because colo-compare or other modules needs the vnet_hdr_len to parse
packet, we add this new option send the len to others.
You can use it for example:
-object filter-redirector,id=r0,netdev=hn0,queue=tx,outdev=red0,vnet_hdr_support
net/filter-mirror.c: Make filter mirror support vnet support.
We add the vnet_hdr_support option for filter-mirror, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support
If it has vnet_hdr_support flag, we will change the sending packet format from
struct {int size; const uint8_t buf[];} to {int size; int vnet_hdr_len; const uint8_t buf[];}.
make other module(like colo-compare) know how to parse net packet correctly.
Stefan Hajnoczi [Fri, 14 Jul 2017 13:31:11 +0000 (14:31 +0100)]
trace: update old trace events in docs
Commit c5f1ad429cdf26023cf331075a7d327708e3db6d ("block: Remove
bdrv_aio_readv/writev/flush()") removed
bdrv_aio_readv()/bdrv_aio_writev() so the example in the tracing
documentation is no longer valid.
trace: [tcg] Do not generate TCG code to trace dynamically-disabled events
If an event is dynamically disabled, the TCG code that calls the
execution-time tracer is not generated.
Removes the overheads of execution-time tracers for dynamically disabled
events. As a bonus, also avoids checking the event state when the
execution-time tracer is called from TCG-generated code (since otherwise
TCG would simply not call it).
exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic
tracing event state values. Each set of TBs can be used by any number of
vCPUs to maximize TB reuse when vCPUs have the same tracing state.
This feature is later used by tracetool to optimize tracing of guest
code events.
The maximum number of TB sets is defined as 2^E, where E is the number
of events that have the 'vcpu' property (their state is stored in
CPUState->trace_dstate).
For this to work, a change on the dynamic tracing state of a vCPU will
force it to flush its virtual TB cache (which is only indexed by
address), and fall back to the physical TB cache (which now contains the
vCPU's dynamic tracing state as part of the hashing function).
trace: [tcg] Delay changes to dynamic state when translating
This keeps consistency across all decisions taken during translation
when the dynamic state of a vCPU is changed in the middle of translating
some guest code.
There's little point in dynamically allocating the bitmap if we
know at compile-time the max number of events we want to support.
Thus, make room in the struct for the bitmap, which will make things
easier later: this paves the way for upcoming changes, in which
we'll use a u32 to fully capture cpu->trace_dstate.
This change also increases performance by saving a dereference and
improving locality--note that this is important since upcoming work
makes reading this bitmap fairly common.
net/filter-mirror.c: Introduce parameter for filter_send()
This patch change the filter_send() parameter from CharBackend to MirrorState,
we can get more information like vnet_hdr(We use it to support packet with vnet_header).
Peter Maydell [Mon, 17 Jul 2017 11:52:59 +0000 (12:52 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170717' into staging
ppc patch queue 2017-07-17
This pull requests supersedes the one from 2017-07-14. That one had a
couple of subtle regressions: there was a build error for mingw32, and
an instance_size which was theoretically wrong everywhere, but only
actually bit on the Travis OSX build.
There are two major batches in this set, rather than the usual
collection of assorted fixes.
* More DRC cleanup. This gets the state management into a state
which should fix many of the hotplug+migration problems we've
had. Plus it gets the migration stream format into something
well defined and pretty minimal which we can reasonably support
into the future.
* Hashed Page Table resizing. It's been a while since this was
posted, but it's been through several previous rounds of review.
The kernel parts (both guest and host) are merged in 4.11, so
this is the only remaining piece left to allow resizing of the
HPT in a running guest.
Peter Maydell [Mon, 17 Jul 2017 10:46:36 +0000 (11:46 +0100)]
Merge remote-tracking branch 'remotes/famz/tags/block-and-testing-pull-request' into staging
# gpg: Signature made Mon 17 Jul 2017 04:47:05 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/block-and-testing-pull-request:
travis: add no-TCG build
docker.py: Improve subprocess exit code handling
docker.py: Drop infile parameter
docker: Don't enable networking as a side-effect of DEBUG=1
ssh: support I/O from any AioContext
sheepdog: add queue_lock
qed: protect table cache with CoMutex
qed: introduce bdrv_qed_init_state
block: invoke .bdrv_drain callback in coroutine context and from AioContext
qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc}
vvfat: make it thread-safe
vpc: make it thread-safe
vdi: make it thread-safe
coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
qcow2: call CoQueue APIs under CoMutex
Alexander Graf [Wed, 12 Jul 2017 12:43:45 +0000 (14:43 +0200)]
vnc: Set default kbd delay to 10ms
The current VNC default keyboard delay is 1ms. With that we're constantly
typing faster than the guest receives keyboard events from an XHCI attached
USB HID device.
The default keyboard delay time in the input layer however is 10ms. I don't know
how that number came to be, but empirical tests on some OpenQA driven ARM
systems show that 10ms really is a reasonable default number for the delay.
This patch moves the VNC delay also to 10ms. That way our default is much
safer (good!) and also consistent with the input layer default (also good!).
If a voice recording equipment is opened for a long time(several days)
in windows guest, rate->ipos will overflow and rate->opos will never
have a chance to change. It will result to a infinite loop.
Peter Maydell [Mon, 17 Jul 2017 09:02:22 +0000 (10:02 +0100)]
Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates
# gpg: Signature made Sat 15 Jul 2017 13:30:03 BST
# gpg: using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: aka "Samuel Thibault <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82 304B D017 8C76 7D06 9EE6
# Subkey fingerprint: AEBF 7448 FAB9 453A 4552 390E B0A5 1BF5 8C91 79C5
* remotes/thibault/tags/samuel-thibault:
slirp: Handle error returns from sosendoob()
slirp: Handle error returns from slirp_send() in sosendoob()
slirp: fork_exec(): Don't close() a negative number in fork_exec()
slirp: use DIV_ROUND_UP
Christian Nilsson (1):
[intel] Add INTEL_NO_PHY_RST for I219-LM (2)
David Decotigny (2):
[build] Return const char * from uuid_ntoa()
[af_packet] Add new AF_PACKET driver for Linux
Jason Wang (1):
[virtio] Support VIRTIO_NET_F_IOMMU_PLATFORM
Jerone Young (1):
[intel] Add support for I219-V in 7th Gen Intel NUC
Konrad Adamczyk (1):
[thunderx] Don't disable NIC when exiting from iPXE
Ladi Prosek (3):
[virtio] Cap queue size to MAX_QUEUE_NUM
[virtio] Simplify virtqueue shutdown
[virtio] Remove queue size limit in legacy virtio
Martin Habets (1):
[sfc] Add driver for Solarflare SFC8XXX adapters
Michael Brown (159):
[interface] Provide intf_reinit() to reinitialise nullified interfaces
[iscsi] Avoid potential infinite loops during shutdown
[efi] Add basic EFI SAN booting capability
[undi] Allocate base memory before calling UNDI loader entry point
[romprefix] Avoid using PMM-allocated memory in UNDI loader entry point
[undi] Clean up driver and device name information
[prefix] Remove impossible progress message
[prefix] Include diagnostic information within progress messages
[undi] Try matching UNDI ROMs in BIOS enumeration order
[efi] Work around temporal anomaly encountered during ExitBootServices()
[ipv4] Accept unicast packets for the local network broadcast address
[build] Add %.vhd target for building VM bootable disk images
[virtio] Use separate RX and TX empty header buffers
[cloud] Add ability to retrieve Google Compute Engine metadata
[virtio] Use host-specified MTU when available
[netdevice] Allow MTU to be changed at runtime
[cloud] Show CPU vendor and model in example cloud boot scripts
[hyperv] Ignore unsolicited VMBus messages
[pic8259] Fix definitions for "read IRR" and "read ISR" commands
[efi] Fix building elf2efi.c when -fpic is enabled by default
[interface] Avoid unnecessary reference counting in intf_unplug()
[interface] Remove misleading comment
[interface] Unplug interface before calling intf_close() in intf_shutdown()
[netdevice] Limit MTU by hardware maximum frame length
[cpuid] Provide cpuid_supported() to test for supported functions
[time] Allow timer to be selected at runtime
[hyperv] Provide timer based on the 10MHz time reference count MSR
[int13] Avoid potential division by zero
[int13] Test correct return status from INT 13 calls
[settings] Add "unixtime" builtin setting to expose the current time
[time] Report attempts to use timers before initialisation
[interface] Provide the ability to shut down multiple interfaces
[http] Cleanly shut down potentially looped interfaces
[efi] Add missing SANBOOT_PROTO_HTTP to EFI default configuration
[block] Remove spurious comments
[block] Centralise SAN device abstraction
[block] Centralise "san-drive" setting
[int13] Refactor to use centralised SAN device abstraction
[efi] Refactor to use centralised SAN device abstraction
[block] Retry any SAN device operation
[iscsi] Use intfs_shutdown() when shutting down multiple interfaces
[scsi] Use intfs_shutdown() when shutting down multiple interfaces
[block] Use intfs_shutdown() when shutting down multiple interfaces
[scsi] Avoid duplicate calls to scsicmd_close()
[build] Provide common ARRAY_SIZE() definition
[efi] Update to current EDK2 headers
[efi] Add EFI_ACPI_TABLE_PROTOCOL header and GUID definition
[efi] Provide ACPI table description for SAN devices
[efi] Skip cable detection at initialisation where possible
[undi] Move PXE API caller back into UNDI driver
[dhcp] Allow vendor class to be changed in DHCP requests
[hermon] Avoid potential integer overflow when calculating memory mappings
[arbel] Avoid potential integer overflow when calculating memory mappings
[xfer] Ensure va_end() is called on failure path
[nfs] Fix double free bug on error path
[linda] Use correct length for memset()
[qib7322] Use correct length for memset()
[sis900] Remove extraneous memset() with incorrect length
[802.11] Remove redundant NULL pointer check after dereference
[crypto] Free correct pointer on the error path
[librm] Fail gracefully if asked to ioremap() a zero length
[usb] Use correct length for memcpy()
[mucurses] Attempt to fix test for empty string
[mucurses] Attempt to fix keypress processing logic
[mucurses] Attempt to fix resource leaks
[hyperv] Fix resource leaks on error path
[slam] Fix resource leak on error path
[slam] Avoid NULL pointer dereference in slam_pull_value()
[eoib] Avoid passing a NULL I/O buffer to netdev_tx_complete_err()
[http] Add missing check for memory allocation failure
[mucurses] Attempt to fix use of uninitialised buffer with strcat()
[xhci] Avoid accessing beyond end of endpoint context array
[build] Avoid confusing sparse in single-argument DBG() macros
[infiniband] Return status code from ib_create_cq() and ib_create_qp()
[infiniband] Return status code from ib_create_mi()
[block] Quell spurious Coverity size mismatch warning
[ath] Add missing break statements
[pixbuf] Avoid potential division by zero
[usb] Use correct length for memcpy()
[xen] Use standard calling pattern for asprintf()
[tcp] Use correct length for memset()
[video_subr] Use memmove() for overlapping memory copy
[arbel] Assert that mapping length is non-zero
[hermon] Assert that mapping length is non-zero
[tlan] Guard against failure to identify chip
[w89c840] Avoid potential array overrun
[sis190] Avoid NULL pointer dereference
[mucurses] Ensure SLK labels are always terminated
[coverity] Add Coverity user model
[malloc] Track maximum heap usage
[travis] Add minimal .travis.yml file
[travis] Build and run the unit test suite
[travis] Integrate with Coverity Scan
[rtl818x] Fix resource leak on error path
[pcnet32] Eliminate redundant register read
[iobuf] Increase minimum I/O buffer size to 128 bytes
[vxge] Fix use of stale I/O buffer on error path
[scsi] Avoid duplicate call to scsicmd_close() on TEST UNIT READY failure
[block] Add dummy SAN device
[block] Add basic multipath support
[int13] Improve geometry guessing for unaligned partitions
[int13con] Avoid overwriting random portions of SAN boot disks
[time] Add sleep_fixed() function to sleep without checking for Ctrl-C
[block] Allow SAN retry count to be reconfigured
[block] Add a small delay between attempts to reopen SAN targets
[block] Retry reopening indefinitely for multipath devices
[block] Gracefully close SAN device if registration fails
[linux] Use dummy SAN device
[block] Ignore redundant xfer_window_changed() messages
[block] Describe all SAN devices via ACPI tables
[iscsi] Do not install iBFT when no iSCSI targets exist
[http] Notify data transfer interface when underlying connection is ready
[mucurses] Fix erroneous __nonnull attribute
[build] Avoid implicit-fallthrough warnings on GCC 7
[linux] Fix building with kernel 4.11 headers
[scsi] Retry TEST UNIT READY command
[libc] Add stdbool.h standard header
[efi] Fix typo in efi_acpi_table_protocol_guid
[efi] Add efi_sprintf() and efi_vsprintf()
[block] Allow use of a non-default EFI SAN boot filename
[intel] Show original CTRL and STATUS values in debugging output
[intel] Do not enable ASDE on i350 backplane NIC
[block] Provide sandev_read() and sandev_write() as global symbols
[block] Provide abstraction to allow system to be quiesced
[hyperv] Do not fail if guest OS ID MSR is already set
[hyperv] Remove redundant return status code from mapping functions
[hyperv] Cope with Windows Server 2016 enlightenments
[efi] Standardise PCI debug messages
[iscsi] Always send FirstBurstLength parameter
[iscsi] Fix iBFT when no explicit initiator name setting exists
[xen] Provide 18 4kB receive buffers to work around xen-netback bug
[efi] Prevent EFI code from being linked in to non-EFI builds
[tls] Keep cipherstream window open until TLS negotiation is complete
[settings] Extend numerical setting tags to 64 bits
[acpi] Make acpi_find_rsdt() a per-platform method
[efi] Provide access to ACPI tables
[acpi] Expose ACPI tables via settings mechanism
[syslog] Handle backspace characters
[hdprefix] Avoid attempts to read beyond the end of the disk
[usb] Allow for USB network devices with no interrupt endpoint
[build] Use -no-pie on newer versions of gcc
[ecm] Display invalid MAC address strings in debug messages
[cpuid] Allow input %ecx value to be specified
[crypto] Expose RSA_CTX_SIZE constant
[crypto] Expose asn1_grow()
[crypto] Provide asn1_built() to construct a cursor from a builder
[crypto] Expose pem_asn1() for use with non-image data
[exanic] Add driver for Exablaze ExaNIC cards
[usb] Use non-zero language ID to retrieve strings
[mucurses] Avoid potential division by zero
[tls] Support RFC5746 secure renegotiation
[smscusb] Abstract out common SMSC USB device functionality
[smsc95xx] Use common SMSC USB device functionality
[smsc75xx] Use common SMSC USB device functionality
[smscusb] Add ability to read MAC address from OTP
[smscusb] Move non-inline register access functions to smscusb.c
[smscusb] Allow for alternative PHY register layouts
[smsc75xx] Expose functionality shared with LAN78xx devices
[lan78xx] Add driver for Microchip LAN78xx USB Ethernet NICs
Mika Tiainen (1):
[intel] Add INTEL_NO_PHY_RST for I219-V
Mike McCormack (1):
[sky2] Use 32-bit read to read Y2_VAUX_AVAIL
Raed Salem (2):
[golan] Update Connect-IB, ConnectX-4 and ConnectX-4 Lx (Infiniband) support
[golan] Bug fixes and improved paging allocation method
Vishvananda Ishaya (1):
[intel] Reset all virtual function settings
Vishvananda Ishaya Abrams (1):
[iscsi] Don't close when receiving NOP-In
target/ppc: fix CPU hotplug when radix is enabled (TCG)
But when a guest initializes radix mode, it issues a H_REGISTER_PROC_TBL
to update the LPCR of all CPUs. Hot-plugged CPUs inherit from the same
setting under KVM but not under TCG. So, Let's check for radix and update
the default LPCR to keep new CPUs in sync.
David Gibson [Wed, 12 Jul 2017 07:56:55 +0000 (17:56 +1000)]
pseries: Allow HPT resizing with KVM
So far, qemu implements the PAPR Hash Page Table (HPT) resizing extension
with TCG. The same implementation will work with KVM PR, but we don't
currently allow that. For KVM HV we can only implement resizing with the
assistance of the host kernel, which needs a new capability and ioctl()s.
This patch adds support for testing the new KVM capability and implementing
the resize in terms of KVM facilities when necessary. If we're running on
a kernel which doesn't have the new capability flag at all, we fall back to
testing for PR vs. HV KVM using the same hack that we already use in a
number of places for older kernels.
David Gibson [Wed, 12 Jul 2017 07:56:06 +0000 (17:56 +1000)]
pseries: Use smaller default hash page tables when guest can resize
We've now implemented a PAPR extension allowing PAPR guest to resize
their hash page table (HPT) during runtime.
This patch makes use of that facility to allocate smaller HPTs by default.
Specifically when a guest is aware of the HPT resize facility, qemu sizes
the HPT to the initial memory size, rather than the maximum memory size on
the assumption that the guest will resize its HPT if necessary for hot
plugged memory.
When the initial memory size is much smaller than the maximum memory size
(a common configuration with e.g. oVirt / RHEV) then this can save
significant memory on the HPT.
If the guest does *not* advertise HPT resize awareness when it makes the
ibm,client-architecture-support call, qemu resizes the HPT for maxmimum
memory size (unless it's been configured not to allow such guests at all).
For now we make that reallocation assuming the guest has not yet used the
HPT at all. That's true in practice, but not, strictly, an architectural
or PAPR requirement. If we need to in future we can fix this by having
the client-architecture-support call reboot the guest with the revised
HPT size (the client-architecture-support call is explicitly permitted to
trigger a reboot in this way).
David Gibson [Wed, 12 Jul 2017 07:53:17 +0000 (17:53 +1000)]
pseries: Enable HPT resizing for 2.10
We've now implemented a PAPR extensions which allows PAPR guests (i.e.
"pseries" machine type) to resize their hash page table during runtime.
However, that extension is only enabled if explicitly chosen on the
command line. This patch enables it by default for spapr-2.10, but leaves
it disabled (by default) for older machine types.
David Gibson [Fri, 12 May 2017 05:46:49 +0000 (15:46 +1000)]
pseries: Implement HPT resizing
This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table. This will eventually allow for more flexible memory
hotplug.
The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function. The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.
The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion. If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.
The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT. The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).
For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT. This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).
In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs. That's a project for another day, but should be possible
without any changes to the guest interface.
David Gibson [Fri, 12 May 2017 05:46:11 +0000 (15:46 +1000)]
pseries: Stubs for HPT resizing
This introduces stub implementations of the H_RESIZE_HPT_PREPARE and
H_RESIZE_HPT_COMMIT hypercalls which we hope to add in a PAPR
extension to allow run time resizing of a guest's hash page table. It
also adds a new machine property for controlling whether this new
facility is available.
For now we only allow resizing with TCG, allowing it with KVM will require
kernel changes as well.
Finally, it adds a new string to the hypertas property in the device
tree, advertising to the guest the availability of the HPT resizing
hypercalls. This is a tentative suggested value, and would need to be
standardized by PAPR before being merged.
Greg Kurz [Wed, 12 Jul 2017 09:48:39 +0000 (11:48 +0200)]
spapr: fix potential memory leak in spapr_core_plug()
Since commit 5c1da81215c7 ("spapr: Remove unnecessary differences between
hotplug and coldplug paths"), the CPU DT for the DRC is always allocated.
This causes a memory leak for pseries-2.6 and older machine types, that
don't support CPU hotplug and don't allocate DRCs for CPUs.
David Gibson [Thu, 8 Jun 2017 13:55:03 +0000 (23:55 +1000)]
spapr: Implement DR-indicator for physical DRCs only
According to PAPR, the DR-indicator should only be valid for physical DRCs,
not logical DRCs. At the moment we implement it for all DRCs, so restrict
it to physical ones only.
We move the state to the physical DRC subclass, which means adding some
QOM boilerplate to handle the newly distinct type.
Most of the time, the state of a DRC object is contained in the single
'state' variable. However, during the transition from UNISOLATE to
CONFIGURED state requires multiple calls to the ibm,configure-connector
RTAS call to retrieve the device tree for the attached device. We need
some extra state to keep track of where we're up to in delivering the
device tree information to the guest.
Currently that extra state is in a sPAPRConfigureConnectorState
substructure which is only allocated when we're in the middle of the
configure connector process. That sounds like a good idea, but the extra
state is only two integers - on many platforms that will take up the same
room as the (maybe NULL) ccs pointer even before malloc() overhead. Plus
it's another object whose lifetime we need to manage. In short, it's not
worth it.
So, fold the sPAPRConfigureConnectorState substructure directly into the
DRC object.
Previously the structure was allocated lazily when the configure-connector
call discovers it's not there. Now, we need to initialize the subfields
pre-emptively, as soon as we enter UNISOLATE state.
Although it's not strictly necessary (the field values should only ever
be consulted when in UNISOLATE state), we try to keep them at -1 when in
other states, as a debugging aid.
David Gibson [Tue, 20 Jun 2017 13:57:48 +0000 (21:57 +0800)]
spapr: Consolidate DRC state variables
Each DRC has three fields describing its state: isolation_state,
allocation_state and configured. At first this seems like a reasonable
representation, since its based directly on the PAPR defined
isolation-state and allocation-state indicators. However:
* Only a few combinations of the two fields' values are permitted
* allocation_state isn't used at all for physical DRCs
* The indicators are write only so they don't really have a well
defined current value independent of each other
This replaces these variables with a single state variable, whose names
and numbers are based on the diagram in LoPAPR section 13.4. Along with
this we add code to check the current state on various operations and make
sure the requested transition is permitted.
Strictly speaking, this makes guest visible changes to behaviour (since we
probably allowed some transitions we shouldn't have before). However, a
hypothetical guest broken by that wasn't PAPR compliant, and probably
wouldn't have worked under PowerVM.
David Gibson [Tue, 20 Jun 2017 13:02:41 +0000 (21:02 +0800)]
spapr: Cleanups relating to DRC awaiting_release field
'awaiting_release' indicates that the host has requested an unplug of the
device attached to the DRC, but the guest has not (yet) put the device
into a state where it is safe to complete removal.
1. Rename it to 'unplug_requested' which to me at least is clearer
2. Remove the ->release_pending() method used to check this from outside
spapr_drc.c. The method only plausibly has one implementation, so use
a plain function (spapr_drc_unplug_requested()) instead.
3. Remove it from the migration stream. Attempting to migrate mid-unplug
is broken not just for spapr - in general management has no good way to
determine if the device should be present on the destination or not. So,
until that's fixed, there's no point adding extra things to the stream.
David Gibson [Tue, 4 Jul 2017 11:07:14 +0000 (21:07 +1000)]
spapr: Refactor spapr_drc_detach()
This function has two unused parameters - remove them.
It also sets awaiting_release on all paths, except one. On that path
setting it is harmless, since it will be immediately cleared by
spapr_drc_release(). So factor it out of the if statements.
David Gibson [Thu, 13 Jul 2017 00:52:39 +0000 (10:52 +1000)]
spapr: Abort on delete failure in spapr_drc_release()
We currently ignore errors from the object_property_del() in
spapr_drc_release(). But the only way that could fail is if the property
doesn't exist, in which case it's a bug that we're in spapr_drc_release()
at all. So change from ignoring to abort()ing on errors.
David Gibson [Thu, 13 Jul 2017 00:45:35 +0000 (10:45 +1000)]
spapr: Simplify unplug path
spapr_lmb_release() and spapr_core_release() call hotplug_handler_unplug()
which after a bunch of indirection calls spapr_memory_unplug() or
spapr_core_unplug(). But we already know which is the appropriate thing
to call here, so we can just fold it directly into the release function.
Once that's done, there's no need for an hc->unplug method in the spapr
machine at all: since we also have an hc->unplug_request method, the
hotplug core will never use ->unplug.
David Gibson [Mon, 3 Jul 2017 10:20:53 +0000 (20:20 +1000)]
spapr: Remove 'awaiting_allocation' DRC flag
The awaiting_allocation flag in the DRC was introduced by aab9913
"spapr_drc: Prevent detach racing against attach for CPU DR", allegedly to
prevent a guest crash on racing attach and detach. Except.. information
from the BZ actually suggests a qemu crash, not a guest crash. And there
shouldn't be a problem here anyway: if the guest has already moved the DRC
away from UNUSABLE state, the detach would already be deferred, and if it
hadn't it should be safe to detach it (the guest should fail gracefully
when it attempts to change the allocation state).
I think this was probably just a bandaid for some other problem in the
state management. So, remove awaiting_allocation and associated code.
Laurent Vivier [Fri, 9 Jun 2017 11:08:10 +0000 (13:08 +0200)]
spapr: Treat devices added before inbound migration as coldplugged
When migrating a guest which has already had devices hotplugged,
libvirt typically starts the destination qemu with -incoming defer,
adds those hotplugged devices with qmp, then initiates the incoming
migration.
This causes problems for the management of spapr DRC state. Because
the device is treated as hotplugged, it goes into a DRC state for a
device immediately after it's plugged, but before the guest has
acknowledged its presence. However, chances are the guest on the
source machine *has* acknowledged the device's presence and configured
it.
If the source has fully configured the device, then DRC state won't be
sent in the migration stream: for maximum migration compatibility with
earlier versions we don't migrate DRCs in coldplug-equivalent state.
That means that the DRC effectively changes state over the migrate,
causing problems later on.
In addition, logging hotplug events for these devices isn't what we
want because a) those events should already have been issued on the
source host and b) the event queue should get wiped out by the
incoming state anyway.
In short, what we really want is to treat devices added before an
incoming migration as if they were coldplugged.
To do this, we first add a spapr_drc_hotplugged() helper which
determines if the device is hotplugged in the sense relevant for DRC
state management. We only send hotplug events when this is true.
Second, when we add a device which isn't hotplugged in this sense, we
force a reset of the DRC state - this ensures the DRC is in a
coldplug-equivalent state (there isn't usually a system reset between
these device adds and the incoming migration).
This is based on an earlier patch by Laurent Vivier, cleaned up and
extended.
David Gibson [Wed, 12 Jul 2017 01:55:53 +0000 (11:55 +1000)]
spapr: Minor cleanups to events handling
The rtas_error_log structure is marked packed, which strongly suggests its
precise layout is important to match an external interface. Along with
that one could expect it to have a fixed endianness to match the same
interface. That used to be the case - matching the layout of PAPR RTAS
event format and requiring BE fields.
Now, however, it's only used embedded within sPAPREventLogEntry with the
fields in native order, since they're processed internally.
Clear that up by removing the nested structure in sPAPREventLogEntry.
struct rtas_error_log is moved back to spapr_events.c where it is used as
a temporary to help convert the fields in sPAPREventLogEntry to the correct
in memory format when delivering an event to the guest.
In racing situations between hotplug events and migration operation,
a rtas hotplug event could have not yet be delivered to the source
guest when migration is started. In this case the pending_events of
spapr state need be transmitted to the target so that the hotplug
event can be finished on the target.
To achieve the minimal VMSD possible to migrate the pending_events list,
this patch makes the changes in spapr_events.c:
- 'log_type' of sPAPREventLogEntry struct deleted. This information can be
derived by inspecting the rtas_error_log summary field. A new function
called 'spapr_event_log_entry_type' was added to retrieve the type of
a given sPAPREventLogEntry.
- sPAPREventLogEntry, epow_log_full and hp_log_full were redesigned. The
only data we're going to migrate in the VMSD is the event log data itself,
which can be divided in two parts: a rtas_error_log header and an extended
event log field. The rtas_error_log header contains information about the
size of the extended log field, which can be used inside VMSD as the size
parameter of the VBUFFER_ALOC field that will store it. To allow this use,
the header.extended_length field must be exposed inline to the VMSD instead
of embedded into a 'data' field that holds everything. With this in mind,
the following changes were done:
* a new 'header' field was added to sPAPREventLogEntry. This field holds a
a struct rtas_error_log inline.
* the declaration of the 'rtas_error_log' struct was moved to spapr.h
to be visible to the VMSD macros.
* 'data' field of sPAPREventLogEntry was renamed to 'extended_log' and
now holds only the contents of the extended event log.
* 'struct rtas_error_log hdr' were taken away from both epow_log_full
and hp_log_full. This information is now available at the header field of
sPAPREventLogEntry.
* epow_log_full and hp_log_full were renamed to epow_extended_log and
hp_extended_log respectively. This rename makes it clearer to understand
the new purpose of both structures: hold the information of an extended
event log field.
* spapr_powerdown_req and spapr_hotplug_req_event now creates a
sPAPREventLogEntry structure that contains the full rtas log entry.
* rtas_event_log_queue and rtas_event_log_dequeue now receives a
sPAPREventLogEntry pointer as a parameter instead of a void pointer.
- the endianess of the sPAPREventLogEntry header is now native instead
of be32. We can use the fields in native endianess internally and write
them in be32 in the guest physical memory inside 'check_exception'. This
allows the VMSD inside spapr.c to read the correct size of the
entended_log field.
- inside spapr.c, pending_events is put in a subsection in the spapr state
VMSD to make sure migration across different versions is not broken.
A small change in rtas_event_log_queue and rtas_event_log_dequeue were also
made: instead of calling qdev_get_machine(), both functions now receive
a pointer to the sPAPRMachineState. This pointer is already available in
the callers of these functions and we don't need to waste resources
calling qdev() again.
David Gibson [Mon, 17 Jul 2017 04:51:09 +0000 (14:51 +1000)]
spapr: Remove unnecessary instance_size specifications from DRC subtypes
All the DRC subtypes explicitly list instance_size in TypeInfo (all as
sizeof(sPAPRDRConnector). This isn't necessary, since if it's not listed
it will be derived from the parent type.
Worse, this is dangerous, because if a subtype is changed in future to
have a larger structure, then subtypes of that subtype also need to have
instance_size changed, or it will lead to hard to track memory corruption
bugs.