Git Repo - qemu.git/log

s390x/pci: pass the retaddr to all PCI instructions

Once we wire up TCG, we will need the retaddr to correctly inject
program interrupts. As we want to get rid of the function
program_interrupt(), convert PCI code too.

For KVM, we can simply use RA_IGNORED.

Convert program_interrupt() to s390_program_interrupt() directly, making
use of the passed address.

Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20171130162744 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/ioinst: pass the retaddr to all IO instructions

TCG needs the retaddr when injecting an interrupt. Let's just pass it
along and use RA_IGNORED for KVM. The value will be completely ignored for
KVM.

Convert program_interrupt() to s390_program_interrupt() directly, making
use of the passed address.

Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20171130162744 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/tcg: rip out dead tpi code

It is broken and not even wired up. We'll add a new handler soon, but
that will live somewhere else.

Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20171130162744 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/tcg: get rid of runtime_exception()

Let's use s390_program_interrupt() instead.

Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20171130162744 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/tcg: introduce and use s390_program_interrupt()

Allows to easily convert more callers of program_interrupt() and to
easily introduce new exceptions without forgetting about the cpu state
reset.

Use s390_program_interrupt() in places where we already had the same
pattern. We will later get rid of program_interrupt().

RA != 0 checks are already done behind the scenes.

Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <20171130162744 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x: nuke DPRINTF in helper.c

It is not used anywhere.

Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x: introduce 2.12 compat machine

Acked-by: Christian Borntraeger <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

pc-bios/s390-ccw.img: update image

Contains the following commit:
- pc-bios/s390-ccw: zero out bss section

Signed-off-by: Cornelia Huck <[email protected]>

pc-bios/s390-ccw: zero out bss section

The QEMU ELF loader does not zero the bss segment.
This resulted in several bugs, e.g. see

commit 5d739a4787a5 (s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css)
commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx)
commit 8775d91a0f42 (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting)

Let's fix this once and forever by letting the BIOS zero the bss itself.

Suggested-by: Alexander Graf <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Message-Id: <20171122142627 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/migration: use zero flag parameter

valgrind pointed out that we call KVM_S390_GET_IRQ_STATE with an
undefined value for flags. Kernels prior to 4.15 did not use that
field, and later kernels ignore it for compatibility reasons, but we
better play safe.

The same is true for SET_IRQ_STATE. We should make sure to not use the
flag field, either.

Signed-off-by: Christian Borntraeger <[email protected]>
Message-Id: <20171122142627 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20171214' into staging

HMP pull 2017-12-14

# gpg: Signature made Thu 14 Dec 2017 12:46:41 GMT
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20171214:
  tests: test-hmp: print command execution result
  hmp-commands: Remove the deprecated usb_add and usb_del

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20171213' into staging

target-arm queue:
* xilinx_spips: set reset values correctly
* MAINTAINERS: fix an email address
* hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
* nvic: Make systick banked for v8M
* refactor get_phys_addr() so we can return the right format PAR
   for ATS operations
* implement v8M TT instruction
* fix some minor v8M bugs
* Implement reset for GICv3 ITS
* xlnx-zcu102: Add support for the ZynqMP QSPI

# gpg: Signature made Wed 13 Dec 2017 18:01:31 GMT
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <[email protected]>"
# gpg:                 aka "Peter Maydell <[email protected]>"
# gpg:                 aka "Peter Maydell <[email protected]>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20171213: (43 commits)
  xilinx_spips: Use memset instead of a for loop to zero registers
  xilinx_spips: Set all of the reset values
  xilinx_spips: Update the QSPI Mod ID reset value
  MAINTAINERS: replace the unavailable email address
  hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
  nvic: Make systick banked
  nvic: Make nvic_sysreg_ns_ops work with any MemoryRegion
  target/arm: Extend PAR format determination
  target/arm: Remove fsr argument from get_phys_addr() and arm_tlb_fill()
  target/arm: Ignore fsr from get_phys_addr() in do_ats_write()
  target/arm: Use ARMMMUFaultInfo in deliver_fault()
  target/arm: Convert get_phys_addr_pmsav8() to not return FSC values
  target/arm: Convert get_phys_addr_pmsav7() to not return FSC values
  target/arm: Convert get_phys_addr_pmsav5() to not return FSC values
  target/arm: Convert get_phys_addr_lpae() to not return FSC values
  target/arm: Convert get_phys_addr_v6() to not return FSC values
  target/arm: Convert get_phys_addr_v5() to not return FSC values
  target/arm: Remove fsr argument from arm_ld*_ptw()
  target/arm: Provide fault type enum and FSR conversion functions
  target/arm: Implement TT instruction
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20171213.0' into staging

VFIO updates for v2.12

- Fix bug failing to register all but the first group attached to
   a container with kvm-vfio device (Alex Williamson)

- Explicit QLIST init (Yi Lui)

- SPAPR IOMMU v1 fallback (Alexey Kardashevskiy)

- Remove unused structure fields (Alexey Kardashevskiy)

# gpg: Signature made Wed 13 Dec 2017 18:03:48 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# gpg:                 aka "Alex Williamson <[email protected]>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-update-20171213.0:
  vfio-pci: Remove unused fields from VFIOMSIXInfo
  vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1
  vfio/common: init giommu_list and hostwin_list of vfio container
  vfio: Fix vfio-kvm group registration

Signed-off-by: Peter Maydell <[email protected]>

tests: test-hmp: print command execution result

Provide HMP monitor command execution result as it would be seen
by user who established an HMP monitor session.

Currently many commands may silently fail without any sign of that.
This patch let this info to be printed once test is running in
verbose mode.

For the future it might be useful to fail the test if command has
failed, however it would require a bit of rework inside test
engine itself.

A simple example of silent failure without reporting it would to
add some non-existent HMP command into 'hmp_cmds' list. In this case
test will report it successfully passed without error.

Signed-off-by: Vadim Galitsyn <[email protected]>
Cc: Dr. David Alan Gilbert <[email protected]>
Cc: [email protected]
Message-Id: <20171023151310 [email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

hmp-commands: Remove the deprecated usb_add and usb_del

It's easy to use device_add and device_del as replacement instead.
The usb_add and usb_del commands are deprecated since QEMU 2.10,
and nobody complained that they are still needed, so let's get rid
of them now to make the HMP interface a little bit less overloaded.

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <1512073140 [email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

xilinx_spips: Use memset instead of a for loop to zero registers

Use memset() instead of a for loop to zero all of the registers.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: KONRAD Frederic <[email protected]>
Reviewed-by: Francisco Iglesias <[email protected]>
Message-id: c076e907f355923864cb1afde31b938ffb677778.1513104804 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Set all of the reset values

Following the ZynqMP register spec let's ensure that all reset values
are set.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: Francisco Iglesias <[email protected]>
Message-id: 19836f3e0a298b13343c5a59c87425355e7fd8bd.1513104804 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Update the QSPI Mod ID reset value

Update the reset value to match the latest ZynqMP register spec.

Signed-off-by: Alistair Francis <[email protected]>
Reviewed-by: KONRAD Frederic <[email protected]>
Reviewed-by: Francisco Iglesias <[email protected]>
Message-id: c03e51d041db7f055596084891aeb1e856e32b9f.1513104804 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

MAINTAINERS: replace the unavailable email address

Since I'm not working as an assignee in Linaro, replace the Linaro email
address with my personal one.

Signed-off-by: Zhaoshenglong <[email protected]>
Message-id: 1513058845 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS

The ctz32() routine could return a value greater than
TC6393XB_GPIOS=16, because the device has 24 GPIO level
bits but we only implement 16 outgoing lines. This could
lead to an OOB array access. Mask 'level' to avoid it.

Reported-by: Moguofang <[email protected]>
Signed-off-by: Prasad J Pandit <[email protected]>
Message-id: 20171212041539 [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

nvic: Make systick banked

For the v8M security extension, there should be two systick
devices, which use separate banked systick exceptions. The
register interface is banked in the same way as for other
banked registers, including the existence of an NS alias
region for secure code to access the nonsecure timer.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 1512154296 [email protected]

nvic: Make nvic_sysreg_ns_ops work with any MemoryRegion

Generalize nvic_sysreg_ns_ops so that we can pass it an
arbitrary MemoryRegion which it will use as the underlying
register implementation to apply the NS-alias behaviour
to. We'll want this so we can do the same with systick.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 1512154296 [email protected]

target/arm: Extend PAR format determination

Now that do_ats_write() is entirely in control of whether to
generate a 32-bit PAR or a 64-bit PAR, we can make it use the
correct (complicated) condition for doing so.

Signed-off-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1512503192 [email protected]
[PMM: Rebased Edgar's patch on top of get_phys_addr() refactoring;
use arm_s1_regime_using_lpae_format() rather than
regime_using_lpae_format() because the latter will assert
if passed ARMMMUIdx_S12NSE0 or ARMMMUIdx_S12NSE1;
updated commit message appropriately]
Signed-off-by: Peter Maydell <[email protected]>

target/arm: Remove fsr argument from get_phys_addr() and arm_tlb_fill()

All of the callers of get_phys_addr() and arm_tlb_fill() now ignore
the FSR values they return, so we can just remove the argument
entirely.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Ignore fsr from get_phys_addr() in do_ats_write()

In do_ats_write(), rather than using the FSR value from get_phys_addr(),
construct the PAR values using the information in the ARMMMUFaultInfo
struct. This allows us to create a PAR of the correct format regardless
of what the translation table format is.

For the moment we leave the condition for "when should this be a
64 bit PAR" as it was previously; this will need to be fixed to
properly support AArch32 Hyp mode.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Use ARMMMUFaultInfo in deliver_fault()

Now that ARMMMUFaultInfo is guaranteed to have enough information
to construct a fault status code, we can pass it in to the
deliver_fault() function and let it generate the correct type
of FSR for the destination, rather than relying on the value
provided by get_phys_addr().

I don't think there are any cases the old code was getting
wrong, but this is more obviously correct.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_pmsav8() to not return FSC values

Make get_phys_addr_pmsav8() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_pmsav7() to not return FSC values

Make get_phys_addr_pmsav7() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_pmsav5() to not return FSC values

Make get_phys_addr_pmsav5() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Note that PMSAv5 does not define any guest-visible fault status
register, so the different "fsr" values we were previously
returning are entirely arbitrary. So we can just switch to using
the most appropriae fi->type values without worrying that we
need to special-case FaultInfo->FSC conversion for PMSAv5.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_lpae() to not return FSC values

Make get_phys_addr_v6() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_v6() to not return FSC values

Make get_phys_addr_v6() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Convert get_phys_addr_v5() to not return FSC values

Make get_phys_addr_v5() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Remove fsr argument from arm_ld*_ptw()

All the callers of arm_ldq_ptw() and arm_ldl_ptw() ignore the value
that those functions store in the fsr argument on failure: if they
return failure to their callers they will always overwrite the fsr
value with something else.

Remove the argument from these functions and S1_ptw_translate().
This will simplify removing fsr from the calling functions.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Provide fault type enum and FSR conversion functions

Currently get_phys_addr() and its various subfunctions return
a hard-coded fault status register value for translation
failures. This is awkward because FSR values these days may
be either long-descriptor format or short-descriptor format.
Worse, the right FSR type to use doesn't depend only on the
translation table being walked -- some cases, like fault
info reported to AArch32 EL2 for some kinds of ATS operation,
must be in long-descriptor format even if the translation
table being walked was short format. We can't get those cases
right with our current approach.

Provide fields in the ARMMMUFaultInfo struct which allow
get_phys_addr() to provide sufficient information for a caller to
construct an FSR value themselves, and utility functions which do
this for both long and short format FSR values, as a first step in
switching get_phys_addr() and its children to only returning the
failure cause in the ARMMMUFaultInfo struct.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Stefano Stabellini <[email protected]>
Message-id: 1512503192 [email protected]

target/arm: Implement TT instruction

Implement the TT instruction which queries the security
state and access permissions of a memory location.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]

target/arm: Factor MPU lookup code out of get_phys_addr_pmsav8()

For the TT instruction we're going to need to do an MPU lookup that
also tells us which MPU region the access hit. This requires us
to do the MPU lookup without first doing the SAU security access
check, so pull the MPU lookup parts of get_phys_addr_pmsav8()
out into their own function.

The TT instruction also needs to know the MPU region number which
the lookup hit, so provide this information to the caller of the
MPU lookup code, even though get_phys_addr_pmsav8() doesn't
need to know it.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

target/arm: Create new arm_v7m_mmu_idx_for_secstate_and_priv()

The TT instruction is going to need to look up the MMU index
for a specified security and privilege state. Refactor the
existing arm_v7m_mmu_idx_for_secstate() into a version that
lets you specify the privilege state and one that uses the
current state of the CPU.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

target/arm: Split M profile MNegPri mmu index into user and priv

For M profile, we currently have an mmu index MNegPri for
"requested execution priority negative". This fails to
distinguish "requested execution priority negative, privileged"
from "requested execution priority negative, usermode", but
the two can return different results for MPU lookups. Fix this
by splitting MNegPri into MNegPriPriv and MNegPriUser, and
similarly for the Secure equivalent MSNegPri.

This takes us from 6 M profile MMU modes to 8, which means
we need to bump NB_MMU_MODES; this is OK since the point
where we are forced to reduce TLB sizes is 9 MMU modes.

(It would in theory be possible to stick with 6 MMU indexes:
{mpu-disabled,user,privileged} x {secure,nonsecure} since
in the MPU-disabled case the result of an MPU lookup is
always the same for both user and privileged code. However
we would then need to rework the TB flags handling to put
user/priv into the TB flags separately from the mmuidx.
Adding an extra couple of mmu indexes is simpler.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]

target/arm: Add missing M profile case to regime_is_user()

When we added the ARMMMUIdx_MSUser MMU index we forgot to
add it to the case statement in regime_is_user(), so we
weren't treating it as unprivileged when doing MPU lookups.
Correct the omission.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

target/arm: Allow explicit writes to CONTROL.SPSEL in Handler mode

In ARMv7M the CPU ignores explicit writes to CONTROL.SPSEL
in Handler mode. In v8M the behaviour is slightly different:
writes to the bit are permitted but will have no effect.

We've already done the hard work to handle the value in
CONTROL.SPSEL being out of sync with what stack pointer is
actually in use, so all we need to do to fix this last loose
end is to update the condition we use to guard whether we
call write_v7m_control_spsel() on the register write.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]

target/arm: Handle SPSEL and current stack being out of sync in MSP/PSP reads

For v8M it is possible for the CONTROL.SPSEL bit value and the
current stack to be out of sync. This means we need to update
the checks used in reads and writes of the PSP and MSP special
registers to use v7m_using_psp() rather than directly checking
the SPSEL bit in the control register.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-id: 1512153879 [email protected]
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

hw/intc/arm_gicv3_its: Implement full reset

Voiding the ITS caches is not supposed to happen via
individual register writes. So we introduced a dedicated
ITS KVM device ioctl to perform a cold reset of the ITS:
KVM_DEV_ARM_VGIC_GRP_CTRL/KVM_DEV_ARM_ITS_CTRL_RESET. Let's
use this latter if the kernel supports it.

Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1511883692 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

linux-headers: update to 4.15-rc1

Update headers against v4.15-rc1.

Signed-off-by: Eric Auger <[email protected]>
Message-id: 1511883692 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/intc/arm_gicv3_its: Implement a minimalist reset

At the moment the ITS is not properly reset and this causes
various bugs on save/restore. We implement a minimalist reset
through individual register writes but for kernel versions
before v4.15 this fails voiding the vITS cache. We cannot
claim we have a comprehensive reset (hence the error message)
but that's better than nothing.

Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1511883692 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

hw/intc/arm_gicv3_its: Don't call post_load on reset

From the very beginning, post_load() was called from common
reset. This is not standard and obliged to discriminate the
reset case from the restore case using the iidr value.

Let's get rid of that call.

Signed-off-by: Eric Auger <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Message-id: 1511883692 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xlnx-zcu102: Add support for the ZynqMP QSPI

Add support for the ZynqMP QSPI (consisting of the Generic QSPI and Legacy
QSPI) and connect Numonyx n25q512a11 flashes to it.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Add support for the ZynqMP Generic QSPI

Add support for the Zynq Ultrascale MPSoc Generic QSPI.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Don't set TX FIFO UNDERFLOW at cmd done

Don't set TX FIFO UNDERFLOW interrupt after transmitting the commands.
Also update interrupts after reading out the interrupt status.

Signed-off-by: Francisco Iglesias <[email protected]>
Acked-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Add support for 4 byte addresses in the LQSPI

Add support for 4 byte addresses in the LQSPI and correct LQSPI_CFG_SEP_BUS.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Add support for zero pumping

Add support for zero pumping according to the transfer size register.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Make tx/rx_data_bytes more generic and reusable

Make tx/rx_data_bytes more generic so they can be reused (when adding
support for the Zynqmp Generic QSPI).

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Add support for RX discard and RX drain

Add support for the RX discard and RX drain functionality. Also transmit
one byte per dummy cycle (to the flash memories) with commands that require
these.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Update striping to be big-endian bit order

Update striping functionality to be big-endian bit order (as according to
the Zynq-7000 Technical Reference Manual). Output thereafter the even bits
into the flash memory connected to the lower QSPI bus and the odd bits into
the flash memory connected to the upper QSPI bus.

Signed-off-by: Francisco Iglesias <[email protected]>
Acked-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

xilinx_spips: Move FlashCMD, XilinxQSPIPS and XilinxSPIPSClass

Move the FlashCMD enum, XilinxQSPIPS and XilinxSPIPSClass structures to the
header for consistency (struct XilinxSPIPS is found there). Also move out
a define and remove two double included headers (while touching the code).
Finally, add 4 byte address commands to the FlashCMD enum.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

m25p80: Add support for n25q512a11 and n25q512a13

Add support for Micron (Numonyx) n25q512a11 and n25q512a13 flashes.

Signed-off-by: Francisco Iglesias <[email protected]>
Acked-by: Marcin Krzemiński <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

m25p80: Add support for BRRD/BRWR and BULK_ERASE (0x60)

Add support for the bank address register access commands (BRRD/BRWR) and
the BULK_ERASE (0x60) command.

Signed-off-by: Francisco Iglesias <[email protected]>
Acked-by: Marcin Krzemiński <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

m25p80: Add support for SST READ ID 0x90/0xAB commands

Add support for SST READ ID 0x90/0xAB commands for reading out the flash
manufacturer ID and device ID.

Signed-off-by: Francisco Iglesias <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

m25p80: Add support for continuous read out of RDSR and READ_FSR

Add support for continuous read out of the RDSR and READ_FSR status
registers until the chip select is deasserted. This feature is supported
by amongst others 1 or more flashtypes manufactured by Numonyx (Micron),
Windbond, SST, Gigadevice, Eon and Macronix.

Signed-off-by: Francisco Iglesias <[email protected]>
Acked-by: Marcin Krzemiński<[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Tested-by: Edgar E. Iglesias <[email protected]>
Message-id: 20171126231634 [email protected]
Signed-off-by: Peter Maydell <[email protected]>

vfio-pci: Remove unused fields from VFIOMSIXInfo

When support for multiple mappings per a region were added, this was
left behind, let's finish and remove unused bits.

Fixes: db0da029a185 ("vfio: Generalize region support")
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1

The vfio_iommu_spapr_tce driver advertises kernel's support for
v1 and v2 IOMMU support, however it is not always possible to use
the requested IOMMU type. For example, a pseries host platform does not
support dynamic DMA windows so v2 cannot initialize and QEMU fails to
start.

This adds a fallback to the v1 IOMMU if v2 cannot be used.

Fixes: 318f67ce1371 ("vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2)")
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio/common: init giommu_list and hostwin_list of vfio container

The init of giommu_list and hostwin_list is missed during container
initialization.

Signed-off-by: Liu, Yi L <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

vfio: Fix vfio-kvm group registration

Commit 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container
attaching") moved registration of groups with the vfio-kvm device from
vfio_get_group() to vfio_connect_container(), but it missed the case
where a group is attached to an existing container and takes an early
exit. Perhaps this is a less common case on ppc64/spapr, but on x86
(without viommu) all groups are connected to the same container and
thus only the first group gets registered with the vfio-kvm device.
This becomes a problem if we then hot-unplug the devices associated
with that first group and we end up with KVM being misinformed about
any vfio connections that might remain. Fix by including the call to
vfio_kvm_device_add_group() in this early exit path.

Fixes: 8c37faa475f3 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching")
Cc: [email protected] # qemu-2.10+
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Tested-by: Peter Xu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Tested-by: Eric Auger <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>

Open 2.12 development tree

Signed-off-by: Peter Maydell <[email protected]>

Update version for v2.11.0 release

Signed-off-by: Peter Maydell <[email protected]>

Update version for v2.11.0-rc5 release

Signed-off-by: Peter Maydell <[email protected]>

target/arm: Generate UNDEF for 32-bit Thumb2 insns

The refactoring of commit 296e5a0a6c3935 has a nasty bug:
it accidentally dropped the generation of code to raise
the UNDEF exception when disas_thumb2_insn() returns nonzero.
This means that 32-bit Thumb2 instruction patterns that
ought to UNDEF just act like nops instead. This is likely
to break any number of things, including the kernel's "disable
the FPU and use the UNDEF exception to identify when to turn
it back on again" trick.

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1513006964 [email protected]
Reviewed-by: Richard Henderson <[email protected]>

Update version for v2.11.0-rc4 release

Signed-off-by: Peter Maydell <[email protected]>

vhost-scsi: add missing virtqueue_size parameter

Commit 5c0919d02066 ("virtio-scsi: Add virtqueue_size parameter allowing
virtqueue size to be set.") introduced a new parameter to virtio-scsi.
Later, commit 920036106044 ("vhost-user-scsi: add missing virtqueue_size
param") added that parameter to the new vhost-user-scsi interface but
neglected the existing vhost-scsi interface it was built on.

Apply the same change to vhost-scsi, so that we can boot a guest with
a device defined. This also avoids crashing a guest when hotplugging
a vhost-scsi device.

Signed-off-by: Eric Farman <[email protected]>
Message-id: 20171201151538 [email protected]
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171205' into staging

ppc patch queue 2017-12-05

Alas, this is yet another fix for ppc that I think it's worth
squeezing into 2.11.  It's a really ugly fix for some pretty ugly
code, but it does seem to address a real problem.  It's also a problem
that's appeared relatively recently, since it was either created by,
or made much easier to trigger by, by the merge of MTTCG.

# gpg: Signature made Tue 05 Dec 2017 05:24:04 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.11-20171205:
  target/ppc: Fix system lockups caused by interrupt_request state corruption

Signed-off-by: Peter Maydell <[email protected]>

target/ppc: Fix system lockups caused by interrupt_request state corruption

Occasionally in Linux guests on x86_64 we're seeing logs like:

ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004

when they should read:

ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002

The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
just before the log message. Something is causing the HARD bit setting
to get lost.

The knock on effect of losing that bit is the decrementer timer interrupts
don't get delivered which causes the guest to sit idle in its idle handler
and 'hang'.

The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.

Rather than poking directly into cs->interrupt_request, that code needs to:

a) hold BQL
b) use the cpu_interrupt() helper

This patch fixes the call sites to do this, fixing the hang. The calls
are made from a variety of contexts so a helper function is added to handle
the necessary locking. This can likely be improved and optimised in the future
but it ensures the code is correct and doesn't lockup as it stands today.

Signed-off-by: Richard Purdie <[email protected]>
Signed-off-by: David Gibson <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches for 2.11.0-rc4

# gpg: Signature made Mon 04 Dec 2017 16:46:07 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  blockjob: Make block_job_pause_all() keep a reference to the jobs

Signed-off-by: Peter Maydell <[email protected]>

blockjob: Make block_job_pause_all() keep a reference to the jobs

Starting from commit 40840e419be31e6a32e6ea24511c74b389d5e0e4 we are
pausing all block jobs during bdrv_reopen_multiple() to prevent any of
them from finishing and removing nodes from the graph while they are
being reopened.

It turns out that pausing a block job doesn't necessarily prevent it
from finishing: a paused block job can still run its exit function
from the main loop and call block_job_completed(). The mirror block
job in particular always goes to the main loop while it is paused (by
virtue of the bdrv_drained_begin() call in mirror_run()).

Destroying a paused block job during bdrv_reopen_multiple() has two
consequences:

   1) The references to the nodes involved in the job are released,
      possibly destroying some of them. If those nodes were in the
      reopen queue this would trigger the problem originally described
      in commit 40840e419be, crashing QEMU.

   2) At the end of bdrv_reopen_multiple(), bdrv_drain_all_end() would
      not be doing all necessary bdrv_parent_drained_end() calls.

I can reproduce problem 1) easily with iotest 030 by increasing
STREAM_BUFFER_SIZE from 512KB to 8MB in block/stream.c, or by tweaking
the iotest like in this example:

   https://lists.gnu.org/archive/html/qemu-block/2017-11/msg00934.html

This patch keeps an additional reference to all block jobs between
block_job_pause_all() and block_job_resume_all(), guaranteeing that
they are kept alive.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, virtio: fixes for rc3

A bunch of fixes all over the place.

Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 01 Dec 2017 17:06:33 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg:                 aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  pc: fix crash on attempted cpu unplug
  virtio: check VirtQueue Vring object is set
  vhost: fix error check in vhost_verify_ring_mappings()
  dump-guest-memory.py: fix No symbol "vmcoreinfo_find"
  vhost: restore avail index from vring used index on disconnection
  virtio: Add queue interface to restore avail index from vring used index
  i386/msi: Correct mask of destination ID in MSI address

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171204' into staging

ppc patch queue 2017-12-04

We are, alas, not yet to the bottom of ppc bugs.  This pull request
fixes several more.  I believe they're important enough to include in
2.11. despite the late date.

# gpg: Signature made Mon 04 Dec 2017 03:40:56 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.11-20171204:
  spapr: Include "pre-plugged" DIMMS in ram size calculation at reset
  target-ppc: Don't invalidate non-supported msr bits
  pseries: fix TCG migration

Signed-off-by: Peter Maydell <[email protected]>

spapr: Include "pre-plugged" DIMMS in ram size calculation at reset

At guest reset time, we allocate a hash page table (HPT) for the guest
based on the guest's RAM size.  If dynamic HPT resizing is not available we
use the maximum RAM size, if it is we use the current RAM size.

But the "current RAM size" calculation is incorrect - we just use the
"base" ram_size from the machine structure.  This doesn't include any
pluggable DIMMs that are already plugged at reset time.

This means that if you try to start a 'pseries' machine with a DIMM
specified on the command line that's much larger than the "base" RAM size,
then the guest will get a woefully inadequate HPT.  This can lead to a
guest freeze during boot as it runs out of HPT space during initial MMU
setup.

Signed-off-by: David Gibson <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Tested-by: Greg Kurz <[email protected]>

pc: fix crash on attempted cpu unplug

when qemu is started with '-no-acpi' CLI option, an attempt
to unplug a CPU using device_del results in null pointer
dereference at:

  #0 object_get_class
  #1 pc_machine_device_unplug_request_cb
  #2 qmp_marshal_device_del

which is caused by pcms->acpi_dev == NULL due to ACPI support
being disabled.

Considering that ACPI support is necessary for unplug to work,
check that it's enabled and fail unplug request gracefully
if no acpi device were found.

Signed-off-by: Igor Mammedov <[email protected]>
Reviewed-by: Eduardo Habkost <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: check VirtQueue Vring object is set

A guest could attempt to use an uninitialised VirtQueue object
or unset Vring.align leading to a arithmetic exception. Add check
to avoid it.

Reported-by: Zhangboxian <[email protected]>
Signed-off-by: Prasad J Pandit <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>

vhost: fix error check in vhost_verify_ring_mappings()

Since commit f1f9e6c5 "vhost: adapt vhost_verify_ring_mappings() to
virtio 1 ring layout", we check the mapping of each part (descriptor
table, available ring and used ring) of each virtqueue separately.

The checking of a part is done by the vhost_verify_ring_part_mapping()
function: it returns either 0 on success or a negative errno if the
part cannot be mapped at the same place.

Unfortunately, the vhost_verify_ring_mappings() function checks its
return value the other way round. It means that we either:
- only verify the descriptor table of the first virtqueue, and if it
is valid we ignore all the other mappings
- or ignore all broken mappings until we reach a valid one

ie, we only raise an error if all mappings are broken, and we consider
all mappings are valid otherwise (false success), which is obviously
wrong.

This patch ensures that vhost_verify_ring_mappings() only returns
success if ALL mappings are okay.

Reported-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

dump-guest-memory.py: fix No symbol "vmcoreinfo_find"

When qemu is compiled without debug, the dump gdb python script can fail with:

Error occurred in Python command: No symbol "vmcoreinfo_find" in current context.

Because vmcoreinfo_find() is inlined and not exported.

Use the underlying object_resolve_path_type() to get the instance instead.

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Laszlo Ersek <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

vhost: restore avail index from vring used index on disconnection

vhost_virtqueue_stop() gets avail index value from the backend,
except if the backend is not responding.

It happens when the backend crashes, and in this case, internal
state of the virtio queue is inconsistent, making packets
to corrupt the vring state.

With a Linux guest, it results in following error message on
backend reconnection:

[   22.444905] virtio_net virtio0: output.0:id 0 is not a head!
[   22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
[   22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5

Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down")
Cc: [email protected]
Signed-off-by: Maxime Coquelin <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

virtio: Add queue interface to restore avail index from vring used index

In case of backend crash, it is not possible to restore internal
avail index from the backend value as vhost_get_vring_base
callback fails.

This patch provides a new interface to restore internal avail index
from the vring used index, as done by some vhost-user backend on
reconnection.

Signed-off-by: Maxime Coquelin <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

i386/msi: Correct mask of destination ID in MSI address

According to SDM 10.11.1, only [19:12] bits of MSI address are
Destination ID, change the mask to avoid ambiguity for VT-d spec
has used the bit 4 to indicate a remappable interrupt request.

Signed-off-by: Chao Gao <[email protected]>
Signed-off-by: Lan Tianyu <[email protected]>
Reviewed-by: Anthony PERARD <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>

target-ppc: Don't invalidate non-supported msr bits

The msr invalidation code (commits 993eb and 2360b) inverts all
bits except MSR_TGPR and MSR_HVB. On non PowerPC 601 processors
this leads to incorrect change of excp_prefix in hreg_store_msr()
function. The problem is that new msr value get multiplied by msr_mask
and inverted msr does not, thus values of MSR_EP bit in new msr value
and inverted msr are distinct, so that excp_prefix changes but should
not.

Signed-off-by: Kurban Mallachiev <[email protected]>
Signed-off-by: David Gibson <[email protected]>

pseries: fix TCG migration

Migration of pseries is broken with TCG because
QEMU tries to restore KVM MMU state unconditionally.

The result is a SIGSEGV in kvm_vm_ioctl():

  #0  kvm_vm_ioctl (s=0x0, type=-2146390353)
      at qemu/accel/kvm/kvm-all.c:2032
  #1  0x00000001003e3e2c in kvmppc_configure_v3_mmu (cpu=<optimized out>,
      radix=<optimized out>, gtse=<optimized out>, proc_tbl=<optimized out>)
      at qemu/target/ppc/kvm.c:396
  #2  0x00000001002f8b88 in spapr_post_load (opaque=0x1019103c0,
      version_id=<optimized out>) at qemu/hw/ppc/spapr.c:1578
  #3  0x000000010059e4cc in vmstate_load_state (f=0x106230000,
      vmsd=0x1009479e0 <vmstate_spapr>, opaque=0x1019103c0,
      version_id=<optimized out>) at qemu/migration/vmstate.c:165
  #4  0x00000001005987e0 in vmstate_load (f=<optimized out>, se=<optimized out>)
      at qemu/migration/savevm.c:748

This patch fixes the problem by not calling the KVM function with the
TCG mode.

Fixes: d39c90f5f3 ("spapr: Fix migration of Radix guests")
Signed-off-by: Laurent Vivier <[email protected]>
Reviewed-by: Suraj Jitindar Singh <[email protected]>
Signed-off-by: David Gibson <[email protected]>

Update version for v2.11.0-rc3 release

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches for 2.11.0-rc3

# gpg: Signature made Wed 29 Nov 2017 15:25:13 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block/nfs: fix nfs_client_open for filesize greater than 1TB
  blockjob: reimplement block_job_sleep_ns to allow cancellation
  blockjob: introduce block_job_do_yield
  blockjob: remove clock argument from block_job_sleep_ns
  block: Expect graph changes in bdrv_parent_drained_begin/end
  blockjob: Remove the job from the list earlier in block_job_unref()
  QAPI & interop: Clarify events emitted by 'block-job-cancel'
  qemu-options: Mention locking option of file driver
  docs: Add image locking subsection
  iotests: fix 075 and 078

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'mreitz/tags/pull-block-2017-11-29' into queue-block

One block patch for 2.11.0-rc3

# gpg: Signature made Wed Nov 29 15:28:38 2017 CET
# gpg:                using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <[email protected]>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2017-11-29:
  block/nfs: fix nfs_client_open for filesize greater than 1TB

Signed-off-by: Kevin Wolf <[email protected]>

block/nfs: fix nfs_client_open for filesize greater than 1TB

DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE) was overflowing ret (int) if
st.st_size is greater than 1TB.

Cc: [email protected]
Signed-off-by: Peter Lieven <[email protected]>
Message-id: 1511798407 [email protected]
Signed-off-by: Max Reitz <[email protected]>

blockjob: reimplement block_job_sleep_ns to allow cancellation

This reverts the effects of commit 4afeffc857 ("blockjob: do not allow
coroutine double entry or entry-after-completion", 2017-11-21)

This fixed the symptom of a bug rather than the root cause. Canceling the
wait on a sleeping blockjob coroutine is generally fine, we just need to
make it work correctly across AioContexts. To do so, use a QEMUTimer
that calls block_job_enter. Use a mutex to ensure that block_job_enter
synchronizes correctly with block_job_sleep_ns.

Signed-off-by: Paolo Bonzini <[email protected]>
Tested-By: Jeff Cody <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

blockjob: introduce block_job_do_yield

Hide the clearing of job->busy in a single function, and set it
in block_job_enter. This lets block_job_do_yield verify that
qemu_coroutine_enter is not used while job->busy = false.

Signed-off-by: Paolo Bonzini <[email protected]>
Tested-By: Jeff Cody <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

blockjob: remove clock argument from block_job_sleep_ns

All callers are using QEMU_CLOCK_REALTIME, and it will not be possible to
support more than one clock when block_job_sleep_ns switches to a single
timer stored in the BlockJob struct.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Tested-By: Jeff Cody <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Expect graph changes in bdrv_parent_drained_begin/end

The .drained_begin/end callbacks can (directly or indirectly via
aio_poll()) cause block nodes to be removed or the current BdrvChild to
point to a different child node.

Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
BlockDriverStates or accidentally continue iterating the parents of the
new child node instead of the node we actually came from.

Signed-off-by: Kevin Wolf <[email protected]>
Tested-by: Jeff Cody <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Jeff Cody <[email protected]>
Reviewed-by: Alberto Garcia <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

blockjob: Remove the job from the list earlier in block_job_unref()

When destroying a block job in block_job_unref() we should remove it
from the job list before calling block_job_remove_all_bdrv().

This is because removing the BDSs can trigger an aio_poll() and wake
up other jobs that might attempt to use the block job list. If that
happens the job we're currently destroying should not be in that list
anymore.

Signed-off-by: Alberto Garcia <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-11-28' into staging

nbd patches for 2017-11-28

Eric Blake - 0/2 fix two NBD server CVEs

# gpg: Signature made Tue 28 Nov 2017 12:58:29 GMT
# gpg:                using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <[email protected]>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <[email protected]>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2017-11-28:
  nbd/server: CVE-2017-15118 Stack smash on large export name
  nbd/server: CVE-2017-15119 Reject options larger than 32M

Signed-off-by: Peter Maydell <[email protected]>

nbd/server: CVE-2017-15118 Stack smash on large export name

Introduced in commit f37708f6b8 (2.10).  The NBD spec says a client
can request export names up to 4096 bytes in length, even though
they should not expect success on names longer than 256.  However,
qemu hard-codes the limit of 256, and fails to filter out a client
that probes for a longer name; the result is a stack smash that can
potentially give an attacker arbitrary control over the qemu
process.

The smash can be easily demonstrated with this client:
$ qemu-io f raw nbd://localhost:10809/$(printf %3000d 1 | tr ' ' a)

If the qemu NBD server binary (whether the standalone qemu-nbd, or
the builtin server of QMP nbd-server-start) was compiled with
-fstack-protector-strong, the ability to exploit the stack smash
into arbitrary execution is a lot more difficult (but still
theoretically possible to a determined attacker, perhaps in
combination with other CVEs).  Still, crashing a running qemu (and
losing the VM) is bad enough, even if the attacker did not obtain
full execution control.

CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>

nbd/server: CVE-2017-15119 Reject options larger than 32M

The NBD spec gives us permission to abruptly disconnect on clients
that send outrageously large option requests, rather than having
to spend the time reading to the end of the option.  No real
option request requires that much data anyways; and meanwhile, we
already have the practice of abruptly dropping the connection on
any client that sends NBD_CMD_WRITE with a payload larger than 32M.

For comparison, nbdkit drops the connection on any request with
more than 4096 bytes; however, that limit is probably too low
(as the NBD spec states an export name can theoretically be up
to 4096 bytes, which means a valid NBD_OPT_INFO could be even
longer) - even if qemu doesn't permit exports longer than 256
bytes.

It could be argued that a malicious client trying to get us to
read nearly 4G of data on a bad request is a form of denial of
service.  In particular, if the server requires TLS, but a client
that does not know the TLS credentials sends any option (other
than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
payload of nearly 4G, then the server was keeping the connection
alive trying to read all the payload, tying up resources that it
would rather be spending on a client that can get past the TLS
handshake.  Hence, this warranted a CVE.

Present since at least 2.5 when handling known options, and made
worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
to handle unknown options.

CC: [email protected]
Signed-off-by: Eric Blake <[email protected]>

Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-11-28-1' into staging

Merge qio 2017/11/28 v1

# gpg: Signature made Tue 28 Nov 2017 10:49:08 GMT
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <[email protected]>"
# gpg:                 aka "Daniel P. Berrange <[email protected]>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2017-11-28-1:
  sockets: avoid crash when cleaning up sockets for an invalid FD

Signed-off-by: Peter Maydell <[email protected]>

sockets: avoid crash when cleaning up sockets for an invalid FD

If socket_listen_cleanup is passed an invalid FD, then querying the socket
local address will fail. We must thus be prepared for the returned addr to
be NULL

Reported-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Daniel P. Berrange <[email protected]>

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Tue 28 Nov 2017 03:58:11 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  virtio-net: don't touch virtqueue if vm is stopped

Signed-off-by: Peter Maydell <[email protected]>

virtio-net: don't touch virtqueue if vm is stopped

Guest state should not be touched if VM is stopped, unfortunately we
didn't check running state and tried to drain tx queue unconditionally
in virtio_net_set_status(). A crash was then noticed as a migration
destination when user type quit after virtqueue state is loaded but
before region cache is initialized. In this case,
virtio_net_drop_tx_queue_data() tries to access the uninitialized
region cache.

Fix this by only dropping tx queue data when vm is running.

Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down")
Cc: Yuri Benditovich <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Stefan Hajnoczi <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: [email protected]
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Jason Wang <[email protected]>