David Gibson [Thu, 4 Jun 2020 06:42:18 +0000 (16:42 +1000)]
target/i386: sev: Remove redundant handle field
The user can explicitly specify a handle via the "handle" property wired
to SevGuestState::handle. That gets passed to the KVM_SEV_LAUNCH_START
ioctl() which may update it, the final value being copied back to both
SevGuestState::handle and SEVState::handle.
AFAICT, nothing will be looking SEVState::handle before it and
SevGuestState::handle have been updated from the ioctl(). So, remove the
field and just use SevGuestState::handle directly.
David Gibson [Thu, 4 Jun 2020 06:42:17 +0000 (16:42 +1000)]
target/i386: sev: Remove redundant policy field
SEVState::policy is set from the final value of the policy field in the
parameter structure for the KVM_SEV_LAUNCH_START ioctl(). But, AFAICT
that ioctl() won't ever change it from the original supplied value which
comes from SevGuestState::policy.
So, remove this field and just use SevGuestState::policy directly.
David Gibson [Thu, 4 Jun 2020 06:42:16 +0000 (16:42 +1000)]
target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields
The SEVState structure has cbitpos and reduced_phys_bits fields which are
simply copied from the SevGuestState structure and never changed. Now that
SEVState is embedded in SevGuestState we can just access the original copy
directly.
David Gibson [Thu, 4 Jun 2020 06:42:15 +0000 (16:42 +1000)]
target/i386: sev: Partial cleanup to sev_state global
The SEV code uses a pretty ugly global to access its internal state. Now
that SEVState is embedded in SevGuestState, we can avoid accessing it via
the global in some cases. In the remaining cases use a new global
referencing the containing SevGuestState which will simplify some future
transformations.
David Gibson [Thu, 4 Jun 2020 06:42:13 +0000 (16:42 +1000)]
target/i386: sev: Rename QSevGuestInfo
At the moment this is a purely passive object which is just a container for
information used elsewhere, hence the name. I'm going to change that
though, so as a preliminary rename it to SevGuestState.
That name risks confusion with both SEVState and SevState, but I'll be
working on that in following patches.
David Gibson [Thu, 4 Jun 2020 06:42:12 +0000 (16:42 +1000)]
target/i386: sev: Move local structure definitions into .c file
Neither QSevGuestInfo nor SEVState (not to be confused with SevState) is
used anywhere outside target/i386/sev.c, so they might as well live in
there rather than in a (somewhat) exposed header.
Anthony PERARD [Wed, 3 Jun 2020 16:04:42 +0000 (17:04 +0100)]
xen: fix build without pci passthrough
Xen PCI passthrough support may not be available and thus the global
variable "has_igd_gfx_passthru" might be compiled out. Common code
should not access it in that case.
Unfortunately, we can't use CONFIG_XEN_PCI_PASSTHROUGH directly in
xen-common.c so this patch instead move access to the
has_igd_gfx_passthru variable via function and those functions are
also implemented as stubs. The stubs will be used when QEMU is built
without passthrough support.
Now, when one will want to enable igd-passthru via the -machine
property, they will get an error message if QEMU is built without
passthrough support.
Roman Bolshakov [Thu, 28 May 2020 19:37:52 +0000 (22:37 +0300)]
i386: hvf: Drop fetch_rip from HVFX86EmulatorState
The field is used to print address of instructions that have no parser
in decode_invalid(). RIP from VMCS is saved into fetch_rip before
decoding starts but it's also saved into env->eip in load_regs().
Therefore env->eip can be used instead of fetch_rip.
While at it, correct address printed in decode_invalid(). It prints an
address before the unknown instruction.
Roman Bolshakov [Thu, 28 May 2020 19:37:46 +0000 (22:37 +0300)]
i386: hvf: Move HVFState definition into hvf
"sysemu/hvf.h" is intended for inclusion in generic code. However it
also contains several hvf definitions and declarations, including
HVFState that are used only inside "hvf.c". "hvf-i386.h" would be more
appropriate place to define HVFState as it's only included by "hvf.c"
and "x86_task.c".
Joseph Myers [Fri, 12 Jun 2020 13:45:23 +0000 (13:45 +0000)]
target/i386: correct fix for pcmpxstrx substring search
This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri
/ pcmpestrm / pcmpistri / pcmpistrm substring search, commit ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60.
That commit fixed a bug that showed up in four GCC tests with one libc
implementation. The tests in question generate random inputs to the
intrinsics and compare results to a C implementation, but they only
test 1024 possible random inputs, and when the tests use the cases of
those instructions that work with word rather than byte inputs, it's
easy to have problematic cases that show up much less frequently than
that. Thus, testing with a different libc implementation, and so a
different random number generator, showed up a problem with the
previous patch.
When investigating the previous test failures, I found the description
of these instructions in the Intel manuals (starting from computing a
16x16 or 8x8 set of comparison results) confusing and hard to match up
with the more optimized implementation in QEMU, and referred to AMD
manuals which described the instructions in a different way. Those
AMD descriptions are very explicit that the whole of the string being
searched for must be found in the other operand, not running off the
end of that operand; they say "If the prototype and the SUT are equal
in length, the two strings must be identical for the comparison to be
TRUE.". However, that statement is incorrect.
In my previous commit message, I noted:
The operation in this case is a search for a string (argument d to
the helper) in another string (argument s to the helper); if a copy
of d at a particular position would run off the end of s, the
resulting output bit should be 0 whether or not the strings match in
the region where they overlap, but the QEMU implementation was
wrongly comparing only up to the point where s ends and counting it
as a match if an initial segment of d matched a terminal segment of
s. Here, "run off the end of s" means that some byte of d would
overlap some byte outside of s; thus, if d has zero length, it is
considered to match everywhere, including after the end of s.
The description "some byte of d would overlap some byte outside of s"
is accurate only when understood to refer to overlapping some byte
*within the 16-byte operand* but at or after the zero terminator; it
is valid to run over the end of s if the end of s is the end of the
16-byte operand. So the fix in the previous patch for the case of d
being empty was correct, but the other part of that patch was not
correct (as it never allowed partial matches even at the end of the
16-byte operand). Nor was the code before the previous patch correct
for the case of d nonempty, as it would always have allowed partial
matches at the end of s.
Fix with a partial revert of my previous change, combined with
inserting a check for the special case of s having maximum length to
determine where it is necessary to check for matches.
In the added test, test 1 is for the case of empty strings, which
failed before my 2017 patch, test 2 is for the bug introduced by my
2017 patch and test 3 deals with the case where a match of an initial
segment at the end of the string is not valid when the string ends
before the end of the 16-byte operand (that is, the case that would be
broken by a simple revert of the non-empty-string part of my 2017
patch).
Most x87 instruction implementations fail to raise the expected IEEE
floating-point exceptions because they do nothing to convert the
exception state from the softfloat machinery into the exception flags
in the x87 status word. There is special-case handling of division to
raise the divide-by-zero exception, but that handling is itself buggy:
it raises the exception in inappropriate cases (inf / 0 and nan / 0,
which should not raise any exceptions, and 0 / 0, which should raise
"invalid" instead).
Fix this by converting the floating-point exceptions raised during an
operation by the softfloat machinery into exceptions in the x87 status
word (passing through the existing fpu_set_exception function for
handling related to trapping exceptions). There are special cases
where some functions convert to integer internally but exceptions from
that conversion are not always correct exceptions for the instruction
to raise.
There might be scope for some simplification if the softfloat
exception state either could always be assumed to be in sync with the
state in the status word, or could always be ignored at the start of
each instruction and just set to 0 then; I haven't looked into that in
detail, and it might run into interactions with the various ways the
emulation does not yet handle trapping exceptions properly. I think
the approach taken here, of saving the softfloat state, setting
exceptions there to 0 and then merging the old exceptions back in
after carrying out the operation, is conservatively safe.
When mapping physical memory into host's virtual address space,
'address_space_map' may return NULL if BounceBuffer is in_use.
Set and return '*plen = 0' to avoid later NULL pointer dereference.
qemu/thread: Mark qemu_thread_exit() with 'noreturn' attribute
After upgrading to Ubuntu 20.04 LTS, GCC 9.3 complains:
util/qemu-thread-posix.c: In function ‘qemu_thread_exit’:
util/qemu-thread-posix.c:577:6: error: function might be candidate for attribute ‘noreturn’ [-Werror=suggest-attribute=noreturn]
577 | void qemu_thread_exit(void *retval)
| ^~~~~~~~~~~~~~~~
Fix by marking the qemu_thread_exit function with QEMU_NORETURN
to set the 'noreturn' attribute.
memory: Make 'info mtree' not display disabled regions by default
We might have many disabled memory regions, making the 'info mtree'
output too verbose to be useful.
Remove the disabled regions in the default output, but allow the
monitor user to display them using the '-D' option.
Like Xu [Fri, 29 May 2020 07:43:47 +0000 (15:43 +0800)]
target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.
The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].
Julio Faracco [Mon, 23 Mar 2020 20:05:38 +0000 (17:05 -0300)]
i386: Remove unused define's from hax and hvf
Commit acb9f95a removed boundary checks for ID and VCPU ID. After that,
the max definitions of that boundaries are not required anymore. This
commit is only a code cleanup.
Pavel Dovgalyuk [Thu, 30 Apr 2020 09:13:49 +0000 (12:13 +0300)]
replay: implement fair mutex
In record/replay icount mode main loop thread and vCPU thread
do not perform simultaneously. They take replay mutex to synchronize
the actions. Sometimes vCPU thread waits for locking the mutex for
very long time, because main loop releases the mutex and takes it
back again. Standard qemu mutex do not provide the ordering
capabilities.
This patch adds a "queue" for replay mutex. Therefore thread ordering
becomes more "fair". Threads are executed in the same order as
they are trying to take the mutex.
hw/i386/amd_iommu: Fix the reserved bits definition of IOMMU commands
Many reserved bits of amd_iommu commands are defined incorrectly in QEMU.
Because of it, QEMU incorrectly injects lots of illegal commands into guest
VM's IOMMU event log.
chardev/char-socket: Properly make qio connections non blocking
In tcp_chr_sync_read function, there is a possibility of socket
disconnection during blocking read, then tcp_chr_hup function would clean up
the qio channel pointers(i.e ioc, sioc).
Peter Xu [Wed, 18 Mar 2020 14:52:03 +0000 (10:52 -0400)]
KVM: Kick resamplefd for split kernel irqchip
This is majorly only for X86 because that's the only one that supports
split irqchip for now.
When the irqchip is split, we face a dilemma that KVM irqfd will be
enabled, however the slow irqchip is still running in the userspace.
It means that the resamplefd in the kernel irqfds won't take any
effect and it will miss to ack INTx interrupts on EOIs.
One example is split irqchip with VFIO INTx, which will break if we
use the VFIO INTx fast path.
This patch can potentially supports the VFIO fast path again for INTx,
that the IRQ delivery will still use the fast path, while we don't
need to trap MMIOs in QEMU for the device to emulate the EIOs (see the
callers of vfio_eoi() hook). However the EOI of the INTx will still
need to be done from the userspace by caching all the resamplefds in
QEMU and kick properly for IOAPIC EOI broadcast.
This is tricky because in this case the userspace ioapic irr &
remote-irr will be bypassed. However such a change will greatly boost
performance for assigned devices using INTx irqs (TCP_RR boosts 46%
after this patch applied).
When the userspace is responsible for the resamplefd kickup, don't
register it on the kvm_irqfd anymore, because on newer kernels (after
commit 654f1f13ea56, 5.2+) the KVM_IRQFD will fail if with both split
irqchip and resamplefd. This will make sure that the fast path will
work for all supported kernels.
Peter Xu [Wed, 18 Mar 2020 14:52:02 +0000 (10:52 -0400)]
KVM: Pass EventNotifier into kvm_irqchip_assign_irqfd
So that kvm_irqchip_assign_irqfd() can have access to the
EventNotifiers, especially the resample event. It is needed in follow
up patch to cache and kick resamplefds from QEMU.
Peter Xu [Wed, 18 Mar 2020 14:52:01 +0000 (10:52 -0400)]
vfio/pci: Use kvm_irqchip_add_irqfd_notifier_gsi() for irqfds
VFIO is currently the only one left that is not using the generic
function (kvm_irqchip_add_irqfd_notifier_gsi()) to register irqfds.
Let VFIO use the common framework too.
Follow up patches will introduce extra features for kvm irqfd, so that
VFIO can easily leverage that after the switch.
AVX512_VP2INTERSECT compute vector pair intersection to a pair
of mask registers, which is introduced with intel Tiger Lake,
defining as CPUID.(EAX=7,ECX=0):EDX[bit 08].
Refer to the following release spec:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
AddressSanitizer:DEADLYSIGNAL
=================================================================
==29476==ERROR: AddressSanitizer: SEGV on unknown address 0x000000008840 (pc 0x56448bec4d79 bp 0x7ffeec9741b0 sp 0x7ffeec9740e0 T0)
==29476==The signal is caused by a READ memory access.
#0 0x56448bec4d78 in vmport_ioport_read (qemu-fuzz-i386+0x1260d78)
#1 0x56448bb5f175 in memory_region_read_accessor (qemu-fuzz-i386+0xefb175)
#2 0x56448bb30c13 in access_with_adjusted_size (qemu-fuzz-i386+0xeccc13)
#3 0x56448bb2ea27 in memory_region_dispatch_read1 (qemu-fuzz-i386+0xecaa27)
#4 0x56448bb2e443 in memory_region_dispatch_read (qemu-fuzz-i386+0xeca443)
#5 0x56448b961ab1 in flatview_read_continue (qemu-fuzz-i386+0xcfdab1)
#6 0x56448b96336d in flatview_read (qemu-fuzz-i386+0xcff36d)
#7 0x56448b962ec4 in address_space_read_full (qemu-fuzz-i386+0xcfeec4)
Joseph Myers [Fri, 15 May 2020 21:20:25 +0000 (21:20 +0000)]
target/i386: fix fisttpl, fisttpll handling of out-of-range values
The fist / fistt family of instructions should all store the most
negative integer in the destination format when the rounded /
truncated integer result is out of range or the input is an invalid
encoding, infinity or NaN. The fisttpl and fisttpll implementations
(32-bit and 64-bit results, truncate towards zero) failed to do this,
producing the most positive integer in some cases instead. Fix this
by copying the code used to handle this issue for fistpl and fistpll,
adjusted to use the _round_to_zero functions for the actual
conversion (but without any other changes to that code).
Joseph Myers [Wed, 13 May 2020 23:51:42 +0000 (23:51 +0000)]
target/i386: fix fbstp handling of out-of-range values
The fbstp implementation fails to check for out-of-range and invalid
values, instead just taking the result of conversion to int64_t and
storing its sign and low 18 decimal digits. Fix this by checking for
an out-of-range result (invalid conversions always result in INT64_MAX
or INT64_MIN from the softfloat code, which are large enough to be
considered as out-of-range by this code) and storing the packed BCD
indefinite encoding in that case.
Joseph Myers [Wed, 13 May 2020 23:51:09 +0000 (23:51 +0000)]
target/i386: fix fbstp handling of negative zero
The fbstp implementation stores +0 when the rounded result should be
-0 because it compares an integer value with 0 to determine the sign.
Fix this by checking the sign bit of the operand instead.
Joseph Myers [Wed, 13 May 2020 23:50:19 +0000 (23:50 +0000)]
target/i386: fix fxam handling of invalid encodings
The fxam implementation does not check for invalid encodings, instead
treating them like NaN or normal numbers depending on the exponent.
Fix it to check that the high bit of the significand is set before
treating an encoding as NaN or normal, thus resulting in correct
handling (all of C0, C2 and C3 cleared) for invalid encodings.
The implementations of the fldl2t, fldl2e, fldpi, fldlg2 and fldln2
instructions load fixed constants independent of the rounding mode.
Fix them to load a value correctly rounded for the current rounding
mode (but always rounded to 64-bit precision independent of the
precision control, and without setting "inexact") as specified.
Joseph Myers [Thu, 7 May 2020 00:46:28 +0000 (00:46 +0000)]
target/i386: fix fscale handling of rounding precision
The fscale implementation uses floatx80_scalbn for the final scaling
operation. floatx80_scalbn ends up rounding the result using the
dynamic rounding precision configured for the FPU. But only a limited
set of x87 floating-point instructions are supposed to respect the
dynamic rounding precision, and fscale is not in that set. Fix the
implementation to save and restore the rounding precision around the
call to floatx80_scalbn.
Joseph Myers [Thu, 7 May 2020 00:45:38 +0000 (00:45 +0000)]
target/i386: fix fscale handling of infinite exponents
The fscale implementation passes infinite exponents through to generic
code that rounds the exponent to a 32-bit integer before using
floatx80_scalbn. In round-to-nearest mode, and ignoring exceptions,
this works in many cases. But it fails to handle the special cases of
scaling 0 by a +Inf exponent or an infinity by a -Inf exponent, which
should produce a NaN, and because it produces an inexact result for
finite nonzero numbers being scaled, the result is sometimes incorrect
in other rounding modes. Add appropriate handling of infinite
exponents to produce a NaN or an appropriately signed exact zero or
infinity as a result.
Joseph Myers [Thu, 7 May 2020 00:44:57 +0000 (00:44 +0000)]
target/i386: fix fscale handling of invalid exponent encodings
The fscale implementation does not check for invalid encodings in the
exponent operand, thus treating them like INT_MIN (the value returned
for invalid encodings by floatx80_to_int32_round_to_zero). Fix it to
treat them similarly to signaling NaN exponents, thus generating a
quiet NaN result.
Joseph Myers [Thu, 7 May 2020 00:44:14 +0000 (00:44 +0000)]
target/i386: fix fscale handling of signaling NaN
The implementation of the fscale instruction returns a NaN exponent
unchanged. Fix it to return a quiet NaN when the provided exponent is
a signaling NaN.
Joseph Myers [Thu, 7 May 2020 00:43:30 +0000 (00:43 +0000)]
target/i386: implement special cases for fxtract
The implementation of the fxtract instruction treats all nonzero
operands as normal numbers, so yielding incorrect results for invalid
formats, infinities, NaNs and subnormal and pseudo-denormal operands.
Implement appropriate handling of all those cases.
While in megasas_handle_frame(), megasas_enqueue_frame() may
set a NULL frame into MegasasCmd object for a given 'frame_addr'
address. Add check to avoid a NULL pointer dereference issue.
megasas: use unsigned type for reply_queue_head and check index
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Pan Nengyuan [Wed, 13 May 2020 13:26:30 +0000 (09:26 -0400)]
i386/kvm: fix a use-after-free when vcpu plug/unplug
When we hotplug vcpus, cpu_update_state is added to vm_change_state_head
in kvm_arch_init_vcpu(). But it forgot to delete in kvm_arch_destroy_vcpu() after
unplug. Then it will cause a use-after-free access. This patch delete it in
kvm_arch_destroy_vcpu() to fix that.
The UAF stack:
==qemu-system-x86_64==28233==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e00002e798 at pc 0x5573c6917d9e bp 0x7fff07139e50 sp 0x7fff07139e40
WRITE of size 1 at 0x62e00002e798 thread T0
#0 0x5573c6917d9d in cpu_update_state /mnt/sdb/qemu/target/i386/kvm.c:742
#1 0x5573c699121a in vm_state_notify /mnt/sdb/qemu/vl.c:1290
#2 0x5573c636287e in vm_prepare_start /mnt/sdb/qemu/cpus.c:2144
#3 0x5573c6362927 in vm_start /mnt/sdb/qemu/cpus.c:2150
#4 0x5573c71e8304 in qmp_cont /mnt/sdb/qemu/monitor/qmp-cmds.c:173
#5 0x5573c727cb1e in qmp_marshal_cont qapi/qapi-commands-misc.c:835
#6 0x5573c7694c7a in do_qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:132
#7 0x5573c7694c7a in qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:175
#8 0x5573c71d9110 in monitor_qmp_dispatch /mnt/sdb/qemu/monitor/qmp.c:145
#9 0x5573c71dad4f in monitor_qmp_bh_dispatcher /mnt/sdb/qemu/monitor/qmp.c:234
WangBowen [Sat, 9 May 2020 03:59:52 +0000 (11:59 +0800)]
hax: Dynamic allocate vcpu state structure
Dynamic allocating vcpu state structure according to smp value to be
more precise and safe. Previously it will alloccate array of fixed size
HAX_MAX_VCPU.
This is achieved by using g_new0 to dynamic allocate the array. The
allocated size is obtained from smp.max_cpus in MachineState. Also, the
size is compared with HAX_MAX_VCPU when creating the vm. The reason for
choosing dynamic array over linked list is because the status is visited
by index all the time.
This will lead to QEMU checking whether the smp value is larger than the
HAX_MAX_VCPU when creating vm, if larger, the process will terminate,
otherwise it will allocate array of size smp to store the status.
cpus: Fix botched configure_icount() error API violation fix
Before recent commit abc9bf69a66, configure_icount() returned early
when option "shift" was absent: succeed when option "align" was also
absent, else fail.
Since then, it still errors out when only "align" is present, but
continues when both are absent. Crashes when examining the value of
"shift" further. Reproducer: -icount "".
Liran Alon [Thu, 12 Mar 2020 16:54:31 +0000 (18:54 +0200)]
hw/i386/vmport: Assert vmport initialized before registering commands
vmport_register() is also called from other modules such as vmmouse.
Therefore, these modules rely that vmport is realized before those call
sites. If this is violated, vmport_register() will NULL-deref.
To make such issues easier to debug, assert in vmport_register() that
vmport is already realized.
Liran Alon [Thu, 12 Mar 2020 16:54:30 +0000 (18:54 +0200)]
hw/i386/vmport: Add support for CMD_GETHZ
This command returns to guest information on LAPIC bus frequency and TSC
frequency.
One can see how this interface is used by Linux vmware_platform_setup()
introduced in Linux commit 88b094fb8d4f ("x86: Hypervisor detection and
get tsc_freq from hypervisor").
Liran Alon [Thu, 12 Mar 2020 16:54:28 +0000 (18:54 +0200)]
hw/i386/vmport: Allow x2apic without IR
Signal to guest that hypervisor supports x2apic without VT-d/IOMMU
Interrupt-Remapping support. This allows guest to use x2apic in
case all APIC IDs fits in 8-bit (i.e. Max APIC ID < 255).
See Linux kernel commit 4cca6ea04d31 ("x86/apic: Allow x2apic
without IR on VMware platform") and Linux try_to_enable_x2apic()
function.
Liran Alon [Thu, 12 Mar 2020 16:54:24 +0000 (18:54 +0200)]
hw/i386/vmport: Add support for CMD_GETBIOSUUID
This is VMware documented functionallity that some guests rely on.
Returns the BIOS UUID of the current virtual machine.
Note that we also introduce a new compatability flag "x-cmds-v2" to
make sure to expose new VMPort commands only to new machine-types.
This flag will also be used by the following patches that will introduce
additional VMPort commands.
Liran Alon [Thu, 12 Mar 2020 16:54:23 +0000 (18:54 +0200)]
hw/i386/vmport: Define enum for all commands
No functional change.
Defining an enum for all VMPort commands have the following advantages:
* It gets rid of the error-prone requirement to update VMPORT_ENTRIES
when new VMPort commands are added to QEMU.
* It makes it clear to know by looking at one place at the source, what
are all the VMPort commands supported by QEMU.
vmware-vmx-version is a number returned from CMD_GETVERSION which specifies
to guest VMware Tools the the host VMX version. If the host reports a number
that is different than what the guest VMware Tools expects, it may force
guest to upgrade VMware Tools. (See comment above VERSION_MAGIC and
VmCheck_IsVirtualWorld() function in open-vm-tools open-source code).
For better readability and allow maintaining compatability for guests
which may expect different vmware-vmx-version, make vmware-vmx-version a
VMPort object property. This would allow user to control it's value via
"-global vmport.vmware-vmx-version=X".
Liran Alon [Thu, 12 Mar 2020 16:54:19 +0000 (18:54 +0200)]
hw/i386/vmport: Set EAX to -1 on failed and unsupported commands
This is used as a signal for VMware Tools to know if a command it
attempted to invoke, failed or is unsupported. As a result, VMware Tools
will either report failure to user or fallback to another backdoor command
in attempt to perform some operation.
A few examples:
* open-vm-tools TimeSyncReadHost() function fallbacks to
CMD_GETTIMEFULL command when CMD_GETTIMEFULL_WITH_LAG
fails/unsupported.
* open-vm-tools Hostinfo_NestingSupported() function verifies
EAX != -1 to check for success.
* open-vm-tools Hostinfo_VCPUInfoBackdoor() functions checks
if reserved-bit is set to indicate command is unimplemented.
Liran Alon [Thu, 12 Mar 2020 16:54:18 +0000 (18:54 +0200)]
hw/i386/vmport: Propagate IOPort read to vCPU EAX register
vmport_ioport_read() returns the value that should propagate to vCPU EAX
register when guest reads VMPort IOPort (i.e. By x86 IN instruction).
However, because vmport_ioport_read() calls cpu_synchronize_state(), the
returned value gets overridden by the value in QEMU vCPU EAX register.
i.e. cpu->env.regs[R_EAX].
To fix this issue, change vmport_ioport_read() to explicitly override
cpu->env.regs[R_EAX] with the value it wish to propagate to vCPU EAX
register.
Liran Alon [Thu, 12 Mar 2020 16:54:16 +0000 (18:54 +0200)]
hw/i386/vmport: Add reference to VMware open-vm-tools
This official VMware open-source project can be used as reference to
understand how guest code interacts with VMPort virtual device. Thus,
providing understanding on how device is expected to behave.
Janne Grunau [Wed, 1 Apr 2020 22:52:53 +0000 (00:52 +0200)]
target/i386: fix phadd* with identical destination and source register
Detected by asm test suite failures in dav1d
(https://code.videolan.org/videolan/dav1d). Can be reproduced by
`qemu-x86_64 -cpu core2duo ./tests/checkasm --test=mc_8bpc 1659890620`.
CPUID leaf CPUID_Fn80000008_ECX provides information about the
number of threads supported by the processor. It was found that
the field ApicIdSize(bits 15-12) was not set correctly.
ApicIdSize is defined as the number of bits required to represent
all the ApicId values within a package.
Valid Values: Value Description
3h-0h Reserved.
4h up to 16 threads.
5h up to 32 threads.
6h up to 64 threads.
7h up to 128 threads.
Fh-8h Reserved.
Jon Doron [Fri, 24 Apr 2020 12:34:43 +0000 (15:34 +0300)]
i386: Hyper-V VMBus ACPI DSDT entry
Guest OS uses ACPI to discover VMBus presence. Add a corresponding
entry to DSDT in case VMBus has been enabled.
Experimentally Windows guests were found to require this entry to
include two IRQ resources. They seem to never be used but they still
have to be there.
Make IRQ numbers user-configurable via corresponding properties; use 7
and 13 by default.
Jon Doron [Fri, 24 Apr 2020 12:34:41 +0000 (15:34 +0300)]
vmbus: vmbus implementation
Add the VMBus infrastructure -- bus, devices, root bridge, vmbus state
machine, vmbus channel interactions, etc.
VMBus is a collection of technologies. At its lowest layer, it's a message
passing and signaling mechanism, allowing efficient passing of messages to and
from guest VMs. A layer higher, it's a mechanism for defining channels of
communication, where each channel is tagged with a type (which implies a
protocol) and a instance ID. A layer higher than that, it's a bus driver,
serving as the basis of device enumeration within a VM, where a channel can
optionally be exposed as a paravirtual device. When a server-side (paravirtual
back-end) component wishes to offer a channel to a guest VM, it does so by
specifying a channel type, a mode, and an instance ID. VMBus then exposes this
in the guest.
More information about VMBus can be found in the file
vmbuskernelmodeclientlibapi.h in Microsoft's WDK.
TODO:
- split into smaller palatable pieces
- more comments
- check and handle corner cases
Igor Mammedov [Mon, 11 May 2020 14:11:03 +0000 (10:11 -0400)]
numa: prevent usage of -M memory-backend and -numa memdev at the same time
Options -M memory-backend and -numa memdev are mutually exclusive,
and if used together, it might lead to a crash in the worst case.
For example when the same backend is used with these options together:
-m 4G \
-object memory-backend-ram,id=mem0,size=4G \
-M pc,memory-backend=mem0 \
-numa node,memdev=mem0
QEMU will abort with:
exec.c:2006: qemu_ram_set_idstr: Assertion `!new_block->idstr[0]' failed.
and following backtrace:
abort ()
qemu_ram_set_idstr ()
vmstate_register_ram ()
vmstate_register_ram_global ()
machine_consume_memdev ()
numa_init_memdev_container ()
numa_complete_configuration ()
machine_run_board_init ()
add a check to error out in case the user tries to use both options at
the same time.
Igor Mammedov [Mon, 11 May 2020 14:11:02 +0000 (10:11 -0400)]
vl.c: run preconfig loop before creating default RAM backend
Default RAM backend depends on numa_uses_legacy_mem(), which is
infulenced by -numa options on CLI or set-numa-node QMP command
at preconfig time. If QEMU is started with '-preconfig'
without -numa, it will lead to creating default RAM backend
even if later set-numa-node is used to assing RAM to NUMA nodes
using 'memdev' NUMA option.
That at best will waste RAM object created by default and with
next patch adding a check to prevent usage of conflicting
'-M memory-backend' and '-numa memdev'
options, it will make QEMU error out if user tries to configure
NUMA at preconfig time with memdev option, making set-numa-node
unusable.
To fix issue, move preconfig loop before default RAM backend is
created, so that numa_uses_legacy_mem() would take into account
effects of set-numa-node commands executed at preconfig time.