Peter Maydell [Tue, 24 Mar 2020 17:36:30 +0000 (17:36 +0000)]
dump: Fix writing of ELF section
In write_elf_section() we set the 'shdr' pointer to point to local
structures shdr32 or shdr64, which we fill in to be written out to
the ELF dump. Unfortunately the address we pass to fd_write_vmcore()
has a spurious '&' operator, so instead of writing out the section
header we write out the literal pointer value followed by whatever is
on the stack after the 'shdr' local variable.
Peter Maydell [Fri, 3 Apr 2020 12:47:12 +0000 (13:47 +0100)]
hw/gpio/aspeed_gpio.c: Don't directly include assert.h
Remove a direct include of assert.h -- this is already
provided by qemu/osdep.h, and it breaks our rule that the
first include must always be osdep.h.
In particular we must get the assert() macro via osdep.h
to avoid compile failures on mingw (see the comment in
osdep.h where we redefine assert() for that platform).
Peter Maydell [Tue, 31 Mar 2020 14:34:07 +0000 (15:34 +0100)]
target/arm: Remove obsolete TODO note from get_phys_addr_lpae()
An old comment in get_phys_addr_lpae() claims that the code does not
support the different format TCR for VTCR_EL2. This used to be true
but it is not true now (in particular the aa64_va_parameters() and
aa32_va_parameters() functions correctly handle the different
register format by checking whether the mmu_idx is Stage2).
Remove the out of date parts of the comment.
Peter Maydell [Mon, 30 Mar 2020 17:06:51 +0000 (18:06 +0100)]
target/arm: PSTATE.PAN should not clear exec bits
Our implementation of the PSTATE.PAN bit incorrectly cleared all
access permission bits for privileged access to memory which is
user-accessible. It should only affect the privileged read and write
permissions; execute permission is dealt with via XN/PXN instead.
Peter Maydell [Thu, 26 Mar 2020 20:49:19 +0000 (20:49 +0000)]
hw/arm/collie: Put StrongARMState* into a CollieMachineState struct
Coverity complains that the collie_init() function leaks the memory
allocated in sa1110_init(). This is true but not significant since
the function is called only once on machine init and the memory must
remain in existence until QEMU exits anyway.
Still, we can avoid the technical memory leak by keeping the pointer
to the StrongARMState inside the machine state struct. Switch from
the simple DEFINE_MACHINE() style to defining a subclass of
TYPE_MACHINE which extends the MachineState struct, and keep the
pointer there.
Alex Bennée [Thu, 2 Apr 2020 14:39:13 +0000 (15:39 +0100)]
target/arm: don't expose "ieee_half" via gdbstub
While support for parsing ieee_half in the XML description was added
to gdb in 2019 (a6d0f249) there is no easy way for the gdbstub to know
if the gdb end will understand it. Disable it for now and allow older
gdbs to successfully connect to the default -cpu max SVE enabled
QEMUs.
Stefan Hajnoczi [Thu, 2 Apr 2020 14:54:34 +0000 (15:54 +0100)]
aio-posix: fix test-aio /aio/event/wait with fdmon-io_uring
When a file descriptor becomes ready we must re-arm POLL_ADD. This is
done by adding an sqe to the io_uring sq ring. The ->need_wait()
function wasn't taking pending sqes into account and therefore
io_uring_submit_and_wait() was not being called. Polling for cqes
failed to detect fd readiness since we hadn't submitted the sqe to
io_uring.
This patch fixes the following tests/test-aio -p /aio/event/wait
failure:
ok 11 /aio/event/wait
**
ERROR:tests/test-aio.c:374:test_flush_event_notifier: assertion failed: (aio_poll(ctx, false))
Peter Maydell [Fri, 3 Apr 2020 09:07:27 +0000 (10:07 +0100)]
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue for -rc2
Fixes:
* EPYC CPU model APIC ID topology fixes (Babu Moger)
* Fix crash when enabling intel-pt on older machine types
(Luwei Kang)
* Add missing ARCH_CAPABILITIES bits to Icelake-Server CPU model
(Xiaoyao Li)
* remotes/ehabkost/tags/x86-next-pull-request:
target/i386: Add ARCH_CAPABILITIES related bits into Icelake-Server CPU model
target/i386: set the CPUID level to 0x14 on old machine-type
i386: Fix pkg_id offset for EPYC cpu models
target/i386: Enable new apic id encoding for EPYC based cpus models
hw/i386: Move arch_id decode inside x86_cpus_init
i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition
hw/i386: Introduce apicid functions inside X86MachineState
target/i386: Cleanup and use the EPYC mode topology functions
hw/386: Add EPYC mode topology decoding functions
* remotes/bonzini/tags/for-upstream:
xen: fixup RAM memory region initialization
object-add: don't create return value if failed
qmp: fix leak on callbacks that return both value and error
migration: fix cleanup_bh leak on resume
target/i386: do not set unsupported VMX secondary execution controls
serial: Fix double migration data
i386: hvf: Reset IRQ inhibition after moving RIP
vl: fix broken IPA range for ARM -M virt with KVM enabled
util/bufferiszero: improve avx2 accelerator
util/bufferiszero: assign length_to_accel value for each accelerator case
MAINTAINERS: Add an entry for the HVF accelerator
softmmu: fix crash with invalid -M memory-backend=
virtio-iommu: depend on PCI
hw/isa/superio: Correct the license text
hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
Igor Mammedov [Thu, 2 Apr 2020 14:54:18 +0000 (10:54 -0400)]
xen: fixup RAM memory region initialization
Since bd457782b3b0 ("x86/pc: use memdev for RAM") Xen
machine fails to start with:
qemu-system-i386: xen: failed to populate ram at 0
The reason is that xen_ram_alloc() which is called by
memory_region_init_ram(), compares memory region with
statically allocated 'global' ram_memory memory region
that it uses for RAM, and does nothing in case it matches.
While it's possible feed machine->ram to xen_ram_alloc()
in the same manner to keep that hack working, I'd prefer
not to keep that circular dependency and try to untangle that.
However it doesn't look trivial to fix, so as temporary
fixup opt out Xen machine from memdev based RAM allocation,
and let xen_ram_alloc() do its trick for now.
Paolo Bonzini [Thu, 26 Mar 2020 09:41:21 +0000 (10:41 +0100)]
object-add: don't create return value if failed
No need to return an empty value from object-add (it would also leak
if the command failed). While at it, remove the "if" around object_unref
since object_unref handles NULL arguments just fine.
qmp: fix leak on callbacks that return both value and error
Direct leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x7fa114931887 in __interceptor_calloc (/lib64/libasan.so.6+0xb0887)
#1 0x7fa1144ad8f0 in g_malloc0 (/lib64/libglib-2.0.so.0+0x588f0)
#2 0x561e3c9c8897 in qmp_object_add /home/elmarco/src/qemu/qom/qom-qmp-cmds.c:291
#3 0x561e3cf48736 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:155
#4 0x561e3c8efb36 in monitor_qmp_dispatch /home/elmarco/src/qemu/monitor/qmp.c:145
#5 0x561e3c8f09ed in monitor_qmp_bh_dispatcher /home/elmarco/src/qemu/monitor/qmp.c:234
#6 0x561e3d08c993 in aio_bh_call /home/elmarco/src/qemu/util/async.c:136
#7 0x561e3d08d0a5 in aio_bh_poll /home/elmarco/src/qemu/util/async.c:164
#8 0x561e3d0a535a in aio_dispatch /home/elmarco/src/qemu/util/aio-posix.c:380
#9 0x561e3d08e3ca in aio_ctx_dispatch /home/elmarco/src/qemu/util/async.c:298
#10 0x7fa1144a776e in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x5276e)
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery
tests/qtest/libqtest.c:140: kill_qemu() tried to terminate QEMU
process but encountered exit status 1 (expected 0)
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f25971dfc58 in __interceptor_malloc (/lib64/libasan.so.5+0x10dc58)
#1 0x7f2596d08358 in g_malloc (/lib64/libglib-2.0.so.0+0x57358)
#2 0x560970d006f8 in qemu_bh_new /home/elmarco/src/qemu/util/main-loop.c:532
#3 0x5609704afa02 in migrate_fd_connect
/home/elmarco/src/qemu/migration/migration.c:3407
#4 0x5609704b6b6f in migration_channel_connect
/home/elmarco/src/qemu/migration/channel.c:92
#5 0x5609704b2bfb in socket_outgoing_migration
/home/elmarco/src/qemu/migration/socket.c:108
#6 0x560970b9bd6c in qio_task_complete /home/elmarco/src/qemu/io/task.c:196
#7 0x560970b9aa97 in qio_task_thread_result
/home/elmarco/src/qemu/io/task.c:111
#8 0x7f2596cfee3a (/lib64/libglib-2.0.so.0+0x4de3a)
Vitaly Kuznetsov [Tue, 31 Mar 2020 16:27:52 +0000 (18:27 +0200)]
target/i386: do not set unsupported VMX secondary execution controls
Commit 048c95163b4 ("target/i386: work around KVM_GET_MSRS bug for
secondary execution controls") added a workaround for KVM pre-dating
commit 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm
KVM_GET_MSRS") which wasn't setting certain available controls. The
workaround uses generic CPUID feature bits to set missing VMX controls.
It was found that in some cases it is possible to observe hosts which
have certain CPUID features but lack the corresponding VMX control.
In particular, it was reported that Azure VMs have RDSEED but lack
VMX_SECONDARY_EXEC_RDSEED_EXITING; attempts to enable this feature
bit result in QEMU abort.
Resolve the issue but not applying the workaround when we don't have
to. As there is no good way to find out if KVM has the fix itself, use 95c5c7c77c ("KVM: nVMX: list VMX MSRs in KVM_GET_MSR_INDEX_LIST") instead
as these [are supposed to] come together.
After c9808d60281 we have both an object representing the serial-isa
device and a separate object representing the underlying common serial
uart. Both of these have vmsd's associated with them and thus the
migration stream ends up with two copies of the migration data - the
serial-isa includes the vmstate of the core serial. Besides
being wrong, it breaks backwards migration compatibility.
Fix this by removing the dc->vmsd from the core device, so it only
gets migrated by any parent devices including it.
Add a vmstate_serial_mm so that any device that uses serial_mm_init
rather than creating a device still gets migrated.
(That doesn't fix backwards migration for serial_mm_init users,
but does seem to work forwards for ppce500).
Roman Bolshakov [Sat, 28 Mar 2020 17:44:12 +0000 (20:44 +0300)]
i386: hvf: Reset IRQ inhibition after moving RIP
The sequence of instructions exposes an issue:
sti
hlt
Interrupts cannot be delivered to hvf after hlt instruction cpu because
HF_INHIBIT_IRQ_MASK is set just before hlt is handled and never reset
after moving instruction pointer beyond hlt.
So, after hvf_vcpu_exec() returns, CPU thread gets locked up forever in
qemu_wait_io_event() (cpu_thread_is_idle() evaluates inhibition
flag and considers the CPU idle if the flag is set).
Igor Mammedov [Thu, 26 Mar 2020 11:28:29 +0000 (07:28 -0400)]
vl: fix broken IPA range for ARM -M virt with KVM enabled
Commit a1b18df9a4848, broke virt_kvm_type() logic, which depends on
maxram_size, ram_size, ram_slots being parsed/set on machine instance
at the time accelerator (KVM) is initialized.
set_memory_options() part was already reverted by commit 2a7b18a3205b,
so revert remaining initialization of above machine fields to make
virt_kvm_type() work as it used to.
Older QEMU versions did fixup the ram size to match what can be reported
via sclp. We need to mimic this behaviour for machine types 4.2 and
older to not fail on inbound migration for memory sizes that do not fit.
Old machines with proper aligned memory sizes are not affected.
Suggested action is to replace unaligned -m value with a suitable
aligned one or if a change to a newer machine type is possible, use a
machine version >= 5.0.
A future version might remove the compatibility handling.
For machine types >= 5.0 we can simply use an increment size of 1M and
use the full range of increment number which allows for all possible
memory sizes. The old limitation of having a maximum of 1020 increments
was added for standby memory, which we no longer support. With that we
can now support even weird memory sizes like 10001234 MB.
As we no longer fixup maxram_size as well, make other users use ram_size
instead. Keep using maxram_size when setting the maximum ram size in KVM,
as that will come in handy in the future when supporting memory hotplug
(in contrast, storage keys and storage attributes for hotplugged memory
will have to be migrated per RAM block in the future).
Janosch Frank [Tue, 31 Mar 2020 11:01:23 +0000 (07:01 -0400)]
s390x: kvm: Fix number of cpu reports for stsi 3.2.2
The cpu number reporting is handled by KVM and QEMU only fills in the
VM name, uuid and other values.
Unfortunately KVM doesn't report reserved cpus and doesn't even know
they exist until the are created via the ioctl.
So let's fix up the cpu values after KVM has written its values to the
3.2.2 sysib. To be consistent, we use the same code to retrieve the cpu
numbers as the STSI TCG code in target/s390x/misc_helper.c:HELPER(stsi).
Paolo Bonzini [Fri, 20 Mar 2020 10:41:24 +0000 (11:41 +0100)]
virtio-iommu: depend on PCI
The virtio-iommu device attaches itself to a PCI bus, so it makes
no sense to include it unless PCI is supported---and in fact
compilation fails without this change.
The license is the 'GNU General Public License v2.0 or later',
not 'and':
This program is free software; you can redistribute it and/ori
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
Elazar Leibovich [Sun, 15 Mar 2020 13:26:34 +0000 (15:26 +0200)]
hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
during guest ACPI reset. It crashes on assert(s->rings_info_valid)
in pvscsi_process_io().
Analyzing the crash revealed that it happens when userspace issues
a sync during a reboot syscall.
Since QEMU pvscsi should imitate VMware pvscsi device emulation,
we decided to imitate VMware's behavior in this case.
To check VMware behavior, we wrote a kernel module that issues
a reset to the pvscsi device and then issues a kick. We ran it on
VMware ESXi 6.5 and it seems that it simply ignores the kick.
Hence, we decided to ignore the kick as well.
Luwei Kang [Thu, 12 Mar 2020 16:48:06 +0000 (00:48 +0800)]
target/i386: set the CPUID level to 0x14 on old machine-type
The CPUID level need to be set to 0x14 manually on old
machine-type if Intel PT is enabled in guest. E.g. the
CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1 when the
Qemu with "-machine pc-i440fx-3.1 -cpu qemu64,+intel-pt" parameter.
Some Intel PT capabilities are exposed by leaf 0x14 and the
missing capabilities will cause some MSRs access failed.
This patch add a warning message to inform the user to extend
the CPUID level.
Babu Moger [Wed, 11 Mar 2020 22:54:09 +0000 (17:54 -0500)]
target/i386: Enable new apic id encoding for EPYC based cpus models
The APIC ID is decoded based on the sequence sockets->dies->cores->threads.
This works fine for most standard AMD and other vendors' configurations,
but this decoding sequence does not follow that of AMD's APIC ID enumeration
strictly. In some cases this can cause CPU topology inconsistency.
When booting a guest VM, the kernel tries to validate the topology, and finds
it inconsistent with the enumeration of EPYC cpu models. The more details are
in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166.
To fix the problem we need to build the topology as per the Processor
Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. The documentation is available from the bugzilla Link below.
Here is the text from the PPR.
Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
number of least significant bits in the Initial APIC ID that indicate core ID
within a processor, in constructing per-core CPUID masks.
Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
(MNC) that the processor could theoretically support, not the actual number of
cores that are actually implemented or enabled on the processor, as indicated
by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
The new apic id encoding is enabled for EPYC and EPYC-Rome models.
Babu Moger [Wed, 11 Mar 2020 22:54:02 +0000 (17:54 -0500)]
hw/i386: Move arch_id decode inside x86_cpus_init
Apicid calculation depends on knowing the total number of numa nodes
for EPYC cpu models. Right now, we are calculating the arch_id while
parsing the numa(parse_numa). At this time, it is not known how many
total numa nodes are configured in the system.
Move the arch_id calculation inside x86_cpus_init. At this time, smp
parse is already completed and numa node information is available.
Override the handlers if use_epyc_apic_id_encoding is enabled in
cpu model definition.
Also replace the calling convention to use handlers from
X86MachineState.
Babu Moger [Wed, 11 Mar 2020 22:53:41 +0000 (17:53 -0500)]
target/i386: Cleanup and use the EPYC mode topology functions
Use the new functions from topology.h and delete the unused code. Given the
sockets, nodes, cores and threads, the new functions generate apic id for EPYC
mode. Removes all the hardcoded values.
Babu Moger [Wed, 11 Mar 2020 22:53:34 +0000 (17:53 -0500)]
hw/386: Add EPYC mode topology decoding functions
These functions add support for building EPYC mode topology given the smp
details like numa nodes, cores, threads and sockets.
The new apic id decoding is mostly similar to current apic id decoding
except that it adds a new field node_id when numa configured. Removes all
the hardcoded values. Subsequent patches will use these functions to build
the topology.
Following functions are added.
apicid_llc_width_epyc
apicid_llc_offset_epyc
apicid_pkg_offset_epyc
apicid_from_topo_ids_epyc
x86_topo_ids_from_idx_epyc
x86_topo_ids_from_apicid_epyc
x86_apicid_from_cpu_idx_epyc
Bugfixes all over the place.
Add a new balloon maintainer.
A checkpatch enhancement to enforce ACPI change rules.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Tue 31 Mar 2020 15:54:36 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "[email protected]"
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>" [full]
# gpg: aka "Michael S. Tsirkin <[email protected]>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
vhost-vsock: fix double close() in the realize() error path
acpi: add acpi=OnOffAuto machine property to x86 and arm virt
fix vhost_user_blk_watch crash
hw/i386/amd_iommu.c: Fix corruption of log events passed to guest
virtio-iommu: avoid memleak in the unrealize
virtio-blk: delete vqs on the error path in realize()
acpi: pcihp: fix left shift undefined behavior in acpi_pcihp_eject_slot()
virtio-serial-bus: Plug memory leak on realize() error paths
MAINTAINERS: Add myself as virtio-balloon co-maintainer
checkpatch: enforce process for expected files
vhost-vsock: fix double close() in the realize() error path
vhost_dev_cleanup() closes the vhostfd parameter passed to
vhost_dev_init(), so this patch avoids closing it twice in
the vhost_vsock_device_realize() error path.
Peter Maydell [Tue, 31 Mar 2020 13:49:46 +0000 (14:49 +0100)]
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 31 Mar 2020 14:15:18 BST
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
qtest: add tulip test case
hw/net/allwinner-sun8i-emac.c: Fix REG_ADDR_HIGH/LOW reads
net: tulip: check frame size and r/w data length
net/colo-compare.c: Expose "expired_scan_cycle" to users
net/colo-compare.c: Expose "compare_timeout" to users
hw/net/can: Make CanBusClientInfo::can_receive() return a boolean
hw/net: Make NetCanReceive() return a boolean
hw/net/rtl8139: Update coding style to make checkpatch.pl happy
hw/net/rtl8139: Simplify if/else statement
hw/net/smc91c111: Let smc91c111_can_receive() return a boolean
hw/net/e1000e_core: Let e1000e_can_receive() return a boolean
Fixed integer overflow in e1000e
hw/net/i82596.c: Avoid reading off end of buffer in i82596_receive()
hw/net/i82596: Correct command bitmask (CID 1419392)
Li Qiang [Mon, 30 Mar 2020 14:52:01 +0000 (07:52 -0700)]
qtest: add tulip test case
The tulip networking card emulation has an OOB issue in
'tulip_copy_tx_buffers' when the guest provide malformed descriptor.
This test will trigger a ASAN heap overflow crash. To trigger this
issue we can construct the data as following:
1. construct a 'tulip_descriptor'. Its control is set to
'0x7ff | 0x7ff << 11', this will make the 'tulip_copy_tx_buffers's
'len1' and 'len2' to 0x7ff(2047). So 'len1+len2' will overflow
'TULIPState's 'tx_frame' field. This descriptor's 'buf_addr1' and
'buf_addr2' should set to a guest address.
2. write this descriptor to tulip device's CSR4 register. This will
set the 'TULIPState's 'current_tx_desc' field.
3. write 'CSR6_ST' to tulip device's CSR6 register. This will trigger
'tulip_xmit_list_update' and finally calls 'tulip_copy_tx_buffers'.
Following shows the backtrack of crash:
==31781==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x628000007cd0 at pc 0x7fe03c5a077a bp 0x7fff05b46770 sp 0x7fff05b45f18
WRITE of size 2047 at 0x628000007cd0 thread T0
#0 0x7fe03c5a0779 (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x79779)
#1 0x5575fb6daa6a in flatview_read_continue /home/test/qemu/exec.c:3194
#2 0x5575fb6daccb in flatview_read /home/test/qemu/exec.c:3227
#3 0x5575fb6dae66 in address_space_read_full /home/test/qemu/exec.c:3240
#4 0x5575fb6db0cb in address_space_rw /home/test/qemu/exec.c:3268
#5 0x5575fbdfd460 in dma_memory_rw_relaxed /home/test/qemu/include/sysemu/dma.h:87
#6 0x5575fbdfd4b5 in dma_memory_rw /home/test/qemu/include/sysemu/dma.h:110
#7 0x5575fbdfd866 in pci_dma_rw /home/test/qemu/include/hw/pci/pci.h:787
#8 0x5575fbdfd8a3 in pci_dma_read /home/test/qemu/include/hw/pci/pci.h:794
#9 0x5575fbe02761 in tulip_copy_tx_buffers hw/net/tulip.c:585
#10 0x5575fbe0366b in tulip_xmit_list_update hw/net/tulip.c:678
#11 0x5575fbe04073 in tulip_write hw/net/tulip.c:783
Coverity points out (CID 1421926) that the read code for
REG_ADDR_HIGH reads off the end of the buffer, because it does a
32-bit read from byte 4 of a 6-byte buffer.
The code also has an endianness issue for both REG_ADDR_HIGH and
REG_ADDR_LOW, because it will do the wrong thing on a big-endian
host.
Rewrite the read code to use ldl_le_p() and lduw_le_p() to fix this;
the write code is not incorrect, but for consistency we make it use
stl_le_p() and stw_le_p().
Tulip network driver while copying tx/rx buffers does not check
frame size against r/w data length. This may lead to OOB buffer
access. Add check to avoid it.
Limit iterations over descriptors to avoid potential infinite
loop issue in tulip_xmit_list_update.
Zhang Chen [Wed, 18 Mar 2020 08:23:19 +0000 (16:23 +0800)]
net/colo-compare.c: Expose "compare_timeout" to users
The "compare_timeout" determines the maximum time to hold the primary net packet.
This patch expose the "compare_timeout", make user have ability to
adjest the value according to application scenarios.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1737400
Fixed setting max_queue_num if there are no peers in
NICConf. qemu_new_nic() creates NICState with 1 NetClientState(index
0) without peers, set max_queue_num to 0 - It prevents undefined
behavior and possible crashes, especially during pcie hotplug.
Peter Maydell [Thu, 12 Mar 2020 20:16:38 +0000 (20:16 +0000)]
hw/net/i82596.c: Avoid reading off end of buffer in i82596_receive()
The i82596_receive() function attempts to pass the guest a buffer
which is effectively the concatenation of the data it is passed and a
4 byte CRC value. However, rather than implementing this as "write
the data; then write the CRC" it instead bumps the length value of
the data by 4, and writes 4 extra bytes from beyond the end of the
buffer, which it then overwrites with the CRC. It also assumed that
we could always fit all four bytes of the CRC into the final receive
buffer, which might not be true if the CRC needs to be split over two
receive buffers.
Calculate separately how many bytes we need to transfer into the
guest's receive buffer from the source buffer, and how many we need
to transfer from the CRC work.
We add a count 'bufsz' of the number of bytes left in the source
buffer, which we use purely to assert() that we don't overrun.
Spotted by Coverity (CID 1419396) for the specific case when we end
up using a local array as the source buffer.
The command is 32-bit, but we are loading the 16 upper bits with
the 'get_uint16(s->scb + 2)' call.
Once shifted by 16, the command bits match the status bits:
- Command
Bit 31 ACK-CX Acknowledges that the CU completed an Action Command.
Bit 30 ACK-FR Acknowledges that the RU received a frame.
Bit 29 ACK-CNA Acknowledges that the Command Unit became not active.
Bit 28 ACK-RNR Acknowledges that the Receive Unit became not ready.
- Status
Bit 15 CX The CU finished executing a command with its I(interrupt) bit set.
Bit 14 FR The RU finished receiving a frame.
Bit 13 CNA The Command Unit left the Active state.
Bit 12 RNR The Receive Unit left the Ready state.
Add the SCB_COMMAND_ACK_MASK definition to simplify the code.
This fixes Coverity 1419392 (CONSTANT_EXPRESSION_RESULT):
/hw/net/i82596.c: 352 in examine_scb()
346 cuc = (command >> 8) & 0x7;
347 ruc = (command >> 4) & 0x7;
348 DBG(printf("MAIN COMMAND %04x cuc %02x ruc %02x\n", command, cuc, ruc));
349 /* and clear the scb command word */
350 set_uint16(s->scb + 2, 0);
351
>>> CID 1419392: (CONSTANT_EXPRESSION_RESULT)
>>> "command & (2147483648UL /* 1UL << 31 */)" is always 0 regardless of the values of its operands. This occurs as the logical operand of "if".
352 if (command & BIT(31)) /* ACK-CX */
353 s->scb_status &= ~SCB_STATUS_CX;
>>> CID 1419392: (CONSTANT_EXPRESSION_RESULT)
>>> "command & (1073741824UL /* 1UL << 30 */)" is always 0 regardless of the values of its operands. This occurs as the logical operand of "if".
354 if (command & BIT(30)) /*ACK-FR */
355 s->scb_status &= ~SCB_STATUS_FR;
>>> CID 1419392: (CONSTANT_EXPRESSION_RESULT)
>>> "command & (536870912UL /* 1UL << 29 */)" is always 0 regardless of the values of its operands. This occurs as the logical operand of "if".
356 if (command & BIT(29)) /*ACK-CNA */
357 s->scb_status &= ~SCB_STATUS_CNA;
>>> CID 1419392: (CONSTANT_EXPRESSION_RESULT)
>>> "command & (268435456UL /* 1UL << 28 */)" is always 0 regardless of the values of its operands. This occurs as the logical operand of "if".
358 if (command & BIT(28)) /*ACK-RNR */
359 s->scb_status &= ~SCB_STATUS_RNR;
Peter Maydell [Tue, 31 Mar 2020 10:20:21 +0000 (11:20 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20200330' into staging
Improve PIE and other linkage
Fix for decodetree vs Python3 floor division operator
Fix i386 INDEX_op_dup2_vec expansion
Fix loongson multimedia condition instructions
# gpg: Signature made Tue 31 Mar 2020 04:50:15 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "[email protected]"
# gpg: Good signature from "Richard Henderson <[email protected]>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20200330:
decodetree: Use Python3 floor division operator
tcg/i386: Fix INDEX_op_dup2_vec
target/mips: Fix loongson multimedia condition instructions
configure: Support -static-pie if requested
configure: Override the os default with --disable-pie
configure: Unnest detection of -z,relro and -z,now
configure: Always detect -no-pie toolchain support
configure: Do not force pie=no for non-x86
tcg: Remove softmmu code_gen_buffer fixed address
configure: Drop adjustment of textseg
This script started using Python2, where the 'classic' division
operator returns the floor result. In commit 3d004a371 we started
to use Python3, where the division operator returns the float
result ('true division').
To keep the same behavior, use the 'floor division' operator "//"
which returns the floor result.
* remotes/pmaydell/tags/pull-target-arm-20200330:
target/arm: fix incorrect current EL bug in aarch32 exception emulation
hw/arm/xlnx-zynqmp.c: Add missing error-propagation code
hw/arm/xlnx-zynqmp.c: Avoid memory leak in error-return path
docs/conf.py: Raise ConfigError for bad Sphinx Python version
hw/misc/allwinner-h3-dramc: enforce 64-bit multiply when calculating row mirror address
hw/arm/orangepi: check for potential NULL pointer when calling blk_is_available
Changbin Du [Sat, 28 Mar 2020 14:02:32 +0000 (22:02 +0800)]
target/arm: fix incorrect current EL bug in aarch32 exception emulation
The arm_current_el() should be invoked after mode switching. Otherwise, we
get a wrong current EL value, since current EL is also determined by
current mode.
In some places in xlnx_zynqmp_realize() we were putting an
error into our local Error*, but forgetting to check for
failure and pass it back to the caller. Add the missing code.
Peter Maydell [Mon, 30 Mar 2020 12:18:59 +0000 (13:18 +0100)]
hw/arm/xlnx-zynqmp.c: Avoid memory leak in error-return path
In xlnx_zynqmp_realize() if the attempt to realize the SD
controller object fails then the error-return path will leak
the 'bus_name' string. Fix this by deferring the allocation
until after the realize has succeeded.
Peter Maydell [Mon, 30 Mar 2020 12:18:59 +0000 (13:18 +0100)]
docs/conf.py: Raise ConfigError for bad Sphinx Python version
Raise ConfigError rather than VersionRequirementError when we detect
that the Python being used by Sphinx is too old.
Currently the way we flag the Python version problem up to the user
causes Sphinx to print an unnecessary Python stack trace as well as
the information about the problem; in most versions of Sphinx this is
unavoidable.
The upstream Sphinx developers kindly added a feature to allow
conf.py to report errors to the user without the backtrace:
https://github.com/sphinx-doc/sphinx/commit/be608ca2313fc08eb842f3dc19d0f5d2d8227d08
but the exception type they chose for this was ConfigError.
Switch to ConfigError, which won't make any difference with currently
deployed Sphinx versions, but will be prettier one day when the user
is using a Sphinx version with the new feature.
Niek Linnenbank [Mon, 30 Mar 2020 12:18:58 +0000 (13:18 +0100)]
hw/misc/allwinner-h3-dramc: enforce 64-bit multiply when calculating row mirror address
The allwinner_h3_dramc_map_rows function simulates row addressing behavior
when bootloader software attempts to detect the amount of available SDRAM.
Currently the line that calculates the 64-bit address of the mirrored row
uses a signed 32-bit multiply operation that in theory could result in the
upper 32-bit be all 1s. This commit ensures that the row mirror address
is calculated using only 64-bit operations.
Niek Linnenbank [Mon, 30 Mar 2020 12:18:58 +0000 (13:18 +0100)]
hw/arm/orangepi: check for potential NULL pointer when calling blk_is_available
The Orange Pi PC initialization function needs to verify that the SD card
block backend is usable before calling the Boot ROM setup routine. When
calling blk_is_available() the input parameter should not be NULL.
This commit ensures that blk_is_available is only called with non-NULL input.
Li Feng [Mon, 23 Mar 2020 05:29:24 +0000 (13:29 +0800)]
fix vhost_user_blk_watch crash
the G_IO_HUP is watched in tcp_chr_connect, and the callback
vhost_user_blk_watch is not needed, because tcp_chr_hup is registered as
callback. And it will close the tcp link.
Peter Maydell [Thu, 26 Mar 2020 10:53:49 +0000 (10:53 +0000)]
hw/i386/amd_iommu.c: Fix corruption of log events passed to guest
In the function amdvi_log_event(), we write an event log buffer
entry into guest ram, whose contents are passed to the function
via the "uint64_t *evt" argument. Unfortunately, a spurious
'&' in the call to dma_memory_write() meant that instead of
writing the event to the guest we would write the literal value
of the pointer, plus whatever was in the following 8 bytes
on the stack. This error was spotted by Coverity.
Pan Nengyuan [Sat, 28 Mar 2020 00:57:04 +0000 (08:57 +0800)]
virtio-blk: delete vqs on the error path in realize()
virtio_vqs forgot to free on the error path in realize(). Fix that.
The asan stack:
Direct leak of 14336 byte(s) in 1 object(s) allocated from:
#0 0x7f58b93fd970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f58b858249d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x5562cc627f49 in virtio_add_queue /mnt/sdb/qemu/hw/virtio/virtio.c:2413
#3 0x5562cc4b524a in virtio_blk_device_realize /mnt/sdb/qemu/hw/block/virtio-blk.c:1202
#4 0x5562cc613050 in virtio_device_realize /mnt/sdb/qemu/hw/virtio/virtio.c:3615
#5 0x5562ccb7a568 in device_set_realized /mnt/sdb/qemu/hw/core/qdev.c:891
#6 0x5562cd39cd45 in property_set_bool /mnt/sdb/qemu/qom/object.c:2238
Igor Mammedov [Thu, 26 Mar 2020 13:56:24 +0000 (09:56 -0400)]
acpi: pcihp: fix left shift undefined behavior in acpi_pcihp_eject_slot()
Coverity spots subj in following guest triggered code path
pci_write(, data = 0) -> acpi_pcihp_eject_slot(,slots = 0)
uinst32_t slot = ctz32(slots)
...
... = ~(1U << slot)
where 'slot' value is 32 in case 'slots' bitmap is empty.
'slots' is a bitmap and empty one shouldn't do anything
so return early doing nothing if resulted slot value is
not valid (i.e. not in 0-31 range)
The leak stack:
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f04a8008ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8)
#1 0x7f04a73cf1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5)
#2 0x56273eaee484 in aio_bh_new /mnt/sdb/backup/qemu/util/async.c:125
#3 0x56273eafe9a8 in qemu_bh_new /mnt/sdb/backup/qemu/util/main-loop.c:532
#4 0x56273d52e62e in virtser_port_device_realize /mnt/sdb/backup/qemu/hw/char/virtio-serial-bus.c:946
#5 0x56273dcc5040 in device_set_realized /mnt/sdb/backup/qemu/hw/core/qdev.c:891
#6 0x56273e5ebbce in property_set_bool /mnt/sdb/backup/qemu/qom/object.c:2238
#7 0x56273e5e5a9c in object_property_set /mnt/sdb/backup/qemu/qom/object.c:1324
#8 0x56273e5ef5f8 in object_property_set_qobject /mnt/sdb/backup/qemu/qom/qom-qobject.c:26
#9 0x56273e5e5e6a in object_property_set_bool /mnt/sdb/backup/qemu/qom/object.c:1390
#10 0x56273daa40de in qdev_device_add /mnt/sdb/backup/qemu/qdev-monitor.c:680
#11 0x56273daa53e9 in qmp_device_add /mnt/sdb/backup/qemu/qdev-monitor.c:805
MAINTAINERS: Add myself as virtio-balloon co-maintainer
As suggested by Michael, let's add me as co-maintainer of virtio-balloon.
While at it, also add "balloon.c" and "include/sysemu/balloon.h" to the
file list.
If the process documented in tests/qtest/bios-tables-test.c
is followed, then same patch never touches both expected
files and code. Teach checkpatch to enforce this rule.
Loongson multimedia condition instructions were previously implemented as
write 0 to rd due to lack of documentation. So I just confirmed with Loongson
about their encoding and implemented them correctly.
configure: Override the os default with --disable-pie
Some distributions, e.g. Ubuntu 19.10, enable PIE by default.
If for some reason one wishes to build a non-pie binary, we
must provide additional options to override.
At the same time, reorg the code to an elif chain.
configure: Always detect -no-pie toolchain support
The CFLAGS_NOPIE and LDFLAGS_NOPIE variables are used
in pc-bios/optionrom/Makefile, which has nothing to do
with the PIE setting of the main qemu executables.
This overrides any operating system default to build
all executables as PIE, which is important for ROMs.
The commentary talks about "in concert with the addresses
assigned in the relevant linker script", except there is no
linker script for softmmu, nor has there been for some time.
(Do not confuse the user-only linker script editing that was
removed in the previous patch, because user-only does not
use this code_gen_buffer allocation method.)
This adjustment was random and unnecessary. The user mode
startup code in probe_guest_base() will choose a value for
guest_base that allows the host qemu binary to not conflict
with the guest binary.
With modern distributions, this isn't even used, as the default
is PIE, which does the same job in a more portable way.
* remotes/jnsnow/tags/ide-pull-request:
cmd646-ide: use qdev gpio rather than qemu_allocate_irqs()
via-ide: use qdev gpio rather than qemu_allocate_irqs()
via-ide: don't use PCI level for legacy IRQs
hw/ide/sii3112: Use qdev gpio rather than qemu_allocate_irqs()
fdc/i8257: implement verify transfer mode
Mark Cave-Ayland [Tue, 24 Mar 2020 21:05:17 +0000 (21:05 +0000)]
via-ide: don't use PCI level for legacy IRQs
The PCI level calculation was accidentally left in when rebasing from a
previous patchset. Since both IRQs are driven separately, the value
being passed into the IRQ handler should be used directly.
Peter Maydell [Mon, 23 Mar 2020 15:17:15 +0000 (15:17 +0000)]
hw/ide/sii3112: Use qdev gpio rather than qemu_allocate_irqs()
Coverity points out (CID 1421984) that we are leaking the
memory returned by qemu_allocate_irqs(). We can avoid this
leak by switching to using qdev_init_gpio_in(); the base
class finalize will free the irqs that this allocates under
the hood.
Sven Schnelle [Fri, 1 Nov 2019 16:55:13 +0000 (17:55 +0100)]
fdc/i8257: implement verify transfer mode
While working on the Tulip driver i tried to write some Teledisk images to
a floppy image which didn't work. Turned out that Teledisk checks the written
data by issuing a READ command to the FDC but running the DMA controller
in VERIFY mode. As we ignored the DMA request in that case, the DMA transfer
never finished, and Teledisk reported an error.
The i8257 spec says about verify transfers:
3) DMA verify, which does not actually involve the transfer of data. When an
8257 channel is in the DMA verify mode, it will respond the same as described
for transfer operations, except that no memory or I/O read/write control signals
will be generated.
Hervé proposed to remove all the dma_mode_ok stuff from fdc to have a more
clear boundary between DMA and FDC, so this patch also does that.
Peter Maydell [Fri, 27 Mar 2020 16:04:22 +0000 (16:04 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Fix another case of mirror block job deadlocks
- Minor fixes
# gpg: Signature made Fri 27 Mar 2020 15:18:37 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
qcow2: Remove unused fields from BDRVQcow2State
mirror: Wait only for in-flight operations
Revert "mirror: Don't let an operation wait for itself"
nvme: Print 'cqid' for nvme_del_cq
block: fix bdrv_root_attach_child forget to unref child_bs
block/iscsi:use the flags in iscsi_open() prevent Clang warning
Kevin Wolf [Thu, 26 Mar 2020 17:07:57 +0000 (18:07 +0100)]
qcow2: Remove unused fields from BDRVQcow2State
These fields were already removed in commit c3c10f72, but then commit b58deb34 revived them probably due to bad merge conflict resolution.
They are still unused, so remove them again.
Kevin Wolf [Thu, 26 Mar 2020 15:36:28 +0000 (16:36 +0100)]
mirror: Wait only for in-flight operations
mirror_wait_for_free_in_flight_slot() just picks a random operation to
wait for. However, a MirrorOp is already in s->ops_in_flight when
mirror_co_read() waits for free slots, so if not enough slots are
immediately available, an operation can end up waiting for itself, or
two or more operations can wait for each other to complete, which
results in a hang.
Fix this by adding a flag to MirrorOp that tells us if the request is
already in flight (and therefore occupies slots that it will later
free), and picking only such operations for waiting.
The fix was incomplete as it only protected against requests waiting for
themselves, but not against requests waiting for each other. We need a
different solution.