Kevin Wolf [Fri, 17 Aug 2018 12:58:49 +0000 (14:58 +0200)]
block: Remove aio_poll() in bdrv_drain_poll variants
bdrv_drain_poll_top_level() was buggy because it didn't release the
AioContext lock of the node to be drained before calling aio_poll().
This way, callbacks called by aio_poll() would possibly take the lock a
second time and run into a deadlock with a nested AIO_WAIT_WHILE() call.
However, it turns out that the aio_poll() call isn't actually needed any
more. It was introduced in commit 91af091f923, which is effectively
reverted by this patch. The cases it was supposed to fix are now covered
by bdrv_drain_poll(), which waits for block jobs to reach a quiescent
state.
Kevin Wolf [Fri, 7 Sep 2018 13:31:22 +0000 (15:31 +0200)]
blockjob: Lie better in child_job_drained_poll()
Block jobs claim in .drained_poll() that they are in a quiescent state
as soon as job->deferred_to_main_loop is true. This is obviously wrong,
they still have a completion BH to run. We only get away with this
because commit 91af091f923 added an unconditional aio_poll(false) to the
drain functions, but this is bypassing the regular drain mechanisms.
However, just removing this and telling that the job is still active
doesn't work either: The completion callbacks themselves call drain
functions (directly, or indirectly with bdrv_reopen), so they would
deadlock then.
As a better lie, tell that the job is active as long as the BH is
pending, but falsely call it quiescent from the point in the BH when the
completion callback is called. At this point, nested drain calls won't
deadlock because they ignore the job, and outer drains will wait for the
job to really reach a quiescent state because the callback is already
running.
Kevin Wolf [Thu, 6 Sep 2018 15:47:22 +0000 (17:47 +0200)]
block-backend: Decrease in_flight only after callback
Request callbacks can do pretty much anything, including operations that
will yield from the coroutine (such as draining the backend). In that
case, a decreased in_flight would be visible to other code and could
lead to a drain completing while the callback hasn't actually completed
yet.
Note that reordering these operations forbids calling drain directly
inside an AIO callback. As Paolo explains, indirectly calling it is
okay:
- Calling it through a coroutine is okay, because then
bdrv_drained_begin() goes through bdrv_co_yield_to_drain() and you
have in_flight=2 when bdrv_co_yield_to_drain() yields, then soon
in_flight=1 when the aio_co_wake() in the AIO callback completes, then
in_flight=0 after the bottom half starts.
- Calling it through a bottom half would be okay too, as long as the AIO
callback remembers to do inc_in_flight/dec_in_flight just like
bdrv_co_yield_to_drain() and bdrv_co_drain_bh_cb() do
A few more important cases that come to mind:
- A coroutine that yields because of I/O is okay, with a sequence
similar to bdrv_co_yield_to_drain().
- A coroutine that yields with no I/O pending will correctly decrease
in_flight to zero before yielding.
- Calling more AIO from the callback won't overflow the counter just
because of mutual recursion, because AIO functions always yield at
least once before invoking the callback.
Kevin Wolf [Fri, 7 Sep 2018 11:45:54 +0000 (13:45 +0200)]
block-backend: Fix potential double blk_delete()
blk_unref() first decreases the refcount of the BlockBackend and calls
blk_delete() if the refcount reaches zero. Requests can still be in
flight at this point, they are only drained during blk_delete():
At this point, arbitrary callbacks can run. If any callback takes a
temporary BlockBackend reference, it will first increase the refcount to
1 and then decrease it to 0 again, triggering another blk_delete(). This
will cause a use-after-free crash in the outer blk_delete().
Fix it by draining the BlockBackend before decreasing to refcount to 0.
Assert in blk_ref() that it never takes the first refcount (which would
mean that the BlockBackend is already being deleted).
Kevin Wolf [Thu, 6 Sep 2018 15:43:49 +0000 (17:43 +0200)]
block-backend: Add .drained_poll callback
A bdrv_drain operation must ensure that all parents are quiesced, this
includes BlockBackends. Otherwise, callbacks called by requests that are
completed on the BDS layer, but not quite yet on the BlockBackend layer
could still create new requests.
Kevin Wolf [Fri, 17 Aug 2018 16:54:18 +0000 (18:54 +0200)]
block: Add missing locking in bdrv_co_drain_bh_cb()
bdrv_do_drained_begin/end() assume that they are called with the
AioContext lock of bs held. If we call drain functions from a coroutine
with the AioContext lock held, we yield and schedule a BH to move out of
coroutine context. This means that the lock for the home context of the
coroutine is released and must be re-acquired in the bottom half.
Kevin Wolf [Thu, 6 Sep 2018 09:58:01 +0000 (11:58 +0200)]
test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback
This is a regression test for a deadlock that occurred in block job
completion callbacks (via job_defer_to_main_loop) because the AioContext
lock was taken twice: once in job_finish_sync() and then again in
job_defer_to_main_loop_bh(). This would cause AIO_WAIT_WHILE() to hang.
Kevin Wolf [Fri, 17 Aug 2018 12:58:49 +0000 (14:58 +0200)]
job: Use AIO_WAIT_WHILE() in job_finish_sync()
job_finish_sync() needs to release the AioContext lock of the job before
calling aio_poll(). Otherwise, callbacks called by aio_poll() would
possibly take the lock a second time and run into a deadlock with a
nested AIO_WAIT_WHILE() call.
Also, job_drain() without aio_poll() isn't necessarily enough to make
progress on a job, it could depend on bottom halves to be executed.
Combine both open-coded while loops into a single AIO_WAIT_WHILE() call
that solves both of these problems.
Kevin Wolf [Fri, 17 Aug 2018 15:29:08 +0000 (17:29 +0200)]
test-blockjob: Acquire AioContext around job_cancel_sync()
All callers in QEMU proper hold the AioContext lock when calling
job_finish_sync(). test-blockjob should do the same when it calls the
function indirectly through job_cancel_sync().
Kevin Wolf [Wed, 5 Sep 2018 16:14:17 +0000 (18:14 +0200)]
aio-wait: Increase num_waiters even in home thread
Even if AIO_WAIT_WHILE() is called in the home context of the
AioContext, we still want to allow the condition to change depending on
other threads as long as they kick the AioWait. Specfically block jobs
can be running in an I/O thread and should then be able to kick a drain
in the main loop context.
Kevin Wolf [Fri, 17 Aug 2018 12:53:05 +0000 (14:53 +0200)]
blockjob: Wake up BDS when job becomes idle
In the context of draining a BDS, the .drained_poll callback of block
jobs is called. If this returns true (i.e. there is still some activity
pending), the drain operation may call aio_poll() with blocking=true to
wait for completion.
As soon as the pending activity is completed and the job finally arrives
in a quiescent state (i.e. its coroutine either yields with busy=false
or terminates), the block job must notify the aio_poll() loop to wake
up, otherwise we get a deadlock if both are running in different
threads.
Fam Zheng [Fri, 24 Aug 2018 02:43:42 +0000 (10:43 +0800)]
job: Fix nested aio_poll() hanging in job_txn_apply
All callers have acquired ctx already. Doing that again results in
aio_poll() hang. This fixes the problem that a BDRV_POLL_WHILE() in the
callback cannot make progress because ctx is recursively locked, for
example, when drive-backup finishes.
util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cb
AIO Coroutines shouldn't by managed by an AioContext different than the
one assigned when they are created. aio_co_enter avoids entering a
coroutine from a different AioContext, calling aio_co_schedule instead.
Scheduled coroutines are then entered by co_schedule_bh_cb using
qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the
current AioContext obtained with qemu_get_current_aio_context.
Eventually, co->ctx will be set to the AioContext passed as an argument
to qemu_aio_coroutine_enter.
This means that, if an IO Thread's AioConext is being processed by the
Main Thread (due to aio_poll being called with a BDS AioContext, as it
happens in AIO_WAIT_WHILE among other places), the AioContext from some
coroutines may be wrongly replaced with the one from the Main Thread.
This is the root cause behind some crashes, mainly triggered by the
drain code at block/io.c. The most common are these abort and failed
assertion:
util/async.c:aio_co_schedule
456 if (scheduled) {
457 fprintf(stderr,
458 "%s: Co-routine was already scheduled in '%s'\n",
459 __func__, scheduled);
460 abort();
461 }
But it's also known to cause random errors at different locations, and
even SIGSEGV with broken coroutine backtraces.
By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can
pass the correct AioContext as an argument, making sure co->ctx is not
wrongly altered.
Alberto Garcia [Thu, 6 Sep 2018 14:25:41 +0000 (17:25 +0300)]
block: Fix use after free error in bdrv_open_inherit()
When a block device is opened with BDRV_O_SNAPSHOT and the
bdrv_append_temp_snapshot() call fails then the error code path tries
to unref the already destroyed 'options' QDict.
This can be reproduced easily by setting TMPDIR to a location where
the QEMU process can't write:
block/linux-aio: acquire AioContext before qemu_laio_process_completions
In qemu_laio_process_completions_and_submit, the AioContext is acquired
before the ioq_submit iteration and after qemu_laio_process_completions,
but the latter is not thread safe either.
This change avoids a number of random crashes when the Main Thread and
an IO Thread collide processing completions for the same AioContext.
This is an example of such crash:
- The IO Thread is trying to acquire the AioContext at aio_co_enter,
which evidences that it didn't lock it before:
Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)):
#0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0
#2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0)
at ../nptl/pthread_mutex_lock.c:114
#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66
#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493
#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478
#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104
#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
at block/linux-aio.c:222
#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
at block/linux-aio.c:237
#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406
#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true)
at util/aio-posix.c:693
#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64
#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308
#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
- The Main Thread is also processing completions from the same
AioContext, and crashes due to failed assertion at util/iov.c:78:
Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)):
#0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90
#2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset")
at assert.c:92
#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101
#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78
#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410
#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88
#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
at block/linux-aio.c:222
#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
at block/linux-aio.c:237
#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272
#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497
#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574
#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false)
at util/aio-posix.c:604
#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273
#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390
#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590
#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774
#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401
#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449
#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080)
at block/commit.c:92
#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973
#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90
#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118
#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436
#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201
#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854
#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215
#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
#30 0x00005631fbd81412 in main_loop () at vl.c:1866
#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
at vl.c:4647
- A closer examination shows that s->io_q.in_flight appears to have
gone backwards:
Kevin Wolf [Tue, 27 Jun 2017 15:18:20 +0000 (17:18 +0200)]
commit: Add top-node/base-node options
The block-commit QMP command required specifying the top and base nodes
of the commit jobs using the file name of that node. While this works
in simple cases (local files with absolute paths), the file names
generated for more complicated setups can be hard to predict.
The block-commit command has more problems than just this, so we want to
replace it altogether in the long run, but libvirt needs a reliable way
to address nodes now. So we don't want to wait for a new, cleaner
command, but just add the minimal thing needed right now.
This adds two new options top-node and base-node to the command, which
allow specifying node names instead. They are mutually exclusive with
the old options.
John Snow [Thu, 6 Sep 2018 13:02:25 +0000 (09:02 -0400)]
blockdev: document transactional shortcomings
Presently only the backup job really guarantees what one would consider
transactional semantics. To guard against someone helpfully adding them
in the future, document that there are shortcomings in the model that
would need to be audited at that time.
John Snow [Thu, 6 Sep 2018 13:02:18 +0000 (09:02 -0400)]
tests/test-blockjob: remove exit callback
We remove the exit callback and the completed boolean along with it.
We can simulate it just fine by waiting for the job to defer to the
main loop, and then giving it one final kick to get the main loop
portion to run.
During refactor, a potential problem with bdrv_drop_intermediate
was identified, the patched behavior is no worse than the pre-patch
behavior, so leave a FIXME for now to be fixed in a future patch.
Peter Maydell [Tue, 25 Sep 2018 12:30:45 +0000 (13:30 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.1-20180925' into staging
ppc patch queue 2018-09-25
Here are the accumulated ppc target patches for the last several
weeks. Highlights are:
* A number of 40p / PReP cleanups
* Preliminary irq rework on the pseries machine towards the new
XIVE interrupt controller
There are a few patches which make small changes to generic device and
arm code as prerequisites to the 40p interrupt routing cleanup. They
have acks from the relevant maintainers.
* remotes/dgibson/tags/ppc-for-3.1-20180925:
40p: add fixed IRQ routing for LSI SCSI device
lsi53c895a: add optional external IRQ via qdev
scsi: remove unused lsi53c895a_create() and lsi53c810_create() functions
scsi: move lsi53c8xx_create() callers to lsi53c8xx_handle_legacy_cmdline()
scsi: add lsi53c8xx_handle_legacy_cmdline() function
sm501: Adjust endianness of pixel value in rectangle fill
spapr_pci: add an extra 'nr_msis' argument to spapr_populate_pci_dt
spapr: increase the size of the IRQ number space
spapr: introduce a spapr_irq class 'nr_msis' attribute
40p: use OR gate to wire up raven PCI interrupts
raven: some minor IRQ-related tidy-ups
hw/ppc: on 40p machine, change default firmware to OpenBIOS
target/ppc/cpu-models: Re-group the 970 CPUs together again
Record history of ppcemb target in common.json
* remotes/cody/tags/block-pull-request:
curl: Make sslverify=off disable host as well as peer verification.
block/rbd: add deprecation documentation for filename keyvalue pairs
block/rbd: add iotest for rbd legacy keyvalue filename parsing
block/rbd: Attempt to parse legacy filenames
block/rbd: pull out qemu_rbd_convert_options
* remotes/armbru/tags/pull-qobject-2018-09-24:
json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP
json: Eliminate lexer state IN_ERROR
json: Nicer recovery from lexical errors
json: Make lexer's "character consumed" logic less confusing
json: Clean up how lexer consumes "end of input"
json: Fix lexer for lookahead character beyond '\x7F'
* remotes/armbru/tags/pull-error-2018-09-24:
MAINTAINERS: Fix F: patterns that don't match anything
Drop "qemu:" prefix from error_report() arguments
qemu-error: make use of {error, warn}_report_once_cond
qemu-error: add {error, warn}_report_once_cond
Peter Maydell [Tue, 25 Sep 2018 10:05:56 +0000 (11:05 +0100)]
Merge remote-tracking branch 'remotes/xtensa/tags/20180918-xtensa' into staging
target/xtensa updates:
- fix gdbstub register counts;
- add big-endian core test_kc705_be;
- convert to do_transaction_failed and add test for failed memory
transactions;
- fix couple FPU2000 bugs;
- fix s32c1i implementation;
- clean up exception handlers generation in xtensa tests;
- add support for semihosting console input through a chardev.
curl: Make sslverify=off disable host as well as peer verification.
The sslverify setting is supposed to turn off all TLS certificate
checks in libcurl. However because of the way we use it, it only
turns off peer certificate authenticity checks
(CURLOPT_SSL_VERIFYPEER). This patch makes it also turn off the check
that the server name in the certificate is the same as the server
you're connecting to (CURLOPT_SSL_VERIFYHOST).
We can use Google's server at 8.8.8.8 which happens to have a bad TLS
certificate to demonstrate this:
$ ./qemu-img create -q -f qcow2 -b 'json: { "file.sslverify": "off", "file.driver": "https", "file.url": "https://8.8.8.8/foo" }' /var/tmp/file.qcow2
qemu-img: /var/tmp/file.qcow2: CURL: Error opening file: SSL: no alternative certificate subject name matches target host name '8.8.8.8'
Could not open backing image to determine size.
With this patch applied, qemu-img connects to the server regardless of
the bad certificate:
$ ./qemu-img create -q -f qcow2 -b 'json: { "file.sslverify": "off", "file.driver": "https", "file.url": "https://8.8.8.8/foo" }' /var/tmp/file.qcow2
qemu-img: /var/tmp/file.qcow2: CURL: Error opening file: The requested URL returned error: 404 Not Found
(The 404 error is expected because 8.8.8.8 is not actually serving a
file called "/foo".)
Of course the default (without sslverify=off) remains to always check
the certificate:
$ ./qemu-img create -q -f qcow2 -b 'json: { "file.driver": "https", "file.url": "https://8.8.8.8/foo" }' /var/tmp/file.qcow2
qemu-img: /var/tmp/file.qcow2: CURL: Error opening file: SSL: no alternative certificate subject name matches target host name '8.8.8.8'
Could not open backing image to determine size.
Further information about the two settings is available here:
Jeff Cody [Tue, 11 Sep 2018 22:32:32 +0000 (18:32 -0400)]
block/rbd: add iotest for rbd legacy keyvalue filename parsing
This is a small test that will check for the ability to parse
both legacy and modern options for rbd.
The way the test is set up is for failure to occur, but without
having to wait to timeout on a non-existent rbd server. The error
messages in the success path show that the arguments were parsed.
The failure behavior prior to the patch series that has this test, is
qemu-img complaining about mandatory options (e.g. 'pool') not being
provided.
Jeff Cody [Tue, 11 Sep 2018 22:32:31 +0000 (18:32 -0400)]
block/rbd: Attempt to parse legacy filenames
When we converted rbd to get rid of the older key/value-centric
encoding format, we broke compatibility with image files with backing
file strings encoded in the old format.
This leaves a bit of an ugly conundrum, and a hacky solution.
If the initial attempt to parse the "proper" options fails, it assumes
that we may have an older key/value encoded filename. Fall back to
attempting to parse the filename, and extract the required options from
it. If that fails, pass along the original error message.
We do not support mixed modern usage alongside legacy keyvalue pair
usage.
A deprecation warning has been added, although care should be taken
when actually deprecating since the impact is not limited to
commandline or qapi usage, but also opening existing images.
Mark Cave-Ayland [Wed, 19 Sep 2018 17:21:01 +0000 (18:21 +0100)]
40p: add fixed IRQ routing for LSI SCSI device
Whilst the PReP specification describes how all PCI IRQs are routed via IRQ
15 on the interrupt controller, the real 40p machine has a routing quirk in
that the LSI SCSI device is routed directly to IRQ 13.
Enable the external IRQ for the LSI SCSI device by wiring up the IRQ with
qdev to the relevant interrupt controller gpio.
Mark Cave-Ayland [Wed, 19 Sep 2018 17:20:58 +0000 (18:20 +0100)]
scsi: move lsi53c8xx_create() callers to lsi53c8xx_handle_legacy_cmdline()
As part of commits a64aa5785d "hw: Deprecate -drive if=scsi with non-onboard
HBAs" and b891538e81 "hw/ppc/prep: Fix implicit creation of "-drive if=scsi"
devices" the lsi53c895a_create() and lsi53c810_create() functions were added
to wrap pci_create_simple() and scsi_bus_legacy_handle_cmdline().
Unfortunately this prevents us from changing qdev properties on the device
and/or changing the PCI configuration. By switching over to using the new
lsi53c8xx_handle_legacy_cmdline() function then the caller can now configure
and realize the LSI SCSI device exactly as required.
Marcus Comstedt [Wed, 19 Sep 2018 12:31:14 +0000 (14:31 +0200)]
sm501: Adjust endianness of pixel value in rectangle fill
The value from twoD_foreground (which is in host endian format) must
be converted to the endianness of the framebuffer (currently always
little endian) before it can be used to perform the fill operation.
The new layout using static IRQ number does not leave much space to
the dynamic MSI range, only 0x100 IRQ numbers. Increase the total
number of IRQS for newer machines and introduce a legacy XICS backend
for pre-3.1 machines to maintain compatibility.
For the old backend, provide a 'nr_msis' value covering the full IRQ
number space as it does not use the bitmap allocator to allocate MSI
interrupt numbers.
spapr: introduce a spapr_irq class 'nr_msis' attribute
The number of MSI interrupts a sPAPR machine can allocate is in direct
relation with the number of interrupts of the sPAPRIrq backend. Define
statically this value at the sPAPRIrq class level and use it for the
"ibm,pe-total-#msi" property of the sPAPR PHB.
According to the PAPR specs, "ibm,pe-total-#msi" defines the maximum
number of MSIs that are available to the PE. We choose to advertise
the maximum number of MSIs that are available to the machine for
simplicity of the model and to avoid segmenting the MSI interrupt pool
which can be easily shared. If the pool limit is reached, it can be
extended dynamically.
Finally, remove XICS_IRQS_SPAPR which is now unused.
According to the PReP specification section 6.1.6 "System Interrupt
Assignments", all PCI interrupts are routed via IRQ 15.
Instead of mapping each PCI IRQ separately, we introduce an OR gate within the
raven PCI host bridge and then wire the single output of the OR gate to the
interrupt controller.
Note that whilst the (now deprecated) PReP machine still exists we still need
to preserve the old IRQ routing. This is done by adding a new "is-legacy-prep"
property to the raven PCI host bridge which is set to true for the PReP
machine.
This really lays the groundwork for the upcoming patches: it renames the
irqs PREPPCIState struct member to pci_irqs (as soon there will be a
distinction) and then changes the raven IRQ opaque to use PREPPCIState
instead of just irqs array.
Thomas Huth [Fri, 7 Sep 2018 13:19:54 +0000 (15:19 +0200)]
target/ppc/cpu-models: Re-group the 970 CPUs together again
The addition of the POWER9 CPUs divided the entries for the 970 CPUs,
which is a little bit confusing when you look at the code. So let's
re-group the 970 CPUs together again, and since these chips have been
based on the POWER4 processor, move them also in front of the POWER5
chips now.
Thomas Huth [Tue, 21 Aug 2018 11:27:48 +0000 (13:27 +0200)]
Record history of ppcemb target in common.json
We recently removed the long deprecated "ppcemb" target. This adds a
comment in common.json about the SysEmuTarget type, recording when it was
removed.
Peter Maydell [Mon, 24 Sep 2018 17:49:11 +0000 (18:49 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, features
pci resource capability + misc fixes everywhere.
Signed-off-by: Michael S. Tsirkin <[email protected]>
# gpg: Signature made Fri 07 Sep 2018 22:50:38 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <[email protected]>"
# gpg: aka "Michael S. Tsirkin <[email protected]>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
tests: update acpi expected files
vhost: fix invalid downcast
pc: make sure that guest isn't able to unplug the first cpu
hw/pci: add PCI resource reserve capability to legacy PCI bridge
hw/pci: factor PCI reserve resources to a separate structure
virtio: update MemoryRegionCaches when guest negotiates features
pc: acpi: revert back to 1 SRAT entry for hotpluggable area
Peter Maydell [Mon, 24 Sep 2018 17:12:54 +0000 (18:12 +0100)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-09-07-1' into staging
Merge tpm 2018/09/07 v1
# gpg: Signature made Fri 07 Sep 2018 21:38:06 BST
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-09-07-1:
tests: Fix signalling race condition in TPM tests
Peter Maydell [Mon, 24 Sep 2018 16:14:10 +0000 (17:14 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.1-20180907' into staging
ppc patch queue 2018-09-07
Here's another pull request for qemu-3.1. No real theme here, just an
assortment of various fixes. Probably the most notable thing is the
removal of the ppcemb target which has been deprecated for some time
now.
* remotes/dgibson/tags/ppc-for-3.1-20180907:
target-ppc: Extend HWCAP2 bits for ISA 3.0
target/ppc/kvm: set vcpu as online/offline
Fix a deadlock case in the CPU hotplug flow
spapr: Correct reference count on spapr-cpu-core
mac_newworld: implement custom FWPathProvider
uninorth: add ofw-addr property to allow correct fw path generation
mac_oldworld: implement custom FWPathProvider
grackle: set device fw_name and address for correct fw path generation
macio: add addr property to macio IDE object
macio: add macio bus to help with fw path generation
macio: move MACIOIDEState type declarations to macio.h
spapr_pci: fix potential NULL pointer dereference
spapr: fix leak of rev array
ppc: Remove deprecated ppcemb target
json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP
The lexer ignores whitespace like this:
on whitespace on non-ws spontaneously
IN_START --> IN_WHITESPACE --> JSON_SKIP --> IN_START
^ |
\__/ on whitespace
This accumulates a whitespace token in state IN_WHITESPACE, only to
throw it away on the transition via JSON_SKIP to the start state.
Wasteful. Go from IN_START to IN_START on whitespace directly,
dropping the whitespace character.
When the lexer chokes on an input character, it consumes the
character, emits a JSON error token, and enters its start state. This
can lead to suboptimal error recovery. For instance, input
0123 ,
produces the tokens
JSON_ERROR 01
JSON_INTEGER 23
JSON_COMMA ,
Make the lexer skip characters after a lexical error until a
structural character ('[', ']', '{', '}', ':', ','), an ASCII control
character, or '\xFE', or '\xFF'.
Note that we must not skip ASCII control characters, '\xFE', '\xFF',
because those are documented to force the JSON parser into known-good
state, by docs/interop/qmp-spec.txt.
The lexer now produces
JSON_ERROR 01
JSON_COMMA ,
Update qmp-test for the nicer error recovery: QMP now reports just one
error for input %p instead of two. Also drop the newline after %p; it
was needed to tease out the second error.
json: Make lexer's "character consumed" logic less confusing
The lexer uses macro TERMINAL_NEEDED_LOOKAHEAD() to decide whether a
state transition consumes the input character. It returns true when
the state transition is defined with the TERMINAL() macro. To detect
that, it checks whether input '\0' would have resulted in the same
state transition, and the new state is not IN_ERROR.
Why does that even work? For all states, the new state on input '\0'
is either IN_ERROR or defined with TERMINAL(). If the state
transition equals the one we'd get for input '\0', it goes to IN_ERROR
or to the argument of TERMINAL(). We never use TERMINAL(IN_ERROR),
because it makes no sense. Thus, if it doesn't go to IN_ERROR, it
must be defined with TERMINAL().
Since this isn't quite confusing enough, we negate the result to get
@char_consumed, and ignore it when @flush is true.
Instead of deriving the lookahead bit from the state transition, make
it explicit. This is easier to understand, and a bit more flexible,
too.
When the lexer isn't in its start state at the end of input, it's
working on a token. To flush it out, it needs to transit to its start
state on "end of input" lookahead.
There are two ways to the start state, depending on the current state:
* If the lexer is in a TERMINAL(JSON_FOO) state, it can emit a
JSON_FOO token.
* Else, it can go to IN_ERROR state, and emit a JSON_ERROR token.
There are complications, however:
* The transition to IN_ERROR state consumes the input character and
adds it to the JSON_ERROR token. The latter is inappropriate for
the "end of input" character, so we suppress that. See also recent
commit a2ec6be72b8 "json: Fix lexer to include the bad character in
JSON_ERROR token".
* The transition to a TERMINAL(JSON_FOO) state doesn't consume the
input character. In that case, the lexer normally loops until it is
consumed. We have to suppress that for the "end of input" input
character. If we didn't, the lexer would consume it by entering
IN_ERROR state, emitting a bogus JSON_ERROR token. We fixed that in
commit bd3924a33a6.
However, simply breaking the loop this way assumes that the lexer
needs exactly one state transition to reach its start state. That
assumption is correct now, but it's unclean, and I'll soon break it.
Clean up: instead of breaking the loop after one iteration, break it
after it reached the start state.
Peter Maydell [Mon, 24 Sep 2018 15:46:43 +0000 (16:46 +0100)]
Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-pullreq-20180905' into staging
A misc collection of RISC-V related patches for 3.1.
# gpg: Signature made Wed 05 Sep 2018 23:06:55 BST
# gpg: using RSA key 21E10D29DF977054
# gpg: Good signature from "Alistair Francis <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8 CE8F 21E1 0D29 DF97 7054
* remotes/alistair/tags/pull-riscv-pullreq-20180905:
riscv: remove define cpu_init()
hw/riscv/spike: Set the soc device tree node as a simple-bus
hw/riscv/virtio: Set the soc device tree node as a simple-bus
target/riscv: call gen_goto_tb on DISAS_TOO_MANY
target/riscv: optimize indirect branches
target/riscv: optimize cross-page direct jumps in softmmu
RISC-V: Simplify riscv_cpu_local_irqs_pending
RISC-V: Use atomic_cmpxchg to update PLIC bitmaps
RISC-V: Improve page table walker spec compliance
RISC-V: Update address bits to support sv39 and sv48
* remotes/kraxel/tags/vga-20180903-pull-request:
virtio-gpu: add iommu support
virtio-gpu: pass down VirtIOGPU pointer to a bunch of functions
use dpy_gfx_update_full
Revert "virtio-gpu: fix crashes upon warm reboot with vga mode"
virtio-vga: fix reset
Cornelia Huck [Thu, 30 Aug 2018 14:59:01 +0000 (16:59 +0200)]
qemu-error: add {error, warn}_report_once_cond
Add two functions to print an error/warning report once depending
on a passed-in condition variable and flip it if printed. This is
useful if you want to print a message not once-globally, but e.g.
once-per-device.
Inspired by warn_once() in hw/vfio/ccw.c, which has been replaced
with warn_report_once_cond().
Markus spotted some issues with this new test case which
unfortunately I didn't notice had been flagged until after
I'd applied the pull request. Revert the relevant commit.
* remotes/cohuck/tags/s390x-20180829:
target/s390x: use regular spaces in translate.c
hw/s390x: Move virtio-ccw-blk code to a separate file
hw/s390x: Move virtio-ccw-net code to a separate file
hw/s390x: Move virtio-ccw-input code to a separate file
hw/s390x: Move virtio-ccw-gpu code to a separate file
hw/s390x: Move vhost-vsock-ccw code to a separate file
hw/s390x: Move virtio-ccw-crypto code to a separate file
hw/s390x: Move virtio-ccw-9p code to a separate file
hw/s390x: Move virtio-ccw-rng code to a separate file
hw/s390x: Move virtio-ccw-scsi code to a separate file
hw/s390x: Move virtio-ccw-balloon code to a separate file
hw/s390x: Move virtio-ccw-serial code to a separate file
hw/s390x/virtio-ccw: Consolidate calls to virtio_ccw_unrealize()
target/s390x: fix PACK reading 1 byte less and writing 1 byte more
target/s390x: add EX support for TRT and TRTR
target/s390x: fix IPM polluting irrelevant bits
target/s390x: fix CSST decoding and runtime alignment check
target/s390x: add BAL and BALR instructions
tests/tcg: add a simple s390x test
Peter Maydell [Mon, 24 Sep 2018 09:46:33 +0000 (10:46 +0100)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-08-28' into staging
QAPI patches for 2018-08-28
# gpg: Signature made Tue 28 Aug 2018 17:23:32 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg: aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2018-08-28:
qapi: Add comments to aid debugging generated introspection
qapi: Minor introspect.py cleanups
qapi: Update docs for generator changes since commit 9ee86b85267
qapi: Emit a blank line before dummy declaration
qapi: Drop qapi_event_send_FOO()'s Error ** argument
qapi: Fix build_params() for empty parameter list
Max Filippov [Tue, 11 Sep 2018 01:23:35 +0000 (18:23 -0700)]
target/xtensa: fix s32c1i TCGMemOp flags
s32c1i must load and store value with target endianness, not host.
This results in an infinite loop in atomic cmpxchg sequences when target
endianness doesn't match host endianness.
Max Filippov [Fri, 31 Aug 2018 08:48:26 +0000 (01:48 -0700)]
tests/tcg/xtensa: only generate defined exception handlers
Don't generate handlers for IRQ levels that are not defined for the CPU
or for window overflow/underflow exceptions for configs w/o windowed
registers.
Max Filippov [Fri, 31 Aug 2018 19:00:42 +0000 (12:00 -0700)]
tests/tcg/xtensa: move exception handlers to separate section
Not all CPU configurations may have enough space for handler code
between exception/interrupt vectors. Leave jumps to the handlers at the
vectors, but move all handlers past the vectors area.
Max Filippov [Fri, 31 Aug 2018 07:40:28 +0000 (00:40 -0700)]
target/xtensa: fix FPU2000 bugs
- FPU2000 defines rfr and wfr opcodes, not rfr.s and wfr.s;
- movcond.s uses incorrect operand in tcg_gen_movcond: in case the
condition is not satisfied it must not change its argument 0.
Max Filippov [Mon, 20 Aug 2018 03:21:50 +0000 (20:21 -0700)]
tests/tcg/xtensa: add test for failed memory transactions
Failed memory transactions should raise exceptions 14 (for fetch) or 15
(for load/store) with XEA2.
Memory accesses that result in TLB miss followed by an attempt to load
PTE from physical memory which fails should raise InstTLBMiss or
LoadStoreTLBMiss with XEA2.
When a container fails, it leaves a dangling tarball which name is
based on a timestamp. Further uses of make won't clean those files,
neither calling the 'docker-clean' target.
Use the .DELETE_ON_ERROR built-in target to let make remove those
temporary tarballs in case of failure.
virtio_queue_get_desc_addr returns 64-bit hwaddr while int is usually 32-bit.
If returned hwaddr is not equal to 0 but least-significant 32 bits are
equal to 0 then this code will not actually stop running queue.