Git Repo - qemu.git/log

target/s390x: Move DisasFields into DisasContext

I believe that the separate allocation of DisasFields from DisasContext
was meant to limit the places from which we could access fields. But
that plan did not go unchanged, and since DisasContext contains a pointer
to fields, the substructure is accessible everywhere.

By allocating the substructure with DisasContext, we improve the locality
of the accesses by avoiding one level of pointer chasing. In addition,
we avoid a dangling pointer to stack allocated memory, diagnosed by static
checkers.

Launchpad: https://bugs.launchpad.net/bugs/1661815
Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20200123232248 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x: Pass DisasContext to get_field and have_field

All callers pass s->fields, so we might as well pass s directly.

Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20200123232248 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x: Remove DisasFields argument from callbacks

The DisasFields data is available from DisasContext.
We do not need to pass a separate argument.

Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20200123232248 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x: Move struct DisasFields definition earlier

We will want to include the struct in DisasContext.

Signed-off-by: Richard Henderson <[email protected]>
Message-Id: <20200123232248 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x/kvm: Enable adapter interruption suppression again

The AIS feature has been disabled late in the v2.10 development cycle since
there were some issues with migration (see commit 3f2d07b3b01ea61126b -
"s390x/ais: for 2.10 stable: disable ais facility"). We originally wanted
to enable it again for newer machine types, but apparently we forgot to do
this so far. Let's do it now for the machines that support proper CPU models.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1756946
Signed-off-by: Thomas Huth <[email protected]>
Message-Id: <20200122101437 [email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Tested-by: Matthew Rosato <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

docs/devel: fix stable process doc formatting

Enumeration of stable criteria needs proper bullet points.

Message-Id: <20200113103023 [email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

target/s390x: Remove duplicated ifdef macro

Commit ae71ed8610 replaced the use of global max_cpus variable
with a machine property, but introduced a unnecessary ifdef, as
this block is already in the 'not CONFIG_USER_ONLY' branch part:

   86 #if defined(CONFIG_USER_ONLY)
   87
  ...
  106 #else /* !CONFIG_USER_ONLY */
  107
  ...
  292 static void do_ext_interrupt(CPUS390XState *env)
  293 {
  ...
  313 #ifndef CONFIG_USER_ONLY
  314         MachineState *ms = MACHINE(qdev_get_machine());
  315         unsigned int max_cpus = ms->smp.max_cpus;
  316 #endif

To ease code review, remove the duplicated preprocessor macro,
and move the declarations at the beginning of the statement.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/event-facility: fix error propagation

We currently check (by error) if the passed-in Error pointer errp
is non-null and return after realizing the first child of the
event facility in that case. Symptom is that 'virsh shutdown'
does not work, as the sclpquiesce device is not realized.

Fix this by (correctly) checking the local Error err.

Reported-by: Christian Borntraeger <[email protected]>
Fixes: 3d508334dd2c ("s390x/event-facility: Fix realize() error API violations")
Message-Id: <20200121095506 [email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Tested-by: Christian Borntraeger <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Markus Armbruster <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x: adapter routes error handling

If the kernel irqchip has been disabled, we don't want the
{add,release}_adapter_routes routines to call any kvm_irqchip_*
interfaces, as they may rely on an irqchip actually having been
created. Just take a quick exit in that case instead. If you are
trying to use irqfd without a kernel irqchip, we will fail with
an error.

Also initialize routes->gsi[] with -1 in the virtio-ccw handling,
to make sure we don't trip over other errors, either. (Nobody
else uses the gsi array in that structure.)

Fixes: d426d9fba8ea ("s390x/virtio-ccw: wire up irq routing and irqfds")
Reviewed-by: Thomas Huth <[email protected]>
Acked-by: Christian Borntraeger <[email protected]>
Message-Id: <20200117111147 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/event-facility.c: remove unneeded labels

'out' label from write_event_mask() and write_event_data()
can be replaced by 'return'.

The 'out' label from read_event_data() can also be replaced.
However, as suggested by Cornelia Huck, instead of simply
replacing the 'out' label, let's also change the code flow
a bit to make it clearer that sccb events are always handled
regardless of the mask for unconditional reads, while selective
reads are handled if the mask is valid.

CC: Cornelia Huck <[email protected]>
CC: Thomas Huth <[email protected]>
CC: Halil Pasic <[email protected]>
CC: Christian Borntraeger <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Message-Id: <20200108144607 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

intc/s390_flic_kvm.c: remove unneeded label in kvm_flic_load()

'out' label can be replaced by 'return' with the appropriate
value that is set by 'r' right before the jump.

Cc: Christian Borntraeger <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Message-Id: <20200106182425 [email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

s390x/sclp.c: remove unneeded label in sclp_service_call()

'out' label can be replaced by 'return' with the appropriate
value. The 'r' integer, which is used solely to set the
return value for this label, can also be removed.

CC: Cornelia Huck <[email protected]>
CC: Halil Pasic <[email protected]>
CC: Christian Borntraeger <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20200106182425 [email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Register qdev properties as class properties (Marc-André)
* Cleanups (Philippe)
* virtio-scsi fix (Pan Nengyuan)
* Tweak Skylake-v3 model id (Kashyap)
* x86 UCODE_REV support and nested live migration fix (myself)
* Advisory mode for pvpanic (Zhenwei)

# gpg: Signature made Fri 24 Jan 2020 20:16:23 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <[email protected]>" [full]
# gpg:                 aka "Paolo Bonzini <[email protected]>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (58 commits)
  build-sys: clean up flags included in the linker command line
  target/i386: Add the 'model-id' for Skylake -v3 CPU models
  qdev: use object_property_help()
  qapi/qmp: add ObjectPropertyInfo.default-value
  qom: introduce object_property_help()
  qom: simplify qmp_device_list_properties()
  vl: print default value in object help
  qdev: register properties as class properties
  qdev: move instance properties to class properties
  qdev: rename DeviceClass.props
  qdev: set properties with device_class_set_props()
  object: return self in object_ref()
  object: release all props
  object: add object_class_property_add_link()
  object: express const link with link property
  object: add direct link flag
  object: rename link "child" to "target"
  object: check strong flag with &
  object: do not free class properties
  object: add object_property_set_default
  ...

Signed-off-by: Peter Maydell <[email protected]>

build-sys: clean up flags included in the linker command line

Some of the CFLAGS that are discovered during configure, for example
compiler warnings, are being included on the linker command line because
QEMU_CFLAGS is added to it.  Other flags, such as the -m32, appear twice
because they are included in both QEMU_CFLAGS and LDFLAGS.  All this
leads to confusion with respect to what goes in which Makefile variables
(and we have plenty).

So, introduce QEMU_LDFLAGS for flags discovered by configure, following
the lead of QEMU_CFLAGS, and stop adding to it:

1) options that are already in CFLAGS, for example "-g"

2) duplicate options

At the same time, options that _are_ needed by both compiler and linker
must now be added to both QEMU_CFLAGS and QEMU_LDFLAGS, which is clearer.
This is mostly -fsanitize options.  For now, --extra-cflags has this behavior
(but --extra-cxxflags does not).

Meson will not include CFLAGS on the linker command line, do the same in our
build system as well.

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: Add the 'model-id' for Skylake -v3 CPU models

This fixes a confusion in the help output.  (Although, if you squint
long enough at the '-cpu help' output, you _do_ notice that
"Skylake-Client-noTSX-IBRS" is an alias of "Skylake-Client-v3";
similarly for Skylake-Server-v3.)

Without this patch:

    $ qemu-system-x86 -cpu help
    ...
    x86 Skylake-Client-v1     Intel Core Processor (Skylake)
    x86 Skylake-Client-v2     Intel Core Processor (Skylake, IBRS)
    x86 Skylake-Client-v3     Intel Core Processor (Skylake, IBRS)
    ...
    x86 Skylake-Server-v1     Intel Xeon Processor (Skylake)
    x86 Skylake-Server-v2     Intel Xeon Processor (Skylake, IBRS)
    x86 Skylake-Server-v3     Intel Xeon Processor (Skylake, IBRS)
    ...

With this patch:

    $ ./qemu-system-x86 -cpu help
    ...
    x86 Skylake-Client-v1     Intel Core Processor (Skylake)
    x86 Skylake-Client-v2     Intel Core Processor (Skylake, IBRS)
    x86 Skylake-Client-v3     Intel Core Processor (Skylake, IBRS, no TSX)
    ...
    x86 Skylake-Server-v1     Intel Xeon Processor (Skylake)
    x86 Skylake-Server-v2     Intel Xeon Processor (Skylake, IBRS)
    x86 Skylake-Server-v3     Intel Xeon Processor (Skylake, IBRS, no TSX)
    ...

Signed-off-by: Kashyap Chamarthy <[email protected]>
Message-Id: <20200123090116 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: use object_property_help()

Use the common function introduced earlier, and report default value.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
[Sort the properties, following what is done for -object ...,help. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>

qapi/qmp: add ObjectPropertyInfo.default-value

Report the default value associated with a property.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
[Report it as type "any", not string. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>

qom: introduce object_property_help()

Let's factor out the code to format a help string for a property. We
are going to reuse it in qdev next, which will bring some consistency.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
[Adjust for removal of object_property_get_default, move default
after description. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>

qom: simplify qmp_device_list_properties()

All qdev properties are object properties, no need for
make_device_property_info() helper.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

vl: print default value in object help

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: register properties as class properties

Use class properties facilities to add properties to the class during
device_class_set_props().

qdev_property_add_static() must be adapted as PropertyInfo now
operates with classes (and not instances), so we must
set_default_value() on the ObjectProperty, before calling its init()
method on the object instance.

Also, PropertyInfo.create() is now exclusively used for class
properties. Fortunately, qdev_property_add_static() is only used in
target/arm/cpu.c so far, which doesn't use "link" properties (that
require create()).

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: move instance properties to class properties

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: rename DeviceClass.props

Ensure that conflicts in the future will cause a syntax error.

Signed-off-by: Paolo Bonzini <[email protected]>

qdev: set properties with device_class_set_props()

The following patch will need to handle properties registration during
class_init time. Let's use a device_class_set_props() setter.

spatch --macro-file scripts/cocci-macro-file.h --sp-file
./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place
--dir .

@@
typedef DeviceClass;
DeviceClass *d;
expression val;
@@
- d->props = val
+ device_class_set_props(d, val)

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: return self in object_ref()

This allow for simpler assignment with ref: foo = object_ref(bar)

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: release all props

Class properties may have to release resources when the object is
destroyed. Let's use the existing release() callback for that, but
class properties must not release ObjectProperty, as it can be shared
by various instances.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: add object_class_property_add_link()

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: express const link with link property

Let's not mix child property and link property callbacks, as this is
confusing, use LinkProperty with DIRECT flag to hold the target pointer.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: add direct link flag

Allow the link property to hold the pointer to the target, instead of
indirectly through another variable.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: rename link "child" to "target"

A child property is a different kind of property. Let's use "target"
for the link target.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: check strong flag with &

The following patch is going to introduce more flags.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: do not free class properties

The release callback is called during object_property_del_all(), on a
live instance. But class properties are common among all
instances. It is not currently called, because we don't release
classes, but it would not be correct if we did.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: add object_property_set_default

Add a default value to ObjectProperty and an implementation of
ObjectPropertyInit that uses it. This will make it easier to show the
default in help messages.

Also provide convenience functions object_property_set_default_{bool,
str, int, uint}().

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qstring: add qstring_free()

Similar to g_string_free(), optionally return the underlying char*.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: make object_class_property_add* return property

This will help calling other ObjectProperty associated functions
easily after.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: add class property initializer

This callback is used to set default value in following patch "object:
add object_property_set_defaut_{bool,str,int,uint}()".

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: avoid extra class property key duplication

Like object properties, no need to duplicate property name, as it is
owned already by ObjectProperty value.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: move helper function to monitor/misc

Move the one-user function to the place it is being used.

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: remove extraneous error

All callers use error_abort, and even the function itself calls with
error_abort.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qdev: remove duplicated qdev_property_add_static() doc

The function is already documented in the header.

Signed-off-by: Marc-André Lureau <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

object: add extra sanity checks

Type system checked that children class_size >= parent class_size, but
not instances. Fix that.

Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200110153039.1379601 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

accel/tcg: Sanitize include path

Commit af0440ae852 moved the qemu_tcg_configure() function,
but introduced extraneous 'include/' in the includes path.
As it is not necessary, remove it.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

accel: Replace current_machine->accelerator by current_accel() wrapper

We actually want to access the accelerator, not the machine, so
use the current_accel() wrapper instead.

Suggested-by: Paolo Bonzini <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

accel: Introduce the current_accel() wrapper

The accel/ code only accesses the MachineState::accel field.
As we simply want to access the accelerator, not the machine,
add a current_accel() wrapper.

Suggested-by: Paolo Bonzini <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qom/object: Display more helpful message when a parent is missing

QEMU object model is scarse in documentation. Some calls are
recursive, and it might be hard to figure out even trivial issues.

We can avoid developers to waste time in a debugging session by
displaying a simple error message.

This commit is also similar to e02bdf1cecd2 ("Display more helpful
message when an object type is missing").

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/arm/kvm: Use CPUState::kvm_state in kvm_arm_pmu_supported()

KVMState is already accessible via CPUState::kvm_state, use it.

Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/ppc/spapr_rtas: Remove local variable

We only access this variable in the RTAS_SYSPARM_SPLPAR_CHARACTERISTICS
case. Use it in place and remove the local declaration.

Suggested-by: Greg Kurz <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument

We received a SpaprMachineState argument. Since SpaprMachineState
inherits of MachineState, use it instead of calling qdev_get_machine.

Reviewed-by: Greg Kurz <[email protected]>
Acked-by: David Gibson <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/ppc/spapr_rtas: Use local MachineState variable

Since we have the MachineState already available locally,
use it instead of the global current_machine.

Reviewed-by: Greg Kurz <[email protected]>
Acked-by: David Gibson <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200121110349 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

virtio-scsi: convert to new virtio_delete_queue

Use virtio_delete_queue to make it more clear.

Signed-off-by: Pan Nengyuan <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Message-Id: <20200117075547 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

virtio-scsi: delete vqs in unrealize to avoid memleaks

This patch fix memleaks when attaching/detaching virtio-scsi device, the
memory leak stack is as follow:

Direct leak of 21504 byte(s) in 3 object(s) allocated from:
  #0 0x7f491f2f2970 (/lib64/libasan.so.5+0xef970)  ??:?
  #1 0x7f491e94649d (/lib64/libglib-2.0.so.0+0x5249d)  ??:?
  #2 0x564d0f3919fa (./x86_64-softmmu/qemu-system-x86_64+0x2c3e9fa)  /mnt/sdb/qemu/hw/virtio/virtio.c:2333
  #3 0x564d0f2eca55 (./x86_64-softmmu/qemu-system-x86_64+0x2b99a55)  /mnt/sdb/qemu/hw/scsi/virtio-scsi.c:912
  #4 0x564d0f2ece7b (./x86_64-softmmu/qemu-system-x86_64+0x2b99e7b)  /mnt/sdb/qemu/hw/scsi/virtio-scsi.c:924
  #5 0x564d0f39ee47 (./x86_64-softmmu/qemu-system-x86_64+0x2c4be47)  /mnt/sdb/qemu/hw/virtio/virtio.c:3531
  #6 0x564d0f980224 (./x86_64-softmmu/qemu-system-x86_64+0x322d224)  /mnt/sdb/qemu/hw/core/qdev.c:865

Reported-by: Euler Robot <[email protected]>
Signed-off-by: Pan Nengyuan <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Message-Id: <20200117075547 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: kvm: initialize microcode revision from KVM

KVM can return the host microcode revision as a feature MSR.
Use it as the default value for -cpu host.

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <1579544504 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: add a ucode-rev property

Add the property and plumb it in TCG and HVF (the latter of which
tried to support returning a constant value but used the wrong MSR).

Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <1579544504 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

target/i386: kvm: initialize feature MSRs very early

Some read-only MSRs affect the behavior of ioctls such as
KVM_SET_NESTED_STATE. We can initialize them once and for all
right after the CPU is realized, since they will never be modified
by the guest.

Reported-by: Qingua Cheng <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Message-Id: <1579544504 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/core/Makefile: Group generic objects versus system-mode objects

To ease review/modifications of this Makefile, group generic
objects first, then system-mode specific ones, and finally
peripherals (which are only used in system-mode).

No logical changes introduced here.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/core: Restrict reset handlers API to system-mode

The user-mode code does not use this API, restrict it
to the system-mode.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Makefile: Remove unhelpful comment

It is pointless to keep qapi/ object separate from the other
common-objects. Drop the comment.

Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Makefile: Restrict system emulation and tools objects

Restrict all the system emulation and tools objects with a
Makefile IF (CONFIG_SOFTMMU OR CONFIG_TOOLS) check.

Using the same description over and over is not very helpful.
Use it once, just before the if() block.

Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Makefile: Clarify all the codebase requires qom/ objects

QEMU user-mode also requires the qom/ objects, it is not only
used by "system emulation and qemu-img". As we will use a big
if() block, move it upper in the "Common libraries for tools
and emulators" section.

Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

configure: Do not build libfdt if not required

We only require libfdt for system emulation, in a small set
of architecture:

4077  # fdt support is mandatory for at least some target architectures,
4078  # so insist on it if we're building those system emulators.
4079  fdt_required=no
4080  for target in $target_list; do
4081    case $target in
4082      aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu|riscv*-softmmu)
4083        fdt_required=yes

Do not build libfdt if we did not manually specified --enable-fdt,
or have one of the platforms that require it in our target list.

Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Message-Id: <20200118140619 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/pci-host/designware: Remove unuseful FALLTHROUGH comment

We don't need to explicit this obvious switch fall through.
Stay consistent with the rest of the codebase.

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/net/imx_fec: Remove unuseful FALLTHROUGH comments

We don't need to explicit these obvious switch fall through
comments. Stay consistent with the rest of the codebase.

Suggested-by: Thomas Huth <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/net/imx_fec: Rewrite fall through comments

GCC9 is confused by this comment when building with CFLAG
-Wimplicit-fallthrough=2:

  hw/net/imx_fec.c: In function ‘imx_eth_write’:
  hw/net/imx_fec.c:906:12: error: this statement may fall through [-Werror=implicit-fallthrough=]
    906 |         if (unlikely(single_tx_ring)) {
        |            ^
  hw/net/imx_fec.c:912:5: note: here
    912 |     case ENET_TDAR:     /* FALLTHROUGH */
        |     ^~~~
  cc1: all warnings being treated as errors

Rewrite the comments in the correct place,  using 'fall through'
which is recognized by GCC and static analyzers.

Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/timer/aspeed_timer: Add a fall through comment

Reported by GCC9 when building with CFLAG -Wimplicit-fallthrough=2:

  hw/timer/aspeed_timer.c: In function ‘aspeed_timer_set_value’:
  hw/timer/aspeed_timer.c:283:24: error: this statement may fall through [-Werror=implicit-fallthrough=]
    283 |         if (old_reload || !t->reload) {
        |             ~~~~~~~~~~~^~~~~~~~~~~~~
  hw/timer/aspeed_timer.c:287:5: note: here
    287 |     case TIMER_REG_STATUS:
        |     ^~~~
  cc1: all warnings being treated as errors

Add the missing fall through comment.

Fixes: 1403f364472
Reviewed-by: Cédric Le Goater <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

hw/display/tcx: Add missing fall through comments

When building with GCC9 using CFLAG -Wimplicit-fallthrough=2 we get:

  hw/display/tcx.c: In function ‘tcx_dac_writel’:
  hw/display/tcx.c:453:26: error: this statement may fall through [-Werror=implicit-fallthrough=]
    453 |             s->dac_index = (s->dac_index + 1) & 0xff; /* Index autoincrement */
        |             ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
  hw/display/tcx.c:454:9: note: here
    454 |         default:
        |         ^~~~~~~
  hw/display/tcx.c: In function ‘tcx_dac_readl’:
  hw/display/tcx.c:412:22: error: this statement may fall through [-Werror=implicit-fallthrough=]
    412 |         s->dac_index = (s->dac_index + 1) & 0xff; /* Index autoincrement */
        |         ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
  hw/display/tcx.c:413:5: note: here
    413 |     default:
        |     ^~~~~~~
  cc1: all warnings being treated as errors

Give a hint to GCC by adding the missing fall through comments.

Fixes: 55d7bfe22
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Reviewed-by: Mark Cave-Ayland <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

audio/audio: Add missing fall through comment

When building with GCC9 using CFLAG -Wimplicit-fallthrough=2 we get:

  audio/audio.c: In function ‘audio_pcm_init_info’:
  audio/audio.c:306:14: error: this statement may fall through [-Werror=implicit-fallthrough=]
    306 |         sign = 1;
        |         ~~~~~^~~
  audio/audio.c:307:5: note: here
    307 |     case AUDIO_FORMAT_U8:
        |     ^~~~
  cc1: all warnings being treated as errors

Similarly to e46349414, add the missing fall through comment to
hint GCC.

Fixes: 2b9cce8c8c
Reviewed-by: Richard Henderson <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Aleksandar Markovic <[email protected]>
Reviewed-by: Gerd Hoffmann <[email protected]>
Message-Id: <20191218192526 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

qom/object: Display more helpful message when an interface is missing

When adding new devices implementing QOM interfaces, we might
forgot to add the Kconfig dependency that pulls the required
objects in when building.

Since QOM dependencies are resolved at runtime, we don't get any
link-time failures, and QEMU aborts while starting:

  $ qemu ...
  Segmentation fault (core dumped)

  (gdb) bt
  #0  0x00007ff6e96b1e35 in raise () from /lib64/libc.so.6
  #1  0x00007ff6e969c895 in abort () from /lib64/libc.so.6
  #2  0x00005572bc5051cf in type_initialize (ti=0x5572be6f1200) at qom/object.c:323
  #3  0x00005572bc505074 in type_initialize (ti=0x5572be6f1800) at qom/object.c:301
  #4  0x00005572bc505074 in type_initialize (ti=0x5572be6e48e0) at qom/object.c:301
  #5  0x00005572bc506939 in object_class_by_name (typename=0x5572bc56109a) at qom/object.c:959
  #6  0x00005572bc503dd5 in cpu_class_by_name (typename=0x5572bc56109a, cpu_model=0x5572be6d9930) at hw/core/cpu.c:286

Since the caller has access to the qdev parent/interface names,
we can simply display them to avoid starting a debugger:

  $ qemu ...
  qemu: missing interface 'fancy-if' for object 'fancy-dev'
  Aborted (core dumped)

This commit is similar to e02bdf1cecd2 ("Display more helpful message
when an object type is missing").

Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <20200118162348 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pvpanic: implement crashloaded event handling

Handle bit 1 write, then post event to monitor.

Suggested by Paolo, declear a new event, using GUEST_PANICKED could
cause upper layers to react by shutting down or rebooting the guest.

In advance for extention, add GuestPanicInformation in event message.

Signed-off-by: zhenwei pi <[email protected]>
Message-Id: <20200114023102 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

pvpanic: introduce crashloaded for pvpanic

Add bit 1 for pvpanic. This bit means that guest hits a panic, but
guest wants to handle error by itself. Typical case: Linux guest runs
kdump in panic. It will help us to separate the abnormal reboot from
normal operation.

Signed-off-by: zhenwei pi <[email protected]>
Message-Id: <20200114023102 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

cpu: Use cpu_class_set_parent_reset()

Convert all targets to use cpu_class_set_parent_reset() with the following
coccinelle script:

@@
type CPUParentClass;
CPUParentClass *pcc;
CPUClass *cc;
identifier parent_fn;
identifier child_fn;
@@
+cpu_class_set_parent_reset(cc, child_fn, &pcc->parent_fn);
-pcc->parent_fn = cc->reset;
...
-cc->reset = child_fn;

Signed-off-by: Greg Kurz <[email protected]>
Acked-by: David Gibson <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Message-Id: <157650847817.354886.7047137349018460524 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

cpu: Introduce cpu_class_set_parent_reset()

Similarly to what we already do with qdev, use a helper to overload the
reset QOM methods of the parent in children classes, for clarity.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-Id: <157650847239.354886.2782881118916307978 [email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

Merge remote-tracking branch 'remotes/palmer/tags/riscv-for-master-5.0-sf1' into staging

RISC-V Patches for the 5.0 Soft Freeze, Part 1

This patch set contains a handful of collected fixes that I'd like to target
for the 5.0 soft freeze (I know that's a long way away, I just don't know what
else to call these):

* A fix for a memory leak initializing the sifive_u board.
* Fixes to privilege mode emulation related to interrupts and fstatus.

Notably absent is the H extension implementation.  That's pretty much reviewed,
but not quite ready to go yet and I didn't want to hold back these important
fixes.  This boots 32-bit and 64-bit Linux (buildroot this time, just for fun)
and passes "make check".

# gpg: Signature made Tue 21 Jan 2020 22:55:28 GMT
# gpg:                using RSA key 2B3C3747446843B24A943A7A2E1319F35FBB1889
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Palmer Dabbelt <[email protected]>" [unknown]
# gpg:                 aka "Palmer Dabbelt <[email protected]>" [unknown]
# gpg:                 aka "Palmer Dabbelt <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 00CE 76D1 8349 60DF CE88  6DF8 EF4C A150 2CCB AB41
#      Subkey fingerprint: 2B3C 3747 4468 43B2 4A94  3A7A 2E13 19F3 5FBB 1889

* remotes/palmer/tags/riscv-for-master-5.0-sf1:
  target/riscv: update mstatus.SD when FS is set dirty
  target/riscv: fsd/fsw doesn't dirty FP state
  target/riscv: Fix tb->flags FS status
  riscv: Set xPIE to 1 after xRET
  riscv/sifive_u: fix a memory leak in soc_realize()

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b' into staging

virtiofsd first pull v2

Import our virtiofsd.
This pulls in the daemon to drive a file system connected to the
existing qemu virtiofsd device.
It's derived from upstream libfuse with lots of changes (and a lot
trimmed out).
The daemon lives in the newly created qemu/tools/virtiofsd

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
v2
  drop the docs while we discuss where they should live
  and we need to redo the manpage in anything but texi

# gpg: Signature made Thu 23 Jan 2020 16:45:18 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b: (108 commits)
  virtiofsd: add some options to the help message
  virtiofsd: stop all queue threads on exit in virtio_loop()
  virtiofsd/passthrough_ll: Pass errno to fuse_reply_err()
  virtiofsd: Convert lo_destroy to take the lo->mutex lock itself
  virtiofsd: add --thread-pool-size=NUM option
  virtiofsd: fix lo_destroy() resource leaks
  virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races
  virtiofsd: process requests in a thread pool
  virtiofsd: use fuse_buf_writev to replace fuse_buf_write for better performance
  virtiofsd: add definition of fuse_buf_writev()
  virtiofsd: passthrough_ll: Use cache_readdir for directory open
  virtiofsd: Fix data corruption with O_APPEND write in writeback mode
  virtiofsd: Reset O_DIRECT flag during file open
  virtiofsd: convert more fprintf and perror to use fuse log infra
  virtiofsd: do not always set FUSE_FLOCK_LOCKS
  virtiofsd: introduce inode refcount to prevent use-after-free
  virtiofsd: passthrough_ll: fix refcounting on remove/rename
  libvhost-user: Fix some memtable remap cases
  virtiofsd: rename inode->refcount to inode->nlookup
  virtiofsd: prevent races with lo_dirp_put()
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/ui-20200123-pull-request' into staging

vnc: fix zlib compression artifacts.
ui: add "none" to -display help.

# gpg: Signature made Thu 23 Jan 2020 14:20:53 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>" [full]
# gpg:                 aka "Gerd Hoffmann <[email protected]>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/ui-20200123-pull-request:
  ui/console: Display the 'none' backend in '-display help'
  vnc: prioritize ZRLE compression over ZLIB
  Revert "vnc: allow fall back to RAW encoding"

Signed-off-by: Peter Maydell <[email protected]>

virtiofsd: add some options to the help message

Add following options to the help message:
- cache
- flock|no_flock
- norace
- posix_lock|no_posix_lock
- readdirplus|no_readdirplus
- timeout
- writeback|no_writeback
- xattr|no_xattr

Signed-off-by: Masayoshi Mizuma <[email protected]>
dgilbert: Split cache, norace, posix_lock, readdirplus off
into our own earlier patches that added the options

Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: stop all queue threads on exit in virtio_loop()

On guest graceful shutdown, virtiofsd receives VHOST_USER_GET_VRING_BASE
request from VMM and shuts down virtqueues by calling fv_set_started(),
which joins fv_queue_thread() threads. So when virtio_loop() returns,
there should be no thread is still accessing data in fuse session and/or
virtio dev.

But on abnormal exit, e.g. guest got killed for whatever reason,
vhost-user socket is closed and virtio_loop() breaks out the main loop
and returns to main(). But it's possible fv_queue_worker()s are still
working and accessing fuse session and virtio dev, which results in
crash or use-after-free.

Fix it by stopping fv_queue_thread()s before virtio_loop() returns,
to make sure there's no-one could access fuse session and virtio dev.

Reported-by: Qingming Su <[email protected]>
Signed-off-by: Eryu Guan <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd/passthrough_ll: Pass errno to fuse_reply_err()

lo_copy_file_range() passes -errno to fuse_reply_err() and then fuse_reply_err()
changes it to errno again, so that subsequent fuse_send_reply_iov_nofree() catches
the wrong errno.(i.e. reports "fuse: bad error value: ...").

Make fuse_send_reply_iov_nofree() accept the correct -errno by passing errno
directly in lo_copy_file_range().

Signed-off-by: Xiao Yang <[email protected]>
Reviewed-by: Eryu Guan <[email protected]>
dgilbert: Sent upstream and now Merged as aa1185e153f774f1df65
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: Convert lo_destroy to take the lo->mutex lock itself

lo_destroy was relying on some implicit knowledge of the locking;
we can avoid this if we create an unref_inode that doesn't take
the lock and then grab it for the whole of the lo_destroy.

Suggested-by: Vivek Goyal <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: add --thread-pool-size=NUM option

Add an option to control the size of the thread pool. Requests are now
processed in parallel by default.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: fix lo_destroy() resource leaks

Now that lo_destroy() is serialized we can call unref_inode() so that
all inode resources are freed.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races

When running with multiple threads it can be tricky to handle
FUSE_INIT/FUSE_DESTROY in parallel with other request types or in
parallel with themselves. Serialize FUSE_INIT and FUSE_DESTROY so that
malicious clients cannot trigger race conditions.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Masayoshi Mizuma <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: process requests in a thread pool

Introduce a thread pool so that fv_queue_thread() just pops
VuVirtqElements and hands them to the thread pool.  For the time being
only one worker thread is allowed since passthrough_ll.c is not
thread-safe yet.  Future patches will lift this restriction so that
multiple FUSE requests can be processed in parallel.

The main new concept is struct FVRequest, which contains both
VuVirtqElement and struct fuse_chan.  We now have fv_VuDev for a device,
fv_QueueInfo for a virtqueue, and FVRequest for a request.  Some of
fv_QueueInfo's fields are moved into FVRequest because they are
per-request.  The name FVRequest conforms to QEMU coding style and I
expect the struct fv_* types will be renamed in a future refactoring.

This patch series is not optimal.  fbuf reuse is dropped so each request
does malloc(se->bufsize), but there is no clean and cheap way to keep
this with a thread pool.  The vq_lock mutex is held for longer than
necessary, especially during the eventfd_write() syscall.  Performance
can be improved in the future.

prctl(2) had to be added to the seccomp whitelist because glib invokes
it.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: use fuse_buf_writev to replace fuse_buf_write for better performance

fuse_buf_writev() only handles the normal write in which src is buffer
and dest is fd. Specially if src buffer represents guest physical
address that can't be mapped by the daemon process, IO must be bounced
back to the VMM to do it by fuse_buf_copy().

Signed-off-by: Jun Piao <[email protected]>
Suggested-by: Dr. David Alan Gilbert <[email protected]>
Suggested-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: add definition of fuse_buf_writev()

Define fuse_buf_writev() which use pwritev and writev to improve io
bandwidth. Especially, the src bufs with 0 size should be skipped as
their mems are not *block_size* aligned which will cause writev failed
in direct io mode.

Signed-off-by: Jun Piao <[email protected]>
Suggested-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: passthrough_ll: Use cache_readdir for directory open

Since keep_cache(FOPEN_KEEP_CACHE) has no effect for directory as
described in fuse_common.h, use cache_readdir(FOPNE_CACHE_DIR) for
diretory open when cache=always mode.

Signed-off-by: Misono Tomohiro <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: Fix data corruption with O_APPEND write in writeback mode

When writeback mode is enabled (-o writeback), O_APPEND handling is
done in kernel. Therefore virtiofsd clears O_APPEND flag when open.
Otherwise O_APPEND flag takes precedence over pwrite() and write
data may corrupt.

Currently clearing O_APPEND flag is done in lo_open(), but we also
need the same operation in lo_create(). So, factor out the flag
update operation in lo_open() to update_open_flags() and call it
in both lo_open() and lo_create().

This fixes the failure of xfstest generic/069 in writeback mode
(which tests O_APPEND write data integrity).

Signed-off-by: Misono Tomohiro <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: Reset O_DIRECT flag during file open

If an application wants to do direct IO and opens a file with O_DIRECT
in guest, that does not necessarily mean that we need to bypass page
cache on host as well. So reset this flag on host.

If somebody needs to bypass page cache on host as well (and it is safe to
do so), we can add a knob in daemon later to control this behavior.

I check virtio-9p and they do reset O_DIRECT flag.

Signed-off-by: Vivek Goyal <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: convert more fprintf and perror to use fuse log infra

Signed-off-by: Eryu Guan <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: do not always set FUSE_FLOCK_LOCKS

Right now we always enable it regardless of given commandlines.
Fix it by setting the flag relying on the lo->flock bit.

Signed-off-by: Peng Tao <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: introduce inode refcount to prevent use-after-free

If thread A is using an inode it must not be deleted by thread B when
processing a FUSE_FORGET request.

The FUSE protocol itself already has a counter called nlookup that is
used in FUSE_FORGET messages. We cannot trust this counter since the
untrusted client can manipulate it via FUSE_FORGET messages.

Introduce a new refcount to keep inodes alive for the required lifespan.
lo_inode_put() must be called to release a reference. FUSE's nlookup
counter holds exactly one reference so that the inode stays alive as
long as the client still wants to remember it.

Note that the lo_inode->is_symlink field is moved to avoid creating a
hole in the struct due to struct field alignment.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Reviewed-by: Sergio Lopez <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: passthrough_ll: fix refcounting on remove/rename

Signed-off-by: Miklos Szeredi <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

libvhost-user: Fix some memtable remap cases

If a new setmemtable command comes in once the vhost threads are
running, it will remap the guests address space and the threads
will now be looking in the wrong place.

Fortunately we're running this command under lock, so we can
update the queue mappings so that threads will look in the new-right
place.

Note: This doesn't fix things that the threads might be doing
without a lock (e.g. a readv/writev!) That's for another time.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: rename inode->refcount to inode->nlookup

This reference counter plays a specific role in the FUSE protocol. It's
not a generic object reference counter and the FUSE kernel code calls it
"nlookup".

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: prevent races with lo_dirp_put()

Introduce lo_dirp_put() so that FUSE_RELEASEDIR does not cause
use-after-free races with other threads that are accessing lo_dirp.

Also make lo_releasedir() atomic to prevent FUSE_RELEASEDIR racing with
itself. This prevents double-frees.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: make lo_release() atomic

Hold the lock across both lo_map_get() and lo_map_remove() to prevent
races between two FUSE_RELEASE requests. In this case I don't see a
serious bug but it's safer to do things atomically.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: prevent fv_queue_thread() vs virtio_loop() races

We call into libvhost-user from the virtqueue handler thread and the
vhost-user message processing thread without a lock.  There is nothing
protecting the virtqueue handler thread if the vhost-user message
processing thread changes the virtqueue or memory table while it is
running.

This patch introduces a read-write lock.  Virtqueue handler threads are
readers.  The vhost-user message processing thread is a writer.  This
will allow concurrency for multiqueue in the future while protecting
against fv_queue_thread() vs virtio_loop() races.

Note that the critical sections could be made smaller but it would be
more invasive and require libvhost-user changes.  Let's start simple and
improve performance later, if necessary.  Another option would be an
RCU-style approach with lighter-weight primitives.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: use fuse_lowlevel_is_virtio() in fuse_session_destroy()

vu_socket_path is NULL when --fd=FDNUM was used. Use
fuse_lowlevel_is_virtio() instead.

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: Support remote posix locks

Doing posix locks with-in guest kernel are not sufficient if a file/dir
is being shared by multiple guests. So we need the notion of daemon doing
the locks which are visible to rest of the guests.

Given posix locks are per process, one can not call posix lock API on host,
otherwise bunch of basic posix locks properties are broken. For example,
If two processes (A and B) in guest open the file and take locks on different
sections of file, if one of the processes closes the fd, it will close
fd on virtiofsd and all posix locks on file will go away. This means if
process A closes the fd, then locks of process B will go away too.

Similar other problems exist too.

This patch set tries to emulate posix locks while using open file
description locks provided on Linux.

Daemon provides two options (-o posix_lock, -o no_posix_lock) to enable
or disable posix locking in daemon. By default it is enabled.

There are few issues though.

- GETLK() returns pid of process holding lock. As we are emulating locks
  using OFD, and these locks are not per process and don't return pid
  of process, so GETLK() in guest does not reuturn process pid.

- As of now only F_SETLK is supported and not F_SETLKW. We can't block
  the thread in virtiofsd for arbitrary long duration as there is only
  one thread serving the queue. That means unlock request will not make
  it to daemon and F_SETLKW will block infinitely and bring virtio-fs
  to a halt. This is a solvable problem though and will require significant
  changes in virtiofsd and kernel. Left as a TODO item for now.

Signed-off-by: Vivek Goyal <[email protected]>
Reviewed-by: Masayoshi Mizuma <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

Virtiofsd: fix memory leak on fuse queueinfo

For fuse's queueinfo, both queueinfo array and queueinfos are allocated in
fv_queue_set_started() but not cleaned up when the daemon process quits.

This fixes the leak in proper places.

Signed-off-by: Liu Bo <[email protected]>
Signed-off-by: Eric Ren <[email protected]>
Reviewed-by: Misono Tomohiro <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>

virtiofsd: fix incorrect error handling in lo_do_lookup

Signed-off-by: Eric Ren <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Signed-off-by: Dr. David Alan Gilbert <[email protected]>