Git Repo - qemu.git/log

block: Drop unmaintained 'archipelago' driver

The driver has failed to build since commit da34e65, in qemu 2.6,
due to a missing include of qapi/error.h for error_setg().
Since no one has complained in three releases, it is easier to
remove the dead code than to keep it around, especially since it
is not being built by default and therefore prone to bitrot.

Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Max Reitz <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

file-posix: Consider max_segments for BlockLimits.max_transfer

BlockLimits.max_transfer can be too high without this fix, guest will
encounter I/O error or even get paused with werror=stop or rerror=stop. The
cause is explained below.

Linux has a separate limit, /sys/block/.../queue/max_segments, which in
the worst case can be more restrictive than the BLKSECTGET which we
already consider (note that they are two different things). So, the
failure scenario before this patch is:

1) host device has max_sectors_kb = 4096 and max_segments = 64;
2) guest learns max_sectors_kb limit from QEMU, but doesn't know
   max_segments;
3) guest issues e.g. a 512KB request thinking it's okay, but actually
   it's not, because it will be passed through to host device as an
   SG_IO req that has niov > 64;
4) host kernel doesn't like the segmenting of the request, and returns
   -EINVAL;

This patch checks the max_segments sysfs entry for the host device and
calculates a "conservative" bytes limit using the page size, which is
then merged into the existing max_transfer limit. Guest will discover
this from the usual virtual block device interfaces. (In the case of
scsi-generic, it will be done in the INQUIRY reply interception in
device model.)

The other possibility is to actually propagate it as a separate limit,
but it's not better. On the one hand, there is a big complication: the
limit is per-LUN in QEMU PoV (because we can attach LUNs from different
host HBAs to the same virtio-scsi bus), but the channel to communicate
it in a per-LUN manner is missing down the stack; on the other hand,
two limits versus one doesn't change much about the valid size of I/O
(because guest has no control over host segmenting).

Also, the idea to fall back to bounce buffering in QEMU, upon -EINVAL,
was explored. Unfortunately there is no neat way to ensure the bounce
buffer is less segmented (in terms of DMA addr) than the guest buffer.

Practically, this bug is not very common. It is only reported on a
Emulex (lpfc), so it's okay to get it fixed in the easier way.

Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

backup: allow target without .bdrv_get_info

Currently backup to nbd target is broken, as nbd doesn't have
.bdrv_get_info realization.

Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170309-1' into staging

2.9 bugfixes for ohci and qxl

# gpg: Signature made Thu 09 Mar 2017 09:09:44 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <[email protected]>"
# gpg:                 aka "Gerd Hoffmann <[email protected]>"
# gpg:                 aka "Gerd Hoffmann (private) <[email protected]>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170309-1:
  qxl: clear guest_cursor on QXL_CURSOR_HIDE
  ohci: relax link check

Signed-off-by: Peter Maydell <[email protected]>

qxl: clear guest_cursor on QXL_CURSOR_HIDE

Make sure we don't leave guest_cursor pointing into nowhere. This might
lead to (rare) live migration failures, due to target trying to restore
the cursor from the stale pointer.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1421788
Reported-by: Dr. David Alan Gilbert <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Message-id: 1488789111 [email protected]

ohci: relax link check

The strict td link limit added by commit "95ed569 usb: ohci: limit the
number of link eds" causes problems with macos guests. Lets raise the
limit.

Reported-by: Programmingkid <[email protected]>
Reported-by: Howard Spoelstra <[email protected]>
Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: BALATON Zoltan <[email protected]>
Reviewed-by: John Arbuckle <[email protected]>
Message-id: 1488876018 [email protected]

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer fixes for 2.9.0-rc0

# gpg: Signature made Tue 07 Mar 2017 14:59:18 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <[email protected]>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (27 commits)
  commit: Don't use error_abort in commit_start
  block: Don't use error_abort in blk_new_open
  sheepdog: Support blockdev-add
  qapi-schema: Rename SocketAddressFlat's variant tcp to inet
  qapi-schema: Rename GlusterServer to SocketAddressFlat
  gluster: Plug memory leaks in qemu_gluster_parse_json()
  gluster: Don't duplicate qapi-util.c's qapi_enum_parse()
  gluster: Drop assumptions on SocketTransport names
  sheepdog: Implement bdrv_parse_filename()
  sheepdog: Use SocketAddress and socket_connect()
  sheepdog: Report errors in pseudo-filename more usefully
  sheepdog: Don't truncate long VDI name in _open(), _create()
  sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()
  sheepdog: Mark sd_snapshot_delete() lossage FIXME
  sheepdog: Fix error handling sd_create()
  sheepdog: Fix error handling in sd_snapshot_delete()
  sheepdog: Defuse time bomb in sd_open() error handling
  block: Fix error handling in bdrv_replace_in_backing_chain()
  block: Handle permission errors in change_parent_backing_link()
  block: Ignore multiple children in bdrv_check_update_perm()
  ...

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-block-2017-02-28-v4' into staging

block: Command line option -blockdev

# gpg: Signature made Tue 07 Mar 2017 15:07:59 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg:                 aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-block-2017-02-28-v4: (24 commits)
  keyval: Support lists
  docs/qapi-code-gen.txt: Clarify naming rules
  qapi: Improve how keyval input visitor reports unexpected dicts
  block: Initial implementation of -blockdev
  qapi: New qobject_input_visitor_new_str() for convenience
  keyval: Restrict key components to valid QAPI names
  qapi: New parse_qapi_name()
  test-qapi-util: New, covering qapi/qapi-util.c
  monitor: Assert qmp_schema_json[] is sane
  test-visitor-serialization: Pass &error_abort to qobject_from_json()
  check-qjson: Test errors from qobject_from_json()
  block: More detailed syntax error reporting for JSON filenames
  qobject: Propagate parse errors through qobject_from_json()
  test-qobject-input-visitor: Abort earlier on bad test input
  qjson: Abort earlier on qobject_from_jsonf() misuse
  libqtest: Fix qmp() & friends to abort on JSON parse errors
  qobject: Propagate parse errors through qobject_from_jsonv()
  qapi: Factor out common qobject_input_get_keyval()
  qapi: Factor out common part of qobject input visitor creation
  test-keyval: Cover use with qobject input visitor
  ...

Signed-off-by: Peter Maydell <[email protected]>

keyval: Support lists

Additionally permit non-negative integers as key components.  A
dictionary's keys must either be all integers or none.  If all keys
are integers, convert the dictionary to a list.  The set of keys must
be [0,N].

Examples:

* list.1=goner,list.0=null,list.1=eins,list.2=zwei
  is equivalent to JSON [ "null", "eins", "zwei" ]

* a.b.c=1,a.b.0=2
  is inconsistent: a.b.c clashes with a.b.0

* list.0=null,list.2=eins,list.2=zwei
  has a hole: list.1 is missing

Similar design flaw as for objects: there is no way to denote an empty
list.  While interpreting "key absent" as empty list seems natural
(removing a list member from the input string works when there are
multiple ones, so why not when there's just one), it doesn't work:
"key absent" already means "optional list absent", which isn't the
same as "empty list present".

Update the keyval object visitor to use this a.0 syntax in error
messages rather than the usual a[0].

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488317230 [email protected]>
[Off-by-one fix squashed in, as per Kevin's review]
Reviewed-by: Kevin Wolf <[email protected]>

docs/qapi-code-gen.txt: Clarify naming rules

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: Improve how keyval input visitor reports unexpected dicts

Incorrect option

    -blockdev node-name=foo,driver=file,filename=foo.img,aio.unmap=on

is rejected with "Invalid parameter type for 'aio', expected: string".
To make sense of this, you almost have to translate it into the
equivalent QMP command

    { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "file", "filename": "foo.img", "aio": { "unmap": true } } }

Improve the error message to "Parameters 'aio.*' are unexpected".
Take care not to confuse the case "unexpected nested parameters"
(i.e. the object is a QDict or QList) with the case "non-string scalar
parameter".  The latter is a misuse of the visitor, and should perhaps
be an assertion.  Note that test-qobject-input-visitor exercises this
misuse in test_visitor_in_int_keyval(), test_visitor_in_bool_keyval()
and test_visitor_in_number_keyval().

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

block: Initial implementation of -blockdev

The new command line option -blockdev works like QMP command
blockdev-add.

The option argument may be given in JSON syntax, exactly as in QMP.
Example usage:

    -blockdev '{"node-name": "foo", "driver": "raw", "file": {"driver": "file", "filename": "foo.img"} }'

The JSON argument doesn't exactly blend into the existing option
syntax, so the traditional KEY=VALUE,... syntax is also supported,
using dotted keys to do the nesting:

    -blockdev node-name=foo,driver=raw,file.driver=file,file.filename=foo.img

This does not yet support lists, but that will be addressed shortly.

Note that calling qmp_blockdev_add() (say via qmp_marshal_block_add())
right away would crash.  We need to stash the configuration for later
instead.  This is crudely done, and bypasses QemuOpts, even though
storing configuration is what QemuOpts is for.  Need to revamp option
infrastructure to support QAPI types like BlockdevOptions.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: New qobject_input_visitor_new_str() for convenience

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488317230 [email protected]>

keyval: Restrict key components to valid QAPI names

Until now, key components are separated by '.'. This leaves little
room for evolving the syntax, and is incompatible with the __RFQDN_
prefix convention for downstream extensions.

Since key components will be commonly used as QAPI member names by the
QObject input visitor, we can just as well borrow the QAPI naming
rules here: letters, digits, hyphen and period starting with a letter,
with an optional __RFQDN_ prefix for downstream extensions.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: New parse_qapi_name()

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488317230 [email protected]>

test-qapi-util: New, covering qapi/qapi-util.c

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

monitor: Assert qmp_schema_json[] is sane

qmp_query_qmp_schema() parses qmp_schema_json[] with
qobject_from_json(). This must not fail, so pass &error_abort.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

test-visitor-serialization: Pass &error_abort to qobject_from_json()

qmp_deserialize() calls qobject_from_json() ignoring errors. It
passes the result to qobject_input_visitor_new(), which asserts it's
not null. Therefore, we can just as well pass &error_abort to
qobject_from_json().

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

check-qjson: Test errors from qobject_from_json()

Pass &error_abort with known-good input. Else pass &err and check
what comes back. This demonstrates that the parser fails silently for
many errors.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

block: More detailed syntax error reporting for JSON filenames

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qobject: Propagate parse errors through qobject_from_json()

The next few commits will put the errors to use where appropriate.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488317230 [email protected]>

test-qobject-input-visitor: Abort earlier on bad test input

visitor_input_test_init_internal() parses test input with
qobject_from_jsonv(), and asserts it succeeds. Pass &error_abort for
good measure.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qjson: Abort earlier on qobject_from_jsonf() misuse

Ignoring errors first, then asserting success is suboptimal. Pass
&error_abort instead, so we abort earlier, and hopefully get more
useful clues on what's wrong.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

libqtest: Fix qmp() & friends to abort on JSON parse errors

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qobject: Propagate parse errors through qobject_from_jsonv()

The next few commits will put the errors to use where appropriate.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: Factor out common qobject_input_get_keyval()

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: Factor out common part of qobject input visitor creation

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

test-keyval: Cover use with qobject input visitor

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

qapi: qobject input visitor variant for use with keyval_parse()

Currently the QObjectInputVisitor assumes that all scalar values are
directly represented as the final types declared by the thing being
visited. i.e. it assumes an 'int' is using QInt, and a 'bool' is using
QBool, etc.  This is good when QObjectInputVisitor is fed a QObject
that came from a JSON document on the QMP monitor, as it will strictly
validate correctness.

To allow QObjectInputVisitor to be reused for visiting a QObject
originating from keyval_parse(), an alternative mode is needed where
all the scalars types are represented as QString and converted on the
fly to the final desired type.

Signed-off-by: Daniel P. Berrange <[email protected]>
Message-Id: <1475246744 [email protected]>

Rebased, conflicts resolved, commit message updated to refer to
keyval_parse().  autocast replaced by keyval in identifiers,
noautocast replaced by fail in tests.

Fix qobject_input_type_uint64_keyval() not to reject '-', for QemuOpts
compatibility: replace parse_uint_full() by open-coded
parse_option_number().  The next commit will add suitable tests.
Leave out the fancy ERANGE error reporting for now, but add a TODO
comment.  Add it qobject_input_type_int64_keyval() and
qobject_input_type_number_keyval(), too.

Open code parse_option_bool() and parse_option_size() so we have to
call qobject_input_get_name() only when actually needed.  Again, leave
out ERANGE error reporting for now.

QAPI/QMP downstream extension prefixes __RFQDN_ don't work, because
keyval_parse() splits them at '.'.  This will be addressed later in
the series.

qobject_input_type_int64_keyval(), qobject_input_type_uint64_keyval(),
qobject_input_type_number_keyval() tweaked for style.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

keyval: New keyval_parse()

keyval_parse() parses KEY=VALUE,... into a QDict.  Works like
qemu_opts_parse(), except:

* Returns a QDict instead of a QemuOpts (d'oh).

* Supports nesting, unlike QemuOpts: a KEY is split into key
  fragments at '.' (dotted key convention; the block layer does
  something similar on top of QemuOpts).  The key fragments are QDict
  keys, and the last one's value is updated to VALUE.

* Each key fragment may be up to 127 bytes long.  qemu_opts_parse()
  limits the entire key to 127 bytes.

* Overlong key fragments are rejected.  qemu_opts_parse() silently
  truncates them.

* Empty key fragments are rejected.  qemu_opts_parse() happily
  accepts empty keys.

* It does not store the returned value.  qemu_opts_parse() stores it
  in the QemuOptsList.

* It does not treat parameter "id" specially.  qemu_opts_parse()
  ignores all but the first "id", and fails when its value isn't
  id_wellformed(), or duplicate (a QemuOpts with the same ID is
  already stored).  It also screws up when a value contains ",id=".

* Implied value is not supported.  qemu_opts_parse() desugars "foo" to
  "foo=on", and "nofoo" to "foo=off".

* An implied key's value can't be empty, and can't contain ','.

I intend to grow this into a saner replacement for QemuOpts.  It'll
take time, though.

Note: keyval_parse() provides no way to do lists, and its key syntax
is incompatible with the __RFQDN_ prefix convention for downstream
extensions, because it blindly splits at '.', even in __RFQDN_.  Both
issues will be addressed later in the series.

Signed-off-by: Markus Armbruster <[email protected]>
Message-Id: <1488317230 [email protected]>

tests: Fix gcov-files-test-qemu-opts-y, gcov-files-test-logging-y

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

test-qemu-opts: Cover qemu_opts_parse() of "no"

qemu_opts_parse() interprets "no" as negated empty key. Consistent
with its acceptance of empty keys elsewhere, whatever that's worth.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-Id: <1488317230 [email protected]>

disas/arm: Avoid unintended sign extension

When assembling 'given' from the instruction bytes, C's integer
promotion rules mean we may promote an unsigned char to a signed
integer before shifting it, and then sign extend to a 64-bit long,
which can set the high bits of the long. The code doesn't in fact
care about the high bits if the long is 64 bits, but this is
surprising, so don't do it.

(Spotted by Coverity, CID 1005404.)

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1488556233 [email protected]

disas/cris: Avoid unintended sign extension

In the cris disassembler we were using 'unsigned long' to calculate
addresses which are supposed to be 32 bits. This meant that we might
accidentally sign extend or calculate a value that was outside the 32
bit range of the guest CPU. Use 'uint32_t' instead so we give the
right answers on 64-bit hosts.

(Spotted by Coverity, CID 1005402, 1005403.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 1488556233 [email protected]

disas/microblaze: Avoid unintended sign extension

In read_insn_microblaze() we assemble 4 bytes into an 'unsigned
long'. If 'unsigned long' is 64 bits and the high byte has its top
bit set, then C's implicit conversion from 'unsigned char' to 'int'
for the shift will result in an unintended sign extension which sets
the top 32 bits in 'inst'. Add casts to prevent this. (Spotted by
Coverity, CID 1005401.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Edgar E. Iglesias <[email protected]>
Message-id: 1488556233 [email protected]

disas/m68k: Avoid unintended sign extension in get_field()

In get_field(), we take an 'unsigned char' value and shift it left,
which implicitly promotes it to 'signed int', before ORing it into an
'unsigned long' type. If 'unsigned long' is 64 bits then this will
result in a sign extension and the top 32 bits of the result will be
1s. Add explicit casts to unsigned long before shifting to prevent
this.

(Spotted by Coverity, CID 715697.)

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Laurent Vivier <[email protected]>
Message-id: 1488556233 [email protected]

disas/i386: Avoid NULL pointer dereference in error case

In a code path where we hit an internal disassembler error, execution
would subsequently attempt to dereference a NULL pointer. This
should never happen, but avoid the crash.

Signed-off-by: Peter Maydell <[email protected]>
Message-id: 1488556233 [email protected]

disas/hppa: Remove dead code

Coverity complains (CID 1302705) that the "fr0" part of the ?: in
fput_fp_reg_r() is dead. This looks like cut-n-paste error from
fput_fp_reg(); delete the dead code.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-id: 1488556233 [email protected]

commit: Don't use error_abort in commit_start

bdrv_set_backing_hd failure needn't be abort. Since we already have
error parameter, use it.

Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Don't use error_abort in blk_new_open

We have an errp and bdrv_root_attach_child can fail permission check,
error_abort is not the best choice here.

Signed-off-by: Fam Zheng <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Support blockdev-add

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qapi-schema: Rename SocketAddressFlat's variant tcp to inet

QAPI type SocketAddressFlat differs from SocketAddress pointlessly:
the discriminator value for variant InetSocketAddress is 'tcp' instead
of 'inet'. Rename.

The type is so far only used by the Gluster block drivers. Take care
to keep 'tcp' working in things like -drive's file.server.0.type=tcp.
The "gluster+tcp" URI scheme in pseudo-filenames stays the same.
blockdev-add changes, but it has changed incompatibly since 2.8
already.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

qapi-schema: Rename GlusterServer to SocketAddressFlat

As its documentation says, it's not specific to Gluster. Rename it,
as I'm going to use it for something else.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

gluster: Plug memory leaks in qemu_gluster_parse_json()

To reproduce, run

$ valgrind qemu-system-x86_64 --nodefaults -S --drive driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Niels de Vos <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

gluster: Don't duplicate qapi-util.c's qapi_enum_parse()

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Niels de Vos <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

gluster: Drop assumptions on SocketTransport names

qemu_gluster_glfs_init() passes the names of QAPI enumeration type
SocketTransport to glfs_set_volfile_server(). Works, because they
were chosen to match. But the coupling is artificial. Use the
appropriate literal strings instead.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Niels de Vos <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Implement bdrv_parse_filename()

This permits configuration with driver-specific options in addition to
pseudo-filename parsed as URI.  For instance,

    --drive driver=sheepdog,host=fido,vdi=dolly

instead of

    --drive driver=sheepdog,file=sheepdog://fido/dolly

It's also a first step towards supporting blockdev-add.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Use SocketAddress and socket_connect()

sd_parse_uri() builds a string from host and port parts for
inet_connect(). inet_connect() parses it into host, port and options.
Whether this gets exactly the same host, port and no options for all
inputs is not obvious.

Cut out the string middleman and build a SocketAddress for
socket_connect() instead.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Report errors in pseudo-filename more usefully

Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.

Add real error reporting, such as:

    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters

The code to translate legacy syntax to URI fails to escape URI
meta-characters.  The new error messages are misleading then.  Replace
them by the old "Can't parse filename" message.  "Internal error"
would be more honest.  Anyway, no worse than before.  Also add a FIXME
comment.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Don't truncate long VDI name in _open(), _create()

sd_parse_uri() truncates long VDI names silently. Reject them
instead.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()

sd_parse_uri() and sd_snapshot_goto() screw up error checking after
strtoul(), and truncate long tag names silently.  Fix by replacing
those parts by new sd_parse_snapid_or_tag(), which checks more
carefully.

sd_snapshot_delete() also parses snapshot IDs, but is currently too
broken for me to touch.  Mark TODO.

Two calls of strtol() without error checking remain in
parse_redundancy().  Mark them FIXME.

More silent truncation of configuration strings remains elsewhere.
Not marked.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Mark sd_snapshot_delete() lossage FIXME

sd_snapshot_delete() should delete the snapshot whose ID matches
@snapshot_id and whose name matches @name.  But that's not what it
does.  If @snapshot_id is a valid ID, it deletes the snapshot with
that ID, else it deletes the snapshot with that name.  It doesn't use
@name at all.  Add suitable FIXME comments, so someone who actually
knows Sheepdog can fix it.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Fix error handling sd_create()

As a bdrv_create() method, sd_create() must set an error and return
negative errno on failure.  It prints the error instead of setting it
when connect_to_sdog() fails.  Fix that.

While there, return the value of connect_to_sdog() like we do
elsewhere, instead of -EIO.  No functional change, as
connect_to_sdog() returns no other error code.

Many more suspicious uses of error_report() and error_report_err()
remain in other functions.  Left for another day.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Fix error handling in sd_snapshot_delete()

As a bdrv_snapshot_delete() method, sd_snapshot_delete() must set an
error and return negative errno on failure.  It sometimes returns -1,
and sometimes neglects to set an error.  It also prints error messages
with error_report().  Fix all that.

Moreover, its handling of an attempt to delete a nonexistent snapshot
is wrong: it error_report()s and succeeds.  Fix it to set an error and
return -ENOENT instead.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

sheepdog: Defuse time bomb in sd_open() error handling

When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero. Fortunately, qemu_opts_absorb_qdict() can't
fail, because:

1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.

Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.

Also do that for the error paths where sd->fd is still -1. The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.

While there, rename label out to err, because it's on the error path,
not the normal path out of the function.

Signed-off-by: Markus Armbruster <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>

block: Fix error handling in bdrv_replace_in_backing_chain()

When adding an Error parameter, bdrv_replace_in_backing_chain() would
become nothing more than a wrapper around change_parent_backing_link().
So make the latter public, renamed as bdrv_replace_node(), and remove
bdrv_replace_in_backing_chain().

Most of the callers just remove a node from the graph that they just
inserted, so they can use &error_abort, but completion of a mirror job
with 'replaces' set can actually fail.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Handle permission errors in change_parent_backing_link()

Instead of just trying to change parents by parent over to reference @to
instead of @from, and abort()ing whenever the permissions don't allow
this, do proper permission checking beforehand and pass any error to the
callers.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Ignore multiple children in bdrv_check_update_perm()

change_parent_backing_link() will need to update multiple BdrvChild
objects at once. Checking permissions reference by reference doesn't
work because permissions need to be consistent only with all parents
moved to the new child.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Factor out bdrv_replace_child_noperm()

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

block: Factor out should_update_child()

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

block: Fix blockdev-snapshot error handling

For blockdev-snapshot, external_snapshot_prepare() accepts an arbitrary
node reference at first and only checks later whether it already has a
backing file. Between those places, other errors can occur.

Therefore checking in external_snapshot_abort() whether state->new_bs
has a backing file is not sufficient to tell whether bdrv_append() was
already completed or not. Trying to undo the bdrv_append() when it
wasn't even executed is wrong.

Introduce a new boolean flag in the state to fix this.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix error path for dirty bitmap creation

mirror_top_bs must be removed from the graph again when creating the
dirty bitmap fails.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix permissions for removing mirror_top_bs

mirror_top_bs takes write permissions on its backing file, which can
make it impossible to attach that backing file node to another parent.
However, this is exactly what needs to be done in order to remove
mirror_top_bs from the backing chain. So give up the write permission
first.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

mirror: Fix permission problem with 'replaces'

The 'replaces' option of drive-mirror can be used to mirror a Quorum
node to a new image and then let the target image replace one of the
Quorum children. In order for this graph modification to succeed, the
mirror job needs to lift its restrictions on the target node first
before actually replacing the child.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

commit: Fix error handling

Apparently some kind of mismerge happened in commit 8dfba279, which
broke the error handling without any real reason by removing the
assignment of the return value to ret in a blk_insert_bs() call.

Signed-off-by: Kevin Wolf <[email protected]>
Reviewed-by: Fam Zheng <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

Merge remote-tracking branch 'remotes/xtensa/tags/20170306-xtensa' into staging

target/xtensa updates:

- instantiate local memories in xtensa sim machine;
- add two missing include files to xtensa core importing script.

# gpg: Signature made Mon 06 Mar 2017 22:32:45 GMT
# gpg:                using RSA key 0x51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# gpg:                 aka "Max Filippov <[email protected]>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20170306-xtensa:
  target/xtensa: add two missing headers to core import script
  target/xtensa: sim: instantiate local memories

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/gkurz/tags/fixes-for-2.9' into staging

Fixes issues that got merged with the latest pull request:
- missing O_NOFOLLOW flag for CVE-2016-960
- build break with older glibc that don't have O_PATH and AT_EMPTY_PATH
- various bugs reported by Coverity

# gpg: Signature made Mon 06 Mar 2017 17:51:29 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Greg Kurz <[email protected]>"
# gpg:                 aka "Gregory Kurz (Groug) <[email protected]>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/fixes-for-2.9:
  9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
  9pfs: fix O_PATH build break with older glibc versions
  9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
  9pfs: fail local_statfs() earlier
  9pfs: fix fd leak in local_opendir()
  9pfs: fix bogus fd check in local_remove()

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-06-tag' into staging

qemu-ga patch queue for 2.9

* fix fsfreeze for filesystems mounted in multiple locations
* fix test failure when running in a chroot
* support for socket-based activation

# gpg: Signature made Mon 06 Mar 2017 07:54:17 GMT
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"
# gpg:                 aka "Michael Roth <[email protected]>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* remotes/mdroth/tags/qga-pull-2017-03-06-tag:
  tests: check path to avoid a failing qga/get-vcpus test
  qga: ignore EBUSY when freezing a filesystem
  qga: add systemd socket activation support

Signed-off-by: Peter Maydell <[email protected]>

9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()

We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

9pfs: fix O_PATH build break with older glibc versions

When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <[email protected]>

9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()

The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.

It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.

Reported-by: Mark Cave-Ayland <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Tested-by: Mark Cave-Ayland <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

9pfs: fail local_statfs() earlier

If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.

(Coverity issue CID1371729)

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. berrange <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

9pfs: fix fd leak in local_opendir()

Coverity issue CID1371731

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

9pfs: fix bogus fd check in local_remove()

This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.

(Coverity issue CID1371732)

Signed-off-by: Greg Kurz <[email protected]>
Reviewed-by: Eric Blake <[email protected]>

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 06 Mar 2017 04:15:17 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <[email protected]>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/filter-mirror: Follow CODING_STYLE
  COLO-compare: Fix icmp and udp compare different packet always dump bug
  COLO-compare: Optimize compare_common and compare_tcp
  COLO-compare: Rename compare function and remove duplicate codes
  filter-rewriter: skip net_checksum_calculate() while offset = 0
  net/colo: fix memory double free error
  vmxnet3: VMStatify rx/tx q_descr and int_state
  vmxnet3: Convert ring values to uint32_t's
  net/colo-compare: Fix memory free error
  colo-compare: Fix removing fds been watched incorrectly in finalization
  char: remove the right fd been watched in qemu_chr_fe_set_handlers()
  colo-compare: kick compare thread to exit after some cleanup in finalization
  colo-compare: use g_timeout_source_new() to process the stale packets
  NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()
  NetRxPkt: Account buffer with ETH header in IOV length
  NetRxPkt: Do not try to pull more data than present
  NetRxPkt: Fix memory corruption on VLAN header stripping
  eth: Extend vlan stripping functions
  net: Remove useless local var pkt

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170306' into staging

ppc patch queue for 2017-03-06

Looks like my previous batch wasn't quite the last before hard freeze.
This has a handful of bugfixes to go in.  They're all genuine
bugfixes, though not regressions in some cases.

# gpg: Signature made Mon 06 Mar 2017 04:07:48 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <[email protected]>"
# gpg:                 aka "David Gibson (Red Hat) <[email protected]>"
# gpg:                 aka "David Gibson (ozlabs.org) <[email protected]>"
# gpg:                 aka "David Gibson (kernel.org) <[email protected]>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170306:
  target/ppc: use helper for excp handling
  target/ppc: fmadd: add macro for updating flags
  target/ppc: fmadd check for excp independently
  spapr: ensure that all threads within core are on the same NUMA node
  ppc/xics: register reset handlers for the ICP and ICS objects

Signed-off-by: Peter Maydell <[email protected]>

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-02-28' into staging

QAPI patches for 2017-02-28

# gpg: Signature made Sun 05 Mar 2017 08:21:51 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <[email protected]>"
# gpg:                 aka "Markus Armbruster <[email protected]>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-02-28: (27 commits)
  qapi: Improve qobject visitor documentation
  qapi: Fix object input visit beyond end of list
  tests: Cover input visit beyond end of list
  qapi: Make input visitors detect unvisited list tails
  test-qobject-input-visitor: Cover missing nested struct member
  tests: Cover partial input visit of list
  test-string-input-visitor: Improve list coverage
  test-string-input-visitor: Tear down existing test automatically
  tests-qobject-input-strict: Merge into test-qobject-input-visitor
  qapi: Drop unused non-strict qobject input visitor
  test-qobject-input-visitor: Use strict visitor
  qom: Make object_property_set_qobject()'s input visitor strict
  qapi: Make string input and opts visitor require non-null input
  qapi: Drop string input visitor method optional()
  qapi: Improve qobject input visitor error reporting
  qapi: Make QObject input visitor set *list reliably
  qapi: Clean up after commit 3d344c2
  qapi: Improve a QObject input visitor error message
  qmp: Eliminate silly QERR_QMP_* macros
  qmp: Drop duplicated QMP command object checks
  ...

Signed-off-by: Peter Maydell <[email protected]>

tests: check path to avoid a failing qga/get-vcpus test

The qga/get-vcpus test fails in a simple chroot environment, as
used in an openSUSE Build Service local build, so first check
that the sysfs based path exists in order to avoid calling this
test in an environment where it won't work right.

Signed-off-by: Bruce Rogers <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

qga: ignore EBUSY when freezing a filesystem

the current implementation fails if we try to freeze an
already frozen filesystem. This can happen if a filesystem
is mounted more than once (e.g. with a bind mount).

Suggested-by: Christian Theune <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Lieven <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

qga: add systemd socket activation support

AF_UNIX and AF_VSOCK listen sockets can be passed in by systemd on
startup.  This allows systemd to manage the listen socket until the
first client connects and between restarts.  Advantages of socket
activation are that parallel startup of network services becomes
possible and that unused daemons do not consume memory.

The key to achieving this is the LISTEN_FDS environment variable, which
is a stable ABI as shown here:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/

We could link against libsystemd and use sd_listen_fds(3) but it's easy
to implement the tiny LISTEN_FDS ABI so that qemu-ga does not depend on
libsystemd.  Some systems may not have systemd installed and wish to
avoid the dependency.  Other init systems or socket activation servers
may implement the same ABI without systemd involvement.

Test as follows:

  $ cat ~/.config/systemd/user/qga.service
  [Unit]
  Description=qga

  [Service]
  WorkingDirectory=/tmp
  ExecStart=/path/to/qemu-ga --logfile=/tmp/qga.log --pidfile=/tmp/qga.pid --statedir=/tmp

  $ cat ~/.config/systemd/user/qga.socket
  [Socket]
  ListenStream=/tmp/qga.sock

  [Install]
  WantedBy=default.target

  $ systemctl --user daemon-reload
  $ systemctl --user start qga.socket
  $ nc -U /tmp/qga.sock

Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrange <[email protected]>
Signed-off-by: Michael Roth <[email protected]>

net/filter-mirror: Follow CODING_STYLE

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Fix icmp and udp compare different packet always dump bug

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Optimize compare_common and compare_tcp

Add offset args for colo_packet_compare_common, optimize
colo_packet_compare_icmp() and colo_packet_compare_udp()
just compare the IP payload. Before compare all tcp packet,
we compare tcp checksum firstly, this function can get
better performance.

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

COLO-compare: Rename compare function and remove duplicate codes

Rename colo_packet_compare() to colo_packet_compare_common() that
make tcp_compare udp_compare icmp_compare reuse this function.
Remove minimum packet size check in icmp_compare, because we have
check this in parse_packet_early().

Signed-off-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

filter-rewriter: skip net_checksum_calculate() while offset = 0

While the offset of packets's sequence for primary side and
secondary side is zero, it is unnecessary to call net_checksum_calculate()
to recalculate the checksume value of packets.

Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net/colo: fix memory double free error

The 'primary_list' and 'secondary_list' members of struct Connection
is not allocated through dynamically g_queue_new(), but we free it by using
g_queue_free(), which will lead to a double-free bug.

Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

vmxnet3: VMStatify rx/tx q_descr and int_state

Fairly simple mechanical conversion of all fields.

TODO!!!!
The problem is vmxnet3-ring size/cell_size/next are declared as size_t
but written as 32bit.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Acked-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

vmxnet3: Convert ring values to uint32_t's

The index's in the Vmxnet3Ring were migrated as 32bit ints
yet are declared as size_t's. They appear to be derived
from 32bit values loaded from guest memory, so actually
store them as that.

Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Acked-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net/colo-compare: Fix memory free error

We use g_queue_init() to init s->conn_list, so we should use g_queue_clear()
to instead of g_queue_free().

Signed-off-by: Zhang Chen <[email protected]>
Reviewed-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: Fix removing fds been watched incorrectly in finalization

We will catch the bellow error report while try to delete compare object
by qmp command:
chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed.

This is caused by failing to remove the right fd been watched while
call qemu_chr_fe_set_handlers();

Fix it by pass the worker_context parameter to qemu_chr_fe_set_handlers().

Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

char: remove the right fd been watched in qemu_chr_fe_set_handlers()

We can call qemu_chr_fe_set_handlers() to add/remove fd been watched
in 'context' which can be either default main context or other explicit
context. But the original logic is not correct, we didn't remove
the right fd because we call g_main_context_find_source_by_id(NULL, tag)
which always try to find the Gsource from default context.

Fix it by passing the right context to g_main_context_find_source_by_id().

Cc: Paolo Bonzini <[email protected]>
Cc: Marc-André Lureau <[email protected]>
Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Marc-André Lureau <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: kick compare thread to exit after some cleanup in finalization

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Before compare thead exits, some cleanup works need to be
done, All unhandled packets need to be released and connection_track_table
needs to be freed, or there will be memory leak.

Signed-off-by: zhanghailiang <[email protected]>
Reviewed-by: Zhang Chen <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

colo-compare: use g_timeout_source_new() to process the stale packets

Instead of using qemu timer to process the stale packets,
We re-use the colo compare thread to process these packets
by creating a new timeout coroutine.

Besides, since we process all the same vNIC's net connection/packets
in one thread, it is safe to remove the timer_check_lock.

Signed-off-by: zhanghailiang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()

This is a refactoring commit that does not change behavior.

Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Account buffer with ETH header in IOV length

In case of VLAN stripping ETH header is stored in a
separate chunk and length of IOV should take this into
account.

This patch fixes checksum validation for RX packets
with VLAN header.

Devices affected by this problem: e1000e and vmxnet3.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Do not try to pull more data than present

In case of VLAN stripping, ETH header put into a
separate buffer, therefore amont of data copied
from original IOV should be smaller.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

NetRxPkt: Fix memory corruption on VLAN header stripping

This patch fixed a problem that was introduced in commit eb700029.

When net_rx_pkt_attach_iovec() calls eth_strip_vlan()
this can result in pkt->ehdr_buf being overflowed, because
ehdr_buf is only sizeof(struct eth_header) bytes large
but eth_strip_vlan() can write
sizeof(struct eth_header) + sizeof(struct vlan_header)
bytes into it.

Devices affected by this problem: vmxnet3.

Cc: [email protected]
Reported-by: Peter Maydell <[email protected]>
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

eth: Extend vlan stripping functions

Make VLAN stripping functions return number of bytes
copied to given Ethernet header buffer.

This information should be used to re-compose
packet IOV after VLAN stripping.

Cc: [email protected]
Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

net: Remove useless local var pkt

This has been pointless since commit 605d52e62, which was a
search-and-replace, overlooked the redundancy.

Signed-off-by: Fam Zheng <[email protected]>
Reviewed-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Jason Wang <[email protected]>

target/ppc: use helper for excp handling

Use the helper routine float[32,64]_maddsub_update_excp() in VSX_MADD
macro.

Signed-off-by: Nikunj A Dadhania <[email protected]>
Signed-off-by: David Gibson <[email protected]>