The O_DIRECT flag imposes alignment restrictions on the length and address
of userspace buffers and the file offset of I/Os.
While VirtFS/9P has plans to implement O_DIRECT behavior on the server,
for now we will stick to a behavior like NFS by bypassing the page cache
only on the client. Server may still cache the I/O.
[virtio-9p] Introduce server side TFSYNC/RFSYNC for dotl
SYNOPSIS
size[4] Tfsync tag[2] fid[4]
size[4] Rfsync tag[2]
DESCRIPTION
The Tfsync transaction transfers ("flushes") all modified in-core data of
file identified by fid to the disk device (or other permanent storage
device) where that file resides.
TGetlock is used to test for the existence of byte range posix locks on
a file identified by given fid. The reply contains getlock structure. If
the lock could be placed it returns F_UNLCK in type field of getlock structure.
Otherwise it returns the details of the conflicting locks in the getlock
structure
getlock structure:
type[1] - Type of lock: F_RDLCK, F_WRLCK
start[8] - Starting offset for lock
length[8] - Number of bytes to lock
If length is 0, lock all bytes starting at the location
'start' through to the end of file
proc_id[4] - process id that wants to take lock/owns the task
in case of reply
client[4] - Client id of the system that owns the process
Tlock is used to acquire/release byte range posix locks on a file
identified by given fid. The reply contains status of the lock request
flock structure:
type[1] - Type of lock: F_RDLCK, F_WRLCK, F_UNLCK
flags[4] - Flags could be either of
P9_LOCK_FLAGS_BLOCK(1) - Blocked lock request, if there is a
conflicting lock exists, wait for that lock to be released.
P9_LOCK_FLAGS_RECLAIM(2) - Reclaim lock request, used when client is
trying to reclaim a lock after a server restrart (due to crash)
start[8] - Starting offset for lock
length[8] - Number of bytes to lock
If length is 0, lock all bytes starting at the location 'start'
through to the end of file
pid[4] - PID of the process that wants to take lock
client_id[4] - Unique client id
status[1] - Status of the lock request, can be
P9_LOCK_SUCCESS(0), P9_LOCK_BLOCKED(1), P9_LOCK_ERROR(2) or
P9_LOCK_GRACE(3)
P9_LOCK_SUCCESS - Request was successful
P9_LOCK_BLOCKED - A conflicting lock is held by another process
P9_LOCK_ERROR - Error while processing the lock request
P9_LOCK_GRACE - Server is in grace period, it can't accept new lock
requests in this period (except locks with
P9_LOCK_FLAGS_RECLAIM flag set)
When 9P server fails to create a file due to permission problems it should
return EPERM. However the current 9P2000.L code returns EBADF. EBADF is NOT
a valid return value from open() call.
The problem is because we do not preserve the errno variable properly. If the
file open had failed, the call to close() on the fd in v9fs_post_lcreate()
fails and sets errno to EBADF. We should preserve the errno that we got from
open() and we should call close() only if we had a valid fd.
Stefan Hajnoczi [Mon, 18 Oct 2010 12:42:54 +0000 (13:42 +0100)]
trace: Relax trace-events parsing regex in simpletrace.py
The regular expression to parse trace event definitions assumed the
format string would be a simple double-quoted string. However, we now
use PRI?64 for portability which splits string literals. The regular
expression can disregard the format string entirely since simpletrace.py
never needs to use it.
snd_pcm_start() starts the capture process and ensures that the events
are delivered to the poll handler. Without the call, capture can be started
only when there is simultaneous playback running.
Blue Swirl [Wed, 13 Oct 2010 19:14:29 +0000 (19:14 +0000)]
trace: print a warning if user tries to enable an unknown trace event
There was no warning if a bad trace event name was given to
'trace-event' command, thus the user could think that the command
was successful even if this was not the case.
Print a warning if the user tries to enable a trace event which is not
known.
Blue Swirl [Wed, 13 Oct 2010 18:38:08 +0000 (18:38 +0000)]
mips: avoid write only variables
Compiling with GCC 4.6.0 20100925 produced a lot of warnings like:
/src/qemu/target-mips/translate.c: In function 'gen_ld':
/src/qemu/target-mips/translate.c:1039:17: error: variable 'opn' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the variable is not unused.
Blue Swirl [Wed, 13 Oct 2010 18:38:08 +0000 (18:38 +0000)]
ppc: avoid write only variables
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/target-ppc/op_helper.c: In function 'helper_icbi':
/src/qemu/target-ppc/op_helper.c:351:14: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
/src/qemu/target-ppc/op_helper.c: In function 'do_6xx_tlb':
/src/qemu/target-ppc/op_helper.c:3805:28: error: variable 'EPN' set but not used [-Werror=unused-but-set-variable]
/src/qemu/target-ppc/op_helper.c: In function 'do_74xx_tlb':
/src/qemu/target-ppc/op_helper.c:3838:28: error: variable 'EPN' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the variable is not unused. Delete tmp.
Blue Swirl [Wed, 13 Oct 2010 18:38:08 +0000 (18:38 +0000)]
i386: avoid a write only variable
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/target-i386/op_helper.c: In function 'switch_tss':
/src/qemu/target-i386/op_helper.c:283:53: error: variable 'new_trap' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the variable is not unused. Add also
pointer to docs.
Blue Swirl [Wed, 13 Oct 2010 18:38:08 +0000 (18:38 +0000)]
vnc: avoid write only variables
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/ui/vnc.c: In function 'vnc_client_cache_auth':
/src/qemu/ui/vnc.c:217:12: error: variable 'qdict' set but not used [-Werror=unused-but-set-variable]
/src/qemu/ui/vnc.c: In function 'vnc_display_open':
/src/qemu/ui/vnc.c:2526:9: error: variable 'acl' set but not used [-Werror=unused-but-set-variable]
Fix by making the variable declarations and their uses also conditional
to debug definition.
Blue Swirl [Wed, 13 Oct 2010 18:38:08 +0000 (18:38 +0000)]
cris: avoid a write only variable
Compiling with GCC 4.6.0 20100925 produced a warning:
In file included from /src/qemu/target-cris/translate.c:3154:0:
/src/qemu/target-cris/translate_v10.c: In function 'dec10_prep_move_m':
/src/qemu/target-cris/translate_v10.c:111:22: error: variable 'rd' set but not used [-Werror=unused-but-set-variable]
Blue Swirl [Wed, 13 Oct 2010 18:41:29 +0000 (18:41 +0000)]
Delete write only variables
Compiling with GCC 4.6.0 20100925 produced warnings like:
/src/qemu/net/tap-win32.c: In function 'tap_win32_open':
/src/qemu/net/tap-win32.c:582:12: error: variable 'hThread' set but not used [-Werror=unused-but-set-variable]
Blue Swirl [Wed, 13 Oct 2010 18:38:07 +0000 (18:38 +0000)]
lsi53c895a: avoid a write only variable
Compiling with GCC 4.6.0 20100925 produced a warning:
/src/qemu/hw/lsi53c895a.c: In function 'lsi_do_msgout':
/src/qemu/hw/lsi53c895a.c:848:9: error: variable 'len' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the variable is not unused for
non-debug case.
Blue Swirl [Wed, 13 Oct 2010 18:38:07 +0000 (18:38 +0000)]
eepro100: initialize a variable in all cases
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/hw/eepro100.c: In function 'eepro100_read4':
/src/qemu/hw/eepro100.c:1351:14: error: 'val' may be used uninitialized in this function [-Werror=uninitialized]
/src/qemu/hw/eepro100.c: In function 'eepro100_read2':
/src/qemu/hw/eepro100.c:1328:14: error: 'val' may be used uninitialized in this function [-Werror=uninitialized]
/src/qemu/hw/eepro100.c: In function 'eepro100_read1':
/src/qemu/hw/eepro100.c:1285:13: error: 'val' may be used uninitialized in this function [-Werror=uninitialized]
Blue Swirl [Wed, 13 Oct 2010 18:38:07 +0000 (18:38 +0000)]
cirrus: avoid write only variables
Compiling with GCC 4.6.0 20100925 produced a lot of warnings like:
In file included from /src/qemu/hw/cirrus_vga_rop.h:174:0,
from /src/qemu/hw/cirrus_vga.c:284:
/src/qemu/hw/cirrus_vga_rop2.h: In function 'cirrus_patternfill_0_8':
/src/qemu/hw/cirrus_vga_rop2.h:48:18: error: variable 'col' set but not used [-Werror=unused-but-set-variable]
/src/qemu/hw/cirrus_vga_rop2.h: In function 'cirrus_colorexpand_transp_0_8':
/src/qemu/hw/cirrus_vga_rop2.h:104:18: error: variable 'col' set but not used [-Werror=unused-but-set-variable]
Fix the warnings by introducing an inline function, which avoids
exposing write-only variables.
Blue Swirl [Wed, 13 Oct 2010 18:38:07 +0000 (18:38 +0000)]
block: avoid a write only variable
Compiling with GCC 4.6.0 20100925 produced a warning:
/src/qemu/block/qcow2-refcount.c: In function 'update_refcount':
/src/qemu/block/qcow2-refcount.c:552:13: error: variable 'dummy' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the result is not unused.
Blue Swirl [Sat, 9 Oct 2010 08:24:17 +0000 (08:24 +0000)]
trace: remove timestamp files when cleaning up
'make clean' did not remove trace.[ch]-timestamp files,
only trace.[ch]. But 'make' did not know how to make trace.[ch]
files if the timestamp files were present.
Fix by removing the timestamp files along with trace.[ch].
Stefan Hajnoczi [Thu, 7 Oct 2010 11:07:15 +0000 (12:07 +0100)]
.gitignore: Ignore *-timestamp
Timestamp files were recently added to reduce make churn on source files
that use tracing. The timestamp files should never be committed and
should not be visible in git status.
Scott Wood [Tue, 5 Oct 2010 19:28:17 +0000 (14:28 -0500)]
configure: include stddef.h for NULL
This fixes an observed failure to detect madvise() on Linux.
To avoid similar issues, all other tests that use NULL but don't already
have stddef.h (or another header that is defined to provide NULL,
such as stdio.h, unistd.h, or time.h) are also fixed.
Stefan Hajnoczi [Tue, 5 Oct 2010 13:28:52 +0000 (14:28 +0100)]
trace: Use TP_PROTO() and TP_ARGS() for LTTng UST
The LTTng UserSpace Tracer formerly used TPPROTO() and TPARGS() instead
of TP_PROTO() and TP_ARGS() like the kernel uses. This has been changed
so QEMU needs to follow.
I am not aware of a graceful way of making the transition but since no
one complained that the UST build is broken, it should be fine to just
switch over without compatibility for old UST headers. The newer UST
headers are shipping in distro packages so it is realistic to make this
change now.
Although comment lines must be skipped, the '#' character can occur in
valid format strings. Be more careful when checking for comments.
Leave comments at the end of the line where they will not interfere with
other processing.
Stefan Hajnoczi [Tue, 5 Oct 2010 13:28:50 +0000 (14:28 +0100)]
trace: Use portable format strings
It is not portable to use "%ld" for int64_t because int64_t may have
type long on 64-bit platforms and long long on 32-bit platforms. Use
the standard library PRId64 macros to keep format strings portable.
When using irqfd with vhost-net to inject interrupts,
a single evenfd might inject multiple interrupts.
Implementing this is much easier with a single
per-device callback to set guest notifiers.
Stefan Weil [Wed, 29 Sep 2010 19:59:55 +0000 (21:59 +0200)]
eepro100: Add support for multiple individual addresses (multiple IA)
I reviewed the latest sources of Linux, FreeBSD and NetBSD.
They all reset the multiple IA bit (multi_ia in BSD) to zero,
but I did not find code which sets this bit to one
(like it is done by some routers).
Running Windows guests also did not set this bit.
Intel's Open Source Software Developer Manual does not
give much information on the semantics related to this bit,
so I had to guess how it works. The guess was good enough
to make the router emulation work.
Related changes in this patch:
* Update naming and documentation of the internal hash register.
It is not limited to multicast, but also used for multiple IA.
* Dump complete configuration register when debug traces are enabled.
* Debug output when multiple IA bit is set during CmdConfigure.
* Debug output when frames are received because multiple IA bit is set,
or when they are ignored although it is set.
As status is set to 0 on reset, invoke the relevant callback. This makes
for a cleaner code in devices as they don't need to duplicate the code
in their reset routine, as well as excercises this path a little more.
In particular this makes it possible to unify
vhost-net handling code with the following patch.
With -netdev, virtio devices present offload
features to guest, depending on the backend used.
Thus, removing host netdev peer while guest is
active leads to guest-visible inconsistency and/or crashes.
As a solution, while guest (NIC) peer device exists,
we prevent the host peer from being deleted.
This patch does this by adding peer_deleted flag in nic state:
if host device is going away while guest device
is around, set this flag and keep a shell of
the host device around for as long as guest device exists.
The link is put down so all packets will get discarded.
At the moment, management can detect that device deletion
is delayed by doing info net. As a next step, we shall add
commands that control hotplug/unplug without
removing the device, and an event to report that
guest has responded to the hotplug event.
Stefan Weil [Thu, 15 Jul 2010 20:28:02 +0000 (22:28 +0200)]
Add new user mode option -ignore-environment
An empty environment is sometimes useful in user mode.
The new option provides it for linux-user and bsd-user
(darwin-user still has no environment related options).
The patch also adds the documentation for other
environment related options.
Stefan Hajnoczi [Mon, 20 Sep 2010 13:11:19 +0000 (14:11 +0100)]
console: Avoid dereferencing NULL active_console
The console_select() function does not check that active_console is
non-NULL before dereferencing it. When invoked with qemu -nodefaults it
is possible to hit this case.
This patch checks that active_console is non-NULL before stashing away
the old console dimensions in console_select().
Stefan Weil [Thu, 23 Sep 2010 18:47:32 +0000 (20:47 +0200)]
blockdev: Use GCC_FMT_ATTR (format checking)
Additional changes:
* Removed 'extern' from drive_add (avoids too long line).
* Removed 'extern' from other functions (makes declarations
consistent with others in same header file).
Stefan Weil [Thu, 23 Sep 2010 19:28:03 +0000 (21:28 +0200)]
Replace most gcc format attributes by macro GCC_FMT_ATTR (format checking)
Since version 4.4.x, gcc supports additional format attributes.
__attribute__ ((format (gnu_printf, 1, 2)))
should be used instead of
__attribute__ ((format (printf, 1, 2))
because QEMU always uses standard format strings (even with mingw32).
The patch replaces format attribute printf / __printf__ by macro
GCC_FMT_ATTR which uses gnu_printf if supported.
It also removes an #ifdef __GNUC__ (not needed any longer).
Andreas Färber [Sun, 19 Sep 2010 22:50:43 +0000 (00:50 +0200)]
configure: Add basic support for Haiku
For compatibility with BeOS, Haiku's error codes are negative whereas recent
POSIX versions require them to be positive. As spotted by François, some
parts of QEMU code rely on this, so use a mapper library to convert them
to positive ones.
Cc: François Revol <[email protected]> Cc: Ingo Weinhold <[email protected]>
Haiku has network functions in libnetwork.so. It doesn't ship libutil.so.
Blue Swirl [Sat, 2 Oct 2010 14:28:12 +0000 (14:28 +0000)]
trace: avoid unnecessary recompilation if nothing changed
Add logic to detect changes in generated files. If the old
and new files are identical, don't touch the generated file.
This avoids a lot of churn since many files depend on trace.h.
Blue Swirl [Sat, 2 Oct 2010 14:28:08 +0000 (14:28 +0000)]
Makefile: fix config-devices.mak generation
The logic of detecting changes in default-configs/*.mak is
flawed as can be demonstrated by 'touch default-configs/*.mak'
followed by make. This results in a message claiming that user
made changes to the */config-devices.mak files.
Fix by separating the detection of changes made by the user and
changes in the default-configs.
That name makes no sense anymore, as dispatch tables have been split,
a better name is handler_is_qobject(), which really communicates
the handler's type.
QMP has its own dispatch tables, we can now drop the following
checks:
o 'info' command: this command doesn't exist in QMP's
dispatch table, the right thing will happen when it's
issued by a client (ie. command not found error)
o monitor_handler_ported(): all QMP handlers are 'ported', no
need to check for that
o monitor_cmd_user_only(): no HMP handler will exist in QMP's
dispatch tables, that's why we have split them after all :-)
This file contains a copy of the following information from the
qemu-monitor.hx file:
o QObject handlers entries
o QMP documentation (all SQMP/EQMP sections)
Right now it's only used to generate the QMP docs in QMP/, but
next commits will turn this into QMP's command dispatch table.
It's important to note that QObject handlers entries are going
to get duplicated: they will exist in both QMP's and HMP's
dispatch tables.
This will be fixed in the near future, when we add a proper
QMP call interface and HMP is converted to use it. This way we
can completely drop QObject handlers entries from HMP's tables.
NOTE: HMP specific constructions, like "q|quit", have been dropped.
If I understood it correcty, the is_async_return() logic was only
used to prevent QMP from issuing duplicated success responses
for asynchronous handlers.
However, QMP doesn't use do_info() anymore so this is dead logic
and (hopefully) can be safely dropped.
disable guest-provided stats on "info balloon" command
The addition of memory stats reporting to the virtio balloon causes
the 'info balloon' command to become asynchronous. This is a regression
because in some cases it can hang the user monitor.
This is an alternative to Adam Litke's patch. Adam's patch disabled the
corresponding (guest-visible) virtio feature bit, causing issues for migration.
Original discussion is available at:
http://marc.info/?l=qemu-devel&m=128448124328314&w=2
The monitor does not pretty-print JSON output, so that everything
will be on a single line reply. When JSON docs get large this is
quite unpleasant to read. For the future command line capabilities
query ability, huge JSON docs will be available. This needs the
ability to pretty-print.
This introduces a new API qobject_to_json_pretty() that does
a minimal indentation of list and dict members. As an example,
this makes
Andreas Färber [Tue, 28 Sep 2010 21:48:42 +0000 (23:48 +0200)]
tap: Remove double include of util.h
If neither of __FreeBSD__, __FreeBSD_kernel__ and __DragonFly__ is defined,
util.h is included from tap-bsd.c.
Don't include it again if __OpenBSD__ is defined.
Fix a rpos coordination bug between qpa_run_out() and qpa_thread_out(),
which shows up as playback noises.
qpa_run_out()
qpa_thread_out loop N critical section 1
qpa_run_out() qpa_thread_out loop N doing pa_simple_write()
qpa_run_out() qpa_thread_out loop N doing pa_simple_write()
qpa_thread_out loop N critical section 2
qpa_thread_out loop N+1 critical section 1
qpa_run_out() qpa_thread_out loop N+1 doing pa_simple_write()
In the above scheme, "qpa_thread_out loop N+1 critical section 1" will
get the same rpos as the one used by "qpa_thread_out loop N critical
section 1". So it will be reading dead samples from the old rpos.
The rpos can only be updated back to qpa_thread_out when there is a
qpa_run_out() run between two qpa_thread_out loops.
normal sequence:
qpa_thread_out:
hw->rpos (X0) => local rpos => pa->rpos (X1)
qpa_run_out:
pa->rpos (X1) => hw->rpos (X1)
qpa_thread_out:
hw->rpos (X1) => local rpos => pa->rpos (X2)
buggy sequence:
qpa_thread_out:
hw->rpos (X0) => local rpos => pa->rpos (X1)
qpa_thread_out:
hw->rpos (X0) => local rpos => pa->rpos (X1')
Obviously qpa_run_out() shall be called at least once between any two
qpa_thread_out loops (after pa->rpos is set), in order for the new
qpa_thread_out loop to see the updated rpos.
Setting pa->live to 0 does the trick. The next loop will have to wait
for one qpa_run_out() invocation in order to get a non-zero pa->live
and proceed.