Peter Maydell [Thu, 19 Apr 2018 12:57:40 +0000 (13:57 +0100)]
linux-user: Fix getdents emulation for 64 bit guest on 32 bit host
Currently we mishandle emulation of the getdents syscall for the
case of a 64 bit guest on a 32 bit host -- it defaults into
the 'host and guest same size' codepath and generates incorrect
structures in the guest buffer.
We can't easily handle the 64-on-32 case using the host getdents
syscall, because the guest struct dirent is bigger than the
host struct dirent, and we might find the host syscall has handed
us back more records than we can fit in the guest buffer after
conversion. Instead, always emulate 64-on-32 getdents with
the host getdents64. This avoids the buffer-overrun problem
because a dirent64 struct is always the same size on any host
and always larger than any architecture's dirent struct.
Alex Bennée [Wed, 25 Apr 2018 10:02:18 +0000 (11:02 +0100)]
linux-user: set minimum uname for RISC-V
As support for RISC-V was only merged into the mainline kernel at 4.15
it is unlikely that glibc will be happy with a reported kernel version
of 3.8.0. Indeed when I testing binaries created by the current Debian
Sid compiler the tests failed with:
FATAL: kernel too old
Bump the version to the minimum a RISC-V glibc would expect:
* remotes/kraxel/tags/usb-20180427-pull-request:
ccid-card: include libcacard.h only
Fix libusb-1.0.22 deprecated libusb_set_debug with libusb_set_option
ccid: Fix dwProtocols advertisement of T=0
Peter Maydell [Fri, 27 Apr 2018 09:49:23 +0000 (10:49 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.13-20180427' into staging
ppc patch queue 2018-04-27
Here's the first batch of ppc patches for 2.13. This has a lot of
stuff that's accumulated during the 2.12 freeze. Highlights are:
* Many improvements for the Uninorth PCI host bridge for Mac
machine types
* Preliminary helpers improve handling of multiple backing
pagesizes (not strictly ppc related, but have acks and aimed to
allow future ppc changes)
* Cleanups to pseries cpu initialization
* Cleanups to hash64 MMU handling
* Assorted bugfixes and improvements
* remotes/dgibson/tags/ppc-for-2.13-20180427: (49 commits)
Clear mem_path if we fall back to anonymous RAM allocation
spapr: Set compatibility mode before the rest of spapr_cpu_reset()
target/ppc: Don't bother with MSR_EP in cpu_ppc_set_papr()
spapr: Support ibm,dynamic-memory-v2 property
ppc: e500: switch E500 based machines to full machine definition
spapr: Add ibm,max-associativity-domains property
target/ppc: Fold slb_nr into PPCHash64Options
target/ppc: Get rid of POWERPC_MMU_VER() macros
target/ppc: Remove unnecessary POWERPC_MMU_V3 flag from mmu_model
target/ppc: Fold ci_large_pages flag into PPCHash64Options
target/ppc: Move 1T segment and AMR options to PPCHash64Options
target/ppc: Make hash64_opts field mandatory for 64-bit hash MMUs
target/ppc: Split page size information into a separate allocation
target/ppc: Move page size setup to helper function
target/ppc: Remove fallback 64k pagesize information
target/ppc: Avoid taking "env" parameter to mmu-hash64 functions
target/ppc: Pass cpu instead of env to ppc_create_page_sizes_prop()
target/ppc: Simplify cpu valid check in ppc_cpu_realize
target/ppc: Standardize instance_init and realize function names
spapr: drop useless dynamic sysbus device sanity check
...
Tina Zhang [Fri, 27 Apr 2018 09:11:06 +0000 (17:11 +0800)]
ui: introduce vfio_display_reset
During guest OS reboot, guest framebuffer is invalid. It will cause
bugs, if the invalid guest framebuffer is still used by host.
This patch is to introduce vfio_display_reset which is invoked
during vfio display reset. This vfio_display_reset function is used
to release the invalid display resource, disable scanout mode and
replace the invalid surface with QemuConsole's DisplaySurafce.
This patch can fix the GPU hang issue caused by gd_egl_draw during
guest OS reboot.
Changes v3->v4:
- Move dma-buf based display check into the vfio_display_reset().
(Gerd)
Changes v2->v3:
- Limit vfio_display_reset to dma-buf based vfio display. (Gerd)
Changes v1->v2:
- Use dpy_gfx_update_full() update screen after reset. (Gerd)
- Remove dpy_gfx_switch_surface(). (Gerd)
When trying to build with latest libcacard-2.5.1, I hit the
following error:
In file included from hw/usb/ccid-card-passthru.c:12:0:
/usr/include/cacard/vscard_common.h:26:2: error: #warning "Only <libcacard.h> can be included directly" [-Werror=cpp]
#warning "Only <libcacard.h> can be included directly"
While it was fixed in libcacard upstream (so that individual
files can be included directly), it doesn't make much sense.
Let's switch to including the main libcacard.h and also require
at least libcacard-2.5.1 which introduced it. It's available
since late 2015.
CC hw/usb/host-libusb.o
/builds/xen/src/qemu-xen/hw/usb/host-libusb.c: In function 'usb_host_init':
/builds/xen/src/qemu-xen/hw/usb/host-libusb.c:250:5: error: 'libusb_set_debug' is deprecated: Use libusb_set_option instead [-Werror=deprecated-declarations]
libusb_set_debug(ctx, loglevel);
^~~~~~~~~~~~~~~~
In file included from /builds/xen/src/qemu-xen/hw/usb/host-libusb.c:40:0:
/usr/include/libusb-1.0/libusb.h:1300:18: note: declared here
void LIBUSB_CALL libusb_set_debug(libusb_context *ctx, int level);
^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [/builds/xen/src/qemu-xen/rules.mak:66: hw/usb/host-libusb.o] Error 1
make: Leaving directory '/builds/xen/src/xen/tools/qemu-xen-build'
Jason Andryuk [Fri, 20 Apr 2018 18:32:19 +0000 (14:32 -0400)]
ccid: Fix dwProtocols advertisement of T=0
Commit d7d218ef02d87c637d20d64da8f575d434ff6f78 attempted to change
dwProtocols to only advertise support for T=0 and not T=1. The change
was incorrect as it changed 0x00000003 to 0x00010000.
lsusb -v in a linux guest shows:
"dwProtocols 65536 (Invalid values detected)", though the
smart card could still be accessed. Windows 7 does not detect inserted
smart cards and logs the the following Error in the Event Logs:
Source: Smart Card Service
Event ID: 610
Smart Card Reader 'QEMU QEMU USB CCID 0' rejected IOCTL SET_PROTOCOL:
Incorrect function. If this error persists, your smart card or reader
may not be functioning correctly
David Gibson [Thu, 19 Apr 2018 07:19:11 +0000 (17:19 +1000)]
Clear mem_path if we fall back to anonymous RAM allocation
If the -mem-path option is set, we attempt to map the guest's RAM from a
file in the given path; it's usually used to back guest RAM with hugepages.
If we're unable to (e.g. not enough free hugepages) then we fall back to
allocating normal anonymous pages. This behaviour can be surprising, but a
comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour
we can't change.
What really isn't ok, though, is that in this case we leave mem_path set.
That means functions which attempt to determine the pagesize of main RAM
can erroneously think it is hugepage based on the requested path, even
though it's not.
This is particular bad for the pseries machine type. KVM HV limitations
mean the guest can't use pagesizes larger than the host page size used to
back RAM. That means that such a fallback, rather than merely giving
poorer performance than expected will cause the guest to freeze up early in
boot as it attempts to use large page mappings that can't work.
This patch addresses the problem by clearing the mem_path variable when we
fall back to anonymous pages, meaning that subsequent attempts to
determine the RAM page size will get an accurate result.
David Gibson [Thu, 5 Apr 2018 05:49:23 +0000 (15:49 +1000)]
spapr: Set compatibility mode before the rest of spapr_cpu_reset()
Although the order doesn't really matter at the moment, it's possible
other initializastions could depend on the compatiblity mode, so make sure
we set it first in spapr_cpu_reset().
While we're at it drop the test against first_cpu. Setting the compat mode
to the value it already has is redundant, but harmless, so we might as well
make a small simplification to the code.
David Gibson [Fri, 13 Apr 2018 04:54:34 +0000 (14:54 +1000)]
target/ppc: Don't bother with MSR_EP in cpu_ppc_set_papr()
cpu_ppc_set_papr() removes the EP and HV bits from the MSR mask. While
removing the HV bit makes sense (a cpu in PAPR mode should never be
emulated in hypervisor mode), the EP bit is just bizarre. Although it's
true that a papr mode guest shouldn't be able to change the exception
prefix, the MSR[EP] bit doesn't even exist on the cpus supported for PAPR
mode, so it's pointless to do anything with it here.
Igor Mammedov [Wed, 18 Apr 2018 14:28:02 +0000 (16:28 +0200)]
ppc: e500: switch E500 based machines to full machine definition
Convert PPCE500Params to PCCE500MachineClass which it essentially is,
and introduce PCCE500MachineState to keep track of E500 specific
state instead of adding global variables or extra parameters to
functions when we need to keep data beyond machine init
(i.e. make it look like typical fully defined machine).
It's pretty shallow conversion instead of currently used trivial
DEFINE_MACHINE() macro. It adds extra 60LOC of boilerplate code
of full machine definition.
The patch on top[1] will use PCCE500MachineState to keep track of
platform_bus device and add E500Plate specific machine class
to use HOTPLUG_HANDLER for explicitly initializing dynamic
sysbus devices at the time they are added instead of delaying
it to machine done time by platform_bus_init_notify() which is
being removed.
Now recent kernels (i.e. since linux-stable commit a346137e9142
("powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes")
support this property to mark initially memory-less NUMA nodes as "possible"
to allow further memory hot-add to them.
Advertise this property for pSeries machines to let guest kernels detect
maximum supported node configuration and benefit from kernel side change
when hot-add memory to specific, possibly empty before, NUMA node.
David Gibson [Thu, 29 Mar 2018 07:29:38 +0000 (18:29 +1100)]
target/ppc: Fold slb_nr into PPCHash64Options
The env->slb_nr field gives the size of the SLB (Segment Lookaside Buffer).
This is another static-after-initialization parameter of the specific
version of the 64-bit hash MMU in the CPU. So, this patch folds the field
into PPCHash64Options with the other hash MMU options.
This is a bit more complicated that the things previously put in there,
because slb_nr was foolishly included in the migration stream. So we need
some of the usual dance to handle backwards compatible migration.
David Gibson [Fri, 23 Mar 2018 05:48:43 +0000 (16:48 +1100)]
target/ppc: Get rid of POWERPC_MMU_VER() macros
These macros were introduced to deal with the fact that the mmu_model
field has bit flags mixed in with what's otherwise an enum of various mmu
types.
We've now eliminated all those flags except for one, and that one -
POWERPC_MMU_64 - is already included/compared in the MMU_VER macros. So,
we can get rid of those macros and just directly compare mmu_model values
in the places it was used.
David Gibson [Fri, 23 Mar 2018 05:42:45 +0000 (16:42 +1100)]
target/ppc: Remove unnecessary POWERPC_MMU_V3 flag from mmu_model
The only place we test this flag is in conjunction with
ppc64_use_proc_tbl(). That checks for the LPCR_UPRT bit, which we already
ensure can't be set except on a machine with a v3 MMU (i.e. POWER9).
David Gibson [Fri, 23 Mar 2018 03:32:48 +0000 (14:32 +1100)]
target/ppc: Fold ci_large_pages flag into PPCHash64Options
The ci_large_pages boolean in CPUPPCState is only relevant to 64-bit hash
MMU machines, indicating whether it's possible to map large (> 4kiB) pages
as cache-inhibitied (i.e. for IO, rather than memory). Fold it as another
flag into the PPCHash64Options structure.
David Gibson [Fri, 23 Mar 2018 03:11:07 +0000 (14:11 +1100)]
target/ppc: Move 1T segment and AMR options to PPCHash64Options
Currently env->mmu_model is a bit of an unholy mess of an enum of distinct
MMU types, with various flag bits as well. This makes which bits of the
field should be compared pretty confusing.
Make a start on cleaning that up by moving two of the flags bits -
POWERPC_MMU_1TSEG and POWERPC_MMU_AMR - which are specific to the 64-bit
hash MMU into a new flags field in PPCHash64Options structure.
David Gibson [Fri, 23 Mar 2018 02:59:20 +0000 (13:59 +1100)]
target/ppc: Make hash64_opts field mandatory for 64-bit hash MMUs
Currently some cpus set the hash64_opts field in the class structure, with
specific details of their variant of the 64-bit hash mmu. For the
remaining cpus with that mmu, ppc_hash64_realize() fills in defaults.
But there are only a couple of cpus that use those fallbacks, so just have
them to set the has64_opts field instead, simplifying the logic.
David Gibson [Fri, 23 Mar 2018 02:31:52 +0000 (13:31 +1100)]
target/ppc: Split page size information into a separate allocation
env->sps contains page size encoding information as an embedded structure.
Since this information is specific to 64-bit hash MMUs, split it out into
a separately allocated structure, to reduce the basic env size for other
cpus. Along the way we make a few other cleanups:
* Rename to PPCHash64Options which is more in line with qemu name
conventions, and reflects that we're going to merge some more hash64
mmu specific details in there in future. Also rename its
substructures to match qemu conventions.
* Move structure definitions to the mmu-hash64.[ch] files.
David Gibson [Fri, 23 Mar 2018 02:07:48 +0000 (13:07 +1100)]
target/ppc: Move page size setup to helper function
Initialization of the env->sps structure at the end of instance_init is
specific to the 64-bit hash MMU, so move the code into a helper function
in mmu-hash64.c.
We also create a corresponding function to be called at finalize time -
it's empty for now, but we'll need it shortly.
David Gibson [Fri, 23 Mar 2018 00:24:28 +0000 (11:24 +1100)]
target/ppc: Remove fallback 64k pagesize information
CPU definitions for cpus with the 64-bit hash MMU can include a table of
available pagesizes. If this isn't supplied ppc_cpu_instance_init() will
fill it in a fallback table based on the POWERPC_MMU_64K bit in mmu_model.
However, it turns out all the cpus which support 64K pages already include
an explicit table of page sizes, so there's no point to the fallback table
including 64k pages.
That removes the only place which tests POWERPC_MMU_64K, so we can remove
it. Which in turn allows some logic to be removed from
kvm_fixup_page_sizes().
David Gibson [Thu, 22 Mar 2018 05:49:28 +0000 (16:49 +1100)]
target/ppc: Avoid taking "env" parameter to mmu-hash64 functions
In most cases we prefer to pass a PowerPCCPU rather than the (embedded)
CPUPPCState.
For ppc_hash64_update_{rmls,vrma}() change to take "cpu" instead of "env".
For ppc_hash64_set_{dsi,isi}() remove the redundant "env" parameter.
In theory this makes more work for the functions, but since "cs", "cpu"
and "env" are related by at most constant offsets, the compiler should be
able to optimize out the difference at effectively zero cost.
helper_*() functions are left alone - since they're more closely tied to
the TCG generated code, passing "env" is still the standard there.
David Gibson [Fri, 23 Mar 2018 01:55:04 +0000 (12:55 +1100)]
target/ppc: Simplify cpu valid check in ppc_cpu_realize
The #if isn't necessary, because there's a suitable one inside
ppc_cpu_is_valid(). We've already filtered for suitable cpu models in the
functions that search and register them. So by the time we get to realize
having an invalid one indicates a code error, not a user error, so an
assert() is more appropriate than error_setg().
David Gibson [Wed, 21 Mar 2018 04:30:13 +0000 (15:30 +1100)]
target/ppc: Standardize instance_init and realize function names
Because of the various hooks called some variant on "init" - and the rather
greater number that used to exist, I'm always wondering when a function
called simply "*_init" or "*_initfn" will be called.
To make it easier on myself, and maybe others, rename the instance_init
hooks for ppc cpus to *_instance_init(). While we're at it rename the
realize time hooks to *_realize() (from *_realizefn()) which seems to be
the more common current convention.
Greg Kurz [Wed, 11 Apr 2018 15:01:20 +0000 (17:01 +0200)]
spapr: drop useless dynamic sysbus device sanity check
Since commit 7da79a167aa11, the machine class init function registers
dynamic sysbus device types it supports. Passing an unsupported device
type on the command line causes QEMU to exit with an error message
just after machine init.
It is hence not needed to do the same sanity check at machine reset.
Leave change @node type from uint32_t to to int from reverted commit
because node < 0 is always false.
Note that implementing capability or some trick to detect if guest
kernel does not support hot-add to memory: this returns previous
behavour where memory added to first non-empty node.
David Gibson [Tue, 3 Apr 2018 05:05:45 +0000 (15:05 +1000)]
Add host_memory_backend_pagesize() helper
There are a couple places (one generic, one target specific) where we need
to get the host page size associated with a particular memory backend. I
have some upcoming code which will add another place which wants this. So,
for convenience, add a helper function to calculate this.
host_memory_backend_pagesize() returns the host pagesize for a given
HostMemoryBackend object.
David Gibson [Tue, 3 Apr 2018 04:55:11 +0000 (14:55 +1000)]
Make qemu_mempath_getpagesize() accept NULL
qemu_mempath_getpagesize() gets the effective (host side) page size for
a block of memory backed by an mmap()ed file on the host. It requires
the mem_path parameter to be non-NULL.
This ends up meaning all the callers need a different case for handling
anonymous memory (for memory-backend-ram or default memory with -mem-path
is not specified).
We can make all those callers a little simpler by having
qemu_mempath_getpagesize() accept NULL, and treat that as the anonymous
memory case.
BALATON Zoltan [Sun, 25 Mar 2018 23:54:28 +0000 (01:54 +0200)]
target/ppc: Fix reserved bit mask of dstst instruction
According to the Vector/SIMD extension documentation bit 6 that is
currently masked is valid (listed as transient bit) but bits 7 and 8
should be reserved instead. Fix the mask to match this.
Michael Matz [Fri, 23 Feb 2018 17:29:56 +0000 (17:29 +0000)]
ppc: Fix size of ppc64 xer register
The normal gdb definition of the XER registers is only 32 bit,
and that's what the current version of power64-core.xml also
says (seems copied from gdb's). But qemu's idea of the XER register
is target_ulong (in CPUPPCState, ppc_gdb_register_len and
ppc_cpu_gdb_read_register)
That mismatch leads to the following message when attaching
with gdb:
Truncated register 32 in remote 'g' packet
(and following on that qemu stops responding). The simple fix is
to say the truth in the .xml file. But the better fix is to
actually make it 32bit on the wire, as old gdbs don't support
XML files for describing registers. Also the XER state in qemu
doesn't seem to use the high 32 bits, so sending it off to gdb
doesn't seem worthwhile.
uninorth: move PCI IO (ISA) memory region into the uninorth device
Do this for both the uninorth main and uninorth u3 AGP buses, using the main
PCI bus for each machine (this ensures the IO addresses still match those
used by OpenBIOS).
uninorth: remove obsolete pci_pmac_u3_init() function
Instead wire up the PCI/AGP host bridges in mac_newworld.c. Now this is complete
it is possible to move the initialisation of the PCI hole alias into
pci_u3_agp_init().
uninorth: remove obsolete pci_pmac_init() function
Instead wire up the PCI/AGP host bridges in mac_newworld.c. Now this is complete
it is possible to move the initialisation of the PCI hole alias into
pci_unin_main_init().
Somewhere in the history of time, the initialisation of the PCI buses for the
AGP and PCI host bridges got mixed up in that the PCI host bridge was
creating an instance of the AGP PCI bus, and the AGP PCI bus was missing.
Swap the PCI host bridge over to use the correct PCI bus (including setting
the kMacRISCPCIAddressSelect register used by MacOS X) and add the missing
reference to the AGP PCI bus.
uninorth: move PCI host bridge bus initialisation into device realize
Since the IO address space is fixed to use the standard system IO address
space then we can also use the opportunity to remove the address_space_io
parameter from pci_pmac_init() and pci_pmac_u3_init().
Note we also move the default mac99 PCI bus to the end of the initialisation
list so that it becomes the default destination for any devices specified
via -device without an explicit PCI bus provided.
This is in preparation for moving the PCI bus wiring inside the uninorth
host bridge devices. In the future it will be possible to remove this once the
PICs have been switched to use qdev GPIOs.
mac_oldworld: move wiring of macio IRQs to macio_oldworld_realize()
Since the macio device has a link to the PIC device, we can now wire up the
IRQs directly via qdev GPIOs rather than having to use an intermediate array.
This is the first step towards removing the old-style pci_grackle_init()
function. Following on from the previous commit we can now pass the heathrow
device as an object link and wire up the heathrow IRQs via qdev GPIOs.
uninorth: move uninorth definitions into uninorth.h
Signed-off-by: Mark Cave-Ayland <[email protected]>
[dwg: Added hw/hw.h #include as suggested by Philippe Mathieu-Daudé] Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Signed-off-by: David Gibson <[email protected]>
uninorth: remove second set of uninorth token registers
Commit 593c181160: "PPC: Newworld: Add second uninorth control register set"
added a second set of uninorth registers at 0xf3000000.
Testing MacOS 9.2 to MacOS X 10.4 reveals no accesses to this address and I
can't find any reference to it in Apple's Core99.cpp source so I'm assuming
that this was the result of another bug that has now been fixed.
Peter Maydell [Thu, 26 Apr 2018 18:22:09 +0000 (19:22 +0100)]
Merge remote-tracking branch 'remotes/iwj/tags/for-upstream.depriv-2' into staging
xen: xen-domid-restrict improvements
# gpg: Signature made Thu 26 Apr 2018 19:11:22 BST
# gpg: using RSA key E3E3392348B50D39
# gpg: Good signature from "Ian Jackson (new general purpose key) <[email protected]>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 559A E46C 2D6B 6D32 65E7 CBA1 E3E3 3923 48B5 0D39
* remotes/iwj/tags/for-upstream.depriv-2:
configure: do_compiler: Dump some extra info under bash
os-posix: cleanup: Replace perror with error_report
os-posix: cleanup: Replace fprintf with error_report in remaining call sites
xen: Expect xenstore write to fail when restricted
xen: Remove now-obsolete xen_xc_domain_add_to_physmap
xen: Use newly added dmops for mapping VGA memory
os-posix: Provide new -runas <uid>:<gid> facility
os-posix: cleanup: Replace fprintfs with error_report in change_process_uid
xen: destroy_hvm_domain: Try xendevicemodel_shutdown
xen: move xc_interface compatibility fallback further up the file
xen: destroy_hvm_domain: Move reason into a variable
xen: defer call to xen_restrict until just before os_setup_post
xen: restrict: use xentoolcore_restrict_all
xen: link against xentoolcore
AccelClass: Introduce accel_setup_post
checkpatch: Add xendevicemodel_handle to the list of types
Ian Jackson [Mon, 25 Sep 2017 15:41:03 +0000 (16:41 +0100)]
configure: do_compiler: Dump some extra info under bash
This makes it much easier to find a particular thing in config.log.
We have to use the ${BASH_LINENO[*]} syntax which is a syntax error in
other shells, so test what shell we are running and use eval.
The extra output is only printed if configure is run with bash. On
systems where /bin/sh is not bash, it is necessary to say bash
./configure to get the extra debug info in the log.
Ross Lagerwall [Mon, 5 Mar 2018 10:07:46 +0000 (10:07 +0000)]
xen: Expect xenstore write to fail when restricted
Saving the current state to xenstore may fail when running restricted
(in particular, after a migration). Therefore, don't report the error or
exit when running restricted. Toolstacks that want to allow running
QEMU restricted should instead make use of QMP events to listen for
state changes.
Ross Lagerwall [Mon, 23 Oct 2017 09:27:27 +0000 (10:27 +0100)]
xen: Use newly added dmops for mapping VGA memory
Xen unstable (to be in 4.11) has two new dmops, relocate_memory and
pin_memory_cacheattr. Use these to set up the VGA memory, replacing the
previous calls to libxc. This allows the VGA console to work properly
when QEMU is running restricted (-xen-domid-restrict).
Wrapper functions are provided to allow QEMU to work with older versions
of Xen.
Tweak the error handling while making this change:
* Report pin_memory_cacheattr errors.
* Report errors even when DEBUG_HVM is not set. This is useful for
trying to understand why VGA is not working, since otherwise it just
fails silently.
* Fix the return values when an error occurs. The functions now
consistently return -1 and set errno.
Ian Jackson [Fri, 15 Sep 2017 17:10:44 +0000 (18:10 +0100)]
os-posix: Provide new -runas <uid>:<gid> facility
This allows the caller to specify a uid and gid to use, even if there
is no corresponding password entry. This will be useful in certain
Xen configurations.
We don't support just -runas <uid> because: (i) deprivileging without
calling setgroups would be ineffective (ii) given only a uid we don't
know what gid we ought to use (since uids may eppear in multiple
passwd file entries with different gids).
Ian Jackson [Tue, 3 Oct 2017 17:51:05 +0000 (18:51 +0100)]
xen: move xc_interface compatibility fallback further up the file
We are going to want to use the dummy xendevicemodel_handle type in
new stub functions in the CONFIG_XEN_CTRL_INTERFACE_VERSION < 41000
section. So we need to provide that definition, or (as applicable)
include the appropriate header, earlier in the file.
(Ideally the newer compatibility layers would be at the bottom of the
file, so that they can naturally benefit from the compatibility layers
for earlier version. But that's rather too much for this series.)
Ian Jackson [Fri, 15 Sep 2017 15:02:24 +0000 (16:02 +0100)]
xen: defer call to xen_restrict until just before os_setup_post
We need to restrict *all* the control fds that qemu opens. Looking in
/proc/PID/fd shows there are many; their allocation seems scattered
throughout Xen support code in qemu.
We must postpone the restrict call until roughly the same time as qemu
changes its uid, chroots (if applicable), and so on.
There doesn't seem to be an appropriate hook already. The RunState
change hook fires at different times depending on exactly what mode
qemu is operating in.
And it appears that no-one but the Xen code wants a hook at this phase
of execution. So, introduce a bare call to a new function
xen_setup_post, just before os_setup_post. Also provide the
appropriate stub for when Xen compilation is disabled.
We do the restriction before rather than after os_setup_post, because
xen_restrict may need to open /dev/null, and os_setup_post might have
called chroot.
Currently this does not work with migration, because when running as
the Xen device model qemu needs to signal to the toolstack that it is
ready. It currently does this using xenstore, and for incoming
migration (but not for ordinary startup) that happens after
os_setup_post.
It is correct that this happens late: we want the incoming migration
stream to be processed by a restricted qemu. The fix for this will be
to do the startup notification a different way, without using
xenstore. (QMP is probably a reasonable choice.)
So for now this restriction feature cannot be used in conjunction with
migration. (Note that this is not a regression in this patch, because
previously the -xen-restrict-domid call was, in fact, simply
ineffective!) We will revisit this in the Xen 4.11 release cycle.
Ian Jackson [Fri, 15 Sep 2017 15:03:14 +0000 (16:03 +0100)]
xen: restrict: use xentoolcore_restrict_all
And insist that it works.
Drop individual use of xendevicemodel_restrict and
xenforeignmemory_restrict. These are not actually effective in this
version of qemu, because qemu has a large number of fds open onto
various Xen control devices.
The restriction arrangements are still not right, because the
restriction needs to be done very late - after qemu has opened all of
its control fds.
xentoolcore_restrict_all and xentoolcore.h are available in Xen 4.10
and later, only. Provide a compatibility stub. And drop the
compatibility stubs for the old functions.
Ian Jackson [Thu, 8 Mar 2018 18:07:26 +0000 (18:07 +0000)]
checkpatch: Add xendevicemodel_handle to the list of types
This avoids checkpatch misparsing (as statements) long function
definitions or declarations, which sometimes start with constructs
like this:
static inline int xendevicemodel_relocate_memory(
xendevicemodel_handle *dmod, domid_t domid, ...
The type xendevicemodel_handle does not conform to Qemu CODING_STYLE,
which would suggest CamelCase. However, it is a type defined by the
Xen Project in xen.git. It would be possible to introduce a typedef
to allow the qemu code to refer to it by a differently-spelled name,
but that would obfuscate more than it would clarify.
Peter Maydell [Fri, 20 Apr 2018 14:52:48 +0000 (15:52 +0100)]
vl.c: Remove compile time limit on number of serial ports
Instead of having a fixed sized global serial_hds[] array,
use a local dynamically reallocated one, so we don't have
a compile time limit on how many serial ports a system has.
Peter Maydell [Fri, 20 Apr 2018 14:52:47 +0000 (15:52 +0100)]
superio: Don't use MAX_SERIAL_PORTS for serial port limit
The superio device has a limit on the number of serial
ports it supports which is really only there because
it has a fixed-size array serial[]. This limit isn't
related particularly to the global MAX_SERIAL_PORTS limit,
so use a different #define for it.
(In practice the users of superio only ever want 2 serial ports.)
Peter Maydell [Fri, 20 Apr 2018 14:52:46 +0000 (15:52 +0100)]
serial-isa: Use MAX_ISA_SERIAL_PORTS instead of MAX_SERIAL_PORTS
The ISA serial port handling in serial-isa.c imposes a limit
of 4 serial ports. This is because we only know of 4 IO port
and IRQ settings for them, and is unrelated to the generic
MAX_SERIAL_PORTS limit, though they happen to both be set at
4 currently.
Use a new MAX_ISA_SERIAL_PORTS wherever that is the correct
limit to be checking against.
Peter Maydell [Fri, 20 Apr 2018 14:52:45 +0000 (15:52 +0100)]
hw/char/exynos4210_uart.c: Remove unneeded handling of NULL chardev
The handling of NULL chardevs in exynos4210_uart_create() is now
all unnecessary: we don't need to create 'null' chardevs, and we
don't need to enforce a bounds check on serial_hd().
Peter Maydell [Fri, 20 Apr 2018 14:52:44 +0000 (15:52 +0100)]
Remove checks on MAX_SERIAL_PORTS that are just bounds checks
Remove checks on MAX_SERIAL_PORTS that were just checking whether
they were within bounds for the serial_hds[] array and falling
back to NULL if not. This isn't needed with the serial_hd()
function, which returns NULL for all indexes beyond what the
user set up.
Peter Maydell [Fri, 20 Apr 2018 14:52:43 +0000 (15:52 +0100)]
Change references to serial_hds[] to serial_hd()
Change all the uses of serial_hds[] to go via the new
serial_hd() function. Code change produced with:
find hw -name '*.[ch]' | xargs sed -i -e 's/serial_hds\[\([^]]*\)\]/serial_hd(\1)/g'
Peter Maydell [Fri, 20 Apr 2018 14:52:42 +0000 (15:52 +0100)]
vl.c: Provide accessor function serial_hd() for serial_hds[] array
Provide an accessor function serial_hd() to return the Chardev
(if any) associated with the numbered serial port. This will
be used to replace direct accesses to the serial_hds[] array,
so that calling code doesn't need to care about the size of
that array.