Long Li [Wed, 15 May 2019 21:09:05 +0000 (14:09 -0700)]
cifs: Allocate memory for all iovs in smb2_ioctl
An IOCTL uses up to 2 iovs. The 1st iov is the command itself, the 2nd iov is
optional data for that command. The 1st iov is always allocated on the heap
but the 2nd iov may point to a variable on the stack. This will trigger an
error when passing the 2nd iov for RDMA I/O.
Linus Torvalds [Thu, 16 May 2019 01:56:50 +0000 (18:56 -0700)]
Merge tag 'libnvdimm-fixes-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"Just a small collection of fixes this time around.
The new virtio-pmem driver is nearly ready, but some last minute
device-mapper acks and virtio questions made it prudent to await v5.3.
Other major topics that were brewing on the linux-nvdimm mailing list
like sub-section hotplug, and other devm_memremap_pages() reworks will
go upstream through Andrew's tree.
Summary:
- Fix a long standing namespace label corruption scenario when
re-provisioning capacity for a namespace.
- Restore the ability of the dax_pmem module to be built-in.
- Harden the build for the 'nfit_test' unit test modules so that the
userspace test harness can ensure all required test modules are
available"
* tag 'libnvdimm-fixes-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
drivers/dax: Allow to include DEV_DAX_PMEM as builtin
libnvdimm/namespace: Fix label tracking error
tools/testing/nvdimm: add watermarks for dax_pmem* modules
dax/pmem: Fix whitespace in dax_pmem
Linus Torvalds [Thu, 16 May 2019 01:50:40 +0000 (18:50 -0700)]
Merge tag 'for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"Core:
- Add over-current health state
- Add standard, adaptive and custom charge types
- Add new properties for start/end charge threshold
New Drivers / Hardware:
- UCS1002 Programmable USB Port Power Controller
- Ingenic JZ47xx Battery Fuel Gauge
- AXP20x USB Power: Add AXP813 support
- AT91 poweroff: Add SAM9X60 support
- OLPC battery: Add XO-1.5 and XO-1.75 support
Misc Changes:
- syscon-reboot: support mask property
- AXP288 fuel gauge: Blacklist ACEPC T8/T11. Looks like some vendor
thought it's a good idea to build a desktop system with a fuel
gauge, that slowly "discharges"...
- cpcap-battery: Fix calculation errors
- misc fixes"
* tag 'for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (54 commits)
power: supply: olpc_battery: force the le/be casts
power: supply: ucs1002: Fix build error without CONFIG_REGULATOR
power: supply: ucs1002: Fix wrong return value checking
power: supply: Add driver for Microchip UCS1002
dt-bindings: power: supply: Add bindings for Microchip UCS1002
power: supply: core: Add POWER_SUPPLY_HEALTH_OVERCURRENT constant
power: supply: core: fix clang -Wunsequenced
power: supply: core: Add missing documentation for CHARGE_CONTROL_* properties
power: supply: core: Add CHARGE_CONTROL_{START_THRESHOLD,END_THRESHOLD} properties
power: supply: core: Add Standard, Adaptive, and Custom charge types
power: supply: axp288_fuel_gauge: Add ACEPC T8 and T11 mini PCs to the blacklist
power: supply: bq27xxx_battery: Notify also about status changes
power: supply: olpc_battery: Have the framework register sysfs files for us
power: supply: olpc_battery: Add OLPC XO 1.75 support
power: supply: olpc_battery: Avoid using platform_info
power: supply: olpc_battery: Use devm_power_supply_register()
power: supply: olpc_battery: Move priv data to a struct
power: supply: olpc_battery: Use DT to get battery version
x86/platform/olpc: Use a correct version when making up a battery node
x86/platform/olpc: Trivial code move in DT fixup
...
Linus Torvalds [Thu, 16 May 2019 01:44:52 +0000 (18:44 -0700)]
Merge tag 'for-linus-5.2b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- some minor cleanups
- two small corrections for Xen on ARM
- two fixes for Xen PVH guest support
- a patch for a new command line option to tune virtual timer handling
* tag 'for-linus-5.2b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/arm: Use p2m entry with lock protection
xen/arm: Free p2m entry if fail to add it to RB tree
xen/pvh: correctly setup the PV EFI interface for dom0
xen/pvh: set xen_domain_type to HVM in xen_pvh_init
xenbus: drop useless LIST_HEAD in xenbus_write_watch() and xenbus_file_write()
xen-netfront: mark expected switch fall-through
xen: xen-pciback: fix warning Using plain integer as NULL pointer
x86/xen: Add "xen_timer_slop" command line option
Tony Luck [Thu, 16 May 2019 01:04:14 +0000 (18:04 -0700)]
ia64: Make sure that we have a mmiowb function real early
Generic kernels feed many operation through the "machvec" logic to get
the correct form of the operation for the current system. "mmiowb()" is
one of those operations.
Although machvec is initialized very early in boot, it isn't early
enough for a recent upstream kernel change that added mmiowb to the
spin_unlock() path.
Statically initialize the mmiowb field of machvec so that we won't die
with a call through a NULL pointer. This should be safe because we do
the real initialization of machvec before bringing up any addtional CPUs
or doing any I/O.
Fixes: 49ca6462fc9e ("ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()") Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Linus Torvalds [Thu, 16 May 2019 01:21:43 +0000 (18:21 -0700)]
Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"This consists mostly of nfsd container work:
Scott Mayhew revived an old api that communicates with a userspace
daemon to manage some on-disk state that's used to track clients
across server reboots. We've been using a usermode_helper upcall for
that, but it's tough to run those with the right namespaces, so a
daemon is much friendlier to container use cases.
Trond fixed nfsd's handling of user credentials in user namespaces. He
also contributed patches that allow containers to support different
sets of NFS protocol versions.
The only remaining container bug I'm aware of is that the NFS reply
cache is shared between all containers. If anyone's aware of other
gaps in our container support, let me know.
The rest of this is miscellaneous bugfixes"
* tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: update callback done processing
locks: move checks from locks_free_lock() to locks_release_private()
nfsd: fh_drop_write in nfsd_unlink
nfsd: allow fh_want_write to be called twice
nfsd: knfsd must use the container user namespace
SUNRPC: rsi_parse() should use the current user namespace
SUNRPC: Fix the server AUTH_UNIX userspace mappings
lockd: Pass the user cred from knfsd when starting the lockd server
SUNRPC: Temporary sockets should inherit the cred from their parent
SUNRPC: Cache the process user cred in the RPC server listener
nfsd: Allow containers to set supported nfs versions
nfsd: Add custom rpcbind callbacks for knfsd
SUNRPC: Allow further customisation of RPC program registration
SUNRPC: Clean up generic dispatcher code
SUNRPC: Add a callback to initialise server requests
SUNRPC/nfs: Fix return value for nfs4_callback_compound()
nfsd: handle legacy client tracking records sent by nfsdcld
nfsd: re-order client tracking method selection
nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
nfsd: un-deprecate nfsdcld
...
Dave Airlie [Thu, 16 May 2019 00:19:29 +0000 (10:19 +1000)]
Merge tag 'drm-misc-next-fixes-2019-05-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
- A couple new panfrost fixes
- Fix the low refresh rate register in adv7511
- A handful of msm fixes that fell out of 5.1 bringup on SDM845
- Fix spinlock initialization in pl111
Linus Torvalds [Wed, 15 May 2019 23:05:47 +0000 (16:05 -0700)]
Merge tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"The major changes in this tracing update includes:
- Removal of non-DYNAMIC_FTRACE from 32bit x86
- Removal of mcount support from x86
- Emulating a call from int3 on x86_64, fixes live kernel patching
- Consolidated Tracing Error logs file
Minor updates:
- Removal of klp_check_compiler_support()
- kdb ftrace dumping output changes
- Accessing and creating ftrace instances from inside the kernel
- Clean up of #define if macro
- Introduction of TRACE_EVENT_NOP() to disable trace events based on
config options
And other minor fixes and clean ups"
* tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
x86: Hide the int3_emulate_call/jmp functions from UML
livepatch: Remove klp_check_compiler_support()
ftrace/x86: Remove mcount support
ftrace/x86_32: Remove support for non DYNAMIC_FTRACE
tracing: Simplify "if" macro code
tracing: Fix documentation about disabling options using trace_options
tracing: Replace kzalloc with kcalloc
tracing: Fix partial reading of trace event's id file
tracing: Allow RCU to run between postponed startup tests
tracing: Fix white space issues in parse_pred() function
tracing: Eliminate const char[] auto variables
ring-buffer: Fix mispelling of Calculate
tracing: probeevent: Fix to make the type of $comm string
tracing: probeevent: Do not accumulate on ret variable
tracing: uprobes: Re-enable $comm support for uprobe events
ftrace/x86_64: Emulate call function while updating in breakpoint handler
x86_64: Allow breakpoints to emulate call instructions
x86_64: Add gap to int3 to allow for call emulation
tracing: kdb: Allow ftdump to skip all but the last few entries
tracing: Add trace_total_entries() / trace_total_entries_cpu()
...
GCC 4.7, which is still permitted, emits code using the original
syntax. This means we end up with lots of assembler warnings when
building with a currently-supported version of gcc.
Revert the commit (with fixups to keep the follow-on -mauto-it
change) to avoid these warnings.
Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU"
The RDPMC-exiting control is dependent on the existence of the RDPMC
instruction itself, i.e. is not tied to the "Architectural Performance
Monitoring" feature. For all intents and purposes, the control exists
on all CPUs with VMX support since RDPMC also exists on all VCPUs with
VMX supported. Per Intel's SDM:
The RDPMC instruction was introduced into the IA-32 Architecture in
the Pentium Pro processor and the Pentium processor with MMX technology.
The earlier Pentium processors have performance-monitoring counters, but
they must be read with the RDMSR instruction.
Because RDPMC-exiting always exists, KVM requires the control and refuses
to load if it's not available. As a result, hiding the PMU from a guest
breaks nested virtualization if the guest attemts to use KVM.
While it's not explicitly stated in the RDPMC pseudocode, the VM-Exit
check for RDPMC-exiting follows standard fault vs. VM-Exit prioritization
for privileged instructions, e.g. occurs after the CPL/CR0.PE/CR4.PCE
checks, but before the counter referenced in ECX is checked for validity.
In other words, the original KVM behavior of injecting a #GP was correct,
and the KVM unit test needs to be adjusted accordingly, e.g. eat the #GP
when the unit test guest (L3 in this case) executes RDPMC without
RDPMC-exiting set in the unit test host (L2).
Kai Huang [Fri, 3 May 2019 08:40:25 +0000 (01:40 -0700)]
kvm: x86: Fix L1TF mitigation for shadow MMU
Currently KVM sets 5 most significant bits of physical address bits
reported by CPUID (boot_cpu_data.x86_phys_bits) for nonpresent or
reserved bits SPTE to mitigate L1TF attack from guest when using shadow
MMU. However for some particular Intel CPUs the physical address bits
of internal cache is greater than physical address bits reported by
CPUID.
Use the kernel's existing boot_cpu_data.x86_cache_bits to determine the
five most significant bits. Doing so improves KVM's L1TF mitigation in
the unlikely scenario that system RAM overlaps the high order bits of
the "real" physical address space as reported by CPUID. This aligns with
the kernel's warnings regarding L1TF mitigation, e.g. in the above
scenario the kernel won't warn the user about lack of L1TF mitigation
if x86_cache_bits is greater than x86_phys_bits.
Also initialize shadow_nonpresent_or_rsvd_mask explicitly to make it
consistent with other 'shadow_{xxx}_mask', and opportunistically add a
WARN once if KVM's L1TF mitigation cannot be applied on a system that
is marked as being susceptible to L1TF.
KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible
If L1 is using an MSR bitmap, unconditionally merge the MSR bitmaps from
L0 and L1 for MSR_{KERNEL,}_{FS,GS}_BASE. KVM unconditionally exposes
MSRs L1. If KVM is also running in L1 then it's highly likely L1 is
also exposing the MSRs to L2, i.e. KVM doesn't need to intercept L2
accesses.
Based on code from Jintack Lim.
Cc: Jintack Lim <jintack@xxxxxxxxxxxxxxx> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> Signed-off-by: Paolo Bonzini <[email protected]>
Stephen Boyd [Thu, 18 Apr 2019 22:20:22 +0000 (15:20 -0700)]
clk: Remove io.h from clk-provider.h
Now that we've gotten rid of clk_readl() we can remove io.h from the
clk-provider header and push out the io.h include to any code that isn't
already including the io.h header but using things like readl/writel,
etc.
I also reordered a couple includes when they weren't alphabetical and
removed clk.h from kona, replacing it with clk-provider.h because
that driver doesn't use clk consumer APIs.
UBIFS stores inode numbers as LE64 integers.
We have to convert them to host oder, otherwise
BE hosts won't be able to use the integer correctly.
Reported-by: kbuild test robot <[email protected]> Fixes: 9ca2d7326444 ("ubifs: Limit number of xattrs per inode") Signed-off-by: Richard Weinberger <[email protected]>
CONFIG_UBIFS_FS_ENCRYPTION is gone, fscrypt is now
controlled via CONFIG_FS_ENCRYPTION.
This problem slipped into the tree because of a mis-merge on
my side.
Reported-by: Eric Biggers <[email protected]> Fixes: eea2c05d927b ("ubifs: Remove #ifdef around CONFIG_FS_ENCRYPTION") Signed-off-by: Richard Weinberger <[email protected]>
YueHaibing [Fri, 10 May 2019 03:21:44 +0000 (11:21 +0800)]
ubifs: Fix build error without CONFIG_UBIFS_FS_XATTR
Fix gcc build error while CONFIG_UBIFS_FS_XATTR
is not set
fs/ubifs/dir.o: In function `ubifs_unlink':
dir.c:(.text+0x260): undefined reference to `ubifs_purge_xattrs'
fs/ubifs/dir.o: In function `do_rename':
dir.c:(.text+0x1edc): undefined reference to `ubifs_purge_xattrs'
fs/ubifs/dir.o: In function `ubifs_rmdir':
dir.c:(.text+0x2638): undefined reference to `ubifs_purge_xattrs'
Reported-by: Hulk Robot <[email protected]> Fixes: 9ca2d7326444 ("ubifs: Limit number of xattrs per inode") Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Richard Weinberger <[email protected]>
ARM64's implementation of get_cpuidr_str() masks out the revision bits
[3:0] while reading the CPU identifier, there is no need for the
[[:xdigit:]] wildcard.
The shell tests should not redirect useful output to /dev/null, as that
is done automatically by 'perf test' in non verbose mode, so remove that
from the zstd comp/decomp test, fixing up verbose mode.
Before:
$ perf test zstd
68: Zstd perf.data compression/decompression : Ok
$ perf test -v zstd
68: Zstd perf.data compression/decompression :
--- start ---
test child forked, pid 11956
-z, --compression-level[=<n>]
Collecting compressed record file:
Checking compressed events stats:
test child finished with 0
---- end ----
Zstd perf.data compression/decompression: Ok
$
Now:
$ perf test zstd
68: Zstd perf.data compression/decompression : Ok
$ perf test -v zstd
68: Zstd perf.data compression/decompression :
--- start ---
test child forked, pid 12695
Collecting compressed record file:
0+500 records in
72+1 records out
37361 bytes (37 kB, 36 KiB) copied, 9.83796 s, 3.8 kB/s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB /tmp/perf.data.rzq, compressed (original 0.004 MB, ratio is 3.679) ]
Checking compressed events stats:
# compressed : Zstd, level = 1, ratio = 4
COMPRESSED events: 3
test child finished with 0
---- end ----
Zstd perf.data compression/decompression: Ok
$
Alexey Budankov [Mon, 18 Mar 2019 17:46:17 +0000 (20:46 +0300)]
perf tests: Implement Zstd comp/decomp integration test
Introduce a basic integration test for Zstd based record
compression/decompression using 'perf record' and 'perf report'.
Committer notes:
Reduce a bit the freq (from 25 kHz to 5 kHz) and the number of /dev/null
records read (from 1000 to 500), reducing the time it takes to something
more in line with the time existing 'perf test' entries take to run.
With that in place:
$ time perf test zstd
68: Zstd perf.data compression/decompression : Ok
real 0m10.376s
user 0m0.105s
sys 0m0.440s
$ grep "model name" /proc/cpuinfo | head -1
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
$
Alexey Budankov [Mon, 18 Mar 2019 17:45:11 +0000 (20:45 +0300)]
perf report: Implement perf.data record decompression
zstd_init(, comp_level = 0) initializes decompression part of API only
hat now consists of zstd_decompress_stream() function.
The perf.data PERF_RECORD_COMPRESSED records are decompressed using
zstd_decompress_stream() function into a linked list of mmaped memory
regions of mmap_comp_len size (struct decomp).
After decompression of one COMPRESSED record its content is iterated and
fetched for usual processing. The mmaped memory regions with
decompressed events are kept in the linked list till the tool process
termination.
When dumping raw records (e.g., perf report -D --header) file offsets of
events from compressed records are printed as zero.
Committer notes:
Since now we have support for processing PERF_RECORD_COMPRESSED, we see
none, in raw form, like we saw in the previous patch commiter notes,
they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc}
records, we only see the stats for those PERF_RECORD_COMPRESSED events,
and since I used the file generated in the commiter notes for the
previous patch, there they are, 2 compressed records:
Implemented -z,--compression_level[=<n>] option that enables compression
of mmaped kernel data buffers content in runtime during perf record mode
collection. Default option value is 1 (fastest compression).
Compression overhead has been measured for serial and AIO streaming when
profiling matrix multiplication workload:
OVH = (Execution time with -z N) / (Execution time with -z 0)
ratio - compression ratio
size - number of bytes that was compressed
size ~= trace size x ratio
Committer notes:
Testing it I noticed that it failed to disable build id processing when
compression is enabled, and as we'd have to uncompress everything to
look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids
to read from DSOs, we better disable build id processing when
compression is enabled, logging with pr_debug() when doing so:
Original patch:
# perf record -z2
^C[ perf record: Woken up 1 times to write data ]
0x1746e0 [0x76]: failed to process type: 81 [Invalid argument]
[ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ]
#
After auto-disabling build id processing when compression is enabled:
$ perf record -z2 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ]
$ perf record -v -z2 sleep 1
Compression enabled, disabling build id collection at the end of the session.
<SNIP extra -v pr_debug() messages>
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ]
$
Also, with parts of the patch originally after this one moved to just
before this one we get:
$ perf record -z2 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ]
$ perf report -D | grep COMPRESS
0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled!
0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled!
COMPRESSED events: 2
COMPRESSED events: 0
$
I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code
to process, we just show it as not being handled, skip them and
continue, while before we had:
$ perf report -D | grep COMPRESS
0x1b8 [0x169]: failed to process type: 81 [Invalid argument]
Error:
failed to process sample
0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED
$
Alexey Budankov [Mon, 18 Mar 2019 17:45:11 +0000 (20:45 +0300)]
perf report: Add stub processing of compressed events for -D
Committer note:
Split from a larger patch, this only dumps PERF_RECORD_COMPRESSED as
unhandled, so that when we introduce the record part in the next patch,
we don't see unhandled events when using 'perf record -D'.
Changed it so that we dump the event if the handler is just a stub, i.e.
for the case where we don't have ZSTD linked but we're processing a
perf.data file generated by a tool with that linked.
Also when failing to decompress we can't just dump the uncompressed
event and return 0, we have to propagate the error.
Alexey Budankov [Mon, 18 Mar 2019 17:44:12 +0000 (20:44 +0300)]
perf record: Implement compression for AIO trace streaming
Compression is implemented using the functions from zstd.c. As the memory
to operate on the compression uses mmap->aio.data[] buffers. If Zstd
streaming compression API fails for some reason the data to be compressed
are just copied into the memory buffers using plain memcpy().
Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE
and consists of perf_event_header followed by the compressed chunk
that is decompressed on the loading stage.
perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used
in the both serial and AIO streaming cases. perf_mmap__push() is extended
with positive return values to signify absence of data ready for
processing.
Alexey Budankov [Mon, 18 Mar 2019 17:43:35 +0000 (20:43 +0300)]
perf record: Implement compression for serial trace streaming
Compression is implemented using the functions from zstd.c. As the
memory to operate on the compression uses mmap->data buffer.
If Zstd streaming compression API fails for some reason the data to be
compressed are just copied into the memory buffers using plain memcpy().
Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
records. Each element of the array is not longer that
PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the
compressed chunk that is decompressed on the loading stage.
Comitter notes:
Undo some unnecessary line breaks, remove some unnecessary () around
zstd_data to then just get its address, and fix conflicts with
BPF_PROG_INFO/BPF_BTF patchkits.
Alexey Budankov [Mon, 18 Mar 2019 17:42:55 +0000 (20:42 +0300)]
perf tools: Introduce Zstd streaming based compression API
Implemented functions are based on Zstd streaming compression API.
The functions are used in runtime to compress data that come from mmaped
kernel buffer. zstd_init(), zstd_fini() are used for initialization and
finalization to allocate and deallocate internal zstd objects.
zstd_compress_stream_to_records() is used to convert parts of mmaped
kernel buffer into an array of PERF_RECORD_COMPRESSED records.
Alexey Budankov [Mon, 18 Mar 2019 17:41:33 +0000 (20:41 +0300)]
perf record: Implement COMPRESSED event record and its attributes
Implemented PERF_RECORD_COMPRESSED event, related data types, header
feature and functions to write, read and print feature attributes from
the trace header section.
comp_mmap_len preserves the size of mmaped kernel buffer that was used
during collection. comp_mmap_len size is used on loading stage as the
size of decomp buffer for decompression of COMPRESSED events content.
Committer notes:
Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features.
Alexey Budankov [Mon, 18 Mar 2019 17:41:02 +0000 (20:41 +0300)]
perf session: Define 'bytes_transferred' and 'bytes_compressed' metrics
Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate
ratio in the end of the data collection:
compression ratio = bytes_transferred / bytes_compressed
The 'bytes_transferred' metric accumulates the amount of bytes that was
extracted from the mmaped kernel buffers for compression, while
'bytes_compressed' accumulates the amount of bytes that was received
after applying compression.
"The Intel Ultra Path Interconnect (UPI) is a point-to-point processor
interconnect developed by Intel which replaced the Intel QuickPath
Interconnect (QPI) in Xeon Skylake-SP platforms starting in 2017.
UPI is a low-latency coherent interconnect for scalable multiprocessor
systems with a shared address space. It uses a directory-based home
snoop coherency protocol with a transfer speed of up to 10.4 GT/s.
Supporting processors typically have two or three UPI links."
With support for Python 2 or 3 and PySide 1 or 2 (Qt 4 or 5), it is
useful to see what versions are in use. Add an 'About' dialog box that
displays Python, PySide, Qt and database server (SQLite or PostgreSQL)
version numbers.
Then go to 'Help', then 'About', select all the lines with the mouse
press 'Control+C', then, on the same terminal press control+shift+V
which shows my current environment:
Adrian Hunter [Fri, 3 May 2019 12:08:27 +0000 (15:08 +0300)]
perf scripts python: exported-sql-viewer.py: Add context menu
Add a context menu (right-click) that provides options for copying to
clipboard, including, for trees, the ability to copy only the cell under
the mouse pointer.
Simply right click and pick "Copy selection", that at this point has
just the first line, not expanded, then see what was copied by pressing
shift+control+v on a terminal:
Adrian Hunter [Fri, 3 May 2019 12:08:26 +0000 (15:08 +0300)]
perf scripts python: exported-sql-viewer.py: Add copy to clipboard
Add support for copying to clipboard. Two menu options are added to copy the
selected rows / columns with normal spacing, or as comma-separated-values.
In the case of trees, only entire rows can be copied.
Adrian Hunter [Fri, 3 May 2019 12:08:23 +0000 (15:08 +0300)]
perf scripts python: exported-sql-viewer.py: Fix error when shrinking / enlarging font
Fix the following error if shrink / enlarge font is used with the help
window.
Traceback (most recent call last):
File "tools/perf/scripts/python/exported-sql-viewer.py", line 2791, in ShrinkFont
ShrinkFont(win.view)
AttributeError: 'HelpWindow' object has no attribute 'view'
Committer testing:
Before, matches above output:
$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db
Traceback (most recent call last):
File "/home/acme/libexec/perf-core/scripts/python/exported-sql-viewer.py", line 2780, in EnlargeFont
EnlargeFont(win.view)
AttributeError: 'HelpWindow' object has no attribute 'view'
$
After:
No more tracebacks, but the fonts don't get enlarged, which is kinda
frustrating...
Andi Kleen [Mon, 6 May 2019 14:19:24 +0000 (07:19 -0700)]
perf tools x86: Add support for recording and printing XMM registers
Icelake and later platforms support collecting XMM registers with PEBS
event.
Add support for 'perf script' to dump them, and support for the register
parser in 'perf record -I=' ... to configure them.
For now they are just printed in hex, we could potentially later add
other formats too.
Committer testing:
Before:
# perf record -IXMM0
Warning:
unknown register XMM0, check man page or run 'perf record -I?'
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
#
# perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
#
After:
# perf record -IXMM0
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
/bin/dmesg | grep -i perf may provide additional information.
#
# perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use -I ? to list register names
#
More work is needed to, when faced with such error, warn the user that
that register is not available on the running platform.
perf parse-regs: Improve error output when faced with unknown register name
Add quotes around the register name and suggest using 'perf record -I?'
to get the list of available registers.
Before:
# perf record -Idi,xmm20,xmm1
Warning:
unknown register xmm20, check man page
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use -I ? to list register names
#
# perf record -Idi,xmm20,xmm1
Warning:
unknown register "xmm20", check man page or run "perf record -I?"
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use -I ? to list register names
#
tools x86 uapi asm: Sync the pt_regs.h copy with the kernel sources
To get the changes in:
878068ea270e ("perf/x86: Support outputting XMM registers")
That will be used in a followup patch to allow users to ask for some or
all of those registers to be collected in certain contatexts.
This silences the following perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/perf_regs.h' differs from latest version at 'arch/x86/include/uapi/asm/perf_regs.h'
diff -u tools/arch/x86/include/uapi/asm/perf_regs.h arch/x86/include/uapi/asm/perf_regs.h
59073aaf6de0 ("kvm: x86: Add exception payload fields to kvm_vcpu_events")
This silences the following perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
The changes in this file are in something not used at this time in any
tools/perf/ tool.
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
No changes in the tooling using this, that was just to ease some objtool
return checking.
Jiri Olsa [Fri, 26 Apr 2019 07:38:04 +0000 (09:38 +0200)]
perf tools: Speed up report for perf compiled with linwunwind
When compiled with libunwind, perf does some preparatory work when
processing side-band events. This is not needed when report actually
don't unwind dwarf callchains, so it's disabled with
dwarf_callchain_users bool.
However we could move that check to higher level and shield more
unwanted code for normal report processing, giving us following speed up
on kernel build profile:
Before:
$ perf record make -j40
...
$ ll ../../perf.data
-rw-------. 1 jolsa jolsa 461783932 Apr 26 09:11 perf.data
$ perf stat -e cycles:u,instructions:u perf report -i perf.data > out
Performance counter stats for 'perf report -i perf.data':
78,669,920,155 cycles:u
99,076,431,951 instructions:u # 1.26 insn per cycle
Mao Han [Sun, 21 Apr 2019 15:33:14 +0000 (23:33 +0800)]
csky: Add support for libdw
This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.
Jin Yao [Fri, 15 Mar 2019 21:16:17 +0000 (05:16 +0800)]
perf annotate: Remove hist__account_cycles() from callback
The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.
But it looks it's not necessary. In hist__account_cycles, it already
walks on all branch entries.
This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.
Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).
With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.
Now, we can try, for example:
perf record -b ...
perf annotate or perf report -s symbol
The before/after output should be no change.
v3:
---
Fix the crash in stdio mode.
Like previous code, it needs the checking of ui__has_annotation()
before hist__account_cycles()
v2:
---
1. Cover the similar perf report
2. Remove the checking code "ch->reset >= ch->num / 2"
David Howells [Fri, 3 May 2019 17:26:55 +0000 (18:26 +0100)]
dns_resolver: Allow used keys to be invalidated
Allow used DNS resolver keys to be invalidated after use if the caller is
doing its own caching of the results. This reduces the amount of resources
required.
Fix AFS to invalidate DNS results to kill off permanent failure records
that get lodged in the resolver keyring and prevent future lookups from
happening.
Fixes: 0a5143f2f89c ("afs: Implement VL server rotation") Signed-off-by: David Howells <[email protected]>
David Howells [Tue, 7 May 2019 14:16:26 +0000 (15:16 +0100)]
afs: Fix missing lock when replacing VL server list
When afs_update_cell() replaces the cell->vl_servers list, it uses RCU
protocol so that proc is protected, but doesn't take ->vl_servers_lock to
protect afs_start_vl_iteration() (which does actually take a shared lock).
Fix this by making afs_update_cell() take an exclusive lock when replacing
->vl_servers.
Fixes: 0a5143f2f89c ("afs: Implement VL server rotation") Signed-off-by: David Howells <[email protected]>
David Howells [Sun, 12 May 2019 07:31:23 +0000 (08:31 +0100)]
afs: Fix afs_xattr_get_yfs() to not try freeing an error value
afs_xattr_get_yfs() tries to free yacl, which may hold an error value (say
if yfs_fs_fetch_opaque_acl() failed and returned an error).
Fix this by allocating yacl up front (since it's a fixed-length struct,
unlike afs_acl) and passing it in to the RPC function. This also allows
the flags to be placed in the object rather than passing them through to
the RPC function.
Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") Signed-off-by: David Howells <[email protected]>
David Howells [Sun, 12 May 2019 07:05:10 +0000 (08:05 +0100)]
afs: Fix incorrect error handling in afs_xattr_get_acl()
Fix incorrect error handling in afs_xattr_get_acl() where there appears to
be a redundant assignment before return, but in fact the return should be a
goto to the error handling at the end of the function.
Fixes: 260f082bae6d ("afs: Get an AFS3 ACL as an xattr")
Addresses-Coverity: ("Unused Value") Reported-by: Colin Ian King <[email protected]> Signed-off-by: David Howells <[email protected]>
cc: Joe Perches <[email protected]>
Linus Torvalds [Wed, 15 May 2019 16:06:14 +0000 (09:06 -0700)]
Merge tag 'kconfig-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- error out if a user specifies a directory instead of a file from
"Save" menu of GUI interfaces
- do not overwrite .config if there is no change in the configuration
- create parent directories as needed when a user specifies a new file
path from "Save" menu of menuconfig/nconfig
- fix potential buffer overflow
- some trivial cleanups
* tag 'kconfig-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: make conf_get_autoconfig_name() static
kconfig: use snprintf for formatting pathnames
kconfig: remove useless NULL pointer check in conf_write_dep()
kconfig: make parent directories for the saved .config as needed
kconfig: do not write .config if the content is the same
kconfig: do not accept a directory for configuration output
kconfig: remove trailing whitespaces
kconfig: Make nconf-cfg.sh executable
Linus Torvalds [Wed, 15 May 2019 15:58:49 +0000 (08:58 -0700)]
Merge tag 'acpi-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These fix two regressions introduced during the 5.0 cycle, in ACPICA
and in device PM, cause the values returned by _ADR to be stored in 64
bits and fix two ACPI documentation issues.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20190509
including one regression fix:
* Prevent excessive ACPI debug messages from being printed by
moving the ACPI_DEBUG_DEFAULT definition to the right place
(Erik Schmauss).
- Set the enable_for_wake bits for wakeup GPEs during suspend to idle
to allow acpi_enable_all_wakeup_gpes() to enable them as
aproppriate and make wakeup devices sighaling events through ACPI
GPEs work with suspend-to-idle again (Rajat Jain).
- Use 64 bits to store the return values of _ADR which are assumed to
be 64-bit by some bus specs and may contain nonzero bits in the
upper 32 bits part for some devices (Pierre-Louis Bossart).
- Fix two minor issues with the ACPI documentation (Sakari Ailus)"
* tag 'acpi-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Set enable_for_wake for wakeup GPEs during suspend-to-idle
Documentation: ACPI: Direct references are allowed to devices only
Documentation: ACPI: Use tabs for graph ASL indentation
ACPICA: Update version to 20190509
ACPICA: Linux: move ACPI_DEBUG_DEFAULT flag out of ifndef
ACPI: bus: change _ADR representation to 64 bits