Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC platform updates from Olof Johansson:
"SoC platform changes. Main theme this merge window:
- The Netx platform (Netx 100/500) platform is removed by Linus
Walleij-- the SoC doesn't have active maintainers with hardware,
and in discussions with the vendor the agreement was that it's OK
to remove.
- Russell King has a series of patches that cleans up and refactors
SA1101 and RiscPC support"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (47 commits)
ARM: stm32: use "depends on" instead of "if" after prompt
ARM: sa1100: convert to common clock framework
ARM: exynos: Cleanup cppcheck shifting warning
ARM: pxa/lubbock: remove lubbock_set_misc_wr() from global view
ARM: exynos: Only build MCPM support if used
arm: add missing include platform-data/atmel.h
ARM: davinci: Use GPIO lookup table for DA850 LEDs
ARM: OMAP2: drop explicit assembler architecture
ARM: use arch_extension directive instead of arch argument
ARM: imx: Switch imx7d to imx-cpufreq-dt for speed-grading
ARM: bcm: Enable PINCTRL for ARCH_BRCMSTB
ARM: bcm: Enable ARCH_HAS_RESET_CONTROLLER for ARCH_BRCMSTB
ARM: riscpc: enable chained scatterlist support
ARM: riscpc: reduce IRQ handling code
ARM: riscpc: move RiscPC assembly files from arch/arm/lib to mach-rpc
ARM: riscpc: parse video information from tagged list
ARM: riscpc: add ecard quirk for Atomwide 3port serial card
MAINTAINERS: mvebu: Add git entry
soc: ti: pm33xx: Add a print while entering RTC only mode with DDR in self-refresh
ARM: OMAP2+: Make some variables static
...
Merge tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"Dave is back in shape, but now family got it so I'm doing the pull.
Two things worthy of note:
- nouveau feature pull was way too late, Dave&me decided to not take
that, so Ben spun up a pull with just the fixes.
- after some chatting with the arm display maintainers we decided to
change a bit how that's maintained, for more oversight/review and
cross vendor collab.
amdgpu:
- large pile of fixes for new hw support this release (navi, vega20)
- audio hotplug fix
- bunch of corner cases and small fixes all over for amdgpu/kfd
komeda:
- back out some new properties (from this merge window) that needs
more pondering.
bochs:
- fb pitch setup
core:
- a new panel quirk
- misc fixes"
* tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm: (73 commits)
drm/nouveau/secboot/gp102-: remove WAR for SEC2 RTOS start bug
drm/nouveau/flcn/gp102-: improve implementation of bind_context() on SEC2/GSP
drm/nouveau: fix memory leak in nouveau_conn_reset()
drm/nouveau/dmem: missing mutex_lock in error path
drm/nouveau/hwmon: return EINVAL if the GPU is powered down for sensors reads
drm/nouveau: fix bogus GPL-2 license header
drm/nouveau: fix bogus GPL-2 license header
drm/nouveau/i2c: Enable i2c pads & busses during preinit
drm/nouveau/disp/tu102-: wire up scdc parameter setter
drm/nouveau/core: recognise TU116 chipset
drm/nouveau/kms: disallow dual-link harder if hdmi connection detected
drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling
drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes
drm/nouveau/mcp89/mmu: Use mcp77_mmu_new instead of g84_mmu_new on MCP89.
drm/amd/display: init res_pool dccg_ref, dchub_ref with xtalin_freq
drm/amdgpu/pm: remove check for pp funcs in freq sysfs handlers
drm/amd/display: Force uclk to max for every state
drm/amdkfd: Remove GWS from process during uninit
drm/amd/amdgpu: Fix offset for vmid selection in debugfs interface
drm/amd/powerplay: update vega20 driver if to fit latest SMU firmware
...
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Fix missed wake-up race in padata
- Use crypto_memneq in ccp
- Fix version check in ccp
- Fix fuzz test failure in ccp
- Fix potential double free in crypto4xx
- Fix compile warning in stm32
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
crypto: ccp - Fix SEV_VERSION_GREATER_OR_EQUAL
crypto: ccp/gcm - use const time tag comparison.
crypto: ccp - memset structure fields to zero before reuse
crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe
crypto: stm32/hash - Fix incorrect printk modifier for size_t
Merge tag 'trace-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Eiichi Tsukata found a small bug from the fixup of the stack code
Removing ULONG_MAX as the marker for the user stack trace end, made
the tracing code not know where the end is. The end is now marked with
a zero (NULL) pointer. Eiichi fixed this in the tracing code"
* tag 'trace-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix user stack trace "??" output
Merge tag 'csky-for-linus-5.3-rc1' of git://github.com/c-sky/csky-linux
Pull arch/csky pupdates from Guo Ren:
"This round of csky subsystem gives two features (ASID algorithm
update, Perf pmu record support) and some fixups.
ASID updates:
- Revert mmu ASID mechanism
- Add new asid lib code from arm
- Use generic asid algorithm to implement switch_mm
- Improve tlb operation with help of asid
Perf pmu record support:
- Init pmu as a device
- Add count-width property for csky pmu
- Add pmu interrupt support
- Fix perf record in kernel/user space
- dt-bindings: Add csky PMU bindings
Fixes:
- Fixup no panic in kernel for some traps
- Fixup some error count in 810 & 860.
- Fixup abiv1 memset error"
* tag 'csky-for-linus-5.3-rc1' of git://github.com/c-sky/csky-linux:
csky: Fixup abiv1 memset error
csky: Improve tlb operation with help of asid
csky: Use generic asid algorithm to implement switch_mm
csky: Add new asid lib code from arm
csky: Revert mmu ASID mechanism
dt-bindings: csky: Add csky PMU bindings
dt-bindings: interrupt-controller: Update csky mpintc
csky: Fixup some error count in 810 & 860.
csky: Fix perf record in kernel/user space
csky: Add pmu interrupt support
csky: Add count-width property for csky pmu
csky: Init pmu as a device
csky: Fixup no panic in kernel for some traps
csky: Select intc & timer drivers
Merge tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Fixes and features:
- A series to introduce a common command line parameter for disabling
paravirtual extensions when running as a guest in virtualized
environment
- A fix for int3 handling in Xen pv guests
- Removal of the Xen-specific tmem driver as support of tmem in Xen
has been dropped (and it was experimental only)
- A security fix for running as Xen dom0 (XSA-300)
- A fix for IRQ handling when offlining cpus in Xen guests
- Some small cleanups"
* tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: let alloc_xenballooned_pages() fail if not enough memory free
xen/pv: Fix a boot up hang revealed by int3 self test
x86/xen: Add "nopv" support for HVM guest
x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete
x86: Add "nopv" parameter to disable PV extensions
x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
xen: remove tmem driver
Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"
xen/events: fix binding user event channels to cpus
Merge tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap split/cleanup from Darrick Wong:
"As promised, here's the second part of the iomap merge for 5.3, in
which we break up iomap.c into smaller files grouped by functional
area so that it'll be easier in the long run to maintain cohesiveness
of code units and to review incoming patches. There are no functional
changes and fs/iomap.c split cleanly.
Summary:
- Regroup the fs/iomap.c code by major functional area so that we can
start development for 5.4 from a more stable base"
* tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move internal declarations into fs/iomap/
iomap: move the main iteration code into a separate file
iomap: move the buffered IO code into a separate file
iomap: move the direct IO code into a separate file
iomap: move the SEEK_HOLE code into a separate file
iomap: move the file mapping reporting code into a separate file
iomap: move the swapfile code into a separate file
iomap: start moving code to fs/iomap/
Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull adfs updates from Al Viro:
"More ADFS patches from Russell King"
* 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/adfs: add time stamp and file type helpers
fs/adfs: super: limit idlen according to directory type
fs/adfs: super: fix use-after-free bug
fs/adfs: super: safely update options on remount
fs/adfs: super: correct superblock flags
fs/adfs: clean up indirect disc addresses and fragment IDs
fs/adfs: clean up error message printing
fs/adfs: use %pV for error messages
fs/adfs: use format_version from disc_record
fs/adfs: add helper to get filesystem size
fs/adfs: add helper to get discrecord from map
fs/adfs: correct disc record structure
Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro:
"The first part of mount updates.
Convert filesystems to use the new mount API"
* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
mnt_init(): call shmem_init() unconditionally
constify ksys_mount() string arguments
don't bother with registering rootfs
init_rootfs(): don't bother with init_ramfs_fs()
vfs: Convert smackfs to use the new mount API
vfs: Convert selinuxfs to use the new mount API
vfs: Convert securityfs to use the new mount API
vfs: Convert apparmorfs to use the new mount API
vfs: Convert openpromfs to use the new mount API
vfs: Convert xenfs to use the new mount API
vfs: Convert gadgetfs to use the new mount API
vfs: Convert oprofilefs to use the new mount API
vfs: Convert ibmasmfs to use the new mount API
vfs: Convert qib_fs/ipathfs to use the new mount API
vfs: Convert efivarfs to use the new mount API
vfs: Convert configfs to use the new mount API
vfs: Convert binfmt_misc to use the new mount API
convenience helper: get_tree_single()
convenience helper get_tree_nodev()
vfs: Kill sget_userns()
...
2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit.
3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika.
4) Fix handling of neigh timers wrt. entries added by userspace, from
Lorenzo Bianconi.
5) Various cases of missing of_node_put(), from Nishka Dasgupta.
6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing.
7) Various RDS layer fixes, from Gerd Rausch.
8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from
Cong Wang.
9) Fix FIB source validation checks over loopback, also from Cong Wang.
10) Use promisc for unsupported number of filters, from Justin Chen.
11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
tcp: fix tcp_set_congestion_control() use from bpf hook
ag71xx: fix return value check in ag71xx_probe()
ag71xx: fix error return code in ag71xx_probe()
usb: qmi_wwan: add D-Link DWM-222 A2 device ID
bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
net: dsa: sja1105: Fix missing unlock on error in sk_buff()
gve: replace kfree with kvfree
selftests/bpf: fix test_xdp_noinline on s390
selftests/bpf: fix "valid read map access into a read-only array 1" on s390
net/mlx5: Replace kfree with kvfree
MAINTAINERS: update netsec driver
ipv6: Unlink sibling route in case of failure
liquidio: Replace vmalloc + memset with vzalloc
udp: Fix typo in net/ipv4/udp.c
net: bcmgenet: use promisc for unsupported filters
ipv6: rt6_check should return NULL if 'from' is NULL
tipc: initialize 'validated' field of received packets
selftests: add a test case for rp_filter
fib: relax source validation check for loopback packets
mlxsw: spectrum: Do not process learned records with a dummy FID
...
Merge yet more updates from Andrew Morton:
"The rest of MM and a kernel-wide procfs cleanup.
Summary of the more significant patches:
- Patch series "mm/memory_hotplug: Factor out memory block
devicehandling", v3. David Hildenbrand.
Some spring-cleaning of the memory hotplug code, notably in
drivers/base/memory.c
- "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
Shi.
Fix /proc/pid/smaps output for THP pages used in shmem.
- "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.
Bugfix and speedup for kernel/resource.c
- Patch series "mm: Further memory block device cleanups", David
Hildenbrand.
More spring-cleaning of the memory hotplug code.
- Patch series "mm: Sub-section memory hotplug support". Dan
Williams.
Generalise the memory hotplug code so that pmem can use it more
completely. Then remove the hacks from the libnvdimm code which
were there to work around the memory-hotplug code's constraints.
- "proc/sysctl: add shared variables for range check", Matteo Croce.
We have about 250 instances of
int zero;
...
.extra1 = &zero,
in the tree. This is a tree-wide sweep to make all those private
"zero"s and "one"s use global variables.
Alas, it isn't practical to make those two global integers const"
* emailed patches from Andrew Morton <[email protected]>: (38 commits)
proc/sysctl: add shared variables for range check
mm: migrate: remove unused mode argument
mm/sparsemem: cleanup 'section number' data types
libnvdimm/pfn: stop padding pmem namespaces to section alignment
libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
mm/devm_memremap_pages: enable sub-section remap
mm: document ZONE_DEVICE memory-model implications
mm/sparsemem: support sub-section hotplug
mm/sparsemem: prepare for sub-section ranges
mm: kill is_dev_zone() helper
mm/hotplug: kill is_dev_zone() usage in __remove_pages()
mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
mm/sparsemem: add helpers track active portions of a section at boot
mm/sparsemem: introduce a SECTION_IS_EARLY flag
mm/sparsemem: introduce struct mem_section_usage
drivers/base/memory.c: get rid of find_memory_block_hinted()
mm/memory_hotplug: move and simplify walk_memory_blocks()
mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
mm: make register_mem_sect_under_node() static
...
Eiichi Tsukata [Sun, 30 Jun 2019 08:54:38 +0000 (17:54 +0900)]
tracing: Fix user stack trace "??" output
Commit c5c27a0a5838 ("x86/stacktrace: Remove the pointless ULONG_MAX
marker") removes ULONG_MAX marker from user stack trace entries but
trace_user_stack_print() still uses the marker and it outputs unnecessary
"??".
The user stack trace code zeroes the storage before saving the stack, so if
the trace is shorter than the maximum number of entries it can terminate
the print loop if a zero entry is detected.
The recent rewrite of PCM link lock management introduced the refcount
in snd_pcm_group object, managed by the kernel refcount_t API. This
caused unexpected kernel warnings when the kernel is built with
CONFIG_REFCOUNT_FULL=y. As the warning line indicates, the problem is
obviously that we start with refcount=0 and do refcount_inc() for
adding each PCM link, while refcount_t API doesn't like refcount_inc()
performed on zero.
For adapting the proper refcount_t usage, this patch changes the logic
slightly:
- The initial refcount is 1, assuming the single list entry
- The refcount is incremented / decremented at each PCM link addition
and deletion
- ... which allows us concentrating only on the refcount as a release
condition
dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device
dma_map_sg() may use swiotlb buffer when the kernel command line includes
"swiotlb=force" or the dma_addr is out of dev->dma_mask range. After
DMA complete the memory moving from device to memory, then user call
dma_sync_sg_for_cpu() to sync with DMA buffer, and copy the original
virtual buffer to other space.
So dma_direct_sync_sg_for_cpu() should use swiotlb physical addr, not
the original physical addr from sg_phys(sg).
dma_direct_sync_sg_for_device() also has the same issue, correct it as
well.
Fixes: 55897af63091("dma-direct: merge swiotlb_dma_ops into the dma_direct code") Signed-off-by: Fugang Duan <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
Hui Wang [Fri, 19 Jul 2019 09:38:58 +0000 (12:38 +0300)]
Input: alps - fix a mismatch between a condition check and its comment
In the function alps_is_cs19_trackpoint(), we check if the param[1] is
in the 0x20~0x2f range, but the code we wrote for this checking is not
correct:
(param[1] & 0x20) does not mean param[1] is in the range of 0x20~0x2f,
it also means the param[1] is in the range of 0x30~0x3f, 0x60~0x6f...
Now fix it with a new condition checking ((param[1] & 0xf0) == 0x20).
Input: psmouse - fix build error of multiple definition
trackpoint_detect() should be static inline while
CONFIG_MOUSE_PS2_TRACKPOINT is not set, otherwise, we build fails:
drivers/input/mouse/alps.o: In function `trackpoint_detect':
alps.c:(.text+0x8e00): multiple definition of `trackpoint_detect'
drivers/input/mouse/psmouse-base.o:psmouse-base.c:(.text+0x1b50): first defined here
Mao Wenan [Tue, 16 Jul 2019 18:16:28 +0000 (20:16 +0200)]
Input: applespi - remove set but not used variables 'sts'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/input/keyboard/applespi.c: In function applespi_set_bl_level:
drivers/input/keyboard/applespi.c:902:6: warning: variable sts set but not used [-Wunused-but-set-variable]
Fixes: b426ac0452093d ("Input: add Apple SPI keyboard and trackpad driver") Signed-off-by: Mao Wenan <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]>
Ronald Tschalär [Mon, 15 Jul 2019 17:30:11 +0000 (10:30 -0700)]
Input: add Apple SPI keyboard and trackpad driver
The keyboard and trackpad on recent MacBook's (since 8,1) and
MacBookPro's (13,* and 14,*) are attached to an SPI controller instead
of USB, as previously. The higher level protocol is not publicly
documented and hence has been reverse engineered. As a consequence there
are still a number of unknown fields and commands. However, the known
parts have been working well and received extensive testing and use.
In order for this driver to work, the proper SPI drivers need to be
loaded too; for MB8,1 these are spi_pxa2xx_platform and spi_pxa2xx_pci;
for all others they are spi_pxa2xx_platform and intel_lpss_pci.
Dexuan Cui [Fri, 19 Jul 2019 03:22:35 +0000 (03:22 +0000)]
x86/hyper-v: Zero out the VP ASSIST PAGE on allocation
The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
5.2.1 "GPA Overlay Pages" for the details) and here is an excerpt:
"The hypervisor defines several special pages that "overlay" the guest's
Guest Physical Addresses (GPA) space. Overlays are addressed GPA but are
not included in the normal GPA map maintained internally by the hypervisor.
Conceptually, they exist in a separate map that overlays the GPA map.
If a page within the GPA space is overlaid, any SPA page mapped to the
GPA page is effectively "obscured" and generally unreachable by the
virtual processor through processor memory accesses.
If an overlay page is disabled, the underlying GPA page is "uncovered",
and an existing mapping becomes accessible to the guest."
SPA = System Physical Address = the final real physical address.
When a CPU (e.g. CPU1) is onlined, hv_cpu_init() allocates the VP ASSIST
PAGE and enables the EOI optimization for this CPU by writing the MSR
HV_X64_MSR_VP_ASSIST_PAGE. From now on, hvp->apic_assist belongs to the
special SPA page, and this CPU *always* uses hvp->apic_assist (which is
shared with the hypervisor) to decide if it needs to write the EOI MSR.
When a CPU is offlined then on the outgoing CPU:
1. hv_cpu_die() disables the EOI optimizaton for this CPU, and from
now on hvp->apic_assist belongs to the original "normal" SPA page;
2. the remaining work of stopping this CPU is done
3. this CPU is completely stopped.
Between 1 and 3, this CPU can still receive interrupts (e.g. reschedule
IPIs from CPU0, and Local APIC timer interrupts), and this CPU *must* write
the EOI MSR for every interrupt received, otherwise the hypervisor may not
deliver further interrupts, which may be needed to completely stop the CPU.
So, after the EOI optimization is disabled in hv_cpu_die(), it's required
that the hvp->apic_assist's bit0 is zero, which is not guaranteed by the
current allocation mode because it lacks __GFP_ZERO. As a consequence the
bit might be set and interrupt handling would not write the EOI MSR causing
interrupt delivery to become stuck.
Add the missing __GFP_ZERO to the allocation.
Note 1: after the "normal" SPA page is allocted and zeroed out, neither the
hypervisor nor the guest writes into the page, so the page remains with
zeros.
Note 2: see Section 10.3.5 "EOI Assist" for the details of the EOI
optimization. When the optimization is enabled, the guest can still write
the EOI MSR register irrespective of the "No EOI required" value, but
that's slower than the optimized assist based variant.
Dave Airlie [Fri, 19 Jul 2019 07:21:48 +0000 (17:21 +1000)]
Merge tag 'drm-next-5.3-2019-07-18' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.3-2019-07-18:
amdgpu:
- Navi DC fix for secondary adapters
- Fix Navi flickering with high res panels
- Navi SMU fixes
- Vega20 SMU fixes
- Fixes for audio hotplug on HG systems
- Fix for potential integer overflows on large buffer
migrations
- debugfs fixes for umr
- Various other small fixes
amdkfd:
- Apply noretry setting consistently
- Fix hang in eviction
- Properly clean up GWS on uninit
Yongxin Liu [Mon, 1 Jul 2019 01:46:22 +0000 (09:46 +0800)]
drm/nouveau: fix memory leak in nouveau_conn_reset()
In nouveau_conn_reset(), if connector->state is true,
__drm_atomic_helper_connector_destroy_state() will be called,
but the memory pointed by asyc isn't freed. Memory leak happens
in the following function __drm_atomic_helper_connector_reset(),
where newly allocated asyc->state will be assigned to connector->state.
So using nouveau_conn_atomic_destroy_state() instead of
__drm_atomic_helper_connector_destroy_state to free the "old" asyc.
Ralph Campbell [Fri, 14 Jun 2019 20:20:03 +0000 (13:20 -0700)]
drm/nouveau/dmem: missing mutex_lock in error path
In nouveau_dmem_pages_alloc(), the drm->dmem->mutex is unlocked before
calling nouveau_dmem_chunk_alloc() as shown when CONFIG_PROVE_LOCKING
is enabled:
Ben Skeggs [Fri, 5 Jul 2019 05:11:42 +0000 (15:11 +1000)]
drm/nouveau: fix bogus GPL-2 license header
The bulk SPDX addition made all these files into GPL-2.0 licensed files.
However the remainder of the project is MIT-licensed, these files
were simply missing the boiler plate and got caught up in the global update.
Ilia Mirkin [Thu, 20 Jun 2019 00:13:43 +0000 (20:13 -0400)]
drm/nouveau: fix bogus GPL-2 license header
The bulk SPDX addition made all these files into GPL-2.0 licensed files.
However the remainder of the project is MIT-licensed, these files
(primarily header files) were simply missing the boiler plate and got
caught up in the global update.
Fixes: b24413180f5 (License cleanup: add SPDX GPL-2.0 license identifier to files with no license) Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Emil Velikov <[email protected]> Acked-by: Karol Herbst <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
Lyude Paul [Wed, 26 Jun 2019 18:10:27 +0000 (14:10 -0400)]
drm/nouveau/i2c: Enable i2c pads & busses during preinit
It turns out that while disabling i2c bus access from software when the
GPU is suspended was a step in the right direction with:
commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")
We also ended up accidentally breaking the vbios init scripts on some
older Tesla GPUs, as apparently said scripts can actually use the i2c
bus. Since these scripts are executed before initializing any
subdevices, we end up failing to acquire access to the i2c bus which has
left a number of cards with their fan controllers uninitialized. Luckily
this doesn't break hardware - it just means the fan gets stuck at 100%.
This also means that we've always been using our i2c busses before
initializing them during the init scripts for older GPUs, we just didn't
notice it until we started preventing them from being used until init.
It's pretty impressive this never caused us any issues before!
So, fix this by initializing our i2c pad and busses during subdev
pre-init. We skip initializing aux busses during pre-init, as those are
guaranteed to only ever be used by nouveau for DP aux transactions.
Ben Skeggs [Mon, 17 Jun 2019 02:52:48 +0000 (12:52 +1000)]
drm/nouveau/core: recognise TU116 chipset
Modesetting only, still waiting on ACR/GR firmware from NVIDIA for Turing
graphics/compute bring-up.
Each subsystem was compared with traces, along with various tests to check
that things generally work as they should, and appears compatible enough
with the current TU117 code to enable support.
Ben Skeggs [Tue, 28 May 2019 23:58:18 +0000 (09:58 +1000)]
drm/nouveau/kms: disallow dual-link harder if hdmi connection detected
The fallthrough cases (pre-Fermi) would accidentally allow dual-link pixel
clocks even where they shouldn't be. This leads to a high resolution HDMI
displays, connected via a DVI->HDMI adapter, to fail on the original NV50.
Previously center scaling would get scaling applied to it (when it was
only supposed to center the image), and aspect-corrected scaling did not
always correctly pick whether to reduce width or height for a particular
combination of inputs/outputs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110660 Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
Ilia Mirkin [Sat, 25 May 2019 22:41:48 +0000 (18:41 -0400)]
drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes
Higher layers tend to add a lot of modes not actually in the EDID, such
as the standard DMT modes. Changing this would be extremely intrusive to
everyone, so just force the scaler more often. There are no practical
cases we're aware of where a LVDS/eDP panel has multiple resolutions
exposed, and i915 already does it this way.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110660 Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
Guo Ren [Fri, 28 Jun 2019 12:39:46 +0000 (20:39 +0800)]
csky: Fixup abiv1 memset error
Current memset implementation in abiv1 is wrong and it'll cause unalign
access. Just remove it and use the generic one. This patch will cause
performance degradation and we will improve it with a new design in next
patchset.
Guo Ren [Tue, 18 Jun 2019 12:34:35 +0000 (20:34 +0800)]
csky: Improve tlb operation with help of asid
There are two generations of tlb operation instruction for C-SKY.
First generation is use mcr register and it need software do more
things, second generation is use specific instructions, eg:
tlbi.va, tlbi.vas, tlbi.alls
We implemented the following functions:
- flush_tlb_range (a range of entries)
- flush_tlb_page (one entry)
Above functions use asid from vma->mm to invalid tlb entries and
we could use tlbi.vas instruction for newest generation csky cpu.
- flush_tlb_kernel_range
- flush_tlb_one
Above functions don't care asid and it invalid the tlb entries only
with vpn and we could use tlbi.vaas instruction for newest generat-
ion csky cpu.
Guo Ren [Tue, 18 Jun 2019 12:33:32 +0000 (20:33 +0800)]
csky: Use generic asid algorithm to implement switch_mm
Use linux generic asid/vmid algorithm to implement csky
switch_mm function. The algorithm is from arm and it could
work with SMP system. It'll help reduce tlb flush for
switch_mm in task/vm switch.
Guo Ren [Tue, 18 Jun 2019 12:06:52 +0000 (20:06 +0800)]
csky: Add new asid lib code from arm
This patch only contains asid help code from arm for next patch to
use.
The asid allocator use five level check to reduce the cost of
switch_mm.
1. Check if the asid version is the same (it's general)
2. Check reserved_asid which is set in rollover flush_context()
and key point is to keep the same bit position with the current
asid version instead of input version.
3. Check if the position of bitmap is free then it could be set &
used directly.
4. find_next_zero_bit() (a little performance cost)
5. flush_context (this is the worst cost with increase current asid
version)
Check is level by level and cost is also higher with the next level.
The reserved_asid and bitmap mechanism prevent unnecessary
find_next_zero_bit().
The atomic 64 bit asid is also suitable for 32-bit system and it
won't cost a lot in 1th 2th 3th level check.
The operation of set/clear mm_cpumask was removed in arm64 compared to
arm32. It seems no side effect on current arm64 system, but from
software meaning it's wrong. Although csky also needn't it, we add it
back for csky.
The asid_per_ctxt is no use for csky and it reserves the lowest bits for
other use, maybe: trust zone ? Ok, just keep it in csky copy.
Seems it also could be used by other archs and it's worth to move asid
code to generic in future.
Guo Ren [Tue, 18 Jun 2019 09:20:10 +0000 (17:20 +0800)]
csky: Revert mmu ASID mechanism
Current C-SKY ASID mechanism is from mips and it doesn't work well
with multi-cores. ASID per core mechanism is not suitable for C-SKY
SMP tlb maintain operations, eg: tlbi.vas need share the same asid
in all processors and it'll invalid the tlb entry in all cores with
the same asid.
Add trigger type setting for csky,mpintc. The driver also could
support #interrupt-cells <1> and it wouldn't invalidate existing
DTs. Here we only show the complete format.
Guo Ren [Tue, 4 Jun 2019 10:54:48 +0000 (18:54 +0800)]
csky: Fixup some error count in 810 & 860.
CK810 pmu only support event with index 0-8 and 0xd; CK860 only
support event 1~4, 0xa~0x1b. So do not register unsupport event
to hardware cache event, which may leader to unknown behavior.
Mao Han [Tue, 4 Jun 2019 10:54:49 +0000 (18:54 +0800)]
csky: Fix perf record in kernel/user space
csky_pmu_event_init is called several times during the perf record
initialzation. After configure the event counter in either kernel
space or user space, csky_pmu_event_init is called twice with no
attr specified. Configuration will be overwritten with sampling in
both kernel space and user space. --all-kernel/--all-user is
useless without this patch applied.
Mao Han [Tue, 4 Jun 2019 10:54:45 +0000 (18:54 +0800)]
csky: Add count-width property for csky pmu
The csky pmu counter may have different io width. When the counter is
smaller then 64 bits and counter value is smaller than the old value, it
will result to a extremely large delta value. So the sampled value should
be extend to 64 bits to avoid this, the extension bits base on the
count-width property from dts.
Mao Han [Tue, 4 Jun 2019 10:54:44 +0000 (18:54 +0800)]
csky: Init pmu as a device
This patch change the csky pmu initialization from arch init to
device init. The pmu can be configued with information from
device tree(pmu device name, irq number and etc.).
Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.
As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.
The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.
In case of error, the function of_get_mac_address() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.
On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.
The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:
Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.
This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:
# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%
Dan Williams [Thu, 18 Jul 2019 22:58:43 +0000 (15:58 -0700)]
mm/sparsemem: cleanup 'section number' data types
David points out that there is a mixture of 'int' and 'unsigned long'
usage for section number data types. Update the memory hotplug path to
use 'unsigned long' consistently for section numbers.
Dan Williams [Thu, 18 Jul 2019 22:58:40 +0000 (15:58 -0700)]
libnvdimm/pfn: stop padding pmem namespaces to section alignment
Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE
memory, we no longer need to add padding at pfn/dax device creation
time. The kernel will still honor padding established by older kernels.
At namespace creation time there is the potential for the "expected to
be zero" fields of a 'pfn' info-block to be filled with indeterminate
data. While the kernel buffer is zeroed on allocation it is immediately
overwritten by nd_pfn_validate() filling it with the current contents of
the on-media info-block location. For fields like, 'flags' and the
'padding' it potentially means that future implementations can not rely on
those fields being zero.
In preparation to stop using the 'start_pad' and 'end_trunc' fields for
section alignment, arrange for fields that are not explicitly
initialized to be guaranteed zero. Bump the minor version to indicate
it is safe to assume the 'padding' and 'flags' are zero. Otherwise,
this corruption is expected to benign since all other critical fields
are explicitly initialized.
Note The cc: stable is about spreading this new policy to as many
kernels as possible not fixing an issue in those kernels. It is not
until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to
section alignment" where this improper initialization becomes a problem.
So if someone decides to backport "libnvdimm/pfn: Stop padding pmem
namespaces to section alignment" (which is not tagged for stable), make
sure this pre-requisite is flagged.
Dan Williams [Thu, 18 Jul 2019 22:58:33 +0000 (15:58 -0700)]
mm/devm_memremap_pages: enable sub-section remap
Teach devm_memremap_pages() about the new sub-section capabilities of
arch_{add,remove}_memory(). Effectively, just replace all usage of
align_start, align_end, and align_size with res->start, res->end, and
resource_size(res). The existing sanity check will still make sure that
the two separate remap attempts do not collide within a sub-section (2MB
on x86).
Dan Williams [Thu, 18 Jul 2019 22:58:26 +0000 (15:58 -0700)]
mm/sparsemem: support sub-section hotplug
The libnvdimm sub-system has suffered a series of hacks and broken
workarounds for the memory-hotplug implementation's awkward
section-aligned (128MB) granularity.
For example the following backtrace is emitted when attempting
arch_add_memory() with physical address ranges that intersect 'System
RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section:
It was discovered that the problem goes beyond RAM vs PMEM collisions as
some platform produce PMEM vs PMEM collisions within a given section.
The libnvdimm workaround for that case revealed that the libnvdimm
section-alignment-padding implementation has been broken for a long
while.
A fix for that long-standing breakage introduces as many problems as it
solves as it would require a backward-incompatible change to the
namespace metadata interpretation. Instead of that dubious route [1],
address the root problem in the memory-hotplug implementation.
Note that EEXIST is no longer treated as success as that is how
sparse_add_section() reports subsection collisions, it was also obviated
by recent changes to perform the request_region() for 'System RAM'
before arch_add_memory() in the add_memory() sequence.
Dan Williams [Thu, 18 Jul 2019 22:58:22 +0000 (15:58 -0700)]
mm/sparsemem: prepare for sub-section ranges
Prepare the memory hot-{add,remove} paths for handling sub-section
ranges by plumbing the starting page frame and number of pages being
handled through arch_{add,remove}_memory() to
sparse_{add,remove}_one_section().
This is simply plumbing, small cleanups, and some identifier renames.
No intended functional changes.
Dan Williams [Thu, 18 Jul 2019 22:58:15 +0000 (15:58 -0700)]
mm/hotplug: kill is_dev_zone() usage in __remove_pages()
The zone type check was a leftover from the cleanup that plumbed altmap
through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the
vmem_altmap to arch_remove_memory and __remove_pages".
Dan Williams [Thu, 18 Jul 2019 22:58:11 +0000 (15:58 -0700)]
mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
Allow sub-section sized ranges to be added to the memmap.
populate_section_memmap() takes an explict pfn range rather than
assuming a full section, and those parameters are plumbed all the way
through to vmmemap_populate(). There should be no sub-section usage in
current deployments. New warnings are added to clarify which memmap
allocation paths are sub-section capable.
Dan Williams [Thu, 18 Jul 2019 22:58:07 +0000 (15:58 -0700)]
mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
Sub-section hotplug support reduces the unit of operation of hotplug
from section-sized-units (PAGES_PER_SECTION) to sub-section-sized units
(PAGES_PER_SUBSECTION). Teach shrink_{zone,pgdat}_span() to consider
PAGES_PER_SUBSECTION boundaries as the points where pfn_valid(), not
valid_section(), can toggle.
Dan Williams [Thu, 18 Jul 2019 22:58:04 +0000 (15:58 -0700)]
mm/sparsemem: add helpers track active portions of a section at boot
Prepare for hot{plug,remove} of sub-ranges of a section by tracking a
sub-section active bitmask, each bit representing a PMD_SIZE span of the
architecture's memory hotplug section size.
The implications of a partially populated section is that pfn_valid()
needs to go beyond a valid_section() check and either determine that the
section is an "early section", or read the sub-section active ranges
from the bitmask. The expectation is that the bitmask (subsection_map)
fits in the same cacheline as the valid_section() / early_section()
data, so the incremental performance overhead to pfn_valid() should be
negligible.
The rationale for using early_section() to short-ciruit the
subsection_map check is that there are legacy code paths that use
pfn_valid() at section granularity before validating the pfn against
pgdat data. So, the early_section() check allows those traditional
assumptions to persist while also permitting subsection_map to tell the
truth for purposes of populating the unused portions of early sections
with PMEM and other ZONE_DEVICE mappings.
Dan Williams [Thu, 18 Jul 2019 22:58:00 +0000 (15:58 -0700)]
mm/sparsemem: introduce a SECTION_IS_EARLY flag
In preparation for sub-section hotplug, track whether a given section
was created during early memory initialization, or later via memory
hotplug. This distinction is needed to maintain the coarse expectation
that pfn_valid() returns true for any pfn within a given section even if
that section has pages that are reserved from the page allocator.
For example one of the of goals of subsection hotplug is to support
cases where the system physical memory layout collides System RAM and
PMEM within a section. Several pfn_valid() users expect to just check
if a section is valid, but they are not careful to check if the given
pfn is within a "System RAM" boundary and instead expect pgdat
information to further validate the pfn.
Rather than unwind those paths to make their pfn_valid() queries more
precise a follow on patch uses the SECTION_IS_EARLY flag to maintain the
traditional expectation that pfn_valid() returns true for all early
sections.
Dan Williams [Thu, 18 Jul 2019 22:57:57 +0000 (15:57 -0700)]
mm/sparsemem: introduce struct mem_section_usage
Patch series "mm: Sub-section memory hotplug support", v10.
The memory hotplug section is an arbitrary / convenient unit for memory
hotplug. 'Section-size' units have bled into the user interface
('memblock' sysfs) and can not be changed without breaking existing
userspace. The section-size constraint, while mostly benign for typical
memory hotplug, has and continues to wreak havoc with 'device-memory'
use cases, persistent memory (pmem) in particular. Recall that pmem
uses devm_memremap_pages(), and subsequently arch_add_memory(), to
allocate a 'struct page' memmap for pmem. However, it does not use the
'bottom half' of memory hotplug, i.e. never marks pmem pages online and
never exposes the userspace memblock interface for pmem. This leaves an
opening to redress the section-size constraint.
To date, the libnvdimm subsystem has attempted to inject padding to
satisfy the internal constraints of arch_add_memory(). Beyond
complicating the code, leading to bugs [2], wasting memory, and limiting
configuration flexibility, the padding hack is broken when the platform
changes this physical memory alignment of pmem from one boot to the
next. Device failure (intermittent or permanent) and physical
reconfiguration are events that can cause the platform firmware to
change the physical placement of pmem on a subsequent boot, and device
failure is an everyday event in a data-center.
It turns out that sections are only a hard requirement of the
user-facing interface for memory hotplug and with a bit more
infrastructure sub-section arch_add_memory() support can be added for
kernel internal usages like devm_memremap_pages(). Here is an analysis
of the current design assumptions in the current code and how they are
addressed in the new implementation:
Current design assumptions:
- Sections that describe boot memory (early sections) are never
unplugged / removed.
- pfn_valid(), in the CONFIG_SPARSEMEM_VMEMMAP=y, case devolves to a
valid_section() check
- __add_pages() and helper routines assume all operations occur in
PAGES_PER_SECTION units.
- The memblock sysfs interface only comprehends full sections
New design assumptions:
- Sections are instrumented with a sub-section bitmask to track (on
x86) individual 2MB sub-divisions of a 128MB section.
- Partially populated early sections can be extended with additional
sub-sections, and those sub-sections can be removed with
arch_remove_memory(). With this in place we no longer lose usable
memory capacity to padding.
- pfn_valid() is updated to look deeper than valid_section() to also
check the active-sub-section mask. This indication is in the same
cacheline as the valid_section() so the performance impact is
expected to be negligible. So far the lkp robot has not reported any
regressions.
- Outside of the core vmemmap population routines which are replaced,
other helper routines like shrink_{zone,pgdat}_span() are updated to
handle the smaller granularity. Core memory hotplug routines that
deal with online memory are not touched.
- The existing memblock sysfs user api guarantees / assumptions are not
touched since this capability is limited to !online
!memblock-sysfs-accessible sections.
Meanwhile the issue reports continue to roll in from users that do not
understand when and how the 128MB constraint will bite them. The current
implementation relied on being able to support at least one misaligned
namespace, but that immediately falls over on any moderately complex
namespace creation attempt. Beyond the initial problem of 'System RAM'
colliding with pmem, and the unsolvable problem of physical alignment
changes, Linux is now being exposed to platforms that collide pmem ranges
with other pmem ranges by default [3]. In short, devm_memremap_pages()
has pushed the venerable section-size constraint past the breaking point,
and the simplicity of section-aligned arch_add_memory() is no longer
tenable.
These patches are exposed to the kbuild robot on a subsection-v10 branch
[4], and a preview of the unit test for this functionality is available
on the 'subsection-pending' branch of ndctl [5].
Towards enabling memory hotplug to track partial population of a section,
introduce 'struct mem_section_usage'.
A pointer to a 'struct mem_section_usage' instance replaces the existing
pointer to a 'pageblock_flags' bitmap. Effectively it adds one more
'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house
a new 'subsection_map' bitmap. The new bitmap enables the memory
hot{plug,remove} implementation to act on incremental sub-divisions of a
section.
SUBSECTION_SHIFT is defined as global constant instead of per-architecture
value like SECTION_SIZE_BITS in order to allow cross-arch compatibility of
subsection users. Specifically a common subsection size allows for the
possibility that persistent memory namespace configurations be made
compatible across architectures.
The primary motivation for this functionality is to support platforms that
mix "System RAM" and "Persistent Memory" within a single section, or
multiple PMEM ranges with different mapping lifetimes within a single
section. The section restriction for hotplug has caused an ongoing saga
of hacks and bugs for devm_memremap_pages() users.
Beyond the fixups to teach existing paths how to retrieve the 'usemap'
from a section, and updates to usemap allocation path, there are no
expected behavior changes.
mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
walk_memory_range() was once used to iterate over sections. Now, it
iterates over memory blocks. Rename the function, fixup the
documentation.
Also, pass start+size instead of PFNs, which is what most callers
already have at hand. (we'll rework link_mem_sections() most probably
soon)
Follow-up patches will rework, simplify, and move walk_memory_blocks()
to drivers/base/memory.c.
Note: walk_memory_blocks() only works correctly right now if the
start_pfn is aligned to a section start. This is the case right now,
but we'll generalize the function in a follow up patch so the semantics
match the documentation.
Patch series "mm: Further memory block device cleanups", v1.
Some further cleanups around memory block devices. Especially, clean up
and simplify walk_memory_range(). Including some other minor cleanups.
This patch (of 6):
We are using a mixture of "int" and "unsigned long". Let's make this
consistent by using "unsigned long" everywhere. We'll do the same with
memory block ids next.
While at it, turn the "unsigned long i" in removable_show() into an int
- sections_per_block is an int.
Nadav Amit [Thu, 18 Jul 2019 22:57:34 +0000 (15:57 -0700)]
resource: avoid unnecessary lookups in find_next_iomem_res()
find_next_iomem_res() shows up to be a source for overhead in dax
benchmarks.
Improve performance by not considering children of the tree if the top
level does not match. Since the range of the parents should include the
range of the children such check is redundant.
Running sysbench on dax (pmem emulation, with write_cache disabled):
sysbench fileio --file-total-size=3G --file-test-mode=rndwr \
--file-io-mode=mmap --threads=4 --file-fsync-mode=fdatasync run
Nadav Amit [Thu, 18 Jul 2019 22:57:31 +0000 (15:57 -0700)]
resource: fix locking in find_next_iomem_res()
Since resources can be removed, locking should ensure that the resource
is not removed while accessing it. However, find_next_iomem_res() does
not hold the lock while copying the data of the resource.
Keep holding the lock while the data is copied. While at it, change the
return value to a more informative value. It is disregarded by the
callers.
Yang Shi [Thu, 18 Jul 2019 22:57:27 +0000 (15:57 -0700)]
mm: thp: fix false negative of shmem vma's THP eligibility
Commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
vma") introduced THPeligible bit for processes' smaps. But, when
checking the eligibility for shmem vma, __transparent_hugepage_enabled()
is called to override the result from shmem_huge_enabled(). It may
result in the anonymous vma's THP flag override shmem's. For example,
running a simple test which create THP for shmem, but with anonymous THP
disabled, when reading the process's smaps, it may show:
And, /proc/meminfo does show THP allocated and PMD mapped too:
ShmemHugePages: 4096 kB
ShmemPmdMapped: 4096 kB
This doesn't make too much sense. The shmem objects should be treated
separately from anonymous THP. Calling shmem_huge_enabled() with
checking MMF_DISABLE_THP sounds good enough. And, we could skip stack
and dax vma check since we already checked if the vma is shmem already.
Also check if vma is suitable for THP by calling
transhuge_vma_suitable().
And minor fix to smaps output format and documentation.
Yang Shi [Thu, 18 Jul 2019 22:57:24 +0000 (15:57 -0700)]
mm: thp: make transhuge_vma_suitable available for anonymous THP
transhuge_vma_suitable() was only available for shmem THP, but anonymous
THP has the same check except pgoff check. And, it will be used for THP
eligible check in the later patch, so make it available for all kind of
THPs. This also helps reduce code duplication slightly.
Since anonymous THP doesn't have to check pgoff, so make pgoff check
shmem vma only.
And regroup some functions in include/linux/mm.h to solve compile issue
since transhuge_vma_suitable() needs call vma_is_anonymous() which was
defined after huge_mm.h is included.
Wei Yang [Thu, 18 Jul 2019 22:57:21 +0000 (15:57 -0700)]
mm/sparse.c: set section nid for hot-add memory
In case of NODE_NOT_IN_PAGE_FLAGS is set, we store section's node id in
section_to_node_table[]. While for hot-add memory, this is missed.
Without this information, page_to_nid() may not give the right node id.
BTW, current online_pages works because it leverages nid in
memory_block. But the granularity of node id should be mem_section
wide.
mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section
The parameter is unused, so let's drop it. Memory removal paths should
never care about zones. This is the job of memory offlining and will
require more refactorings.
mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail
We really don't want anything during memory hotunplug to fail. We
always pass a valid memory block device, that check can go. Avoid
allocating memory and eventually failing. As we are always called under
lock, we can use a static piece of memory. This avoids having to put
the structure onto the stack, having to guess about the stack size of
callers.
Patch inspired by a patch from Oscar Salvador.
In the future, there might be no need to iterate over nodes at all.
mem->nid should tell us exactly what to remove. Memory block devices
with mixed nodes (added during boot) should properly fenced off and
never removed.
mm/memory_hotplug: remove memory block devices before arch_remove_memory()
Let's factor out removing of memory block devices, which is only
necessary for memory added via add_memory() and friends that created
memory block devices. Remove the devices before calling
arch_remove_memory().
This finishes factoring out memory block device handling from
arch_add_memory() and arch_remove_memory().
mm/memory_hotplug: create memory block devices after arch_add_memory()
Only memory to be added to the buddy and to be onlined/offlined by user
space using /sys/devices/system/memory/... needs (and should have!)
memory block devices.
Factor out creation of memory block devices. Create all devices after
arch_add_memory() succeeded. We can later drop the want_memblock
parameter, because it is now effectively stale.
Only after memory block devices have been added, memory can be onlined
by user space. This implies, that memory is not visible to user space
at all before arch_add_memory() succeeded.
While at it
- use WARN_ON_ONCE instead of BUG_ON in moved unregister_memory()
- introduce find_memory_block_by_id() to search via block id
- Use find_memory_block_by_id() in init_memory_block() to catch
duplicates
mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE
We want to improve error handling while adding memory by allowing to use
arch_remove_memory() and __remove_pages() even if
CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:
arch_add_memory()
rc = do_something();
if (rc) {
arch_remove_memory();
}
We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
quite some dependencies for memory offlining.
A proper arch_remove_memory() implementation is on its way, which also
cleanly removes page tables in arch_add_memory() in case something goes
wrong.
As we want to use arch_remove_memory() in case something goes wrong
during memory hotplug after arch_add_memory() finished, let's add a
temporary hack that is sufficient enough until we get a proper
implementation that cleans up page table entries.
We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
patches.
mm/memory_hotplug: simplify and fix check_hotplug_memory_range()
Patch series "mm/memory_hotplug: Factor out memory block devicehandling", v3.
We only want memory block devices for memory to be onlined/offlined
(add/remove from the buddy). This is required so user space can
online/offline memory and kdump gets notified about newly onlined
memory.
Let's factor out creation/removal of memory block devices. This helps
to further cleanup arch_add_memory/arch_remove_memory() and to make
implementation of new features easier - especially sub-section memory
hot add from Dan.
Anshuman Khandual is currently working on arch_remove_memory(). I added
a temporary solution via "arm64/mm: Add temporary arch_remove_memory()
implementation", that is sufficient as a firsts tep in the context of
this series. (we don't cleanup page tables in case anything goes wrong
already)
Did a quick sanity test with DIMM plug/unplug, making sure all devices
and sysfs links properly get added/removed. Compile tested on s390x and
x86-64.
This patch (of 11):
By converting start and size to page granularity, we actually ignore
unaligned parts within a page instead of properly bailing out with an
error.
Michael Chan [Wed, 17 Jul 2019 07:07:23 +0000 (03:07 -0400)]
bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
Unlike legacy chips, 57500 chips don't need additional VNIC resources
for aRFS/ntuple. Fix the code accordingly so that we don't reserve
and allocate additional VNICs on 57500 chips. Without this patch,
the driver is failing to initialize when it tries to allocate extra
VNICs.
Fixes: ac33906c67e2 ("bnxt_en: Add support for aRFS on 57500 chips.") Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Ronnie Sahlberg [Thu, 18 Jul 2019 22:12:11 +0000 (08:12 +1000)]
cifs: flush before set-info if we have writeable handles
Servers can defer destaging any data and updating the mtime until close().
This means that if we do a setinfo to modify the mtime while other handles
are open for write the server may overwrite our setinfo timestamps when
if flushes the file on close() of the writeable handle.
To solve this we add an explicit flush when the mtime is about to
be updated.
This fixes "cp -p" to preserve mtime when copying a file onto an SMB2 share.
Steve French [Thu, 18 Jul 2019 22:22:18 +0000 (17:22 -0500)]
smb3: optimize open to not send query file internal info
We can cut one third of the traffic on open by not querying the
inode number explicitly via SMB3 query_info since it is now
returned on open in the qfid context.
This is better in multiple ways, and
speeds up file open about 10% (more if network is slow).
Merge tag 'for-5.3/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull more device mapper updates from Mike Snitzer:
- Fix zone state management race in DM zoned target by eliminating the
unnecessary DMZ_ACTIVE state.
- A couple fixes for issues the DM snapshot target's optional discard
support added during first week of the 5.3 merge.
- Increase default size of outstanding IO that is allowed for a each
dm-kcopyd client and introduce tunable to allow user adjust.
- Update DM core to use printk ratelimiting functions rather than
duplicate them and in doing so fix an issue where DMDEBUG_LIMIT()
rate limited KERN_DEBUG messages had excessive "callbacks suppressed"
messages.
* tag 'for-5.3/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: use printk ratelimiting functions
dm kcopyd: Increase default sub-job size to 512KB
dm snapshot: fix oversights in optional discard support
dm zoned: fix zone state management race