Git Repo - linux.git/log

apparmor: test: make static symbols visible during kunit testing

Use macros, VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT, to allow
static symbols to be conditionally set to be visible during
apparmor_policy_unpack_test, which removes the need to include the testing
file in the implementation file.

Change the namespace of the symbols that are now conditionally visible (by
adding the prefix aa_) to avoid confusion with symbols of the same name.

Allow the test to be built as a module and namespace the module name from
policy_unpack_test to apparmor_policy_unpack_test to improve clarity of
the module name.

Provide an example of how static symbols can be dealt with in testing.

Signed-off-by: Rae Moar <[email protected]>
Reviewed-by: David Gow <[email protected]>
Acked-by: John Johansen <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: add macro to allow conditionally exposing static symbols to tests

Create two macros:

VISIBLE_IF_KUNIT - A macro that sets symbols to be static if CONFIG_KUNIT
is not enabled. Otherwise if CONFIG_KUNIT is enabled there is no change to
the symbol definition.

EXPORT_SYMBOL_IF_KUNIT(symbol) - Exports symbol into
EXPORTED_FOR_KUNIT_TESTING namespace only if CONFIG_KUNIT is enabled. Must
use MODULE_IMPORT_NS(EXPORTED_FOR_KUNIT_TESTING) in test file in order to
use symbols.

Signed-off-by: Rae Moar <[email protected]>
Reviewed-by: John Johansen <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: make parser preserve whitespace when printing test log

Currently, kunit_parser.py is stripping all leading whitespace to make
parsing easier. But this means we can't accurately show kernel output
for failing tests or when the kernel crashes.

Embarassingly, this affects even KUnit's own output, e.g.
[13:40:46] Expected 2 + 1 == 2, but
[13:40:46] 2 + 1 == 3 (0x3)
[13:40:46] not ok 1 example_simple_test
[13:40:46] [FAILED] example_simple_test

After this change, here's what the output in context would look like
[13:40:46] =================== example (4 subtests) ===================
[13:40:46] # example_simple_test: initializing
[13:40:46] # example_simple_test: EXPECTATION FAILED at lib/kunit/kunit-example-test.c:29
[13:40:46] Expected 2 + 1 == 2, but
[13:40:46] 2 + 1 == 3 (0x3)
[13:40:46] [FAILED] example_simple_test
[13:40:46] [SKIPPED] example_skip_test
[13:40:46] [SKIPPED] example_mark_skipped_test
[13:40:46] [PASSED] example_all_expect_macros_test
[13:40:46] # example: initializing suite
[13:40:46] # example: pass:1 fail:1 skip:2 total:4
[13:40:46] # Totals: pass:1 fail:1 skip:2 total:4
[13:40:46] ===================== [FAILED] example =====================

This example shows one minor cosmetic defect this approach has.
The test counts lines prevent us from dedenting the suite-level output.
But at the same time, any form of non-KUnit output would do the same
unless it happened to be indented as well.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Documentation: kunit: Fix "How Do I Use This" / "Next Steps" sections

The "How Do I Use This" section of index.rst and "Next Steps" section of
start.rst were just copies of the table of contents, and therefore
weren't really useful either when looking a sphinx generated output
(which already had the TOC visible) or when reading the source (where
it's just a list of files that ls could give you).

Instead, provide a small number of concrete next steps, and a bit more
description about what the pages contain.

This also removes the broken reference to 'tips.rst', which was
previously removed.

Fixed git am whitespace complaints during commit:
Shuah Khan <[email protected]>

Fixes: 4399c737a97d ("Documentation: kunit: Remove redundant 'tips.rst' page")
Signed-off-by: David Gow <[email protected]>
Reviewed-by: Sadiya Kazi <[email protected]>
Reviewed-by: Bagas Sanjaya <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: don't include KTAP headers and the like in the test log

We print the "test log" on failure.
This is meant to be all the kernel output that happened during the test.

But we also include the special KTAP lines in it, which are often
redundant.

E.g. we include the "not ok" line in the log, right before we print
that the test case failed...
[13:51:48] Expected 2 + 1 == 2, but
[13:51:48] 2 + 1 == 3 (0x3)
[13:51:48] not ok 1 example_simple_test
[13:51:48] [FAILED] example_simple_test

More full example after this patch:
[13:51:48] =================== example (4 subtests) ===================
[13:51:48] # example_simple_test: initializing
[13:51:48] # example_simple_test: EXPECTATION FAILED at lib/kunit/kunit-example-test.c:29
[13:51:48] Expected 2 + 1 == 2, but
[13:51:48] 2 + 1 == 3 (0x3)
[13:51:48] [FAILED] example_simple_test

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: improve KTAP compliance of KUnit test output

Change KUnit test output to better comply with KTAP v1 specifications
found here: https://kernel.org/doc/html/latest/dev-tools/ktap.html.
1) Use "KTAP version 1" instead of "TAP version 14" as test output header
2) Remove '-' between test number and test name on test result lines
2) Add KTAP version lines to each subtest header as well

Note that the new KUnit output still includes the “# Subtest” line now
located after the KTAP version line. This does not completely match the
KTAP v1 spec but since it is classified as a diagnostic line, it is not
expected to be disruptive or break any existing parsers. This
“# Subtest” line comes from the TAP 14 spec
(https://testanything.org/tap-version-14-specification.html) and it is
used to define the test name before the results.

Original output:

TAP version 14
1..1
   # Subtest: kunit-test-suite
   1..3
   ok 1 - kunit_test_1
   ok 2 - kunit_test_2
   ok 3 - kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 - kunit-test-suite

New output:

KTAP version 1
1..1
   KTAP version 1
   # Subtest: kunit-test-suite
   1..3
   ok 1 kunit_test_1
   ok 2 kunit_test_2
   ok 3 kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 kunit-test-suite

Signed-off-by: Rae Moar <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Tested-by: Anders Roxell <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: parse KTAP compliant test output

Change the KUnit parser to be able to parse test output that complies with
the KTAP version 1 specification format found here:
https://kernel.org/doc/html/latest/dev-tools/ktap.html. Ensure the parser
is able to parse tests with the original KUnit test output format as
well.

KUnit parser now accepts any of the following test output formats:

Original KUnit test output format:

TAP version 14
1..1
   # Subtest: kunit-test-suite
   1..3
   ok 1 - kunit_test_1
   ok 2 - kunit_test_2
   ok 3 - kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 - kunit-test-suite

KTAP version 1 test output format:

KTAP version 1
1..1
   KTAP version 1
   1..3
   ok 1 kunit_test_1
   ok 2 kunit_test_2
   ok 3 kunit_test_3
ok 1 kunit-test-suite

New KUnit test output format (changes made in the next patch of
this series):

KTAP version 1
1..1
   KTAP version 1
   # Subtest: kunit-test-suite
   1..3
   ok 1 kunit_test_1
   ok 2 kunit_test_2
   ok 3 kunit_test_3
# kunit-test-suite: pass:3 fail:0 skip:0 total:3
# Totals: pass:3 fail:0 skip:0 total:3
ok 1 kunit-test-suite

Signed-off-by: Rae Moar <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

mm: slub: test: Use the kunit_get_current_test() function

Use the newly-added function kunit_get_current_test() instead of
accessing current->kunit_test directly. This function uses a static key
to return more quickly when KUnit is enabled, but no tests are actively
running. There should therefore be a negligible performance impact to
enabling the slub KUnit tests.

Other than the performance improvement, this should be a no-op.

Cc: Oliver Glitta <[email protected]>
Cc: Hyeonggon Yoo <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: David Gow <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Hyeonggon Yoo <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: Use the static key when retrieving the current test

In order to detect if a KUnit test is running, and to access its
context, the 'kunit_test' member of the current task_struct is used.
Usually, this is accessed directly or via the kunit_fail_current_task()
function.

In order to speed up the case where no test is running, add a wrapper,
kunit_get_current_test(), which uses the static key to fail early.
Equally, Speed up kunit_fail_current_test() by using the static key.

This should make it convenient for code to call this
unconditionally in fakes or error paths, without worrying that this will
slow the code down significantly.

If CONFIG_KUNIT=n (or m), this compiles away to nothing. If
CONFIG_KUNIT=y, it will compile down to a NOP (on most architectures) if
no KUnit test is currently running.

Note that kunit_get_current_test() does not work if KUnit is built as a
module. This mirrors the existing restriction on kunit_fail_current_test().

Note that the definition of kunit_fail_current_test() still wraps an
empty, inline function if KUnit is not built-in. This is to ensure that
the printf format string __attribute__ will still work.

Also update the documentation to suggest users use the new
kunit_get_current_test() function, update the example, and to describe
the behaviour when KUnit is disabled better.

Cc: Jonathan Corbet <[email protected]>
Cc: Sadiya Kazi <[email protected]>
Signed-off-by: David Gow <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: Provide a static key to check if KUnit is actively running tests

KUnit does a few expensive things when enabled. This hasn't been a
problem because KUnit was only enabled on test kernels, but with a few
people enabling (but not _using_) KUnit on production systems, we need a
runtime way of handling this.

Provide a 'kunit_running' static key (defaulting to false), which allows
us to hide any KUnit code behind a static branch. This should reduce the
performance impact (on other code) of having KUnit enabled to a single
NOP when no tests are running.

Note that, while it looks unintuitive, tests always run entirely within
__kunit_test_suites_init(), so it's safe to decrement the static key at
the end of this function, rather than in __kunit_test_suites_exit(),
which is only there to clean up results in debugfs.

Signed-off-by: David Gow <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: make --json do nothing if --raw_ouput is set

When --raw_output is set (to any value), we don't actually parse the
test results. So asking to print the test results as json doesn't make
sense.

We internally create a fake test with one passing subtest, so --json
would actually print out something misleading.

This patch:
* Rewords the flag descriptions so hopefully this is more obvious.
* Also updates --raw_output's description to note the default behavior
is to print out only "KUnit" results (actually any KTAP results)
* also renames and refactors some related logic for clarity (e.g.
test_result => test, it's a kunit_parser.Test object).

Notably, this patch does not make it an error to specify --json and
--raw_output together. This is an edge case, but I know of at least one
wrapper around kunit.py that always sets --json. You'd never be able to
use --raw_output with that wrapper.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: tweak error message when no KTAP found

We currently tell people we "couldn't find any KTAP output" with no
indication as to what this might mean.

After this patch, we get:

$ ./tools/testing/kunit/kunit.py parse /dev/null
============================================================
[ERROR] Test: <missing>: Could not find any KTAP output. Did any KUnit tests run?
============================================================
Testing complete. Ran 0 tests: errors: 1

Note: we could try and generate a more verbose message like
> Please check .kunit/test.log to see the raw kernel output.
or the like, but we'd need to know what the build dir was to know where
test.log actually lives.

This patch tries to make a more minimal improvement.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: remove KUNIT_INIT_MEM_ASSERTION macro

Commit 870f63b7cd78 ("kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT
macros") removed all the other macros of this type.

But it raced with commit b8a926bea8b1 ("kunit: Introduce
KUNIT_EXPECT_MEMEQ and KUNIT_EXPECT_MEMNEQ macros"), which added another
instance.

Remove KUNIT_INIT_MEM_ASSERTION and just use the generic
KUNIT_INIT_ASSERT macro instead.
Rename the `size` arg to avoid conflicts by appending a "_" (like we did
in the previous commit).

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Documentation: kunit: Remove redundant 'tips.rst' page

The contents of 'tips.rst' was mostly included in 'usage.rst' way back in
commit 953574390634 ("Documentation: KUnit: Rework writing page to focus on writing tests"),
but the tips page remained behind as well.

The parent patches in this series fill in the gaps, so now 'tips.rst' is
redundant.
Therefore, delete 'tips.rst'.

While I regret breaking any links to 'tips' which might exist
externally, it's confusing to have two subtly different versions of the
same content around.

Signed-off-by: David Gow <[email protected]>
Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: Sadiya Kazi <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Documentation: KUnit: reword description of assertions

The existing wording implies that kunit_kmalloc_array() is "the method
under test". We're actually testing the sort() function in that example.
This is because the example was changed in commit 953574390634
("Documentation: KUnit: Rework writing page to focus on writing tests"),
but the wording was not.

Also add a `note` telling people they can use the KUNIT_ASSERT_EQ()
macros from any function. Some users might be coming from a framework
like gUnit where that'll compile but silently do the wrong thing.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: Sadiya Kazi <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Documentation: KUnit: make usage.rst a superset of tips.rst, remove duplication

usage.rst had most of the content of the tips.rst page copied over.
But it's missing https://www.kernel.org/doc/html/v6.0/dev-tools/kunit/tips.html#customizing-error-messages
Copy it over so we can retire tips.rst w/o losing content.

And in that process, it also gained a duplicate section about how
KUNIT_ASSERT_*() exit the test case early. Remove that.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: Sadiya Kazi <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: eliminate KUNIT_INIT_*_ASSERT_STRUCT macros

These macros exist because passing an initializer list to other macros
is hard.

The goal of these macros is to generate a line like
  struct $ASSERT_TYPE __assertion = $APPROPRIATE_INITIALIZER;
e.g.
  struct kunit_unary_assertion __assertion = {
  .condition = "foo()",
  .expected_true = true
  };

But the challenge is you can't pass `{.condition=..., .expect_true=...}`
as a macro argument, since the comma means you're actually passing two
arguments, `{.condition=...` and `.expect_true=....}`.
So we'd made custom macros for each different initializer-list shape.

But we can work around this with the following generic macro
  #define KUNIT_INIT_ASSERT(initializers...) { initializers }

Note: this has the downside that we have to rename some macros arguments
to not conflict with the struct field names (e.g. `expected_true`).
It's a bit gross, but probably worth reducing the # of macros.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: remove redundant file.close() call in unit test

We're using a `with` block above, so the file object is already closed.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: unit tests all check parser errors, standardize formatting a bit

Let's verify that the parser isn't reporting any errors for valid
inputs.

This change also
* does result.status checking on one line
* makes sure we consistently do it outside of the `with` block

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: make TestCounts a dataclass

Since we're using Python 3.7+, we can use dataclasses to tersen the
code.

It also lets us create pre-populated TestCounts() objects and compare
them in our unit test. (Before, you could only create empty ones).

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: print summary of failed tests if a few failed out of a lot

E.g. all the hw_breakpoint tests are failing right now.
So if I run `kunit.py run --altests --arch=x86_64`, then I see
> Testing complete. Ran 408 tests: passed: 392, failed: 9, skipped: 7

Seeing which 9 tests failed out of the hundreds is annoying.
If my terminal doesn't have scrollback support, I have to resort to
looking at `.kunit/test.log` for the `not ok` lines.

Teach kunit.py to print a summarized list of failures if the # of tests
reachs an arbitrary threshold (>=100 tests).

To try and keep the output from being too long/noisy, this new logic
a) just reports "parent_test failed" if every child test failed
b) won't print anything if there are >10 failures (also arbitrary).

With this patch, we get an extra line of output showing:
> Testing complete. Ran 408 tests: passed: 392, failed: 9, skipped: 7
> Failures: hw_breakpoint

This also works with parameterized tests, e.g. if I add a fake failure
> Failures: kcsan.test_atomic_builtins_missing_barrier.threads=6

Note: we didn't have enough tests for this to be a problem before.
But with commit 980ac3ad0512 ("kunit: tool: rename all_test_uml.config,
use it for --alltests"), --alltests works and thus running >100 tests
will probably become more common.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: tool: make unit test not print parsed testdata to stdout

Currently, if you run
$ ./tools/testing/kunit/kunit_tool_test.py
you'll see a lot of output from the parser as we feed it testdata.

This makes the output hard to read and fairly confusing, esp. since our
testdata includes example failures, which get printed out in red.

Silence that output so real failures are easier to see.

Signed-off-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: remove unused structure definition

remove unused string_stream_alloc_context structure definition.

Signed-off-by: YoungJun.park <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: Use KUNIT_EXPECT_MEMEQ macro

Use KUNIT_EXPECT_MEMEQ to compare memory blocks in replacement of the
KUNIT_EXPECT_EQ macro. Therefor, the statement

KUNIT_EXPECT_EQ(test, memcmp(foo, bar, size), 0);

is replaced by:

KUNIT_EXPECT_MEMEQ(test, foo, bar, size);

Signed-off-by: Maíra Canal <[email protected]>
Acked-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: Add KUnit memory block assertions to the example_all_expect_macros_test

Augment the example_all_expect_macros_test with the KUNIT_EXPECT_MEMEQ
and KUNIT_EXPECT_MEMNEQ macros by creating a test with memory block
assertions.

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: Introduce KUNIT_EXPECT_MEMEQ and KUNIT_EXPECT_MEMNEQ macros

Currently, in order to compare memory blocks in KUnit, the KUNIT_EXPECT_EQ
or KUNIT_EXPECT_FALSE macros are used in conjunction with the memcmp
function, such as:
    KUNIT_EXPECT_EQ(test, memcmp(foo, bar, size), 0);

Although this usage produces correct results for the test cases, when
the expectation fails, the error message is not very helpful,
indicating only the return of the memcmp function.

Therefore, create a new set of macros KUNIT_EXPECT_MEMEQ and
KUNIT_EXPECT_MEMNEQ that compare memory blocks until a specified size.
In case of expectation failure, those macros print the hex dump of the
memory blocks, making it easier to debug test failures for memory blocks.

That said, the expectation

    KUNIT_EXPECT_EQ(test, memcmp(foo, bar, size), 0);

would translate to the expectation

    KUNIT_EXPECT_MEMEQ(test, foo, bar, size);

Signed-off-by: Maíra Canal <[email protected]>
Reviewed-by: Daniel Latypov <[email protected]>
Reviewed-by: Muhammad Usama Anjum <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Documentation: Kunit: Update architecture.rst for minor fixes

Updated the architecture.rst page with the following changes:
-Add missing article _the_ across the document.
-Reword content across for style and standard.
-Update all occurrences of Command Line to Command-line
across the document.
-Correct grammatical issues, for example,
added _it_wherever missing.
-Update all occurrences of “via" to either use
“through” or “using”.
-Update the text preceding the external links and pushed the full
link to a new line for better readability.
-Reword content under the config command to make it more clear and concise.

Signed-off-by: Sadiya Kazi <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

kunit: log numbers in decimal and hex

When KUNIT_EXPECT_EQ() or KUNIT_ASSERT_EQ() log a failure, they log the
two values being compared, with numerical values logged in decimal.

In some cases, decimal output is painful to consume, and hexadecimal
output would be more helpful. For example, this is the case for tests
I'm currently developing for the arm64 insn encoding/decoding code,
where comparing two 32-bit instruction opcodes results in output such
as:

|  # test_insn_add_shifted_reg: EXPECTATION FAILED at arch/arm64/lib/test_insn.c:2791
|  Expected obj_insn == gen_insn, but
|      obj_insn == 2332164128
|      gen_insn == 1258422304

To make this easier to consume, this patch logs the values in both
decimal and hexadecimal:

|  # test_insn_add_shifted_reg: EXPECTATION FAILED at arch/arm64/lib/test_insn.c:2791
|  Expected obj_insn == gen_insn, but
|      obj_insn == 2332164128 (0x8b020020)
|      gen_insn == 1258422304 (0x4b020020)

As can be seen from the example, having hexadecimal makes it
significantly easier for a human to spot which specific bits are
incorrect.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: David Gow <[email protected]>
Cc: [email protected]
Cc: [email protected]
Acked-by: Daniel Latypov <[email protected]>
Reviewed-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>

Linux 6.1-rc2

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"RISC-V:

   - Fix compilation without RISCV_ISA_ZICBOM

   - Fix kvm_riscv_vcpu_timer_pending() for Sstc

  ARM:

   - Fix a bug preventing restoring an ITS containing mappings for very
     large and very sparse device topology

   - Work around a relocation handling error when compiling the nVHE
     object with profile optimisation

   - Fix for stage-2 invalidation holding the VM MMU lock for too long
     by limiting the walk to the largest block mapping size

   - Enable stack protection and branch profiling for VHE

   - Two selftest fixes

  x86:

   - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl

  selftests:

   - synchronize includes between include/uapi and tools/include/uapi"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  tools: include: sync include/api/linux/kvm.h
  KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER
  KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()
  kvm: Add support for arch compat vm ioctls
  RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc
  RISC-V: Fix compilation without RISCV_ISA_ZICBOM
  KVM: arm64: vgic: Fix exit condition in scan_its_table()
  KVM: arm64: nvhe: Fix build with profile optimization
  KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test
  KVM: arm64: selftests: Fix multiple versions of GIC creation
  KVM: arm64: Enable stack protection and branch profiling for VHE
  KVM: arm64: Limit stage2_apply_range() batch size to largest block
  KVM: arm64: Work out supported block level at compile time

Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()"

This reverts commit 72a95859728a7866522e6633818bebc1c2519b17.

It broke reboots on big-endian MIPS and MIPS64 malta QEMU instances,
which use the syscon driver. Little-endian is not effected, which means
likely it's important to handle regmap_get_val_endian() in this function
after all.

Fixes: 72a95859728a ("mfd: syscon: Remove repetition of the regmap_get_val_endian()")
Reviewed-by: Andy Shevchenko <[email protected]>
Cc: Lee Jones <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

kernel/utsname_sysctl.c: Fix hostname polling

Commit bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") added
a new entry to the uts_kern_table[] array, but didn't update the
UTS_PROC_xyz enumerators of older entries, breaking anything that used
them.

Which is admittedly not many cases: it's really just the two uses of
uts_proc_notify() in kernel/sys.c. But apparently journald-systemd
actually uses this to detect hostname changes.

Reported-by: Torsten Hilbrich <[email protected]>
Fixes: bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch")
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://linux-regtracking.leemhuis.info/regzbot/regression/[email protected]/
Cc: Petr Vorel <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

Merge tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Fix raw data handling when perf events are used in bpf

- Rework how SIGTRAPs get delivered to events to address a bunch of
   problems with it. Add a selftest for that too

* tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bpf: Fix sample_flags for bpf_perf_event_output
  selftests/perf_events: Add a SIGTRAP stress test with disables
  perf: Fix missing SIGTRAPs

Merge tag 'sched_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

- Adjust code to not trip up CFI

- Fix sched group cookie matching

* tag 'sched_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Introduce struct balance_callback to avoid CFI mismatches
sched/core: Fix comparison in sched_group_cookie_match()

Merge tag 'objtool_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Borislav Petkov:

- Fix ORC stack unwinding when GCOV is enabled

* tag 'objtool_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/unwind/orc: Fix unreliable stack dump with gcov

Merge tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
"As usually the case, right after a major release, the tip urgent
  branches accumulate a couple more fixes than normal. And here is the
  x86, a bit bigger, urgent pile.

   - Use the correct CPU capability clearing function on the error path
     in Intel perf LBR

   - A CFI fix to ftrace along with a simplification

   - Adjust handling of zero capacity bit mask for resctrl cache
     allocation on AMD

   - A fix to the AMD microcode loader to attempt patch application on
     every logical thread

   - A couple of topology fixes to handle CPUID leaf 0x1f enumeration
     info properly

   - Drop a -mabi=ms compiler option check as both compilers support it
     now anyway

   - A couple of fixes to how the initial, statically allocated FPU
     buffer state is setup and its interaction with dynamic states at
     runtime"

* tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly
  perf/x86/intel/lbr: Use setup_clear_cpu_cap() instead of clear_cpu_cap()
  ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()
  x86/ftrace: Remove ftrace_epilogue()
  x86/resctrl: Fix min_cbm_bits for AMD
  x86/microcode/AMD: Apply the patch early on every logical thread
  x86/topology: Fix duplicated core ID within a package
  x86/topology: Fix multiple packages shown on a single-package system
  hwmon/coretemp: Handle large core ID value
  x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB
  x86/fpu: Exclude dynamic states from init_fpstate
  x86/fpu: Fix the init_fpstate size check with the actual size
  x86/fpu: Configure init_fpstate attributes orderly

Merge tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux

Pull io_uring follow-up from Jens Axboe:
"Currently the zero-copy has automatic fallback to normal transmit, and
  it was decided that it'd be cleaner to return an error instead if the
  socket type doesn't support it.

  Zero-copy does work with UDP and TCP, it's more of a future proofing
  kind of thing (eg for samba)"

* tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux:
  io_uring/net: fail zc sendmsg when unsupported by socket
  io_uring/net: fail zc send when unsupported by socket
  net: flag sockets supporting msghdr originated zerocopy

Merge tag 'hwmon-for-v6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- corsair-psu: Fix typo in USB id description, and add USB ID for new
   PSU

- pwm-fan: Fix fan power handling when disabling fan control

* tag 'hwmon-for-v6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (corsair-psu) Add USB id of the new HX1500i psu
  hwmon: (pwm-fan) Explicitly switch off fan power when setting pwm1_enable to 0
  hwmon: (corsair-psu) fix typo in USB id description

Merge tag 'i2c-for-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"RPM fix for qcom-cci, platform module alias for xiic, build warning
  fix for mlxbf, typo fixes in comments"

* tag 'i2c-for-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mlxbf: depend on ACPI; clean away ifdeffage
  i2c: fix spelling typos in comments
  i2c: qcom-cci: Fix ordering of pm_runtime_xx and i2c_add_adapter
  i2c: xiic: Add platform module alias

Merge tag 'pci-v6.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci fixes from Bjorn Helgaas:

- Revert a simplification that broke pci-tegra due to a masking error

- Update MAINTAINERS for Kishon's email address change and TI
   DRA7XX/J721E maintainer change

* tag 'pci-v6.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Update Kishon's email address in PCI endpoint subsystem
  MAINTAINERS: Add Vignesh Raghavendra as maintainer of TI DRA7XX/J721E PCI driver
  Revert "PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro"

Merge tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull missed media updates from Mauro Carvalho Chehab:
"It seems I screwed-up my previous pull request: it ends up that only
  half of the media patches that were in linux-next got merged in -rc1.

  The script which creates the signed tags silently failed due to
  5.19->6.0 so it ended generating a tag with incomplete stuff.

  So here are the missing parts:

   - a DVB core security fix

   - lots of fixes and cleanups for atomisp staging driver

   - old drivers that are VB1 are being moved to staging to be
     deprecated

   - several driver updates - mostly for embedded systems, but there are
     also some things addressing issues with some PC webcams, in the UVC
     video driver"

* tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (163 commits)
  media: sun6i-csi: Move csi buffer definition to main header file
  media: sun6i-csi: Introduce and use video helper functions
  media: sun6i-csi: Add media ops with link notify callback
  media: sun6i-csi: Remove controls handler from the driver
  media: sun6i-csi: Register the media device after creation
  media: sun6i-csi: Pass and store csi device directly in video code
  media: sun6i-csi: Tidy up video code
  media: sun6i-csi: Tidy up v4l2 code
  media: sun6i-csi: Tidy up Kconfig
  media: sun6i-csi: Use runtime pm for clocks and reset
  media: sun6i-csi: Define and use variant to get module clock rate
  media: sun6i-csi: Always set exclusive module clock rate
  media: sun6i-csi: Tidy up platform code
  media: sun6i-csi: Refactor main driver data structures
  media: sun6i-csi: Define and use driver name and (reworked) description
  media: cedrus: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun8i-rotate: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun8i-di: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun4i-csi: Add a Kconfig dependency on RESET_CONTROLLER
  media: sun6i-csi: Add a Kconfig dependency on RESET_CONTROLLER
  ...

io_uring/net: fail zc sendmsg when unsupported by socket

The previous patch fails zerocopy send requests for protocols that don't
support it, do the same for zerocopy sendmsg.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

io_uring/net: fail zc send when unsupported by socket

If a protocol doesn't support zerocopy it will silently fall back to
copying. This type of behaviour has always been a source of troubles
so it's better to fail such requests instead.

Cc: <[email protected]> # 6.0
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

net: flag sockets supporting msghdr originated zerocopy

We need an efficient way in io_uring to check whether a socket supports
zerocopy with msghdr provided ubuf_info. Add a new flag into the struct
socket flags fields.

Cc: <[email protected]> # 6.0
Signed-off-by: Pavel Begunkov <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

hwmon: (corsair-psu) Add USB id of the new HX1500i psu

Also update the documentation accordingly.

Signed-off-by: Wilken Gottwalt <[email protected]>
Link: https://lore.kernel.org/r/Y0FghqQCHG/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>

tools: include: sync include/api/linux/kvm.h

Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.

Fixes: 17601bfed909 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option")
Cc: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER

The KVM_X86_SET_MSR_FILTER ioctls contains a pointer in the passed in
struct which means it has a different struct size depending on whether
it gets called from 32bit or 64bit code.

This patch introduces compat code that converts from the 32bit struct to
its 64bit counterpart which then gets used going forward internally.
With this applied, 32bit QEMU can successfully set MSR bitmaps when
running on 64bit kernels.

Reported-by: Andrew Randrianasulu <[email protected]>
Fixes: 1a155254ff937 ("KVM: x86: Introduce MSR filtering")
Signed-off-by: Alexander Graf <[email protected]>
Message-Id: <20221017184541 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()

In the next patch we want to introduce a second caller to
set_msr_filter() which constructs its own filter list on the stack.
Refactor the original function so it takes it as argument instead of
reading it through copy_from_user().

Signed-off-by: Alexander Graf <[email protected]>
Message-Id: <20221017184541 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

kvm: Add support for arch compat vm ioctls

We will introduce the first architecture specific compat vm ioctl in the
next patch. Add all necessary boilerplate to allow architectures to
override compat vm ioctls when necessary.

Signed-off-by: Alexander Graf <[email protected]>
Message-Id: <20221017184541 [email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>

Merge tag 'kvm-riscv-fixes-6.1-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.1, take #1

- Fix compilation without RISCV_ISA_ZICBOM
- Fix kvm_riscv_vcpu_timer_pending() for Sstc

Merge tag 'kvmarm-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.1, take #2

- Fix a bug preventing restoring an ITS containing mappings
for very large and very sparse device topology

- Work around a relocation handling error when compiling
the nVHE object with profile optimisation

Merge tag 'kvmarm-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.1, take #1

- Fix for stage-2 invalidation holding the VM MMU lock
for too long by limiting the walk to the largest
block mapping size

- Enable stack protection and branch profiling for VHE

- Two selftest fixes

Merge tag 'thermal-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
"This fixes the control CPU selection in the intel_powerclamp thermal
driver"

* tag 'thermal-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel_powerclamp: Use first online CPU as control_cpu

Merge tag 'pm-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix some issues and clean up code in ARM cpufreq drivers.

  Specifics:

   - Fix module loading in the Tegra124 cpufreq driver (Jon Hunter)

   - Fix memory leak and update to read-only region in the qcom cpufreq
     driver (Fabien Parent)

   - Miscellaneous minor cleanups to cpufreq drivers (Fabien Parent,
     Yang Yingliang)"

* tag 'pm-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: sun50i: Switch to use dev_err_probe() helper
  cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper
  cpufreq: imx6q: Switch to use dev_err_probe() helper
  cpufreq: dt: Switch to use dev_err_probe() helper
  cpufreq: qcom: remove unused parameter in function definition
  cpufreq: qcom: fix writes in read-only memory region
  cpufreq: qcom: fix memory leak in error path
  cpufreq: tegra194: Fix module loading

Merge tag 'acpi-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"These fix issues introduced during this merge window (ACPI/PCI, device
  enumeration and documentation) and some other ones found recently.

  Specifics:

   - Add missing device reference counting to acpi_get_pci_dev() after
     changing it recently (Rafael Wysocki)

   - Fix resource list walk in acpi_dma_get_range() (Robin Murphy)

   - Add IRQ override quirk for LENOVO IdeaPad and extend the IRQ
     override warning message (Jiri Slaby)

   - Fix integer overflow in ghes_estatus_pool_init() (Ashish Kalra)

   - Fix multiple error records handling in one of the ACPI extlog
     driver code paths (Tony Luck)

   - Prune DSDT override documentation from index after dropping it
     (Bagas Sanjaya)"

* tag 'acpi-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: scan: Fix DMA range assignment
  ACPI: PCI: Fix device reference counting in acpi_get_pci_dev()
  ACPI: resource: note more about IRQ override
  ACPI: resource: do IRQ override on LENOVO IdeaPad
  ACPI: extlog: Handle multiple records
  ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()
  Documentation: ACPI: Prune DSDT override documentation from index

Merge tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

- fixes for the EFI variable store refactor that landed in v6.0

- fixes for issues that were introduced during the merge window

- back out some changes related to EFI zboot signing - we'll add a
   better solution for this during the next cycle

* tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0
  efi: libstub: Fix incorrect payload size in zboot header
  efi: libstub: Give efi_main() asmlinkage qualification
  efi: efivars: Fix variable writes without query_variable_store()
  efi: ssdt: Don't free memory if ACPI table was loaded successfully
  efi: libstub: Remove zboot signing from build options

Merge tag 'iommu-fixes-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:
"Intel VT-d fixes:

   - Fix a lockdep splat issue in intel_iommu_init()

   - Allow NVS regions to pass RMRR check

   - Domain cleanup in error path"

* tag 'iommu-fixes-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Clean up si_domain in the init_dmars() error path
  iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check()
  iommu/vt-d: Use rcu_lock in get_resv_regions
  iommu: Add gfp parameter to iommu_alloc_resv_region

Merge tag 'for-linus-2022102101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

- a 12 year old bug fix for the Apple Magic Trackpad v1 (José Expósito)

- a fix for a potential crash on removal of the Playstation controllers
   (Roderick Colenbrander)

- a few new device IDs and device-specific quirks, most notably support
   of the new Playstation DualSense Edge controller

* tag 'for-linus-2022102101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: lenovo: Make array tp10ubkbd_led static const
  HID: saitek: add madcatz variant of MMO7 mouse device ID
  HID: playstation: support updated DualSense rumble mode.
  HID: playstation: add initial DualSense Edge controller support
  HID: playstation: stop DualSense output work on remove.
  HID: magicmouse: Do not set BTN_MOUSE on double report

Merge tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:

- memory leak fixes

- fixes for directory leases, including an important one which fixes a
   problem noticed by git functional tests

- fixes relating to missing free_xid calls (helpful for
   tracing/debugging of entry/exit into cifs.ko)

- a multichannel fix

- a small cleanup fix (use of list_move instead of list_del/list_add)

* tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module number
  cifs: fix memory leaks in session setup
  cifs: drop the lease for cached directories on rmdir or rename
  smb3: interface count displayed incorrectly
  cifs: Fix memory leak when build ntlmssp negotiate blob failed
  cifs: set rc to -ENOENT if we can not get a dentry for the cached dir
  cifs: use LIST_HEAD() and list_move() to simplify code
  cifs: Fix xid leak in cifs_get_file_info_unix()
  cifs: Fix xid leak in cifs_ses_add_channel()
  cifs: Fix xid leak in cifs_flock()
  cifs: Fix xid leak in cifs_copy_file_range()
  cifs: Fix xid leak in cifs_create()

Merge tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:
"Fixes for patches merged in v6.1"

* tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: ensure we always call fh_verify_error tracepoint
NFSD: unregister shrinker when nfsd_init_net() fails

x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly

When an extended state component is not present in fpstate, but in init
state, the function copies from init_fpstate via copy_feature().

But, dynamic states are not present in init_fpstate because of all-zeros
init states. Then retrieving them from init_fpstate will explode like this:

BUG: kernel NULL pointer dereference, address: 0000000000000000
...
RIP: 0010:memcpy_erms+0x6/0x10
  ? __copy_xstate_to_uabi_buf+0x381/0x870
  fpu_copy_guest_fpstate_to_uabi+0x28/0x80
  kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm]
  ? __this_cpu_preempt_check+0x13/0x20
  ? vmx_vcpu_put+0x2e/0x260 [kvm_intel]
  kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
  ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm]
  ? __fget_light+0xd4/0x130
  __x64_sys_ioctl+0xe3/0x910
  ? debug_smp_processor_id+0x17/0x20
  ? fpregs_assert_state_consistent+0x27/0x50
  do_syscall_64+0x3f/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Adjust the 'mask' to zero out the userspace buffer for the features that
are not available both from fpstate and from init_fpstate.

The dynamic features depend on the compacted XSAVE format. Ensure it is
enabled before reading XCOMP_BV in init_fpstate.

Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
Reported-by: Yuan Yao <[email protected]>
Suggested-by: Dave Hansen <[email protected]>
Signed-off-by: Chang S. Bae <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Yuan Yao <[email protected]>
Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/
Link: https://lkml.kernel.org/r/[email protected]

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Two small changes, one in the lpfc driver and the other in the core.

  The core change is an additional footgun guard which prevents users
  from writing the wrong state to sysfs and causing a hang"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: Fix memory leak in lpfc_create_port()
  scsi: core: Restrict legal sdev_state transitions via sysfs

Merge tag 'block-6.1-2022-10-20' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Christoph:
      - fix nvme-hwmon for DMA non-cohehrent architectures (Serge Semin)
      - add a nvme-hwmong maintainer (Christoph Hellwig)
      - fix error pointer dereference in error handling (Dan Carpenter)
      - fix invalid memory reference in nvmet_subsys_attr_qid_max_show
        (Daniel Wagner)
      - don't limit the DMA segment size in nvme-apple (Russell King)
      - fix workqueue MEM_RECLAIM flushing dependency (Sagi Grimberg)
      - disable write zeroes on various Kingston SSDs (Xander Li)

- fix a memory leak with block device tracing (Ye)

- flexible-array fix for ublk (Yushan)

- document the ublk recovery feature from this merge window
   (ZiyangZhang)

- remove dead bfq variable in struct (Yuwei)

- error handling rq clearing fix (Yu)

- add an IRQ safety check for the cached bio freeing (Pavel)

- drbd bio cloning fix (Christoph)

* tag 'block-6.1-2022-10-20' of git://git.kernel.dk/linux:
  blktrace: remove unnessary stop block trace in 'blk_trace_shutdown'
  blktrace: fix possible memleak in '__blk_trace_remove'
  blktrace: introduce 'blk_trace_{start,stop}' helper
  bio: safeguard REQ_ALLOC_CACHE bio put
  block, bfq: remove unused variable for bfq_queue
  drbd: only clone bio if we have a backing device
  ublk_drv: use flexible-array member instead of zero-length array
  nvmet: fix invalid memory reference in nvmet_subsys_attr_qid_max_show
  nvmet: fix workqueue MEM_RECLAIM flushing dependency
  nvme-hwmon: kmalloc the NVME SMART log buffer
  nvme-hwmon: consistently ignore errors from nvme_hwmon_init
  nvme: add Guenther as nvme-hwmon maintainer
  nvme-apple: don't limit DMA segement size
  nvme-pci: disable write zeroes on various Kingston SSD
  nvme: fix error pointer dereference in error handling
  Documentation: document ublk user recovery feature
  blk-mq: fix null pointer dereference in blk_mq_clear_rq_mapping()

Merge tag 'io_uring-6.1-2022-10-20' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

- Fix a potential memory leak in the error handling path of io-wq setup
   (Rafael)

- Kill an errant debug statement that got added in this release (me)

- Fix an oops with an invalid direct descriptor with IORING_OP_MSG_RING
   (Harshit)

- Remove unneeded FFS_SCM flagging (Pavel)

- Remove polling off the exit path (Pavel)

- Move out direct descriptor debug check to the cleanup path (Pavel)

- Use the proper helper rather than open-coding cached request get
   (Pavel)

* tag 'io_uring-6.1-2022-10-20' of git://git.kernel.dk/linux:
  io-wq: Fix memory leak in worker creation
  io_uring/msg_ring: Fix NULL pointer dereference in io_msg_send_fd()
  io_uring/rw: remove leftover debug statement
  io_uring: don't iopoll from io_ring_ctx_wait_and_kill()
  io_uring: reuse io_alloc_req()
  io_uring: kill hot path fixed file bitmap debug checks
  io_uring: remove FFS_SCM

Merge tag 'for-linus-6.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
"Just two fixes for the new 'virtio with grants' feature"

* tag 'for-linus-6.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/virtio: Convert PAGE_SIZE/PAGE_SHIFT/PFN_UP to Xen counterparts
xen/virtio: Handle cases when page offset > PAGE_SIZE properly

Merge tag 'selinux-pr-20221020' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
"A small SELinux fix for a GFP_KERNEL allocation while a spinlock is
  held.

  The patch, while still fairly small, is a bit larger than one might
  expect from a simple s/GFP_KERNEL/GFP_ATOMIC/ conversion because we
  added support for the function to be called with different gfp flags
  depending on the context, preserving GFP_KERNEL for those cases that
  can safely sleep"

* tag 'selinux-pr-20221020' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: enable use of both GFP_KERNEL and GFP_ATOMIC in convert_context()

Merge tag 'mm-hotfixes-stable-2022-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morron:
"Seventeen hotfixes, mainly for MM.

  Five are cc:stable and the remainder address post-6.0 issues"

* tag 'mm-hotfixes-stable-2022-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nouveau: fix migrate_to_ram() for faulting page
  mm/huge_memory: do not clobber swp_entry_t during THP split
  hugetlb: fix memory leak associated with vma_lock structure
  mm/page_alloc: reduce potential fragmentation in make_alloc_exact()
  mm: /proc/pid/smaps_rollup: fix maple tree search
  mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages
  mm/mmap: fix MAP_FIXED address return on VMA merge
  mm/mmap.c: __vma_adjust(): suppress uninitialized var warning
  mm/mmap: undo ->mmap() when mas_preallocate() fails
  init: Kconfig: fix spelling mistake "satify" -> "satisfy"
  ocfs2: clear dinode links count in case of error
  ocfs2: fix BUG when iput after ocfs2_mknod fails
  gcov: support GCC 12.1 and newer compilers
  zsmalloc: zs_destroy_pool: add size_class NULL check
  mm/mempolicy: fix mbind_range() arguments to vma_merge()
  mailmap: update email for Qais Yousef
  mailmap: update Dan Carpenter's email address

Merge tag 'trace-tools-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tool update from Steven Rostedt:

- Make dot2c generate monitor's automata definition static

* tag 'trace-tools-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv/dot2c: Make automaton definition static

Merge tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

- Add tracing events for the most common watchdog events

* tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: Add tracing events for the most usual watchdog events

Merge branches 'acpi-scan', 'acpi-resource', 'acpi-apei', 'acpi-extlog' and 'acpi-docs'

Merge assorted ACPI fixes for 6.1-rc2:

- Fix resource list walk in acpi_dma_get_range() (Robin Murphy).

- Add IRQ override quirk for LENOVO IdeaPad and extend the IRQ
   override warning message (Jiri Slaby).

- Fix integer overflow in ghes_estatus_pool_init() (Ashish Kalra).

- Fix multiple error records handling in one of the ACPI extlog driver
   code paths (Tony Luck).

- Prune DSDT override documentation from index after dropping it (Bagas
   Sanjaya).

* acpi-scan:
  ACPI: scan: Fix DMA range assignment

* acpi-resource:
  ACPI: resource: note more about IRQ override
  ACPI: resource: do IRQ override on LENOVO IdeaPad

* acpi-apei:
  ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()

* acpi-extlog:
  ACPI: extlog: Handle multiple records

* acpi-docs:
  Documentation: ACPI: Prune DSDT override documentation from index

x86/unwind/orc: Fix unreliable stack dump with gcov

When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL
enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder,
causing the stack trace to show all text addresses as unreliable:

  # echo l > /proc/sysrq-trigger
  [  477.521031] sysrq: Show backtrace of all active CPUs
  [  477.523813] NMI backtrace for cpu 0
  [  477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65
  [  477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
  [  477.526439] Call Trace:
  [  477.526854]  <TASK>
  [  477.527216]  ? dump_stack_lvl+0xc7/0x114
  [  477.527801]  ? dump_stack+0x13/0x1f
  [  477.528331]  ? nmi_cpu_backtrace.cold+0xb5/0x10d
  [  477.528998]  ? lapic_can_unplug_cpu+0xa0/0xa0
  [  477.529641]  ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0
  [  477.530393]  ? arch_trigger_cpumask_backtrace+0x1d/0x30
  [  477.531136]  ? sysrq_handle_showallcpus+0x1b/0x30
  [  477.531818]  ? __handle_sysrq.cold+0x4e/0x1ae
  [  477.532451]  ? write_sysrq_trigger+0x63/0x80
  [  477.533080]  ? proc_reg_write+0x92/0x110
  [  477.533663]  ? vfs_write+0x174/0x530
  [  477.534265]  ? handle_mm_fault+0x16f/0x500
  [  477.534940]  ? ksys_write+0x7b/0x170
  [  477.535543]  ? __x64_sys_write+0x1d/0x30
  [  477.536191]  ? do_syscall_64+0x6b/0x100
  [  477.536809]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
  [  477.537609]  </TASK>

This happens when the compiled code for show_stack() has a single word
on the stack, and doesn't use a tail call to show_stack_log_lvl().
(CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.)  Then the
__unwind_start() skip logic hits an off-by-one bug and fails to unwind
all the way to the intended starting frame.

Fix it by reverting the following commit:

  f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")

The original justification for that commit no longer exists.  That
original issue was later fixed in a different way, with the following
commit:

  f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")

Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
Signed-off-by: Chen Zhongjin <[email protected]>
[jpoimboe: rewrite commit log]
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>

efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0

The generic EFI stub can be instructed to avoid SetVirtualAddressMap(),
and simply run with the firmware's 1:1 mapping. In this case, it
populates the virtual address fields of the runtime regions in the
memory map with the physical address of each region, so that the mapping
code has to be none the wiser. Only if SetVirtualAddressMap() fails, the
virtual addresses are wiped and the kernel code knows that the regions
cannot be mapped.

However, wiping amounts to setting it to zero, and if a runtime region
happens to live at physical address 0, its valid 1:1 mapped virtual
address could be mistaken for a wiped field, resulting on loss of access
to the EFI services at runtime.

So let's only assume that VA == 0 means 'no runtime services' if the
region in question does not live at PA 0x0.

Signed-off-by: Ard Biesheuvel <[email protected]>

efi: libstub: Fix incorrect payload size in zboot header

The linker script symbol definition that captures the size of the
compressed payload inside the zboot decompressor (which is exposed via
the image header) refers to '.' for the end of the region, which does
not give the correct result as the expression is not placed at the end
of the payload. So use the symbol name explicitly.

Signed-off-by: Ard Biesheuvel <[email protected]>

efi: libstub: Give efi_main() asmlinkage qualification

To stop the bots from sending sparse warnings to me and the list about
efi_main() not having a prototype, decorate it with asmlinkage so that
it is clear that it is called from assembly, and therefore needs to
remain external, even if it is never declared in a header file.

Signed-off-by: Ard Biesheuvel <[email protected]>

efi: efivars: Fix variable writes without query_variable_store()

Commit bbc6d2c6ef22 ("efi: vars: Switch to new wrapper layer")
refactored the efivars layer so that the 'business logic' related to
which UEFI variables affect the boot flow in which way could be moved
out of it, and into the efivarfs driver.

This inadvertently broke setting variables on firmware implementations
that lack the QueryVariableInfo() boot service, because we no longer
tolerate a EFI_UNSUPPORTED result from check_var_size() when calling
efivar_entry_set_get_size(), which now ends up calling check_var_size()
a second time inadvertently.

If QueryVariableInfo() is missing, we support writes of up to 64k -
let's move that logic into check_var_size(), and drop the redundant
call.

Cc: <[email protected]> # v6.0
Fixes: bbc6d2c6ef22 ("efi: vars: Switch to new wrapper layer")
Signed-off-by: Ard Biesheuvel <[email protected]>

efi: ssdt: Don't free memory if ACPI table was loaded successfully

Amadeusz reports KASAN use-after-free errors introduced by commit
3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from
variables"). The problem appears to be that the memory that holds the
new ACPI table is now freed unconditionally, instead of only when the
ACPI core reported a failure to load the table.

So let's fix this, by omitting the kfree() on success.

Cc: <[email protected]> # v6.0
Link: https://lore.kernel.org/all/[email protected]/
Fixes: 3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from variables")
Reported-by: Amadeusz Sławiński <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>

efi: libstub: Remove zboot signing from build options

The zboot decompressor series introduced a feature to sign the PE/COFF
kernel image for secure boot as part of the kernel build. This was
necessary because there are actually two images that need to be signed:
the kernel with the EFI stub attached, and the decompressor application.

This is a bit of a burden, because it means that the images must be
signed on the the same system that performs the build, and this is not
realistic for distros.

During the next cycle, we will introduce changes to the zboot code so
that the inner image no longer needs to be signed. This means that the
outer PE/COFF image can be handled as usual, and be signed later in the
release process.

Let's remove the associated Kconfig options now so that they don't end
up in a LTS release while already being deprecated.

Signed-off-by: Ard Biesheuvel <[email protected]>

iommu/vt-d: Clean up si_domain in the init_dmars() error path

A splat from kmem_cache_destroy() was seen with a kernel prior to
commit ee2653bbe89d ("iommu/vt-d: Remove domain and devinfo mempool")
when there was a failure in init_dmars(), because the iommu_domain
cache still had objects. While the mempool code is now gone, there
still is a leak of the si_domain memory if init_dmars() fails. So
clean up si_domain in the init_dmars() error path.

Cc: Lu Baolu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Robin Murphy <[email protected]>
Fixes: 86080ccc223a ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: Jerry Snitselaar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>

iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check()

arch_rmrr_sanity_check() warns if the RMRR is not covered by an ACPI
Reserved region, but it seems like it should accept an NVS region as
well. The ACPI spec
https://uefi.org/specs/ACPI/6.5/15_System_Address_Map_Interfaces.html
uses similar wording for "Reserved" and "NVS" region types; for NVS
regions it says "This range of addresses is in use or reserved by the
system and must not be used by the operating system."

There is an old comment on this mailing list that also suggests NVS
regions should pass the arch_rmrr_sanity_check() test:

The warnings come from arch_rmrr_sanity_check() since it checks whether
the region is E820_TYPE_RESERVED. However, if the purpose of the check
is to detect RMRR has regions that may be used by OS as free memory,
isn't  E820_TYPE_NVS safe, too?

This patch overlaps with another proposed patch that would add the region
type to the log since sometimes the bug reporter sees this log on the
console but doesn't know to include the kernel log:

https://lore.kernel.org/lkml/20220611204859 [email protected]/

Here's an example of the "Firmware Bug" apparent false positive (wrapped
for line length):

DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR
       [0x000000006f760000-0x000000006f762fff], contact BIOS vendor for
       fixes
DMAR: [Firmware Bug]: Your BIOS is broken; bad RMRR
       [0x000000006f760000-0x000000006f762fff]

This is the snippet from the e820 table:

BIOS-e820: [mem 0x0000000068bff000-0x000000006ebfefff] reserved
BIOS-e820: [mem 0x000000006ebff000-0x000000006f9fefff] ACPI NVS
BIOS-e820: [mem 0x000000006f9ff000-0x000000006fffefff] ACPI data

Fixes: f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved")
Cc: Will Mortensen <[email protected]>
Link: https://lore.kernel.org/linux-iommu/[email protected]/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216443
Signed-off-by: Charlotte Tan <[email protected]>
Reviewed-by: Aaron Tomlin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>

iommu/vt-d: Use rcu_lock in get_resv_regions

Commit 5f64ce5411b46 ("iommu/vt-d: Duplicate iommu_resv_region objects
per device list") converted rcu_lock in get_resv_regions to
dmar_global_lock to allow sleeping in iommu_alloc_resv_region(). This
introduced possible recursive locking if get_resv_regions is called from
within a section where intel_iommu_init() already holds dmar_global_lock.

Especially, after commit 57365a04c921 ("iommu: Move bus setup to IOMMU
device registration"), below lockdep splats could always be seen.

============================================
WARNING: possible recursive locking detected
6.0.0-rc4+ #325 Tainted: G          I
--------------------------------------------
swapper/0/1 is trying to acquire lock:
ffffffffa8a18c90 (dmar_global_lock){++++}-{3:3}, at:
intel_iommu_get_resv_regions+0x25/0x270

but task is already holding lock:
ffffffffa8a18c90 (dmar_global_lock){++++}-{3:3}, at:
intel_iommu_init+0x36d/0x6ea

...

Call Trace:
  <TASK>
  dump_stack_lvl+0x48/0x5f
  __lock_acquire.cold.73+0xad/0x2bb
  lock_acquire+0xc2/0x2e0
  ? intel_iommu_get_resv_regions+0x25/0x270
  ? lock_is_held_type+0x9d/0x110
  down_read+0x42/0x150
  ? intel_iommu_get_resv_regions+0x25/0x270
  intel_iommu_get_resv_regions+0x25/0x270
  iommu_create_device_direct_mappings.isra.28+0x8d/0x1c0
  ? iommu_get_dma_cookie+0x6d/0x90
  bus_iommu_probe+0x19f/0x2e0
  iommu_device_register+0xd4/0x130
  intel_iommu_init+0x3e1/0x6ea
  ? iommu_setup+0x289/0x289
  ? rdinit_setup+0x34/0x34
  pci_iommu_init+0x12/0x3a
  do_one_initcall+0x65/0x320
  ? rdinit_setup+0x34/0x34
  ? rcu_read_lock_sched_held+0x5a/0x80
  kernel_init_freeable+0x28a/0x2f3
  ? rest_init+0x1b0/0x1b0
  kernel_init+0x1a/0x130
  ret_from_fork+0x1f/0x30
  </TASK>

This rolls back dmar_global_lock to rcu_lock in get_resv_regions to avoid
the lockdep splat.

Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration")
Signed-off-by: Lu Baolu <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

iommu: Add gfp parameter to iommu_alloc_resv_region

Add gfp parameter to iommu_alloc_resv_region() for the callers to specify
the memory allocation behavior. Thus iommu_alloc_resv_region() could also
be available in critical contexts.

Signed-off-by: Lu Baolu <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc

The kvm_riscv_vcpu_timer_pending() checks per-VCPU next_cycles
and per-VCPU software injected VS timer interrupt. This function
returns incorrect value when Sstc is available because the per-VCPU
next_cycles are only updated by kvm_riscv_vcpu_timer_save() called
from kvm_arch_vcpu_put(). As a result, when Sstc is available the
VCPU does not block properly upon WFI traps.

To fix the above issue, we introduce kvm_riscv_vcpu_timer_sync()
which will update per-VCPU next_cycles upon every VM exit instead
of kvm_riscv_vcpu_timer_save().

Fixes: 8f5cb44b1bae ("RISC-V: KVM: Support sstc extension")
Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Atish Patra <[email protected]>
Signed-off-by: Anup Patel <[email protected]>

RISC-V: Fix compilation without RISCV_ISA_ZICBOM

riscv_cbom_block_size and riscv_init_cbom_blocksize() should always
be available and riscv_init_cbom_blocksize() should always be
invoked, even when compiling without RISCV_ISA_ZICBOM enabled. This
is because disabling RISCV_ISA_ZICBOM means "don't use zicbom
instructions in the kernel" not "pretend there isn't zicbom, even
when there is". When zicbom is available, whether the kernel enables
its use with RISCV_ISA_ZICBOM or not, KVM will offer it to guests.
Ensure we can build KVM and that the block size is initialized even
when compiling without RISCV_ISA_ZICBOM.

Fixes: 8f7e001e0325 ("RISC-V: Clean up the Zicbom block size probing")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Andrew Jones <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Heiko Stuebner <[email protected]>
Tested-by: Heiko Stuebner <[email protected]>
Signed-off-by: Anup Patel <[email protected]>

i2c: mlxbf: depend on ACPI; clean away ifdeffage

This fixes maybe_unused warnings/errors.

According to a comment during device tree removal, only ACPI is supported,
thus let's actually require it.

Fixes: be18c5ede25d ("i2c: mlxbf: remove device tree support")
Signed-off-by: Adam Borowski <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>

nouveau: fix migrate_to_ram() for faulting page

Commit 16ce101db85d ("mm/memory.c: fix race when faulting a device private
page") changed the migrate_to_ram() callback to take a reference on the
device page to ensure it can't be freed while handling the fault.
Unfortunately the corresponding update to Nouveau to accommodate this
change was inadvertently dropped from that patch causing GPU to CPU
migration to fail so add it here.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 16ce101db85d ("mm/memory.c: fix race when faulting a device private page")
Signed-off-by: Alistair Popple <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Lyude Paul <[email protected]>
Cc: Ben Skeggs <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/huge_memory: do not clobber swp_entry_t during THP split

The following has been observed when running stressng mmap since commit
b653db77350c ("mm: Clear page->private when splitting or migrating a page")

   watchdog: BUG: soft lockup - CPU#75 stuck for 26s! [stress-ng:9546]
   CPU: 75 PID: 9546 Comm: stress-ng Tainted: G            E      6.0.0-revert-b653db77-fix+ #29 0357d79b60fb09775f678e4f3f64ef0579ad1374
   Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016
   RIP: 0010:xas_descend+0x28/0x80
   Code: cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 75 08 <48> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02 89 c2
   RSP: 0018:ffffbbf02a2236a8 EFLAGS: 00000246
   RAX: ffff9cab7d6a0002 RBX: ffffe04b0af88040 RCX: 0000000000000002
   RDX: 0000000000000030 RSI: ffff9cab60509b60 RDI: ffffbbf02a2236c0
   RBP: 0000000000000000 R08: ffff9cab60509b60 R09: ffffbbf02a2236c0
   R10: 0000000000000001 R11: ffffbbf02a223698 R12: 0000000000000000
   R13: ffff9cab4e28da80 R14: 0000000000039c01 R15: ffff9cab4e28da88
   FS:  00007fab89b85e40(0000) GS:ffff9cea3fcc0000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00007fab84e00000 CR3: 00000040b73a4003 CR4: 00000000003706e0
   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   Call Trace:
    <TASK>
    xas_load+0x3a/0x50
    __filemap_get_folio+0x80/0x370
    ? put_swap_page+0x163/0x360
    pagecache_get_page+0x13/0x90
    __try_to_reclaim_swap+0x50/0x190
    scan_swap_map_slots+0x31e/0x670
    get_swap_pages+0x226/0x3c0
    folio_alloc_swap+0x1cc/0x240
    add_to_swap+0x14/0x70
    shrink_page_list+0x968/0xbc0
    reclaim_page_list+0x70/0xf0
    reclaim_pages+0xdd/0x120
    madvise_cold_or_pageout_pte_range+0x814/0xf30
    walk_pgd_range+0x637/0xa30
    __walk_page_range+0x142/0x170
    walk_page_range+0x146/0x170
    madvise_pageout+0xb7/0x280
    ? asm_common_interrupt+0x22/0x40
    madvise_vma_behavior+0x3b7/0xac0
    ? find_vma+0x4a/0x70
    ? find_vma+0x64/0x70
    ? madvise_vma_anon_name+0x40/0x40
    madvise_walk_vmas+0xa6/0x130
    do_madvise+0x2f4/0x360
    __x64_sys_madvise+0x26/0x30
    do_syscall_64+0x5b/0x80
    ? do_syscall_64+0x67/0x80
    ? syscall_exit_to_user_mode+0x17/0x40
    ? do_syscall_64+0x67/0x80
    ? syscall_exit_to_user_mode+0x17/0x40
    ? do_syscall_64+0x67/0x80
    ? do_syscall_64+0x67/0x80
    ? common_interrupt+0x8b/0xa0
    entry_SYSCALL_64_after_hwframe+0x63/0xcd

The problem can be reproduced with the mmtests config
config-workload-stressng-mmap.  It does not always happen and when it
triggers is variable but it has happened on multiple machines.

The intent of commit b653db77350c patch was to avoid the case where
PG_private is clear but folio->private is not-NULL.  However, THP tail
pages uses page->private for "swp_entry_t if folio_test_swapcache()" as
stated in the documentation for struct folio.  This patch only clobbers
page->private for tail pages if the head page was not in swapcache and
warns once if page->private had an unexpected value.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: b653db77350c ("mm: Clear page->private when splitting or migrating a page")
Signed-off-by: Mel Gorman <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Brian Foster <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Oleksandr Natalenko <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Vitaly Wool <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

hugetlb: fix memory leak associated with vma_lock structure

The hugetlb vma_lock structure hangs off the vm_private_data pointer of
sharable hugetlb vmas.  The structure is vma specific and can not be
shared between vmas.  At fork and various other times, vmas are duplicated
via vm_area_dup().  When this happens, the pointer in the newly created
vma must be cleared and the structure reallocated.  Two hugetlb specific
routines deal with this hugetlb_dup_vma_private and hugetlb_vm_op_open.
Both routines are called for newly created vmas.  hugetlb_dup_vma_private
would always clear the pointer and hugetlb_vm_op_open would allocate the
new vms_lock structure.  This did not work in the case of this calling
sequence pointed out in [1].

  move_vma
    copy_vma
      new_vma = vm_area_dup(vma);
      new_vma->vm_ops->open(new_vma); --> new_vma has its own vma lock.
    is_vm_hugetlb_page(vma)
      clear_vma_resv_huge_pages
        hugetlb_dup_vma_private --> vma->vm_private_data is set to NULL

When clearing hugetlb_dup_vma_private we actually leak the associated
vma_lock structure.

The vma_lock structure contains a pointer to the associated vma.  This
information can be used in hugetlb_dup_vma_private and hugetlb_vm_op_open
to ensure we only clear the vm_private_data of newly created (copied)
vmas.  In such cases, the vma->vma_lock->vma field will not point to the
vma.

Update hugetlb_dup_vma_private and hugetlb_vm_op_open to not clear
vm_private_data if vma->vma_lock->vma == vma.  Also, log a warning if
hugetlb_vm_op_open ever encounters the case where vma_lock has already
been correctly allocated for the vma.

[1] https://lore.kernel.org/linux-mm/5154292a-4c55-28cd-0935-82441e512fc3@huawei.com/

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 131a79b474e9 ("hugetlb: fix vma lock handling during split vma and range unmapping")
Signed-off-by: Mike Kravetz <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: James Houghton <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mina Almasry <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Prakash Sangappa <[email protected]>
Cc: Sven Schnelle <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/page_alloc: reduce potential fragmentation in make_alloc_exact()

Try to avoid using the left over split page on the next request for a page
by calling __free_pages_ok() with FPI_TO_TAIL. This increases the
potential of defragmenting memory when it's used for a short period of
time.

Link: https://lkml.kernel.org/r/20220531185626.yvlmymbxyoe5vags@revolver
Signed-off-by: Liam R. Howlett <[email protected]>
Suggested-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm: /proc/pid/smaps_rollup: fix maple tree search

/proc/pid/smaps_rollup showed 0 kB for everything: now find first vma.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
Signed-off-by: Hugh Dickins <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages

The h->*_huge_pages counters are protected by the hugetlb_lock, but
alloc_huge_page has a corner case where it can decrement the counter
outside of the lock.

This could lead to a corrupted value of h->resv_huge_pages, which we have
observed on our systems.

Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a
potential race.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count")
Signed-off-by: Rik van Riel <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Glen McCready <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/mmap: fix MAP_FIXED address return on VMA merge

mmap should return the start address of newly mapped area when successful.
On a successful merge of a VMA, the return address was changed and thus
was violating that expectation from userspace.

This is a restoration of functionality provided by 309d08d9b3a3
(mm/mmap.c: fix mmap return value when vma is merged after call_mmap()).
For completeness of fixing MAP_FIXED, implement the comments from the
previous discussion to never update the address and fail if the address
changes. Leaving the error as a WARN_ON() to avoid crashing the kernel.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/Y06yk66SKxlrwwfb@lakrids/
Link: https://lore.kernel.org/all/[email protected]/
Fixes: 4dd1b84140c1 ("mm/mmap: use advanced maple tree API for mmap_region()")
Signed-off-by: Liam R. Howlett <[email protected]>
Reported-by: Mark Rutland <[email protected]>
Cc: Liu Zixian <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/mmap.c: __vma_adjust(): suppress uninitialized var warning

The code is OK, but it fools gcc.

mm/mmap.c:802 __vma_adjust() error: uninitialized symbol 'next_next'.

Fixes: 524e00b36e8c5 ("mm: remove rb tree.")
Reported-by: kernel test robot <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/mmap: undo ->mmap() when mas_preallocate() fails

A memory leak in hugetlb_reserve_pages was reported in [1].  The root
cause was traced to an error path in mmap_region when mas_preallocate()
fails.  In this case, the vma is freed after a successful call to
filesystem specific mmap.  The hugetlbfs mmap routine may allocate data
structures pointed to by m_private_data.  These need to be cleaned up by
the hugetlb vm_ops->close() routine.

The same issue was addressed by commit deb0f6562884 ("mm/mmap: undo
->mmap() when arch_validate_flags() fails") for the arch_validate_flags()
test.  Go to the same close_and_free_vma label if mas_preallocate() fails.

[1] https://lore.kernel.org/linux-mm/CAKXUXMxf7OiCwbxib7MwfR4M1b5+b3cNTU7n5NV9Zm4967=FPQ@mail.gmail.com/

Link: https://lkml.kernel.org/r/[email protected]
Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
Signed-off-by: Mike Kravetz <[email protected]>
Reported-by: Lukas Bulwahn <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Carlos Llamas <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

init: Kconfig: fix spelling mistake "satify" -> "satisfy"

There is a spelling mistake in a Kconfig description. Fix it.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

ocfs2: clear dinode links count in case of error

In ocfs2_mknod(), if error occurs after dinode successfully allocated,
ocfs2 i_links_count will not be 0.

So even though we clear inode i_nlink before iput in error handling, it
still won't wipe inode since we'll refresh inode from dinode during inode
lock. So just like clear inode i_nlink, we clear ocfs2 i_links_count as
well. Also do the same change for ocfs2_symlink().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Joseph Qi <[email protected]>
Reported-by: Yan Wang <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

ocfs2: fix BUG when iput after ocfs2_mknod fails

Commit b1529a41f777 "ocfs2: should reclaim the inode if
'__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed
inode if __ocfs2_mknod_locked() fails later.  But this introduce a race,
the freed bit may be reused immediately by another thread, which will
update dinode, e.g.  i_generation.  Then iput this inode will lead to BUG:
inode->i_generation != le32_to_cpu(fe->i_generation)

We could make this inode as bad, but we did want to do operations like
wipe in some cases.  Since the claimed inode bit can only affect that an
dinode is missing and will return back after fsck, it seems not a big
problem.  So just leave it as is by revert the reclaim logic.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: b1529a41f777 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error")
Signed-off-by: Joseph Qi <[email protected]>
Reported-by: Yan Wang <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

gcov: support GCC 12.1 and newer compilers

Starting with GCC 12.1, the created .gcda format can't be read by gcov
tool.  There are 2 significant changes to the .gcda file format that
need to be supported:

a) [gcov: Use system IO buffering]
   (23eb66d1d46a34cb28c4acbdf8a1deb80a7c5a05) changed that all sizes in
   the format are in bytes and not in words (4B)

b) [gcov: make profile merging smarter]
   (72e0c742bd01f8e7e6dcca64042b9ad7e75979de) add a new checksum to the
   file header.

Tested with GCC 7.5, 10.4, 12.2 and the current master.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Martin Liska <[email protected]>
Tested-by: Peter Oberparleiter <[email protected]>
Reviewed-by: Peter Oberparleiter <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

zsmalloc: zs_destroy_pool: add size_class NULL check

Inside the zs_destroy_pool() function, there can still be NULL size_class
pointers: if when the next size_class is allocated, inside
zs_create_pool() function, kzalloc will return NULL and handling the error
condition, zs_create_pool() will call zs_destroy_pool().

Link: https://lkml.kernel.org/r/[email protected]
Fixes: f24263a5a076 ("zsmalloc: remove unnecessary size_class NULL check")
Signed-off-by: Alexey Romanov <[email protected]>
Reviewed-by: Sergey Senozhatsky <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mm/mempolicy: fix mbind_range() arguments to vma_merge()

Fuzzing produced an invalid argument to vma_merge() which was caught by
the newly added verification of the number of VMAs being removed on
process exit. Analyzing the failure eventually resulted in finding an
issue with the search of a VMA that started at address 0, which caused an
underflow and thus the loss of many VMAs being tracked in the tree. Fix
the underflow by changing the search of the maple tree to use the start
address directly.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 66850be55e8e ("mm/mempolicy: use vma iterator & maple state instead of vma linked list")
Signed-off-by: Liam R. Howlett <[email protected]>
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

mailmap: update email for Qais Yousef

Update my email address for old entry and add a new entry for my
contribution while working with arm to continue support that work.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Qais Yousef <[email protected]>
Acked-by: Qais Yousef <[email protected]>
Acked-by: Qais Yousef <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>