Git Repo - linux.git/log

crypto: ecdh - avoid unaligned accesses in ecdh_set_secret()

ecdh_set_secret() casts a void* pointer to a const u64* in order to
feed it into ecc_is_key_valid(). This is not generally permitted by
the C standard, and leads to actual misalignment faults on ARMv6
cores. In some cases, these are fixed up in software, but this still
leads to performance hits that are entirely avoidable.

So let's copy the key into the ctx buffer first, which we will do
anyway in the common case, and which guarantees correct alignment.

Cc: <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: ccree - rework cache parameters handling

Rework the setting of DMA cache parameters, program more appropriate
values and explicitly set sharability domain.

Signed-off-by: Gilad Ben-Yossef <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: cavium - Use dma_set_mask_and_coherent to simplify code

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: cavium/zip - Use dma_set_mask_and_coherent to simplify code

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: ccree - Fix fall-through warnings for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of
letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Acked-by: Gilad Ben-Yossef <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: tcrypt - include 1420 byte blocks in aead and skcipher benchmarks

WireGuard and IPsec both typically operate on input blocks that are
~1420 bytes in size, given the default Ethernet MTU of 1500 bytes and
the overhead of the VPN metadata.

Many aead and sckipher implementations are optimized for power-of-2
block sizes, and whether they perform well when operating on 1420
byte blocks cannot be easily extrapolated from the performance on
power-of-2 block size. So let's add 1420 bytes explicitly, and round
it up to the next blocksize multiple of the algo in question if it
does not support 1420 byte blocks.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: tcrypt - permit tcrypt.ko to be builtin

When working on crypto algorithms, being able to run tcrypt quickly
without booting an entire Linux installation can be very useful. For
instance, QEMU/kvm can be used to boot a kernel from the command line,
and having tcrypt.ko builtin would allow tcrypt to be executed to run
benchmarks, or to run tests for algorithms that need to be instantiated
from templates, without the need to make it past the point where the
rootfs is mounted.

So let's relax the requirement that tcrypt can only be built as a module
when CONFIG_EXPERT is enabled.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: tcrypt - don't initialize at subsys_initcall time

Commit c4741b2305979 ("crypto: run initcalls for generic implementations
earlier") converted tcrypt.ko's module_init() to subsys_initcall(), but
this was unintentional: tcrypt.ko currently cannot be built into the core
kernel, and so the subsys_initcall() gets converted into module_init()
under the hood. Given that tcrypt.ko does not implement a generic version
of a crypto algorithm that has to be available early during boot, there
is no point in running the tcrypt init code earlier than implied by
module_init().

However, for crypto development purposes, we will lift the restriction
that tcrypt.ko must be built as a module, and when builtin, it makes sense
for tcrypt.ko (which does its work inside the module init function) to run
as late as possible. So let's switch to late_initcall() instead.

Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

MAINTAINERS: Move HiSilicon TRNG V2 driver

Move HiSilicon TRNG V2 driver into 'drivers/crypto/hisilicon/trng'
with some updating on 'MAINTAINERS'.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zaibo Xu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/trng - add support for PRNG

This patch adds support for pseudo random number generator(PRNG)
in Crypto subsystem.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zaibo Xu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/trng - add HiSilicon TRNG driver support

Move existing char/hw_random/hisi-trng-v2.c to crypto/hisilicon/trng.c.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zaibo Xu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

hwrng: hisi - remove HiSilicon TRNG driver

Driver of HiSilicon true random number generator(TRNG)
is removed from 'drivers/char/hw_random'.

Both 'Kunpeng 920' and 'Kunpeng 930' chips have TRNG,
however, PRNG is only supported by 'Kunpeng 930'.
So, this driver is moved to 'drivers/crypto/hisilicon/trng/'
in the next to enable the two's TRNG better.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zaibo Xu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: sparc - Fix sparse endianness warnings

This patch fixes a coulpe of sparse endianness warnings.

Signed-off-by: Herbert Xu <[email protected]>

crypto: powerpc/sha256-spe - Fix sparse endianness warning

This patch fixes a sparse endianness warning in sha256-spe.

Signed-off-by: Herbert Xu <[email protected]>

crypto: mips/octeon - Fix sparse endianness warnings

This patch fixes a number of endianness warnings in the mips/octeon
code.

Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - fix excluded_middle.cocci warnings

Condition !A || A && B is equivalent to !A || B.

Generated by: scripts/coccinelle/misc/excluded_middle.cocci

Fixes: b76f0ea01312 ("coccinelle: misc: add excluded_middle.cocci script")
CC: Denis Efremov <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: kernel test robot <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qce - Fix SHA result buffer corruption issues

Partial hash was being copied into the final result buffer without the
entire message block processed. Depending on how the end user processes
this result buffer, errors vary from result buffer corruption to result
buffer poisoing. Fix this issue by ensuring that only the final hash value
is copied into the result buffer.

Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Thara Gopinath <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qce - Enable support for crypto engine on sdm845

Add support Qualcomm Crypto Engine accelerated encryption and
authentication algorithms on sdm845.

Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Thara Gopinath <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: aegis128 - expose SIMD code path as separate driver

Wiring the SIMD code into the generic driver has the unfortunate side
effect that the tcrypt testing code cannot distinguish them, and will
therefore not use the latter to fuzz test the former, as it does for
other algorithms.

So let's refactor the code a bit so we can register two implementations:
aegis128-generic and aegis128-simd.

Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: aegis128/neon - move final tag check to SIMD domain

Instead of calculating the tag and returning it to the caller on
decryption, use a SIMD compare and min across vector to perform
the comparison. This is slightly more efficient, and removes the
need on the caller's part to wipe the tag from memory if the
decryption failed.

While at it, switch to unsigned int when passing cryptlen and
assoclen - we don't support input sizes where it matters anyway.

Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: aegis128/neon - optimize tail block handling

Avoid copying the tail block via a stack buffer if the total size
exceeds a single AEGIS block. In this case, we can use overlapping
loads and stores and NEON permutation instructions instead, which
leads to a modest performance improvement on some cores (< 5%),
and is slightly cleaner. Note that we still need to use a stack
buffer if the entire input is smaller than 16 bytes, given that
we cannot use 16 byte NEON loads and stores safely in this case.

Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: aegis128 - wipe plaintext and tag if decryption fails

The AEGIS spec mentions explicitly that the security guarantees hold
only if the resulting plaintext and tag of a failed decryption are
withheld. So ensure that we abide by this.

While at it, drop the unused struct aead_request *req parameter from
crypto_aegis128_process_crypt().

Reviewed-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: sun8i-ce - fix two error path's memory leak

This patch fixes the following smatch warnings:
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-hash.c:412
sun8i_ce_hash_run() warn: possible memory leak of 'result'
Note: "buf" is leaked as well.

Furthermore, in case of ENOMEM, crypto_finalize_hash_request() was not
called which was an error.

Fixes: 56f6d5aee88d ("crypto: sun8i-ce - support hash algorithms")
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Corentin Labbe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: kconfig - fix a couple of spelling mistakes

There are a couple of spelling mistakes in two crypto Kconfig files.
Fix these.

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add qat_4xxx driver

Add support for QAT 4xxx devices.

Signed-off-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Fiona Trahe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add hook to initialize vector routing table

Add an hook to initialize the vector routing table with the default
values before MSIx is enabled.
The new function set_msix_rttable() is called only if present in the
struct adf_hw_device_data of the device. This is to allow for QAT
devices that do not support that functionality.

Signed-off-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Fiona Trahe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - target fw images to specific AEs

Introduce support for devices that require multiple firmware images.
If a device requires more than a firmware image to operate, load the
image to the appropriate Acceleration Engine (AE).

Signed-off-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Fiona Trahe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: omap-aes - Fix PM disable depth imbalance in omap_aes_probe

The pm_runtime_enable will increase power disable depth.
Thus a pairing decrement is needed on the error handling
path to keep it balanced according to context.

Fixes: f7b2b5dd6a62a ("crypto: omap-aes - add error check for pm_runtime_get_sync")
Signed-off-by: Zhang Qilong <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/zip - add a work_queue for zip irq

The patch 'irqchip/gic-v3-its: Balance initial LPI affinity across CPUs'
set the IRQ to an uncentain CPU. If an IRQ is bound to the CPU used by the
thread which is sending request, the throughput will be just half.

So allocate a 'work_queue' and set as 'WQ_UNBOUND' to do the back half work
on some different CPUS.

Signed-off-by: Yang Shen <[email protected]>
Reviewed-by: Zaibo Xu <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: lib/curve25519 - Move selftest prototype into header file

This patch moves the curve25519_selftest into curve25519.h so
we don't get a warning from gcc complaining about a missing
prototype.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: sha - split sha.h into sha1.h and sha2.h

Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure. So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1. It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Acked-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: crypto4xx - Replace bitwise OR with logical OR in crypto4xx_build_pd

Clang warns:

drivers/crypto/amcc/crypto4xx_core.c:921:60: warning: operator '?:' has
lower precedence than '|'; '|' will be evaluated first
[-Wbitwise-conditional-parentheses]
                 (crypto_tfm_alg_type(req->tfm) == CRYPTO_ALG_TYPE_AEAD) ?
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
drivers/crypto/amcc/crypto4xx_core.c:921:60: note: place parentheses
around the '|' expression to silence this warning
                 (crypto_tfm_alg_type(req->tfm) == CRYPTO_ALG_TYPE_AEAD) ?
                                                                         ^
                                                                        )
drivers/crypto/amcc/crypto4xx_core.c:921:60: note: place parentheses
around the '?:' expression to evaluate it first
                 (crypto_tfm_alg_type(req->tfm) == CRYPTO_ALG_TYPE_AEAD) ?
                                                                         ^
                 (
1 warning generated.

It looks like this should have been a logical OR so that
PD_CTL_HASH_FINAL gets added to the w bitmask if crypto_tfm_alg_type
is either CRYPTO_ALG_TYPE_AHASH or CRYPTO_ALG_TYPE_AEAD. Change the
operator so that everything works properly.

Fixes: 4b5b79998af6 ("crypto: crypto4xx - fix stalls under heavy load")
Link: https://github.com/ClangBuiltLinux/linux/issues/1198
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Christian Lamparter <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: caam/qi - simplify error path for context allocation

Wang Qing reports that IS_ERR_OR_NULL() should be matched with
PTR_ERR_OR_ZERO(), not PTR_ERR().

As it turns out, the error path always returns an error code,
i.e. NULL is never returned.
Update the code accordingly - s/IS_ERR_OR_NULL/IS_ERR.

Reported-by: Wang Qing <[email protected]>
Signed-off-by: Horia Geantă <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: arm64/gcm - move authentication tag check to SIMD domain

Instead of copying the calculated authentication tag to memory and
calling crypto_memneq() to verify it, use vector bytewise compare and
min across vector instructions to decide whether the tag is valid. This
is more efficient, and given that the tag is only transiently held in a
NEON register, it is also safer, given that calculated tags for failed
decryptions should be withheld.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/sec2 - Fix aead authentication setting key error

Fix aead auth setting key process error. if use soft shash function, driver
need to use digest size replace of the user input key length.

Signed-off-by: Kai Ye <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: arm64/chacha - simplify tail block handling

Based on lessons learnt from optimizing the 32-bit version of this driver,
we can simplify the arm64 version considerably, by reordering the final
two stores when the last block is not a multiple of 64 bytes. This removes
the need to use permutation instructions to calculate the elements that are
clobbered by the final overlapping store, given that the store of the
penultimate block now follows it, and that one carries the correct values
for those elements already.

While at it, simplify the overlapping loads as well, by calculating the
address of the final overlapping load upfront, and switching to this
address for every load that would otherwise extend past the end of the
source buffer.

There is no impact on performance, but the resulting code is substantially
smaller and easier to follow.

Cc: Eric Biggers <[email protected]>
Cc: "Jason A . Donenfeld" <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add gen4 firmware loader

Add support for the QAT gen4 devices in the firmware loader.

Signed-off-by: Jack Xu <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add support for broadcasting mode

Add support for broadcasting mode in firmware loader to enable the next
generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add support for shared ustore

Add support for shared ustore mode support. This is required by the next
generation of QAT devices to share the same fw image across engines.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - allow to target specific AEs

Introduce new API, qat_uclo_set_cfg_ae_mask(), to allow the load of the
firmware image to a subset of Acceleration Engines (AEs). This is
required by the next generation of QAT devices to be able to load
different firmware images to the device.

Signed-off-by: Jack Xu <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add FCU CSRs to chip info

Add firmware control unit (FCU) CSRs to chip info so the firmware
authentication code is common between all devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add CSS3K support

Add support for CSS3K, which uses RSA3K as image signature algorithm,
to support the next generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - use ae_mask

Use ae_mask to decide which Accelerator Engine (AE) to target in AE
related operations, instead of a sequential loop, to skip AEs that are
fused out.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add check for null pointer

Add null pointer check when freeing the memory for firmware.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add misc control CSR to chip info

Add misc control CSR to chip info since the CSR offset will be different
in the next generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add wake up event to chip info

Add the wake up event to chip info since this value will be different
in the next generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add clock enable CSR to chip info

Add global clock enable CSR to the chip info since the CSR offset
will be different in the next generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add reset CSR and mask to chip info

Add reset CSR offset and mask to chip info since they are different
in new QAT devices. This also simplifies the reset/clrReset functions
by using the reset mask.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add local memory size to chip info

Add the local memory size to the chip info since the size of this memory
will be different in the next generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add support for lm2 and lm3

Add support for local memory lm2 and lm3 which is introduced in the next
generation of QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add next neighbor to chip_info

Introduce the next neighbor (NN) capability in chip_info as NN registers
are not supported in certain SKUs of QAT.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - replace check based on DID

Modify condition in qat_uclo_wr_mimage() to use a capability of the
device (sram_visible), rather than the device ID, so the check is not
specific to devices of the same type.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - introduce chip info structure

Introduce the chip info structure which contains device specific
information. The initialization path has been split between common and
hardware specific in order to facilitate the introduction of the next
generation hardware.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - refactor long expressions

Replace long expressions with local variables in the functions
qat_uclo_wr_uimage_page(), qat_uclo_init_globals() and
qat_uclo_init_umem_seg() to improve readability.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - refactor qat_uclo_set_ae_mode()

Refactor qat_uclo_set_ae_mode() by moving the logic that sets the AE
modes to a separate function, qat_hal_set_modes().

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - move defines to header files

Move the definition of ICP_QAT_AE_OFFSET, ICP_QAT_CAP_OFFSET,
LOCAL_TO_XFER_REG_OFFSET and ICP_QAT_EP_OFFSET from qat_hal.c to
icp_qat_hal.h to avoid the definition of generation specific constants
in qat_hal.c.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - remove global CSRs helpers

Include the offset of GLOBAL_CSR directly into the enum hal_global_csr
and remove the macros SET_GLB_CSR/GET_GLB_CSR to simplify the global CSR
access.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - refactor AE start

Change the API and the behaviour of the qat_hal_start() function.
With this change, the function starts under the hood all acceleration
engines (AEs) and there is no longer need to call it for each engine.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - change micro word data mask

Change micro word data mask since the Acceleration Engine (AE)
instruction codes have been changed in the new generation QAT devices.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - change type for ctx_mask

Change type for ctx_mask from unsigned char to unsigned long to avoid
type casting.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - add support for relative FW ucode loading

Improve the way micro instructions (FW code) are uploaded to Accelerator
Engines (AEs). If code starts at PC zero (absolute addressing), read
uwords with no relative address. Otherwise, use relative addressing to
the page region.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - rename qat_uclo_del_uof_obj()

Rename the function qat_uclo_del_uof_obj() in qat_uclo_del_obj() since
it frees the memory allocated for all firmware objects.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - introduce additional parenthesis

Introduce additional parenthesis to resolve a warninga reported by
checkpatch.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - remove unnecessary parenthesis

Remove unnecessary parenthesis across the firmware loader.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - fix error message

Change message in error path of qat_uclo_check_image_compat() to report
an incompatible firmware image that contains a neighbor register table.

Signed-off-by: Jack Xu <[email protected]>
Co-developed-by: Wojciech Ziemba <[email protected]>
Signed-off-by: Wojciech Ziemba <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - fix CSR access

Do not mask the AE number with the AE mask when accessing the AE local
CSRs. Bit 12 of the local CSR address is the start of AE number so just
take out the AE mask here.

Signed-off-by: Jack Xu <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - fix status check in qat_hal_put_rel_rd_xfer()

The return value of qat_hal_rd_ae_csr() is always a CSR value and never
a status and should not be stored in the status variable of
qat_hal_put_rel_rd_xfer().

This removes the assignment as qat_hal_rd_ae_csr() is not expected to
fail.
A more comprehensive handling of the theoretical corner case which could
result in a fail will be submitted in a separate patch.

Fixes: 8c9478a400b7 ("crypto: qat - reduce stack size with KASAN")
Signed-off-by: Jack Xu <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Fiona Trahe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - support for mof format in fw loader

Implement infrastructure for the Multiple Object File (MOF) format
in the firmware loader. This will allow to load a specific firmware
image contained inside an MOF file.

This patch is based on earlier work done by Pingchao Yang.

Signed-off-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Jack Xu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: cavium/nitrox - Fix sparse warnings

This patch fixes all the sparse warnings in cavium/nitrox:

- Fix endianness warnings by adding the correct markers to unions.
- Add missing header inclusions for prototypes.
- Move nitrox_sriov_configure prototype into the isr header file.

Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - replace pci with PCI in comments

Change all lower case pci in comments to be upper case PCI.

Suggested-by: Andy Shevchenko <[email protected]>
Signed-off-by: Adam Guerin <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: arm/chacha-neon - optimize for non-block size multiples

The current NEON based ChaCha implementation for ARM is optimized for
multiples of 4x the ChaCha block size (64 bytes). This makes sense for
block encryption, but given that ChaCha is also often used in the
context of networking, it makes sense to consider arbitrary length
inputs as well.

For example, WireGuard typically uses 1420 byte packets, and performing
ChaCha encryption involves 5 invocations of chacha_4block_xor_neon()
and 3 invocations of chacha_block_xor_neon(), where the last one also
involves a memcpy() using a buffer on the stack to process the final
chunk of 1420 % 64 == 12 bytes.

Let's optimize for this case as well, by letting chacha_4block_xor_neon()
deal with any input size between 64 and 256 bytes, using NEON permutation
instructions and overlapping loads and stores. This way, the 140 byte
tail of a 1420 byte input buffer can simply be processed in one go.

This results in the following performance improvements for 1420 byte
blocks, without significant impact on power-of-2 input sizes. (Note
that Raspberry Pi is widely used in combination with a 32-bit kernel,
even though the core is 64-bit capable)

   Cortex-A8  (BeagleBone)       :   7%
   Cortex-A15 (Calxeda Midway)   :  21%
   Cortex-A53 (Raspberry Pi 3)   :   3%
   Cortex-A72 (Raspberry Pi 4)   :  19%

Cc: Eric Biggers <[email protected]>
Cc: "Jason A . Donenfeld" <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - remove cast for mailbox CSR

Remove cast for mailbox CSR in adf_admin.c as it is not needed.

Suggested-by: Andy Shevchenko <[email protected]>
Signed-off-by: Adam Guerin <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: Kconfig - CRYPTO_MANAGER_EXTRA_TESTS requires the manager

The extra tests in the manager actually require the manager to be
selected too. Otherwise the linker gives errors like:

ld: arch/x86/crypto/chacha_glue.o: in function `chacha_simd_stream_xor':
chacha_glue.c:(.text+0x422): undefined reference to `crypto_simd_disabled_for_test'

Fixes: 2343d1529aff ("crypto: Kconfig - allow tests to be disabled when manager is disabled")
Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: caam - fix printing on xts fallback allocation error path

At the time xts fallback tfm allocation fails the device struct
hasn't been enabled yet in the caam xts tfm's private context.

Fix this by using the device struct from xts algorithm's private context
or, when not available, by replacing dev_err with pr_err.

Fixes: 9d9b14dbe077 ("crypto: caam/jr - add fallback for XTS with more than 8B IV")
Fixes: 83e8aa912138 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
Fixes: 36e2d7cfdcf1 ("crypto: caam/qi2 - add fallback for XTS with more than 8B IV")
Signed-off-by: Horia Geantă <[email protected]>
Reviewed-by: Iuliana Prodan <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - split 'hisi_qm_init' into smaller pieces

'hisi_qm_init' initializes configuration of QM.
To improve code readability, split it into two pieces.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - split 'qm_eq_ctx_cfg' into smaller pieces

'qm_eq_ctx_cfg' initializes configuration of EQ and AEQ,
split it into two pieces to improve code readability.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - split 'qm_qp_ctx_cfg' into smaller pieces

'qm_qp_ctx_cfg' initializes configuration of SQ and CQ,
split it into two pieces to improve code readability.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - replace 'sprintf' with 'scnprintf'

Replace 'sprintf' with 'scnprintf' to avoid overrun.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - modify return type of 'qm_set_sqctype'

Since 'qm_set_sqctype' always returns 0, change it as 'void'.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - modify the return type of debugfs interface

Since 'qm_create_debugfs_file' always returns 0, change it as 'void'.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - modify the return type of function

The returns of 'qm_get_hw_error_status' and 'qm_get_dev_err_status'
are values from the hardware registers, which should not be defined
as 'int', so update as 'u32'.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: hisilicon/qm - numbers are replaced by macros

Some numbers are replaced by macros to avoid incomprehension.

Signed-off-by: Weili Qian <[email protected]>
Reviewed-by: Zhou Wang <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

hwrng: imx-rngc - irq already prints an error

Clean up the check for irq. dev_err() is superfluous as
platform_get_irq() already prints an error. Check for zero
would indicate a bug. Remove curly braces to conform to
styling requirements.
Signed-off-by: Nigel Christian <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: arm/aes-neonbs - fix usage of cbc(aes) fallback

Loading the module deadlocks since:
-local cbc(aes) implementation needs a fallback and
-crypto API tries to find one but the request_module() resolves back to
the same module

Fix this by changing the module alias for cbc(aes) and
using the NEED_FALLBACK flag when requesting for a fallback algorithm.

Fixes: 00b99ad2bac2 ("crypto: arm/aes-neonbs - Use generic cbc encryption path")
Signed-off-by: Horia Geantă <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: qat - remove unneeded semicolon

A semicolon is not needed after a switch statement.

Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Giovanni Cabiddu <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: cavium/nitrox - remove unneeded semicolon

A semicolon is not needed after a switch statement.

Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: arm64/poly1305-neon - reorder PAC authentication with SP update

PAC pointer authentication signs the return address against the value
of the stack pointer, to prevent stack overrun exploits from corrupting
the control flow. However, this requires that the AUTIASP is issued with
SP holding the same value as it held when the PAC value was generated.
The Poly1305 NEON code got this wrong, resulting in crashes on PAC
capable hardware.

Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS ...")
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: af_alg - avoid undefined behavior accessing salg_name

Commit 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm
names") made the kernel start accepting arbitrarily long algorithm names
in sockaddr_alg.  However, the actual length of the salg_name field
stayed at the original 64 bytes.

This is broken because the kernel can access indices >= 64 in salg_name,
which is undefined behavior -- even though the memory that is accessed
is still located within the sockaddr structure.  It would only be
defined behavior if the array were properly marked as arbitrary-length
(either by making it a flexible array, which is the recommended way
these days, or by making it an array of length 0 or 1).

We can't simply change salg_name into a flexible array, since that would
break source compatibility with userspace programs that embed
sockaddr_alg into another struct, or (more commonly) declare a
sockaddr_alg like 'struct sockaddr_alg sa = { .salg_name = "foo" };'.

One solution would be to change salg_name into a flexible array only
when '#ifdef __KERNEL__'.  However, that would keep userspace without an
easy way to actually use the longer algorithm names.

Instead, add a new structure 'sockaddr_alg_new' that has the flexible
array field, and expose it to both userspace and the kernel.
Make the kernel use it correctly in alg_bind().

This addresses the syzbot report
"UBSAN: array-index-out-of-bounds in alg_bind"
(https://syzkaller.appspot.com/bug?extid=92ead4eb8e26a26d465e).

Reported-by: [email protected]
Fixes: 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm names")
Cc: <[email protected]> # v4.12+
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: caam - enable crypto-engine retry mechanism

Use the new crypto_engine_alloc_init_and_set() function to
initialize crypto-engine and enable retry mechanism.

Set the maximum size for crypto-engine software queue based on
Job Ring size (JOBR_DEPTH) and a threshold (reserved for the
non-crypto-API requests that are not passed through crypto-engine).

The callback for do_batch_requests is NULL, since CAAM
doesn't support linked requests.

Signed-off-by: Iuliana Prodan <[email protected]>
Reviewed-by: Horia Geantă <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: testmgr - WARN on test failure

Currently, by default crypto self-test failures only result in a
pr_warn() message and an "unknown" status in /proc/crypto.  Both of
these are easy to miss.  There is also an option to panic the kernel
when a test fails, but that can't be the default behavior.

A crypto self-test failure always indicates a kernel bug, however, and
there's already a standard way to report (recoverable) kernel bugs --
the WARN() family of macros.  WARNs are noisier and harder to miss, and
existing test systems already know to look for them in dmesg or via
/proc/sys/kernel/tainted.

Therefore, call WARN() when an algorithm fails its self-tests.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: testmgr - always print the actual skcipher driver name

When alg_test() is called from tcrypt.ko rather than from the algorithm
registration code, "driver" is actually the algorithm name, not the
driver name. So it shouldn't be used in places where a driver name is
wanted, e.g. when reporting a test failure or when checking whether the
driver is the generic driver or not.

Fix this for the skcipher algorithm tests by getting the driver name
from the crypto_skcipher that actually got allocated.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: testmgr - always print the actual AEAD driver name

When alg_test() is called from tcrypt.ko rather than from the algorithm
registration code, "driver" is actually the algorithm name, not the
driver name. So it shouldn't be used in places where a driver name is
wanted, e.g. when reporting a test failure or when checking whether the
driver is the generic driver or not.

Fix this for the AEAD algorithm tests by getting the driver name from
the crypto_aead that actually got allocated.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: testmgr - always print the actual hash driver name

When alg_test() is called from tcrypt.ko rather than from the algorithm
registration code, "driver" is actually the algorithm name, not the
driver name. So it shouldn't be used in places where a driver name is
wanted, e.g. when reporting a test failure or when checking whether the
driver is the generic driver or not.

Fix this for the hash algorithm tests by getting the driver name from
the crypto_ahash or crypto_shash that actually got allocated.

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: aead - add crypto_aead_driver_name()

Add crypto_aead_driver_name(), which is analogous to
crypto_skcipher_driver_name().

Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: atmel-sha - remove unneeded break

A break is not needed if it is preceded by a return

Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: Alexandre Belloni <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: lib/sha256 - Unroll LOAD and BLEND loops

Unrolling the LOAD and BLEND loops improves performance by ~8% on x86_64
(tested on Broadwell Xeon) while not increasing code size too much.

Signed-off-by: Arvind Sankar <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: lib/sha256 - Unroll SHA256 loop 8 times intead of 64

This reduces code size substantially (on x86_64 with gcc-10 the size of
sha256_update() goes from 7593 bytes to 1952 bytes including the new
SHA256_K array), and on x86 is slightly faster than the full unroll
(tested on Broadwell Xeon).

Signed-off-by: Arvind Sankar <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: lib/sha256 - Clear W[] in sha256_update() instead of sha256_transform()

The temporary W[] array is currently zeroed out once every call to
sha256_transform(), i.e. once every 64 bytes of input data. Moving it to
sha256_update() instead so that it is cleared only once per update can
save about 2-3% of the total time taken to compute the digest, with a
reasonable memset() implementation, and considerably more (~20%) with a
bad one (eg the x86 purgatory currently uses a memset() coded in C).

Signed-off-by: Arvind Sankar <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

crypto: lib/sha256 - Don't clear temporary variables

The assignments to clear a through h and t1/t2 are optimized out by the
compiler because they are unused after the assignments.

Clearing individual scalar variables is unlikely to be useful, as they
may have been assigned to registers, and even if stack spilling was
required, there may be compiler-generated temporaries that are
impossible to clear in any case.

So drop the clearing of a through h and t1/t2.

Signed-off-by: Arvind Sankar <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>