APC and your system crashes randomly.
apic= [APIC,X86] Advanced Programmable Interrupt Controller
- Change the output verbosity whilst booting
+ Change the output verbosity while booting
Format: { quiet (default) | verbose | debug }
Change the amount of debugging information output
when initialising the APIC and IO-APIC components.
cut the overhead, others just disable the usage. So
only cgroup_disable=memory is actually worthy}
- cgroup_no_v1= [KNL] Disable one, multiple, all cgroup controllers in v1
- Format: { controller[,controller...] | "all" }
+ cgroup_no_v1= [KNL] Disable cgroup controllers and named hierarchies in v1
+ Format: { { controller | "all" | "named" }
+ [,{ controller | "all" | "named" }...] }
Like cgroup_disable, but only applies to cgroup v1;
the blacklisted controllers remain available in cgroup2.
+ "all" blacklists all controllers and "named" disables
+ named mounts. Specifying both "all" and "named" disables
+ all v1 hierarchies.
cgroup.memory= [KNL] Pass options to the cgroup memory controller.
Format: <string>
cpuidle.off=1 [CPU_IDLE]
disable the cpuidle sub-system
+ cpuidle.governor=
+ [CPU_IDLE] Name of the cpuidle governor to use.
+
cpufreq.off=1 [CPU_FREQ]
disable the cpufreq sub-system
causing system reset or hang due to sending
INIT from AP to BSP.
- disable_counter_freezing [HW]
+ perf_v4_pmi= [X86,INTEL]
+ Format: <bool>
Disable Intel PMU counter freezing feature.
The feature only exists starting from
Arch Perfmon v4 (Skylake and newer).
off
Disables hypervisor mitigations and doesn't
emit any warnings.
+ It also drops the swap size and available
+ RAM limit restriction on both hypervisor and
+ bare metal.
Default is 'flush'.
check bypass). With this option data leaks are possible
in the system.
- nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2
+ nospectre_v2 [X86,PPC_FSL_BOOK3E] Disable all mitigations for the Spectre variant 2
(indirect branch prediction) vulnerability. System may
allow data leaks with this option, which is equivalent
to spectre_v2=off.
before loading.
See Documentation/blockdev/ramdisk.txt.
+ psi= [KNL] Enable or disable pressure stall information
+ tracking.
+ Format: <bool>
+
psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
probe for; one of (bare|imps|exps|lifebook|any).
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
in microseconds. The default of zero says
no holdoff.
- rcutorture.cbflood_inter_holdoff= [KNL]
- Set holdoff time (jiffies) between successive
- callback-flood tests.
-
- rcutorture.cbflood_intra_holdoff= [KNL]
- Set holdoff time (jiffies) between successive
- bursts of callbacks within a given callback-flood
- test.
-
- rcutorture.cbflood_n_burst= [KNL]
- Set the number of bursts making up a given
- callback-flood test. Set this to zero to
- disable callback-flood testing.
-
- rcutorture.cbflood_n_per_burst= [KNL]
- Set the number of callbacks to be registered
- in a given burst of a callback-flood test.
-
rcutorture.fqs_duration= [KNL]
Set duration of force_quiescent_state bursts
in microseconds.
Set wait time between force_quiescent_state bursts
in seconds.
+ rcutorture.fwd_progress= [KNL]
+ Enable RCU grace-period forward-progress testing
+ for the types of RCU supporting this notion.
+
+ rcutorture.fwd_progress_div= [KNL]
+ Specify the fraction of a CPU-stall-warning
+ period to do tight-loop forward-progress testing.
+
+ rcutorture.fwd_progress_holdoff= [KNL]
+ Number of seconds to wait between successive
+ forward-progress tests.
+
+ rcutorture.fwd_progress_need_resched= [KNL]
+ Enclose cond_resched() calls within checks for
+ need_resched() during tight-loop forward-progress
+ testing.
+
rcutorture.gp_cond= [KNL]
Use conditional/asynchronous update-side
primitives, if available.
spectre_v2= [X86] Control mitigation of Spectre variant 2
(indirect branch speculation) vulnerability.
+ The default operation protects the kernel from
+ user space attacks.
- on - unconditionally enable
- off - unconditionally disable
+ on - unconditionally enable, implies
+ spectre_v2_user=on
+ off - unconditionally disable, implies
+ spectre_v2_user=off
auto - kernel detects whether your CPU model is
vulnerable
CONFIG_RETPOLINE configuration option, and the
compiler with which the kernel was built.
+ Selecting 'on' will also enable the mitigation
+ against user space to user space task attacks.
+
+ Selecting 'off' will disable both the kernel and
+ the user space protections.
+
Specific mitigations can also be selected manually:
retpoline - replace indirect branches
Not specifying this option is equivalent to
spectre_v2=auto.
+ spectre_v2_user=
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability between
+ user space tasks
+
+ on - Unconditionally enable mitigations. Is
+ enforced by spectre_v2=on
+
+ off - Unconditionally disable mitigations. Is
+ enforced by spectre_v2=off
+
+ prctl - Indirect branch speculation is enabled,
+ but mitigation can be enabled via prctl
+ per thread. The mitigation control state
+ is inherited on fork.
+
+ prctl,ibpb
+ - Like "prctl" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different user
+ space processes.
+
+ seccomp
+ - Same as "prctl" above, but all seccomp
+ threads will enable the mitigation unless
+ they explicitly opt out.
+
+ seccomp,ibpb
+ - Like "seccomp" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different
+ user space processes.
+
+ auto - Kernel selects the mitigation depending on
+ the available CPU features and vulnerability.
+
+ Default mitigation:
+ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
spec_store_bypass_disable=
[HW] Control Speculative Store Bypass (SSB) Disable mitigation
(Speculative Store Bypass vulnerability)
prevent spurious wakeup);
n = USB_QUIRK_DELAY_CTRL_MSG (Device needs a
pause after every control message);
+ o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra
+ delay after resetting its port);
Example: quirks=0781:5580:bk,0a5c:5834:gij
usbhid.mousepoll=
The security list is not a disclosure channel. For that, see Coordination
below.
-Once a robust fix has been developed, our preference is to release the
-fix in a timely fashion, treating it no differently than any of the other
-thousands of changes and fixes the Linux kernel project releases every
-month.
-
-However, at the request of the reporter, we will postpone releasing the
-fix for up to 5 business days after the date of the report or after the
-embargo has lifted; whichever comes first. The only exception to that
-rule is if the bug is publicly known, in which case the preference is to
-release the fix as soon as it's available.
+Once a robust fix has been developed, the release process starts. Fixes
+for publicly known bugs are released immediately.
+
+Although our preference is to release fixes for publicly undisclosed bugs
+as soon as they become available, this may be postponed at the request of
+the reporter or an affected party for up to 7 calendar days from the start
+of the release process, with an exceptional extension to 14 calendar days
+if it is agreed that the criticality of the bug requires more time. The
+only valid reason for deferring the publication of a fix is to accommodate
+the logistics of QA and large scale rollouts which require release
+coordination.
- Whilst embargoed information may be shared with trusted individuals in
+ While embargoed information may be shared with trusted individuals in
order to develop a fix, such information will not be published alongside
the fix or on any other disclosure channel without the permission of the
reporter. This includes but is not limited to the original bug report
to better maintained higher layer. Also, as init failure path is
shared with exit path, both can get more testing.
+ Note though that when converting current calls or assignments to
+ managed devm_* versions it is up to you to check if internal operations
+ like allocating memory, have failed. Managed resources pertains to the
+ freeing of these resources *only* - all other checks needed are still
+ on you. In some cases this may mean introducing checks that were not
+ necessary before moving to the managed devm_* calls.
+
3. Devres group
---------------
devm_gpiod_get_index_optional()
devm_gpiod_get_optional()
devm_gpiod_put()
+ devm_gpiod_unhinge()
devm_gpiochip_add_data()
- devm_gpiochip_remove()
devm_gpio_request()
devm_gpio_request_one()
devm_gpio_free()
The link self points to the process reading the file system. Each process
subdirectory has the entries listed in Table 1-1.
+ Note that an open a file descriptor to /proc/<pid> or to any of its
+ contained files or subdirectories does not prevent <pid> being reused
+ for some other process in the event that <pid> exits. Operations on
+ open /proc/<pid> file descriptors corresponding to dead processes
+ never act on any new process that the kernel may, through chance, have
+ also assigned the process ID <pid>. Instead, operations on these FDs
+ usually fail with ESRCH.
Table 1-1: Process specific entries in /proc
..............................................................................
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
+ THP_enabled: 1
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
+ CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
+ Speculation_Store_Bypass: thread vulnerable
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
It's slow but very precise.
- Table 1-2: Contents of the status files (as of 4.8)
+ Table 1-2: Contents of the status files (as of 4.19)
..............................................................................
Field Content
Name filename of the executable
HugetlbPages size of hugetlb memory portions
CoreDumping process's memory is currently being dumped
(killing the process may lead to a corrupted core)
+ THP_enabled process is allowed to use THP (returns 0 when
+ PR_SET_THP_DISABLE is set on the process
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
CapPrm bitmap of permitted capabilities
CapEff bitmap of effective capabilities
CapBnd bitmap of capabilities bounding set
+ CapAmb bitmap of ambient capabilities
NoNewPrivs no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...)
Seccomp seccomp mode, like prctl(PR_GET_SECCOMP, ...)
+ Speculation_Store_Bypass speculative store bypass mitigation status
Cpus_allowed mask of CPUs on which this process may run
Cpus_allowed_list Same as previous, but in "list format"
Mems_allowed mask of memory nodes allowed to this process
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
+THPeligible: 0
VmFlags: rd ex mr mw me dw
the first of these lines shows the same information as is displayed for the
"SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
does not take into account swapped out page of underlying shmem objects.
"Locked" indicates whether the mapping is locked in memory or not.
+"THPeligible" indicates whether the mapping is eligible for THP pages - 1 if
+true, 0 otherwise.
"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
Note that there is no guarantee that every flag and associated mnemonic will
be present in all further kernel releases. Things get changed, the flags may
-be vanished or the reverse -- new added.
+be vanished or the reverse -- new added. Interpretation of their meaning
+might change in future as well. So each consumer of these flags has to
+follow each specific kernel version for the exact semantic.
This file is only present if the CONFIG_MMU kernel configuration option is
enabled.
Simply running out of kernel/system memory is signalled through ENOMEM.
- EPERM/EACCESS:
+ EPERM/EACCES:
Returned for an operation that is valid, but needs more privileges.
E.g. root-only or much more common, DRM master-only operations return
this when when called by unpriviledged clients. There's no clear
- difference between EACCESS and EPERM.
+ difference between EACCES and EPERM.
ENODEV:
+ The device is not (yet) present or fully initialized.
+
+EOPNOTSUPP:
Feature (like PRIME, modesetting, GEM) is not supported by the driver.
ENXIO:
-.. -*- coding: utf-8; mode: rst -*-
+.. Permission is granted to copy, distribute and/or modify this
+.. document under the terms of the GNU Free Documentation License,
+.. Version 1.1 or any later version published by the Free Software
+.. Foundation, with no Invariant Sections, no Front-Cover Texts
+.. and no Back-Cover Texts. A copy of the license is included at
+.. Documentation/media/uapi/fdl-appendix.rst.
+..
+.. TODO: replace it to GFDL-1.1-or-later WITH no-invariant-sections
.. _extended-controls:
``V4L2_CID_MPEG_VIDEO_H264_LOOP_FILTER_ALPHA (integer)``
Loop filter alpha coefficient, defined in the H264 standard.
+ This value corresponds to the slice_alpha_c0_offset_div2 slice header
+ field, and should be in the range of -6 to +6, inclusive. The actual alpha
+ offset FilterOffsetA is twice this value.
Applicable to the H264 encoder.
``V4L2_CID_MPEG_VIDEO_H264_LOOP_FILTER_BETA (integer)``
Loop filter beta coefficient, defined in the H264 standard.
+ This corresponds to the slice_beta_offset_div2 slice header field, and
+ should be in the range of -6 to +6, inclusive. The actual beta offset
+ FilterOffsetB is twice this value.
Applicable to the H264 encoder.
.. _v4l2-mpeg-video-h264-entropy-mode:
configuring a stateless hardware decoding pipeline for MPEG-2.
The bitstream parameters are defined according to :ref:`mpeg2part2`.
+ .. note::
+
+ This compound control is not yet part of the public kernel API and
+ it is expected to change.
+
.. c:type:: v4l2_ctrl_mpeg2_slice_params
.. cssclass:: longtable
Specifies quantization matrices (as extracted from the bitstream) for the
associated MPEG-2 slice data.
+ .. note::
+
+ This compound control is not yet part of the public kernel API and
+ it is expected to change.
+
.. c:type:: v4l2_ctrl_mpeg2_quantization
.. cssclass:: longtable
converts that received signal to lower intermediate frequency (IF) or
baseband frequency (BB). Tuners that could do baseband output are often
called Zero-IF tuners. Older tuners were typically simple PLL tuners
- inside a metal box, whilst newer ones are highly integrated chips
+ inside a metal box, while newer ones are highly integrated chips
without a metal box "silicon tuners". These controls are mostly
applicable for new feature rich silicon tuners, just because older
tuners does not have much adjustable features.
--- /dev/null
- pause whilst the driver figures out where its media went). My tests
+ Originally, this driver was written for the Digital Equipment
+ Corporation series of EtherWORKS Ethernet cards:
+
+ DE425 TP/COAX EISA
+ DE434 TP PCI
+ DE435 TP/COAX/AUI PCI
+ DE450 TP/COAX/AUI PCI
+ DE500 10/100 PCI Fasternet
+
+ but it will now attempt to support all cards which conform to the
+ Digital Semiconductor SROM Specification. The driver currently
+ recognises the following chips:
+
+ DC21040 (no SROM)
+ DC21041[A]
+ DC21140[A]
+ DC21142
+ DC21143
+
+ So far the driver is known to work with the following cards:
+
+ KINGSTON
+ Linksys
+ ZNYX342
+ SMC8432
+ SMC9332 (w/new SROM)
+ ZNYX31[45]
+ ZNYX346 10/100 4 port (can act as a 10/100 bridge!)
+
+ The driver has been tested on a relatively busy network using the DE425,
+ DE434, DE435 and DE500 cards and benchmarked with 'ttcp': it transferred
+ 16M of data to a DECstation 5000/200 as follows:
+
+ TCP UDP
+ TX RX TX RX
+ DE425 1030k 997k 1170k 1128k
+ DE434 1063k 995k 1170k 1125k
+ DE435 1063k 995k 1170k 1125k
+ DE500 1063k 998k 1170k 1125k in 10Mb/s mode
+
+ All values are typical (in kBytes/sec) from a sample of 4 for each
+ measurement. Their error is +/-20k on a quiet (private) network and also
+ depend on what load the CPU has.
+
+ =========================================================================
+
+ The ability to load this driver as a loadable module has been included
+ and used extensively during the driver development (to save those long
+ reboot sequences). Loadable module support under PCI and EISA has been
+ achieved by letting the driver autoprobe as if it were compiled into the
+ kernel. Do make sure you're not sharing interrupts with anything that
+ cannot accommodate interrupt sharing!
+
+ To utilise this ability, you have to do 8 things:
+
+ 0) have a copy of the loadable modules code installed on your system.
+ 1) copy de4x5.c from the /linux/drivers/net directory to your favourite
+ temporary directory.
+ 2) for fixed autoprobes (not recommended), edit the source code near
+ line 5594 to reflect the I/O address you're using, or assign these when
+ loading by:
+
+ insmod de4x5 io=0xghh where g = bus number
+ hh = device number
+
+ NB: autoprobing for modules is now supported by default. You may just
+ use:
+
+ insmod de4x5
+
+ to load all available boards. For a specific board, still use
+ the 'io=?' above.
+ 3) compile de4x5.c, but include -DMODULE in the command line to ensure
+ that the correct bits are compiled (see end of source code).
+ 4) if you are wanting to add a new card, goto 5. Otherwise, recompile a
+ kernel with the de4x5 configuration turned off and reboot.
+ 5) insmod de4x5 [io=0xghh]
+ 6) run the net startup bits for your new eth?? interface(s) manually
+ (usually /etc/rc.inet[12] at boot time).
+ 7) enjoy!
+
+ To unload a module, turn off the associated interface(s)
+ 'ifconfig eth?? down' then 'rmmod de4x5'.
+
+ Automedia detection is included so that in principle you can disconnect
+ from, e.g. TP, reconnect to BNC and things will still work (after a
++ pause while the driver figures out where its media went). My tests
+ using ping showed that it appears to work....
+
+ By default, the driver will now autodetect any DECchip based card.
+ Should you have a need to restrict the driver to DIGITAL only cards, you
+ can compile with a DEC_ONLY define, or if loading as a module, use the
+ 'dec_only=1' parameter.
+
+ I've changed the timing routines to use the kernel timer and scheduling
+ functions so that the hangs and other assorted problems that occurred
+ while autosensing the media should be gone. A bonus for the DC21040
+ auto media sense algorithm is that it can now use one that is more in
+ line with the rest (the DC21040 chip doesn't have a hardware timer).
+ The downside is the 1 'jiffies' (10ms) resolution.
+
+ IEEE 802.3u MII interface code has been added in anticipation that some
+ products may use it in the future.
+
+ The SMC9332 card has a non-compliant SROM which needs fixing - I have
+ patched this driver to detect it because the SROM format used complies
+ to a previous DEC-STD format.
+
+ I have removed the buffer copies needed for receive on Intels. I cannot
+ remove them for Alphas since the Tulip hardware only does longword
+ aligned DMA transfers and the Alphas get alignment traps with non
+ longword aligned data copies (which makes them really slow). No comment.
+
+ I have added SROM decoding routines to make this driver work with any
+ card that supports the Digital Semiconductor SROM spec. This will help
+ all cards running the dc2114x series chips in particular. Cards using
+ the dc2104x chips should run correctly with the basic driver. I'm in
+ this feature working. So far we have tested KINGSTON, SMC8432, SMC9332
+ (with the latest SROM complying with the SROM spec V3: their first was
+ broken), ZNYX342 and LinkSys. ZNYX314 (dual 21041 MAC) and ZNYX 315
+ (quad 21041 MAC) cards also appear to work despite their incorrectly
+ wired IRQs.
+
+ I have added a temporary fix for interrupt problems when some SCSI cards
+ share the same interrupt as the DECchip based cards. The problem occurs
+ because the SCSI card wants to grab the interrupt as a fast interrupt
+ (runs the service routine with interrupts turned off) vs. this card
+ which really needs to run the service routine with interrupts turned on.
+ This driver will now add the interrupt service routine as a fast
+ interrupt if it is bounced from the slow interrupt. THIS IS NOT A
+ RECOMMENDED WAY TO RUN THE DRIVER and has been done for a limited time
+ until people sort out their compatibility issues and the kernel
+ interrupt service code is fixed. YOU SHOULD SEPARATE OUT THE FAST
+ INTERRUPT CARDS FROM THE SLOW INTERRUPT CARDS to ensure that they do not
+ run on the same interrupt. PCMCIA/CardBus is another can of worms...
+
+ Finally, I think I have really fixed the module loading problem with
+ more than one DECchip based card. As a side effect, I don't mess with
+ the device structure any more which means that if more than 1 card in
+ 2.0.x is installed (4 in 2.1.x), the user will have to edit
+ linux/drivers/net/Space.c to make room for them. Hence, module loading
+ is the preferred way to use this driver, since it doesn't have this
+ limitation.
+
+ Where SROM media detection is used and full duplex is specified in the
+ SROM, the feature is ignored unless lp->params.fdx is set at compile
+ time OR during a module load (insmod de4x5 args='eth??:fdx' [see
+ below]). This is because there is no way to automatically detect full
+ duplex links except through autonegotiation. When I include the
+ autonegotiation feature in the SROM autoconf code, this detection will
+ occur automatically for that case.
+
+ Command line arguments are now allowed, similar to passing arguments
+ through LILO. This will allow a per adapter board set up of full duplex
+ and media. The only lexical constraints are: the board name (dev->name)
+ appears in the list before its parameters. The list of parameters ends
+ either at the end of the parameter list or with another board name. The
+ following parameters are allowed:
+
+ fdx for full duplex
+ autosense to set the media/speed; with the following
+ sub-parameters:
+ TP, TP_NW, BNC, AUI, BNC_AUI, 100Mb, 10Mb, AUTO
+
+ Case sensitivity is important for the sub-parameters. They *must* be
+ upper case. Examples:
+
+ insmod de4x5 args='eth1:fdx autosense=BNC eth0:autosense=100Mb'.
+
+ For a compiled in driver, in linux/drivers/net/CONFIG, place e.g.
+ DE4X5_OPTS = -DDE4X5_PARM='"eth0:fdx autosense=AUI eth2:autosense=TP"'
+
+ Yes, I know full duplex isn't permissible on BNC or AUI; they're just
+ examples. By default, full duplex is turned off and AUTO is the default
+ autosense setting. In reality, I expect only the full duplex option to
+ be used. Note the use of single quotes in the two examples above and the
+ lack of commas to separate items.
setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
The keyring can be manipulated after it has been given to the socket. This
- permits the server to add more keys, replace keys, etc. whilst it is live.
+ permits the server to add more keys, replace keys, etc. while it is live.
(3) A local address must then be bound:
struct sockaddr_rxrpc *srx,
struct key *key);
- This attempts to partially reinitialise a call and submit it again whilst
+ This attempts to partially reinitialise a call and submit it again while
reusing the original call's Tx queue to avoid the need to repackage and
re-encrypt the data to be sent. call indicates the call to retry, srx the
new address to send it to and key the encryption key to use for signing or
u32 rxrpc_kernel_check_life(struct socket *sock,
struct rxrpc_call *call);
+ void rxrpc_kernel_probe_life(struct socket *sock,
+ struct rxrpc_call *call);
- This returns a number that is updated when ACKs are received from the peer
- (notably including PING RESPONSE ACKs which we can elicit by sending PING
- ACKs to see if the call still exists on the server). The caller should
- compare the numbers of two calls to see if the call is still alive after
- waiting for a suitable interval.
+ The first function returns a number that is updated when ACKs are received
+ from the peer (notably including PING RESPONSE ACKs which we can elicit by
+ sending PING ACKs to see if the call still exists on the server). The
+ caller should compare the numbers of two calls to see if the call is still
+ alive after waiting for a suitable interval.
This allows the caller to work out if the server is still contactable and
- if the call is still alive on the server whilst waiting for the server to
+ if the call is still alive on the server while waiting for the server to
process a client operation.
- This function may transmit a PING ACK.
+ The second function causes a ping ACK to be transmitted to try to provoke
+ the peer into responding, which would then cause the value returned by the
+ first function to change. Note that this must be called in TASK_RUNNING
+ state.
(*) Get reply timestamp.
(*) connection_expiry
The amount of time in seconds after a connection was last used before we
- remove it from the connection list. Whilst a connection is in existence,
+ remove it from the connection list. While a connection is in existence,
it serves as a placeholder for negotiated security; when it is deleted,
the security must be renegotiated.
(*) transport_expiry
The amount of time in seconds after a transport was last used before we
- remove it from the transport list. Whilst a transport is in existence, it
+ remove it from the transport list. While a transport is in existence, it
serves to anchor the peer data and keeps the connection ID counter.
(*) rxrpc_rx_window_size
protocol entry point.
Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
- to struct boot_params for loading bzImage and ramdisk
+ to struct boot_params for loading bzImage and ramdisk
above 4G in 64bit.
-Protocol 2.13: (Kernel 3.14) Support 32- and 64-bit flags being set in
- xloadflags to support booting a 64-bit kernel from 32-bit
- EFI
-
-Protocol 2.14: (Kernel 4.20) Added acpi_rsdp_addr holding the physical
- address of the ACPI RSDP table.
- The bootloader updates version with:
- 0x8000 | min(kernel-version, bootloader-version)
- kernel-version being the protocol version supported by
- the kernel and bootloader-version the protocol version
- supported by the bootloader.
-
**** MEMORY LAYOUT
The traditional memory map for the kernel loader, used for Image or
0258/8 2.10+ pref_address Preferred loading address
0260/4 2.10+ init_size Linear memory required during initialization
0264/4 2.11+ handover_offset Offset of handover entry point
-0268/8 2.14+ acpi_rsdp_addr Physical address of RSDP table
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
Contains the magic number "HdrS" (0x53726448).
Field name: version
-Type: modify
+Type: read
Offset/size: 0x206/2
Protocol: 2.00+
e.g. 0x0204 for version 2.04, and 0x0a11 for a hypothetical version
10.17.
- Up to protocol version 2.13 this information is only read by the
- bootloader. From protocol version 2.14 onwards the bootloader will
- write the used protocol version or-ed with 0x8000 to the field. The
- used protocol version will be the minimum of the supported protocol
- versions of the bootloader and the kernel.
-
Field name: realmode_swtch
Type: modify (optional)
Offset/size: 0x208/4
See EFI HANDOVER PROTOCOL below for more details.
-Field name: acpi_rsdp_addr
-Type: write
-Offset/size: 0x268/8
-Protocol: 2.14+
-
- This field can be set by the boot loader to tell the kernel the
- physical address of the ACPI RSDP table.
-
- A value of 0 indicates the kernel should fall back to the standard
- methods to locate the RSDP.
-
**** THE IMAGE CHECKSUM
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;
/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
}
/*
{
void *ret = kmem_cache_alloc(s, flags);
- kasan_kmalloc(s, ret, size, flags);
+ ret = kasan_kmalloc(s, ret, size, flags);
return ret;
}
{
void *ret = kmem_cache_alloc_node(s, gfpflags, node);
- kasan_kmalloc(s, ret, size, gfpflags);
+ ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
#endif /* CONFIG_TRACING */
* kmalloc is the normal method of allocating memory
* for objects smaller than page size in the kernel.
*
- * The @flags argument may be one of:
+ * The @flags argument may be one of the GFP flags defined at
+ * include/linux/gfp.h and described at
+ * :ref:`Documentation/core-api/mm-api.rst <mm-api-gfp-flags>`
*
- * %GFP_USER - Allocate memory on behalf of user. May sleep.
+ * The recommended usage of the @flags is described at
+ * :ref:`Documentation/core-api/memory-allocation.rst <memory-allocation>`
*
- * %GFP_KERNEL - Allocate normal kernel ram. May sleep.
+ * Below is a brief outline of the most useful GFP flags
*
- * %GFP_ATOMIC - Allocation will not sleep. May use emergency pools.
- * For example, use this inside interrupt handlers.
+ * %GFP_KERNEL
+ * Allocate normal kernel ram. May sleep.
*
- * %GFP_HIGHUSER - Allocate pages from high memory.
+ * %GFP_NOWAIT
+ * Allocation will not sleep.
*
- * %GFP_NOIO - Do not do any I/O at all while trying to get memory.
+ * %GFP_ATOMIC
+ * Allocation will not sleep. May use emergency pools.
*
- * %GFP_NOFS - Do not make any fs calls while trying to get memory.
- *
- * %GFP_NOWAIT - Allocation will not sleep.
- *
- * %__GFP_THISNODE - Allocate node-local memory only.
- *
- * %GFP_DMA - Allocation suitable for DMA.
- * Should only be used for kmalloc() caches. Otherwise, use a
- * slab created with SLAB_DMA.
+ * %GFP_HIGHUSER
+ * Allocate memory from high memory on behalf of user.
*
* Also it is possible to set different flags by OR'ing
* in one or more of the following additional @flags:
*
- * %__GFP_HIGH - This allocation has high priority and may use emergency pools.
- *
- * %__GFP_NOFAIL - Indicate that this allocation is in no way allowed to fail
- * (think twice before using).
+ * %__GFP_HIGH
+ * This allocation has high priority and may use emergency pools.
*
- * %__GFP_NORETRY - If memory is not immediately available,
- * then give up at once.
+ * %__GFP_NOFAIL
+ * Indicate that this allocation is in no way allowed to fail
+ * (think twice before using).
*
- * %__GFP_NOWARN - If allocation fails, don't issue any warnings.
+ * %__GFP_NORETRY
+ * If memory is not immediately available,
+ * then give up at once.
*
- * %__GFP_RETRY_MAYFAIL - Try really hard to succeed the allocation but fail
- * eventually.
+ * %__GFP_NOWARN
+ * If allocation fails, don't issue any warnings.
*
- * There are other flags available as well, but these are not intended
- * for general use, and so are not documented here. For a full list of
- * potential flags, always refer to linux/gfp.h.
+ * %__GFP_RETRY_MAYFAIL
+ * Try really hard to succeed the allocation but fail
+ * eventually.
*/
static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
goto out;
}
- /*
- * kmem_cache_create_usercopy - Create a cache.
+ /**
+ * kmem_cache_create_usercopy - Create a cache with a region suitable
+ * for copying to userspace
* @name: A string which is used in /proc/slabinfo to identify this cache.
* @size: The size of objects to be created in this cache.
* @align: The required alignment for the objects.
* @usersize: Usercopy region size
* @ctor: A constructor for the objects.
*
- * Returns a ptr to the cache on success, NULL on failure.
* Cannot be called within a interrupt, but can be interrupted.
* The @ctor is run when new pages are allocated by the cache.
*
* %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
* to catch references to uninitialised memory.
*
- * %SLAB_RED_ZONE - Insert `Red' zones around the allocated memory to check
+ * %SLAB_RED_ZONE - Insert `Red` zones around the allocated memory to check
* for buffer overruns.
*
* %SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware
* cacheline. This can be beneficial if you're counting cycles as closely
* as davem.
+ *
+ * Return: a pointer to the cache on success, NULL on failure.
*/
struct kmem_cache *
kmem_cache_create_usercopy(const char *name,
}
EXPORT_SYMBOL(kmem_cache_create_usercopy);
+ /**
+ * kmem_cache_create - Create a cache.
+ * @name: A string which is used in /proc/slabinfo to identify this cache.
+ * @size: The size of objects to be created in this cache.
+ * @align: The required alignment for the objects.
+ * @flags: SLAB flags
+ * @ctor: A constructor for the objects.
+ *
+ * Cannot be called within a interrupt, but can be interrupted.
+ * The @ctor is run when new pages are allocated by the cache.
+ *
+ * The flags are
+ *
+ * %SLAB_POISON - Poison the slab with a known test pattern (a5a5a5a5)
+ * to catch references to uninitialised memory.
+ *
+ * %SLAB_RED_ZONE - Insert `Red` zones around the allocated memory to check
+ * for buffer overruns.
+ *
+ * %SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware
+ * cacheline. This can be beneficial if you're counting cycles as closely
+ * as davem.
+ *
+ * Return: a pointer to the cache on success, NULL on failure.
+ */
struct kmem_cache *
kmem_cache_create(const char *name, unsigned int size, unsigned int align,
slab_flags_t flags, void (*ctor)(void *))
css_get(&s->memcg_params.memcg->css);
s->memcg_params.deact_fn = deact_fn;
- call_rcu_sched(&s->memcg_params.deact_rcu_head, kmemcg_deactivate_rcufn);
+ call_rcu(&s->memcg_params.deact_rcu_head, kmemcg_deactivate_rcufn);
}
void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)
mutex_unlock(&slab_mutex);
/*
- * SLUB deactivates the kmem_caches through call_rcu_sched. Make
+ * SLUB deactivates the kmem_caches through call_rcu. Make
* sure all registered rcu callbacks have been invoked.
*/
if (IS_ENABLED(CONFIG_SLUB))
- rcu_barrier_sched();
+ rcu_barrier();
/*
* SLAB and SLUB create memcg kmem_caches through workqueue and SLUB
index = size_index[size_index_elem(size)];
} else {
- if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
- WARN_ON(1);
+ if (WARN_ON_ONCE(size > KMALLOC_MAX_CACHE_SIZE))
return NULL;
- }
index = fls(size - 1);
}
page = alloc_pages(flags, order);
ret = page ? page_address(page) : NULL;
kmemleak_alloc(ret, size, 1, flags);
- kasan_kmalloc_large(ret, size, flags);
+ ret = kasan_kmalloc_large(ret, size, flags);
return ret;
}
EXPORT_SYMBOL(kmalloc_order);
ks = ksize(p);
if (ks >= new_size) {
- kasan_krealloc((void *)p, new_size, flags);
+ p = kasan_krealloc((void *)p, new_size, flags);
return (void *)p;
}
}
ret = __do_krealloc(p, new_size, flags);
- if (ret && p != ret)
+ if (ret && kasan_reset_tag(p) != kasan_reset_tag(ret))
kfree(p);
return ret;