Add docs and arch tests to BMI.

[binutils.git] / gas / doc / c-i386.texi
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi

index cf0bfa8f5d822f51b0560e289c1b77837ac19d4e..4ea33f6db513c05c267f7499716d28de273f28b4 100644 (file)
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -3,6 +3,8 @@
  @c Free Software Foundation, Inc.
  @c This is part of the GAS manual.
  @c For copying conditions, see the file as.texinfo.
+@c man end
+
  @ifset GENERIC
  @page
  @node i386-Dependent
@@ -32,6 +34,8 @@ extending the Intel architecture to 64-bits.
  * i386-Jumps::                  Handling of Jump Instructions
  * i386-Float::                  Floating Point
  * i386-SIMD::                   Intel's MMX and AMD's 3DNow! SIMD Operations
+* i386-LWP::                    AMD's Lightweight Profiling Instructions
+* i386-BMI::                    Bit Manipulation Instruction
  * i386-16bit::                  Writing 16-bit Code
  * i386-Arch::                   Specifying an x86 CPU architecture
  * i386-Bugs::                   AT&T Syntax bugs
@@ -49,15 +53,19 @@ extending the Intel architecture to 64-bits.
  The i386 version of @code{@value{AS}} has a few machine
  dependent options:
  
-@table @code
+@c man begin OPTIONS
+@table @gcctabopt
  @cindex @samp{--32} option, i386
  @cindex @samp{--32} option, x86-64
+@cindex @samp{--n32} option, i386
+@cindex @samp{--n32} option, x86-64
  @cindex @samp{--64} option, i386
  @cindex @samp{--64} option, x86-64
-@item --32 | --64
-Select the word size, either 32 bits or 64 bits. Selecting 32-bit
-implies Intel i386 architecture, while 64-bit implies AMD x86-64
-architecture.
+@item --32 | --n32 | --64
+Select the word size, either 32 bits or 64 bits.  @samp{--32}
+implies Intel i386 architecture, while @samp{--n32} and @samp{--64}
+imply AMD x86-64 architecture with 32-bit or 64-bit word-size
+respectively.
  
  These options are only available with the ELF object file format, and
  require that the necessary BFD support has been included (on a 32-bit
@@ -108,6 +116,7 @@ processor names are recognized:
  @code{opteron},
  @code{k8},
  @code{amdfam10},
+@code{bdver1},
  @code{generic32} and
  @code{generic64}.
  
@@ -134,12 +143,19 @@ accept various extension mnemonics.  For example,
  @code{vmx},
  @code{smx},
  @code{xsave},
+@code{xsaveopt},
  @code{aes},
  @code{pclmul},
+@code{fsgsbase},
+@code{rdrnd},
+@code{f16c},
  @code{fma},
  @code{movbe},
  @code{ept},
  @code{clflush},
+@code{lwp},
+@code{fma4},
+@code{xop},
  @code{syscall},
  @code{rdtscp},
  @code{3dnow},
@@ -175,8 +191,8 @@ with VEX prefix.
  @cindex @samp{-msse-check=} option, i386
  @cindex @samp{-msse-check=} option, x86-64
  @item -msse-check=@var{none}
-@item -msse-check=@var{warning}
-@item -msse-check=@var{error}
+@itemx -msse-check=@var{warning}
+@itemx -msse-check=@var{error}
  These options control if the assembler should check SSE intructions.
  @option{-msse-check=@var{none}} will make the assembler not to check SSE
  instructions,  which is the default.  @option{-msse-check=@var{warning}}
@@ -184,10 +200,20 @@ will make the assembler issue a warning for any SSE intruction.
  @option{-msse-check=@var{error}} will make the assembler issue an error
  for any SSE intruction.
  
+@cindex @samp{-mavxscalar=} option, i386
+@cindex @samp{-mavxscalar=} option, x86-64
+@item -mavxscalar=@var{128}
+@itemx -mavxscalar=@var{256}
+This options control how the assembler should encode scalar AVX
+instructions.  @option{-mavxscalar=@var{128}} will encode scalar
+AVX instructions with 128bit vector length, which is the default.
+@option{-mavxscalar=@var{256}} will encode scalar AVX instructions
+with 256bit vector length.
+
  @cindex @samp{-mmnemonic=} option, i386
  @cindex @samp{-mmnemonic=} option, x86-64
  @item -mmnemonic=@var{att}
-@item -mmnemonic=@var{intel}
+@itemx -mmnemonic=@var{intel}
  This option specifies instruction mnemonic for matching instructions. 
  The @code{.att_mnemonic} and @code{.intel_mnemonic} directives will
  take precedent.
@@ -195,7 +221,7 @@ take precedent.
  @cindex @samp{-msyntax=} option, i386
  @cindex @samp{-msyntax=} option, x86-64
  @item -msyntax=@var{att}
-@item -msyntax=@var{intel}
+@itemx -msyntax=@var{intel}
  This option specifies instruction syntax when processing instructions. 
  The @code{.att_syntax} and @code{.intel_syntax} directives will
  take precedent.
@@ -207,6 +233,7 @@ This opetion specifies that registers don't require a @samp{%} prefix.
  The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent.
  
  @end table
+@c man end
  
  @node i386-Directives
  @section x86 specific Directives
@@ -309,6 +336,9 @@ this by prefixing memory operands (@emph{not} the instruction mnemonics) with
  Intel @samp{mov al, byte ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T
  syntax.
  
+In 64-bit code, @samp{movabs} can be used to encode the @samp{mov}
+instruction with the 64-bit displacement or immediate operand.
+
  @cindex return instructions, i386
  @cindex i386 jump, call, return
  @cindex return instructions, x86-64
@@ -372,7 +402,8 @@ quadruple word).
  
  Different encoding options can be specified via optional mnemonic
  suffix.  @samp{.s} suffix swaps 2 register operands in encoding when
-moving from one register to another.
+moving from one register to another.  @samp{.d32} suffix forces 32bit
+displacement in encoding.
  
  @cindex conversion instructions, i386
  @cindex i386 conversion instructions
@@ -796,6 +827,40 @@ as the floating point stack.
  See Intel and AMD documentation, keeping in mind that the operand order in
  instructions is reversed from the Intel syntax.
  
+@node i386-LWP
+@section AMD's Lightweight Profiling Instructions
+
+@cindex LWP, i386
+@cindex LWP, x86-64
+
+@code{@value{AS}} supports AMD's Lightweight Profiling (LWP)
+instruction set, available on AMD's Family 15h (Orochi) processors.
+
+LWP enables applications to collect and manage performance data, and
+react to performance events.  The collection of performance data
+requires no context switches.  LWP runs in the context of a thread and
+so several counters can be used independently across multiple threads.
+LWP can be used in both 64-bit and legacy 32-bit modes.
+
+For detailed information on the LWP instruction set, see the
+@cite{AMD Lightweight Profiling Specification} available at
+@uref{http://developer.amd.com/cpu/LWP,Lightweight Profiling Specification}.
+
+@node i386-BMI
+@section Bit Manipulation Instructions
+
+@cindex BMI, i386
+@cindex BMI, x86-64
+
+@code{@value{AS}} supports the Bit Manipulation (BMI) instruction set.
+
+BMI instructions provide several instructions implementing individual
+bit manipulation operations such as isolation, masking, setting, or
+resetting.  
+
+@c Need to add a specification citation here when available.
+
+
  @node i386-16bit
  @section Writing 16-bit Code
  
@@ -812,8 +877,9 @@ or 64-bit x86-64 code depending on the default configuration,
  it also supports writing code to run in real mode or in 16-bit protected
  mode code segments.  To do this, put a @samp{.code16} or
  @samp{.code16gcc} directive before the assembly language instructions to
-be run in 16-bit mode.  You can switch @code{@value{AS}} back to writing
-normal 32-bit code with the @samp{.code32} directive.
+be run in 16-bit mode.  You can switch @code{@value{AS}} to writing
+32-bit code with the @samp{.code32} directive or 64-bit code with the
+@samp{.code64} directive.
  
  @samp{.code16gcc} provides experimental support for generating 16-bit
  code from gcc, and differs from @samp{.code16} in that @samp{call},
@@ -888,15 +954,17 @@ supported on the CPU specified.  The choices for @var{cpu_type} are:
  @item @samp{prescott} @tab @samp{nocona} @tab @samp{core} @tab @samp{core2}
  @item @samp{corei7} @tab @samp{l1om}
  @item @samp{k6} @tab @samp{k6_2} @tab @samp{athlon} @tab @samp{k8}
-@item @samp{amdfam10}
+@item @samp{amdfam10} @tab @samp{bdver1}
  @item @samp{generic32} @tab @samp{generic64}
  @item @samp{.mmx} @tab @samp{.sse} @tab @samp{.sse2} @tab @samp{.sse3}
  @item @samp{.ssse3} @tab @samp{.sse4.1} @tab @samp{.sse4.2} @tab @samp{.sse4}
-@item @samp{.avx} @tab @samp{.vmx} @tab @samp{.smx} @tab @samp{.xsave}
-@item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.movbe}
-@item @samp{.ept} @tab @samp{.clflush}
+@item @samp{.avx} @tab @samp{.vmx} @tab @samp{.smx} @tab @samp{.ept}
+@item @samp{.clflush} @tab @samp{.movbe} @tab @samp{.xsave} @tab @samp{.xsaveopt}
+@item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.fsgsbase}
+@item @samp{.rdrnd} @tab @samp{.f16c}
  @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
  @item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} @tab @samp{.abm}
+@item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop}
  @item @samp{.padlock}
  @end multitable