From e5fe9eb8996d2d8236755e1f21f673f86f8c854c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Thu, 8 Apr 2021 23:06:21 +0200 Subject: [PATCH 1/7] New 'Label' section with example and explainations --- .../unstable-book/src/library-features/asm.md | 40 +++++++++++++++++-- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 946c354fd9d..0503c26014a 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -372,6 +372,43 @@ unsafe { # } ``` +## Labels + +The compiler is allowed to instantiate multiple copies an `asm!` block, for example when the function containing it is inlined in multiple places. As a consequence, you should only use GNU assembler [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions. + +Moreover due to [a llvm bug], you cannot use `0` or `1` as labels. Therefore only labels in the `2`-`99` range are allowed. + +```rust +#![feature(asm)] + +let mut a = 0; +unsafe { + asm!( + "mov {0}, 10", + "2:", + "sub {0}, 1", + "cmp {0}, 3", + "jle 2f", + "jmp 2b", + "2:", + "add {0}, 2", + out(reg) a + ); +} +assert_eq!(a, 5); +``` + +This will decrement the `{0}` register value from 10 to 3, then add 2 and store it in `a`. + +This example show a few thing: + +First that the same number can be used as a label multiple times in the same inline block. + +Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes b (“backward”) or f (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction. + +[local labels]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Local-Labels +[a llvm bug]: https://bugs.llvm.org/show_bug.cgi?id=36144 + ## Options By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However in many cases, it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better. @@ -787,8 +824,5 @@ The compiler performs some additional checks on options: - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds). - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited. - You cannot assume that an `asm!` block will appear exactly once in the output binary. The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places. - - As a consequence, you should only use [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions. > **Note**: As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call. - -[local labels]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Local-Labels From 406cfc3b209e7f10d909698432a9729d2f1ef2c2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e?= Date: Fri, 9 Apr 2021 00:09:14 +0200 Subject: [PATCH 2/7] add 'allow_fail' to example --- src/doc/unstable-book/src/library-features/asm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 0503c26014a..dc65e40ad05 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -378,7 +378,7 @@ The compiler is allowed to instantiate multiple copies an `asm!` block, for exam Moreover due to [a llvm bug], you cannot use `0` or `1` as labels. Therefore only labels in the `2`-`99` range are allowed. -```rust +```rust,allow_fail #![feature(asm)] let mut a = 0; From 41c9c9b51f1d9a5e1650bbdb398fc78f836936c9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Fri, 9 Apr 2021 12:18:12 +0200 Subject: [PATCH 3/7] precisions on the authorized labels + typo --- src/doc/unstable-book/src/library-features/asm.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 0503c26014a..348b14dccbd 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -376,7 +376,7 @@ unsafe { The compiler is allowed to instantiate multiple copies an `asm!` block, for example when the function containing it is inlined in multiple places. As a consequence, you should only use GNU assembler [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions. -Moreover due to [a llvm bug], you cannot use `0` or `1` as labels. Therefore only labels in the `2`-`99` range are allowed. +Moreover, due to [an llvm bug], you shouldn't use labels exclusively make of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. ```rust #![feature(asm)] @@ -407,7 +407,7 @@ First that the same number can be used as a label multiple times in the same inl Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes b (“backward”) or f (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction. [local labels]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Local-Labels -[a llvm bug]: https://bugs.llvm.org/show_bug.cgi?id=36144 +[an llvm bug]: https://bugs.llvm.org/show_bug.cgi?id=36144 ## Options From fab2d46d24b4c3328ebdf0ffecc8b67ad7990392 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Fri, 9 Apr 2021 12:34:30 +0200 Subject: [PATCH 4/7] remove allow_fail and uncomment the [feature(asm)] on every example --- .../unstable-book/src/library-features/asm.md | 60 +++++++++---------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index d39f195455b..3edad008020 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -34,8 +34,8 @@ Inline assembly is currently supported on the following architectures: Let us start with the simplest possible example: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] unsafe { asm!("nop"); } @@ -51,8 +51,8 @@ in the first argument of the `asm!` macro as a string literal. Now inserting an instruction that does nothing is rather boring. Let us do something that actually acts on data: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let x: u64; unsafe { asm!("mov {}, 5", out(reg) x); @@ -73,8 +73,8 @@ the template and will read the variable from there after the inline assembly fin Let us see another example that also uses an input: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let i: u64 = 3; let o: u64; unsafe { @@ -113,8 +113,8 @@ readability, and allows reordering instructions without changing the argument or We can further refine the above example to avoid the `mov` instruction: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let mut x: u64 = 3; unsafe { asm!("add {0}, {number}", inout(reg) x, number = const 5); @@ -127,8 +127,8 @@ This is different from specifying an input and output separately in that it is g It is also possible to specify different variables for the input and output parts of an `inout` operand: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let x: u64 = 3; let y: u64; unsafe { @@ -149,8 +149,8 @@ There is also a `inlateout` variant of this specifier. Here is an example where `inlateout` *cannot* be used: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let mut a: u64 = 4; let b: u64 = 4; let c: u64 = 4; @@ -170,8 +170,8 @@ Here the compiler is free to allocate the same register for inputs `b` and `c` s However the following example can use `inlateout` since the output is only modified after all input registers have been read: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let mut a: u64 = 4; let b: u64 = 4; unsafe { @@ -189,8 +189,8 @@ Therefore, Rust inline assembly provides some more specific constraint specifier While `reg` is generally available on any architecture, these are highly architecture specific. E.g. for x86 the general purpose registers `eax`, `ebx`, `ecx`, `edx`, `ebp`, `esi`, and `edi` among others can be addressed by their name. -```rust,allow_fail,no_run -# #![feature(asm)] +```rust,no_run +#![feature(asm)] let cmd = 0xd1; unsafe { asm!("out 0x64, eax", in("eax") cmd); @@ -205,8 +205,8 @@ Note that unlike other operand types, explicit register operands cannot be used Consider this example which uses the x86 `mul` instruction: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] fn mul(a: u64, b: u64) -> u128 { let lo: u64; let hi: u64; @@ -241,8 +241,8 @@ This state is generally referred to as being "clobbered". We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block. -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let ebx: u32; let ecx: u32; @@ -271,8 +271,8 @@ However we still need to tell the compiler that `eax` and `edx` have been modifi This can also be used with a general register class (e.g. `reg`) to obtain a scratch register for use inside the asm code: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] // Multiply x by 6 using shifts and adds let mut x: u64 = 4; unsafe { @@ -293,8 +293,8 @@ assert_eq!(x, 4 * 6); A special operand type, `sym`, allows you to use the symbol name of a `fn` or `static` in inline assembly code. This allows you to call a function or access a global variable without needing to keep its address in a register. -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] extern "C" fn foo(arg: i32) { println!("arg = {}", arg); } @@ -335,8 +335,8 @@ By default the compiler will always choose the name that refers to the full regi This default can be overriden by using modifiers on the template string operands, just like you would with format strings: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let mut x: u16 = 0xab; unsafe { @@ -360,7 +360,7 @@ You have to manually use the memory address syntax specified by the respectively For example, in x86/x86_64 and intel assembly syntax, you should wrap inputs/outputs in `[]` to indicate they are memory operands: -```rust,allow_fail +```rust # #![feature(asm, llvm_asm)] # fn load_fpu_control_word(control: u16) { unsafe { @@ -378,7 +378,7 @@ The compiler is allowed to instantiate multiple copies an `asm!` block, for exam Moreover, due to [an llvm bug], you shouldn't use labels exclusively make of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. -```rust,allow_fail +```rust #![feature(asm)] let mut a = 0; @@ -415,8 +415,8 @@ By default, an inline assembly block is treated the same way as an external FFI Let's take our previous example of an `add` instruction: -```rust,allow_fail -# #![feature(asm)] +```rust +#![feature(asm)] let mut a: u64 = 4; let b: u64 = 4; unsafe { From d58a0de505a008af1998f1c5abbeb3fd75c6f76c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Fri, 9 Apr 2021 12:39:35 +0200 Subject: [PATCH 5/7] conjugation --- src/doc/unstable-book/src/library-features/asm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 3edad008020..6895c323c29 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -376,7 +376,7 @@ unsafe { The compiler is allowed to instantiate multiple copies an `asm!` block, for example when the function containing it is inlined in multiple places. As a consequence, you should only use GNU assembler [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions. -Moreover, due to [an llvm bug], you shouldn't use labels exclusively make of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. +Moreover, due to [an llvm bug], you shouldn't use labels exclusively made of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. ```rust #![feature(asm)] From 4f8dbf66de1c66b4e61e2f2b9a6c22f9b70a30d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Fri, 9 Apr 2021 14:08:49 +0200 Subject: [PATCH 6/7] fix misspelling of register xmm23 which made xmm13 being clobbered twice --- src/doc/unstable-book/src/library-features/asm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 6895c323c29..31082b64b27 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -316,7 +316,7 @@ fn call_foo(arg: i32) { // Also mark AVX-512 registers as clobbered. This is accepted by the // compiler even if AVX-512 is not enabled on the current target. out("xmm16") _, out("xmm17") _, out("xmm18") _, out("xmm19") _, - out("xmm20") _, out("xmm21") _, out("xmm22") _, out("xmm13") _, + out("xmm20") _, out("xmm21") _, out("xmm22") _, out("xmm23") _, out("xmm24") _, out("xmm25") _, out("xmm26") _, out("xmm27") _, out("xmm28") _, out("xmm29") _, out("xmm30") _, out("xmm31") _, ) From 1f7de3fa983ef1153fcae10a67a1d6f1f9efdeb3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timoth=C3=A9e=20Delabrouille?= Date: Fri, 9 Apr 2021 15:41:26 +0200 Subject: [PATCH 7/7] set allow_fail back on each example --- .../unstable-book/src/library-features/asm.md | 34 +++++++++---------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/src/doc/unstable-book/src/library-features/asm.md b/src/doc/unstable-book/src/library-features/asm.md index 31082b64b27..4f9033cedc3 100644 --- a/src/doc/unstable-book/src/library-features/asm.md +++ b/src/doc/unstable-book/src/library-features/asm.md @@ -34,7 +34,7 @@ Inline assembly is currently supported on the following architectures: Let us start with the simplest possible example: -```rust +```rust,allow_fail #![feature(asm)] unsafe { asm!("nop"); @@ -51,7 +51,7 @@ in the first argument of the `asm!` macro as a string literal. Now inserting an instruction that does nothing is rather boring. Let us do something that actually acts on data: -```rust +```rust,allow_fail #![feature(asm)] let x: u64; unsafe { @@ -73,7 +73,7 @@ the template and will read the variable from there after the inline assembly fin Let us see another example that also uses an input: -```rust +```rust,allow_fail #![feature(asm)] let i: u64 = 3; let o: u64; @@ -113,7 +113,7 @@ readability, and allows reordering instructions without changing the argument or We can further refine the above example to avoid the `mov` instruction: -```rust +```rust,allow_fail #![feature(asm)] let mut x: u64 = 3; unsafe { @@ -127,7 +127,7 @@ This is different from specifying an input and output separately in that it is g It is also possible to specify different variables for the input and output parts of an `inout` operand: -```rust +```rust,allow_fail #![feature(asm)] let x: u64 = 3; let y: u64; @@ -149,7 +149,7 @@ There is also a `inlateout` variant of this specifier. Here is an example where `inlateout` *cannot* be used: -```rust +```rust,allow_fail #![feature(asm)] let mut a: u64 = 4; let b: u64 = 4; @@ -170,7 +170,7 @@ Here the compiler is free to allocate the same register for inputs `b` and `c` s However the following example can use `inlateout` since the output is only modified after all input registers have been read: -```rust +```rust,allow_fail #![feature(asm)] let mut a: u64 = 4; let b: u64 = 4; @@ -189,7 +189,7 @@ Therefore, Rust inline assembly provides some more specific constraint specifier While `reg` is generally available on any architecture, these are highly architecture specific. E.g. for x86 the general purpose registers `eax`, `ebx`, `ecx`, `edx`, `ebp`, `esi`, and `edi` among others can be addressed by their name. -```rust,no_run +```rust,allow_fail,no_run #![feature(asm)] let cmd = 0xd1; unsafe { @@ -205,7 +205,7 @@ Note that unlike other operand types, explicit register operands cannot be used Consider this example which uses the x86 `mul` instruction: -```rust +```rust,allow_fail #![feature(asm)] fn mul(a: u64, b: u64) -> u128 { let lo: u64; @@ -241,7 +241,7 @@ This state is generally referred to as being "clobbered". We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block. -```rust +```rust,allow_fail #![feature(asm)] let ebx: u32; let ecx: u32; @@ -271,7 +271,7 @@ However we still need to tell the compiler that `eax` and `edx` have been modifi This can also be used with a general register class (e.g. `reg`) to obtain a scratch register for use inside the asm code: -```rust +```rust,allow_fail #![feature(asm)] // Multiply x by 6 using shifts and adds let mut x: u64 = 4; @@ -293,7 +293,7 @@ assert_eq!(x, 4 * 6); A special operand type, `sym`, allows you to use the symbol name of a `fn` or `static` in inline assembly code. This allows you to call a function or access a global variable without needing to keep its address in a register. -```rust +```rust,allow_fail #![feature(asm)] extern "C" fn foo(arg: i32) { println!("arg = {}", arg); @@ -335,7 +335,7 @@ By default the compiler will always choose the name that refers to the full regi This default can be overriden by using modifiers on the template string operands, just like you would with format strings: -```rust +```rust,allow_fail #![feature(asm)] let mut x: u16 = 0xab; @@ -360,8 +360,8 @@ You have to manually use the memory address syntax specified by the respectively For example, in x86/x86_64 and intel assembly syntax, you should wrap inputs/outputs in `[]` to indicate they are memory operands: -```rust -# #![feature(asm, llvm_asm)] +```rust,allow_fail +#![feature(asm, llvm_asm)] # fn load_fpu_control_word(control: u16) { unsafe { asm!("fldcw [{}]", in(reg) &control, options(nostack)); @@ -378,7 +378,7 @@ The compiler is allowed to instantiate multiple copies an `asm!` block, for exam Moreover, due to [an llvm bug], you shouldn't use labels exclusively made of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. -```rust +```rust,allow_fail #![feature(asm)] let mut a = 0; @@ -415,7 +415,7 @@ By default, an inline assembly block is treated the same way as an external FFI Let's take our previous example of an `add` instruction: -```rust +```rust,allow_fail #![feature(asm)] let mut a: u64 = 4; let b: u64 = 4;