17 高级特征

👈 16 模式与模式匹配

`unsafe` Rust

编译器有时缺乏必要的信息，会拒绝实际安全的代码，使用 unsafe Rust 将禁用编译器的部分检查机制，从而使代码过编。

`unsafe` 的超能力

关键字 unsafe 禁用的编译器检查：

解引用裸指针
调用不安全函数或方法
访问或修改可变静态变量
实现不安全 trait
访问 union 的字段

这 5 类操作必须位于标记为 unsafe 的块中，保持它尽可能小，这将方便错误的排查。

解引用裸指针

unsafe Rust 的 裸指针(raw pointer) 是类似于引用的类型，不可变与可变形式分别写为 *const T、*mut T，其中 * 不是解引用运算符，而是类型名称的一部分。

裸指针相当于引用 / 智能指针的特性：

允许忽略借用规则，可以同时拥有不可变和可变的指针，或多个指向相同位置的可变指针。
不保证指向有效的内存。
允许为空。
不能实现任何自动清理功能。

从引用创建裸指针：

let mut num = 5;
let r1 = &num as *const i32;
let r2 = &mut num as *mut i32;

这里无需 unsafe，因为在安全代码中，可以创建裸指针，只是不能对其解引用。

在一个 unsafe 块中解引用裸指针：

unsafe {
	println!("r1={}", *r1);
	println!("r2={}", *r2);
}

不过，如果将裸指针改为引用：

let mut num = 5;
let r1 = &num;
let r2 = &mut num;
println!("r1={}", *r1);
println!("r2={}", *r2);

无法过编，因为不能同时存在不可变和可变引用。

裸指针则允许这样做，通过可变的裸指针修改数据有潜在的数据竞争风险。

裸指针的主要应用场景：

调用 C 代码的接口。
构建借用检查器无法理解的安全抽象。

调用不安全函数或方法

在函数/方法前部添加 unsafe 以创建一个不安全的函数/方法：

fn main() {
	unsafe {
		dangerous();
	}
}
 
unsafe fn dangerous() {
	// ...
}

此处的 unsafe 表明该函数 / 方法具有调用时需要满足的要求，必须通过 unsafe 块调用一个 unsafe 函数 / 方法，表明调用者已经阅读了该函数 / 方法的文档，并保证满足了调用要求。

unsafe 函数 / 方法体也是 unsafe 块，无需内置更多 unsafe 块。

创建 `unsafe` 代码的安全抽象

仅仅因为函数 / 方法包含不安全代码并不意味着其要整个标注 unsafe。

标准库函数 split_at_mut() 获取一个 slice 并从给定的索引处将其一分为二：

let mut v = vec![1, 2, 3, 4, 5, 6];
let r = &mut v[..];
let (a, b) = r.split_at_mut(3);
assert_eq!(a, &mut [1, 2, 3]);
assert_eq!(b, &mut [4, 5, 6]);

这个函数的实现用到了 unsafe 特性，现在，尝试避开 unsafe 并实现：

fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
    let len = values.len();
    assert!(mid <= len);
    (&mut values[..mid], &mut values[mid..])
}

无法过编：

error[E0499]: cannot borrow `*values` as mutable more than once at a time
  --> src\main.rs:12:31
   |
9  | fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
   |                         - let's call the lifetime of this reference `'1`      
...
12 |     (&mut values[..mid], &mut values[mid..])
   |     --------------------------^^^^^^--------
   |     |     |                   |
   |     |     |                   second mutable borrow occurs here
   |     |     first mutable borrow occurs here
   |     returning this value requires that `*values` is borrowed for `'1`

编译器不允许借用同一个 slice 两次，虽然实际上这两次借用了 slice 中完全不重叠的两个部分（事实上安全，属于“我们比编译器知道更多情况”）。

只好用 unsafe 让编译器闭嘴了：

use std::slice;
 
fn main() {
    let mut v = vec![1, 2, 3, 4, 5, 6];
    let r = &mut v[..];
    let (a, b) = split_at_mut(r, 3);
    assert_eq!(a, &mut [1, 2, 3]);
    assert_eq!(b, &mut [4, 5, 6]);
}
 
fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
    let len = values.len();
    let ptr = values.as_mut_ptr();
    assert!(mid <= len);
    unsafe {
        (
            slice::from_raw_parts_mut(ptr, mid),
            slice::from_raw_parts_mut(ptr.add(mid), len - mid),
        )
    }
}

slice 是一个指向一些数据的指针，并带有该 slice 的长度，调用 as_mut_ptr() 将其裸指针存储在 ptr。

unsafe 块中调用了 slice::from_raw_parts_mut() 和 add()，它们都是 unsafe 的，它们要求的前提是接收的参数（裸指针，索引）是有效的，这已经由 assert!(mid <= len) 手动验证了。

使用 `extern` 函数调用外部代码

关键字 extern 用于创建和使用外部函数接口，即其他编程语言的函数。

集成 C 语言标准库中的 abs()：

extern "C" {
    fn abs(input: i32) -> i32;
}
 
fn main() {
    let x = -2;
    unsafe {
        println!("In C, abs({x})={}", abs(x));
    }
}

其他语言的实现无法由 Rust 检查，因此都需要放在 unsafe 块中调用，不过 extern 本身无需搭配 unsafe 使用。

访问或修改可变静态变量

Rust 其实支持全局变量，不过其对于所有权规则存在问题：如果两个线程访问同一个可变的全局变量，潜在有数据竞争。

全局变量在 Rust 中称为静态(static) 变量，声明一个不可变的 PI：

static PI: f64 = 3.14; // 必须标注类型
 
fn main() {
    println!("Pi = {PI}");
}

静态变量只能存储拥有 'static 生命周期的引用，因此编译器能够自行计算生命周期。

访问不可变静态变量是安全的。

区分静态变量和常量：

不可变静态变量的值有固定的内存地址，而常量的值可以被复制。
静态变量可以是可变的（可变静态变量），对其访问和修改都是 unsafe 的。

声明一个可变静态变量，随后访问并修改：

static mut COUNT: u32 = 0; // 指明可变
 
fn main() {
    add_to_count(3);
    unsafe { // 访问必须在 unsafe 块内
        println!("COUNT={COUNT}");
    }
}
 
fn add_to_count(inc: u32) {
    unsafe { // 修改也必须在 unsafe 块内
        COUNT += inc;
    }
}

COUNT=3

这段单线程代码怎么看都很安全，但在多线程上下文中，多个线程同时进行操作时可能出错，因此访问和修改的代码必须在 unsafe 块内。

实现不安全 trait

在 trait 前部添加 unsafe 以创建一个不安全 trait：

unsafe trait Danger {
	// ...
}
 
unsafe impl Danger for Ous {
	// ...
}

对不安全 trait 进行实现时，也需在前部添加 unsafe。

访问 `union` 的字段

union 主要用于与 C 代码中的联合体交互，也属于不安全操作。

何时使用 `unsafe`

要先确保 unsafe 中的代码是正确的，不过不管怎么说，显式标注 unsafe 使得问题发生时更容易排查。

高级 trait

关联类型在 trait 定义中指定占位符类型

关联类型(associated types) 是一个将类型占位符与 trait 相关联的方式，这样 trait 的方法签名中就可以使用这些占位符类型。

标准库提供的 Iterator trait 的定义中就有一个关联类型 Item：

pub trait Iterator {
    type Item;
    
    fn next(&mut self) -> Option<Self::Item>;
}

关联类型是 trait 契约之一，实现时必须提供具体类型代替它。看起来类似于泛型，为 Counter struct 实现 Iterator：

impl Iterator for Counter {
	type Item = u32;
	
	fn next(&mut self) -> Option<Self::Item> {
		// ...
	}
}

看起来就是泛型，那么为什么不如下定义 Iterator：

pub trait Iterator<T> {
	fn next(&mut self) -> Option<T>;
}

如果是这样，假设有多个 impl Iterator for Counter 的实现，每个都有泛型参数 T 的不同的具体类型，那么每次调用 Counter::next() 时，都要提供类型注解以指明使用的实现。

如果通过关联类型实现 Iterator，则无需每次都标注类型，因为根本无法多次实现 Iterator，也即只能有一个 impl Iterator for Counter。

默认泛型类型参数和运算符重载

上个代码块中，可以为泛型类型参数指定默认值：

pub trait Iterator<T = u32> {
    fn next(&mut self) -> Option<T>;
}

Rust 不允许创建自定义运算符或重载运算符，不过 std::ops 中所列的运算符可以通过实现其相关的 trait 进行重载。

为 Point struct 实现 Add trait，重载 +，以实现两个 point 相加：

use std::ops::Add;
 
#[derive(Debug, Copy, Clone, PartialEq)]
struct Point {
    x: i32,
    y: i32,
}
 
impl Add for Point {
    type Output = Point;
    
    fn add(self, other: Point) -> Point {
        Point {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}
 
fn main() {
    assert_eq!(
        Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
        Point { x: 3, y: 3 }
    );
}

Add trait 有一个关联类型 Output，决定了 add() 的返回值类型：

trait Add<Rhs = Self> {
    type Output;
    
    fn add(self, rhs: Rhs) -> Self::Output;
}

泛型类型参数 Rhs(right hand side) 的默认值为 Self，为 Point 实现 Add 时没有指定泛型类型，因此采用了默认值 Self，此即为 Point。

另一个不采用默认值的示例：

use std::ops::Add;
 
struct Millimeters(u32);
 
struct Meters(u32);
 
impl Add<Meters> for Millimeters {
    type Output = Millimeters;
    
    fn add(self, other: Meters) -> Millimeters {
        Millimeters(self.0 + (other.0 * 1000))
    }
}

将 Millimeters 与 Meters 相加，并指定结果类型为 Meters。

完全限定语法与消歧义：调用相同名称的方法

Rust 既无法避免两个 trait 拥有同名方法，也无法阻止为同一类型同时实现这两个 trait，甚至直接在类型上实现本已存在的同名方法也是可能的。

调用这样的同名方法时，需要指定具体是哪一个：

trait Pilot {
    fn fly(&self);
}
 
trait Wizard {
    fn fly(&self);
}
 
struct Bird;
 
impl Bird {
    fn fly(&self) {
        println!("A bird can fly, of course!");
    }
}
 
impl Pilot for Bird {
    fn fly(&self) {
        println!("This is your captain speaking.");
    }
}
 
impl Wizard for Bird {
    fn fly(&self) {
        println!("abracadabra!");
    }
}

Pilot 和 Wizard 两个 trait 有同名方法 fly()，Bird struct 已经事先实现 fly()，再为 Bird 分别实现两个 trait。现在，对 bird 调用 fly()：

let bird = Bird;
bird.fly();

A bird can fly, of course!

编译器会默认调用 impl Bird 块中的 fly() 实现。

要调用 Pilot::fly() 和 Wizard::fly()，需要指明：

let bird = Bird;
bird.fly();
Pilot::fly(&bird);
Wizard::fly(&bird);

A bird can fly, of course!
This is your captain speaking.
abracadabra!

这对于接收 &self 参数的方法足够了。

但是函数并不接收 &self，对于同名函数需要使用完全限定语法：

trait Animal {
    fn name() -> String;
}
 
struct Dog;
 
impl Dog {
    fn name() -> String {
        String::from("Lucky")
    }
}
 
impl Animal for Dog {
    fn name() -> String {
        String::from("puppy")
    }
}

Animal trait 有函数 name()，Dog struct 实现了同名函数，又实现了 Animal::name()。

println!("Here's {} the dog.", Dog::name());

Here's Lucky the dog.

这没问题，现在，尝试调用 Animal::name()：

println!("A baby dog is often called a {}", Animal::name());

无法过编：

error[E0790]: cannot call associated function on trait without specifying the corresponding `impl` type
  --> src\main.rs:21:49
   |
2  |     fn name() -> String;
   |     -------------------- `Animal::name` defined here
...
21 |     println!("A baby dog is often called a {}", Animal::name());
   |                                                 ^^^^^^^^^^^^ cannot call associated function of trait
   |
help: use the fully-qualified path to the only available implementation
   |
21 |     println!("A baby dog is often called a {}", <Dog as Animal>::name());     
   |                                                 +++++++       +

由于 Animal::name() 不接收 &self，同时它可能同时被其他类型实现，Rust 不知道应该调用哪一个 Animal::name() 实现。

为了消歧义，需要指定调用的是 Dog 的 Animal 实现：

println!("Here's {} the dog.", Dog::name());
println!("A baby dog is often called a {}", <Dog as Animal>::name());

Here's Lucky the dog.
A baby dog is often called a puppy

完全限定语法的形式：

<Type as Trait>::function(receiver_if_method, next_arg, ...);

🥑 EmberNecro

Explorer

17 高级特征

`unsafe` Rust

`unsafe` 的超能力

解引用裸指针

调用不安全函数或方法

创建 `unsafe` 代码的安全抽象

使用 `extern` 函数调用外部代码

访问或修改可变静态变量

实现不安全 trait

访问 `union` 的字段

何时使用 `unsafe`

高级 trait

关联类型在 trait 定义中指定占位符类型

默认泛型类型参数和运算符重载

完全限定语法与消歧义：调用相同名称的方法

TODO：父 trait 用于在另一个 trait 中使用某 trait 的功能

TODO：newtype 模式用以在外部类型上实现外部 trait

Graph View

Table of Contents

Backlinks

🥑 EmberNecro

Explorer

17 高级特征

unsafe Rust

unsafe 的超能力

解引用裸指针

调用不安全函数或方法

创建 unsafe 代码的安全抽象

使用 extern 函数调用外部代码

访问或修改可变静态变量

实现不安全 trait

访问 union 的字段

何时使用 unsafe

高级 trait

关联类型在 trait 定义中指定占位符类型

默认泛型类型参数和运算符重载

完全限定语法与消歧义：调用相同名称的方法

TODO：父 trait 用于在另一个 trait 中使用某 trait 的功能

TODO：newtype 模式用以在外部类型上实现外部 trait

Graph View

Table of Contents

Backlinks

`unsafe` Rust

`unsafe` 的超能力

创建 `unsafe` 代码的安全抽象

使用 `extern` 函数调用外部代码

访问 `union` 的字段

何时使用 `unsafe`