Tutorial 0: Rust for PyTorch Users
You don’t need to learn all of Rust to use floDl. This tutorial covers the 10 patterns you’ll actually encounter in training scripts. If you’ve written PyTorch, you already have the mental model — Rust just makes a few things explicit that Python leaves implicit.
Time: ~15 minutes. After this you can follow every floDl tutorial.
Don’t want to translate by hand? floDl ships with an AI porting skill that reads your PyTorch script and generates a complete floDl project. Run
/port my_model.pyin Claude Code, or see the Porting Guide. For a side-by-side API reference, see the PyTorch Migration Guide. You can also runfdl api-refto get the full, up-to-date API surface in your terminal.
1. Variables: let and let mut
let x = 5; // immutable — like Python, most things don't change
let mut y = 10; // mutable — opt-in, like declaring "I will modify this"
y += 1; // ok
// x += 1; // compile error — x is immutable
PyTorch equivalent: Python variables are always mutable. Rust defaults to
immutable and makes you say mut when you need mutation. This prevents
accidental overwrites — the compiler catches them for you.
Optimizers need mut because step() updates internal state:
let mut optimizer = Adam::new(¶ms, 0.001);
2. References: &x
let a = Tensor::zeros(&[2, 3], opts)?;
let b = Tensor::ones(&[2, 3], opts)?;
let c = a.add(&b)?; // &b = "borrow b without taking ownership"
// b is still usable here
PyTorch equivalent: In Python, everything is passed by reference
automatically. Rust makes you write & to say “I’m borrowing this, not
consuming it.” Most floDl methods take &self or &Tensor — you’ll see &
everywhere, and it always means the same thing: “use it, don’t move it.”
3. The ? Operator and Result<T>
GPU operations can fail (shape mismatch, out of memory, etc.). Python raises
exceptions. Rust returns Result<T> — either Ok(value) or Err(error):
// The ? operator: "if this failed, return the error immediately"
let y = x.matmul(&w)?.add(&b)?.relu()?;
This is equivalent to:
# Python — exceptions propagate implicitly
y = (x @ w + b).relu()
The only difference: Rust makes you acknowledge each fallible call with ?.
Chain them freely — a single ? per call is all you need.
Your main function returns Result<()> to enable ?:
fn main() -> Result<()> {
// ... use ? freely ...
Ok(()) // "everything succeeded"
}
4. Closures: || {}
// Rust closure — like Python's lambda, but multi-line
let make_batch = || {
let x = Tensor::randn(&[16, 2], opts).unwrap();
let y = Tensor::randn(&[16, 2], opts).unwrap();
(x, y)
};
let (x, y) = make_batch();
You’ll see closures in no_grad, iterators, and data generation:
// No-grad inference (like `with torch.no_grad():`)
let pred = no_grad(|| model.forward(&input))?;
// Generate batches
let batches: Vec<_> = (0..32).map(|_| make_batch()).collect();
5. Vectors, Slices, and Iteration
// Vec<T> — like Python's list, but typed
let losses: Vec<f64> = Vec::new();
// Slices &[T] — a view into contiguous data
let shape: &[i64] = &[2, 3];
let t = Tensor::zeros(shape, opts)?;
// Iteration — like Python's for loop
for (input, target) in &batches {
let pred = model.forward(&input)?;
// ...
}
// Ranges
for epoch in 0..100 {
// epoch goes 0, 1, 2, ..., 99
}
Shape arguments are always &[i64] slices: &[2, 3], &[batch, features].
6. Traits: Like Abstract Classes
A trait defines behavior that types can implement. Module is the key trait:
// floDl's Module trait (simplified)
trait Module {
fn forward(&self, input: &Variable) -> Result<Variable>;
fn parameters(&self) -> Vec<Parameter>;
}
Every layer (Linear, GELU, LayerNorm) and every Graph implements
Module. You call .forward() on any of them the same way:
let pred = model.forward(&input)?; // works for Linear, Graph, anything
PyTorch equivalent: nn.Module — same concept, Rust just enforces the
interface at compile time instead of duck-typing it.
7. Types You’ll See But Don’t Need to Understand
These appear in signatures and error messages. You don’t need to construct them:
| Type | What it means | When you see it |
|---|---|---|
Result<T> |
“Might fail” — use ? |
Every tensor/module operation |
Option<T> |
“Might be absent” — Some(v) or None |
x.grad() returns Option<Tensor> |
Vec<T> |
Growable array | parameters(), shape data |
&[T] |
Slice (borrowed view) | Shape arguments: &[2, 3] |
Box<dyn Module> |
Any module, heap-allocated | Inside modules![...] macro |
Rc<RefCell<...>> |
Shared mutable state | Variable internals (you never touch this) |
f64, f32, i64 |
Number types | Loss values, shapes, indices |
usize |
Unsigned index | Loop counters, .len() |
() |
“Nothing” (like Python’s None) | Result<()> = “succeeds or fails, no value” |
8. Common Gotchas
Move semantics
let a = Tensor::zeros(&[2, 3], opts)?;
let b = a; // a is "moved" into b — a is no longer valid
// println!("{:?}", a); // compile error: a was moved
// Fix: clone if you need both
let a = Tensor::zeros(&[2, 3], opts)?;
let b = a.clone(); // b is a shallow copy (shared storage, like PyTorch)
// both a and b are valid
In practice, floDl methods take references (&self, &Tensor), so moves are
rare in training code. The compiler tells you when it happens.
Mutable borrow rules
let mut v = vec![1, 2, 3];
// Can't borrow mutably and immutably at the same time:
// let r = &v[0];
// v.push(4); // compile error — v is borrowed by r
// println!("{}", r);
// Fix: finish using the immutable borrow first
let r = v[0]; // copy the value
v.push(4); // now fine
You’ll rarely hit this in floDl code. When you do, the compiler error message tells you exactly what’s conflicting.
Semicolons matter
fn add_one(x: i64) -> i64 {
x + 1 // no semicolon = this is the return value
}
fn add_one_v2(x: i64) -> i64 {
return x + 1; // explicit return also works
}
The last expression without a semicolon is the return value. Add a semicolon
and it becomes a statement that returns () — the compiler will tell you if
you get this wrong.
Turbofish ::<Type>
Occasionally Rust needs a type hint:
let data: Vec<f32> = t.to_f32_vec()?;
In floDl code, type annotations on let bindings are usually enough.
9. String Formatting
// println! with {} placeholders (like Python's f-strings)
println!("Epoch {}: loss = {:.4}", epoch, loss_val);
// ^^^^ 4 decimal places
// Format strings
let msg = format!("loss: {:.6}", loss_val);
| Format | Output | Python equivalent |
|---|---|---|
{} |
Default | {} |
{:.4} |
4 decimal places | {:.4f} |
{:>10} |
Right-align, width 10 | {:>10} |
{:?} |
Debug output | repr() |
10. The Full Picture: PyTorch vs floDl Training Loop
# PyTorch
import torch
import torch.nn as nn
model = nn.Sequential(
nn.Linear(1, 32),
nn.GELU(),
nn.Linear(32, 1),
)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
model.train()
for epoch in range(200):
for x, y in batches:
optimizer.zero_grad()
pred = model(x)
loss = nn.functional.mse_loss(pred, y)
loss.backward()
optimizer.step()
print(f"Epoch {epoch}: loss={loss.item():.4f}")
// floDl
use flodl::*;
fn main() -> Result<()> {
let model = FlowBuilder::from(Linear::new(1, 32)?)
.through(GELU)
.through(Linear::new(32, 1)?)
.build()?;
let params = model.parameters();
let mut optimizer = Adam::new(¶ms, 0.01);
model.train();
for epoch in 0..200 {
let mut last_loss = 0.0;
for (x, y) in &batches {
let input = Variable::new(x.clone(), true);
let target = Variable::new(y.clone(), false);
optimizer.zero_grad();
let pred = model.forward(&input)?;
let loss = mse_loss(&pred, &target)?;
last_loss = loss.item()?;
loss.backward()?;
optimizer.step()?;
}
println!("Epoch {}: loss={:.4}", epoch, last_loss);
}
Ok(())
}
What’s different:
use flodl::*instead ofimport torch- Shapes are slices:
&[1, 32]not(1, 32) - Every fallible call has
? - Tensors become
Variablefor gradient tracking model.forward(&input)notmodel(input)fn main() -> Result<()>+Ok(())at the end
What’s the same:
- Build model, create optimizer, training loop, zero_grad/forward/backward/step
- The structure is identical — Rust just makes error handling and ownership explicit
You’re ready. The patterns above cover 95% of what you’ll write in floDl.