Intro to Rust for JS developers - Part 1
Rust has become increasingly popular in the JavaScript ecosystem, known for its focus on performance and memory safety. You can spot its use in tools such: Turbo, Biome JS, Volta, LightningCSS... There is even an initiative to rewrite TypeScript in Rust!
As JavaScript developer I decided it's time to give it a shot, and, the best way is to learn by doing. So in this series of tutorials, we're going to build a fun project together. We will create a CLI named audiofy
that turns online articles into podcasts.
It will utilize text-to-speech capabilities provided by the OpenAI API to achieve this. For instance, after building the tool, the following command in the terminal Would yield an .mp3 file of the article, narrated beautifully as if by a professional podcaster:
$ audiofy http://some-interesting-blog-post.com
Pretty cool, isn't it? Letβs get started!
1. Init the audiofy project: Cargo and Crate
Installing Rust is straightforward and can be accomplished with just one command, as outlined in the official documentation at rust-lang.org/tools/install. No need to elaborate on this besides adding the adequate extensions to your IDE for a better DX (rust-analyzer for VSCode).
Once it's installed (running rustc --version
gives you the installed version), you can create a new project by running cargo new
command. It sets up a new directory with the necessary structure for a Rust project and initializes it as a new Git
repository.
cargo new audiofy
Cargo is Rustβs build system and package manager. It does a lot: creating an new project, downloading dependencies, compiling project and your packages, running your code, making distributable packages, and uploading them to crates.io, among other things.
Crate is the term used for a package in Rust code. It refers to a module or a group of modules (think of it as a JavaScript library). The package manager, cargo, allows downloading and compiling your crate's dependencies. Each crate contains a Cargo.toml
"manifest" file at the root, which describes the package.
Crates.io is the public registry where you download (and share) crates. It functions as a centralized hub for publishing and sharing open-source Rust libraries.
Cargo.toml file at the root of the directory, is known as the "manifest" and it contains metadata like the package name, version as well as dependencies. It is equivalent to the package.json
file found in a JavaScript project. It is written in TOML (Tom's Obvious, Minimal Language) format.
main.rs: Inside the src
directory, a main.rs
file is created with a simple "Hello, world!"
program as a starting point. This file is the main entry point for your Rust application.
Run your code to check the output:
$ cargo run
2. Display a welcome message to our CLI: Print, Dbg, and Panic:
Just like console.log
in JavaScript, Rust uses print
macros for outputting text to the console. The println!()
macro adds a new line automatically.
fn main() {
println!("Hello, world!");
}
A macro allows for writing code that writes other code, known as metaprogramming. Macros are used for a variety of purposes, such as code generation or reducing boilerplate. The exclamation mark !
following the macro's name, as in println!
, distinguishes a macro call from a regular function call.
Another handy macro for quickly and easily printing the value of expressions is dbg!
. It's particularly useful because it allows you to see the value of a variable or expression without requiring much setup or altering the flow of your code significantly.
Let's say you have a variable x
and you want to see its value at a certain point in your program. You can use dbg!
like this:
fn main() {
let x = 5;
let y = dbg!(x * 2) + 1;
println!("y: {}", y);
}
// Output:
// [src/main.rs:2] x * 2 = 10
// y: 11
While println!
can be used for similar purposes, dbg!
provides additional context like the file and line number, and doesn't require formatting the output.
If you need your code to crash with an error message when something goes wrong you can use panic!
. Similar idea to Error
in Javascript.
fn main() {
let x = 5;
let y = dbg!(x * 2 +12) + 1;
panic!("There's been an error!");
println!("y: {}", y); // Displays a warning for unreachable statement
}
// Output:
// [src/main.rs:3] x * 2 + 12 = 22
// thread 'main' panicked at src/main.rs:4:5:
// There's been an error!
Let's add a friendly message to our CLI :
fn main() {
let msg = "Audiofy: turn your favorite articles into a podcast! π";
println!("{}", msg);
}
Single quotes, double quotes, String and &str (string slice)
Unlike JavaScript, Rust differentiates between single and double quotes. Single quotes are used for character literals, which represent a single Unicode character and have the type char. For example, 'a'
or 'β'
.
On the other hand, double quotes are used for string literals, which are collections of characters. They represent a &str (a string slice), such as "Hello, world!".
This distinction is crucial in Rust as it enforces correct data types and usage at compile time, preventing errors that could arise from inadvertently confusing characters with strings or vice versa.
Be mindful that the type of "Hello, world!"
is not String
, but &str
(a string slice).
In Rust, both &str
and String
are used to work with text, but they are two primary types used in different ways due to their distinct characteristics:
String
is a growable, heap-allocated (which we'll explore more in an upcoming article) data structure. It's the more complex of the two types and It's typically used when you need to modify your string data, such as adding characters, concatenating strings, or changing the contents in any other way.&str
is a string slice. It is an immutable view into a string and is essentially a reference to some UTF-8 data. It doesn't own the data it points to. The actual data is stored elsewhere, like in aString
or a static string in the program.
You can turn a string slice &str
into to a String
using to_string()
or String::from()
.
Constants, Variables, and Statics
In Rust, there are let
and const
keywords but they behave differently from javascript.
let
is used to declare a variable, but with a key difference from JavaScript: variables are immutable by default. This means that once a variable is assigned a value using let
, it cannot be altered.
This immutability is a core aspect of Rust's design, prioritizing safety and predictability in code.
// This will cause a compile-time error
let x = 5;
x = 6;
// To change the value you must explicitly declare it as mutable
// using the mut keyword
let mut x = 5;
x = 6;
const
declares a constant value, which is not just immutable but also requires a fixed value at compile time.
Constants can never be mutable and it must always include the type annotation
const MAX_POINTS: u32 = 100000;
Static
variables are variables that have a static lifetime, meaning they are present for the entire duration of a program's execution. This is similar to global variables in other languages but with some unique characteristics due to Rust's focus on safety and concurrency.
Static variables are initialized when the program start and can be accessed from anywhere in the program. They have a fixed memory location throughout the program's lifetime.
Static variables can be mutable or immutable, but mutable static variables require the use of unsafe code due to the potential for data races.
// Immutable static variable
static LANGUAGE: &str = "Rust";
// Mutable static variable (unsafe):
static mut COUNTER: i32 = 0;
unsafe {
COUNTER += 1;
}
Static variables are used in the following cases:
- Constants with Complex Initialization: If you have a constant that requires runtime computation for initialization, a static variable can be useful. Regular constants in Rust require values to be determined at compile time.
- Global State or Configuration: They can be used to maintain global state or configuration settings that need to be accessible across different parts of a program.
- Long-Lived Values: For values that are needed throughout the program's lifetime and don't fit the constant model, static variables are appropriate.
- Performance Optimization: Sometimes, using static variables can optimize performance by avoiding repeated initialization of data.
3. Read CLI arguments: The type system
To build our CLI, we need to handle command-line arguments to get the URL for the article we want to turn into a podcast.
We can do so by collecting the arguments using a function provided by Rust's standard library std::env::args
.
use std::env;
fn main() {
let msg = "Audiofy: turn your favorite articles into a podcast! π";
println!("{}", msg);
let args: Vec<String> = env::args().collect();
// args[0] is the path to the program
// Further elements are the passed command-line arguments
println!("Command-line arguments: {:?}", args);
}
Let's break down and analyze this code:
The use
keyword in Rust is used for importing items from modules or crates into the current scope. the statement use std::env
is used to bring the env
module from the standard library std
into the scope.
Itβs also possible to avoid typing env
at each function call by importing the methods you need like this:
use std::env::args;
// Or use std::env::{ args, var } to import multiple methods (eg: args and var)
fn main() {
// Accessing command-line arguments
let cli_args: Vec<String> = args().collect(); // Rust uses snake_case covention to naming
println!("Command-line arguments: {:?}", cli_args);
}
These lines of code bring us to one of the main features of the language: itβs type system.
Rust features a rich and robust type system that gives the developer fine-grained control over how memory is used and managed. Its types are broadly divided into two categories: scalar and compound.
- Scalar types represent a single value, crucial for basic data operations and control structures.
- Compound types, on the other hand, can group multiple values into one type.
Here is a list of available types, though we won't delve into each one. Instead, we'll gradually explore some of these types as we progress through this project:
In javascript we are used to use the array
data structure to hold a collection of elements. JS arrays are flexible in size and can hold elements of different types.
Arrays in Rust, on the other hand have a fixed size known at compile-time, and hold elements of the same type. For example, an array of 32-bit signed integer let my_array: [i32; 5] = [1, 2, 3, 4, 5];
Rust's vectors (Vec<T>), meanwhile, are dynamic, heap-allocated collections that can grow or shrink at runtime, similar to JavaScript arrays in terms of flexibility, but with a uniform type for all elements and more control over memory layout and performance.
Another concept in Rust that goes hand in hand with collection is: Iterator
. Iterator
is not a specific type but rather a trait.
A trait in Rust is similar to an interface in other programming languages: it defines a set of methods that types must implement.
Most collection types in Rust, like vectors and arrays, can be turned into iterators using methods like iter()
for immutable access or iter_mut()
for mutable access.
In our example let args: Vec<String> = args().collect();
The env::args()
function returns an iterator of type std::env::Args
.
This iterator yields values of type String
. We use a trait .collect
to transform it to a Vector of String.
Understanding {:?}
The {:?}
in the println!
macro is a formatting specifier used for pretty-printing values using the Debug
trait. The Debug
trait is a part of Rust's standard library and is generally implemented for most types. It's intended to output a human-readable format of a value, which is particularly useful for debugging purposes.
When you use {:?}
, you're asking println!
to use the Debug
implementation of the value's type to format it. This is different from {}
, which uses the Display
trait meant for user-friendly output. In this case args is a vector and does not implement Display
trait.
If you try to use you get an error.
Looping over an iterator
We may need to loop over the arguments to process multiple links provided by the user. To do so we use the trait .iter()
. To get the index for each value from the Iterator by using .enumerate()
.
use std::env;
fn main() {
let msg = "Audiofy: turn your favorite articles into a podcast! π";
println!("{}", msg);
let args: Vec<String> = env::args().collect();
for (index, arg) in args.iter().enumerate() {
println!("- Arg at index {}: {}", index, arg);
}
}
Let's give it a try by running :
$ cargo run arg1 arg2 arg3
The result is almost as expected. There is an extra argument at index 0:
- Arg at index 0: target/debug/audiofy
- Arg at index 1: arg1
- Arg at index 2: arg2
- Arg at index 3: arg3
The extra argument at index 0 given by env::args().collect()
is typically the path to the executable that's running.
This behavior is standard in many programming environments, where the first element of the array of command-line arguments (argv[0]
in C and C++, for instance) is the program name or the full path to the executable.
We want to ensure we don't process this initial argument. To achieve this, Rust provides the convenient .skip(n)
method, which allows us to skip a specified number of items in the iterator.
use std::env;
fn main() {
let msg = "Audiofy: turn your favorite articles into a podcast! π";
println!("{}", msg);
let args: Vec<String> = env::args().collect();
for (index, arg) in args.iter().skip(1).enumerate() {
println!("- Arg at index {}: {}", index, arg);
}
}
4. Validate URL format: add dependencies
At this point we can read the arguments, but we need to validate that they are valid URL format. To do so, we use a crate called url
.
Like in a JavaScript project, Cargo in Rust offers two main ways to add a dependency:
- Manual Addition: Directly edit the Cargo.toml file and include the desired crate under the
[dependencies]
section. - Using
cargo add
command: For a more automated approach.
Cargo provides several options for customized dependency management. You can add a crate specifically for the development environment or choose a specific version, among other options detailed in this documentation.
In our case, to add the url crate, simply run cargo add url
. This updates the Cargo.toml
file by adding url as a dependency.
After adding a new dependency, it's important to run cargo build
to download and compile the new crate along with any of its dependencies, ensuring everything is correctly set up for your project.
We aim to utilize the Url
module and it's parse
method. For those exploring its documentation, you might be wondering:
- What is
Struct
- What is
pub fn
- What
&str
- What is
Result<Url, ParseError>
Struct (short for "structure") is a data stratcture to group together related data. It's similar to a record, structure, or class in other programming languages, but without built-in notions of inheritance. For example: struct Person { name: String, age: u32,}
.
pub fn means the function is public, so it can be accessed from outside the module it's defined in. By default functions are private.
&str represents a string slice in Rust, which is a view into a string. It's a reference to a part of a String (or another string slice), denoted by its start and end positions. It's a more efficient way to pass strings around, as it avoids copying the string data.
Result
is an enum in Rust used for error handling. It represents either a success Ok
or a failure Err
. Here's a simplified implementation to illustrate its structure enum Result <T, E> { Ok(T), Err(E),}
.
In Result<Url, ParseError>
, Url
is the type returned in the case of success, and ParseError
is the type of error returned in case of failure.
The Result
enum has two helpful methods, is_ok()
and is_err()
, which are used to check whether the Result
is an Ok
variant or an Err
variant, respectively.
Let's use is_ok
to build a helper function is_valid_url
.
use std::env::args;
use url::Url;
fn is_valid_url(url: &str) -> bool {
let result = Url::parse(url);
result.is_ok() // Without ; at end, the value is returned automatically. It's a shorthand for return result.is_ok();
}
fn main() {
let args: Vec<String> = args().collect();
println!("Audiofy: Transform your favorites articles to a podcast π");
for (index, arg) in args.iter().skip(1).enumerate() {
if is_valid_url(arg) {
println!("- Valid URL at index {}: {}", index, arg);
} else {
println!("- Invalid argument at index {}: {}", index, arg);
}
}
}
Let's try this code with two arguments: a valid url format and an invalid one:
$ cargo run http://some-interesting-blog-post.com some-invalid-url
// output
Audiofy: Transform your favorites articles to a podcast π
- Valid URL at index 0: http://some-interesting-blog-post.com
- Invalid argument at index 1: some-invalid-url
5. To be continued
To wrap up, our current implementation functions seamlessly. However, it currently lacks the capability to verify the existence of a given URL. Enhancing our application with this feature will be the focus of the next article...