An Introduction to Anchor: A Beginner’s Guide to Building Solana Programs
What’s this Article About?
Rust is commonly described as the lingua franca of Solana program development. It is more accurate to describe Anchor in this manner instead, as most Rust development uses this framework. Anchor is an opinionated and powerful framework designed to build secure Solana programs quickly. It streamlines the development process by reducing boilerplate for areas such as account (de)serialization and instruction data, conducting essential security checks, generating client libraries automatically, and providing an extensive test environment.
This article explores how to develop Anchor programs. It covers installing Anchor, using Solana Playground, and creating, building, and deploying a simple Hello, World! program. Then, we’ll dive deeper into how Anchor streamlines the development process by examining IDLs, macros, the structure of Anchor programs, account types and constraints, and error handling. We’ll also cover Cross-Program Invocations and Program Derived Addresses briefly. This article will provide everything you need to know to get started with Anchor, today.
This article assumes knowledge of Solana’s programming model. I recommend reading my previous blog post The Solana Programming Model: An Introduction to Developing on Solana, if you’re new to building on Solana.
Don’t fret if you’re new to Rust - advanced knowledge is not required to start Anchor development. The Anchor documentation points out that developers only need to be comfortable with the basics of Rust (i.e., the first nine chapters of the Rust Book). I recommend watching The Rust Survival Guide for a nice breakdown of essential Rust programming concepts. It is also crucial to understand Rust’s memory, ownership, and borrowing rules.
To ease the learning curve, I recommend developers new to low-level programming languages review different systems programming-specific concepts Rust resources tend to skip over. For example, I’d recommend looking into topics such as variable sizes, pointers, and memory leaks. I also recommend Rust By Example and my repository on various data structures and algorithms written in Rust for examples of Rust in action.
This article focuses on Anchor development and Anchor development only. We will not cover how to develop programs in Native Rust, nor will this article assume knowledge thereof. Also, this article will not cover client-side development with Anchor - we will cover how to test and interact with Anchor programs via TypeScript in a future article.
Let’s get started with Anchor.
Setting up Anchor involves a few straightforward steps to install the necessary tools and packages. This section covers installing these tools and packages (i.e., Rust, the Solana Tool Suite, Yarn, and the Anchor Version Manager).
Rust can be installed from the official Rust website or via the command line:
Installing the Solana Tool Suite
Anchor also requires the Solana Tool Suite. The latest release (1.17.16 - at the time of writing this article) can be installed with the following command for macOS and Linux:
For Windows users, it is possible to install the Solana Tool Suite using the following command:
However, it is strongly recommended that you use Windows Subsystem for Linux (WSL) instead. This will allow you to run a Linux environment on your Windows machine without needing to dual boot or spin up a separate virtual machine. By taking this route, refer back to the installation instructions for Linux (i.e., the curl command).
Developers can also replace v1.17.16 with a release tag of the version they wish to download. Or, use the stable, beta, or edge channel names. Once installed, run solana –-version to confirm the desired version of solana is installed.
Anchor also requires Yarn. It can be using Corepack, which is included with all official Node.js releases starting from Node.js from 14.9 / 16.9. However, it's currently opt-in during its experimental stage. So, we need to run corepack enable before it's active. Some third-party distributors may not include Corepack by default. Thus, you may need to run npm install -g corepack before corepack enable.
Installing Anchor Using AVM
The Anchor documentation advises installing Anchor via the Anchor Version Manager (AVM). The AVM simplifies managing and selecting multiple installations of the anchor-cli binary. This may be required to produce verifiable builds, or to work with alternate versions across different programs. It can be installed using Cargo with the command: cargo install --git https://github.com/coral-xyz/anchor avm --locked --force. Then, install and use the latest version:
For a list of anchor-cli’s available versions, use the avm list command. Developers can use avm use <version> to use a specific version. This version will remain in use until it is changed. Developers can uninstall a specific version using the avm uninstall <version> command.
Installing Anchor Using Binaries and Building From Source
On Linux, Anchor binaries are available via the npm package @coral-xyz/anchor-cli. Currently, only x86_64 Linux is supported. So, developers must build from source for other operating systems. Developers can use Cargo to install the CLI directly. For example:
Modify the --tag argument to install another desired Anchor version. Additional dependencies may need to be installed if the Cargo installation fails. For example, on Ubuntu:
Developers can then verify their Anchor installation with the anchor --version command.
Alternatively, developers can start with Anchor using Solana Playground (Solpg). Solana Playground is a browser-based IDE that facilitates the quick development, testing, and deployment of Solana programs.
Developers must create a Playground Wallet for their first time using Solana Playground. Click the red status indicator labeled Not connected at the bottom left of the screen. The following modal will pop up:
It is recommended to save the wallet’s keypair file as a backup before clicking Continue. This is because the Playground Wallet is saved in the browser’s local storage. Clearing the browser cache will remove the wallet.
Click Continue to create a devnet wallet ready to be used in the IDE.
To fund the wallet, developers can run the following command solana airdrop <amount> in the Playground terminal, where <amount> is replaced with the desired amount of devnet SOL. Alternatively, visit this faucet for devnet SOL. I recommend checking out the following guide on how to get devnet SOL.
Note that you may encounter the following error:
This is often due to the devnet faucet being drained and/or requesting too much SOL. The current limit is 5 SOL, which is more than enough to deploy this program. It is therefore recommended to request 5 SOL from the faucet or execute the command solana airdrop 5. Requesting smaller amounts incrementally can potentially lead to rate-limiting.
Hello, World! programs are regarded as an excellent introduction to new frameworks or programming languages. This is because of their simplicity, as developers of all skill levels can understand them. These programs also elucidate the new programming model's basic structure and syntax without introducing complex logic or functions. It has quickly become a pretty standard beginner program in coding, so it’s only natural that we write one ourselves for Anchor. This section covers how to build and deploy a Hello, World! program with a local Anchor setup as well as with Solana Playground.
Creating a new Project with a Local Anchor Setup
Creating a new project with Anchor installed is as easy as:
These commands will initialize a new Anchor project called hello-world, and will navigate into its directory. In this directory, navigate to hello-world/programs/hello-world/src/lib.rs. This file contains the following starter code:
Anchor has prepared a number of files and directories for us. Namely,
- An empty app for the program’s client
- A programs folder that will house all of our Solana programs
- An Anchor.toml configuration file. If you’re new to Rust, a TOML file is a minimal configuration file format that’s easy to read due to its semantics. The Anchor.toml file is used to configure how Anchor will interact with the program. For example, what cluster the program should be deployed to.
Creating a New Project with Solana Playground
Creating a new project on Solana Playground is very straightforward. Navigate to the top left corner and click Create a New Project:
The following modal will pop up:
Name your program, select Anchor(Rust), and click Create. This will create a new Anchor project directly in your browser. Under the Program section on the left, you’ll see a src directory. It holds lib.rs, which has the following starter code:
Notice how Solana Playground only generates client.ts and anchor.test.ts files. I’d recommend reading through the section on creating a program with Anchor locally to see a breakdown of what is usually generated for a new Anchor project.
Writing Hello, World!
Regardless of whether you’re using Anchor locally or via Solana Playground, for a very simple Hello, World! program, replace the starter code with the following:
We’ll go over the exact specifics of each part in the subsequent sections. For now, it is important to notice the use of macros and traits to simplify the development process. The declare_id! macro sets the public key for the program. For local development, the anchor init command to set up the program will generate a keypair in the target/deploy directory and populate this macro. Solana Playground will also do this for us automatically.
In our main hello_world module, we create a function that logs Hello, World! It also returns Ok(()) to signal successful program execution. Notice that we prefix ctx with an underscore to avoid unused variable warnings in our console. Hello is an account struct that does not require any accounts to be passed since the program only logs a new message.
That’s it! There’s no need to take in any accounts or do some complex logic. The code presented above creates a program that logs Hello, World!
Building and Deploying Locally
This section will focus on deploying to Localhost. Although Solana Playground defaults to devnet, a local development environment offers a significantly improved developer experience. Not only is it faster, but also circumvents several issues commonly encountered when testing against devnet. For example, insufficient SOL for transactions, slow deployments, and the inability to test when devnet is down. In contrast, developing locally can guarantee a fresh state with each test. This allows for a more controlled and efficient developer environment.
Configuring Our Tools
First, we want to ensure that the Solana Tool Suite is configured correctly for Localhost development. Run the solana config set --url localhost command to ensure all configurations point to Localhost URLs.
Also, ensure you have a local key pair to interact with Solana locally. You must have a Solana wallet with a SOL balance to deploy a program with the Solana CLI. Run the solana address command to check if you already have a local key pair. If you come across an error, run the solana-keygen new command. A new file system wallet will be created at the ~/.config/solana/id.json path by default. It will also provide a recovery phrase that can be used to recover the public and private keys. It is recommended to save this key pair, even though it is being used locally. Also note, if you already have a file system wallet saved at the default location, the solana-keygen new command will not override it unless specified with the --force command.
Configuring the Anchor.toml
Next, we want to ensure our Anchor.toml file correctly points to Localhost. Ensure it contains the following code:
Here, [programs.localnet] refers to the program’s ID on localnet (i.e., Localhost). The program ID is always specified in relation to the cluster. This is because the same program can be deployed to a different address on a different cluster. From a developer experience perspective, declaring new program IDs for programs deployed across different clusters can be annoying.
The program ID is public. However, its key pair is stored in the target/deploy folder. It follows a specific naming convention based on the program’s name. For example, if the program is named hello_world, Anchor will look for a keypair at target/deploy/hello-world-keypair.json. Anchor will generate a new key pair if it does not find this file during deployment. This will result in a new program ID. Thus, updating the program ID after the first deployment is crucial. The hello-world-keypair.json file serves as proof of ownership for the program. If the keypair is leaked, malicious actors can make unauthorized changes to the program.
With [provider], we are telling Anchor to use Localhost and the specified wallet to pay for storage and transactions.
Building, Deploying, and Running a Local Ledger
Use the anchor build command to build the program. For building a specific program by its name, use the anchor build -p <program name> command, replacing <program name> with the program’s name. Since we’re developing on localnet, we can use the Anchor CLI’s localnet commands to streamline the development process. For example, anchor localnet --skip-build is particularly useful for skip building a program in the workspace. This can save time when running tests, and the program’s code has not been altered.
If we try to run the anchor deploy command now, we’ll get back an error. This is because we don’t have a Solana cluster running on our own machine that we can test against. We can run a local ledger to simulate a cluster on our machine. The Solana CLI comes with a test validator already built in. Running the solana-test-validator command will start a full-featured, single-node cluster on your workstation. This is beneficial for a number of reasons, such as no RPC rate limits, no airdrop limits, direct on-chain program deployment, loading accounts from files, and cloning accounts from a public cluster. The test validator must run in a separate open terminal window and remain running for the localhost cluster to stay online and be available for interaction.
We can now successfully run anchor deploy to deploy the program to our local ledger. Any data transmitted to the local ledger will be saved in a test-ledger folder generated in the current working directory. Adding this folder to your .gitignore file is recommended to avoid committing this folder to your repository. Also, exiting the local ledger (i.e., hitting Ctrl + C in the terminal) will not remove any data sent to the cluster. Removing the test-ledger folder or running solana-test-validator --reset will.
Congratulations! You’ve just deployed your first Solana program to Localhost!
Developers can also configure the Solana Explorer with their local ledger. Navigate to the Solana Explorer. In the navbar, click on the green button stating the current cluster:
This will open up a sidebar allowing you to choose a cluster. Click on Custom RPC URL. This should auto-fill will http://localhost:8899. If not, fill it in to have the explorer point to your machine at port 8899:
This is invaluable for several reasons:
- It allows developers to inspect transactions on your local ledger in real-time, mirroring the capabilities they would normally have with a block explorer that analyzed devnet or mainnet
- It is easier to visualize the state of accounts, tokens, and programs as if they were operating on a live cluster
- It provides detailed information regarding errors and transaction failures
- It provides a consistent development experience across clusters as it is a familiar interface
Deploying to Devnet
Albeit advocating for Localhost development, developers can also deploy to devnet if they wish to test against that cluster specifically. The process is generally the same, except that there is no need to run a local ledger (we have a fully-fledged Solana cluster that we can interact with!).
Run the command solana config set --url devnet to change the selected cluster to devnet. Any solana command run in the terminal will now be executed on devnet. Then, in the Anchor.toml file, duplicate the [programs.localnet] section and rename it to [programs.devnet]. Also, change [provider] so it now points to devnet:
Developers must ensure they have devnet SOL to deploy the program. Use the solana airdrop <amount> command to airdrop to the default keypair location at ~/.config/solana/id.json. A wallet address can also be specified using solana aidrop <amount> <wallet address>. Alternatively, visit this faucet for devnet SOL. I recommend checking out the following guide on how to get devnet SOL.
Note that you may encounter the following error:
This is often due to the devnet faucet being drained and/or requesting too much SOL at once. The current limit is 5 SOL, which is more than enough to deploy this program. It is therefore recommended to request 5 SOL from the faucet or execute the command solana airdrop 5. Requesting smaller amounts incrementally can potentially lead to rate-limiting.
Now, build and deploy the program using the following commands:
Congratulations! You’ve just deployed your first Solana program to devnet locally!
Building and Deploying on Solana Playground
On Solana Playground, navigate to the Tools icon on the left sidebar. Click Build. In the console, you should see the following:
Notice how the ID in the declare_id! macro was overwritten. This new address is where we’ll be deploying the program. Now, click Deploy. You should have something similar to this in your console:
Congratulations! You’ve just deployed your first Solana program to devnet via Solana Playground!
Effective Abstraction: IDLs and Macros
Anchor simplifies program development through effective abstraction. That is, Anchor simplifies complex blockchain programming concepts, making them more accessible and easier to work with. For example, Anchor employs an Interface Definition Language (IDL) to define the program’s interface. When building a program, Anchor will generate a JSON file representing the program’s IDL. Essentially, this structure can be used on the client side, defining how to interact with the program’s functions and data structures. Anchor also provides higher-level abstractions for dealing with state management. Anchor allows developers to define the state of their program using Rust structs, which can be more intuitive than working with raw byte arrays or manual serializations. Thus, developers can define the state as they normally would with any typical Rust data structure, and then Anchor handles the underlying serialization and storage into accounts.
It is also very straightforward to publish an IDL on-chain. Developers can publish an IDL with the following command:
Anchor's macros are one of, if not the most, important abstraction. In Rust, a macro is a piece of code that generates another piece of code. This is a form of metaprogramming. Declarative macros are the most widely used form of macros in Rust. They allow developers to write something similar to a match expression via the macro_rules! construct. Procedural macros act more as a function, accepting some code as an input, operating on that code, and producing some output. In Anchor, for example, the #[account] macro defines and enforces constraints on Solana accounts. This helps to reduce complexity and potential errors surrounding account management. Covering Anchor’s macros inevitably warrants a discussion regarding Anchor’s program structure.
Anchor Program Structure
Anchor’s program structure is designed to leverage a combination of macros and traits to generate boilerplate code and enforce program logic. This design philosophy plays a big part in streamlining the development process and ensuring consistency and reliability in program behavior.
use declarations are found at the top of the file. Note that they are a general Rust language semantic and are not specific to Anchor. These declarations create one or more local name bindings synonymous with some other path - use declarations shorten the path required to refer to a module item. They can appear in modules or blocks. Also, the self keyword can bind a list of paths with a common prefix and their common parent module. For example, these are all valid use declarations:
The first Anchor macro a developer will encounter is declare_id!. It is used to declare the program’s address (the program ID), ensuring all interactions are correctly routed to the program. Anchor will generate a new keypair when a developer builds an Anchor program for the first time. This is the key pair used to deploy the program unless stated otherwise. The key pair’s public key should be supplied as the program ID for the declare_id! macro:
The #[program] attribute macro denotes the program’s instruction logic module. It acts as an entry point, defining how the program interprets and executes incoming instructions. This macro simplifies the routing of these instructions to the appropriate function within the program, making the program’s code more organized and manageable. Each function within this module is treated as a separate instruction. Each function will take a context parameter (ctx) of type Context as its first argument. Developers can access the accounts, the program ID of the executing program, and the remaining accounts.
The Context type is defined as:
This helps provide non-argument inputs to a given program. The program_id field is of type Pubkey and represents the currently executing program ID. accounts refers to the serialized accounts, whereas remaining_accounts refers to the remaining accounts given but were not deserialized or validated - be very careful when using this directly. The bumps field is of type Bumps generated by #[derive(Accounts)]. It represents the bump seeds found during constraint validation. We’ll cover account constraints in a later section. For now, it is important to know that this is provided as a convenience so handlers don’t have to recalculate bump seeds or pass them in as arguments.
Note that Context is a generic type. In Rust, generics allow developers to write flexible, reusable code that works with any data type. They enable type definitions for structs, enums, functions, and methods without specifying the exact type they will work with. Instead, a placeholder is used for those types, usually denoted as T. Generics help reduce repetitive code and increases clarity. For example, an enum can be defined to hold generic data types:
The code snippet above showcases the Option<T> enum. It is a standard Rust enum that can encapsulate a value of any type (i.e., Some(T)) or no type (None).
For our purposes, Context is a generic type with T specifying the required accounts for an instruction (i.e., whatever type a developer wants to create to store data). Developers can define T as a struct implementing the Accounts trait when using Context. For example, Context<SetData>. Developers can access fields within the Context type using dot notation. For instance, ctx.accounts accesses the accounts field of the Context struct.
As mentioned, the #[account] macro defines custom account types. In the next sections, we’ll investigate account types and constraints using #[account(...)]. For now, it is important to note that the Accounts struct is where a developer defines which accounts an instruction should expect and which constraints these accounts should adhere to.
The Account type is used when an instruction wants to access an account’s deserialized data. The Account struct is generic over T and is defined as:
It is a wrapper for AccountInfo that verifies program ownership and deserializes the underlying data into a Rust type. It checks program ownership such that Account.info.owner == T::owner(). That is, it checks that the data owner is the same as the ID (the one created earlier with declare_id!) of the crate #[account] is used in. This means the data type that Account wraps around (=T) must implement the Owner trait. The #[account] attribute implements the trait for a struct using the crate::ID declared by declare_id! in the same program. Most of the time, developers can simply use the #[account] attribute to add the necessary traits and implementations to their data. The #[account] attribute generates implementations for the following traits:
The initial 8 bytes are allocated for a unique account discriminator when implementing traits for account serialization. This discriminator is determined by the first 8 bytes of the SHA-256 hash of the account’s Rust identifier. Any calls to AccountDeserialize’s try_deserialize will check this discriminator and exit account deserialization with an error if an invalid account was provided.
There will be instances where developers need to interact with non-Anchor programs. Here, developers can get all the benefits of Account if they create their own custom wrapper type instead of using #[account]. Take the following code snippet as an example:
Most account validation is done via account constraints, which we’ll cover in the next section. But for now, see how the TokenAccount type is used to ensure that the incoming account is owned by the token program. TokenAccount wraps around the token program’s Account struct and adds the necessary functions. This ensures that Anchor can deserialize the account, and developers can use its fields inside account constraints and with the instruction function.
Also note in the code snippet above that the derive macro encapsulates the entire struct. This implements an Accounts deserializer on SetData and is used to validate incoming accounts.
Several Account types can be used in the account validation struct, including:
- Account<’info, T>: an account container that checks ownership on deserialization
- AccountInfo<’info>: an unchecked account that can be used as a type. However, UncheckedAccount should be used instead, as AccountInfo will likely go away in a future release
- AccountLoader<’info, T>: a type that facilitates on-demand zero copy deserialization. This is different from using Account as a developer must call load_init after initializing an account, load when the account is not mutable, and load_mut when the account is mutable
- Box<Account<’info, T>> or Box<InterfaceAccount<’info, T>>: a box type to save stack space as sometimes accounts are too large for the stack and can lead to stack violations - boxing the account can help remedy this
- Interface<’info, T>: a type that wraps over Program used to validate that the account is one of a set of given programs. It checks if the expected program contains the account’s key and whether the account is executable
- InterfaceAccount<’info, T>: an account container that checks program ownership and deserializes the underlying data into a Rust type
- Option<Account<’info, T>>: an option type for optional accounts
- Program<’info, T>: a type that validates whether the account is the given program
- Signer<’info>: a type that validates whether the account signed the transaction
- SystemAccount<’info>: a type that validates whether the account is owned by the System Program
- Sysvar<’info, T>: a type that validates whether the account is a sysvar. That is, whether the account is a special type that contains dynamically updated data regarding the network cluster, the blockchain history, and the executing transaction. The clock, epoch_schedule, instructions, and rent sysvars are useful for program development
- UncheckedAccount<’info>: an account container that specifically emphasizes that no checks are performed on the specified account
Account constraints are vital to developing secure Anchor programs. In future articles, we’ll cover Solana program security and hacking Anchor programs more in-depth. However, it is important here to cover constraints. Constraints allow developers to verify certain accounts or the data they hold match some predefined requirements. Several different types of constraints can be applied using the #[account(...)] attribute, which can also refer to other data structures. The format is as follows:
It is also important to note that, within the Accounts macro, developers can access the instructions arguments using the #[instruction(...)] attribute. Developers need to list the instruction arguments in the same order as in the instruction but can omit all arguments after the last one needed. For example, from the Anchor documentation:
Account constraints can be divided into Normal Constraints and SPL Constraints. We will go over specific constraints throughout the remainder of this article. In these examples, <expr> represents an arbitrary expression that may be passed in so long as it evaluates to a value of the expected type. For example, owner = token_program.key().
Analyzing a Program’s Constraints
I recommend looking into the Anchor documentation regarding accounts for a more comprehensive list of possible constraints. It would be too arduous to loop through every constraint, providing a formal definition in some sort of table format. For our purposes, it is more beneficial to analyze the following program to get a feel for account constraints in action:
This is Helium’s Fanout program. This is a reasonably complex program to distribute tokens to token holders proportionally based on their holdings. Currently, the project doesn’t look that useful to us as there aren’t any constraints. However, if we analyze the stake_v0 instruction’s StakeV0 struct, we have a multitude of constraints to explore.
The first constraint in this instruction is the mut account constraint. mut is defined as #[account(mut)] or #[account(mut @ <custom_error>)], supporting custom errors with the @ notation. This constraint checks whether a given account is mutable and makes Anchor persist any state changes. In Helium's program, the constraint ensures that the payer account is mutable:
The has_one constraint is defined as #[account(has_one = <target_account)] or #[account(has_one = <target_account> @ <custom_error>)]. It checks the target_account field to see if the account matches the key of the target_account field in the Accounts struct. Custom errors are supported via the @ annotation.
Within the context of the StakeV0 struct, the has_one constraint is used to check whether the account has a membership_mint, token_account, and membership_collection:
Notice how there are multiple has_one constraints, and the mut constraint is also being used. With account constraints, multiple account constraints can be used on an account simultaneously.
The seeds and bump constraints are used to check that a given account is a PDA derived from the currently executing program, the seeds, and, if provided, the bump:
- #[account(seeds = <seeds>, bump)]
- #[account(seeds = <seeds>, bump, seeds::program = <expr>)]
- #[account(seeds = <seeds>, bump = <expr>)]
- #[account(seeds = <seeds>, bump = <expr>, seeds::program = <expr>)]
If the bump isn’t provided, Anchor will use the canonical bump. Seeds::program = <expr> can be used to derive the PDA from a different program than the currently executing one.
In Helium’s fanout program, the seeds constraint checks whether the text “metadata”, token_metadata_program key, membership_collection key, and the text “edition” are seeds used to derive this PDA. The seeds::program constraint ensures that the token_metadata_program is being used to derive the PDA, instead of the current program:
The token::mint and token::authority constraints are defined as follows:
- #[account(token::mint = <target account>, token::authority = <target account>)]
- #[account(token::mint = <target account>, token::authority = <target account>, token::token_program = <target account>)]
The mint and authority token constraints are used to verify the mint address and authority of a TokenAccount. These constraints can be used as a check or with the init constraint to create a token account with the given mint address and authority. When used as a check, it is possible to only specify a subset of the constraints.
Within the context of Helium’s program, these constraints are used to check whether the associated_token’s mint is equal to the membership_mint, and if the token’s authority is set to the staker:
init, payer, space
At this point in time, it makes sense to jump ahead a bit in the code to analyze the init, payer, and space constraints. The init constraint is defined as [#account(init, payer = <target_account>, space = <num_bytes>)]. This constraint creates the account via a CPI to the System Program and initializes it by setting its account discriminator. This will mark the account as mutable and is mutually exclusive with mut. For accounts larger than 10 Kibibytes, use #[account(zero)].
The init constraint has to be used with some additional constraints. It requires the payer constraint, which specifies the account that will pay for the account creation. It also requires the System Program to exist on the struct and be called system_program. The space constraint must also be defined. In the Account Space section, we’ll dive deeper into this constraint and space requirements.
Regarding Helium’s fanout program, the init command creates a new account. The payer is set as the payer, established earlier in the struct as pub payer: Signer<'info>. The account’s space is being set to the size of FanoutVoucherV0, plus 8 bytes for the discriminator and an additional 61 bytes of space:
The init_if_needed constraint is defined as #[account(init_if_nedded, payer = <target_Account>)] or #[account(init)if_needed, payer = <target_account>, space = <num_bytes>)]. This constraint has the exact same functionality as init. However, it only runs if the account does not exist yet. If the account exists, init_if_needed still verifies that all the initialization constraints are met, such as the account has the correct amount of space allocated or the correct seeds in the case of a PDA.
init_if_needed should be used cautiously, as it’s gated behind a feature flag due to potential risks. To enable it, import anchor-lang with the init-if-needed cargo feature. It is crucial to safeguard against re-initialization attacks when using init_if_needed. Developers must ensure that their code includes checks to prevent the account from being reset to its initial state after its initialization unless this behavior is intended. It is considered best practice to keep instruction execution paths straightforward to mitigate these attacks. Consider dividing instructions into one for initialization and all others for subsequent operations.
Helium’s Fanout program uses the init_if_needed constraint to initialize the recipient_account, if the account does not already exist:
The constraint constraint is defined as #[account(constraint = <expr>)] or #[account(constraint = <expr> @ <custom_error>)]. It checks whether the expression provided evaluates to true. This is useful when no other constraint fits the intended use case. It also supports custom errors via the @ annotation.
The Fanout program uses constraint to check whether the mint’s supply is set to zero:
mint::authority, mint::decimals, mint::freeze_authority
In the code snippet above, the mint::decimals, mint::authority, and mint::freeze_authority constraints are being used to check whether the mint’s decimals are set to zero, the voucher has authority and freeze authority.
For context, the mint::authority, mint::decimals, and mint::freeze_authority constraints are defined as:
- #[account(mint::authority = <target account>, mint::decimals = <expr>)]
- #[account(mint::authority = <target account>, mint::decimals = <expr>, mint::freeze_authority = <target account>)]
These constraints are self-evident - that is, they check the token’s authority, decimals, and freeze authority, respectively. They can be used as a check or with init to create a mint account with the given mint decimals and mint authority. The freeze authority is entirely optional when used with init. When used as a check, it is possible to only specify a subset of these constraints.
Every account on Solana a program uses must have its storage space explicitly allocated. This allocation is crucial for efficient resource management, ensuring that only necessary data is stored on-chain. This also contributes to predictable transaction costs and enhances the efficiency of transaction execution - transactions can be processed without the need to dynamically allocate or resize account storage. Moreover, pre-allocating data guarantees the account has enough space to store all its requisite data, reducing the risk of failed transactions or potential security vulnerabilities.
Different data types have different space requirements. Here’s a simplified guide to help estimate space requirements:
- Basic Types: simple data types like bool, u8, i8, u16, i16, u32, i32, u64, i64, u128, and i128 all have fixed sizes. This ranges from 1 byte for a bool (although it only uses 1 bit) to 16 bytes for u128 / i128
- Arrays: for an array [T;amount], the space is calculated as the size of T multiplied by the number of elements (i.e., amount). For example, an array of 16 u16 would require 32 bytes
- Pubkey: a public key always occupies 32 bytes on Solana
- Dynamic Types: String and Vec<T> require careful consideration. They both require 4 bytes to store their length, plus the space for the actual content. It’s essential to allocate enough space for the maximum expected size. For a String, this looks like 4 bytes plus the length of the String in bytes. For a Vec<T>, this looks like 4 bytes plus the space of the given type multiplied by the number of expected elements (i.e., 4 + space(T) * amount)
- Options and Enums: an Option<T> type requires 1 byte plus the space for the type T. Enums require 1 byte for the enum discriminator plus the required space for the largest variant
- Floating Points: types such as f32 and f64 occupy 4 and 8 bytes, respectively. Be careful with NaN values, as they can cause serialization to fail
The following guide only applies to accounts that do not use zero-copy serialization. Zero-copy serialization is denoted by the #[zero_copy] attribute. It leverages the repr(c) attribute for memory layout, which enables direct pointer casting to access data. This is an efficient way to work with on-chain data without the overhead of traditional deserialization. #[zero_copy] is a shorthand for applying the #[derive(Copy, Clone)], #[derive(bytemuck::Zeroable)], #[derive(bytemuck::Pod)], and #[repr(C)]. These attributes ensure the account can be safely treated as a sequence of bytes and is compatible with zero-copy deserialization. Zero-copy deserialization is crucial for accounts that demand significantly large sizes - accounts that cannot be efficiently serialized using Borsh or Anchor’s default serialization mechanisms without running into heap or stack limits.
Anchor’s Internal Discriminator
Developers must add 8 to the space constraint for Anchor’s internal discriminator. For example, if an account requires 32 bytes, it’ll need 40. It is considered good practice to set the space constraint as space = 8 + <account size> to make it known the internal discriminator is being considered in the space calculation.
As an aside, a discriminator is a unique identifier used to distinguish between different data types. This is useful for differentiating between different types of account data structures at runtime. It is also used to prefix instructions, assisting in routing these instructions to their corresponding methods within an Anchor program. The discriminator is an 8-byte array representing the data type’s unique identifier.
Calculating Initial Space
Calculating the initial space requirement for an account can be challenging. The InitSpace macro adds an INIT_SPACE constant that can be used on the account’s structure. It is not necessary for the structure to contain the #[account] macro to generate the constant. The Anchor documentation provides the following example:
In this example, ExampleAccount::INIT_SPACE automatically calculates the necessary space for ExampleAccount. It also takes Anchor’s internal discriminator into consideration for the space calculation.
Resizing Program Space
The realloc constraint is used to adjust the space of a program account at the start of an instruction. It requires the account to be mutable (i.e., mut) and is applicable to either the Account or AccountLoader types. It is defined as #[account(realloc = <space>, realloc::payer = <target>, realloc::zero = <bool>)]. When increasing the account data length, lamports are transferred from the realloc::payer to the program account to maintain rent exemption. If the data length decreases, the lamports are moved back from the program account to the realloc::payer. The realloc::zero constraint decides if the newly allocated memory should be zero-initialized. Zero-initialization ensures the new memory is clean and free from any leftover or unwanted data.
It is not recommended to use AccountInfo::realloc manually compared to the realloc constraint. This is due to the lack of runtime checks that ensure reallocation does not go beyond the MAX_PERMITTED_DATA_INCREASE limit, which could lead to overwriting data in other accounts. The constraint also checks and prevents repeated reallocation within a single instruction.
Error handling is a vital aspect of program development. It is a mechanism to identify and manage errors that may halt a program’s execution. Handling errors must be deliberate and planned to ensure code quality, maintenance, and functionality. Anchor simplifies this with robust error-handling mechanisms. Errors in Anchor programs can be divided into AnchorErrors and non-Anchor errors. This section will focus on AnchorErrors as non-Anchor errors encompass a wide array of Rust errors. For non-Anchor errors, I suggest looking at the Rust Book’s chapter on Error Handling and Rust By Example’s section on Error Handling.
The following struct defines AnchorError:
These fields are relatively straightforward. error_name is a string that represents the name of the error. error_code_number is a unique identifier (i.e., a unique unsigned integer that takes up 32 bits of space) for the error. error_msg is a descriptive message explaining the error. error_origin is an optional field that provides information about where the error originated, such as the source file or account involved. compared_values is an optional field that details the values being compared when the error occurred. This is extremely useful for debugging.
AnchorError implements a log method. This includes information about the error’s origin and the values involved, which is useful for debugging and error resolution. This method uses the error_origin and compared_values to provide this information.
AnchorErrors can be broken down further into Anchor internal errors and custom errors. Anchor has a long list of internal error codes that can be returned. These internal errors are not meant to be used by users. However, knowing the mappings between the codes and their causes is useful. They are usually thrown when a constraint has been violated. Internal error codes follow this schema:
- >= 100 are instruction error codes
- >= 1000 are IDL error codes
- >= 2000 are constraint error codes
- >= 3000 are account error codes
- >= 4100 are miscellaneous error codes
- = 5000 are deprecated error codes.
Custom errors start at the ERROR_CODE_OFFSET (i.e., 6000).
Developers can implement their own custom errors by using the error_code attribute. This attribute is used on an enum, and the enum’s variants can be used as errors throughout the program. A message can be added for each variant. The client can display this message if the error occurs. For example:
It is vital to note that there are several require macros to choose from. A vast majority of these macros concern non-public key values. For example, the require_gte macro checks whether the first non-public key value is greater than or equal to the second non-public key value:
There are also a few caveats when comparing public keys. For example, developers should use require_keys_eq instead of require_eq as the latter is more expensive.
All programs will return a ProgramError. This error type includes a field specifically for a custom error number, which Anchor uses to store its internal and custom error codes. However, this is only a single number, so it isn’t as useful. Anchor’s aforementioned logging with AnchorErrors is much more helpful. Anchor clients are designed to parse these logs. However, there are scenarios where this can be challenging. For instance, retrieving logs for processed transactions with preflight checks turned off is not as straightforward. Similarly, Anchor also employs a fallback mechanism for non-Anchor or legacy programs that do not log AnchorErrors in the standard way. Here, Anchor checks if the error number returned by the transaction corresponds to an Anchor internal error code or an error number defined in the program’s IDL. Anchor enriches the error information to provide more context when a match is found. Anchor will also try to parse the program error stack whenever possible to trace back to the original cause of the program error. ProgramError serves as a foundational error type, with its utility enhanced by Anchor’s logging and parsing mechanisms to provide detailed error information.
Cross-Program Invocations (CPIs)
Cross-Program Invocations (CPIs) have been alluded to throughout this article, so it is only right that we have a dedicated section on them. CPIs are fundamental to the composability of Solana as they facilitate programs calling directly into other programs. This turns the Solana ecosystem into a vast, interconnected API, if you will, for developers. For the sake of brevity, I recommend reading through the Anchor documentation on CPIs as they provide a useful example of CPIs in action with a puppet and puppet master program.
Nevertheless, a CPI can be defined as a call from one program to another, targeting a specific instruction in the called program. The invoking program is halted until the invoked program finishes processing the instruction.
CPIs enable a calling program to extend their signer privileges to the callee. Privilege extension is convenient but has the potential to be very dangerous. If a CPI accidentally targets a malicious program, that program gains the same privileges as the caller. Anchor mitigates this risk with two safeguards:
- The Program<’info, T> type ensures that the specified account matches the expected program (T)
- Even if the Program type isn’t used, the auto-generated CPI function will verify that the cpi_program argument corresponds to the expected program
Executing a CPI
The invoke function is used when a PDA is not required as a signature. In this case, the runtime extends the original signature from the calling program to the callee. The function is defined as:
Invoking another program involves creating an Instruction that includes the program’s ID, instruction data for the callee program, and a list of accounts the callee will access. A program only receives AccountInfo values from the runtime at its program entrypoint. Any account the callee program needs for its invocation must be included and provided by the program that calls it. For instance, if the callee program needs to modify a specific account, the caller program must include that account in the list of AccountInfo values. This also applies to the program ID of the callee (i.e., the caller must explicitly specify which program it is calling by including the callee’s program ID).
The Instruction is typically constructed within the calling program, although it can be deserialized from an external output.
The entire transaction will fail immediately if the callee program encounters an error or aborts. This is because the invoke function does not return on anything but success. Use the set_return_data or get_return_data functions to return data as the result of a CPI. Note that the type being returned must implement the AnchorSerialize and AnchorDeserialize traits. Alternatively, have the callee write to a dedicated account to store the data
While a program can call itself recursively, indirect recursive calls (i.e., reentrancy) by another program will immediately cause the transaction to fail.
For example, if we had a program that transferred tokens via CPI, we’d use invoke as:
invoke_signed is used for CPIs that require a PDA as a signer. It allows a calling program to act on behalf of a PDA by providing the seeds necessary to derive it:
PDAs can also act as signers in a CPI. The runtime will use the provided seeds and the calling program’s program_id to generate the PDA internally via create_program_address. The PDA is then validated against the addresses passed in with the instruction (i.e., account_infos) to confirm it is a valid signer.
With this function, an invocation can sign on behalf of one or more PDAs controlled by the calling program. This allows the callee to interact with the given accounts as if they were cryptographically signed. The signer_seeds consists of seed slices used to derive the PDA. During invocation, the runtime considers any matching account in account_info as “signed”. For example, if we had a program that creates an account for a PDA, we’d call invoke_signed as:
Anchor provides CpiContext as a more simplified way to make CPIs, instead of using invoke or invoke_signed. This struct specifies the necessary non-argument inputs for CPIs, closely mirroring the functionality of Context. It provides information about the accounts required for the instruction, any additional accounts involved, the program ID being invoked, and the seeds for deriving PDAs if necessary. Use CpiContext::new for CPIs without PDAs, and CpiContext::new_with_signer for CPIs that require PDA signers.
CpiContext is defined as follows, with T being a generic type that encompasses any object that implements the ToAccountMetas and ToAccountInfos<’info> traits:
Accounts is a generic type, allowing for any object that implements the ToAccountMetas and ToAccountInfos<’info> traits. This is enabled by the #[derive(Accounts)] attribute macro to facilitate code organization and enhanced type safety.
CpiContext streamlines invoking Anchor and non-Anchor programs. For Anchor programs, simply declare a dependency in the project’s Cargo.toml file and use the cpi module generated by Anchor:
Setting features = [“cpi”] grants the program access to the callee::cpi module. Anchor generates this module automatically and exposes the program’s instructions as a Rust function. This function takes in a CpiContext and any additional instruction data, mirroring the format of regular instruction functions in Anchor programs but with CpiContext replacing Context. The cpi module also provides the necessary account structs for calling the instructions.
For example, if the callee program has an instruction called hello_there that requires specific accounts defined in the GeneralKenobi struct, invoke it as follows:
In the fight_on_utapau module, a CPI is performed using CpiContext. The function call_hello_there is designed to interact with the jedi program. It creates a CpiContext with the necessary account information required for the GeneralKenobi account struct from the jedi program and the jedi program’s account info. This context invokes hello_there, passing in any additional required parameters specified by the GreetingParams struct. The CallGeneralKenobi struct defines the accounts needed for this function, streamlining the process.
Lastly, when invoking instructions from non-Anchor programs, check if the program maintainers have published their own crate with helper functions for calling into their program. If there aren’t any helper functions for the program whose instruction(s) must be invoked, fallback to using invoke and invoke_signer to organize and prepare CPIs.
Program Derived Addresses (PDAs)
Remember, PDAs are off-curve and do not have an associated private key. They allow programs to sign instructions and developers to build hashmap-like structures on-chain. A PDA is derived using a list of optional seeds, a bump seed, and a program ID.
To reiterate, the following constraints are used to check that a given account is a PDA derived from the currently executing program, the seeds, and, if provided, the bump:
- #[account(seeds = <seeds>, bump)]
- #[account(seeds = <seeds>, bump, seeds::program = <expr>)]
- #[account(seeds = <seeds>, bump = <expr>)]
- #[account(seeds = <seeds>, bump = <expr>, seeds::program = <expr>)]
If the bump isn’t provided, Anchor will use the canonical bump. Seeds::program = <expr> can be used to derive the PDA from a different program than the currently executing one.
Using the seeds and bump constraints streamlines the derivation process:
Here, the seeds constraint is used to derive the PDA. Anchor automatically verifies that the account passed into the instruction matches the PDA derived from the seeds. Anchor defaults to canonical bump when the bump constraint is used without a specific value.
Anchor also allows for dynamic seeds based on other account fields or instruction data. This is done by referencing other fields within the struct or using the #[instruction(...)] attribute macro to include deserialized instruction data. For example, in the following struct, example_pda is constrained to use a combination of a static seed, instruction data, and the signer’s public key:
It is an understatement to call Anchor a powerful framework. Its ability to streamline the development process is evident in our exploration of the various macros and traits Anchor employs to reduce code. It is supported by well-kept documentation and a robust ecosystem of related tutorials and crates. Anchor is loved and used by the vast majority of Solana developers.
This article is a very, very comprehensive guide on developing programs in Anchor. It covered installing Anchor, using Solana Playground, as well as creating, building, and deploying a Hello, World! program. Then, we explored Anchor’s methods of effective abstraction, the structure of a typical Anchor program, and the many account types and constraints available. It also covered the importance of delegating account space and error handling. Then, we ended with an exploration of CPIs and PDAs. This is the Anchor article - it has everything you need to start developing programs on Solana, today.
If you’ve read this far, thank you anon! Be sure to enter your email address below so you’ll never miss an update about what’s new on Solana. Ready to dive deeper? Join our Discord to get started developing Anchor programs.