Having a reliable ownership system changed the web as we knew it and started what we now call the Era of Blockchain. Now anyone can access the information encompassed there and are able to rely on what they see. But due to its complexity, although this information is available to everyone, it is often not very simple to find what you’re looking for.
The mission we have here is to find out who the owners of a token are from a specific time frame. To accomplish this we need to find the historical owners of the token as well as current ones, and determine for how long they’ve held it in their wallets.
This is the process we did to track all non-listed Rakkudos and airdrop preKudos in the holders wallets for that specific time frame.
Finding current owners is relatively easy, you can do it by using RPC nodes to get the account data of an existing token account and extracting the owner. Finding historical owners is much more difficult.
Closed accounts can’t be downloaded from RPC endpoints, in other words, can’t be parsed. In this case we must scrape the historical data from the chain and look for “hints” on older transactions involving the account. But let’s remember that typical Solana Nodes only hold a couple epochs of data, so in order to get all transactions we need a source of archive data.
Watch the video demo here:
Quick overview of the Solana Data Structures
First, let’s go through the details of how the Solana data is being put together inside the chain.
The Solana block chain is organized into an ordered series of Slots, which have an associated Block.
Each Block has a set of Transactions in it.
Each Transaction has a set of Instructions and an array of Account addresses.
Instructions are associated with a Program ID (Address), and take a set of inputs and an array of input Accounts These input accounts are drawn from the transaction-wide account list.
Accounts have an amount of Lamports (SOL) and a Data Bucket attached to it, and can be associated with a Program.
Now that we know how the Solana data is distributed inside the chain, the next step is finding SPL (Solana Program Library) Token Accounts
Finding SPL Token Accounts
A set of instructions to do operations like create and destroy accounts, transfer tokens, and mint tokens are stored in an Account on chain.
Since a Solana Account is a generic data bucket, the Solana SDK doesn’t present the mint or amount of Token in a Token Account by default, so it must be parsed using the SPL Token Program’s unpack method. Some data sources provide this for you, such as RPC nodes with “jsonParsed”
After parsing the data array, we get the Owner, Mint, and Balance. Each Unique token has an ID called the “mint” in the SPL Program Structure. For example, for preKUDOS its pkudoFxGVV76UpZcjdXH9hn46ECbUn3VzNAcEphWox9
This program is used to define both currencies and NFTs on the Solana Chain Means that our methodology works for both NFT data such as link to image and features are stored in other accounts attached to the Mint. For NFTs, each NFT is considered its own “mint” with exactly 1 unit of the token in existence, so each of the 10k Rakkudos we minted has a unique mint id.
Now that we know how to find the accounts we need, the next step is to find the owners.
Finding the Owner
While deserializing the accounts would get us the current owner, it will not be helpful for past owners. This is where we need to start using the historical data. This is because Token accounts are typically closed once they are empty, to get the Rent SOL back. Once an account is closed, the data array is deleted from the nodes, so we can’t deserialize it anymore. There is still the problem of getting a list of Account IDs to check.
Our solution here is to crawl the chain for other indicators that tell us the owner. We will need a list of Transactions, which we can get by requesting them from RPC by slot number.
RPC nodes don’t usually hold more than a couple epochs of data, so for most data we will need a trusted source of Solana archive data, such as a BigTable instance or SQL database populated by a Geyser plugin.
Parsing the Instructions
Finding the Instructions can be done by first filtering out any Instructions with the wrong program Address (anything that isn’t the Token Program).
Then we need to parse this by running it through a different unpack method in SPL Token Program, which will tell us whether it is the Instruction type we want.
Since there are a few ways to represent an Initialize Account event, we need to look for InitializeAccount, InitializeAccount2 and InitializeAccount3. Thankfully, these all contain the data we want - the mint, owner address, and address of the token account
Typically when interpreted this way, only the data array is decoded.
Account inputs are usually not in the deserialized data structure. They can instead be found in the accounts input array. The meaning of each index is typically specified in the documentation.
An Account has an owner field if you are using the Rust SDK (possibly others). This is *not* the owner we are looking for, this “owner” is the Address of the Program that is allowed to manipulate the account, so in the case of Token Accounts they are all “owned” by TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA (the Address of the Token Program).
The account owner we want is stored in the “data” array, and is not available on closed accounts.
Up till this point
So far we have a list of accounts that were created for the mint. For some applications, like getting the approximate total number of users that have entered the system, this is sufficient as we can actually extract the initial owner of the account from these InitializeAccount Instructions as well. However, to track ownership over time, especially of NFTs, we need to dig deeper. Additionally, there are edge cases that this approach won’t capture, such as transferring the ownership of the mint account. Since no new accounts are created in this instance, these owners are missed.
Finding the Owner Over Time
We can expand the scope of our search to look for more indications of token ownership over time. There are 2 more broad cases we need to track:
- Change in Token Account ownership.
- Transfer of tokens between accounts.
Now let’s get the data about any possible transfers of tokens between accounts.
Getting the transfers
So we can expand the search to include these 5 Instructions, but we are still missing some data. ChangeAuthority has no mint data, so without more data we don’t know what kind of account was transferred, and Transfers have no ownership data for the receiver so we don’t know who the token was transferred to.
Now, we can try to find the mint data in the transaction in a few ways.
First we can download the account from an RPC node, parse it, and get the mint that way. Unfortunately this will not work for closed accounts, as their data is not held on chain after being closed. So we must look for other evidence. We can’t get the owner either for the same reason, we have to grab it from instructions we find. A Transaction has another piece of data that we skipped over before, call the pre and post Token balance arrays. These have a record of any SPL Token Accounts involved in the transaction, their mint, and their balance before and after the Transaction. We can look in here to see if there are Transfers with the mint we are looking for. Account Transfers (SetAuthority instructions) don’t have this either. We can try to fill in the data based on anything we’ve seen in this same batch of the job, but ultimately we won’t be able to fill in these events until we have an account of all Accounts created for the mint.
Loop over all the Blocks, checking all the Transactions. For Each Transaction, check the account array for the SPL Token Program Address, and examine the Instructions if it's present. On the first pass, filter for these instructions for owner and mint data:
- (Optionally) CloseAccount
And these instructions for mint and transfer data:
Record the Account Address, Owner Address, Slot, Mint and amount transferred. This is an example of data that we can extract from the Transfer instructions, and how.
The ToOwner is missing, so it has to be filled in using evidence such as a later Transfer from Account 4 to a possible Account 5, or from finding an InitializeAccount or SetAuthority for Account 4. SetAuthority and InitializeAccount are parsed in a similar manner. On a second pass, using the list of account IDs that match the mint, find all SetAuthority instructions and record the old authority, new authority, and account id. Alternatively, you could record all SetAuthority instructions during the first pass, then filter them here. Then using this filtered data we can fill in the missing ToOwner fields.
Wrapping it up
From the parsed Transfers and SetAuthorities, we can form a timeline of ownership transfers for an NFT, ordering all the events by Slot number. We can distill each transfer and SetAuthority into an “OwnershipTransfer” that just records the previous owner, the new owner, the mint, and the slot. By sorting these, we can get a timeline of ownership of the NFT. We can put together the InitAccount,CloseAccount and SetAuthority instructions to form a sort of “timeline” for each account and its ownership over time. For some projects, this might be sufficient. For tracking NFT ownership, we need to go over the list of transfers and use these timelines to figure out who owns the token account at each Transfer, noting who the token was transferred to and when. We should also use the SetAuthority Data to populate this transfer list as well, treating that as a transfer of ownership of the NFT. In the end, we have a list of transfer events with dates and the owning Wallet at the time of the transfer. This can be used to calculate how long a user held the NFT for, for example.