The Problem
This week we are going to look at an under-appreciated corner of the digital asset ecosystem: reference data. This block-and-tackle work does not have the glamour of building a high-performance layer 2 blockchain, but for digital assets to become the basis for finance and the broader economy, these things matter. Well-written and broadly-followed standards enable disparate parties to work together, and promote clarity. In a fast-moving space like digital assets it’s easy to dismiss them as overhead and too slow-moving, but the cost gets paid elsewhere.
Cloudwall is building a digital asset risk system called Serenity, and for risk calculations the foundation is the data: historical prices and other market data, but also reference data. And in our experience, there is a great cost being paid by digital asset investors due to lack of standards: either to solve the same problems again and again, or in lack of clarity about risks. To make sense of the risk exposures in a large portfolio you need reliable reference data for three big purposes:
linkage: if you don’t group like-with-like, e.g. recognize the relationship between ETH and WETH, or BTC and a BTC option, or that two exchange hot wallets have balances with the same token, you may split a single big risk into two smaller ones inadvertently
aggregation: the more distinct exposures you have — e.g. 20, 50, 100, 1000+ tokens — it becomes increasingly important to help identify commonalities not just to summarize risk, but also to recognize that those commonalities often manifest as correlation: a specific metaverse token often behaves similarly to other metaverse tokens
valuation: this applies more to complex products with lots of details — in fixed income, you might think all the details of a particular municipal bond, or in options the contract specification — but details matter; if a structured product has a barrier and the model doesn’t know this, the valuation can be dramatically different
What does this have to do with digital assets?
In traditional finance, standards have had decades to evolve for things like symbologies, sector taxonomies, classifications and formats for representing asset economics. There are many, mature vendors in the space, consortiums, accepted conventions and more. Not all of these standards are open, in some cases they have associated licenses and commercial owners, they are not universally applied, and as with all things in life, it’s never perfect, but there are established foundations.
This is not the case in digital assets. In digital assets you have Kraken referring to Bitcoin’s symbol as XBT while most other exchanges use BTC: even for the largest and oldest cryptocurrency, the most basic fact about the asset, its symbol, is a matter of some dispute. There is, to my knowledge, no equivalent of the MIC code for exchanges: every vendor refers to Coinbase, Binance, FTX, etc. differently. There is no broadly-accepted sector taxonomy like GICS. We may see the ultimate heat death of the universe before we have FpML for crypto OTC derivatives, though some might argue that is a blessing.
In our work, we ended up having to build a solution which we will be continuing to grow in the coming years, but this is really a call for development and broader support for standards to help the whole digital asset ecosystem.
Linkage
This part of the problem requires a lot of in-house work for anyone hoping to piece together the risk in a portfolio that spans many blockchains, exchanges, DeFi protocols and more. We work with Digital Asset Research to get our base set of symbols, and then join together symbols from other sources around the ecosystem. This is still a highly-manual process and fragile. And the rapid change in the space does not help: although tokens do not strictly speaking have a concept like corporate actions in equities, there is a time-varying component to linkage: blockchains fork, and a token like Bitcoin Cash (BCH) has at least technically a historical connection to its predecessor, Bitcoin (BTC) in this instance.
This problem gets compounded by three other dimensions:
derivatives and DeFi: you not only need to consistently identify the tokens, but you need to consistently identify the underliers of derivatives and DeFi protocols, e.g. the underlier for a Bitcoin option or perpetual (which may not in fact be Bitcoin, but rather than index or reference rate unique to the exchange), and various DeFi protocols involving pools and vaults have similar indirection.
wrapping & bridging: WETH, or wrapped ETH, deals with the fact that the native ETH token is not an ERC-20 token, and so smart contracts which only work with ERC-20 tokens cannot handle ETH except as a special case. But wait, there’s more! A token may also be non-native to the blockchain it’s sitting on at the moment. We thus need to unpack at least two layers to get to a uniquely identifiable base asset, and this is important, because ETH on Ethereum from the perspective of certain types of risks — smart contract risk, especially as it relates to hacking of bridges — is not the same as WETH bridged onto Avalanche. stETH, or staked ETH, falls into a similar ambiguous category. Linkages matter.
multiple symbologies: to be fair traditional finance never got this fully right despite many worthy attempts — some extended to crypto, like OpenFIGI, where Kaiko is the official FIGI allocator for digital assets — but many sources only use their own symbols and generally don’t agree, so if you need to get data from many places, it is very hard to join them up.
The solution here is not hard to describe, but in practice not available anywhere: a big set of lookup tables which connects the various symbol types and then correctly links together all the related assets. And all these relationships need to be as-of a point in time, because they will change.
Aggregation
In equities there are a number of standards for grouping stocks into sectors, sub-sectors and more detailed industry codes. In this space the most comprehensive solution we have seen so far is Wilshire’s DATS: Digital Asset Taxonomy System, which is available through DAR’s API along with their base reference data. DATS is what’s called a sector taxonomy, a grouping of tokens into various levels. This lets us aggregate, which is useful for summarizing risk at higher levels, but just as importantly it helps you look at an asset’s returns in terms of how the asset performs compares to its “peers” if you will, the related and similar tokens. There are other taxonomies like Coindesk’s DACS for the top 500 tokens, but Wilshire’s has great breadth and is thoughtfully constructed, so it is well worth a look.
We are also a member of the International Token Standardization Association (ITSA) which has done good work building a sophisticated classification standard. They are interesting in that they are less a sector taxonomy than a multi-dimensional classification scheme, with a structured approach to breaking the tokens down in functional, technical and legal dimensions. We think this is also an important element, especially if you wish to consider regulatory risk in investment decisions.
Valuation
This piece is going to be of increasing importance, though at least for basic crypto derivatives there has been a lot of good work done by the various data vendors to extract future and option contract details from exchanges and make them available via API. But as DeFi has rapidly evolved, it is increasingly hard to keep up, and if tokenization extends further and further to reference manifold real-world assets, we think it’s going to require a lot of work on how best to capture the necessary economic details of various tokens so they can be valued properly. As noted above, this is critically important to get right: valuation models can be very sensitive to the details, so if for instance the expiration date of an underlying future for an option is incorrect, or if some other material element of how the token works is not adequately captured, the overall valuation going into a risk model will be off.
Conclusion
I hope this week’s edition has given you a peek into just some of the challenges of getting risk right in digital assets. Our entire focus in the coming years is in providing the models, but we hope to see further development and open standards evolve to help the digital asset ecosystem grow.