Features/Risk Database Example
This example relates to the challenges of modeling and reporting market risk for a portfolio of diverse financial products - the examples/risk subdirectory in the API distribution contains the relevent code.
A common technology problem in the investment banking industry is how to design and maintain a common risk reporting platform for the many and varied financial products that are traded.
The data is multidimensional and structurally diverse: Valuation and risk metrics need to be rolled up by elements of operational responsibility such as portfolio, desk and strategy, as well as by common economic factors such as currency, term and credit rating. Risk for products such as bonds are computed on a per symbol basis and reported on a position basis. Risk for over-the-counter (OTC) products such as interest rate swaps are computed and reported on a per trade basis.
exstreamspeed offers solutions both for efficient modeling and common reporting.
The example focuses on four major product groups: index credit products, bonds, interest-rate derivatives and single-name credit derivatives.
The design consists of seperate node hierarchies by product group, where each node hierarchy is queried by a separate aggregator instance with respect to a common result set.
Valuation and risk metrics include PV (present value), PV01 (interest rate curve risk), Convexity, RefPV01 (credit curve risk) and RefPV01Recovery (recovery rate risk). Risk can be rolled-up by: Product, Portfolio, Desk, Strategy, Trader, RefEntity (Reference Entity), Rating, Industry, Counterparty, Currency and Maturity.
Portfolio and Reference Entity dimensions are represented by two root classes:
Designs that have fewer node instances and more tuples per node are generally more efficient, due to the efficiency of the (filtered) iterator implementation. Dimension data is, therefore, often more efficiently represented as relations than deep hierarchies.
Bond risk is computed on a per-symbol basis and reported on a position basis: The BondRisk root node rolls-up unit risk (i.e. risk per $1 of notional). Each child BondPosition node maintains aggregate position (i.e. BondNotional) for that symbol by counterparty and portfolio.
Actual PV by symbol, counterparty and portfolio is BondPV * BondNotional and is derived using an aggregator formula as part of the query.
Single-name credit derivatives compute risk on a per trade basis. Currency, product and portfolio are factored-out of the CRTrade node into the CRRisk parent. Although this reduces redundancy it also provides more efficient querying capability for these commonly filtered dimensions: Entire nodes of trades can be skipped by a query selecting a subset of portfolios, for example.
Index products are challenging to model. Like bonds, risk is computed on a per symbol basis and reported on a per position basis. Unlike bonds, however, the risk for an index product is then distributed equally amongst the set of reference entities that make-up the index.
Risk is partitioned by index and reference entity by the Index node. Risk is further partitioned by Symbol in IndexRisk and by counterparty and portfolio in IndexPosition. This design implies, however, that an IndexRisk child and corresponding IndexPosition descendants would need to be replicated for each reference entity in the index, since the risk and position information is identical for each.
exstreamspeed gets around this by allowing all the Index tuples corresponding to a single IndexId to point to the same IndexRisk child node instance.
Actual PV by symbol, counterparty and portfolio is IndexPV * IndexNotional and is derived using an aggregator formula as part of the query.
Like single-name credit products, OTC interest rate derivatives compute risk on a per trade basis. Currency, product and portfolio are again factored-out of IRTrade to IRRisk for reasons of query efficiency.
Interest rate products have no credit risk component, so in order to generate reports that combine interest rate risk for both credit and interest-rate products, a place-holder reference entity and corresponding credit risk (of zero) needs to be associated.
An IRRef node is created to reflect this containing only one tuple and one child IRRisk node.
Each of the Id fields (PortfolioId, CurrencyId, RefEntityId etc...) are string-ids - each corresponding to a string dictionary. Using user-defined fields, a string dictionary can be stored in a database along with all the risk data. A seperate node is constructed to store all the string dictionaries used in the database, known as StringDict.
The example is split into two binaries: genrisk.c (genriskcpp.C) and qryrisk.c (qryriskcpp.C).
genrisk.c generates and stores risk for 1 million trades in under a second: 400,000 index trades, 50,000 bond trades, 300,000 OTC interest-rate derivatives and 250,000 single-name OTC credit products.
qryrisk.c loads the database from disk and uses the aggregator to run user-defined risk roll-up queries that are fed from stdin. qryrisk.c implements a simple parser for interpreting key, result, filter and formula field specifications provided on stdin.
The distribution includes an example input file (qryrisk.inp) containing five queries of varying complexity. These include:
qryrisk.txt contains the expected output for qryrisk.inp. It includes the query plans from the aggregator and the query results in comma-separated format. String-ids are mapped to strings for reporting.
|Copyright © 2012 by Richard Brooks|