Published: Nov 06, 2019
Updated Nov 06, 2019

The Xilinx ® Vitis™ Quantitative Finance library is a fundamental library aimed at providing a comprehensive FPGA acceleration library for quantitative finance. It is a free/open-source for a variety of real use cases, such as modeling, trading, evaluation, and risk management.

The Vitis Quantitative Finance library provides comprehensive tools from the bottom up for quantitative finance. It includes the lowest level modules and functions, the pre-defined middle kernel level, and the third level as pure software APIs working with pre-defined hardware overlays.

- At the lowest level (L1), it offers useful statistical functions, numerical methods and linear algebra functions to support user implementation of advanced modeling, such as RNG, Monte Carlo, SVD, and specialist matrix solvers.
- In the middle level (L2), pricing engines kernels are provided to evaluate common finance derivatives, such as equity products, interest-rate products, FX products, and credit products.
- The software API level (L3) wraps the details of offloading acceleration with pre-built binary (overlay) and allow users to accelerate supported pricing tasks on Alveo cards without hardware development.

This article covers the two common option pricing engine (L2 level): European Option pricing engine and American Option pricing engine.

In Quantitative finance, option pricing is to calculate the perineum for purchasing for selling options.

In options trading, option pricing models provide the finance professional a fair value to adjust their trading strategies and portfolios.

The value of the options are calculated based on current value of the underlying asset, volatility of the underlying asset, dividends paid on the underlying asset, strike price of the option, time to expiration on the option and riskless interest rate.

European option and American Option are the most common kinds of options. European options may be exercised on expiry. American options may be exercised on any trading day on or before expiration.

European option pricing engine and American option pricing engine estimate the value of option based on Monte Carlo simulation method and Black-Scholes model.

As the below figure shows, the Monte Carlo simulation process is accelerated by multiple Monte Carlo model (MCM) units operating in dataflow, and each sub-module in MCM working in pipeline.

All sub-modules in MCM are connected by HLS stream.

The latency of pricing engines based on Monte Carlo simulation is:

Number of cycles = requiredSamples * timeSteps / MCM:

Compared to the European option pricing option, the process of American option pricing is more complex and divided into three functions based on Longstaff-Schwartz algorithm.

- MCAmericanEnginePreSamples function generates a small set of samples for calibration.
- MCAmericanEngineCalibrate function uses the least-square method to calibrate the coefficient of regression model with samples from MCAmericanEnginePreSamples.
- MCAmericanEnginePricing function looks up the best exercise point by using the calibrated regression model.

Vitis Quantitative finance library provided optimized pricing engine APIs with template parameter for datatype, number of MCM in parallel, which affects the resource utilization, performance, and quality of the result.

Monte Carlo European pricing engine APIs:

**template** <**typename** DT = double, int UN = 10, bool Antithetic = false>

void MCEuropeanEngine (

DT underlying,

DT volatility,

DT dividendYield,

DT riskFreeRate,

DT timeLength,

DT strike,

bool optionType,

ap_uint <32>* seed,

DT* output,

DT requiredTolerance = 0.02,

unsigned int requiredSamples = 1024,

unsigned int timeSteps = 100,

unsigned int maxSamples = 134217727

)

Monte Carlo American pricing engine APIs:

**template** <**typename** DT = double, int UN = 2>

void MCAmericanEnginePreSamples (

DT underlying,

DT volatility,

DT riskFreeRate,

DT dividendYield,

DT timeLength,

DT strike,

bool optionType,

ap_uint <32>* seed,

ap_uint <8***sizeof** (DT)*UN>* priceOut,

ap_uint <8***sizeof** (DT)>* matOut,

unsigned int calibSamples = 4096,

unsigned int timeSteps = 100

)

**template** <**typename** DT = double, int UN = 2, int UN_STEP = 2>

void MCAmericanEngineCalibrate (

DT timeLength,

DT riskFreeRate,

DT strike,

bool optionType,

ap_uint <8***sizeof** (DT)*UN>* priceIn,

ap_uint <8***sizeof** (DT)>* matIn,

ap_uint <8***sizeof** (DT)*4>* coefOut,

unsigned int calibSamples = 4096,

unsigned int timeSteps = 100

)

**template** <**typename** DT = double, int UN = 2>

void MCAmericanEnginePricing (

DT underlying,

DT volatility,

DT dividendYield,

DT riskFreeRate,

DT timeLength,

DT strike,

bool optionType,

ap_uint <32>* seed,

ap_uint <8***sizeof** (DT)*4>* coefIn,

DT* output,

DT requiredTolerance = 0.02,

unsigned int requiredSamples = 4096,

unsigned int timeSteps = 100,

unsigned int maxSamples = 134217727

)

The implementation of Monte Carlo American option pricing is to connect 3 functions through external memory (DDR/HBM). They are scheduled by XRT runtime to work in pipeline mode in host side, which is elaborated in the benchmark.

Monte Carlo European Option Pricing Workload size: 1 timestep, 47K paths |
||
---|---|---|

Cold Run | Warn Run | |

QuantLib | 20.155ms | 20.155ms |

Vitis Quantitative Finance Library | 0.053ms | 0.01325ms |

Acceleration | 380 | 1521 |

Notes:

- Cold run: the end-to-end execution time of 1 option pricing run.
- Warm run: the mean of the end-to-end execution time of 100 options pricing run continuously.

The resource utilization is listed in the following tables. There is 4 PUs which could price 4 options in parallel to implement kernel-level parallel. Each PU has the same resource utilization.

LUTs | FFs | BRAMs | URAMs | DSPs | |
---|---|---|---|---|---|

1 PU | 234072 | 376207 | 49 | 0 | 1594 |

4 PUs | 936288 | 1504828 | 196 | 0 | 6376 |

total | 1728000 | 3456000 | 2688 | 1280 | 12288 |

utilization ratio | 54.18% | 43.54% | 7.29% | 0 | 51.89% |

- American Option Pricing Engine

The performance is shown in the table below, our cold run has 176X, and warm run has 529X compared to baseline.

Monte Carlo American Option Pricing Workload size: 100 timesteps, 24K paths |
||
---|---|---|

Cold Run | Warn Run | |

QuantLib | 1038.105ms | 1038.105ms |

Vitis Quantitative Finance Library | 5.87ms | 1.96ms |

Acceleration | 176 | 529 |

The resource utilization is listed in the following tables.

LUTs | FFs | BRAMs | URAMs | DSPs | |
---|---|---|---|---|---|

PU_0 (pre-sample) | 120756 | 185169 | 43 | 0 | 416 |

PU_1 (calibrate) | 181793 | 267405 | 68 | 0 | 462 |

PU_2 (pricing) | 251370 | 268839 | 71 | 0 | 911 |

PU_3 (pricing) | 251370 | 268839 | 71 | 0 | 911 |

total of PUs | 805289 | 990252 | 253 | 0 | 2700 |

total | 1728000 | 3456000 | 2688 | 1280 | 12288 |

utilization ratio | 46.6% | 428.65% | 9.41% | 0 | 21.97% |

Alvin Clark is a Sr. Technical Marketing Engineer working on Software and AI Platforms at Xilinx, helping usher in the ACAP era of programmable devices. Alvin has spent his career working and supporting countless FPGA and embedded designs in many varied fields ranging from Consumer to Medical to Aerospace & Defense. He has a degree from the University of California, San Diego and is working on his graduate degree at the Georgia Institute of Technology.

Zhenhong Guo is a software development section manager working on foundational library development at Xilinx, providing an excellent FPGA acceleration solution to customers. Zhenhong received a Masters degree in Microelectronics and Solid-State Electronics from Institute of Electronics, Chinese Academy of Sciences and has spent her career working on supporting quantitative finance and database libraries.