Back
Life at Optiver  · 

Software Reliability at Optiver: Strategy

The first two software reliability concepts we examine concern our overall strategy for pursuing reliability. The foundation of our strategy is prudence and wisdom over brainless rule-following. No single methodology, set of best practices, or regulated rule can guarantee correctness or is appropriate in all situations.

Not all errors are created equal 

The extent to which errors in different parts of the software ecosystem are liable to expose the firm to automation risk varies significantly. For example, errors in the order management logic of an automated trading system are far more likely to be dangerous than errors in a desktop tool displaying market turnover figures to the trader. We therefore explicitly admit a varied appetite for errors in our systems. This means we need not adopt the same strategy for assuring software reliability across all software components.

To be clear, admitting a tolerance for some errors does not mean we open the door to a laissez faire attitude to sloppy software development. Rather, it allows Technology leadership to appropriately focus time and resources on assuring the reliability of the most critical software components. Further, it does not mean a complete absence of reliability concerns in the remaining software systems: it merely facilitates adoption of different practices that better achieve the right balance between rapid innovation and correctness.

Testing is just part of a software reliability strategy

Especially in light of new rules and guidelines emerging from regulatory bodies, software testing is receiving a good deal of attention. We acknowledge the importance of testing, but we take the firm view that it is just one of a number of angles of attack on the software reliability problem. To be clear, building reliable and robust software has always been a challenge in the industry. To date, no known method provides firm guarantees about correctness. In particular, with a huge number of failed projects, some of them very prominent, software testing methodologies cannot claim to be uniformly successful. In short, software reliability is not a solved problem: both industry and academia are still trying to tackle it.

Nevertheless, we acknowledge that modern best practices are available, and we seek always to use them to guide our own thinking. We stress the continuity of that thinking: best practices continue to evolve, and our approaches to quality assurance in general, and to testing specifically, will change accordingly.

A great example of these principles comes from our Automated Trading Systems (ATS) engineering teams. A few years ago we were in the midst of releasing some new trading applications. We made a few successive releases with small bugs in our risk-checking code.

Our risk-checking code is one of the most critical components in our system. Bugs in this part of this system are taken very seriously and this spate of incidents was quite concerning.

To deal with this problem we took a very aggressive approach:

  • For every trading strategy the ATS engineers created a list of all functionality related to risk checks. This list was then vetted by our Technical Operations team to ensure it covered the areas they expected.
  • For every release, no matter how small or insignificant, functionality in this list would be tested by ATS engineers.
  • The testing needed to be “end-to-end” with a production-ready version of the binary. Unit tests or tests which “mocked” functionality of the application with stub code were not sufficient.
  • Each release would include an attestation that this testing was performed.
  • The Technical Operations team checked the attestation to ensure the testing was completed prior to each release.
  • In addition to pre-release testing, our Technical Operations team “fire drills” risk-related functionality in production to see it working in the real world. They do this on a regular basis, and especially when major functional changes are made to our systems. This gives us more assurance that the functionality of our risk-checking limits system continues to behave as expected.
  • Finally, every year Optiver’s offices perform a “peer assessment”, in which a group of engineers travels to each Optiver location to perform a deep dive analysis of selected aspects of the office’s risk procedures.

As you can see, testing is only one part of a broad, company-wide, reliability strategy around risk limits.

Our approach to software reliability is quite different, however, in other areas like our user interfaces. Because there is little, if any, automated trading risk in these systems, our strategy is more accepting of errors, and has a reduced need to be all-encompassing. We still adhere to our overall development principles and practice a variety of disciplines such as TDD, code reviews, and beta testing. But the primary risk of error in these systems is loss of productivity, and thus the reliability strategy can be less all-encompassing.

David Kent, Chief of Staff – Technology

David is a Stanford Computer Science alum and spent several years as a developer at Amazon.com. He joined Optiver as a Software Engineering Lead in 2009 and has led many of Optiver’s software development teams. He is presently Chief of Staff for the Optiver US Technology Group.

Life at OptiverMeet the team
Insights

Related Articles

  • Life at Optiver

    Insight to action: The world of equity analysts at a market maker

    Investment acumen meets instinct In the ever-evolving world of the capital markets, the role of Equity Analyst stands out as a goal for those with a penchant for curiosity, analysis and investment acumen. The position is not just coveted for its intellectual rigor and the pivotal role it plays in investment decisions. Essentially, it provides […]

    Learn more
    Americas
  • Experienced, Life at Optiver, Technology

    Behind the scenes: Engineering Optiver’s global trading network

    Optiver's global trading network is a marvel of engineering, ensuring rapid and reliable data transmission essential for electronic trading. Network Engineer Ryan Bennett reveals how dedicated fibre optic cables and meticulous route planning maintain Optiver's competitive edge. Despite challenges like geographical hurdles and fibre cuts, the network's resilience and continuous improvement keep Optiver at the forefront of trading innovation.

    Learn more
    Europe, Global
  • Experienced, Life at Optiver

    Risk and reward within a dynamic trading firm: Insights from Optiver’s CRO Europe

    In business, risk management is often thought of as a of back-office support function—the department generally responsible for steering a company away from pitfalls and worse-case scenarios with cautionary, arms-length advice. Not at Optiver. In our high-stakes trading firm environment, it’s a core discipline that directly impacts the success of daily trading operations. As Optiver […]

    Learn more
    Global
  • Nicolas_Infrastructure_as_code
    Series
    Experienced, Life at Optiver, Technology

    Navigating Infrastructure as Code (IaC) in a non-cloud trading environment

    In the high-performance landscape of algorithmic trading, technological infrastructure isn't just important—it's critical. While Infrastructure as Code (IaC) is a well-established practice in cloud-based solutions, its application in non-cloud environments presents unique challenges, especially in latency-sensitive environments like ours at Optiver.

    Learn more
    Global
  • Series
    Life at Optiver

    From ideation to production: US tech intern summer projects

    Foreword by US CTO, Alex Itkin One of the most exciting parts of summer at Optiver is hosting the ever growing intern cohort. This summer in the US alone we had 35 interns working across our software, hardware and trading infrastructure teams. The goal of the internship is to give students an opportunity to spend […]

    Learn more
    Americas
  • Series
    Life at Optiver

    Tech intern projects at Optiver Amsterdam

    This summer, Optiver’s Amsterdam office hosted a group of tech interns eager to tackle the challenges of market making. Beyond just theory, they worked hands-on with our core trading technologies, directly engaging with some of the most interesting technical challenges in the financial industry.  In this blog post, four of our Software Engineering interns delve […]

    Learn more