Prior Work

Here are some accomplishments of Designing Patterns software engineers prior to joining Designing Patterns:


Ported a significant portion of a Ticker Plant (market data processing, storage, and distribution engine) from 32-bit to 64-bit software, on both Solaris/ultrasparc and AIX/powerpc

As the system was large and could not be cutover safely at once, the 64-bit rollout had to be planned very carefully and had to proceed program by program. During the rollout, some of the system's processes were 32-bit and some were 64-bit. This necessitated making the code 32-bit and 64-bit dual compatible by fixing shared structures and wire formats that had different 32-bit and 64-bit layouts and working around platform-specific 32-bit and 64-bit compatibility issues (the 32-bit pthread_mutex_t structure is not the same as the 64-bit pthread_mutex_t structure on AIX, for example, making it impossible for 32-bit and 64-bit processes to use one in shared memory). In addition, unsafe pointer usage (such as assuming that the size of a pointer was four bytes) and dependencies on libraries that could not be ported to 64-bit had to be removed.


Refactored a Ticker Plant's volume-weighted average price (VWAP) calculation code

Prior to this work, the Ticker Plant had several different, legacy code bases that did VWAP calculations for different forms of output. This work consisted of creating a clean, testable VWAP calculation library and, one at a time, changing the different VWAP clients to call into the new code base. At the end of this process, the Ticker Plant had a single VWAP calculation library, which paid dividends when an upgrade of the US equity feeds dramatically changed the way that VWAP was calculated for those feeds; the older code bases were not flexible enough to handle the new specification, and each would have had to have been changed, which would have been inefficient and dangerous, since they were not easily testable.


Refactored a Ticker Plant's US equity feed processing

Prior to this work, the US equity feed processing logic was commingled with the logic for many other exchanges. This meant that changes for other exchanges' feeds inadvertently could break the US Equity feed, and changes for the US Equity feed inadvertently could break other exchanges' feeds. As the US Equity feed both was the company's most critical feed and was one of the most frequently changed feeds, this was an enormous problem. The massive US feed changes associated with Reg NMS necessitated refactoring the processing logic as the original infrastructure could not be changed to accommodate the new specifications cleanly nor safely. After an extensive period of study in order to understand the legacy code, the US equity feed processing completely was rewritten within a new infrastructure that separated the US equity code base from that for other exchanges. This work required extensive collaboration and testing with other groups and was done under extreme time pressure in order to meet the Federal Government's Reg NMS enactment date.


Created an extensible interface around the records in a Ticker Plant database

Prior to this work, all of the Ticker Plant's code directly accessed fields in the database records with no abstraction layer, which was an enormous problem as there were many different record formats (hundreds of different formats serving thousands of exchanges). This led to much code of the form:

  if(is one kind of record)
    get high price from the first word
  else if(is another kind of record)
    get high price from the second word
  else if (a third kind of record)
    get high price from the 56th word

The new abstraction layer allowed a given data item to be accessed by name ("high price") and polymorphically dispatched to a class appropriate to the kind of record being manipulated. This new abstraction layer was deployed for only a few record types initially but, over the course of a couple of years, was extended to provide access to all record types. In addition, whereas it only was used to provide access to a few fields initially, access to all fields was provided over time. This refactoring project allowed the number of fields supported by the Ticker Plant to increase dramatically, as, prior to this work, adding new fields to a particular record was a difficult and dangerous task because the code that accessed the record was spread across many different files and sub-systems. The abstraction layer, by contrast, localized all code for a given record in a single file and allowed most of the configuration for a record to be done with a simple domain specific language. An extensive unit testing framework also was built in order to test the new record accessor classes. In addition, as all Ticker Plant developers needed to access database records programmatically, substantial documentation was created about the new system in order to educate the group, which helped the new system quickly achieve widespread adoption.


Cofounded and managed a team of developers in Tokyo, within the context of a larger group spread between New York, London, and Tokyo

This team was created in order to respond quickly to outages during the Asian day and to develop Asian products more efficiently. Being the team leader entailed getting projects for the team, guiding the work of the team's developers, supervising the handling of any outages, conducting local hiring processes (which succeeded in hiring two developers in Tokyo), and communicating with management in New York and colleagues in London. Maintaining communication despite time zone differences and geographical separation was especially critical, as the group shared a single code base and operational responsibility and so any changes had to be coordinated carefully with the other teams.


Took a lead roll in planning, coordinating, and executing the refactoring of a Ticker Plant's delayed feed processing code base

Market data feeds generally are offered on a real-time basis for a fee and freely on a delayed basis (this is why the Google CSCO quote is delayed by 15 minutes, for example). The Ticker Plant distributed both real-time and delayed updates to its clients. Unfortunately, real-time and delayed information for a given security was shoehorned into a single database record, which meant that the real-time and delayed versions of a given field (high price, for instance) were located at different offsets in the record. This difference and a few others resulted in completely separate code bases being developed for real-time and delayed processing. This was a very poor design decision because the logic underlying the processing was the same and so almost identical processing changes always had to be made in two different places, naturally leading to discrepancies between the real-time and delayed code bases. This design decision was baked into many different subsystems within the Ticker Plant, and changing the system so that the real-time code base could be used for delayed processing (eliminating the separate delayed code base) required the work of many different members of the group. Once the basic changes to the infrastructure had been made, each different exchange feed had to be tested on and changed over to the new infrastructure separately, in order to minimize risks of outages.


Scaled a Ticker Plant so that it could handle over 400,000 incoming messages per second on a single machine

A proprietary, in-memory database lay at the core of this Ticker Plant. The dramatic recent increase in market data bandwidth, particularly that of the US Equity Options feed (OPRA) (this article discusses some numbers), necessitated scaling the Ticker Plant so that it could handle the expected throughput increase. In order to do this, the performance of the existing system was analyzed by building an extensive throughput measurement and reporting system and a stress testing framework. This led to targeted profiling and identification of the system's bottlenecks. This project produced a document describing the current performance and proposing projects that could be undertaken in order to improve the performance. The document discussed projected performance payoff, estimated effort, and possible implementation strategies for each suggested project. These plans formed the basis of the group's efforts to deal with the increasing bandwidth and the proposed projects were completed successfully, allowing the software to handle the bandwidth increases.


Reduced the latency associated with tick processing in a Ticker Plant

Providing very low latency data is an important way in which market data vendors distinguish themselves from the competition. Low latency in this context might mean an update flowing through the vendor's data center in under 10 ms, for example. Prior to the work, the Ticker Plant's software was not able to provide this level of real-time service, as it was not built to these specifications. Bounding the time required to process a tick required locating the processing actions that took an indeterminate amount of time (in which the processing task might block, for instance). This information was gathered by instrumenting the entire system, measuring how long each key call took. This was done using high precision timers that were cheap enough to deploy in production. Timing measurements led to the realization that, while on average the processing was quite fast, sometimes it exhibited pathological slowness (sampling profilers had missed this, and commercial instrumenting profilers distorted the operation of the system too much to be useful). The timing measurements were linked to the kind of update being processed, allowing all offending code to be tracked down quickly.


Eliminated process contention in a Ticker Plant data distribution system

The Ticker Plant was responsible for distributing real-time updates to client applications. Prior to this work, this distribution system had four input queues, and producer tasks would put new real-time updates on these queues. Unfortunately, this system did not scale to hundreds of thousands of updates, as the producer tasks heavily contended with each other on the queue locks. This project gave each producer task its own lockless queue to the consumer task, eliminating all contention between the producers and allowing the system to scale to over 1,000,000 updates per second.


Rewrote a Ticker Plant's in-memory database, to make it more scalable and more robust

Prior to the work, there was significant contention on the database locks. Update times of over 50 ms were observed under heavy update loads, which were 50,000 times longer than the goal of 1 μs. At heavy loads, when the machine had little spare CPU, processes would acquire internal database locks and then be scheduled off their CPUs, causing other processes to block until the original process was scheduled back onto the CPU by the operating system. This work eliminated all internal database locks with a lockless database update algorithm, fixing this problem (afterwards, a process getting scheduled off did not impact the performance of other processes because it could not hold a lock needed by other processes). This work also made the database more robust, as previously the entire system would hang if a process died while holding an internal database lock; after the rewrite, a process dying would not impact other processes through its holding any internal database resources.


Improved multicast data distribution system

This work was done in multiple projects over several years. One aspect of this work was making improvements to the proprietary multicast protocol so that it would scale properly to the increasing throughput. Another aspect was creating a library for iterating over the contents of the multicast stream. Prior to this library, this logic was duplicated in multiple client code bases, making it very difficult to change the wire format. Supporting the library required distributing software to different groups and porting the library to new platforms when the need arose, such as to Linux/i386. All of the work described above required rewriting legacy code into object-oriented, testable units and building unit testing suites, which was especially important as this system was extremely critical and gradual releases unfortunately were impossible.


Created a low latency TCP data distribution protocol and server

An instance of this server accepted a data feed and distributed it to several clients, connected either over TCP or UNIX domain sockets. The server was designed from ground up for high throughput and low latency data. Every aspect of the server was unit tested fully.


Developed and deployed software for a continuously available Ticker Plant

Downtime was not acceptable for this Ticker Plant, and any partial loss or reduction in service had a huge impact on the client base. Within this demanding framework, however, large projects were executed. Enormous effort was put into testing and constructing software in order to minimize the chance of failures and to raise alerts when these occurred. Releases were staged carefully and coordinated with other groups; outages were handled as quickly as possible, even if it required logging into the system late at night or on the weekend.


Created a Java spreadsheet program

The program could undo and redo an arbitrary number of actions. In addition, it supported Excel-style formulas in cells and allowed different formatting in different cells (bold, underline, italics, etc). It also supported having multiple spreadsheets open at once and could save spreadsheets to and load spreadsheets from disk.