Just wanted to give a heads-up that I’ll be speaking at the 6th Annual Microsoft Financial Services Developer Conference next week in New York. Here’s the abstract:

Distributed Caches: A Developer’s Guide to Unleashing Your Data in High-Performance Applications

With the advent of high-performance computing, application developers have begun to deliver computation on a massive scale by distributing it across multiple processors and machines. While distributed computing has given performance-critical applications more processor headroom, it has also shifted attention to previously latent bottlenecks: the storage, replication, and transmission of application data. The scalable power of compute grids is ultimately bound to the data grids that feed them. Without data caching and dissemination tools like distributed caches, it can be difficult to fully leverage a high-performance computing solution like Microsoft Windows HPC Server 2008 in data-intensive financial services applications.

In this talk, Lab49 will present a developer-centric guide to distributed caches, including what services they provide, how they function, how they affect application performance, and how they can be integrated into an application architecture. Lab49 will present test data showing the impact of several different caching strategies (such as compression and data segmentation) on distributed cache performance and will also demonstrate a methodology for testing distributed cache functionality. Lab49 will also discuss other application considerations as well, such as object naming conventions, notifications, and read-through/write-through to an underlying persistent data store.

Lab49 will also showing off some cool work we’ve done for Microsoft Windows HPC Server 2008 product team in the area of computational finance. Ssssh.

Lastly, we’ll be announcing the winner of the Lab49 WPF in Finance Innovation Contest. Somebody is going to take home some really cool prizes. Note that if you have been working on your submission (or you are still toying with the idea), the deadline for submissions has been extended to March 10, 2008. Just enough time to add some extra XAML bling.

Look forward to seeing you there!

The Marc Jacobs Utilization Meter has been pegged for at least two weeks now on a combination of client work, internal projects, recruiting, and writing (hence the appearance of my blog having fallen down a well.) It’s great to be busy, but I hate seeing the blog go stale.

In any event, I had an article published in GRIDtoday this morning entitled, “Grid in Financial Services: Past, Present, and Future”. Derrick Harris, the editor of GRIDtoday, reached out for an article after reading my multi-part series on “High Performance Computing: A Customer’s Perspective”. A big thanks to Derrick for giving me this opportunity.

Over the past few months at Lab49, we’ve thrown ourselves into complex event processing (CEP) — aka event stream processing (ESP) — and have been formulating exactly how and when it fits into the larger, more comprehensive technology stack found in global financial services institutions. We’ve formed a number of interesting vendor partnerships, attended product training, sampled, compared, and teased apart many of the popular products, and we’ve created several CEP-based demo applications that have been shown at recent events like SIFMA.

Along the way, we’ve all learned a lot about CEP, and the more I learn, the more I dig it. The more I put CEP into practice, the more I foresee its ultimate dominance as an architectural design pattern for everyday development.

What’s fascinating to me about CEP isn’t that it’s a new idea, despite how it may be touted by vendors. Regardless of the hype, CEP isn’t the most revolutionary technology you’ve never heard of. What’s fascinating is that out from a decades-old, primordial soup of ideas, research, and trial-and-error that, in and beyond academia, has been trying to create architectural models around complex data problems with real-time constraints, enough best practices and design patterns have emerged to evolve an ecosystem of market entrants, seemingly all at once.

It’s not the first time that a bundle of quality design patterns took concrete form as a technology. Object pooling, lifetime management, transaction enlistment, and crash domains begat COM+, Microsoft Component Services, and J2EE application servers. Logging levels, external configuration, adaptable logging sinks begat log4j, syslogd, and the Logging Application Block from the Microsoft Enterprise Library. Unit-testing and test-driven development begat JUnit and its children.

These transformations have been crucial. Once developers accepted these patterns and solutions as sufficiently solved and commoditized, they were saved considerable time and attention. Freed of coding logging libraries and unit testing frameworks for the umpteenth time, developers could focus more on the business problem being solved rather than the infrastructure details required to solve it.

But these transformations didn’t really upset the gross architecture of applications. They may have changed some of the design decisions and simplified the implementations, but they didn’t fundamentally change the abstraction you would use to model a problem and architect a solution.

CEP, on the other hand, does.

Instead of storing and indexing miles of cumulative data in a persistent store to service complex queries in batch/polling fashion, CEP inverts the whole shebang, storing and indexing the complex queries before streaming data across queries without storing a lick. The transformation of a business problem from tables, rows, and polling intervals into events, filters, triggers, and real-time reactions is not only quite enabling, it changes the very way you think about how business problems can be solved and which problems may have viable solutions.

Over the next few weeks, I’ll delve a bit more into CEP and how it relates to technologies you might be more familiar with. In the meantime, check out some of the in-depth blog entries other folks from Lab49 have been writing about CEP.

Excerpted from a paper I delivered on January 16, 2007 at the Microsoft High-Performance Computing in Financial Services event in New York.

In Closing

It’s a very exciting time to a proponent of high-performance computing in finance. Right now, it’s still a rather rugged task and evangelizing such rough solutions can sometimes result in sour impressions, but overall it’s getting easier to make it work all the time. With all the new products and vendors entering the market right now, I’m convinced we’ll scaling out with ease in the coming years. But in the meantime, we have to be vigilant in ensuring that vendors understand our business and developers and that they bring to market the tools and guidance that allow to keep prioritizing business first and technology second.

Excerpted from a paper I delivered on January 16, 2007 at the Microsoft High-Performance Computing in Financial Services event in New York.

Help Me Help You Help Me

To drive home the point, our trading and portfolio generation systems at Bridgewater have been parallelized and distributed for some time, based on a series of proprietary technologies that a) were not that great, b) lacked many features, and c) probably shouldn’t have been written in the first place. Along the way, we used DCOM, COM+, and .NET Remoting. We wrote custom job schedulers and custom deployment processes. We leveraged virtualization, disk imaging, multicast networks, message queues, even Microsoft Application Center. Each time, we managed to stack up the available Lego pieces and make a nice little tower out of it. But, typical of enterprise development projects that supply infrastructure rather than specific line-of-business value, they lacked for amenities. The APIs were never sufficiently developed or documented, the monitoring and administration tools often required black art skills, and the user interfaces, if present at all, bordered on sadistic.

Each time we revisited this situation, we knew that we shouldn’t be writing this stuff. We shouldn’t have had to. We knew that it wasn’t our core expertise and that we would never devote enough developer resources to give it professional polish. In reality, there are always just too many real projects to work on. The problem was that, until recently, there just weren’t any off-the-shelf packages for developing distributed applications on the Microsoft platform. For various reasons, we weren’t going to start invading our IT infrastructure with Linux just to use half-baked open-source solutions. So we rolled our own. Again. And again.

Then, a little over a year and a half ago, I read a news brief about Digipede Network in the now defunct Software Developer Magazine. It advertised a commercial grid computing solution built entirely on Microsoft .NET and running on Microsoft Windows. We downloaded the eval, read the APIs. After a brief meeting among lead engineers, we decided to do a test port, just a dip of the toe to see how much effort it would take to switch to a commercial solution.

Let me tell you. The whole port to Digipede, not just the acid test but the whole port, took one developer (me!) exactly three days from start to finish. After just an additional two weeks of procurement, deployment, and testing, we went live. And it has been working great.

That’s the kind of help we need. We need books and articles aimed at the broader market, not at the ACM or IEEE, that show just how easy it can be. We need our vendors to wrap up the hard stuff and leave the samples, tools, and guidance so that we can just markup our business logic and plug it in to a high-performance computing infrastructure in a week.

Introducing the Windows Execution Foundation

Microsoft .NET Framework 3.0, despite the unfortunate naming confusion, brings with it a tantalizing mix of technologies that are just waiting to be composed into high-performance computing framework for.NET. Building on the power of the Windows Communication Foundation and the Windows Workflow Foundation, we could solve the four big technical vacuums in financial high-performance computing:

  1. Job Deployment
  2. Job Security
  3. Pool Management
  4. Scalable I/O

Such a framework, let’s call it a Windows Execution Foundation, would have several features:

  1. Declarative Parallelism
  2. Distributed Concurrency and Coordination Constructs
  3. Distributed Shared Memory and Object Caches
  4. Lightweight File Swarming
  5. Lightweight Message Bus
  6. On-demand Pool Construction and Node Addressing

Armed with this kind of technology, the financial industry could focus on the business, not the technology. We could achieve high-performance computing without having to understand every relevant implementation detail. We could wrestle back control of our own scalability story from our IT departments and solve our scalability problems with software.


Excerpted from a paper I delivered on January 16, 2007 at the Microsoft High-Performance Computing in Financial Services event in New York.

Know Thy Developer

To make us customers and help us drive high-performance computing through our infrastructure, you have to understand that our engineers prioritize business first and technology second. It’s a mandate. The technology services the business goal, not the reverse. We attract and retain brilliant developer talent. We shower them with education and learning opportunities. At the end of the day, though, we are grooming them to be generalists, not specialists. We care more that they understand their menu of options and know how to choose a solution appropriate to the problem than they become expert on the inner workings of any one technology. If we demand any specialized knowledge at all, it’s usually of finance and economics, not distributed computing, algorithms, program analysis, or detailed performance optimization. I know that they can learn it, but in practice they can’t be their emphasis. Instead, they need to use that mindshare to ensure that we’re doing the right by our customers every step of the way.

Yet, this emphasis gives way to rather pathological situations. Last week I led a code review of an important class in our data infrastructure. It was about 300 lines long, and it was written by one of our most senior and productive engineers. The class takes in a large matrix of Excel-like data extraction and manipulation formulas, evaluates each formula, and passes back the same matrix with each formula overwritten by its result. It’s used widely throughout the company by both end-users and automated processes.

In his attempt to improve the performance of this class on multiprocessor hosts, the engineer decided to parallelize evaluation of the formulas. He constructed a list of threads, appended onto it one thread per matrix element, and then used a semaphore to ensure that only n threads were running at a time. In other words, if you passed in a matrix with 10,000 cells, he’d create 10,000 threads, only eight of which (by default) would be runnable. He had locks in all the right places, test cases, a strong public interface, and copious comments. It even worked. But his poorly thought-out design would bring a server class system to its knees in seconds, and he didn’t know why. After I showed him how to rewrite it in a more conventional producer/consumer pattern with a fixed number of threads, calls to this class which used to take ten minutes were now taking less than ten seconds apiece.

Now this guy is smart. He’s a great coder. He is excellent at picking appropriate technology for a given problem. In fact, he even designed and implemented most of the data infrastructure this class was part of. But when it comes to threading, he’s a rank beginner. He just didn’t know of a better way to do it.

That is the paradox of our engineers. They’re wicked smart. They are as capable as anyone of pulling off a MacGyver with duct tape and bailing wire and making a workable system of non-integrated pieces. But our developers need frameworks, patterns, and comprehensive tools for parallelizing and distributing their business logic. Without them, they’ll start making it up on their own. We all know that it’s not where they should be placing our efforts. With all the other things they need to do, they won’t do it with excellence, and they won’t think through all the things they don’t know.


Excerpted from a paper I delivered on January 16, 2007 at the Microsoft High-Performance Computing in Financial Services event in New York.

What’s Hard, What Isn’t

Part of the problem is that there is an unclear separation between what is hard and what isn’t, and the information out there isn’t helping at all much. Implementing MPI or distributed synchronization objects or job scheduling algorithms is reasonably hard and should probably be left to experts. But naively distributing a command-line executable or a method on a serializable class is cake.

Currently, it’s too easy to get distracted by the watchworks behind a high-performance computing solution. Just like you don’t need to actually solder a north bridge or layout a PCB motherboard to write a blog entry, you don’t need to drive yourself mad generating different portfolios on different machines. What you need is guidance, guidance to know which problems are proudly parallel, which units of parallelization are appropriate, which data should be shared versus copied, which configurations create bottlenecks, how to get your code and data out to the compute nodes, and when to do things simple and when to roll up your sleeves and get dirty. In many ways, distributed computing is simpler than multithreading since you’ve got better insulation between your processes and have to be more explicit about moving state around.

At Bridgewater, we’ve actually created a number of intern projects lately out of distributing existing .NET systems using a product called the Digipede Network from Digipede Technologies. I wouldn’t trust a single one of these interns to write high-quality multithreaded code or design caching strategies or implement distributed matrix multiplication using MPI, but on the other hand they’re able to roll out incredible distributed applications that work great with less than a week’s exposure to Digipede. They can do this because the problems are well-suited to the solution, the product is appropriately targeted to our company’s platform, and the appropriate samples, tools, and guidance exist to get the job done.

A Horrible Clang

If anything, that’s the reason why high-performance computing has such a bad rap. Until recently, high-performance computing meant Unix and clusters and fancy interconnects, all allied with a masochistic appreciation for open-source, thesis projects, and outdated PostScript documentation. The samples, tools, and guidance we’ve inherited have not been aimed at us, the predominant enterprise developer on Windows shifting wholesale from COM to .NET, VB to C#. MPI and OpenMP don’t target this audience. They target the hardcore C++ set. As much as I personally love C++ (almost as much as Python), it’s anti-productive for me to introduce it to my organization just to take advantage of vectorizing compilers, OpenMP, and MPI. I’d sooner settle for NullReferenceExceptions and reference semantics than GPFs and copy constructors any day of the week. Products like Digipede are a step in the right direction, but overall the message of high-performance computing on Windows is muddled, aimed at a narrow market that may not be interested.

At Ellington, my previous firm that specialized in mortgage-backed securities, we had a 256-node high-performance cluster built on Linux, GCC, and MPI. It would be a plum for Compute Cluster Server. Yet, I can’t think of one reason why I, as a principal software engineer, could recommend it to them. There is no point. They have too much invested in their current infrastructure, and there aren’t enough clear-cut savings and advantages that might warrant the costs and resultant risks. Perhaps if they were starting fresh or integrating some third-party analytics packages that only offered support for C++ on Windows, it would make sense. But that isn’t the case. Target their nascent .NET trading desk analytics with something other than MPI, though, and maybe you’ve got a customer.

The Excel Services for SharePoint 2007 story is another mixed bag from a high-performance computing perspective. It’s a fabulous product from the perspective of centrally sharing and managing workbooks at the enterprise level. I can guarantee that it will play a major part of Bridgewater’s Excel-heavy infrastructure. However, from the perspective of integrating your quantitative analysts into your engineering process, it’s a miss.

Another Microsoft product, Microsoft Expression Blend (formerly Expression Interactive Designer or “Sparkle”), demonstrates a great way to directly integrate non-engineering contributors (such as illustrators and UI designers) into the Microsoft Visual Studio 2005 development process. The project artifacts they create are full-fledged members of the solution. Engineers and illustrators work in parallel, and the solution is constantly updated.

We need the analog for our analysts. They need to work in their development environment of choice, Microsoft Excel, and we need to have their work immediately accessible as compiled libraries. The UI is irrelevant; it the math and models we want. The UI is just the vehicle for our analysts to develop and test their methods. I don’t want my engineers shoehorning distributed computing code into an analyst’s spreadsheet. I want analysts to compile their spreadsheets and have our engineers reference them as class libraries that can be inserted into our broader high-performance computing infrastructure.

Call it Microsoft Visual Excel.

Imagine if an analyst could declare one or more worksheets as a class, highlight particular cells as class properties, function inputs and return values, with method bodies filled in by Excel worksheet functions and calls to methods or macros written in the .NET programming language of your choice. Imagine a PowerShell-like metaphor, where everything is an object. And now imagine that you can compile the whole thing into an assembly that can be directly referenced by a .NET project. That would be a better building-block for our high-performance computing applications than Excel Services, as it better addresses our engineers and our engineering process.