High-Energy Spam Filter

By Sean Carroll | November 16, 2007 12:33 pm

Monica Dunford, at the US/LHC Blog, has a great metaphor: thinking of the “trigger” in a particle detector as a spam filter. The trigger, you will remember, is the combination of hardware and software that works to separate potentially interesting events from boring old background. If my rough numbers are anywhere near right (experts should chime in if not), the LHC will create about a billion collisions per second, and only about 100 of them will actually get stored on hard disk. Doesn’t sound like much, but we’re talking about a megabyte of data per event, so you’re writing a gigabyte to disk every ten seconds. Just not practical to keep every piece of data, so the trigger makes some snap judgments about what events are fun (like the simulated supersymmetric ATLAS event below) and which are just the usual workings of the Standard Model.


The spam-filter analogy is pretty apt: you’re being deluged with data, and most of it is irrelevant, and you can’t afford to look at all of it individually. So you have to come up with some automated system that decides what to keep and what to toss out. And of course you have exactly the same concern that you would have with any spam filter: the worry that you’re tossing out interesting stuff! You don’t want any job offers to get lost amidst the ads for C!al1$.

A great deal of work, therefore, goes into deciding what the trigger should keep and what it should toss out. Perhaps that helps explain the graph in a previous post of Monica’s:


That would be “meetings as a function of time,” in case it wasn’t obvious. Over 4500 scheduled meetings in 2007, and that’s just for the ATLAS collaboration. The other general-purpose LHC experiment, CMS, has a similar graph, but they only had about 1000 meetings. Whether it is a tribute to their greater efficiency or a mismatch in accounting procedures remains an open question.


