If you read my previous post on using deep learning as a parton shower, you may have seen in the introduction that my original inspiration for creating that model was wavelet analysis of LHC collisions. Just before Christmas, Craig Wiglesworth and I put out another paper describing how to use wavelets to remove pile-up from LHC collisions at high luminosity.
The vast majority of proton-proton collisions at the LHC are quite boring1 and do not produce any high energy electroweak particles – like the Higgs boson – or even any high energy jets of hadrons. Most of the collisions consist of perhaps between 10-100 relatively low energy particles spread around the detector. In order to increase the probability of observing something interesting, the intensity of the proton beams is high enough that many proton collisions occur each time the beams cross inside one of the detectors. This is great because it means you have a decent chance of something worth recording happening, but it does mean that all the other collisions that you’re not interested in come along for the ride and get superimposed on top of the single collision that you are interested in.
These many overlaid events are colloquially called “pile-up.” Pile-up will be an increasingly big problem for the LHC because the future runs of data-taking are expected to have up to 200 collisions occurring per bunch crossing, and even the run that just ended had around 60. Each pile-up collision may only produce a handful of particles, but when there are 200 of them it adds up to quite an inconvenience and can hide many of the signals that one might hope to see. An example of where pile-up can really spoil the physics signature is vector boson fusion production of a Higgs, which is one of the marquee processes at the LHC. In that process the Higgs is produced by the fusion of a pair of electroweak bosons, rather than a pair of gluons, and the lack of QCD interactions shows up as a region in the detector that is empty(ish) of particles. Of course, if you have another 200 collisions on top of that it doesn’t look empty anymore and while you can still see the Higgs, the event looks like boring old gluon-fusion production that was the original discovery channel back in 2012. The kicker is that weak interactions are weak (duh) and don’t happen very often, so to get a decent number of these VBF events you need a high beam luminosity and can’t avoid the pile-up problem.
There are quite a few pile-up correction methods already used at the LHC, but most of them are designed to correct localised objects like jets, not the large empty regions that characterise things like VBF. Things like jet substructure2 also tend to be quite robust against pile-up, but again you need a hard(ish) jet in the first place. What we (maybe) want with VBF is a way of analysing the structure of the event – how stuff is connected to other stuff – that doesn’t require jets. This might prove to be beneficial for other quantities like missing transverse energy, which are also not based on jets.
This is where wavelets come in. Some of you will know that I’ve been nerding around with wavelets for quite a while. This started out as an approach to identifying diffractive processes, which also have similar regions depleted of particles, but occur much more often. Wavelets are functions that are like little pieces of a wave, but in their discrete form they have specific scaling relations. Wavelet decomposition is a bit like a Fourier decomposition, except the basis functions are finite in size and each wavelet term corresponds not only to some physical scale (or frequency, if you prefer the Fourier terminology), but also to a location within whatever signal it is you are decomposing.
If you decompose some structure – which might be the radiation pattern in a collision, or the cosmic microwave background, or a photograph you want to edit – what you will end up with will be wavelet coefficients that tell you how much large-scale and small-scale stuff is in that structure and where that stuff is. You can then, for example, deal with large-scale and small-scale structures differently. This is compared to most pile-up techniques, which tend to target only a single resolution. Wavelets are multi-resolution, and that means they are great for analysing the sort of self similar structures we get in physics. The self similarity and scaling behaviour is very similar to the ideas behind the deep learning parton shower, and just like in that case, we can apply the wavelet decomposition by first converting the arrangement of particles in an event into a pixel array. Once we’ve wavelet analysed the array, we can convert back to particles by taking the after/before ratio of each pixel and using that as either a discriminator or a multiplier on each particle in the pixel.
A simple pile-up killing approach using wavelets is to zero any wavelet coefficient below some threshold and then reconstruct the event with the new set of coefficients.
You can see this in Figure 1, which shows some of the wavelet coefficients from the decomposition of a jet event that had 100 pile-up events overlaid. Since this is a simulation, we can do the decomposition for only the pile-up particles, or only the signal particles. Pile-up is basically white noise – it has a flat frequency spectrum. Pile-up jets arise when fluctuations in the many independent collisions happen to lie on top of each other in the detector, but since the pile-up particles aren’t generally correlated with each other, these fluctuations are random. On the other hand, the particles in true signal jets are very correlated with the rest of the signal event – which is not just the jet they are in. These correlations mean the signal can be encoded in a small number of wavelet coefficients, while the noisy nature of pile-up means it fills all of the coefficients. It is a general feature of noise that it cannot be compressed3 and that signals can be. You can see in Figure 1 that the green noise is present everywhere at all scales, but at a low level, while the signal is in a handful of spikes. If we simply zero out any coefficient below the orange threshold we will kill off the noise, but retain all of the signal spikes.
It turns out that the threshold for the pile-up noise is in the ball park of 100 MeV times the square root of the number of pile-up collisions. This makes sense because the wavelet coefficients encode fluctuations away from the mean activity in the event, and Poisson statistics mean these fluctuations grow as the square root as the number of collisions. 100 MeV doesn’t seem an insane multiplying factor as it is roughly around the scale of soft-QCD-ish phenomena.
Amazingly, this noise thresholding approach appears to be quite effective. We put a simple cut on how much signal content each particle must have in order to be selected. This is the r-cut label on the x-axis of Figure 2,
and as you increase r-cut you remove more particles from the event since you are being more stringent in your requirements. Figure 2 shows the sum of particle energies for known signal and pile-up particles (again, this is simulation, so we know for sure which particles come from pile-up), and you can see that the signal contribution to the sum is pretty robust up until very high r-cut values, while the pile-up drops quite quickly. This is only for particles that are above 4 GeV in transverse momentum, and is obviously worse for soft particles. 4 GeV is still quite soft for the LHC, and in the paper we go into a few more details and separate the behaviour for low and high momentum and show that it may be possible to use charged tracking information to push the technique to lower particle energies.
Apart from the fun of playing around with a new technique, what I find interesting here is the extent to which jets exist only as a convenience for the analysing physicist rather than a necessary part of nature. Jets are of course a reflection of our models of particle production, but once one drops the requirement to use them, other approaches to understanding the structure of collisions become available and may reveal other aspects of QCD.
1 Unless you happen to be one of the people who gets a kick out of understanding these soft events and the complicated dynamics that go on inside them.
2 Jet substructure means looking at small jets inside bigger jets. Pile-up jets are produced differently to signal jets, so their internal structure looks more fuzzy.
3 Conversely, compressed signals start looking more and more like noise as the compression algorithm becomes more and more efficient. This has been mooted as one solution to the Fermi paradox if alien civilisations are using advanced compression that makes their communications indistinguishable from the cosmic background.