Re-thinking reprocessing

12 May 2010

Data flow around ATLAS


With data-taking now in full flow, thoughts are turning towards the summer conferences. Results from two current reprocessing campaigns, and a change in approach at Tier0, should together allow ATLAS to present the very latest high-energy collision data to the physics community.

The April 2010 reprocessing campaign, the results of which will go to the Physics at the LHC conference in June, is just wrapping up. Hot on its heels, the May 2010 campaign production jobs are due to launch imminently, targeting the ICHEP meeting at the end of July.

These, as well as the campaign looking at the first data in December, and its February successor which took a fresh look at that same data, have all gone pretty smoothly so far according to ATLAS Reprocessing Coordinator Adam Gibson:

“We were taking cosmics for so long that we got a lot of chance to do dry runs where the stakes were kind of low, and had time to do things slowly and fix them when they broke,” he considers. “Just like for the detector, this was good for the software community. Based on that experience, we learned a lot.”

Each campaign has looked at a cumulative dataset, but since the volume of data is still modest and the Grid so vast, the bulk of the jobs are completed within a matter of days. The May campaign is expected to take around three weeks altogether, factoring in the ’long tail ‘– a small fraction of tricky jobs that take a little longer to complete – and data quality assessments.

But, as Adam points out, “You don’t really want to go to ICHEP in July and present the data you took back in May.” Until recently, after the prompt reconstruction which happens at the Tier0 within a few hours of data being taken, data wouldn’t be re-treated until weeks or months later when a reprocessing campaign was launched applying new calibrations, alignments, bad channel lists, and software improvements to a whole dataset.

Because the software at the Tier0 was being updated so rapidly – sometimes on a daily basis back in February – it was difficult to do physics with; if the answer to the question ‘what constitutes a track?’ is dynamic, comparing like with like over a period of days or weeks becomes impossible.

“Right now we’re in the middle of implementing a more stable model,” says Adam. Effectively freezing the physics content of the software running at Tier0 “should make it easier to do physics without having to wait months for a reprocessing.”

The prompt calibration loop, reported in e-News in March this year, factors a delay into the initial bulk reconstruction at Tier0, so that calibration constants and bad channels can be updated before the data is processed, ensuring it is as high quality as possible, first time around.

“The idea is, you’ve already had one update to the calibration, so this data’s pretty good. Maybe you’re happy to take it to a conference for example,‘ says Adam. “But you’d still like to come back after a few months and maybe change it more fundamentally. Perhaps once you have hundreds of millions of tracks you’d make some super-good alignments, or if you have smarter software you’d want to come back and apply it later.”

Freezing the Tier0 shifts the burden of software validation over to the reprocessing stage, but the time available for these big campaigns is scarce. Ideally, they’re done carefully over months, but for the collisions data, says Adam “it’s been more like we’d like that reprocessed yesterday!”

Although the Analysis Model for the First Year task force initially recommended a full reprocessing two to three times in the 2009-10 period, there have already been four campaigns in the last six months. With the new Tier0 freeze, the plan is to hold off on the next campaign until somewhere around September.

“The software developers aren’t used to having to wait months before their software changes can become effective though” cautions Adam, and there may come a point where the number of people with important changes to implement becomes too great to delay any longer.

“In the meanwhile, better integration of the physics community, and better integration of the data and Monte Carlo software are hot topics now that real collision data exists. Eventually, a formal sign-off procedure will give all groups, including physics groups, a chance to communicate whether new software releases meet their needs or not, and the Monte Carlo and data groups are gradually converging,”says Adam.

“We’ll see how much data it takes before we can fully integrate, but I think that’s the direction we’re going in. It’s a big organisation and planning challenge, so we’re still ironing out the kinks,” he considers. “If you count all the software developers, there must be hundreds of people who have some sort of role in this, and then there are lots of dedicated groups that put everything together. This touches some significant fraction of ATLAS; it’s really a huge team.”

 

Ceri Perkins

ATLAS e-News