 
		ATLAS e-News
23 February 2011
Tier-0 on task
1 Decem ber 2008



Recent Tier-0 performance
As you read this sentence, cosmic events in ATLAS are fed  through the trigger and data acquisition system to the Tier-0.  They may seem like a trickle compared with  the flash of the collisions that ATLAS is designed to capture, but the almost  500 million events recorded in the last four months have generated 1.1  petabytes (1 million gigabytes) of raw data – not much less than  ATLAS is supposed to record in that amount of  time during a run with beam.
		      
“We are more or less up to the nominal rates,” Armin Nairz,  ATLAS Tier-0 Operations Coordinator, confirms. The cosmic rays provide complete  practice for the system.  Data from the  detectors goes through the trigger system and gets assembled online.  It is written to files and recorded to  CASTOR, CERN’s mass storage system, and there is a “handshake” database in  place to notify the Tier-0 about the new data. 
Between the arrival of the cosmic data and the point when  processing can begin: “There is a delay of usually not more than an hour,” says  Luc Goossens, ATLAS Tier-0 Software Development Coordinator. Once there are  collisions, only part of the data – the so-called “express” and “calibration  streams” – will be processed that promptly. The bulk “physics streams” will  have to wait until suitable calibration and alignment constants have been  calculated, a process that is expected to take about one day. 
 “We pick up the data  and start processing them on our batch farm,” says Armin.  The farm currently contains 1,500 processing  cores, but it is just a subset of the common farm available to CERN users,  which contains around 10,000 cores in about 1,600 machines. 
After picking up the raw data from CASTOR, the Tier-0 runs the  first-pass event reconstruction , producing, among many other data products,  event summary data (ESD), combined n-tuples (CBNT), and analysis object data  (AOD). The software also produces histograms for the offline data quality  monitoring team and publishes the information to the web. 
On September 10th, the first beam splash events  rushed through the system, and the Tier-0 team was able to post results after a  mere two hours.  “We have been preparing  for taking data for a couple of years already,” says Armin. The ATLAS Tier-0  team performed standalone through-put exercises to make sure that the system  could handle the intended amount of data.  
The Tier-0 hardware resources have been set up to cope with  a raw data rate of about 300 Megabytes per second, at an event rate of 200 per  second. Those are the nominal rates agreed between online and offline  communities in the so-called “computing model”. There are contingencies, but  the available Tier-0 bandwidth also has to be shared with many other activities  besides data taking, like the reading and writing of processed products, tape  archiving and, most notably, the data export to the Tier-1 centers.    
Apart from the bandwidth: “The bottleneck is really the  reconstruction time,” Luc adds. “The time it takes to reconstruct an event  (about ten seconds) is limiting the rate, as we have only a limited amount of  CPU resources.”
The cosmic events we have been recording can be up to ten  times the expected size of collision events, and the average reconstruction  time is about twice the time foreseen for reconstruction of an average  collision event. This leads to the counter-intuitive effect that often the  maximal event rate the Tier-0 can handle is lower for the “simple” cosmic  events than for real collisions.  
As we approach December, most of the subsystems will stop  taking data as hardware work begins. However, the Inner Detector had limited  time to calibrate and align prior to September 10th, so it will continue to run  as long as possible.  In any case, the  Tier-0 will keep running and support all detector commissioning activities as  needed. 
On the software end of Tier-0, Luc describes the current  activities as the “icing on the cake”. They’re working on automating the system  as much as possible. As he envisions it: “It will be like a big factory which  normally works by itself…and then there are people on shifts looking at a big  screen, hopefully with a lot of green stuff on it.” However, if a “red alert”  appears, the shifter will have to take the necessary actions to get it  resolved.
This graphical interface is one of the last missing pieces  of the ATLAS Tier-0 software puzzle, and the team already has prototypes.  “We will have it finished by the time the  first beams come back,” Luc says.
|   Katie McAlpineATLAS e-News |