Documentation - CASTOR at CERN

At CERN, there are several instances of CASTOR that service the different user communities. The disk part of the resouces are specific, but the name server and tape infrastructure parts are shared. The 5 major functional modules are:

  • Stager - this disk pool manager allocates and reclaims space; it also controls client access and oversees the disk pool local catalogue
  • Name Server - this CASTOR name space (files and directories) includes the corresponding file metadata (size, dates, checksum, ownership and ACLs, tape copy information). Command-line tools modelled along Unix tools enables the manipulation of the name space (e.g. nsls corresponds to ls, etc...)
  • Tape Infrastructure - under certain conditions CASTOR saves files onto tape in order to provide data safety and to manage data storage that is larger than the available disks. At CERN, the high capacity tape units that are used are Oracle StorageTek T10000C (5 TB) and IBM TS1140 (4 TB). Cartridges are housed in tape libraries, and access to them is fully automatized. The libraries used by CASTOR in production are 4 x Oracle SL8500 and 3 x IBM TS3500. The current total tape archive capacity is ~100 PB (January 2013).

    The CASTOR Volume Manager database contains information about each tape's characteristics, capacity and status. The Name Server database contains information about the files (sometimes referred to as segments) on a tape:

    • ownership
    • permission details
    • file offset location on tape

    User commands are available to display information in both the Name Server and Volume Manager databases.

    The mounting of cartridges to and from tape drives is managed by the Volume Drive Queue Manager (VDQM) in conjunction with library control software specific to each model of tape library.

    The cost of storage per terabyte on tape is a lot less than that on hard disk, and it has the advantage of not consuming electricity when tapes are not being accessed. However, access times on tape are longer, in the order of minutes rather than seconds.

  • Client - this allows the user to upload, download, access and manage data stored in CASTOR. It is possible to get a file which is stored in a disk server, or a tape server, using RFIO, ROOT, GRIDFTP or XROOTD. The client can be a CASTOR client (command line mode or API) or SRM.
  • Storage Resource Management - this is a middleware component for managing shared storage resources on the GRID using the SRM protocol. It provides dynamic space allocation for file management and enables uniform access to heterogeneous storage elements, of which CASTOR is one. Similarly, the SRM interacts with CASTOR on behalf of a user or other services (such as FTS, the File Transfer System, which is used by the LHC community to export data).

You are here