Table of Contents
Aeolus is managed following an enhanced community condominium model. Investors (individual researchers or groups of affiliated researchers) purchase compute nodes (or storage) from a catalogue of nodes (and storage options). The Aeolus Investment Catalogue offers compute resources that have been verified for compatibility. The Catalog will be updated as vendor prices change, at least annually. Once purchased, such nodes are integrated into Aeolus, and administered by the Aeolus systems administrators. The VCEA Aeolus enhanced community condominium model for HPC allows individual investors and groups to access more computing resources and storage than would be possible by individually purchasing and managing a comparable standalone system, for a given level of investment of funds and faculty and staff time.
In an effort to support the unique computing needs of research computing, VCEA has committed to supporting the creation of both an annual budget and service center to help supplement infrastructure costs.
Computer components have a finite useful life. For purposes of Aeolus management, compute nodes and disk drives are assumed to have useful lifetimes of 5 years. Some individual components will fail sooner and some will last longer. The fact that components have limited lifetimes has implications in terms of Aeolus management, investment timing, and user data management. After five years compute nodes will be deprecated, perhaps be used for parts, low priority computing, and eventually retired. The investor is not responsible for node disposal after the useful life of that node has expired; Aeolus management assumes responsibility for such e-waste disposal. The fact that components have predictably limited useful lifetimes implies that long-term projects must plan in their proposal budgets for intermittent re-investment.
In an effort to recognize and honor investments into Aeolus before July 01, 2016, the following describes the policy for grandfathering equipment and access to the cluster. New investment practices allow for an expected useful lifetime of 5 years for compute node and storage.
Currently, Aeolus employs a round-robin scheduler via Torque and Maui.
With the implementation of the new virtual/redundant login and management servers, Aeolus will be migrating scheduling to the simple Linux user resource manager (SLURM). This will be a major step forward and may enable sharing compute resources with other peer clusters, leading to potential grid computing.
The following defines the policy for investing into Aeolus by purchasing compute nodes.
Standard compute nodes can be purchased from our Aeolus Investment Catalogue. Non-standard compute nodes can be purchased for special computing needs, based on the Aeolus Catalogue, in coordination with and after approval by the Aeolus systems administrators.
Investment in Aeolus through the purchase of a compute node guarantees access to Aeolus HPC for the duration of the 5-year hardware lifetime. Additional investments over time can renew the investor’s continuous access to the cluster.
With an investment in a compute node, 10 GB of storage per core will be made available to that project, user, or group for project storage, during the duration of the 5-year hardware lifetime (see Storage below).
The storage system design for Aeolus envisions multiple tiers; the following defines the tiers and corresponding rules of storage on the cluster.
Tier 0, Fast Scratch – high speed, volatile (short lived) storage
Tier 1, Compute Storage – storage designed for large projects and compute input/ output.
Tier 2, General Storage – user home directories, modules, and logs
Tier 3, Archival Storage – long term slow archival storage
Tier 4, Backup Storage – offsite, tape, and disk backups
All users are provided with a 100 GB home directory (see Tier 2) thanks to the generous contributions of the Voiland College of Engineering and Architecture as well as other initial investors.
All users may take advantage of high performance fast scratch file space to temporarily store data, for up to two weeks (see Tier 0).
Additional storage can be purchased in increments of 100GB. Beyond the 5-year lifetime for such storage, there will be a grace period of 6 months for migration of user data to a resource provided by the user. After that, data will be irrevocably purged.
By using the resources associated with the Aeolus Cluster, a user acknowledges and agrees to comply with the user policies stated in the Aeolus User Policy.
We distinguish between three types of Aeolus users: Investor, User, and Trial User. The responsibilities of each user type with respect to Aeolus use are outlined below..
This section defines the responsibilities of all users on Aeolus.
All three types of Aeolus users shall:
Investor: An Investor may be either an active user, making actual direct interactive use of Aeolus to run jobs, or a passive user, investing in Aeolus in order to provide Aeolus access for their laboratory’s students and staff.
User: With the approval of an Investor, an individual can be provided an account for use of Aeolus. With investment or sponsorship by an Investor, a trial account will be switched to a regular user account, for the lifetime of the enabling investment.
Trial User: Upon approval by the HPC Committee, a new user may be provided an account for a 6-month trial period, to “test out” using the Aeolus HPC in their work and/or to generate data for a proposal.
The VCEA HPC committee will review and revise these policies as necessary, and at least annually. This is the July 2016 version.