Getting a Grid Certificate for ATLAS Use For getting a new certificate or renew a certificate,you can use the CERN CA to request the grid certificate: https://ca....
Grid Certificate Distribution at AGLT2 The certificates in /etc/grid security/certificates are used by the OSG authentication stack. It is a regularly updated, st...
Atlas Great Lakes Tier2 Web How to contact us * For problems, contact us through our signatures here: * Main.WenjingWu AGLT2 Manager and University of M...
General steps to follow on Migrating data from one OST to another 1. Set the source OST in read only status from mds server, if you do not want the files to be mi...
DKMS on Lustre servers spl(required by zfs), zfs and lustre zfs are all using dkms on all OSS to build their kernel modules automatically once the system has a ne...
This is to deploy a test lustre file system, assume all software repository is installed I deploy a test file system on 2 nodes, with lustre 2.10.4, then I will t...
Update Lustre on a testbed from 2.10.4 (SL7.6, zfs 0.7.9) to 2.12.3 (SL7.7, zfs 0.7.13) We are trying to upgrade lustre server from 2.10.4 (SL7.6, zfs 0.7.9) to 2...
How to update OSG and condor ce on gatekeepers The following steps should be first tested on gate03, if it works, then do it on gate01/02 Please note: the gate ke...
Fill the umatlas repository Download the rpms Download the relevant rpms into the umatlas repo from condor repo http://research.cs.wisc.edu/htcondor/yum/stable ...
* InstallUpgradeOSG Modified April 5, 2011, for OSG 1.2.19 install, B.Ball * UpdateOSGOnGatekeeper How to update the OSG and condor ce on gatekeepers ...
Setting up SSH Keys for AGLT2 SSH is able to use a variety of methods for authenticating users. Each method has security strengths and weaknesses. The normal user...
Some info on Oracle setup at AGLT2 * Oracle Installation on linux for the ATLAS Muon Calibration/Alignment centers. * Oracle MuonDB updated (new) schema Feb...
Replacing disk in zpool (potentially with larger disks) ZFS documentation Replacing disk in zpool (potentially with larger disks) Note: use "parted" "mklabel ...
Manuals Cobbler manual: http://www.cobblerd.org/manuals/ For information on the Cheetah template language used in kickstart templates: http://www.cheetahtemplate....
Lustre Basics The Lustre file system is made of three types of servers: the management server (MGS), meta data servers (MDS), and object storage servers (OSS). Ea...
Overview of MSU's Tier3 HTCondor Setup Intro Video for Admins Types of Machines on HTCondor HTCondor consists of three types of machines: submit nodes, worker n...
Configuration of the UM CERN Computing Cluster in BAT 188 In November 2014, the UM CERN Computing Cluster was upgraded to SLC6. Some old hardware was retired, new...
How to update visio 1 Log into senna. 1 Open up a terminal and run the command "rdesktop g 1280x1000 hepwin.pa.msu.edu" 1 Log into this machine with the...
How to update the tier3 rack spreadsheet The spreadsheet that holds all of the rack information is located here. In order to edit this, you will need a google acc...
Workflows for modifying HTCondor configuration When modifying condor there are two broad phases any steps taken can be put into: the testing phase and the impleme...
TOC% This document describes how to install amanda on Centos7, and also connect it to a new tape library EMC ML3. About EMC ML3 It has 2 drivers, It has 32 usabl...
Installing new drives in a login node 1 Check that no jobs are running on the login/submit node and that no users are currently logged on (you can check who is...
Setting up a new login/submit node Hardware 1 Pick a machine that will host the new login node if one has not already been picked. (Discuss with Philippe) ...
Starting HTCondor on a login/submit node To start condor on a login/submit node, do the following: 1 As a super user/root, use the following command: $ service...
Setting up the Bypass Queue The users requested a queue that would bypass the timed queues, i.e., a queue with no limits on it. The agreed upon way to denote such...
Setting up the timed queues The user's requested several timed queues that would hold a job after it had exceeded a certain amount of runtime. These queues each h...
This document describes how user with grid certificate can access the files stored in the AGLT2 dCache system! All files in dCache need to be copied to a local fi...
login to any of the interactive machine(unt3int01 05), run the following commands #localSetupATLAS or run #source /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/u...
IDRAC web interace IDRAC web interface can provide the virtual console and the log files of the system. How to access the idrac web interface URL: https://idrac ...
This section is already implemented for new user when the account was setup Protecting SSH Keys (or X509 Certificates) on AFS This section applies to user's with...
Raritan Dominion MSU MSU has a Dominion KX132. This is a 32 port model. In 2008, the Dominion KX series has been replaced in Raritan's line with the Dominion KX ...
This document helps the UM Tier3 users to diagnose their condor job problems. Submitting Machines Tier 3 users can submit their condor jobs from the following ma...
IO with EOS command lines User can access (list and read) files in CERN EOS from non lxplus nodes, without being authenticated. In order to get full permission(wr...
AFS Tape Backups with Amanda Amanda Commands For operations with amanda, you should be the amanda user on bambi: "su amanda". The exception is "amrecover". Her...
The instructions use the c6 1 24 1 (Dell C6420) as an example Switch ports Available ports Look for all the switches for available ports The c6420 nodes need 2 ...
UMATLAS yum repository NOTE: As of January, 2018, sysprov02, an SL7 VM, has replaced sysprov01, and sysprov01 has been shut down. All refs to sysprov01 below have...
Installing and Using X2Go We are dropping the Remote Desktop machine aglt2rd, and replacing it with a Linux machine set, starting with bridge um at the UM site, a...
Installing and Using X2Go for UM T3 Users This page describes how to get the X2Go client software, install it on Windows (explicitly, it will likely also go on a ...
Local AGLT2 Monitors There are many monitors we've implemented. These include both AGLT2 and general USATLAS pages. Summaries * AGL Compute Summary page of Ph...
yum cron Configuration in SL7 Un modified yum cron ALWAYS sends emails upon completion. This is an overwhelming flood given the number of systems we have. We th...
* RebuildComputeNode Rebuilding a ROCKS compute node * RespondToDownNode What to do with a down node * ControlledShutdown How to bring the cluster down nice...
Upgrading Postgresql from 9.5 to 10.5 We want to go to the most recent Postgresql for use by dCache, at least on head01.aglt2.org. Currently Postgresql is version...
Postgresql on ZFS AGLT2 has been running Postgresql on top of ZFS on our head01.aglt2.org (dCache headnode) for more than 1 year. Recently we came across an inter...
Lustre 2.10 with ZFS 0.7.1 from standard repo This page documents building the Lustre 2.10 RPMs on CentOS 7.3 using the default yum install of ZFS 0.7.1. The ste...
Install or Upgrade OSG at AGLT2 The main difference between these instructions and the usual documentation is that we use worker node and wlcg client installation...
Installing Google Chrome on SL7 for use with VMWare * Create the google chrome repo on umt3int05 google chrome name=google chrome baseurl=http://dl.google.co...
HS06 Measurements Performed at AGLT2 We have made a variety of measurements at AGLT2 during September of 2009 in preparation for the upcoming purchase cycle. We p...
Issues Fixed in CFEngine for SL7.3 Upgrade Known Issues Need to limit yum output on overnight updates so that so many Emails are not sent. The update_dell_firmwa...
Notes on setting up and configuring Lustre version 2.7 Index of Sections Source rpms We have chosen to use the kernel distributed with the rpms from the Lustre ...
HS06 Measurements Performed at the Dell Innovations Lab in August/September, 2017 32 bit Results, Summary Machine/Model ChipSet Speed BIOS Settings RA...
Transition from CFEngine v2 to v3, and Build dCache Pool Servers Introduction As documented elsewhere in this Wiki, cfengine2 is currently (Oct 2012) in use to c...
Building Lustre RPMs for a new kernel These are very old (version 1.8) directions When we move to a new kernel on a machine where lustre must also be mounted, ne...
Directions on Draining then Removing a Pool Set pool readonly. Can start drain right away, but will likely miss a few files ssh to admin domain \c PoolManager \c...
Resizing LVM Partitions Some CERN systems were built with little space in /, with the bulk of the space in /home. However, this means HTCondor, that wants at lea...
Numpy and Scipy at AGLT2 The numpy and scipy software packages are in common use at AGLT2, but, the installed versions are somewhat old, having to do with the dea...
AGLT2 Web Preferences The following settings are web preferences of the AGLT2 web. These preferences overwrite the site level preferences in . and , and c...
How to empty all OST on an OSS, then re create the underlying Lustre file systems Motivation The underlying striping for a Lustre OST, as seen in the mail list, ...
Extending LVM Disks on VMware VMs We sometimes have partitions fill during operations and when those partitions are on VMs and using LVM we can easily extend them...
How to Add New Storage to dCache We will look at the example of umfs16, where 12 new pools were added. Following the xfs file system creation, all disks were mou...
Installing a Main line Kernel on Scientific Linux 6.4 or CentOS 7.2 64 bit To install a main line kernel kernel is as simple as putting in place the correct elrep...
Installation and Configuration of Dell MD3460 Storage Basic Hardware This page refers specifically to hardware purchased in August 2016 using RBD 2016 funds. A s...
Manual Replication of Hot Files in dCache Particularly for the Health Check, we need multiple copies of the source file All work is performed in either a browser ...
Upgrading dCache at AGLT2 from 2.10.55 1 to 2.13.23 1 We are upgrading to the next golden release of dCache on February 23, 2016. We have setup CFEngine to have t...
Upgrading Postgresql on CentOS/RHEL/SL with Hot standby Systems This Wiki topic covers upgrading our existing PostgreSQL version 9.3.11 on Scientific Linux 6.7 64...
Video Conferencing Help Asking for help or suggestions Email: aglt2 umich #64;umich.edu 348 West Hall Howto Guides Set outputs for each screen On the "HDMI ...
Recovering from a Lost Pool When we lose a pool we need to do a number of things to recover. Once we determine we have really lost the pool we will need to find t...
Michigan/AGLT2 SuperComputing 2015 Network Demonstrations This year the University of Michigan and AGLT2 are again participating in SuperComputing 2015. The venue...
Replaced Disks That Show "Foreign" Status Such a simple thing, but such a pain. We've all seen this, replace a failed disk in a RAID array with a salvaged disk, ...
Hardware Transition Planning from head01 (old R610) to head01 temp (new R630) We purchased a new Dell R630 to act as replacement hardware for our existing head01 ...
Test results comparing zfs to ldiskfs The tests below run a test Lustre system (mgs umdist10) through its paces, starting with a zfs 0.6.4.2 straight up install...
Setup and Configuration of the AGLT2 MD3820i This details our installation and configuration of our new MD3820i (UMVMSTOR03). We received both units on August 4th...
VMWare Setup and Updates This page should keep track of VMware related setup/updates and information. Update to vSphere 5.1 This section will document the detail...
In order to setup your GRID Certificate, you need to have already completed the initial steps of requesting the certificate, registering for membership in the ATL...
Squid rpm Installation or Update Follow directions at this OSG Twiki page for installing. More directions are available at General CERN Twiki page. Summary step...
Install Postgresql on CentOS/RHEL/SL with Replication for Esmond This Wiki topic covers installing Postgresql with replication to support the Esmond DB. You will ...
Checking out and editing CFEngine policy Some general notes and information Policies are exported from /var/cfengine/policy on umcfe or msucfe. Any directory un...
Using ZFS on Linux for AGLT2 AFS Fileservers Recently ZFS on Linux became available. ZFS has lots of nice features including Copy On Write (COW), data integrity v...
MultiCore Condor Set UP Introduction AGLT2 implements a mix of static and dynamic job slots for MultiCore jobs. At the time of this writing, we use 10 static sl...
Setting up Condor CE Condor CE is a replacement for globus on our gatekeepers. Condor G can still be used to submit jobs to the gatekeeper, but then the JobRoute...
Cleaning Up the srmspacefile Table (SRM Space token Allocations) We recently found out that our srmspacefile table in dcache was inconsistent with our actual spac...
Athena can be tricky to set up and run under your user account. These are some minimal directions to follow. The ATLAS Computing Workbook is chock full of helpfu...
Using OMD and GLPI for AGLT2 We have some nice tools installed to monitor our systems and software (OMD/Check_MK) and track the resolution of problems (GLPI). It ...
Planning Condor Configuration Updates for AGLT2 Now that AGLT2 is running on an SL6.4 OS we can plan on implementing some new features in Condor that will take ad...
HowTo Shutdown a Pool Node While In Production We sometimes need to restart/reboot pool nodes and would like to make this as least disruptive to the production sy...
CPLD Firmware Updates The CPLD has to be updated outside the OS environment using a bootable USB drive or creating a bootable ISO image and uploading the image to...
Useful Links This page will link to many pages useful for day to day administration Monitoring * Ganglia Monitoring * PerfSonar (latency) (bandwidth) * ...