List of Foswiki users Below is a list of users with accounts. If you want to edit topics or see protected areas of the site then you can get added to the list by ...
Download openafs kernel source rpm and install on build system. The SRPM from https://linat05.grid.umich.edu/pub/SLC/4x/custom/SRPMS/openafs 1.4.6 1.1_AGLT2.src....
Building Rolls ROCKS includes the roll mechanism for packaging software for distribution. A detailed developer guide is available, it contains all the info I nee...
MSU 2008 May BNX2 DKMS Ganglia Found that the existing bnx2 network driver was the cause of the large spikes in the ganglia network plots. It intermittently pu...
Condor Batch System This is the main page for administrative info about the Condor batch system(s) in use at AGLT2. User info is at CondorUser. A description of t...
MSU Raritan sh /usr/local/Raritan/Raritan MPC/5.0.3.5.36/start.sh The first time running it, you will need to tell this client where the KVM is: create "new prof...
dCache Config Overview dCache from Database to Filesystem Below is a view of the chain of configuration for the dCache system starting with the PostgreSQL databa...
Just starting this out... Tom, May 21 Data Storage Locations What locations are in use and how they are used. We have a number of storage locations for AGLT2: ...
AGLT2/Dell.OmreportOmconfig There is a Dell ROCKS Roll, do we want that? AGLT2/Dell.DellOrderStatusMSU check status of a Dell order for MSU AGLT2/Dell.DellService...
PC6248 Reformat Note that using the command show dir seems be a good stress test. Even switches that pass the check disk commands below can fail running the show...
Power Connect SNMP Lots of info is available via SNMP from the PowerConnects. References * http://wiki.xdroop.com/space/snmp/Switching Tables * http://for...
Dell Poweredge x950 Hardware Notes Information about the Dell 1950 and 2950 nodes. Dell Docs BIOS The Fall '07 order arrived with BIOS v1.5.1. This was release...
Provisioning the Dell 2950/MD1000 Storage Servers Recipe for getting a new PE2950 and MD1000 combination going as an fss nfs appliance in our ROCKS cluster. Refer...
Installing DQ2 References * https://twiki.cern.ch/twiki/bin/view/Atlas/PandaDataService Procedure Shawn created a host cert from doegrids.org for umfs02.grid...
Drive Replacement on the MD1000 Obviously, this needs some testing... A drive has failed and XFS got errors and took the filesystem offline. * unmounted filesy...
Extend Compute Need to update extend compute for use at MSU and for changes in ROCKS 4.3. Actions in version from early Nov 2007 * install libgfortran with an...
Dear Atlas US grid participant: REMINDER The next USATLAS Facilities and Operations phone meeting will occur Wed 1200 1330 CDT This is a reminder about a c...
Monitor Disk Activity with iostat and Ganglia The iostat utility from the sysstat package provides information about disk operations and throughput. It works in ...
A Plan for the ROCKS Graph ROCKS Graphs ROCKS uses the Redhat Anaconda installer to do installs. Using the Anaconda installer provides many advantages: * It ...
AGLT2 IP Addresses Information on IP addresses for the Tier2 For detail list of IPs at MSU see ask Tom for msu ips.ods or see configs/msu/network/msu ips.csv For ...
See also: * MSUDZeroOsgSE about the storage element * MSUDZeroOsgStartup Restarting the system * MSUDZeroOsgTests Testing the OSG site * MSUDZeroOsgJo...
Monitoring D0 Jobs Samgrid monitoring is at http://samgrid.fnal.gov:8080/ The list of resent jobs for the samgrid scheduler that is used for MSU jobs is here. Jo...
Storage Element An SRM/dCache instance is added to the site as a grid accessible Storage Element. dCache is a very flexible package for combining multiple filesy...
Restarting the MSU OSG Grid How to restart the system after an outage. Bring Up and Check Services Cluster Services General cluster services are required, for in...
Testing That OSG Site is Functional Central tests OSG centrally tests all sites a few time a day. * List of all sites * Current result for MSU OSG * C...
MSU Hardware Catalog This page lists hardware at MSU. Subpages provide more details and link to hardware documentation. Rack View * WesternSciRack2005 The ra...
This page is obsolete Hardware maintenance is now logged at http://glpi.aglt2.org/ MSU Hardware Repairs Until we have a better system, I'm recording hardware rep...
Room/Site Infrastructure Monitoring at MSU Liebert Air Handlers The two Liebert System/3 Air Handler units have Intellislot Web / 485 cards. See: * http://ww...
for big three phase PDUs The rearmost PDU is 1. In these racks the rearmost PDU is inverted (its cord comes out the top). * place label like "MAC 00:00:00:00...
Condor Monitoring Commands To see view of the available job slots, use the command "condor_status". To see of view of the jobs in the system submitted from your c...
User Info for MSU Tier3 Regulations Your usage of the cluster must conform with MSU's acceptible use statement http://www.msu.edu/au/ Privacy The cluster is a m...
Nov 2008 T2 Hardware Things that can happen whenever: * configure PDUs power strips * install power cords to PDUs * label needed network cables * get ...
MSU Tripp Lite UPS The storage racks at MSU have Tripp Lite SU6000RT3UHV UPSs. These are 6KVA models. One of the two PDU (power strips) in the rack is fed from ...
pe2950 Utility Node Install Have a pe2950 with 2x 250GB drive and 4x 750 GB drives. Want to set it up to support a variety of cluster services including running ...
Backing up and moving VMs If the VM is running you need to pull a snapshot and backup, otherwise the .vmdk may not be consistent. Spaces in VM names for some back...
* RebuildComputeNode Rebuilding a ROCKS compute node * RespondToDownNode What to do with a down node * ControlledShutdown How to bring the cluster down nice...
MSU OSG OSG site information and policy. Currently the MSU OSG site is 100% allocated to "SAMGrid" processing for the DZero Experiment. An SRM/dCache v2.2 SE is l...
Some Addresses At U M 198.32.43.193 an interface on Nile All UM Networks and purposes:http://www.itcom.itd.umich.edu/backbone/umnet/Tool to list all known IP ass...
Potentially Useful Network Equipment Info about hardware we are considering using. Dell Powerconnect 6248 This is one of a new (Fall 2006) fixed switches that s...
Planning for the production network. NetworkHardwareInfo Near term To Do List Here is a list of network related items that need doing as of February 4, 2011: ...
How msurxx was setup Create config files in SVN In the ROCKS SVN repo, below hostconfigs, copy msurxii.aglt2.org to msurxx.aglt2.org. Checkout (nominal location...
Re Adding to Database If a host needs to be re added to the database (it was erased, or the front end is being rebuilt), get the host info from SVN. root@msurxi ...
Configuring the Frontends Record of configuration done to frontends. Updated Dec 16th for msurx build. Connecting You can ssh to the frontend as root to perform...
Installing the Frontends The frontends are installed on VMware clients. Note that you must have a valid resolvable IP address and name or the install will fail. ...
Host Configs in SVN Have modified host config files tracked in SVN. Also have ROCKS DB entries tracked. Pull these out and put on the frontend. Storage area We...
ROCKS 5 Routing ROCKS 5 has a flexible scheme for setting routing rules. Similar to the attribute scheme, routes can be set on a global, OS, appliance and host l...
Frontend Config in SVN Have a scheme to track frontend config in SVN. A directory structure is created at /var/svn, below here modified configurations are copied....
VMware Hosting of Frontend Wish to run the frontends in VMware ESXi. The primary benefit is server consolidation. This also provides a good way to make a full b...
Raritan Dominion MSU MSU has a Dominion KX132. This is a 32 port model. In 2008, the Dominion KX series has been replaced in Raritan's line with the Dominion KX ...
Procedure for rebuilding a compute node In general, compute node rebuilding is fairly easy and the ROCKS should be maintained so that compute nodes can be rebuild...
MSU Hummm... Normal procedure is to plug keyboard/monitor into node and see if there are any kernel messages on screen. On Dells also note errors from LCD. U M ...
Node down in Ganglia If a node is down in ganglia do this (assumes ganglia config is ok...): * From frontend, ping private interface * OK: try ssh to nod...
ROCKS 5.3 * R53abMSURXX Setup of msurxx * R5abFileServer Building a file server and configuring with cfengine * UMRocksiSetup Taking umrocksi.aglt2.org f...
Manually Adding Nodes to Database The insert ethers command used to support manually adding nodes, it no longer does, however this can be performed using the rock...
Intro Here are listed releases or "tags" of the ROCKS installation. Issues with each can be added here. Summer 2011 making an attempt to maintain this page go...
lighttpd service is running on client during install. it matches URLs that have HOST == 127.0.0.1. Then does a redirect of "/install/(. )$" = rocks by.py?filena...
ROCKS Node Info A feature that is weak or missing from ROCKS is a way to add user defined parameters on a per node basis. There is a mechanism for adding user de...
Building a ROCKS client worker node Build Server Status The software revision for ROCKS and CFEngine that are active on the ROCKS frontend are shown here, the ne...
DNS in ROCKS References: * http://rscott.org/dns/ DNS Oversimplified This page written for ROCKS 4.3 with the update given below. ROCKS will manage the config ...
In ROCKS5 whole new scheme for user customization of partitioning. The annoyances with getting custom partitioning done in ROCKS4 seem to be gone we no longer ne...
ROCKS FAQs Introduction FAQs are divided into groups... Database management using ROCKS command Add an appliance This will add new entries in the membership an...
ROCKS 4.3 Install From Scratch Install log of ROCKS 4.3 and SLC45 on a Dell Poweredge 2950 server and 1950 client. The client will be installed over the network ...
Generic Kickstart Want to perform a network install of a node that won't be a ROCKS client but using the ROCKS frontend as the kickstart server. Have tried and f...
ROCKS Graphs Default graph in ROCKS52 Initial new AGL graph structure in ROCKS52 How To make them View on frontend web page (under "misc admin"), or... rocks lis...
ROCKS Installer This page is a description of how the ROCKS installer boots, how it requests the kickstart file for the node, how the server generates the kicksta...
ROCKS This is the main local page for the ROCKS cluster software. Subpages: * BuildingRocksRolls * RocksAglReleases Notes on configs used in production ...
ROCKS MySQL Database ROCKS stores configuration information in a MySQL database. Normal operations on configuration are performed with the rocks command, but thi...
ROCKS Frontend on VMware Running ROCKS frontend on VMware can simplify some management and recovery operations. The ROCKS frontend functions don't require much c...
Install on Frontend Setup EPEL yum repo On test ROCKS5 frontend, install puppet server from EPEL repo. Note that redhat.com is not reachable from aglt2.org. But...
Using RCS in ROCKS and How ROCKS Uses RCS ROCKS uses RCS on files that are written or appended with the file tag in kickstart xml. This provides some possibili...
Recovery Roll The ROCKS server can automatically build a recovery roll that allows the server's configuration to be reproduced on a new ROCKS server install. Thi...
ROCKS Site Sync We have two ROCKS clusters and wish to keep their configurations synchronized. This page will describe how to do that. Note that a closely relat...
ROCKS Kickstart XML Style Guide and How To The kickstart XML and the accompanying scripts (extras directory) specify much of the configuration for nodes on our cl...
Managing the ROCKS Installer with Subversion See local subversion pages at Subversion Creating a Branch or Tag SVN root@msurox /home/install # svn copy m "c...
Test Frontend Wish to be able to separate production and test use of frontend. The idea is that the normal production frontend can be maintained with a well defi...
ROCKS Test Server Wish to have a second server on cluster to enable testing of new ROCKS configurations. It seems that it will be simpler to have an entirely sep...
Directory /home/install/tools under SVN control. Directory /home/install/tools/bin intended for adding to PATH as desired. Main.TomRockwell 29 May 2009
Update Installer Kernel Warning this is a cludge. Darn seems to work fine with the r610 hardware, but on the existing pe1950s, hardisk doesn't get mounted for rei...
Security Planning Config Changes to Tighten Security Ideas from June 26 meeting: * Firewall changes, see SystemInstallChecklist * See below...implement...
See http://www.sensatronics.com/index.php/industrial monitors/model e4.html Need to connect using serial port to make IP configuration. It has a web server and ...
Subversion Subversion is a software revision control system designed to be an improvement on CVS. It generally replicates the features of CVS. References * T...
MSU Two switches are at msu sw1.local and msu sw2.local. To find a given node's switch ports(s): Option 1 access the switch web interface and browse for the port ...
Getting the MSU site up in ROCKS Had some initial difficulty getting the compute nodes installed. Restarted from scratch with the following plan: * Clean up /...
Trouble Atlas Atlas Analysis Job mishandled OSG APP Paul, Bob, OSG_APP should be "/atlas/data08/OSG/APP". "atlas_app/atlas_rel" are subdirectory created when i...
ROCKS Upgrade Procedures It would be helpful to write down how the following are performed: * Upgrade ROCKS version (minor and major) * Upgrade OS version (...
Want to do test installs nodes in a VMWare ESXi guest. Expect that more things can be made to work similarly to an install on a physical host, but expect that th...
These nodes use the MSI barebones chassis, OEM for the IBM 325. We purchased 20 of them from Western Scientific Spring 2005. They were all upgraded to dual core...
Rack Equipment * msi05 compute node WesternNode2005 * sm06 compute node TeamHPCNode2006 Rack Layout Notes: * RU Slot 42 is at the top of rack; 1 is at t...
Introduction xCAT is a cluster management tool originally developed at IBM and now Open Source. xCAT v1 was rewritten with much of the same functionality but a n...
Installing Follow the Install chapter of "Top Document" xCAT2top.pdf xCAT is installed with the command: yum install xCAT (There's some stuff to do before and aft...
Bootstrapping a Client CFE3.2 cf agent includes an option for bootstrapping the client. This uses a basic "update.cf" that is built into the cf agent. Essentia...
CFEngine Policy Path The CFEngine policy server is used to provide configuration scripts and various application config files to clients as needed. Clients alway...
Notes on Migrating from CFEngine community 3.2.1 to 3.3 References: * http://cfengine.com/blog/cfengine 330 release notes * https://cfengine.com/bugtracker/...
Comparing Blades vs. 1U Systems for Compute Node Use We use fairly basic compute node configurations dual CPU, 2 GB per core RAM (which might be considered larg...
Dell Deployment Toolkit Dell provide a toolkit for automating configuration of their systems. A Linux environment can be booted from CD or via PXE which includes...
Updating Dell Firmware Dell provides (officially unsupported?) yum OMSA (OpenManage Server Administrator) repo. This includes the omreport/omconfig and other "srv...
Links * http://support.dell.com/support/edocs/storage/Storlink/ Dell Manuals Physical Inspection Have suffered shipping damage on an MD1200, check that mounting...
BIOS Settings Turbo Mode http://kolbusa.livejournal.com/71066.html ROCKS Install Issues In ROCK 4.3, the LOM for these is not supported, see ??? for info on upd...
Calling Dell Regardless of the apocryphal tech support phone call horror stories, calling Dell for hardware service is quite easy. For these systems you will spea...
Lifecycle Controller The 11th generation (and newer) PowerEdge servers have a built in system designed to assist with managing firmware updates, hardware diagnost...
Fun with omconfig and omreport These programs allow reporting and controlling many hardware functions of newer Dell hardware. Install the srvadmin all rpm to get ...
Build/Rebuild a ROCKS Worker Node Recipe for rebuilding a ROCKS worker node. Includes defining the node on frontend using "nodeinfo" setup. ROCKS default way of a...
Build of msurx ROCKS 5.5 frontend This page gives a procedural view of the build of msurx with ROCKS 5.5 and SL58, some differences in the use of ROCKS 5.5 vs. 5....
Changing the 411 Server for a Client Example of moving a client from 5.3 411 server to 5.5 411 server. We had a login node that had been built from ROCKS 5.3 (ms...
Making ROCKS Rolls ROCKS is designed with plug able "rolls" that allow features to be added to the cluster. There are hooks for doing pretty much any frontend or ...
Updating SL Roll The ROCKS roll agl update sl58 contains the current set of security updates for the base OS (SL58) used in the ROCKS builds. The rolls mechanism ...
PowerVault Management Dell calls this software "PowerVault Modular Disk Storage Manager". It is used to manage the PowerVault shelves that have internal RAID cont...
SysBench Tarball Install Get it from http://sourceforge.net/projects/sysbench/ Have 0.4.12, to build, in unpacked source directory: yum install libtool libtooliz...
Notes on Applying the vSphere 5.1a Update Applied the vSphere 5.1 update to the ESXi hosts and vCenter server. In theory, this is relatively straight forward pro...
VMware CLI VMware offers a wide range of management tools, with over lapping capabilities. ESXi v5 includes a pretty useful shell on the actual VM host. VMware al...