Bi-weekly AGLT2 site meeting notes
Thursday, June 26, 2008
- Need to check syslog-ng configuration for all hosts and base on more reliable machine
- Need to check firewall config on all hosts, determine common config. See also SystemInstallChecklist for some notes on this
- Packet logging, Snort, other
- Start scanning ourselves (Nessus, nmap).
- Keep public services up to date...scan regularly to confirm, check auto-update configurations
- Use Kerberos tickets for ssh logins, integrate MSU Trash/Tier3 into Kerberos, slave KDC at MSU
- Consider removing C compilers from systems where not needed, other extraneous tools to remove?
- Configuration management tools - Cfengine (http://www.cfengine.org), Node Configuration Management (http://quattor.web.cern.ch/quattor/), other?
- STOP USING ROOT TO LOGIN (and use sudo)
Thursday, Sept 19, 2007
o "Press release" -- was this ever released?
Was linked on MSU homepage, press releases don't really occur as such. So..yes?
o Any and all items that occur in regards to the new purchase
o Discussion of configuration for multi-location running
- Storage organization
- we need to test cross-site storage data transfer. we should have enough bandwidth that it doesn't matter. 10G WAN links, 10G links to local servers anyways....
o Gatekeepers -- more than 1?
- backup gatekeeper at MSU
- virtual IP failover? round robin addressing?
- LVS - linux virtual server
Networking setup, should meet with Roy, etc.
- will have two stacks, failover link disabled by spanning tree
- tunneling private networks between sites for rocks provisioning?
- setting up routing so that we take advantage of WSU link.
o Condor flocking between UM/MSU
- Bob is testing on new test nodes
o UM will begin to run tests on this with arrival of new equipment
o Possible distributed disk configurations
Thursday, Aug 23, 2007
Press release - looks like progress, some materials requested from Chip last week, were supposed to release something last week Monday. No contact since. Chip is contacting again now.
RFQ -some vendor response, looks good. We might want to discuss ongoing changes to subcontract with relevant university people.
Network problems - MSU is buying same brand of switch as the ones causing problems at Merit lately...might be a concern.
General question for us: what kind of monitoring can we do to catch failures of our own equipment? What kind of alerts, and can we monitor all the disparate services/networks we depend on?
We should set up separate submission gateways for our 2 locations. BNL, for example, has 100's. Some concerns about jobs submitted to one site reading data over WAN links from another - can we dynamically modify submit files?
Lead time on equipment - Liebert says 6 week lead time, we're on schedule for Oct 15 deadline.
Schedule meeting about our site services, probably after bids (due aug 30).
Meet about bids when we have them, normal meeting Sep 6, 9 am.
Thursday, Aug 2, 2007
Discussed call for bids, draft circulating over email.
MSU is planning to purchase approximately 60TB of storage (2 x 30TB) and 50 compute nodes.
For planning, looking at about 10MB/s sustained bandwidth, per job.
- 14 Aug 2007