LFC SQL Queries

Below are some potentially useful SQL queries to check the status of the LFC. NOTE These are my test queries and I don't guarantee they are correct at this point!

Find "orphan" replicas

select fileid,sfn from Cns_file_replica where fileid NOT IN (select fileid from Cns_file_metadata);

Normally this query should not return anything. If it did there would be a replica record without a corresponding metadata record in the LFC.

Find LFC metadata records without replicas

delete from Cns_file_metadata where fileid NOT IN (select fileid from Cns_file_replica) and filesize>0;

Note in this case we expect "directories" in LFC to not have replicas. The filesize>0 helps isolate "files" but doesn't remove directories which are 0 size.

When we clean up entries in the LFC the 'nlink' parameter may not always be properly updated. Sarah Williams (MWT2_IU) provided the following SQL to update/repair the nlink values:

create temporary table if not exists fscknlink select name,fileid, nlink AS oldnlink, (select count(*) from Cns_file_metadata AS ChildFiles where
Cns_file_metadata.fileid=ChildFiles.parent_fileid) AS newnlink from Cns_file_metadata where guid is NULL having oldnlink!=newnlink;

Then we need to apply the new values of nlink to our Cns_file_metadata table:

update fscknlink,Cns_file_metadata set Cns_file_metadata.nlink=fscknlink.newnlink where Cns_file_metadata.fileid=fscknlink.fileid;

An example run on our site:
mysql> create temporary table if not exists fscknlink select name,fileid, nlink AS oldnlink, (select count(*) from Cns_file_metadata AS ChildFiles where
    -> Cns_file_metadata.fileid=ChildFiles.parent_fileid) AS newnlink from Cns_file_metadata where guid is NULL having oldnlink!=newnlink;
Query OK, 51915 rows affected (1 min 18.93 sec)
Records: 51915  Duplicates: 0  Warnings: 0

mysql> update fscknlink,Cns_file_metadata set Cns_file_metadata.nlink=fscknlink.newnlink where Cns_file_metadata.fileid=fscknlink.fileid;
Query OK, 51915 rows affected (6.91 sec)
Rows matched: 51915  Changed: 51915  Warnings: 0

Find Files with more than one Replica

select fileid from Cns_file_replica GROUP BY fileid HAVING count(fileid)>1;

Once you have a list of fileids it is easy to check them: select sfn from Cns_file_replica where fileid=;

Query for all files in /pnfs and output to file

select rowid,SUBSTRING(sfn,LOCATE('/pnfs',sfn),LENGTH(sfn)-LOCATE('/pnfs',sfn)+1) into outfile '/tmp/lfc_sfn_pnfs.dat' from Cns_file_replica where locate('/pnfs',sfn)>0;

This will find the /pnfs located files and output the results into /tmp/lfc_sfn_pnfs.dat. It records the rowid (which is the key) and the PNFS path to the file.

Update Cns_file_replica 'setname' file with PNFSID

I have some perl/DBI code to do this assuming you have already generated the list of PNFSIDs for the LFC entries. The steps are:
  • First "dump" all LFC pnfs paths via: select rowid,SUBSTRING(sfn,LOCATE('/pnfs',sfn),LENGTH(sfn)-LOCATE('/pnfs',sfn)+1) into outfile '/tmp/lfc_sfn_pnfs2.dat' from Cns_file_replica where locate('/pnfs',sfn)>0 and (setname='' or setname is null); The output file /tmp/lfc_sfn_pnfs.dat holds the rowid and PNFS path to each entry.
  • Run a simple perl script to get the 'pnfsid' from the /pnfs mount point and create a new output file containing "rowid pnfspath pnfsid" (just append pnfsid to the above info and put it into a new file. NOTE: if the pnfsid can't be found set it to -1.
  • Run the script below to put these values into the 'setname' field of the Cns_file_replica table.

#!/usr/bin/perl
#
#  Set the PNFSID for each LFC entry replica into the 'setname' field
#
use DBI;
#use Time::Format;

my $verbose=1;

$infile=$ARGV[0];
chomp($infile);

my $dbh=DBI->connect("DBI:mysql:dbname=cns_db;host=lfc.aglt2.org","<user>","<password>",{ RaiseError => 1}) or die "Open of cns_db on lfc.aglt2.org failed: $DBI::errstr\n";
my $sth=$dbh->prepare('update Cns_file_replica set setname=? where rowid=?') or die "Couldn't prepare statement: " . $dbh->errstr;

open(IN,"<$infile") or die "Unable to open $infile: $!";

# Make sure the input matches our PNFS space
$cnt=0;
while (<IN>) {
    if ( ($cnt++%10000)  == 0 ) {
        print " Did $cnt entries at ".localtime(time());
    }
    ($rowid,$pnfspath,$pnfsid)=split(/\s+/);
#    print " Updating rowid=$rowid to use setname=$pnfsid...\n";
    $sth->execute($pnfsid,$rowid) or die "Couldn't execute statement: " .$sth->errstr;
#    last;
}
$dbh->disconnect();
close(IN);
print "Done!\n";
exit;

Count the Number of Files in a Specific Directory Path

select count(*) from Cns_file_replica where locate('/atlasmcdisk/',sfn)>0;

This example counts the number of files in our AGLT2_MCDISK area on /pnfs.

-- ShawnMcKee - 01 Apr 2009
Topic revision: r9 - 16 Oct 2009, TomRockwell
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback