LFC SQL Queries
Below are some potentially useful SQL queries to check the status of the LFC.
These are my test queries and I don't guarantee they are correct at this point!
Find "orphan" replicas
select fileid,sfn from Cns_file_replica where fileid NOT IN (select fileid from Cns_file_metadata);
Normally this query should not return anything. If it did there would be a replica record without a corresponding metadata record in the LFC.
delete from Cns_file_metadata where fileid NOT IN (select fileid from Cns_file_replica) and filesize>0;
Note in this case we expect "directories" in LFC to not have replicas. The
filesize>0 helps isolate "files" but doesn't remove directories which are 0 size.
Fix the 'nlink' Field in the LFC
When we clean up entries in the LFC the 'nlink' parameter may not always be properly updated. Sarah Williams (MWT2_IU) provided the following SQL to update/repair the
nlink values:
create temporary table if not exists fscknlink select name,fileid, nlink AS oldnlink, (select count(*) from Cns_file_metadata AS ChildFiles where
Cns_file_metadata.fileid=ChildFiles.parent_fileid) AS newnlink from Cns_file_metadata where guid is NULL having oldnlink!=newnlink;
Then we need to apply the new values of
nlink to our Cns_file_metadata table:
update fscknlink,Cns_file_metadata set Cns_file_metadata.nlink=fscknlink.newnlink where Cns_file_metadata.fileid=fscknlink.fileid;
An example run on our site:
mysql> create temporary table if not exists fscknlink select name,fileid, nlink AS oldnlink, (select count(*) from Cns_file_metadata AS ChildFiles where
-> Cns_file_metadata.fileid=ChildFiles.parent_fileid) AS newnlink from Cns_file_metadata where guid is NULL having oldnlink!=newnlink;
Query OK, 51915 rows affected (1 min 18.93 sec)
Records: 51915 Duplicates: 0 Warnings: 0
mysql> update fscknlink,Cns_file_metadata set Cns_file_metadata.nlink=fscknlink.newnlink where Cns_file_metadata.fileid=fscknlink.fileid;
Query OK, 51915 rows affected (6.91 sec)
Rows matched: 51915 Changed: 51915 Warnings: 0
Find Files with more than one Replica
select fileid from Cns_file_replica GROUP BY fileid HAVING count(fileid)>1;
Once you have a list of fileids it is easy to check them:
select sfn from Cns_file_replica where fileid=;
Query for all files in /pnfs and output to file
select rowid,SUBSTRING(sfn,LOCATE('/pnfs',sfn),LENGTH(sfn)-LOCATE('/pnfs',sfn)+1) into outfile '/tmp/lfc_sfn_pnfs.dat' from Cns_file_replica where locate('/pnfs',sfn)>0;
This will find the
/pnfs
located files and output the results into
/tmp/lfc_sfn_pnfs.dat
. It records the
rowid (which is the key) and the PNFS path to the file.
Update Cns_file_replica 'setname' file with PNFSID
I have some perl/DBI code to do this assuming you have already generated the list of PNFSIDs for the LFC entries. The steps are:
- First "dump" all LFC pnfs paths via:
select rowid,SUBSTRING(sfn,LOCATE('/pnfs',sfn),LENGTH(sfn)-LOCATE('/pnfs',sfn)+1) into outfile '/tmp/lfc_sfn_pnfs2.dat' from Cns_file_replica where locate('/pnfs',sfn)>0 and (setname='' or setname is null);
The output file /tmp/lfc_sfn_pnfs.dat holds the rowid and PNFS path to each entry.
- Run a simple perl script to get the 'pnfsid' from the /pnfs mount point and create a new output file containing "rowid pnfspath pnfsid" (just append pnfsid to the above info and put it into a new file. NOTE: if the pnfsid can't be found set it to -1.
- Run the script below to put these values into the 'setname' field of the Cns_file_replica table.
#!/usr/bin/perl
#
# Set the PNFSID for each LFC entry replica into the 'setname' field
#
use DBI;
#use Time::Format;
my $verbose=1;
$infile=$ARGV[0];
chomp($infile);
my $dbh=DBI->connect("DBI:mysql:dbname=cns_db;host=lfc.aglt2.org","<user>","<password>",{ RaiseError => 1}) or die "Open of cns_db on lfc.aglt2.org failed: $DBI::errstr\n";
my $sth=$dbh->prepare('update Cns_file_replica set setname=? where rowid=?') or die "Couldn't prepare statement: " . $dbh->errstr;
open(IN,"<$infile") or die "Unable to open $infile: $!";
# Make sure the input matches our PNFS space
$cnt=0;
while (<IN>) {
if ( ($cnt++%10000) == 0 ) {
print " Did $cnt entries at ".localtime(time());
}
($rowid,$pnfspath,$pnfsid)=split(/\s+/);
# print " Updating rowid=$rowid to use setname=$pnfsid...\n";
$sth->execute($pnfsid,$rowid) or die "Couldn't execute statement: " .$sth->errstr;
# last;
}
$dbh->disconnect();
close(IN);
print "Done!\n";
exit;
Count the Number of Files in a Specific Directory Path
select count(*) from Cns_file_replica where locate('/atlasmcdisk/',sfn)>0;
This example counts the number of files in our AGLT2_MCDISK area on /pnfs.
--
ShawnMcKee - 01 Apr 2009