Migrating files to newly added OST so as to balance content
It is desirable to distribute access over as many Lustre OSS as possible, so when a new OSS umdist04.aglt2.org was added with its 8 new OST, we set out to move files from the two exising OST, umfs05 and umfs06, into the newly installed system.
On 9/3/2010 a 4th file server, umdist03, was added to the mix, and the migration was repeated.
Migration from umfs06, umfs05 to umdist04
- All scripts used are saved in svn
- Total occupied space in lustre is 66TB (47TB free, 118TB total)
- umdist04 capacity is 32.9TB, 28% of the total, so plan on moving 1/4 of 66TB, or 16TB
- Choose all files within lustre that are stored on existing OST, and with names that match six patterns. These generally correspond to large data files.
- Patterns
- *.AOD.*
- DESD_COLLCAND.*
- NTUP_MUONCALIB*
- AOD.*
- *.ESD.*
- *.RAW.*
- Script
- Script executes once per pattern, listing all obd already containing data
- lfs find /lustre/umt3 -type f -name "pattern" -obd umt3-OST0000 -obd .... > log_file
- Catenate and sort lists uniquely and add up space (~1+ hours to do) occupied by the files in the consequent list
- Occupies 43TB, of which we will move 16TB, or 37%
- Randomly choose 37% of the files in the prepared list, and split those to 5 listings
- 139000 files selected. Split to 5 files each of 28368 lines, ie, files to move.
- Set the OST on the umfs05/umfs06 OSS to readonly, eg
- lctl --device umt3-OST0012-osc deactivate
- Run lfs_migrate on 5 worker nodes, one per listing, where the worker nodes are idle in Condor and already upgraded to the 1.8.4 version of Lustre
- Lustre 1.8.3 bugs cause problems with this procedure
- Looks like about 112MB/s is copied to umdist04
- Set read-only OST back to read-write, eg
- lctl --device umt3-OST0012-osc activate
Migration from umfs05, umfs06 and umdist04 to umdist03
During the first migration, UM Worker Nodes were used to control the migration. However, it was noticed that these were running around 30MB/s input data rate, and this may well have been a throttle on how long the migration took to run.
For this migration, we will use scripts running on machines with 10Gb NICs only.
Details
- Total occupied space in lustre is 95TB (68TB free, 171TB total)
- umdist03 capacity is 53.4TB, 31% of the total, so plan on moving 31% of 95TB, or 29TB
- Choose all files within lustre that are stored on existing OST, and with names that match eight patterns, the 6 above plus two additional patterns noted during preparation for this move. These generally correspond to large data files.
- Because of difficulties encountered, relating to files from the crash of backups, etc, a few months back, exclude:
- Any file with pattern umt3/data17* in name
- 442886 Files match above pattern, occupying 44TB of disk
- 29TB is 66% of 44TB, so move 292k files
- Split 4 ways to use with 10Gb NIC machines above
- Moving 292k files, 74172 per machine
[root@c-3-16 lustre]# ./total_size_from_list.sh fs04_outfin.txt
151801 files OK to move, 1775 files dropped from list
Size occupied is 17896829107509 Bytes
Size occupied is 17067746 Mega-Bytes
Size occupied is 16667 Giga-Bytes
Size occupied is 16 Tera-Bytes
[root@c-3-16 lustre]# time ./total_size_from_list.sh fs05_outfin.txt
120506 files OK to move, 1494 files dropped from list
Size occupied is 13118269525674 Bytes
Size occupied is 12510556 Mega-Bytes
Size occupied is 12217 Giga-Bytes
Size occupied is 11 Tera-Bytes
real 15m35.470s
user 0m45.006s
sys 5m5.461s
[root@c-3-16 lustre]# time ./total_size_from_list.sh fs06_outfin.txt
170579 files OK to move, 1892 files dropped from list
Size occupied is 19358442940449 Bytes
Size occupied is 18461649 Mega-Bytes
Size occupied is 18028 Giga-Bytes
Size occupied is 17 Tera-Bytes
real 32m59.336s
user 1m4.954s
sys 8m24.519s
--
BobBall - 29 Aug 2010