This is now obsolete
thor01 now runs FreeBSD. See
SunX4540ConfigFreeBSD.
Zpool configuration
Destroy existing pools (pool1 through pool4):
zpool destroy pool1
The plan is to use 4 raidz2 pools of 12 disks each. With 6 controllers we can suffer the loss of one controller (2 disks) and still keep all pools alive.
So, the final commands are:
zpool create pool1 raidz2 c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0
zpool create pool2 raidz2 c1t2d0 c2t2d0 c3t2d0 c4t2d0 c5t2d0 c6t2d0 c1t3d0 c2t3d0 c3t3d0 c4t3d0 c5t3d0 c6t3d0
zpool create pool3 raidz2 c1t4d0 c2t4d0 c3t4d0 c4t4d0 c5t4d0 c6t4d0 c1t5d0 c2t5d0 c3t5d0 c4t5d0 c5t5d0 c6t5d0
zpool create pool4 raidz2 c1t6d0 c2t6d0 c3t6d0 c4t6d0 c5t6d0 c6t6d0 c1t7d0 c2t7d0 c3t7d0 c4t7d0 c5t7d0 c6t7d0
HOWEVER...we have to do this in a certain order to first give ourselves a place to move /var to and free up c1t0d0 and c2t0d0 which are currently used to make a mirrored disk (metadisk) for /var and root (I already moved root to the flash drive). As you might guess it's as easy as starting with the second pool instead of the first. Then move /var and make the remaining pools.
Go back to
SunX4540SolarisOnCF to read about moving /var.
Before making pools on the disks that were formerly used for the system mirrors, I cleared the old metadb information (forcing is required for last copy of db):
metaclear -a
metadb -d /dev/dsk/c1t0d0s7
metadb -f -d /dev/dsk/c2t0d0s7
Then I relabeled disks using the "format" command. This may not strictly be necessary but I wanted them consistent with other disks in zpools. The first time you will have to choose "auto-configure" and then you will have the option of choosing the HITACHI type for the next run.
Specify disk (enter its number): 1
selecting c1t0d0
[disk formatted]
format> type
AVAILABLE DRIVE TYPES:
0. Auto configure
1. ATA-HITACHIHUA7210S-A90A
2. other
Specify disk type (enter its number)[1]:
That's it, now make pools according to the commands above. We'll make filesystems under each pool for dcache and remove the top-level mount (optional, can leave it mounted).
zfs unmount /pool1
.... etc
zfs create -o mountpoint=/dcache pool1/dcache
zfs create -o mountpoint=/dcache1 pool2/dcache
zfs create -o mountpoint=/dcache2 pool3/dcache
zfs create -o mountpoint=/dcache3 pool4/dcache
We should set some quotas and reservations too:
zfs set reservation=10G pool1/var
zfs set quota=10G pool1/var
zfs set reservation=8.85T pool1/dcache
zfs set quota=8.85T pool1/dcache
zfs set reservation=8.86T pool2/dcache
zfs set quota=8.86T pool1/dcache
etc...for pool3/pool4. This leaves about 5GB to play with in each pool.
Installing dCache
This is pretty easy. dCache has a Solaris package and the included install.sh handles solaris setup fine. So get it and do: pkgadd -d name.pkg
Then copy files from existing dCache hosts. Cfengine will do this and I maintain copies in /var/cfengine/config-dist/dcache.
Please see
InstallDcache for more info.
Note for ccc.py
Must alter "ls" as follows or ccc.py script fails here:
root@thor01 # cat /usr/bin/ls
#!/bin/bash
# Translate --full-time to -E on Solaris
NEWOPTS=${@/'--full-time'/'-E'}
# Issue the old 'ls' with the translated options
/usr/bin/ls.orig $NEWOPTS
Then 'chmod a+x /usr/bin/ls'
The latest Recommended patchset from Sun was applied August 25 2009. Latest
J2SE recommended patchset also applied.
To install packages from sunfreeware.com (OSS packages) you can use /opt/csw/bin/pkg-get. This should be in the default path. Config at /opt/csw/etc/pkg-get.conf.
pkg-get install wget
Syslog-ng:
http://opensystems.wordpress.com/2006/06/01/replacing-syslog-on-solaris-10-with-syslog-ng/
http://www.campin.net/download/NCsysng-1.6.7-1.pkg.gz
Nis is configured with "ypinit -c". Set um-atlas-grid in /etc/defaultdomain first. Do not use IP addresses for yp servers or it will not work - must use hostnames. You absolutely must set your NIS servers in /etc/hosts.
OpenAFS tarball from openafs.org is extracted as /opt/openafs and the bin dir is in the path (follow Solaris instructions in OpenAFS manual) /etc/pam.conf was modified to use pam_krb5.so.1 and module for pam-afs installed:
http://www.eyrie.org/~eagle/software/pam-afs-session/
pam.conf lines are (following pam_unix.so line):
other auth sufficient pam_krb5.so.1
other session required /usr/lib/security/pam_afs_session.so
Modifications to /etc/profile:
PATH=${PATH}:/opt/csw/bin:/opt/openafs/dest/bin
stty erase \^h #for backspace
stty erase \^? #for delete
PS1="\u@\h \w\\$ "
Final /etc/vfstab with disabled swap, flash root, and ramdisk based /tmp:
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/dsk/c0d0s0 /dev/rdsk/c0d0s0 / ufs 1 no noatime
/devices - /devices devfs - no -
ctfs - /system/contract ctfs - no -
objfs - /system/object objfs - no -
dcache1/var - /var zfs - no -
swap - /tmp tmpfs - yes -
swap - /var/tmp tmpfs - yes -
sharefs - /etc/dfs/sharetab sharefs - no -
Cfengine is installed from source of 2.2.10. Our policy is modified to work. To successfully compile requires:
- Install gcc3 (I used sunfreeware.com pkg-get to do this)
- Set compiler as gcc: export CC=/opt/csw/gcc3/bin/gcc
- (maybe, not sure): export LDFLAGS=-L/usr/local/BerkeleyDB.4.2/lib
- Configure: ./configure --with-berkeleydb=/usr/local/BerkeleyDB.4.2 --prefix /usr --with-openssl=/usr/local/ssl (there might be a different usable location installed)
- Edit Makefile, src/Makefile, pub/Makefile and change occurrences of -pthread to -pthreads (same option, different spelling in gcc for solaris)
- ln -s /usr/sbin/cf* /var/cfengine/bin/
- Grab cfexecd and cfservd from cfengine config-dist and put in /etc/init.d.
- cfengine will make runlevel startup/shutdown links on first run (if not also doing the last two by the time anyone reads this)
Myricom drivers from myri.com. Used the package downloaded from:
http://www.myri.com/scs/download-Myri10GE.html
Then replaced the kernel module /kernel/drv/amd64/myri10ge with the gldv3 version here:
http://www.myri.com/ftp/pub/Myri10GE/gldv3/binary/
Edited /kernel/drv/myri10ge.conf to set mtu=1500 and enable LRO.
Solaris "infers" network information from /etc/inet/networks, /etc/hostname.interfacename, and /etc/hosts. Make sure you have your hostname/networks/hostip where they belong.
ifconfig -a to see interfaces.
Setting up network failover (ipmp, aggregation):
Switchports as with other systems. This one is opposite because the primary 10G port is on nile. We'll pretend that nile isn't 800 times more reliable than a pc6248 (redundancy is good anyways if we have to unplug or move). Also, the x4540 has 4 network interfaces built in so we are going to aggregate our backup connection on sw2-unit1.
We end up with a port channel using 4 ports with LACP enabled on both ends.
The switch configuration is basically like the others. Untagged 4001 for speed, tagged 4010. It's achieved by different means on Dell and Cisco. I can't find a Cisco equivalent to "mode general" but setting a native vlan lets you have one that is untagged on the port while still carrying others tagged.
Switch config for the port-channel. Hashing mode 6 means "Source/Destination IP and Source/Destination TCP/IP Port":
sw2#show running-config interface port-channel 1
description 'thor01 aggr1 L3,L4 vlan 4001 4010'
hashing-mode 6
switchport mode general
switchport general pvid 4001
no switchport general acceptable-frame-type tagged-only
switchport general ingress-filtering disable
switchport general allowed vlan add 4001
switchport general allowed vlan add 4010 tagged
vlan priority 1
Port config for 4 ports in port channel (not including unused switchport lines, just set any mode or none):
sw2#show running-config interface ethernet 1/g40
negotiation 1000f
channel-group 1 mode auto
mdix on
lacp timeout short
description 'thor01 nge0 vlan bonded'
spanning-tree portfast
mtu 9216
lldp transmit-tlv port-desc sys-name sys-desc sys-cap
lldp transmit-mgmt
lldp notification
lldp med transmit-tlv location
lldp med transmit-tlv inventory
Port on nile:
interface TenGigabitEthernet8/4
description "thor01 myri10ge0 vlan bonded 4001,4010"
switchport
switchport trunk encapsulation dot1q
switchport trunk native vlan 4001
switchport trunk allowed vlan 4001,4010
switchport mode trunk
mtu 9216
hold-queue 4096 in
hold-queue 4096 out
NOTE: Updated 01 Dec 2010. The THOR01 Myricom 10GE was moved to SW9-1/XG11 from Nile Te8/4. The new configuration required there was:
description 'thor01 aggr1 L3,L4 vlan 4001 4010'
mtu 9216
switchport mode general
switchport general pvid 4001
switchport general ingress-filtering disable
switchport general allowed vlan add 4001
switchport general allowed vlan add 4010 tagged
lldp transmit-tlv port-desc sys-name sys-desc sys-cap
lldp transmit-mgmt
lldp notification
lldp med transmit-tlv location
lldp med transmit-tlv inventory
vlan priority 1
Network configs on thor01
Sun IP multipathing requires a dedicated probe address on the interface. The probe address never migrates to another interface so that the system can see when the "down" interface comes back to life. It is also possible to omit the probe IP and use link-based failover. The probe address must be set to "deprecated" so it is not used, and -failover so it doesn't move to another interface. IP addresses failover within a group, and each IP address should have it's own group.
Some prerequisites:
- Must setup hostnames in /etc/hosts. Especially must setup NIS server hostnames if using NIS, and always must have system hostnames as used in iface config files
- Must put netmasks in /etc/netmasks
- Must set gateway in /etc/defaultrouter
/etc/hosts:
127.0.0.1 localhost
::1 localhost
192.41.230.181 thor01.aglt2.org thor01
10.10.1.181 thor01.local loghost
192.41.230.182 thor01-probe.aglt2.org
10.10.1.182 thor01-probe.local
141.211.43.103 linat03.grid.umich.edu linat03
141.211.43.102 linat02.grid.umich.edu linat02
141.211.43.104 linat04.grid.umich.edu linat04
141.211.43.109 atgrid.grid.umich.edu atgrid
/etc/defaultrouter:
192.41.230.1
/etc/netmasks (really only need first two):
192.41.230.0 255.255.255.0
10.10.0.0 255.255.254.0
141.211.43.96 255.255.255.224
141.211.96.0 255.255.252.0
192.84.86.0 255.255.255.0
141.211.100.0 255.255.255.0
141.211.101.0 255.255.255.0
Creating the aggregate link aggr1 and setting up LACP and hashing policy:
dladm create-aggr -d nge0 -d nge1 -d nge2 -d nge3 1
dladm modify-aggr -l active -P L3,L4 -T short (lacp active, timeout short, hashing policy L3 = source/dest ip, L4=source/dest tcp/ip port)
root@thor01 # dladm show-aggr -L
key: 1 (0x0001) policy: L3,L4 address: 0:14:4f:f2:bb:8c (auto)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
nge0 active short yes yes yes yes no no
nge1 active short yes yes yes yes no no
nge2 active short yes yes yes yes no no
nge3 active short yes yes yes yes no no
Vlans are configured according to the interface hostname.
Note that "hostname" below means the word "hostname" - it does NOT mean to replace with the system hostname.
(in /etc)
hostname.aggr1:
group bond1 failover standby up
hostname.aggr4010001:
group bond0 failover standby up
hostname.myri10ge0:
thor01.aglt2.org netmask + broadcast + group bond1 failover up \
addif thor01-probe.aglt2.org netmask + broadcast + deprecated -failover up
hostname.myri10ge4010000:
thor01.local netmask + broadcast + group bond0 failover up \
addif thor01-probe.local netmask + broadcast + deprecated -failover up
If this is the first setup and you are using ifconfig to do this at runtime you have to plumb. Otherwise the system does it at startup time based on the config files: ifconfig myri10ge04010000 plumb.
Security
Security? Can I call them when someone hacks me? Will they beat the guy up?
Services
Now that we have a network we need to disable a bunch of services with inetadm. In fact all inet services get shut off. So just do this for now:
svcadm disable inetd
This will persist through a reboot. Use "svcadm disable -t" for temporary disable.
Unfortunately, the day will come when you need inetd for something. So when you do disable all these services:
inetadm -d finger telnet stlisten stdiscover rpc_ticotsord xfs stfsloader ktkt_warn rlogin rquota cde-calendar-manager \
cde-ttdbserver rusers cde-spc smserver shell:default
Disable NFS related services:
svcadm disable nfs/client nfs/nlockmgr nfs/status nfs/cbd nfs/mapid
Disable XDMCP listening:
mkdir -p /etc/dt/config
cp /usr/dt/config/Xconfig /etc/dt/config
Uncomment line in Xconfig:
Dtlogin.requestPort: 0
Disable Xserver listening on tcp/6000:
cp /usr/dt/config/Xserver /etc/dt/config
Add to Xserver startup line:
-nolisten tcp
svcadm disable cde-login ; svcadm enable cde-login
Sendmail
Sendmail listens on port 25 by default. We can either disable it (no sending mail from host) or set it to listen to localhost only. Find this line and modify as such in /etc/mail/sendmail.cf:
O DaemonPortOptions=Addr=127.0.0.1,Name=MTA-v4, Family=inet
Can also optionally set a relay host. I did not.
Firewall (/etc/ipf/ipf.conf)
Coming soon....see proto.ipf.conf.
--
BenMeekhof - 21 Aug 2009