Create Configuration

Please run the following commands. The first set below will capture some of the current configuration settings and save them as environemnt variables

export IFACE=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/)
export VPC_ID=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/${IFACE}/vpc-id)
export AWS_REGION=$(curl --silent http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/[a-z]$//')

Subnet

We will pick a subnet in a certain AZ to have more control over where the cluster is placed.

use1 corresponds to us-east-1. Please check where your event is taking place.

export AZ="$(echo $AWSR |cut -d\- -f 1)$(echo $AWSR |cut -d\- -f 2|cut -c1)$(echo $AWSR |cut -d\- -f 3)"
export SUBNET_ID=$(aws ec2   describe-subnets --region=$AWS_REGION \
        --filters=Name=availability-zone-id,Values=${AZ}-az1 \
        --filters=Name=vpc-id,Values=${VPC_ID} \
            |jq -r '.Subnets[0].SubnetId')

SSH-KEY

Next generate a new ssh key pair

# generate a new key-pair
export SSH_KEY=pc-sarus-key
aws ec2 create-key-pair --key-name ${SSH_KEY} --query KeyMaterial --output text --region=${AWS_REGION} > ~/.ssh/${SSH_KEY}
chmod 600 ~/.ssh/${SSH_KEY}

Create Configuration

Finally create a configuration for for ParallelCluster

Let us define some settings:

You can skip this setup step if you do not care, as the config will use defaults.

Instance Types

Let’s define what instance type the pcluster headnode is going to be.

export HEADNODE_INSTANCE=g4dn.2xlarge

Similarily the compute nodes:

export COMPUTE_INSTANCE=c5n.18xlarge

Cluster Size

For the workshop we’ll set MIN==MAX, make sure to adjust this for production environments.

export CLUSTER_MIN=3
export CLUSTER_MAX=3

Maintaining the size can only be done when MIN!=0. It will prevent downscaling - useful for the workshop; make sure to adjust for production.

export MAINTAIN_SIZE=true
cd ~/environment
cat > cluster-config.conf << EOF
[aws]
aws_region_name = ${AWS_REGION}

[global]
cluster_template = default
update_check = true
sanity_check = false

[cluster default]
base_os = alinux2
scheduler = slurm
key_name = ${SSH_KEY}
vpc_settings = public
compute_root_volume_size = 100
queue_settings = c5n
scaling_settings = custom
master_instance_type=${HEADNODE_INSTANCE:-g4dn.2xlarge}
master_root_volume_size = 100
tags = {"Name" : "amzn2-pcluster"}
dcv_settings = hpc-dcv
ebs_settings = shared
fsx_settings = fsxshared
s3_read_resource = arn:aws:s3:::*
post_install=s3://${BUCKET_NAME}/post-install.sh

[queue c5n]
compute_resource_settings = c5n
compute_type = ondemand
disable_hyperthreading = true
enable_efa = true
placement_group = DYNAMIC

[compute_resource c5n]
instance_type = ${COMPUTE_INSTANCE:-c5n.18xlarge}
min_count = 0
max_count = 3
initial_count = 1

[ebs shared]
shared_dir = /shared
volume_type = gp2
volume_size = 250

[fsx fsxshared]
shared_dir = /fsx
storage_capacity = 1200
deployment_type = SCRATCH_2

[dcv hpc-dcv]
enable = master

[scaling custom]
scaledown_idletime = 120

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

[vpc public]
vpc_id = ${VPC_ID}
master_subnet_id = ${SUBNET_ID}
EOF