Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • NO NVMe
  • NO Persistence Memory
  • Server will only use SCM (tmpfs which has size of ~90G only)
  • Node available with Queue Name

    Queue NameNode TypeMax Nodes per Job
    (assoc'd cores)*
    Max DurationMax Jobs in Queue*Charge Rate
    (per node-hour)
    skx-devSKX4 nodes
    (192 cores)*
    2 hrs1*1 SU
    skx-normalSKX128 nodes
    (6,144 cores)*
    48 hrs25*1 SU
    skx-large**SKX868 nodes
    (41,664 cores)*
    48 hrs3*1 SU
  • Stampede2 SKX Compute Node Specifications

    Model: Intel Xeon Platinum 8160 ("Skylake")
    Total cores per SKX node: 48 cores on two sockets (24 cores/socket)
    Hardware threads per core: 2
    Hardware threads per node: 48 x 2 = 96
    Clock rate: 2.1GHz nominal (1.4-3.7GHz depending on instruction set and number of active cores)
    RAM: 192GB (2.67GHz) DDR4
    Cache: 32KB L1 data cache per core; 1MB L2 per core; 33MB L3 per socket. Each socket can cache up to 57MB (sum of L2 and L3 capacity).
    Local storage: 144GB /tmp partition on a 200GB SSD. Size of /tmp partition as of 14 Nov 2017.
    Code Block
    languageyml
    titleDAOS Server yaml
    collapsetrue
    # single server instance per config file for now
    servers:
    -
      targets: 16                		# Confirm the number of targets
      first_core: 0              		# offset of the first core for service xstreams
      nr_xs_helpers: 1           		# count of offload/helper xstreams per target
      fabric_iface: ib0          		# map to OFI_INTERFACE=ib0
      fabric_iface_port: 31416   		# map to OFI_PORT=31416
      log_mask: ERR     		 		# map to D_LOG_MASK=ERR
      log_file: /tmp/daos_server.log 	# map to D_LOG_FILE=/tmp/server.log
    
      # Environment variable values should be supplied without encapsulating quotes.
      env_vars:                 # influence DAOS IO Server behaviour by setting env variables
      - CRT_TIMEOUT=120
      - CRT_CREDIT_EP_CTX=0
      - PSM2_MULTI_EP=1
      - CRT_CTX_SHARE_ADDR=1
      - PMEMOBJ_CONF=prefault.at_open=1;prefault.at_create=1;  # Do we need this?
      - PMEM_IS_PMEM_FORCE=1								   # Do we need this?
    
    
      # Storage definitions
    
      # When scm_class is set to ram, tmpfs will be used to emulate SCM.
      # The size of ram is specified by scm_size in GB units.
      scm_mount: /dev/shm   # map to -s /mnt/daos
      scm_class: ram
      scm_size: 90
    
      # When scm_class is set to dcpm, scm_list is the list of device paths for
      # AppDirect pmem namespaces (currently only one per server supported).
      # scm_class: dcpm
      # scm_list: [/dev/pmem0]
    
      # If using NVMe SSD (will write /mnt/daos/daos_nvme.conf and start I/O
      # service with -n <path>)
      # bdev_class: nvme
      # bdev_list: ["0000:81:00.0"]  # generate regular nvme.conf
    
      # If emulating NVMe SSD with malloc devices
      # bdev_class: malloc  # map to VOS_BDEV_CLASS=MALLOC
      # bdev_size: 4                # malloc size of each device in GB.
      # bdev_number: 1              # generate nvme.conf as follows:
                  # [Malloc]
                  #   NumberOfLuns 1
                  #   LunSizeInMB 4000
    
      # If emulating NVMe SSD over kernel block device
      # bdev_class: kdev            # map to VOS_BDEV_CLASS=AIO
      # bdev_list: [/dev/sdc]       # generate nvme.conf as follows:
                  # [AIO]
                  #   AIO /dev/sdc AIO2
    
      # If emulating NVMe SSD with backend file
      # bdev_class: file            # map to VOS_BDEV_CLASS=AIO
      # bdev_size: 16           # file size in GB. Create file if does not exist.
      # bdev_list: [/tmp/daos-bdev] # generate nvme.conf as follows:
                  # [AIO]
                  #   AIO /tmp/aiofile AIO1 4096

    file Environment variables (If set any):



...