Idbool

From Digitalis

(Difference between revisions)
Jump to: navigation, search
(Technical documentations and other resources)
Line 13: Line 13:
* https://resources.numascale.com/
* https://resources.numascale.com/
* https://wiki.numascale.com
* https://wiki.numascale.com
 +
For questions related to the performance achievable on this machine, please look at:
 +
* https://wiki.numascale.com/tips
 +
* https://resources.numascale.com/numascale-scaling-best-practice.pdf
===== Installation notes =====
===== Installation notes =====

Revision as of 21:43, 6 February 2015

| Introduction | Usage | Idfreeze | Idgraf | Idphix | Idbool | Idkat | Idcin | Idarm | Ppol | Grimage |

Contents

Overview

Idbool is a CC-NUMA system of 192 cores using the Numascale Numaconnect interconnect.

Technically, the machine is composed of 4 nodes, each equipped with 3 AMD Opteron(tm) Processor 6376 (Abu Dhabi, 16 cores) and interconnected to the other nodes using the Numaconnect interconnect in a tore configuration with double links.

This Numaconnect interconnect provides a full hardware Single System Image (SSI) with a single memory space with cache coherency. As a result the system appears as a single Linux system.

Currently the systems is powered by a Ubuntu 14.04 LTS.

Technical documentations and other resources

For questions related to the performance achievable on this machine, please look at:

Installation notes

Instruction to install the system with Ubuntu 14.04 LTS

  • Alter /etc/sysct.conf and kernel parameters (taken from the Numascale Wiki: https://wiki.numascale.com/tips/os-tips)
  • Apply the patch from http://askubuntu.com/questions/468466/why-this-occurs-error-diskfilter-writes-are-not-supported due to software raid
  • Remove irqbalance, suggested by Numascale
  • Disable selinux and apparmor in /etc/default/grub, after that update-grub. Also disabled apparmor startup script
  • Blacklist the edac drivers, because they caused and error during boot seen in dmesg ( /etc/modprobe.d/blacklist.conf )
    • Not recommended by NumaScale, therefore reverted the steps above again. The traces can be considered as warning
    • This is due to scalability in the kernel, which should be fixed with the NumScale provided kernel
  • Install the linux-image-3.15.10-numascale17+_3.15.10-numascale17+-2_amd64.deb patch:
    • Works perfectly, scales pretty good: but swap is not in the kernel, so no swap space is usable. But swapping on a Numasystem does not make sense at all, because this slows down even more than on a normal system

Reserving and accessing idbool

By default OAR gives only 1 or the 4 nodes:

[pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'"
Properties: machine='idbool'
[ADMISSION RULE] Modify resource description with type constraints
Import job key from file: /home/pneyron/.ssh/id_rsa
OAR_JOB_ID=8348
Interactive mode : waiting...
Starting...

Connect to OAR job 8348 via the node idbool.grenoble.grid5000.fr
[OAR] OAR_JOB_ID=8348
[OAR] Your nodes are:
      idbool-1.grenoble.grid5000.fr*48

[pneyron@idbool ~](8348-->60mn)$ 

To reserve the complete machine, one must specify -l machine=1

[pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'" -l machine=1
Properties: machine='idbool'
[ADMISSION RULE] Modify resource description with type constraints
Import job key from file: /home/pneyron/.ssh/id_rsa
OAR_JOB_ID=8349
Interactive mode : waiting...
Starting...

Connect to OAR job 8349 via the node idbool.grenoble.grid5000.fr
[OAR] OAR_JOB_ID=8349
[OAR] Your nodes are:
      idbool-1.grenoble.grid5000.fr*48
      idbool-2.grenoble.grid5000.fr*48
      idbool-3.grenoble.grid5000.fr*48
      idbool-4.grenoble.grid5000.fr*48

[pneyron@idbool ~](8349-->59mn)$
Personal tools
platforms