Idbool

From Digitalis

(Difference between revisions)
Jump to: navigation, search
(Reserving and accessing idbool)
(Acknolegment)
 
(18 intermediate revisions not shown)
Line 3: Line 3:
Idbool is a CC-NUMA system of 192 cores using the [http://www.numascale.com Numascale] Numaconnect interconnect.
Idbool is a CC-NUMA system of 192 cores using the [http://www.numascale.com Numascale] Numaconnect interconnect.
-
Technically, the machine is composed of 4 chassis/motherboards, each equipped with 3 AMD Opteron(tm) Processor 6376 (Abu Dhabi, 16 cores) and interconnected to the other nodes using the Numaconnect interconnect in a tore configuration with double links.
+
Technically, the machine is composed of 4 chassis/motherboards, each equipped with 3 AMD Opteron(tm) Processor 6376 (Abu Dhabi, 16 cores) and interconnected to the other motherboards using the Numaconnect interconnect in a tore configuration with double links.
-
This Numaconnect interconnect provides a full hardware Single System Image (SSI) with a single memory space with cache coherency. As a result the system appears as a single Linux system.
+
This Numaconnect interconnect provides a full-hardware SMP machine with a Single System Image (SSI) OS, providing a unique (single) memory space with cache coherency. As a result the system appears as a single Linux system.
Currently the systems is powered by a Ubuntu 14.04 LTS.
Currently the systems is powered by a Ubuntu 14.04 LTS.
Line 32: Line 32:
== How to experiment ==
== How to experiment ==
=== Reserving and accessing idbool ===
=== Reserving and accessing idbool ===
-
;By default OAR only gives access to 1 or the 4 hosts (motherboards) of the machine:
+
;By default OAR only gives access to 48 of the 192 cores of the machine (only 1 of the 4 motherboards of the machine), which you can indeed notice once on the machine:
  [pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'"
  [pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'"
Line 58: Line 58:
-
;To reserve the complete machine, one must specify <code>-l machine=1</code>.
+
;To reserve the entire machine, you must add <code>-l machine=1</code> to you <code>oarsub</code> command.
Furthermore, we request a 4 hours job in the example below:
Furthermore, we request a 4 hours job in the example below:
-
  [pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'" '''-l machine=1,walltime=4'''
+
  [pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'" '''-l machine=1''',walltime=4
  Properties: machine='idbool'
  Properties: machine='idbool'
  [ADMISSION RULE] Modify resource description with type constraints
  [ADMISSION RULE] Modify resource description with type constraints
Line 77: Line 77:
       '''idbool-4.grenoble.grid5000.fr*48'''
       '''idbool-4.grenoble.grid5000.fr*48'''
   
   
-
  [pneyron@idbool ~](8349-->59mn)$
+
  [pneyron@idbool ~](8349-->239mn)$
=== Privileged commands ===
=== Privileged commands ===
Line 87: Line 87:
* sudo /usr/bin/perf
* sudo /usr/bin/perf
* sudo /usr/bin/lstopo
* sudo /usr/bin/lstopo
 +
* sudo /usr/local/bin/likwid-perfctr
 +
Commands in the following directories are also runnable as sudo (with no requirement for an exclusive job for now, so please mind what you are doing with regard to other concurrent users):
 +
* /opt/nc-utils/os/nc_test/nc_perf/
 +
* /opt/nc-utils/os/nc_test/nc_log_d/
 +
* /opt/nc-utils/os/nc_test/nc_stat/
 +
* /opt/nc-utils/os/nc_test/nc_stat_d/
 +
* /opt/nc-utils/os/numaplace/
 +
* /opt/nc-utils/tools/
 +
Mind the fact that those commands might have side-effects, so watch out and be kind to inform others via the mailing list if disturbances might occur.
 +
 +
(For information on those commands, see: https://github.com/numascale/nc-utils, and https://github.com/numascale/nc-utils/blob/master/os/doc/NC_PERF_USER_GUIDE.pdf)
== Performances ==
== Performances ==
In order to get performance using the whole machine (see the case "machine=1" above), a special care must be taken with regard to data placement in memory vs. cpus. Indeed the numa factor between numa nodes from one motherboard to another motherboard is very high. A typical bandwidth might be as low as 90MB/s if accessing from one CPU, memory of a remote numa nodes. Numascale strongly advises to read https://resources.numascale.com/numascale-scaling-best-practice.pdf.
In order to get performance using the whole machine (see the case "machine=1" above), a special care must be taken with regard to data placement in memory vs. cpus. Indeed the numa factor between numa nodes from one motherboard to another motherboard is very high. A typical bandwidth might be as low as 90MB/s if accessing from one CPU, memory of a remote numa nodes. Numascale strongly advises to read https://resources.numascale.com/numascale-scaling-best-practice.pdf.
 +
 +
== Use recent gcc installation ==
 +
 +
gcc 5.1 have been compiled with numascale patch applied in /opt/gcc-5.1.0+local-stack.
 +
This patch improve performances for libgomp (When you use OpenMP)
 +
 +
To use this compiler, you have to specify it explicilty setting correct environment variables:
 +
 +
export PATH=/opt/gcc-5.1.0+local-stack/bin:$PATH
 +
export LIBRARY_PATH=/opt/gcc-5.1.0+local-stack/lib64:$LIBRARY_PATH
 +
export LD_LIBRARY_PATH=/opt/gcc-5.1.0+local-stack/lib64:$LD_LIBRARY_PATH
 +
 +
then you can call it with : gcc-5.1numa hello_world.c -o hello_world
 +
 +
== Acknolegment ==
 +
The idbool machine was funded by a Grenoble INP project, led by the Mescal, Moais and Erods teams of of LIG.

Current revision as of 13:43, 31 August 2016

| Introduction | Usage | Idfreeze | Idgraf | Idphix | Idbool | Idkat | Idcin | Idarm | Ppol | Grimage |

Contents

Overview

Idbool is a CC-NUMA system of 192 cores using the Numascale Numaconnect interconnect.

Technically, the machine is composed of 4 chassis/motherboards, each equipped with 3 AMD Opteron(tm) Processor 6376 (Abu Dhabi, 16 cores) and interconnected to the other motherboards using the Numaconnect interconnect in a tore configuration with double links.

This Numaconnect interconnect provides a full-hardware SMP machine with a Single System Image (SSI) OS, providing a unique (single) memory space with cache coherency. As a result the system appears as a single Linux system.

Currently the systems is powered by a Ubuntu 14.04 LTS.

Technical documentations and other resources

For questions related to the performance achievable on this machine, please look at:

Installation notes

Instruction to install the system with Ubuntu 14.04 LTS

  • Alter /etc/sysct.conf and kernel parameters (taken from the Numascale Wiki: https://wiki.numascale.com/tips/os-tips)
  • Apply the patch from http://askubuntu.com/questions/468466/why-this-occurs-error-diskfilter-writes-are-not-supported due to software raid
  • Remove irqbalance, suggested by Numascale
  • Disable selinux and apparmor in /etc/default/grub, after that update-grub. Also disabled apparmor startup script
  • Blacklist the edac drivers, because they caused and error during boot seen in dmesg ( /etc/modprobe.d/blacklist.conf )
    • Not recommended by NumaScale, therefore reverted the steps above again. The traces can be considered as warning
    • This is due to scalability in the kernel, which should be fixed with the NumScale provided kernel
  • Install the linux-image-3.15.10-numascale17+_3.15.10-numascale17+-2_amd64.deb patch:
    • Works perfectly, scales pretty good: but swap is not in the kernel, so no swap space is usable. But swapping on a Numasystem does not make sense at all, because this slows down even more than on a normal system

How to experiment

Reserving and accessing idbool

By default OAR only gives access to 48 of the 192 cores of the machine (only 1 of the 4 motherboards of the machine), which you can indeed notice once on the machine
[pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'"
Properties: machine='idbool'
[ADMISSION RULE] Modify resource description with type constraints
Import job key from file: /home/pneyron/.ssh/id_rsa
OAR_JOB_ID=8348
Interactive mode : waiting...
Starting...

Connect to OAR job 8348 via the node idbool.grenoble.grid5000.fr
[OAR] OAR_JOB_ID=8348
[OAR] Your nodes are:
      idbool-1.grenoble.grid5000.fr*48

[pneyron@idbool ~](8348-->60mn)$ 

Then see:

[pneyron@idbool ~](8348-->57mn)$ cat /dev/cpuset/$(grep -o "/oar/.*" /proc/self/cgroup)/cpus
0-47
[pneyron@idbool ~](8348-->57mn)$ cat /dev/cpuset/$(grep -o "/oar/.*" /proc/self/cgroup)/mems
0-5

This job only gives access to the resources of the first host (motherboard) of the machine: logical CPUS (core) 0 to 47 and Numa nodes 0 to 5. Other resources of the machine can be seen (e.g. in `top') but are not reachable because isolated by the linux container cpuset of your job.


To reserve the entire machine, you must add -l machine=1 to you oarsub command.

Furthermore, we request a 4 hours job in the example below:

[pneyron@digitalis ~]$ oarsub -I -p "machine='idbool'" -l machine=1,walltime=4
Properties: machine='idbool'
[ADMISSION RULE] Modify resource description with type constraints
Import job key from file: /home/pneyron/.ssh/id_rsa
OAR_JOB_ID=8349
Interactive mode : waiting...
Starting...

Connect to OAR job 8349 via the node idbool.grenoble.grid5000.fr
[OAR] OAR_JOB_ID=8349
[OAR] Your nodes are:
      idbool-1.grenoble.grid5000.fr*48
      idbool-2.grenoble.grid5000.fr*48
      idbool-3.grenoble.grid5000.fr*48
      idbool-4.grenoble.grid5000.fr*48

[pneyron@idbool ~](8349-->239mn)$

Privileged commands

Currently, the following commands can be run via sudo in exclusive jobs:

  • sudo /usr/bin/whoami (provided for testing the mechanism, should return "root")
  • sudo /usr/bin/schedtool
  • sudo /usr/bin/opcontrol
  • sudo /usr/bin/perf
  • sudo /usr/bin/lstopo
  • sudo /usr/local/bin/likwid-perfctr

Commands in the following directories are also runnable as sudo (with no requirement for an exclusive job for now, so please mind what you are doing with regard to other concurrent users):

  • /opt/nc-utils/os/nc_test/nc_perf/
  • /opt/nc-utils/os/nc_test/nc_log_d/
  • /opt/nc-utils/os/nc_test/nc_stat/
  • /opt/nc-utils/os/nc_test/nc_stat_d/
  • /opt/nc-utils/os/numaplace/
  • /opt/nc-utils/tools/

Mind the fact that those commands might have side-effects, so watch out and be kind to inform others via the mailing list if disturbances might occur.

(For information on those commands, see: https://github.com/numascale/nc-utils, and https://github.com/numascale/nc-utils/blob/master/os/doc/NC_PERF_USER_GUIDE.pdf)

Performances

In order to get performance using the whole machine (see the case "machine=1" above), a special care must be taken with regard to data placement in memory vs. cpus. Indeed the numa factor between numa nodes from one motherboard to another motherboard is very high. A typical bandwidth might be as low as 90MB/s if accessing from one CPU, memory of a remote numa nodes. Numascale strongly advises to read https://resources.numascale.com/numascale-scaling-best-practice.pdf.

Use recent gcc installation

gcc 5.1 have been compiled with numascale patch applied in /opt/gcc-5.1.0+local-stack. This patch improve performances for libgomp (When you use OpenMP)

To use this compiler, you have to specify it explicilty setting correct environment variables:

export PATH=/opt/gcc-5.1.0+local-stack/bin:$PATH
export LIBRARY_PATH=/opt/gcc-5.1.0+local-stack/lib64:$LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/gcc-5.1.0+local-stack/lib64:$LD_LIBRARY_PATH

then you can call it with : gcc-5.1numa hello_world.c -o hello_world

Acknolegment

The idbool machine was funded by a Grenoble INP project, led by the Mescal, Moais and Erods teams of of LIG.

Personal tools
platforms