Version 3 (modified by annette, 8 years ago) (diff)

Performance Modelling

As we move towards the exascale-era, HPC systems are becoming increasingly complex with, for example, co-processors (eg, GPUs, Intel Xeon Phis), Non-uniform memory-access (NUMA) nodes, and various network structures, as well as the trend for increasing numbers of cores with decreasing speeds and decreasing memory per core. This makes it challenging to predict the performance of real applications on new and forthcoming machines.

Similarly scientific software is evolving in order to solve larger and more intricate problems, which involves exploring more scalable model formulations, numerical methods and algorithms. An example of this is the joint Met Office and NERC project, Gung-Ho!, to develop the next generation Unified Model weather and climate prediction system:

Therefore, we want to know how current and future models will perform on current and emerging systems, in terms of time-to-solution, resource efficiency and problem scalability. Performance modelling can be used as a tool to explore answers to these questions.

In this case a performance model encapsulates information about the software and hardware to make a prediction of elapsed wallclock run time for a given scenario.

Shallow water model

Annette Osprey has been developing a benchmark driven performance modelling approach to evaluate deployment choices on multi-core architectures. A shallow-water model has been benchmarked on the Cray XE6 to provide inputs to a model which is designed to predict performance on differing HPC architectures. It is hoped that this approach will provide an alternative to the expensive and time consuming testing of models by trial to find the ideal configuration of architectural parameters for best performance. The model successfully predicts the strong scaling up to 16,000 cores and has been used to choose optimal rank-core affinities.

Paper: A. Osprey, G. D. Riley, M. Manjunathaiah, and B. N. Lawrence. "A Benchmark-Driven Modelling Approach for Evaluating Deployment Choices on a Multicore Architecture". Proceedings of the International Conference on Parallel & Distributed Processing Techniques & Applications. 2013. pdf

Poster: A. Osprey, G. D. Riley, B. N. Lawrence, and M. Manjunathaiah. "Benchmark-driven performance modelling for multi-core architectures". NCAS staff meeting, June 2012. pdf

Poster: A. Osprey, G. D. Riley, B. N. Lawrence, and M. Manjunathaiah. "Modelling the Performance of a Shallow Water Code". NCAS staff meeting, July 2010. pdf

Unified Model

We looked at the performance of a HadGEM3 configuration at UM 7.3 for N96 and N216 global problem sizes. At the time this work was carried out, there was concern that the UM new dynamics would be limited in it's ability to scale to very high resolution global grids (see for example:

We explored:

  • How the computational and communication performance of different parts of the model scaled with core count and resolution.
  • How the time to complete a timestep varied over a 3 day run.
  • How the number of solver iterations required to converge changed with resolution.

This information was also used to outline an analytical application performance model for the UM configuration.

Poster: A. Osprey, L. Steenman-Clark, M. Manjunathaiah, and G. D. Riley. "Performance Modelling for Climate Models". NCAS staff meeting, July 2013. pdf

Attachments (3)

Download all attachments as: .zip