Writing your own code

Programming Resources

Introduction to various programming languages and libraries

Table of Contents:   PythonRJuliaChapelMPIOpenMPDebugging and profilingMATLAB

Python

We give many Python-based workshops and webinars – you can find quite a few of them in the visualization section.

Click on each webinar for its recording and materials.

  Running parallel Ray workflows across multiple cluster nodes (2025-Feb-11)

Ray is a unified framework for scaling AI and general Python workflows. Outside of machine learning (ML), its core distributed runtime and data libraries can be used for writing parallel applications that launch multiple processes, both on the same node and across multiple cluster nodes. These processes can subsequently execute a variety of workloads, e.g. Numba-compiled functions, NumPy calculations, and even GPU-enabled codes.

In this webinar, we will focus on scaling Ray workflows to multiple HPC cluster nodes to speed up various (non-ML) numerical workflows. We will look at both a loosely coupled (embarrassingly parallel) problem and a tightly coupled parallel problem.


  Working with multidimensional datasets in xarray (2020-Sep-30)

  Working with the Python DASK library (2019-Oct-16)


Collapse all webinars in this section

R

Click on each webinar for its recording and materials.

  Introduction to high-performance research computing in R (2023-Jan-31)

The programming language R is not known for its speed. However, with some code optimization, R can be used for relatively heavy computations. Additional speedup can be achieved through various parallel techniques, both with multi-threading and distributed computing. This workshop introduces you to working with R from the command line on the Alliance clusters with a focus on performance. We discuss code profiling and benchmarking, various packages for parallelization, as well as using C++ from inside R to speed up your calculations.

  • Speaker: Marie-Hélène Burle
  • Online slides (use ←/→ keys to navigate)


Collapse all webinars in this section

Julia

You can also browse some of our Julia programming materials here.

Click on each webinar for its recording and materials.

  High-level parallel stencil computations on CPUs and GPUs (2025-Jan-21)

In this webinar, we cover parallel stencil computations in Julia using the ParallelStencil.jl package. This package enables you to write high-level code for fast computations on CPUs and GPUs. These computations are common in all numerical simulations involving the solution of discretized partial differential equations (PDEs) on a grid. ParallelStencil.jl provides high-level functions for computing derivatives and updating arrays. You can execute the same code on a single CPU, multiple CPUs with multithreading via Base.Threads, or on GPUs using either CUDA.jl (NVIDIA GPUs), AMDGPU.jl (AMD GPUs), or Metal.jl (Apple Silicon GPUs).

Regardless of the underlying parallel hardware, all low-level communication between threads is hidden behind ParallelStencil.jl's macro calls, ensuring that it remains invisible in the simulation code. This framework makes it highly accessible to domain scientists.

Furthermore, you can extend this framework to multiple processes, integrating ParallelStencil.jl with ImplicitGlobalGrid.jl (built upon MPI.jl). This combination facilitates easy scaling to multiple cluster nodes, with further parallelization on multiple cores and GPUs on each node. This architecture has been shown to scale efficiently to hundreds of GPUs and hundreds of cluster nodes.


  Nextflow and Julia for scalable computation (2024-Nov-12)

Large-scale numerical experiments are central to much of contemporary scientific and mathematical research. Performing these numerical experiments in a valid, reproducible and scalable fashion is not easy. In this webinar I provide an introduction and pointers to two tools my research group uses to perform numerical experiments:

  • Nextflow: can be thought of as an "operating system" for coordinating numerical experiments.
  • Julia: a programming language to unlock full access to high-performance computation on both CPUs and GPUs.

  Julia at full tilt: profiling and optimizations (2024-Apr-30)

  ThreadsX.jl: easier multithreading in Julia (2022-Feb-02)

  Easier parallel Julia workflow with Dagger.jl (2021-Oct-27)

Designed specifically for HPC and inspired by the Python library Dask, Dagger is a distributed framework with a scheduler built on top of Distributed.jl for efficient parallel and out-of-core execution of tasks represented by a directed acyclic graph (DAG). Dask supports computing with multiple threads, multiple processes, and on GPUs. Checkpoints are easy to create if you need to interrupt and resume computations. Finally, Dagger provides some debugging and runtime profiling tools.

  • Speaker: Marie-Hélène Burle

  Parallel programming in Julia (2021-Mar-17)

In this webinar, we start with a quick review of Julia's multi-threading features but focus primarily on Distributed standard library and its large array of tools. We show parallelization using three problems: a slowly converging series, a Julia set, and an N-body solver. We run the examples on a multi-core laptop and an HPC cluster.

  • Speakers: Alex Razoumov and Marie-Hélène Burle
  • PDF slides

  High-performance research computing with Julia (2020-Mar-04)


Collapse all webinars in this section

Chapel

Click on each webinar for its recording and materials.

  GPU computing with Chapel (2024-Oct-01)

Chapel is a parallel programming language for scientific computing designed to exploit parallelism across a wide range of hardware, from multi-core computers to large HPC clusters. Recently, Chapel introduced support for GPUs, allowing the same code to run seamlessly on both NVIDIA and AMD GPUs, without modification. In addition, for testing and development, Chapel offers a "CPU-as-device" mode, which lets you prototype GPU code on a regular computer without a dedicated GPU.

Programming GPUs in Chapel is significantly easier than using CUDA or ROCm/HIP, and more flexible than OpenACC, as you can run fairly generic Chapel code on GPUs. Obviously, you will benefit from GPU acceleration the most with calculations that can be broken into many independent identical pieces. In Chapel, data transfer to/from a GPU (and between GPUs) is straightforward, thanks to a well-defined coding model that associates both calculations and data with a clear concept of locality.

As of this writing, on the Alliance systems, you can run multi-locale (multiple nodes) GPU Chapel natively on Cedar, and single-locale GPU Chapel on all other clusters with NVIDIA cards via a container. Efforts are underway to expand native GPU support to more systems.

In this webinar, we guide you through Chapel's key GPU programming features with live demos.


  Working with data files and external C libraries in Chapel (2020-Mar-18)

  Working with distributed unstructured data in Chapel (2019-Apr-17)
  • Speaker: Alex Razoumov
  • ZIP filewith slides and sample codes

  Intro to Parallel Programming in Chapel (3-part series, early 2018)

In this three-part online webinar series, we introduce the main concepts of the Chapel parallel programming language. Chapel is a relatively new language for both shared- and distributed-memory programming, with easy-to-use, high-level features that make it ideal for learning parallel programming for a novice HPC user.

Unlike other high-level data-processing languages and workflows, the primary application of Chapel is numerical modelling and simulation codes, so this workshop is ideal for anyone who wants to learn how to write efficient large-scale numerical codes.

  • Speaker: Alex Razoumov

Part 1: Basic language features (2018-Feb-28)


Part 2: Task parallelism in Chapel (2018-Mar-07)


Part 3: Data parallelism in Chapel (2018-Mar-14)




Collapse all webinars in this section

HPC Carpentry Course

As part of their contribution to HPC Carpentry, WestGrid staff authored a Parallel programming in Chapel course. The materials and exercises presented in this course can be presented as a full-day workshop. If you have questions about the materials, please contact Alex Razoumov - alex.razoumov@westgrid.ca.

MPI

Click on each webinar for its recording and materials.

  A Brief Introduction to the Boost MPI Library (2018-May-09)

OpenMP

Click on each webinar for its recording and materials.

  Intro to Parallel Programming for Shared Memory Machines (2019-Oct)

This online workshop explores how to use OpenMP to improve the speed of serial jobs on multi-core machines. We review how to add OpenMP constructs to a serial program in order to run it using multiple cores. Viewers are led through a series of hands-on, interactive examples, focusing on multi-threading parallel programming.

The topics covered include:

  • Basic OpenMP operations
  • Loops
  • Reduction variables

Debugging and profiling

Click on each webinar for its recording and materials.

  Memory debugging with Valgrind (2019-Feb-20)

MATLAB

Click on each webinar for its recording and materials.

  Data Analytics and Machine Learning with MATLAB (2018-Oct-31)