Click on each course for details. Moving your research to Linux and command line in remote servers
This workshop is a hands-on introduction to Linux command line and the interaction with a remote server. We review basic Linux commands, file management (edit, copy, remove and remote-transfer files), directories and the file system, remote access, basic version control (Git, GitHub), Bash scripts and basic Bash programming. Scientific Python
This is a one- or two-day workshop introducing scientific programming in Python to beginners. We start with the basic concepts such as variables, lists, dictionaries, flow control, conditionals, loops, working with libraries, writing functions. We then go to more advanced topics such as speeding up your calculations with numpy (and working with numpy arrays in general), plotting with matplotlib or plot.ly, geospatial data processing and maps with cartopy, pandas dataframes, working with images, multidimensional arrays in xarray, working with 3D multi-resolution data in yt, running Python scripts from the command line including processing arguments and standard input, and other topics.
We can customize this workshop to address your specific Python workflows. Introduction to HPC: using clusters to speed up your research
We start with an overview of the hardware of common HPC clusters and quick description of the resources available on Compute Canada's national systems (Cedar / Graham / Niagara / Béluga). We then continue learning the basic tools and techniques to work on a cluster: software environment and modules, overview of installed programming languages and compilers, working with makefiles and installing new software locally. Finally, we take a look at the Slurm job scheduler: why use it, fairshare and priority, submitting serial jobs and job arrays, submitting OpenMP / MPI / hybrid / GPU jobs, working inside interactive jobs, and tracking your job's memory usage. We also take a quick look at working with common packages such as R, Python and Matlab on the clusters, as well as best practices in cluster workflows. High-performance Python
In scientific computing, Python is the most popular programming/scripting language. While known for its high-level features, hundreds of fantastic libraries and ease of use, Python is slow compared to traditional (C, C++, Fortran) and new (Julia, Chapel) compiled languages. In this course we’ll focus on speeding up your Python workflows using a number of different approaches. In Part 1 we will start with traditional vectorization with NumPy, will talk about Python compilers (Numba) and profiling and will cover parallelization. We’ll do a little bit of multithreading (possible via numexpr, despite the global interpreter lock) but will target primarily multiprocessing.
In Part 2 we will study Ray, a unified framework for scaling AI and Python applications. Since this is not a machine learning workshop, we will not touch most of Ray’s AI capabilities, but will focus on its core distributed runtime and data libraries. We will learn several different approaches to parallelizing purely numerical (and therefore CPU-bound) workflows, both with and without reduction. If your code is I/O-bound, you will also benefit from this course, as I/O-bound workflows can be easily processed with Ray. Introduction to scientific visualization with ParaView
We start with simple 1D/2D/3D plotting using plot.ly. The rest of the day we study 3D scientific visualization with ParaView, an open source, multi-platform data analysis and visualization tool designed to run on a variety of hardware from an individual laptop to large supercomputers. With ParaView users can interactively visualize 2D and 3D data sets defined on structured, adaptive and unstructured meshes or particles, animate these datasets in time, and manipulate them with a variety of filters. ParaView supports both interactive (GUI) and scripted (including offscreen) visualization, and is an easy and fun tool to learn. Introduction to scientific visualization with VisIt
This is a VisIt-flavoured version of the previous workshop. Large-scale 3D remote visualization
This is an advanced version of the ParaView-based visualization course focusing on parallel rendering, interactive client-server remote visualization, batch workflows using both cluster's CPUs and GPUs. 3D visualization for humanities and social sciences
What would you like to do with your data in 3D? 3D visualization has been used in traditional scientific computing domains for the past several decades to visualize results of multidimensional numerical simulations. In humanities 3D visualizations have been mostly restricted to specialized areas such as game engines, architectural renderings, virtual environments, photogrammetric processing, and visualization of point cloud data — workflows that tend to use very specific tools. In this full-day course we will approach 3D visualization from a more general perspective, treating it as an extension of interactive 2D plotting into the third dimension. In the first 80% of the workshop we will teach you how to use 3D general-purpose scientific visualization tools for interactive 3D analysis of humanities data, walking through a series of simple hands-on problems designed specifically for this course. In the remaining 20% we will show you how to put these (and more general polygon-based) visualizations on the web, using state-of-the-start in-browser visualization techniques. No prior visualization experience is needed. Introduction to programming in Julia
R and Python are interpreted languages: an interpreter executes the code directly, without pre-compilation. This is extremely convenient: it is what allows you to type and execute code in a Python or R interactive shell. The price to pay is low performance. To overcome this limitation, researchers often use C/C++ functions for the most computation-intensive parts of their algorithms. But the need to use multiple languages and the non-interactive nature of compiled languages can make this approach somewhat tedious.
Julia uses just-in-time (JIT) compilation: the code is compiled at run time. This means that it feels like running R or Python, while it is almost as fast as C. This makes Julia particularly well suited for big data analysis, machine learning, or heavy modelling. Julia shines with its extremely clean and concise syntax making it easy to learn and really enjoyable to use.
In this workshop, which does not require any prior experience in Julia (experience in another language such as R or Python would be ideal), we will start with the basics of Julia's syntax and its packaging system, and then we will look at running Julia in parallel for large-scale problems. Introduction to multi-threading and multi-processing in Julia
Julia is a high-level programming language well suited for scientific computing and data science. Just-in-time compilation, among other things, makes Julia really fast yet interactive. For heavy computations, Julia supports multi-threaded and multi-process parallelism, both natively and via a number of external packages. It also supports memory arrays distributed across multiple processes either on the same or different nodes. In this hands-on workshop, we will start with a detailed look at multi-threaded programming in Julia, with many hands-on examples. We will next study multi-processing with the Distributed standard library and its large array of tools. Finally, we will work with large data structures on multiple processes using DistributedArrays and SharedArrays libraries. We will demo parallelization using several problems: a slowly converging series, a Julia set, a linear algebra solver, and an N-body solver. We will run examples on a multi-core laptop and an HPC cluster. Foundations of parallel programming and the Chapel programming language
This course is a general introduction to the main concepts of parallel programming and the Chapel programming language. Chapel is a relatively new language for both shared and distributed-memory programming, with easy-to-use, high-level abstractions for both task and data parallelism that make it ideal for learning parallel programming for a novice HPC user. Chapel is incredibly intuitive, striving to merge the ease-of-use of Python and the performance of traditional compiled languages such as C and Fortran. Parallel constructs that typically take tens of lines of MPI code can be expressed in only a few lines of Chapel code. Chapel is open source and can run on any Unix-like operating system, with hardware support from laptops to large HPC systems. Introduction to Machine Learning with PyTorch
This is a full-day workshop introducing the basic principles of machine learning and the first steps with PyTorch. Version control with Git
This two-day workshop introduces version control with Git and covers the most common operations. It puts a particular emphasis on explaining the functioning of Git: understanding what commands really do brings the confidence to go beyond the limited use of "add, commit, push" so common in data science fields. Singularity / Apptainer containers and overlays
This full-day course is a hands-on introduction to working with Singularity/Apptainer containers in an HPC environment, as well as working with data stored inside container overlays.
Join our training webinars every second Tuesday at 11am Pacific / noon Mountain. For more details, check our events.