Research Data Management


“How to create and access MySQL and PostgreSQL databases on DRI systems”

Webinar (2023-Feb-28) with Gemma Hoad


“Data management with DataLad”

Webinar (2023-Feb-14) with Ian Percel

This talk is a brief introduction to version controlling data and data processing workflows. Three illustrative use cases – taken from neuroimaging, geophysics, and workflows for analyzing housing data respectively – are used to provide an introduction to the main concepts of git-based file management, collaboration, and analysis.


“Hiding large numbers of files in container overlays”

Webinar (2023-Jan-17) by Alex Razoumov

Many unoptimized HPC cluster workflows result in writing large numbers of files to distributed filesystems which can create significant problems for the performance of these shared filesystems. One of the ways to alleviate this is to organize write operations inside a persistent overlay directory attached to an immutable read-only container with your scientific software. These output files will be stored separately from the base container image, and to the host filesystem an overlay appears as a single large file. In this presentation, we demo running parallel OpenFOAM simulations where all output goes into overlay images, and the total number of files on the host filesystem is reduced from several million to several dozen or less. The same approach can be used in post-processing and visualization, where you can read simulation data from multiple overlays both in serial and in parallel. In this webinar we walk you through all stages of creating and using overlays. We assume no prior knowledge of the container technology.


“Linking databases to code repositories with Throughput”

Webinar (2021-Mar-03) by Simon Goring


“Automating your backups in Linux and MacOS”

Webinar (2021-Feb-17) by Alex Razoumov


“Working with multidimensional datasets in xarray”

Webinar (2020-Sep-30) by Alex Razoumov


“File access control approaches and best practices”

Webinar (2019-Oct-30) by Sergiy Stepanenko


“Managing many files with Disk ARchiver (DAR)”

Webinar (2019-May-01) by Alex Razoumov


“Research Data Management Tools, Platforms, and Best Practices for Canadian Researchers”

Webinar (2019-Mar-20) by Alex Garnett and Adam McKenzie