GlobusWorld Tour Agenda

Research Computing and Data Management Workshop

row of four logos: Nysernet, Globus, EPOC, and Rensselaer Polytechnic Institute

Register for the Workshop

Rensselaer Polytechnic Institute - 110 8th Street, Troy, NY 12180
(all times EST - tap on a presentation title to view abstract)

February 27, 2024 – for researchers and sysadmins
8:15—8:45 breakfast & informal discussion

Jackie Stampalia, Director, Client Information Services at RPI will lead the introduction, acknowledging the contributors who played a pivotal role in organizing this workshop. Additionally, she will offer an overview of the workshop's key topics and introduce the featured speakers.


Jay McGlothlin, the Associate Director for Research Computing at Rensselaer Polytechnic Institute (RPI), will explore RPI's transition to Globus and its implications for research.


We will provide an overview of the Globus platform features, and demonstrate several data management features. Serving as an introductory session suitable for new users, we’ll use the Globus web app to show data transfer and sharing, use of Globus Connect Personal for laptop/desktop access and introduce the Globus Command Line Interface for interactive and scripting. This introduction will provide important context for subsequent sessions.

10:30—10:45 break

We will present an overview of Globus services for automating research computing and data management tasks, to accelerate research process throughput. This session is aimed at researchers who wish to automate repetitive data management tasks (such as backup and data distribution to collaborators), as well as those working with instruments (cryoEM, next-gen sequencers, fMRI, etc.) who wish to streamline data egress, downstream analysis, and sharing at scale. The material in this session will serve as an introduction to the more advanced concepts that will be covered in detail later in the workshop.


The Science DMZ is a scalable network design pattern that facilitates an optimized way for exchanging research and education data transfers. This architectural paradigm is a portion of the network, built at or near the campus or laboratory's local network perimeter, that is designed such that the equipment, configuration, and security policies are optimized for high-performance scientific applications rather than for general-purpose business systems or “enterprise” computing. This talk will give the background of the approach, and cite examples that can be implemented by workshop participants.

12:15—13:15 lunch

The Science DMZ architecture optimizes the network path to support high-performance scientific applications. A core component of this approach relies on purpose-built and dedicated computer systems that support the function of wide area data transfer. Data Transfer Nodes (DTNs) are servers built with high-quality components and configured specifically for wide area data transfer. The DTN has access to local storage, and runs the software tools designed for high-speed data transfer to remote systems. This talk will give the background of DTNs, show ways they can be integrated into the network, and show effective build and test strategies.


The core idea behind the Science DMZ architecture is a targeted security policy. There are two approaches to security:

  • Identifying risks, and creating mitigation strategies based on this study
  • Implementing broad controls
The Science DMZ approach does both of these. It emphasizes segmenting the network to mitigate risk, and installing appropriate controls for the items that have been identified. This talk will give a background on securing the Science DMZ architecture and its components, as well as offer suggestions on sensible policies for users that facilitate innovation and secure critical components.

14:45—15:00 break

We will review the Globus Connect Server v5 (GCSv5) architecture and deployment model, and describe the process for creating a Globus endpoint on your HPC cluster, lab server, or other multi-user storage system. You will experiment with installing Globus Connect Server, and configuring a number of common options on the endpoint. We will also demonstrate how to monitor and manage user activity, and options for optimizing file transfer performance.

February 28, 2024 – for sysadmins and developers
8:15—9:00 breakfast & informal discussion

We will present various use cases for applying the Globus data sharing capability, and discuss use of service accounts for automating data access and transfer tasks.


The function-as-a-service (FaaS) model is well established in commercial cloud offerings but less so in research computing environments. The Globus Compute service enables remote computing using the FaaS model, but allows users to execute functions on any compute resource where they have access. We will provide an overview of the Globus Compute service, and demonstrate how to install an endpoint and execute a simple function on a remote system.

10:30—10:45 break

Measuring performance of networks is a critical part of ensuring proper operation, and debugging anomalous behavior. perfSONAR is an infrastructure for network performance monitoring, making it easier to solve end-to-end performance problems on paths crossing several networks. It contains a set of services delivering performance measurements in a federated environment. This talk will give a brief history of perfSONAR, suggest ways it can be integrated into network environments, and discuss ways it can be leveraged to solve problems and ensure proper operational performance.


Attendees will have the opportunity to join a guided tour of the inaugural IBM Quantum System One on a university campus. The IBM Quantum System One at RPI is equipped with the powerful 127-qubit IBM Quantum Eagle processor. Recently, IBM showcased its capacity to conduct utility-scale calculations, defining utility-scale as the threshold at which quantum computers become valuable scientific instruments, delving into a new realm of challenges that classical methods find insurmountable.

12:15—13:15 lunch

We will provide a brief introduction to the Globus platform-as-a-service for developers, with emphasis on building simple web applications for data distribution and discovery. We will describe how to register an application with Globus and access platform APIs using the Globus Python SDK and a Jupyter Notebook. We will also introduce the Globus Search service and demonstrate how it is used by an open source web portal framework that can jumpstart research application development.


We will dive deeper into the Globus automation platform and describe how common instrument-based scenarios may be streamlined. We will examine the various components of a Globus flow that takes data from the point of capture on an instrument through to distribution/publication of resulting products to collaborators.

14:45—15:00 break

This session will cover topics of interest to system administrators (such as managing multi-DTN endpoints, mapping user identities, and using custom domains for data access), as well as other advanced topics of interest to the audience (these will be identified throughout the course of the preceding workshop sessions).


We will be available for 1:1 discussions and consultation about specific use cases.

Who Should Attend?

  • Researchers, Research Software Engineers (RSEs), research computing facilitators and system admins who are interested in learning more about managing research data
  • Developers building web applications for research
  • Anyone interested in learning more about Globus for research data management

Why Attend?

  • Learn how Globus can simplify your data management
  • Expand your knowledge of Globus administration
  • Experiment with new Globus services
  • Learn how you can use Globus services in your research applications
  • Exchange ideas with peers on ways to apply Globus capabilities

back to GlobusWorld site...