Past Event – 2019 Program

(click on a presentation title to view abstract)

Wednesday, May 1, 2019 Sessions will be held in the Chagall Ballroom
7:30—17:00	registration desk open Chagall Foyer
7:30—8:30	breakfast Van Gogh
8:30—10:00	GlobusWorld 2019 - Opening Keynote Steve Tuecke and Ian Foster Globus Co-founders Rachana Ananthakrishnan Head of Products Vas Vasiliadis Chief Customer Officer \| slides We will review notable events in the evolution of the Globus service over the past year, and provide an update on future product direction and sustainability.
10:00—10:30	beverage break Chagall Foyer
10:30—11:30	Integrating Globus into LRZ's Data Science Storage Service Stephan Peinkofer Lead Developer Storage Architectures, Leibniz Supercomputing Centre \| slides LRZ's Data Science Storage (DSS) is a novel approach at LRZ to solve the demands and requirements of data intensive science. Therefore, DSS implements a data centric management approach, which gives our researchers the ability to store vast amounts of data for as long as the data is important to them or the science community, access this data from the whole LRZ computing ecosystem, share this data between arbitrary users of the LRZ computing ecosystem and access, transfer and share this data world wide via Globus. This talk will give an overview of LRZ's Data Science Storage Service and will outline how we integrated Globus into our own Management Portal using the Globus REST API.
	Globus: Enabling the Open Storage Network Brian Mohr Sr. Systems Engineer, Johns Hopkins University \| slides The goal of the Open Storage Network (OSN) project is to create a robust national storage substrate that can impact 80% of the NSF research community, and offer a way to build a common basis for the Cyberinfrastructure of the Major Research Equipment and Facilities Construction (MREFC) projects. The program will create a storage appliance which should read and write from disks at speed, with a capacity of about 1.5PB at a price point of $130K. In conjunction with the National Data Service and John Hopkins University, Globus is helping to build a distributed storage platform for OSN, based on object storage with Globus Auth federated identity authorization to promote cross-institutional data sharing for OSN users.
	Petabytes on the Prairie: The South Dakota Data Store Doug Jennewein Director of Research Computing, University of South Dakota \| slides Funded by the National Science Foundation and the South Dakota Board of Regents the South Dakota Data Store (SDDS) was deployed in 2018 and currently provides over 1.2PB of capacity across two service tiers. The Sharing Tier provides high-reliability, high-availability, network-accessible storage for research requiring persistent access to large quantities of data. The Archival Tier is hosted on a magnetic tape library for long-term offsite archival-grade storage. SDDS will serve all faculty, staff, postdocs, students, and graduate students in South Dakota. Globus provides the necessary authentication, data sharing, and transfer capabilities to make SDDS a truly statewide resource.
	Materials Data Facility as Community Database to Share Nano-manufacturing Recipes Ben Galewsky Research Programmer, National Center for Supercomputing Applications \| slides In this talk, we will describe how we used the Materials Data Facility (MDF) and its associated Globus Tooling to implement a global, community-curated database of graphene growth recipes. The University of Illinois' Nano-manufacturing hub has labs that are attempting to scale up growth of graphene using chemical vapor deposition. So far this effort has involved a great deal of trial and error. There are more than 1,000 research groups around the world involved with this same exploration. We created an application inside a HubZero instance which allows users to capture their recipes along with SEM images and spectrographic analysis of the sample. These samples are submitted to MDF where additional metadata is extracted and the dataset indexed with Globus Search. Researchers can mine recipes using Forge to find areas for their own exploration. Analysis based on this data can also be submitted to MDF where related datasets can be linked, published, and assigned a common DOI.
	Introducing the HACC Simulation Data Portal Katrin Heitmann Physicist and Computational Scientist, Argonne National Laboratory \| slides In this talk we will discuss a public data release of cosmological simulations carried out with HACC, the Hardware/Hybrid Accelerated Cosmology Code. Our release platform uses Petrel, a research data service, located at the Argonne Leadership Computing Facility. Petrel offers fast data transfer mechanisms and authentication via Globus, enabling simple and efficient access to stored datasets. Easy browsing of the available data products is provided via a web portal that allows the user to navigate simulation products efficiently. The data hub will be extended by adding more types of data products and by enabling computational capabilities to allow direct interactions with simulation results.
11:30—12:15	Enabling Science with Trust and Security – Guest Keynote Tom Barton, Sr Consultant for Cybersecurity and Data Privacy, University of Chicago and Internet2 \| slides Security is sometimes seen as an inhibitor rather than an enabler of science. I hope to convince you that this is not inherently true. We'll consider two scientific contexts, a simpler one and a far more complex one, in which substantial risk is associated with the research and how at least some of the risk can be addressed. Along the way we'll see that real risk reduction can, and sometimes must, happen by means quite unlike applying security controls from a catalog, and that it can take the form of a service rather than a constraint. I'll leave attendees with a few services they may wish to follow up with to help them address risk in their own circumstances.
12:15—13:30	lunch and roundtable discussions Van Gogh Please join a table for informal conversation on a topic of interest. Globus staff will be spread across tables to participate in discussions.
13:30—14:15	Building the Connectome – Guest Keynote Bobby Kasthuri, Neuroscience Researcher, Argonne National Laboratory, Assistant Professor in Neurobiology, University of Chicago Rafael Vescovi, Postdoctoral Scholar, Argonne National Laboratory \| slides The Kasthuri lab at the University of Chicago and Argonne National Laboratory is pioneering new techniques for brain mapping of the fine structure of the nervous system – 'connectomics' and 'projectomics'. I will describe these developments including: large volume automated electron microscopy for mapping neuronal connections, synchrotron source X-ray microscopy for mapping the cellular composition of entire brains, and combining both with cell type specific labeling for multi-scale, multi-modal brain maps. We have applied these tools to brains from octopuses and squids, to primates and mice, to the enteric nervous system and how stem cells integrate into the brain. We hope to help answer questions like: how do brains learn as they grow up? And how do brains differ across individuals and across species? And how can we reverse engineer brain function in our own computers and robots?
14:15—15:00	Data Sharing via Globus in the NIH Intramural Program Susan Chacko Senior Staff Scientist, National Institutes of Health \| slides The HPC facility at the National Institutes of Health's Intramural campus has had a Globus endpoint since 2012. Globus is used routinely for data transfer along with other methods, but the Globus data-sharing capabilities in particular have been enthusiastically embraced by the NIH HPC users. Statistics and discussion of data sharing from the user and administrative point of view will be presented.
	Health Sciences Research Informatics, Powered by Globus Jonathan Silverstein Chief Research Informatics Officer, Health Sciences and Institute for Precision Medicine, University of Pittsburgh Michael Davis Software Integration Architect, Department of Biomedical Informatics, University of Pittsburgh \| slides The Research Informatics Office (rio.pitt.edu) is responsible for UPMC's clinical data extraction, transformation, honest brokering, and provisioning for research. Neptune Research Data Warehouse and Health Record Research Request (R3) are RIO data and policy resources serving hundreds of large and small research projects across discrete data, text, imaging, and limited -omics. In partnership with the Pittsburgh Supercomputing Center, RIO is also responsible for the infrastructure for the HuBMAP consortium. To efficiently support these many activities, particularly including protected data sharing with many investigators across multiple institutions, RIO has adopted, as a key strategy, the Federated Identity, Data Movement, Search and Group Management features of Globus. In this talk we will use multiple examples to describe how Globus underpins these projects in usable and re-usable ways to achieve Secure Data Liquidity. We will also highlight a specific project, the Cancer Registry Records for Research (CR3), whose goal is to advance cancer research by providing tools that facilitate appropriate governance and dissemination of cancer-specific data to the research community. Using Globus, we have developed a portal providing cloud-based services for authentication, searching and data transfer. Using data extracted from the UPMC Network Cancer Registry, which is based on the North American Association of Central Cancer Registries (NAACCR) data standards for cancer registration, we constructed a set of "limited" data (i.e., site, stage, grade/morphology, outcomes) to store in Globus Search. This enables researchers to query and visualize aggregate cancer data for preparatory to research purposes.
	Globus For Cancer Research Mohamad Qayoom IT Consultant, LSU Health Sciences Due to a schedule conflict, this talk was not presented.
	Globus in European Life Science Steven Newhouse Head of Technical Services, European Bioinformatics Institute \| slides The European Bioinformatics Institute (EMBL-EBI) is part of the European Molecular Biology Laboratory and one of the world's leading providers of life-science data to a global community. A key aspect of our work is for individual users to deposit their data, for their data to be processed, and made available to the global community alongside added value knowledge derived from that data. Moving data from where it is generated to EMBL-EBI and from our archives to where the user wishes to analyze the data is a key part of our contribution to ELIXIR - a European Research Infrastructure for life-science. The integration of the ELIXIR AAI within Globus and the exposure of our data through Globus endpoints, will enable researchers within ELIXIR to seamlessly move data across Europe.
15:00—15:30	beverage break Chagall Foyer
15:30—17:00	Recent Upgrades to ARM Data Transfer and Delivery Using Globus Giri Prakash ARM Data Center Director and Research Staff, Environmental Sciences Division, Oak Ridge National Laboratory \| slides The Atmospheric Radiation Measurement Data Center (ADC) is a long-term archive and distribution facility for various ground-based, aerial and model data products in support of atmospheric and climate research. The ADC Archive currently holds over 11,000 data products with a total holding of over 1.7 petabytes of data that dates back to 1992, these include data from instruments, value added products, model outputs, field campaign and PI contributed data. ADC’s data discovery and delivery use modern and scalable architecture with data access and delivery options include THREDDS/OpenDAP, Globus and near real-time data access API, automated data access via web services, advanced visualizations and big data analysis platform. In this talk, we will discuss how users are using Globus to transfer terrabytes of data from ADC to their home institution and also how ADC is using Globus for its operations including transferring data between clusters.
	The NCAR RDA–Globus Integration: Experiences Developing a Modern Research Data Portal Riley Conroy Sotfware Engineer, National Center for Atmospheric Research \| slides Since late 2014, the Research Data Archive (RDA; https://rda.ucar.edu) at the National Center for Atmospheric Research (NCAR) has used the Globus data management and publication services to support its online research data portal. During this time, users have transferred more than 1.1 petabytes of data from the RDA collections, and the services developed and enabled by the Globus platform have been a tremendous benefit to the RDA user community. This presentation will provide a retrospective look at our experiences building these services into the RDA portal and how we use them to simplify our data management strategies and enhance the user experience. Using the Globus Python SDK, the RDA portal improves data access and enables scalable workflows for its large community of users. Shared endpoint access to the full RDA data catalog is user-driven and fully automated, and user-delegated transfers of curated file lists and custom delayed mode data products are initiated directly from the RDA portal and facilitated by the Globus Auth and Transfer APIs. Additional highlights include the NCAR RDA alternate identity service, which allows users to log into Globus with their RDA credentials, and a complete history of Globus data transfer usage is harvested via the Transfer API. The key ingredient in this integration is the Science DMZ network in place at NCAR, which allows Globus to deliver scalable, efficient, and reliable data transfers out of the RDA.
	A Data Ecosystem to Support Machine Learning in Materials Science Ben Blaiszik Research Scientist - University of Chicago, Globus and Argonne National Laboratory Data Science and Learning Division \| slides In this talk, we describe two related materials data infrastructure systems built on the Globus platform that work to build an ecosystem to support machine learning in materials science: the Materials Data Facility (MDF) and the Data and Learning Hub for Science (DLHub). MDF serves as an automated facilitator and interconnection point for materials data producers and consumers. Its services allow data to flow in from many sources, be enriched via a variety of tools (e.g., via automated metadata extraction, quality control), and flow onwards to many destinations, including not only MDF-operated services (e.g., the MDF repository, for storage of data with no other home, and the MDF search engine, for integration navigation and search of any and all data known to MDF) but also to the growing number of other materials-related data infrastructure components. DLHub provides similar functions for ML models and associated data transformation and analysis tools, allowing researchers to describe and publish such tools in ways that support discovery and reuse; run published tools over the network (with tools executed on a scalable hosted infrastructure); and link models, other tools, and data sources into complete ML/AI pipelines that can themselves be published, discovered, and run.
	SGCI and Globus: Partners for Acceleration of Science Sandra Gesing Research Assistant Professor, Notre Dame \| slides SGCI (Science Gateways Community Institute) provides services to the academic community to achieve sustainable science gateways—end-to-end solutions that allow researchers and educators to solve research questions within easy-to-use user interfaces hiding underlying complex research infrastructures. In this talk, I will present examples of the successful use of Globus technologies with gateways including the QUBES science gateway, the COSMIC2 science gateway, PlantingScience and CitSciBio. As diverse as the projects are—from modeling and simulation integration to citizen science technologies—so is the use of Globus technologies, from authentication features to data transfer to fully applied Globus data management features.
	Globus @ Stanford: A Web Site for Globus Users Karl Kornel System Administrator, Research Computing, Stanford University \| slides At Stanford University, we have a Standard Globus subscription covering all of campus. That includes over 18,000 faculty and students, and over 1,000 IT providers. We want everyone at Stanford to use Globus, yet our team is less than 20 people, with only three people well-versed in Globus. Our response to this need will be the “Globus @ Stanford” web site. In this talk, we will explain why we decided to create an entire site for Globus. We will also talk about the decision to use GitHub Pages as the platform, and describe some of the major sections of the site. Although the site is not public, it is published, and the audience will be invited to check out the site after the talk.
	Easy Object Storage Import/Export Using the S3 Connector on Jetstream Lee Liming Technical Communications Manager, University of Chicago - Globus \| slides Jetstream (www.jetstream-cloud.org) is a self-service OpenStack cloud for researchers, available via NSF's research allocation process and XSEDE. Jetstream offers object storage that’s compatible with Amazon Web Services’ Simple Storage Service (S3). Although the research community is learning to use object storage—and important research applications have been adapted to use the S3 API—it's cumbersome to move large research datasets into and out of object storage services using the API. This lightning talk shows how researchers can easily and reliably move research datasets into and out of Jetstream’s object storage using Globus Connect Server with the AWS S3 storage connector. This illustrates the value of the S3 storage connector to campuses that have AWS S3 or OpenStack object storage.
	Globus Labs: Forging the Next Frontier Kyle Chard Research Fellow, University of Chicago \| slides Abstract will be posted shortly.
17:00—19:00	Reception Van Gogh Enjoy refreshments and light appetizers before heading out to explore some of the amazing dining options that Chicago has to offer.

Thursday, May 2, 2019 - TUTORIALS Tutorials will be held in the Chagall Ballroom
07:30—17:00	registration desk open Chagall Foyer
7:30—8:30	breakfast Van Gogh
9:00—16:00	Office Hours Gauguin Led by: Globus Team Need more personal attention? Do you have a particularly thorny issue that you're unable to resolve? Stop by during "office hours" where Globus developers will be on hand to answer your toughest questions.
8:30—12:15	What's New with Globus Led by: Greg Nawrocki, Rachana Ananthakrishnan \| slides We will demonstrate new and updated Globus capabilities from the perspective of a researcher, systems administrator, and application developer. This is a high-level introduction to all aspects of the Globus service, including the recently refreshed web application, command line interface, and new terminology introduced by the launch of protected data management features.
	Introduction to Globus for System Administrators Led by: Vas Vasiliadis \| slides We will demonstrate how to install and configure a Globus endpoint. We will also review deployment configurations such as multi-server data transfer nodes, using the management console with pause/resume rules, and integrating campus identity systems for streamlined user authentication. You will get to experiment with server endpoint installation using a virtual machine.
	Managing Protected Data with Globus Connect Server v5 Led by: Rachana Ananthakrishnan, Greg Nawrocki \| slides We will provide a detailed walkthrough of installing and configuring Globus Connect Server v5 for high assurance endpoints. We will demonstrate the key aspects of managing protected data on such endpoints, including how to configure additional authentication assurance, enforcing data encryption, and accessing audit logs.
	Leveraging Globus in your Research Applications Led by: Greg Nawrocki \| slides We will use a Jupyter notebook to demonstrate how you can incorporate Globus capabilities into your own data portals, science gateways, and other web applications to easily manage large datasets in diverse research use cases.
12:15—13:00	lunch Van Gogh
13:00—17:00	Maximizing Performance and Network Utility with a Science DMZ Led by: Jason Zurawski, ESnet \| slides We will provide an introduction to the Science DMZ concept and illustrate best practices for leveraging modern, high-speed netowrks in data-intensive research.
	Best Practices for Data Sharing Led by: Rachana Ananthakrishnan, Greg Nawrocki \| slides We will present various use cases that illustrate the power of Globus data sharing capabilities, and provide hands-on experience with the Globus file sharing APIs.
	Automating Research Data Workflows Led by: Greg Nawrocki, Rachana Ananthakrishnan \| slides We will review common use cases and demonstrate how the Globus command line interface (CLI) and API may be used to automate repetitive data management tasks.
	Globus Integrations (JupyterHub, Django, ...) Led by: Vas Vasiliadis \| slides We will demonstrate how Globus integrates with interactive platforms suchs as JupyterHub and web frameworks such as Django. We will also describe how to leverage the Globus platform—and Globus Auth in particular—to secure APIs when building your own web services.
17:00	conference adjourns

Friday, May 3, 2019 – Customer Forum (by invitation only)
08:30—9:00	check-in and breakfast Globus Office
9:00—12:00	Customer Forum Discussion Led by: Globus Team The Customer Forum is an opportunity for Globus subscribers to discuss their experiences with the service, to learn about our product development plans, and to provide input on future product directions. Attendance at the customer forum is by invitation only. If you would like to represent your institution/community please contact us for an invitation.

Past Event Programs

2025 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011

Why Attend?

See how to easily provide Globus services on existing storage systems
Hear how others are using Globus
Learn from the experts about the best ways to apply Globus technologies
Connect with other researchers, admins and developers

Important Dates

Proposals open: Nov. 14, 2025
Registration open: Nov. 19, 2025
Early Bird Registration ends: Feb. 20, 2026
Proposals due: Feb. 20, 2026
Proposal acceptance: Mar. 3, 2026
Speaker slides due (Chicago): April 10, 2026
Speaker slides due (Raleigh): May 22, 2026
Conference session Chicago: April 21-22, 2026
Conference session Raleigh: June 2-3, 2026