UKOLN
Raising Awareness

"A centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities."

UKOLN is based at the University of Bath.
I2S2 logo

Events

Scaling Up to Integrated Research Data Management Workshop, IDCC 2010, 6th December 2010, Chicago

Organisers: Manjula Patel & Liz Lyon, DCC & UKOLN, University of Bath, UK.

Overview: Structural Science incorporates a number of disciplines including Chemistry, Physics, Materials, Earth, Life, Medical, Engineering, and Technology. Within these disciplinary communities scientific research is conducted at a range of differing scales involving the use of small laboratory equipment to institutional installations to large scale facilities such as the synchrotron facilities at CERN, the DIAMOND Light Source (DLS) and ISIS, based at the Science and Technology Facilities Council (STFC). With improvements in technology there is an increasing demand to make available raw, processed and derived data for validation and reanalysis purposes, necessitating data management of these types of data as well as the final results data. It is however apparent, that many research teams capture, manage, discuss and disseminate their data in relative isolation with highly fragmented data infrastructures and poorly integrated software applications. In addition, a low awareness of data curation and preservation issues leads to data loss and reduced productivity. On the other hand, large centralised facilities have a responsibility to provide a data management infrastructure for their users and have spent considerable effort designing and implementing such systems. The outcome is that each large-scale facility has its own, often insular approach to data management resulting in vast ‘data silos’. This workshop organised by the JISC funded I2S2 (Infrastructure for Integration in Structural Sciences) Project aims to explore and highlight a variety of ways currently under investigation to alleviate data management problems resulting from working at differing scales of science and across organisational boundaries.

Aims: The purpose of this workshop is to explore a variety of issues relating to scale and integration in terms of research data management from science being conducted at local bench top level to large-scale facilities such as at CERN and STFC.

Audience: The workshop will be of relevance to a variety of stakeholders interested in ways to improve research data management over differing scales of science and across organisational boundaries; this includes individual research scientists and large-scale facilities managers, as well as computing services and funding agencies.

Programme

12:30-13:30 Lunch
13:30-13:35 Introduction & Welcome,
Liz Lyon, DCC & UKOLN, University of Bath, UK
13:35-14:00 Integrated research data management in the Structural Sciences [PPT],
Manjula Patel, I2S2 Project, DCC & UKOLN, University of Bath, UK
14:00-14:30 A Federated Repository for Large Scientific Datasets[PPT],
Steve Androulakis, Monash University, Australia
14:30-15:00 Data: A Legacy of NEES [PPT],
Shirley Dyke, Professor of Mechanical Engineering and Civil Engineering
& Director of the Intelligent Infrastructure Systems Laboratory, Purdue University
15:00-15:20 Tea & Coffee Break
15:20-15:50 Integrating Data Management into Climate Change Science Research [PPT],
Bruce E. Wilson, ORNL Climate Change Science Institute
15:50-16:20 DataONE: Preserving Data and Enabling Innovation in the Biological and Environmental Sciences [PPT],
William Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico
16:20-16:50 Discussion,
facilitated by Liz Lyon, DCC& UKOLN, University of Bath, UK
16:50-17:00 Conclusion & Closing Remarks,
Simon Hodson, JISC Managing Research Data Programme Manager, UK

Welcome and Panel/Discussion session
Liz Lyon, DCC & UKOLN, University of Bath, UK

Dr Liz Lyon is the Director of UKOLN at the University of Bath UK, where she leads work to promote synergies between digital libraries and open science environments. She is Associate Director of the UK Digital Curation Centre, in which UKOLN is a partner. She is also author of a number of major direction-setting Reports including Open Science at Web-Scale: Optimising Participation and Predictive Potential (2009), Scaling Up (2008) and Dealing with Data (2007). Liz has led a series of pioneering research data management projects: eBank, eCrystals Federation, Infrastructure for Integration in Structural Sciences (I2S2) and SageCite, all of which have explored links between research data, scholarly communications and open science. She has a doctorate in cellular biochemistry and has worked in various University libraries.

Integrated research data management in the Structural Sciences
Manjula Patel, I2S2 Project, DCC & UKOLN, University of Bath, UK

Dr Manjula Patel has been working in the R&D team at UKOLN since 1997. She has worked on numerous projects in various areas of digital information management, including: resource discovery; schemas; metadata; linked data; and virtual museums. Since 2004, she has been involved with the JISC funded Digital Curation Centre, her major role being to liaise with projects concerned with curation and preservation issues within specific disciplines, they include: eBank-UK Phase 3 (Crystallography); the eCrystals Federation (Crystallography); Knowledge & Information Management Through Life (Engineering); and Infrastructure for Integration in Structural Sciences (Chemistry and Earth science). Manjula holds a PhD in computer graphics; an MSc in systems design and a BSc (Hons) in computational science and economics. Further information is available from her staff page at:
http://www.ukoln.ac.uk/ukoln/staff/m.patel/

Abstract: This presentation will explore the work of the I2S2 (Infrastructure for Integration in Structural Science) JISC funded project which forms a part of the Research Data Management Infrastructure strand of the JISC's Managing Research Data Programme. I2S2 aims to identify requirements for a data-driven research infrastructure in Structural Science, focusing on the domain of Chemistry, but with a view towards inter-disciplinary application. The project is exploring the perspectives of scale and complexity, and disciplinary research issues throughout the data lifecycle.

A Federated Repository for Large Scientific Datasets
Steve Androulakis, Monash University, Australia

Steve Androulakis is a senior software developer based at Monash University, Australia. His areas of work include eResearch and bioinformatics solutions relating to digital curation, the sharing and collaboration of research data, and applying high performance computing approaches to research problems. He is the creator of the TARDIS repository for large scientific datasets.

Abstract: The Australian Synchrotron generates terrabytes of scientific data daily. Despite services existing at the facility that allow a scientist to push their collected data back to their home institution, the majority of scientists in the past have opted for the traditional method of bringing their own drives to store generated data and then physically transporting them back to their lab for processing.

MyTARDIS has been created as a federated solution for the need to catalogue, manage, and assist the sharing of large datasets from scientific instruments in a private and secure way, including routing data to researchers' home institutions. TARDIS is its counterpart that provides a centralised index for public datasets.

TARDIS/MyTARDIS has been selected as the metadata store for all instruments at the Australian Synchrotron, and at ANSTO (Australia's nuclear research facility). This workshop will detail and demonstrate the functionality of TARDIS/MyTARDIS including information on future development and an explanation of its underlying data management philosophy.

Data: A Legacy of NEES
Shirley Dyke, School of Mechanical Engineering, Purdue University

Dr. Shirley J. Dyke is Professor of Mechanical Engineering and Civil Engineering, School of Mechanical Engineering, Purdue University and the Director of the Intelligent Infrastructure Systems Laboratory. Dr. Dyke investigates ways to reduce losses and property damage from earthquakes. She also studies the use of structural control and monitoring systems for improving the behaviour and lifetime of structural systems. Dr. Dyke earned her B.S. in Aeronautical and Astronautical Engineering at the University of Illinois and her Ph.D. at the University of Notre Dame.

Abstract: The George E. Brown Network for Earthquake Engineering Simulation (NEES) has been engaged in large-scale experimental and numerical research and education for nearly a decade. The network consists of 14 large-scale testing facilities, including centrifuges, structural testing facilities, shake table facilities, and tsunami wave basins. A key aspect of the legacy of the NEES network lies in the data acquired during physical testing performed on new engineering systems and components. This data, including measurements, video, images, etc. from the experiments performed, is being collected within a curated repository known as the NEES Project Warehouse. This rapidly growing repository will provide the earthquake engineering community with data to demonstrate new design approaches understand the behaviour of new and existing components and validate numerical models. The repository will also be of great benefit to engineering practitioners and educators. This presentation will discuss the challenges involved in the development and management of the data repository and the approaches taken to offer this tremendous resource to the earthquake engineering community.

Integrating Data Management into Climate Change Science Research
Bruce E. Wilson, ORNL Climate Change Science Institute

Bruce Wilson is the Group Leader for the Environmental Data Science and Systems Group, the Manager for the ORNL Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC), and an Adjunct Professor of Information Sciences at the University of Tennessee. His general research interests are in scientific informatics, particularly enabling data-intensive science in ecology and climate science through advancing data storage, curation, distribution, analysis, and visualization technologies and practices.

Bruce serves on the Board of Directors for the USA-National Phenology Network (USA-NPN), the Finance Committee for the Federation of Earth Science Information Partners, and the Core Cyberinfrastructure Team for the DataONE (Observation Network for Earth) project. He also serves as a peer-reviewer for several journals and for grant programs at NSF, NASA, and DOE.

DataONE

William Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico

Conclusion & Closing Remarks

Simon Hodson, JISC Managing Research Data Programme Manager, UK