3:30 pm Main Conference Registration

4:40 Welcoming Remarks

Stan Gloss, Founding Partner, Chief Executive Officer, BioTeam, Inc.

4:45 Moderator’s Remarks

Allison Proffitt, Editorial Director, Bio-IT World & Clinical Informatics News

4:50 OPENING KEYNOTE PRESENTATION: Convergence, Culture, and the Acceleration of Cancer Research

Matthew Trunnell, Vice President and Chief Information Officer, Fred Hutchinson Cancer Center

As we seek to incorporate larger and more diverse data into cancer research--and to shorten the effective distance between research and clinic--we face issues of interoperability at levels ranging from technical to cultural. These issues have drawn the attention of the Vice President’s Cancer Moonshot program, which has raised both the level of urgency and the level of opportunity in achieving new levels of collaboration. This talk will discuss the establishment of the Hutch Data Commonwealth, a novel organization within the Fred Hutch Cancer Center being established to accelerate the convergence of capabilities and competencies in data science that will propel the work of developing cancer cures and preventions.

5:50 Exascale Opportunities for Healthcare

Patricia Kovatch, Associate Dean for Scientific Computing, Mount Sinai School of Medicine

High performance computing has already been enlisted in the quest to better understand, diagnose and treat human disease. Through the expert guidance of computational scientists and advanced computing and data analytic infrastructures, advances have been made in such areas as drug discovery and genomic sequencing. However, enormous scientific challenges lie ahead to realize the promise of personalized medicine. To achieve personalized medicine’s full potential, commensurate advances to reach exascale also need to be made. This talk will outline the specific scientific challenges and impacts for three areas of medicine: personalized cardiac therapy, precision medicine and real-time accurate imaging diagnosis. Then it will discuss the limitations of existing HPC along with the expected computational and data parameters and new capabilities needed for each of these areas in 2025.

6:20 Welcome Reception with Exhibit Viewing

7:20 Close of Day


7:30 am Continental Breakfast


8:00 Welcoming Remarks

Stan Gloss, Founding Partner, Chief Executive Officer, BioTeam, Inc.

8:05 Moderator’s Remarks

Allison Proffitt, Editorial Director, Bio-IT World & Clinical Informatics News

8:10 KEYNOTE PRESENTATION: A Space Odyssey: One Decade of Scaling Research Computing from 200 to Over 60,000 Processors

James Cuff, Assistant Dean and Distinguished Engineer for Research Computing, Harvard University

Since 2006, Harvard University has been scaling their computing environment to support the demands and requirements of their advanced scientific research. They have seen an unprecedented growth in research storage from 20TB to over 55,000TB. This isn’t slowing down any time soon. For example, additional demands for GPGPU computing have forced scaling to over 1.4 million CUDA cores. Harvard was not alone. Four other research intensive universities in the North East (Boston University, MIT, Northeastern and the University of Massachusetts) alongside Harvard, the state government, and private industry all got together to build a state of the art LEED Platinum data center dedicated to research computing. James will tell the story of their voyage, how they got to where they are today, and what lies in the future of converged data, compute and the people and skill sets needed to continue to support the world’s very best research, science and scholarly output.

9:10 Cyberinfrastructure Architecture: Designing a Framework for Science Progress

Dan Stanzione, Ph.D., Executive Director, Texas Advanced Computing Center, The University of Texas at Austin

Modern Cyberinfrastructure is a large collection of enormously complicated parts and partially overlapping disciplines. Successfully delivering science results requires not only combining bioinformatics, algorithms, data, data integration, libraries, APIs, storage, cloud, and high performance computing, but combining them in some systematic way. This talk will examine lessons from TACC in evolving an architecture and ecosystem for cyber-enabled solutions to modern large scientific challenges, at the level of systems, software, and most importantly, people. Examples will be pulled from the iPlant/Cyverse project, the Araport information resource, the DesignSafe CI, and other projects, and organizing principles that can be extracted across these projects. In addition, some recent data will be included on the deployment of a new large scale supercomputer using Intel’s latest many core technology, and early experiences exploiting the huge numbers of available cores on scientific applications.

9:40 Science Gateways and Today’s Research Landscape

Nancy Wilkins-Diehr, Associate Director, San Diego Supercomputer Center

Science gateways, also known as web portals, virtual research environments, and virtual laboratories, are a fundamental part of today’s research landscape. In this talk, I will provide several examples of science gateways that are having a tremendous impact on how research is conducted. I will highlight major NSF investments such as the Science Gateways Community Institute, which will further the development of sustainable gateways. Finally, I will highlight international activities such as the recently launched International Coalition on Science Gateways.

10:10 Coffee Break with Exhibit Viewing


10:45 Both Sides, Now: Joni Mitchell Had It Right Almost 50 Years Ago

Ruth Marinshaw, Chief Technology Officer, Research Computing, Stanford University

It’s all about the cloud ... public cloud, private cloud, your cloud, my cloud, any cloud, every cloud. Or is it??? Containers are the next velcro, they’re better than velcro, institutional data centers are dead ... or maybe they aren’t. As a participant in these discussions at various levels and as a leader helping shape institutional direction and policy, Ruth will discuss her team’s explorations of cloud resources and capabilities, from both sides, as part of a continuum of resources to enable scientific discovery.

11:15 Atomistic Simulations Open Up New Vistas into Biomolecular Function and Rational Drug Design

Markus Dittrich, Senior Scientific Consultant, BioTeam, Inc.

Scientists have long wanted to use computer simulations to study the function of important biomolecules such as enzymes and that of large-scale biomolecular complexes ranging from the DNA transcription machinery to whole viruses. Over the past decade, molecular simulation technologies such as Molecular Dynamics (MD) have become a powerful and increasingly popular tool for investigating molecular function and also for the rational design of drugs based on detailed molecular knowledge. MD simulations in particular have been able to not just leverage but also push the state of the art in high performance computing to deliver ever more detailed studies of biomolecular systems. MD simulations now routinely run on the largest supercomputers in the world taking advantage of the newest accelerator technologies such as GPGPUs and Intel Phi coprocessors. In addition, special purpose supercomputers which were specifically created for running MD, such as the Anton machine developed by D.E. Shaw Research, have been designed and built to increase the capability of MD even further. In this presentation, I will review the state of the art in molecular simulation, current and future computational requirements, where the field is heading, and finally how these new methods will impact the way in which we will study biomolecules and design new drug treatments in the future.

11:45 Scientific Computing in the Cloud: Speeding
Access for Cancer Drug Discovery

Bret Martin, Principal Research Computing Architect,
Data Science and Information Technology, H3 Biomedicine

Jeff Tabor Sr., Director, Product Management and Marketing, Avere Systems

H3 Biomedicine has built a cloud infrastructure that reduces latency and provides storage flexibility, and does so in a way that helps save money and support their business strategy. H3 Biomedicine will discuss cloud technology and cloud services that have enabled application migration to the cloud in a hybrid IT environment.

Aspera 12:15 pm Luncheon Presentation: Time for Better Things: How to Spend Less Time Transferring Research Data

Charles Shiflett, Senior Software Engineer, Cloud, Aspera, an IBM Company

In this presentation, you’ll learn how Aspera can be used to enable faster movement of genomic research-enabling researchers to spend more time working with data. Charles will also explore new ways Aspera can be used to support file and streaming data transfers, and review on-premise and cloud models for sharing, collaborating, and distributing genomic data.

1:00 Dessert Break with Exhibit Viewing

1:30 Roundtable Discussions I

Join one of these interactive sessions designed to provoke thought and discussion in connection with specific topics facing IT and Life Science professionals. Each group will have a moderator to ensure focused conversations around key issues within the topic. This small group format allows participants to informally meet potential collaborators, share examples from their own work and discuss ideas with peers. Discussion topics may include:

  • IT Organizational Challenges
  • Molecular Modeling
  • Collaborative Science
  • Cybersecurity
  • Data Management Solutions
  • Infrastructure
  • Data Centers
  • Science Gateways
  • Networking
  • Cloud


2:15 Chairperson’s Remarks

2:20 Harnessing Data: Executing on an Idea to Make Data an Enterprise Asset

Pragati Mathur, Senior Vice President, Enterprise Architecture and Business Intelligence, Biogen

Jason Tetrault, Director, Data and Advanced Analytics, Biogen

Big Data, Data Science and Analytics are common topics being discussed or used in every organization. Our goal is to share few case studies on how to integrate Technology, Big Data, Cloud, Data Science and Analytics to drive insights across Biogen, to enable collaborative problem solving and to transform interactions to accelerate innovation and insights. Let's discuss our journey, review architecture, explore the technologies, governance and lessons learned to make it happen.

2:50 NSF Cybersecurity Center of Excellence: Cybersecurity Resources for Scientific Research

James Marsteller, Security Officer, Pittsburgh Super Computing Center

The Center of Trustworthy Scientific Cyberinfrastructure (CTSC) is comprised of cybersecurity experts who have spent decades working with science and engineering communities and have an established track record of usable, high-quality solutions suited to the needs of those communities. This overview will highlight: CTSC’s mission, past work with large facilities and Cyberinfrastructure projects, key resources, and events of interest to the scientific research community. This session will share resources to advance the state of cybersecurity practice across the community by analyzing gaps in cybersecurity technology to provide guidance to researchers and developers, addressing the application of software assessment to complicated cyberinfrastructure software stacks, and fostering broadly the transition of cybersecurity research to practice.

3:20 Accelerating the Analysis of High-Throughput Sequencing

Bryce Olson, Global Marketing Director, Health and Life Sciences, Intel

People suffering from disease should be able to receive a diagnosis based on their genome among other data and receive a targeted treatment plan – all in one day. More collaboration and data sharing will be required to help achieve this goal. Learn about new platforms, tools, and innovation that Intel is developing with leading research centers and cancer institutions to simplify and speed-up data sharing and collaboration while protecting institutional IP and patient privacy.

3:50 Transition to Buses

4:30 Reception and Tour at San Diego Supercomputer Center

Visit the San Diego Supercomputer Center, a leader in advanced computation and all aspects of “Big Data”, for a complimentary reception and facility tour. With its two newest supercomputers, a data-intensive system called Gordon, and Comet, a petascale system entering production in 2015, SDSC is a partner in XSEDE (eXtreme Science and Engineering Discovery Environment), a National Science Foundation (NSF) program that comprises the most advanced collection of integrated digital resources and services in the world. SDSC has also pioneered advances in data storage and in cloud computing, and now houses several “centers of excellence” in the areas of large-scale data management, predictive analytics, health IT services, workflow automation, and Internet analysis.

Transportation between the conference venue and SDSC will be provided for all registered attendees. See website for the most up-to-date details.

6:00 Close of Day


7:15 am Breakfast Presentation (Sponsorship Opportunity Available) or Morning Coffee


8:00 Moderator’s Remarks

Allison Proffitt, Editorial Director, Bio-IT World & Clinical Informatics News

8:10 KEYNOTE PRESENTATION: Trends from the Trenches

Chris Dagdigian, Co-Founder and Principal Consultant, BioTeam, Inc.

Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. He’ll cover what has changed (or not) in the past year around infrastructure, storage, computing, and networks. This presentation will help you understand IT to build and support data intensive science.

9:10 The NIH Data Commons – Digital Ecosystems for Using and Sharing Biomedical FAIR Data at Scale

Vivien Bonazzi, Ph.D., Senior Advisor for Data Science Technologies, National Institutes of Health

The challenges of using biomedical big data are now blocking scientists’ ability to do research and to replicate and build on previous work. We need to consider a digital ecosystem approach where biomedical big data is the central currency that can be easily accessed, shared and reused by others. The Data Commons is a platform that allows producers and consumers of scientific data to connect, interact, exchange, create value and generate new discoveries, creating the basis for a digital ecosystem that can support scientific discovery in the era of biomedical big data.

10:00 Coffee Break with Exhibit Viewing

10:30 Roundtable Discussions II

Join one of these interactive sessions designed to provoke thought and discussion in connection with specific topics facing IT and Life Science professionals. Each group will have a moderator to ensure focused conversations around key issues within the topic. This small group format allows participants to informally meet potential collaborators, share examples from their own work and discuss ideas with peers. Discussion topics may include:

  • Data Centers
  • Scientific Analytics
  • Cloud
  • Data Science
  • Storage
  • Data Transfers
  • Science Gateways
  • Cybersecurity

11:30 am Enjoy Lunch on Your Own

12:45 Dessert Break with Exhibit Viewing


1:15 Chairperson’s Remarks

1:20 Making Scientific Data 100x Easier to Use: Transforming Pharmaceutical R&D with Scalable Approaches to Data Stewardship and Data Integration

Nicole Glazer, Ph.D., Director, Scientific Information Management, Merck

Arguably, the vast majority of time and resources in all scientific analytics and informatics projects are dedicated to finding, accessing, understanding, curating, and integrating the input data assets. While scientific data is generally effectively managed for its primary use, it often lacks the context that facilitates secondary uses and cross-functional integration. As a result, much of the research informatics efforts across pharma are focused on creating solutions to these challenges for specific sets of use cases within a particular problem space. As the use of predictive modeling and analytics increases to address the challenges of declining R&D productivity and increasing pressures for demonstrating product value, scalable approaches are required to handle the ever increasing variety of data types, data sources, data models and analytics patterns. To address these challenges at scale, we are combining data stewardship tools and capabilities that leverage crowdsourcing across the community of data creators along with a platform approach to solving data variety problems, building on the Big Data technology stack, which enables an ecosystem of agile fit for purpose datasets and informatics solutions.

1:50 DevOps Equals Freedom: Winning the War in Hybrid Clouds

Adam Kraut, Senior Scientific Consultant, BioTeam, Inc.

The Public Cloud is a necessary point of convergence for big science. Large scale translational medicine will create a complex and dynamic battleground for IT operators. We need to build sustainable infrastructure and continue to leverage breakthrough services for life sciences. This presentation will focus on the factors that are leading to successful missions in Public Cloud and Hybrid Cloud landscapes. Adam will discuss DevOps as a technique of discipline and the dichotomy of Hybrid Clouds.

2:20 Networking Refreshment Break


2:40 Roundtable Report-Outs

3:20 NCI Genomic Data Commons, Cloud Pilots, and FAIR

Warren Kibbe, Director, NCI Center for Biomedical Informatics and Information Technology (CBIIT), National Cancer Institute

The NCI Genomic Data Commons went live on June 6th, with genomic data on 14,000 cancer tumors and associated (but limited) clinical phenotype data. The NCI has also been exploring the use of commercial clouds to create a very different access, curation, analytics, and visualization model for these data, with the intention to democratize access and provide recognition and credit for data submitters, curators, algorithm creators, and software developers. Central to this model is the support of FAIR and broad data sharing of patient-level data.

3:50 CLOSING KEYNOTE PRESENTATION: Data Bloat Spectrum Disorder: Home Remedies and Alchemy for Life Sciences

Ari Berman, Vice President & General Manager, Consulting Services, BioTeam, Inc.

Data generation throughout the life sciences research and healthcare domains has risen at a rate far beyond that predicted by Moore’s Law. As a result, organizations are accumulating 10’s to 100’s of petabytes (PB) of data, spending millions on storage systems, and doing it all in a manner consistent with IT practices and policies from 2005. These practices include little to no data management, ineffective or non-existent data lifecycle policies, no metadata standards, and very few automated analysis pipelines that result in better understanding of the data. The result is a mass of unannotated data where the content is known only to the researcher who created it. This leads to tribal knowledge of the data and renders the possibility of merging the data with other datasets for large-scale modeling and discovery nearly impossible. In this presentation, we will discuss the general scope of the data bloat problem, how organizations have been approaching it, current solutions to data bloat, and how it might be approached in the future; through the convergence and abstraction of storage, network, and compute infrastructure into common analytics platforms that enable discovery and ease data management.

4:35 Close of Converged IT Summit


Founding Sponsors