Monday, January 12, 2015

Wide-angle lens view of workflows

Contributors: Sarah Poon, Nan-Chen, Cecilia Aragon, Lavanya Ramakrishnan

The goal of our project is to focus on a couple of use cases in depth. However, we used the initial few months of the project to get a wide view of a number of different workflows that run on NERSC systems today. We talked to a number of application scientists and computer engineers and scientists who support these projects in their infrastructure needs. We talked to the Advanced Light Source, Materials Project, Climate Science, MODIS Pipeline, JGI and Cosmology. There were a few common themes that came up during the discussions.

What is a workflow?
In all our conversations, we found that workflows were used with different meanings. Computer scientists in the community traditionally use the word workflow to refer to the pipeline that runs on a cluster or HPC resource. However, often the scientists refer to workflow in the context of their larger science context, i.e., the process that takes them from a scientific hypotheses to a publication. This is a longer discussion that we will cover in one of our future blog posts.

Additionally, it was interesting that the majority of the groups that we talked to have developed their scripts or workflow systems to address the problem of managing their jobs through the batch queue system and associated policies at the centers. We also noted that workflow and workflow system were used interchangeably in some cases. This might largely be due to the fact that in many cases, a single or a set of scripts are used to represent the workflow and manage the workflow itself. Thus, the evolution of these scripts and workflow systems point to some of the challenges users face today and are concerned about.

Visibility of the machinery

A very common theme that came across in our discussions was the visibility of the machinery. We often work very hard to build transparent interfaces for scientists i.e., we work towards hiding the complexities of the underlying system for the end-user. In our conversations, we heard the whole breadth of requirements in this space. While this is generally very important, it is also true that the system works well till it doesn’t. In situations where the was a “breakdown” of the mental model, the users especially advanced users really wanted more control or more visibility in the machinery. For example, these mental breakdowns happened when failures occurred and the user was not able to diagnose what happened or the user’s mental model of how the software worked didn’t match what he or she saw. In other cases, scientists mentioned that the users of the analyses or the data on an HPC resource were unfamiliar with the environment and needed to be shielded from the details. As we move to exascale systems, where failures are common and the efficient use of hardware is critical, we need to carefully consider balancing the visibility of the machinery while addressing a wide range of users ( computational scientists, data consumers, analyses users).

Level of effort
Many of the users discussed the level of effort it took them to “manage” workflows in HPC environments. The users had often developed scripts or infrastructure to monitor and manage their job submission. However, the users still spent considerable time manually book-keeping and/or managing their jobs and data on the systems.

This raises some important questions on how much of the “workflow management” can be automated and how can it facilitate the human-in-the-loop situations for both usability and efficiency.

About the UDA Blog

We will use this blog to disseminate early results and news items from our project “Usable Data Abstractions for Next-Generation Scientific Workflows”.

A critical component of our research work is ethnographic based user studies. Ethnography is the systematic study of people and culture. We employ ethnography based user research to understand how scientists use existing applications for data management, analysis and visualization. Ethnographic research involves a researcher observing the community from the point of view of the subject of the study and usually results in a case study or field report.

In our project, we use the knowledge to help to design easy-to-use usable data management software for exascale workflows that can balance abstraction and transparency of the optimization choices that the user might need to make when using next-generation hardware and software infrastructure. We believe that early insights from our user research can benefit the community and will use this blog to disseminate these results.

Why ethnography based user studies?

In order to better understand users’ work and work practices in context, we will be conducting contextual inquiries with our users, which involve both interviews and observations of work occurring in its actual environment. While work is the set of tasks used to accomplish work goals, work practices comprise all the patterns of tasks, norms, communication, and routines used to carry out this work (Hartson and Pyla, 2012).

Interviews alone are not enough to uncover this level of knowledge. Many details may be implicit or deemed unimportant or uninteresting by the users during an interview. In addition, users’ opinions are often shaped by the limitations of existing tools. Observing work done in context allows us to gain a less biased view of existing work practices.

Our goal is to not only understand the various tasks used to carry out users’ work, but to uncover the intentions and strategies hidden in this observable work and to integrate knowledge spread across various users to get a unified understanding of these work practices. This knowledge will enable us to design a system that supports users’ work practices and improve their effectiveness.


References

Rex Hartson and Pardha Pyla. 2012. The UX Book: Process and Guidelines for Ensuring a Quality User Experience (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.