Tuesday, December 1, 2015

Evaluating the proposed capabilities to be supported by an API

In our experience, designing an application with usability as a key goal leads to software that is more readily adopted by a scientific collaboration. Usability applies not only to GUIs but also to APIs and CLIs as well. There has been some work on evaluating the usability of already implemented APIs [1,2,3,4,5]. A few studies have focused on taking a user-centered approach to getting early feedback on an API before implementation [6,7].

We built on this body of work by developing a user experience study to test the proposed features of a future API. This study revealed which functions would have the greatest impact on the users’ work as well as missing functionality.

Methodology
We conducted the study with the three domain scientists, with the test lasting approximately one hour each. Each feature was first defined before asking asking the study participants to perform two tasks.

How one user filled in the valence/arousal
ratings for the feature list

The first task involved asking the participants how they felt about the feature. Each participant placed the feature number on a valence/arousal chart. Valence represents how positively or negatively the participant feels about a feature, while arousal indicates how strongly they felt about the feature. For example, feeling excited about a feature would register as high valence, high arousal, whereas feeling ambivalent about a feature would be neutral valence, low arousal. Once they completed this task, we asked them to explain their ratings.











For the second task, we asked the participants to rank the features according to what they most wanted to see implemented. Additionally, we asked the users for a cutoff point in the ranked list which would mark the level of functionality they would consider very necessary and others that they were okay if the development team didn’t get to it.

Finally, we asked users if there were any missing features or if there were any mismatches between the concepts and how they thought about their work.

The importance heatmap

To analyze the results of the data, we visualized the importance of each feature as a heat map. In the process of categorizing the data into bins, we realized we had to do some judgement calls to split the list into bins. Therefore, we asked participants to verify the results of the binning afterwards. We then further categorized the features into implementation priority recommendations based on the following criteria:









A - Very important (team should look at it immediately)  -  any time 2 or more  people considered it critical.
B - Important (team should look at it soon)  - any time 2 or more people considered it "should have" or higher
C - Important (team should consider this in more detail)  - any time at least one person had a score of "would be good to have" or higher
D - Don't worry about it now.

Methodological Takeaways
In our methodology, the combination of trying to gather the emotional response feedback as well as the ranking feedback provided us with the relevant context for discussing the features with the domain scientists. However, once we concluded the study and tried to analyze the data, we realized that we didn’t have quite the right level of information in order to develop specific recommendations for what features should be implemented. We performed the binning with the information we had on hand but felt the need to do additional verification with the users to check our binning results. In future studies, we plan to do a binning exercise, using the scale shown in the above heatmap, as one of the study tasks.

A future blog posts will discuss our thoughts on evaluating a “paper” API, where you have most of the function definitions and signatures designed but not yet an implemented.

References

[1] Grill, Thomas, Ondrej Polacek, and Manfred Tscheligi. "Methods towards api usability: A structural analysis of usability problem categories." Human-Centered Software Engineering. Springer Berlin Heidelberg, 2012. 164-180.

[2] Rama, Girish Maskeri, and Avinash Kak. "Some structural measures of API usability." Software: Practice and Experience 45.1 (2015): 75-110.

[3] Cataldo, Marcelo, et al. "The impact of interface complexity on failures: an empirical analysis and implications for tool design." School of Computer Science, Carnegie Mellon University, Tech. Rep (2010).

[4] Robillard, Martin P., and Robert Deline. "A field study of API learning obstacles." Empirical Software Engineering 16.6 (2011): 703-732.

[5] Stylos, Jeffrey, et al. "A case study of API redesign for improved usability."Visual Languages and Human-Centric Computing, 2008. VL/HCC 2008. IEEE Symposium on. IEEE, 2008.

[6] Clarke, Steven. "Measuring API usability." Doctor Dobbs Journal 29.5 (2004): S1-S5.

[7] Ramakrishnan, Lavanya, et al. "Experiences with user-centered design for the Tigres workflow API." e-Science (e-Science), 2014 IEEE 10th International Conference on. Vol. 1. IEEE, 2014.


Wednesday, November 11, 2015

AnalyzeThis Paper at SC

UDA paper on AnalyzeThis will be presented at SC|15 next week.  The paper describes AnalyzeThis, an analysis-aware appliance and describes its implementation atop an emulation platform. Various members of the UDA team will be at SC!



Wednesday, October 21, 2015

Time Paper accepted at CSCW 2016

We are back after  a good summer of research with our students and staff.

Earlier we described some of our early work in looking at the time aspects of HPC. Our work on looking at the HPC sociotechnical system with a times lens was accepted at CSCW 2016.


Link to paper

Thursday, June 4, 2015

Accounts: A different approach to abstractions

Supercomputers usually have a variety of storage options including large scratch space, which is local to the system, and global storage for user and project directories. Additionally, users have access to tape storage for long term archival needs. There are various reasons why users may need to move files between these storage systems, including staging input data for faster access and clearing disk space for new files.

One domain scientist shared a story about a middleware solution that provided an abstraction layer to help manage the reading, writing, and transferring of files between these systems. The idea was to free users from thinking about which storage system to use. For example, users could simply issue a read command and let the software figure out where the file was actually stored. Although this provided for ease of use, a problem was quickly identified. Users had no way of knowing whether the file was stored local to the machine or on the tape archive. The difference in latency between the two options was significant, but without knowing where the file was, the users had difficulty estimating this latency. The ability to estimate time for such processes is key in helping users plan their own activities (i.e. deciding to wait for a process to finish or to instead go home).

Dourish and Button have written about this common problem with computational abstractions:
“[O]ne source of interactional problems in HCI lies in the way that computational abstractions are black boxes, in which the activity they encapsulate is obscured from view, and so unavailable to users as a resource around which they might be able to organise their action.” [1]
They argue that instead, systems should be designed such that they are observable and reportable:
“[I]n order to manage the relationship between the user’s work and the system’s action more effectively, we need to provide users with more information about how the system goes about performing the activities that have been requested; and that the place to look for this information is within the implementation, below the abstraction barrier.” [2]
They advocate not to do away with abstractions but for systems to create accounts of their behaviors:
“An account ... is a model of the system’s activity offered by the system to account for and cast light on its own action.” [2]
Frustration with the lack of transparency in HPC systems has been a common theme in our research. Thus, incorporating the idea of accounts into system design seems worthy of further investigation.

References:
  1. Dourish, P. 2001. Where the Action Is: The Foundations of Embodied Interaction. Cambridge: MIT Press.
  2. Dourish, P. and Button, G. 1998. On "Technomethodology": Foundational Relationships between Ethnomethodology and System Design. Human-Computer Interaction, 13(4), 395-432.

Wednesday, May 27, 2015

Roles in the HPC Ecosystem


In our previous post, we discussed the different roles within a scientific collaboration. Here, we take a different view by describing the people surrounding HPC machines. We refer to this sociotechnical system as the HPC Ecosystem. The people using and supporting these machines fall into the following roles:

Domain Scientists

Domain scientists are researchers in specific areas of basic science, such as cosmology, microbial biology, material science, and climate science. Most domain scientists conduct their work as part of one or more scientific collaborations. One or more principal investigators (PI) lead the collaboration. As discussed in our previous post, team members often include senior scientists, mid-career scientists, and early-career postdocs and students. In order to answer fundamental science questions, they run codes written either by their scientific community or by their collaborators. The scale of the codes and/or the data associated with them require these to be run in computing environments larger than the typical workstation, such as in cloud, cluster, and HPC environments. Domain scientists are the primary users of the HPC system we studied.

Computer Engineers

Computer engineers are members of the scientific collaboration who help scientists run codes on the HPC system. This work may involve modifying existing software or ‘codes’ to utilize the systems, installing community codes, writing custom codes to aid in the running and analysis of scientific codes, and developing cyberinfrastructure to support the scientific workflows.

HPC Facility Staff

HPC staff is comprised of people in various roles surrounding the acquisition and support of HPC systems. This includes people who act as liaisons between the users of the system and the facility to understand users’ needs, people who manage the allocation of system resources among its users, and people who work directly with users to help them troubleshoot their system usage.

Future work will discuss the effects that are revealed when examining the interplay of the people within these roles and the systems.

Monday, April 6, 2015

Do you have another minute?




By Nan-Chen Chen


In modern society, time is a critical factor in people’s daily lives. People plan their work and tasks based on time, or evaluate their efforts in terms of their personal time input. In research fields that study work, temporality is a topic that has thrived for over a decade. Studies and discussions around temporality consider time organic and socially constructed. For example, an hour spent with family is not perceived as a lost hour if spending time with family is a strong personal value. Another long-debated example is whether the invention of clocks changed the way we experience time, or whether our experience of time naturally led to the invention of clocks. As there are many different perspectives on time and several existing studies that demonstrate the impact of time on interactions and behaviors in work environments, it is important to further understand the role of time in scientific workflows in high performance computing (HPC) systems.

In the fieldwork we conducted in the past months, we uncovered several different types of time, such as human time--when a person waits for a job in a queue to run; or machine time--the time it takes to execute a job or to move data. From our results we have revealed a need to consider the cost and value of different types, scales, and contexts of time. For example, what is the cost and value of the time to run an analysis? The cost could be in terms of allocation spent on computation or I/O, and the value might be the scientific knowledge that is generated from the analysis. A more complex example might be human wait time for a job to run. At first glance, the cost in human time of code running could be pegged to the amount of wait time and clock time, which might be seen as pure inefficiency. But is the cost as high as it appears? Perhaps during this wait time, the scientist has time to reflect upon a previous run, generate the type of creative insight that often occurs during downtime, or accomplish other critical tasks. Thus there may be unexpected hidden value in this “wasted” waiting time.

The examples above demonstrate that time in the context of scientific workflow in HPC systems needs to be approached from diverse perspectives. As more HPC systems and workflow tools are designed to save time, it will be essential to consider human perceptions and judgments rather than just absolute numbers. Maybe in the future when we ask scientists, “Do you have another minute?” they will tell us, “Yes, but it is a more valuable minute than the one that just passed.”

Tuesday, March 10, 2015

Workflow Tools and the Coding Culture of Scientists

Sarah Poon, Nan-Chen Chen, Cecilia Aragon, Lavanya Ramakrishnan

In this project, our team consists of both computer scientists and HCI researchers who sit at the cross section of computer science and the domain sciences. We interact with the domain scientists to understand how they use computation to achieve their science goals and work with the computer scientists who are developing the tools necessary to efficiently run codes on supercomputers or analyze the data.

In many cases, the domain scientists are the ones developing the science codes, the algorithms used to produce and analyze scientific data, while computer scientists develop tools and technologies to help the scientists run their codes on HPC systems.

Workflow Tools
Workflow libraries and systems offer many benefits to scientists to aid in the instrumentation of their codes to run on HPC. Staff members at the NERSC supercomputing facility in Oakland recently formed a workflows working group, with the aim of evaluating a subset of workflow tools that can be run and supported on HPC systems to serve the needs of scientific users.

The discussion of workflows and the qualities to look at when evaluating workflow tools deserves a more detailed analysis than what can be addressed here. Instead, we will briefly discuss one dimension, whether a scientist is able to design and author a workflow that will run efficiently at scale without the aid of a computer scientist or workflow tool expert. In the cases where workflow tools are used to support complex workflows (e.g., handling of very large amounts of data) careful workflow design is needed to effectively use HPC resources. Therefore, it is usually not the scientists themselves but workflow experts that write these workflows. In many of the use cases discussed at the workshop, these workflows were considered production workflows meant to be run over and over without much modification. Thus, the upfront investment of careful workflow design made sense to these groups.

Scientific work that that is more iterative often has workflows that are much more experimental in nature and require constant tweaking and revisions. In these cases, scientists have expressed the desire to be able to author the workflow themselves, without the aid of a computer scientist or workflow expert, usually in ways that fit naturally into their current coding environment. We have seen examples of high throughput workflows, where each run submits hundreds or thousands of application variants that are only run once. We have also seen examples of workflows that need to go into production but are provisional and require refinement. For these types of workflows, workflows tools that were expressly designed to be easy for people to self-author were a better fit. Similar to workflow tools, workflow types deserves a blog post of its own, which we will explore in a future blog post.

Workflow tools often expect domain scientists to encapsulate their code into black boxes with well defined inputs and output. In other words, the science codes need to be written in a clean, modular way. However, this expectation might be at odds with some of the realities that scientists face when they write their code and becomes problematic when considering the situations where scientists are writing the workflows without the aid of a computer scientist.

How domain scientists code

To illustrate the culture surrounding scientific code among scientists, we will explore different roles* of scientists in collaborative projects that heavily use HPC resources.

[* These descriptions of roles and coding culture are generalizations based on interviews as well as years of working with scientific collaborations. There are fuzzy boundaries and overlap in these roles when it comes to the characterizations described, and roles vary across projects.]

Principal Investigators/Senior Scientists: As the lead of a science project working with a number of scientists, students, and postdocs, the PI on a project not only has quite a lot of knowledge on the various ways computation is used in the project but also often has a hand in the running and coding of some pieces of this software. They often enjoy a certain amount of the day to day running of software, coding analysis tasks, and desire a high level of transparency on how the libraries work. However, they generally do not mind offloading certain tasks to a computer scientist if and when available, such as tuning their code to run well at scale. Despite a lot of experience in computation, several PI’s and senior scientists have expressed concerns they are not doing things “the right way”, especially whenever a HPC facility does a systems upgrade or procures a new machine and their codes no longer compile or perform as expected. This sometimes stems from the fact that, although they are aware that APIs for parallelism or multi-threading are available, these APIs are under-utilized due to the burden of learning them and refactoring their code to use them.

Mid-Career Scientists: Mid-career scientists in science collaborations have often gained a lot of experience using supercomputers as postdocs and graduate students. They have sometimes adopted many software engineering best practices by working with groups of computer scientists over the years. They do a large chunk of the day to day operations and analysis, and often have strong opinions about the types of tools and libraries they want to use, favoring tools that are simple to use, that don’t take a large amount of effort to learn, and that are flexible to sometimes daily iterations to their code. Some mid-career scientists have built computational tools, even workflow tools, despite the fact that often, existing tools could have been utilized by their collaborations. In some cases, they simply didn’t realize such tools existed or had a difficult time seeing a match between a tool and their needs. Other times, due to the complexity of the software and high learning curve, these scientists felt it would be easier to write their own. Most scientists start with writing the software primarily for their personal use to solve a specific problem. This can result in code that can at times be not well documented and riddled with hard coded variables. The task of then making this code production quality can seem like a huge burden compared to the act of writing the original code. In the end, all software written by these scientists are primarily a means to an end, since their career advancement is based on science results and publications, not git commits.

Early Career Scientists: Early career scientists often haven’t had much programming experience beyond what is taught in introductory classes or coding bootcamps. Early career scientists can range from graduate students to postdocs to junior scientists who often will spend a somewhat limited amount of time on a project (sometimes several years, sometimes only a few months). They usually need to learn these skills quickly and on the go, in order to get the science results they need to produce papers during the limited time on the project. An example task they might perform is to add a piece of analysis code into an inherited pipeline that may not be entirely functional. If their software runs inefficiently, they may not have enough background knowledge to recognize, diagnose, and fix the issues. Some of these early career scientists have expressed a feeling of being overwhelmed by the amount of knowledge needed to run seemingly simple code and of intimidation by being surrounded by senior scientists with so much computational knowledge. Like mid-career scientists, their primary goal is not to become an expert coder but to publish in their field.

Here are some early thoughts about the scientific coding culture that we can extract from these roles:

All of these scientist types are probably not going to be spending as much time planning and thinking out their code as perhaps a computer scientist would.

They may not even know all the available techniques and tools out there, may find it too cumbersome to learn, or may not be able to define their problems in such a way that appropriate tools and library can be found.

Adding in any new tool or library will potentially be an afterthought - something that will be added in after a piece of code is already written and running (possibly poorly) in production.

They are looking for ways to solve their immediate pain points, not necessarily for ways for their code to be more efficient. Science results are the goal, not efficient code.

All of these factors in the scientific coding culture mean that scientists often think and write code in ways that are different from computer scientists. They have different constraints and different motivations. This leads to a potential mismatch between what workflow tools are expecting and what domain scientists can provide. Since there is such a strong push to increase computational competency in the sciences, one may wonder if these differences dissolve over time. But when you look at the three roles, you see that there will typically be a range of skills, and often technology moves exponentially while people tend to learn incrementally**. Any individual scientist may increase his computational competency over time but at a slower rate than technological advances. Although we cannot anticipate how this coding culture will change over time, it is likely to continue to be true that tools aimed at computer scientists will not always align well with the way domain scientists code. Therefore, tools that are specifically targeted towards scientific coding should aim to empower these scientists while respecting their coding culture.

** Law of Disruption by Larry Downes

Monday, January 12, 2015

Wide-angle lens view of workflows

Contributors: Sarah Poon, Nan-Chen, Cecilia Aragon, Lavanya Ramakrishnan

The goal of our project is to focus on a couple of use cases in depth. However, we used the initial few months of the project to get a wide view of a number of different workflows that run on NERSC systems today. We talked to a number of application scientists and computer engineers and scientists who support these projects in their infrastructure needs. We talked to the Advanced Light Source, Materials Project, Climate Science, MODIS Pipeline, JGI and Cosmology. There were a few common themes that came up during the discussions.

What is a workflow?
In all our conversations, we found that workflows were used with different meanings. Computer scientists in the community traditionally use the word workflow to refer to the pipeline that runs on a cluster or HPC resource. However, often the scientists refer to workflow in the context of their larger science context, i.e., the process that takes them from a scientific hypotheses to a publication. This is a longer discussion that we will cover in one of our future blog posts.

Additionally, it was interesting that the majority of the groups that we talked to have developed their scripts or workflow systems to address the problem of managing their jobs through the batch queue system and associated policies at the centers. We also noted that workflow and workflow system were used interchangeably in some cases. This might largely be due to the fact that in many cases, a single or a set of scripts are used to represent the workflow and manage the workflow itself. Thus, the evolution of these scripts and workflow systems point to some of the challenges users face today and are concerned about.

Visibility of the machinery

A very common theme that came across in our discussions was the visibility of the machinery. We often work very hard to build transparent interfaces for scientists i.e., we work towards hiding the complexities of the underlying system for the end-user. In our conversations, we heard the whole breadth of requirements in this space. While this is generally very important, it is also true that the system works well till it doesn’t. In situations where the was a “breakdown” of the mental model, the users especially advanced users really wanted more control or more visibility in the machinery. For example, these mental breakdowns happened when failures occurred and the user was not able to diagnose what happened or the user’s mental model of how the software worked didn’t match what he or she saw. In other cases, scientists mentioned that the users of the analyses or the data on an HPC resource were unfamiliar with the environment and needed to be shielded from the details. As we move to exascale systems, where failures are common and the efficient use of hardware is critical, we need to carefully consider balancing the visibility of the machinery while addressing a wide range of users ( computational scientists, data consumers, analyses users).

Level of effort
Many of the users discussed the level of effort it took them to “manage” workflows in HPC environments. The users had often developed scripts or infrastructure to monitor and manage their job submission. However, the users still spent considerable time manually book-keeping and/or managing their jobs and data on the systems.

This raises some important questions on how much of the “workflow management” can be automated and how can it facilitate the human-in-the-loop situations for both usability and efficiency.

About the UDA Blog

We will use this blog to disseminate early results and news items from our project “Usable Data Abstractions for Next-Generation Scientific Workflows”.

A critical component of our research work is ethnographic based user studies. Ethnography is the systematic study of people and culture. We employ ethnography based user research to understand how scientists use existing applications for data management, analysis and visualization. Ethnographic research involves a researcher observing the community from the point of view of the subject of the study and usually results in a case study or field report.

In our project, we use the knowledge to help to design easy-to-use usable data management software for exascale workflows that can balance abstraction and transparency of the optimization choices that the user might need to make when using next-generation hardware and software infrastructure. We believe that early insights from our user research can benefit the community and will use this blog to disseminate these results.

Why ethnography based user studies?

In order to better understand users’ work and work practices in context, we will be conducting contextual inquiries with our users, which involve both interviews and observations of work occurring in its actual environment. While work is the set of tasks used to accomplish work goals, work practices comprise all the patterns of tasks, norms, communication, and routines used to carry out this work (Hartson and Pyla, 2012).

Interviews alone are not enough to uncover this level of knowledge. Many details may be implicit or deemed unimportant or uninteresting by the users during an interview. In addition, users’ opinions are often shaped by the limitations of existing tools. Observing work done in context allows us to gain a less biased view of existing work practices.

Our goal is to not only understand the various tasks used to carry out users’ work, but to uncover the intentions and strategies hidden in this observable work and to integrate knowledge spread across various users to get a unified understanding of these work practices. This knowledge will enable us to design a system that supports users’ work practices and improve their effectiveness.


References

Rex Hartson and Pardha Pyla. 2012. The UX Book: Process and Guidelines for Ensuring a Quality User Experience (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.