Diversity and Metrics in Software Communities — Notes from the DISC Unconference

This is a contributed post by participants in the 2017 Diversity and Inclusion in Scientific Computing (DISC) Unconference.

Read our prior posts in the series:

Authors*: Kelle Cruz, Daniel S. Katz, Yasamin Khorramzadeh, Tiziano Zito

*All authors are equal contributors. They met and began working together at the NumFOCUS Diversity and Inclusion in Scientific Computing (DISC) Unconference at the 2017 PyData NYC Conference.

This post is a summary of discussions held at the NumFOCUS Diversity and Inclusion in Scientific Computing (DISC) Unconference at the 2017 PyData NYC Conference by the authors. Our goal was to think about how to assess the current demographics and “diversity” present in the NumFOCUS community and come up with a way to quantify any changes in these demographics.

Why is diversity important?

People with different backgrounds, skill sets, attitudes, and experiences bring different perspectives and thus a broader range of ideas to the table. Diversity of thoughts, opinions, and background render more innovative solutions to complex problems, leading to more productive collaborations and richer learning experiences.

What do we mean by diversity?

Dictionary.com defines diversity in a broad sense as “the state or fact of being diverse; difference; unlikeness” and in the more specific inclusion context as “the inclusion of individuals representing more than one national origin, color, religion, socioeconomic stratum, sexual orientation, etc.” Groups and communities who are concerned with diversity, who want to improve it, or simply just to track it, need to first define what they mean, then they need to measure it.

Diversity is very context-specific, to place/community for example, which makes defining it difficult. Questions one might ask are: How close is your demographic to that of larger groups? Are you representative? But most often, diversity is tied to a effort to enable those with less privilege to have an equal voice — which means that context is even more important, as a key question is: Who are the people who or do not have privilege? And this implies another question: What changes do you want to make?

Once diversity and objectives of measuring it have been defined, actual measurement can be discussed. Measuring diversity is difficult.

Diversity metrics in action – the ASPP summer school

A case study of measuring diversity and using the results to obtain a goal involves the ASPP summer school. Ten years ago, with the intent of providing the scientific community with the opportunity to learn techniques and tools to develop efficient and sustainable code derived from the software development industry, the German Neuroinformatics Node (G-Node) launched a yearly summer school, ASPP, to teach Advanced Scientific Programming in Python to students, postdocs, and senior scientists and close the gap in typical scientific academic curricula, which usually do not go beyond some very basic programming courses.

The summer school grew quickly, and it recently has received over 300 applications a year for the 30 places typically available. Each year, this self-funded effort attracts students who want to acquire software engineering skills to tackle new challenges in their areas of expertise, spanning the integration of several levels of data, simply re-analyzing old datasets in a new light, or developing open source software for the benefit of the community.

It became clear after the first instance of the summer school that there was a problem attracting female applicants: less than 15% of the applications were from women, where the gender was guessed from applicant names. Given that the application review process was performed blindly, i.e. name and gender information were erased from the application data presented to the reviewers, it was clear the problem was not just some reviewers’ unconscious bias, and that the organizers had to take action to correct this.

Starting the next year, the organizers began applying a “female bias” to the applications. They asked for gender information in the application form and after the reviewers had assigned their scores to the applications, applications from female applicants automatically had their scores increased. The first year that the “female bias” was introduced, the scores for female applicants were increased by 10%. By doing this the gender balance started improving, at least in terms of the number of female participants. But, interestingly enough, the number of female applicants also increased constantly over time, and in the last editions 40% of the applications were from women. The constant increase of female applicants was paired with a constant decrease of the “female bias” in our review process, in 2016 it was down to 5%.

Given that in 2017 gender was almost balanced in the applications, the organizers decided to remove the “female bias” altogether. This most recent year, 60% of the attendees were female!

However, there are several open issues with this generally successful story. First, applicants were asked to identify as male or female. This is of course a problem because it assumes binary gender identification. There are two possible way forward: stop asking about gender information, given that the problem is “fixed”, or, more sensibly perhaps, continue asking about gender information, but in a more inclusive way, for example allowing for a third non-binary gender identification label, so that some sort of “bias” can still be implemented if needed. A second issue is that it is not clear why the number of female applicants increased over the years: it may be that female applicants felt more welcome by seeing that a significant proportion of participants in previous years were female, or it may be that this is a general trend in scientific computing and the school is benefiting from it, or it may be a result of word of mouth advertising from past female participants. Most likely, it is a combination of these effects.

Detailed data and plots about this case study are available at https://python.g-node.org/python-summerschool-2017/archives.html#stats

Surveys

Surveys are a means to measure the diversity of participants in a community. But as with other diversity issues, it is important to have an intended outcome of a survey; it should not be done simple because “we are curious.”

One example of a question that may be answered through a survey is understanding the diversity of various projects and events compared with each other. These results could be used to identify the more successful diversity and inclusion measures and practices each project has used and encourage their implementation in other projects.

The survey we have planned starts with questions meant to identify the role of the respondent in NumFOCUS projects and conferences, and which project(s)/activity(s) the respondent most identifies with. This last question is meant to make the results useful to individual projects, and to make the other data analyzable both overall and in a project-specific manner.

Then we will ask questions regarding their current professional status, their current occupation (e.g., academia, industry, government, non-profit), career stage, and country of residence.

Finally, we will ask questions related to the respondent’s identity as a member of marginalized groups. We plan to ask these questions because we want a way to assess if our efforts to implement inclusive practices are actually making a difference. We will first ask if they consider themselves a member of a marginalised group in their own field. If they say yes, they will then have the opportunity to identify one or more dimensions in which they meant this, such as: gender identity; ethnicity, nationality, skin color, race; or sexual orientation; religion; age; disability; and/or another dimension.

Second, we will ask if there are other factors that prevent them from fully participating in their chosen NumFOCUS activities. If they say yes, we will ask them which of the following factors they meant, such as: nationality, religion, age, disability, socioeconomic status, English language proficiency, family care responsibilities, and/or another factor. If they said yes to the second question, we will also ask what would make it easier for them to participate in NumFOCUS activities, such as: travel support (e.g., funds, childcare), monetary compensation for project work, mentoring infrastructure (e.g., Buddy system/office hours to facilitate community integration career recognition (e.g., NumFOCUS-sponsored prizes/awards, titles that recognize both contributions and responsibilities), training opportunities (e.g., remote or in-person workshops such as on GitHub, coding), or career advancement support (e.g., help convincing administrators of value of open source contributions).

Conclusion

Diversity is important, but it’s also context-specific. Measuring diversity should have a reason and a goal. In any particular case, it’s important to define the context and objectives, as well as the methods that can be used to achieve the objectives, based on measured data. We have shown how measuring diversity for the summer school allowed the organizers to make gains in the areas they wanted to (achieved their objectives). In order to measure diversity across NumFOCUS projects, we have drafted a set of potential survey questions. The idea of measuring diversity is hard, and different groups in different fields have done it different ways previously. It’s not clear which of these methods, if any, apply in the way we are studying and trying to improve diversity. But we are eager to learn!

Diversity and Metrics in Software Communities — Notes from the DISC Unconference

Authors*: Kelle Cruz, Daniel S. Katz, Yasamin Khorramzadeh, Tiziano Zito

Why is diversity important?

What do we mean by diversity?

Diversity metrics in action – the ASPP summer school

Surveys

Conclusion

To stay in the loop on this and other DISC initiatives, join our email discussion list.

Monthly Updates