This is a guest post by Matthew Rocklin, cross-posted from his personal blog. You can find him on Twitter at @mrocklin.

As general purpose open source software displaces domain-specific all-in-one solutions, many institutions are re-assessing how they build and maintain software to support their users. This is true across for-profit enterprises, government agencies, universities, and home-grown communities.

While this shift brings opportunities for growth and efficiency, it also raises questions and challenges about how these institutions should best serve their communities as they grow increasingly dependent on software developed and controlled outside of their organization.

  • How do they ensure that this software will persist for many years?
  • How do they influence this software to better serve the needs of their users?
  • How do they transition users from previous all-in-one solutions to a new open source platform?
  • How do they continue to employ their existing employees who have historically maintained software in this field?
  • If they have a mandate to support this field, what is the best role for them to play, and how can they justify their efforts to the groups that control their budget?

This blogpost investigates this situation from the perspective of large organizations that serve the public good, such as government funding agencies or research institutes like the National Science Foundation, NASA, DoD, and so on. I hope to write separately about this topic from both enterprise and community perspectives in the future.

This blogpost provides context, describes a few common approaches and their outcomes, and draws some general conclusions.

Example

To make this concrete, place yourself in the following situation:

You manage a software engineering department within a domain specific institution (like NASA). Your group produces the defacto standard software suite that almost all scientists in your domain use; or at least that used to be true. Today, a growing number of scientists use general-purpose software stacks, like Scientific Python or R-Stats, that are maintained by a much wider community outside of your institution. While the software solution that your group maintains is still very much in use today, you think that it is unlikely that it will continue to be relevant in five-to-ten years.

What should you do? How should your institution change its funding and employment model to reflect this new context, where the software they depend on is relevant to many other institutions as well?

Common approaches that sometimes end poorly

We list a few common approaches, and some challenges or potential outcomes.

1. Open your existing stack for community development

You’ve heard that by placing your code and development cycle on Github that your software will be adopted by other groups and that you’ll be able to share maintenance costs. You don’t consider your software to be intellectual property so you go ahead and hope for the best.

Positive Outcome: Your existing user-base may appreciate this and some of them may start submitting bugs and even patches.

Negative Outcome: Your software stack is already quite specific to your domain. It’s unlikely that you will see the same response as a general purpose project like Jupyter, Pandas, or Spark.

Additionally, maintaining an open development model is hard. You have to record all of your conversations and decision-making online. When people from other domains come with ideas you will have to choose between supporting them and supporting your own mission, which will frequently come into conflict.

In practice your engineers will start ignoring external users, and as a result your external users will stay away.

 

2. Start a new general purpose open source framework

Positive: If your project becomes widely adopted then you both get some additional development help, but perhaps more importantly your institution gains reputation as an interesting place to work. This helps tremendously with hiring.

Negative: This is likely well beyond the mandate of your institution.

Additionally, it’s difficult to produce general purpose software that attracts a broad audience. This is for both technical and administrative reasons:

  1. You need to have feedback from domains outside of your own. Lets say you’re NASA and you want to make a new image processing tool. You need to talk to satellite engineers (which you have), and also microscopists, medical researchers, astronomers, ecolo