Affiliated Projects

NumFOCUS Affiliated Projects are focused on open source data science, make meaningful use of NumFOCUS-sponsored tools, have an active community of contributors, and have a Code of Conduct, either adopted from our own or similar in spirit. Affiliated Projects are not fiscally sponsored by NumFOCUS.

We highlight affiliated projects to encourage the community to contribute to, promote, and support these open source tools!


Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them.

Read More

It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.


Cython is an optimising static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself.


Dash is a Python framework for building analytical web applications. No JavaScript required. Built on top of Plotly.js, React, and Flask, Dash ties modern UI elements like dropdowns, sliders, and graphs to your analytical Python code.


Dask enables parallel computing through task scheduling and blocked algorithms.

Read More

This allows developers to write complex parallel algorithms and execute them in parallel either on a modern multi-core machine or on a distributed cluster.

Data Retriever

The Data Retriever is a package manager for data. It downloads, cleans, and stores publicly available data, so that analysts spend less time cleaning and managing data, and more time analyzing it.​


DyND is a C++ library for dynamic, multidimensional arrays.

Read More

It is inspired by NumPy, the Python array programming library at the core of the scientific Python stack, but tries to address a number of obstacles encountered by some of its users. Examples of this are support for variable-sized string, ragged array types, and convenient usage from C++. The library is in a preview development state, and can be thought of as a sandbox where features are being tried and tweaked to gain experience with them.


Gensim is a Python library providing scalable statistical semantics, analysis of plain-text documents for semantic structure, and retrieval of semantically similar documents.


MDAnalysis is a Python library to analyze trajectories from molecular dynamics (MD) simulations.

Read More

It can read and write most popular formats, and provides a flexible and fast framework for writing custom analysis through making the underlying data easily available as NumPy arrays.


Numba gives you the power to speed up your applications with high performance functions written directly in Python.

Read More

With a few annotations, array-oriented and math-heavy Python code can be just-in-time compiled to native machine instructions, similar in performance to C, C++ and Fortran, without having to switch languages or Python interpreters.


Open source data visualization and data analysis for novice and expert. Interactive workflows with a large toolbox.


pomegranate is a Python module for fast and flexible probabilistic modeling inspired by the design of scikit-learn.

Read More

A primary focus of pomegranate is to abstract away the intricacies of a model from its definition, allowing users to easily prototype with complex models and training strategies. Its modular implementation allows for probability distributions to be swapped in or out for each other with ease and for models to be stacked within each other, yielding such delights as a mixture of Bayesian networks or a Gaussian mixture model Bayes classifier.


Free scientific and engineering development software used for numerical computations, and analysis and visualization of data using the Python programming language


QuTiP is a software for simulating quantum systems. QuTiP aims to provide tools for user-friendly and efficient numerical simulations of open quantum systems.

Read More

It can be used to simulate a wide range of physical phenomenon in areas such as quantum optics, trapped ions, superconducting circuits and quantum nanomechanical resonators. In addition, it contains a number of other modules to simplify the numerical simulation and study of many topics in quantum physics such as quantum optimal control, quantum information, and computing.


SciPy is open-source software for mathematics, science, and engineering.

Read More

It is also the name of a very popular conference on scientific programming with Python. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization.


Free high-quality and peer-reviewed volunteer produced collection of algorithms for image processing.


scikit-bio is an open-source, BSD-licensed, python package providing data structures, algorithms, and educational resources for bioinformatics.


Module designed for scientific Python that provides accessible solutions to machine learning problems.


Statsmodels is a Python package that provides a complement to Scipy for statistical computations including descriptive statistics and estimation of statistical models.


Spack is a flexible package manager that builds multiple versions of packages for different configurations, platforms, and compilers.  It was created to deploy large-scale scientific simulations on HPC systems, but it can deploy software on Linux and macOS machines, as well.


Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts.

Read More

It features a unique combination of the advanced editing, analysis, debugging, and profiling functionality of a comprehensive development tool with the data exploration, interactive execution, deep inspection, and beautiful visualization capabilities of a scientific package. Furthermore, Spyder offers built-in integration with many popular scientific packages, including NumPy, SciPy, Pandas, IPython, QtConsole, Matplotlib, SymPy and more.


Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.


Yellowbrick is a Python package that visualizes the data science workflow, allowing users to visually steer the feature, algorithm, and hyperparameter selection process by directly extending the Scikit-Learn API.

Benefits of being an Affiliated Project include:

  • listing on the NumFOCUS website
  • belong to NumFOCUS internal project developers mailing list
  • eligible for NumFOCUS small development grants
  • eligible to participate in Google Summer of Code under NumFOCUS umbrella
  • eligible to apply for infrastructure support such as web hosting and server space

NumFOCUS Affiliated Projects are:

  • focused on open source data science,
  • make meaningful use of NumFOCUS-sponsored tools,
  • have an active community of contributors, and
  • have a Code of Conduct, either adopted from our own or similar in spirit.

If your project meets the above criteria and you would like to become a NumFOCUS Affiliated Project, please .



NumFOCUS play such a critical role in the scientific python ecosystem. Happy to support a great cause!

Daniel Smith, Supporting Member