scinexus logo scinexus logo

About

!!! abstract “Just as attrs and dataclasses use type hints to simplify data type definition, scinexus uses them to simplify writing best-practice scientific algorithms.

scinexus (pronounced ‘sigh-nexus’) is a Python framework for rapid development of data processing applications. It enables interoperability between apps through defined data types, allowing development of scientific domain app ecosystems (for examples see cogent3 and piqtree).

Many scientific problems require repeating calculations across many files or database records. Such tasks suit data-level parallelism on multi-core CPUs, but writing robust, maintainable code for them is often tedious and quickly becomes complex.

With scinexus apps, you can use a functional programming style when developing your application. Combined with scinexus app composition, this greatly simplifies your programming logic making it easier to understand and thus easier to explain. And as we know

!!! quote If the implementation is easy to explain, it may be a good idea.

-- Tim Peters, "Zen of Python"

What you get

Standalone utilities

scinexus also provides generally useful utilities for developers of data analysis applications. Utilities for file IO, parallel execution, and progress tracking are usable independently of the app framework.

Get started

The scinexus origin story

The app infrastructure code was originally developed within cogent3, where it accumulated over seven years of development, testing, and real-world use in computational genomics before being extracted into scinexus. The design is mature and has underpinned analyses in published studies.

We acknowledge here that many members of the cogent3 community contributed to the code that now lives here, including @GavinHuttley, @rmcar17, @Nick-Foto, @KatherineCaley, @fredjaya, and @khiron.


  1. Failures are automatically recorded as NotCompleted records which get propagated and stored in data stores. These records record salient details that help you identify the cause of the failure.↩︎

  2. tqdm is the default because of its robustness in notebooks, but you can choose rich.↩︎

  3. The default is Python’s standard library multiprocessing module. If you’re using Jupyter Notebooks, however, it’s recommended that you use loky. This is an installation option and configuration is easy.↩︎