Principles
The guiding principles below are those shaped and informed by the core values.
Wherever possible, re-use existing and openly licensed material
Software that isn’t open source is by definition not transparent, FAIR, or open and conflicts with our values. We aim to use open source software as the basis for our software and products. In the open source software world, there exists many great tools, methods, frameworks, and resources used widely in industry that haven’t been incorporated into common research data practices. Combined with being open source, a more widely used tool or method means that more people are or have examined its codebase. As more people examine the code, there is a higher likelihood that issues are found and fixed faster. So we aim to make use of and/or modify open software and resources for our projects wherever possible.
Build using tools that are familiar to or used by both specialists and non-specialists
For highly technical specialists like software engineers or other full-time IT professionals, there is a rich ecosystem of tools they can draw on and use for nearly any purpose. However, non-specialists who may also be technically knowledgeable may not have the same resources, skills, or knowledge to know how to use these tools. We want to build software and products in a way that encourages more contributions from people outside our team, from contributors with a diverse range of skills and knowledge.
By building in this way, it can help us maintain and develop our products as there may be more potential contributors, who will also (hopefully) use and share our products, leading to greater use. So we aim to use development tools and knowledge that as many potential contributors and future team members are at least familiar with, or soon to be familiar with, when building our products.
At the same time, we recognise that there will always be tools and knowledge that are not as established or are newer, but that provide a substantial improvement over existing or more widely used tools. So while we aim to use more familiar tools, we are also open to using newer tools that are not too difficult to learn and use, and that provide us with substantial benefits.
This way we can balance being accessible, while also always seeking to improve how we work given our resources, time, and knowledge.
Use and build software that integrates easily with other software
Software that can be easily integrated with other software is more likely to be used and maintained. When it is more easily integrated with other systems, it also likely means that it has a good design, such as being modular, well-documented, and using established conventions. We aim to design and build software:
- That can be used in a variety of contexts, with a range of other systems.
- That fits well into our own and others’ software ecosystems and workflows.
By following many of the principles in this page, we will already have achieved this. What this means is that we will likely build smaller, focused software packages that each can connect well together into a larger ecosystem.
Use software that has characteristics of stability and reliability
There are always new software being built, but often though not always, maintenance and development is easier when using tools that are more well established. New technology progresses quite quickly, with new things coming or going, and we want to use tools that are more likely to still be used in the not-too-distant future. Combined with the pace of change, while there are many organisations that develop or support open source software, not all of them have openness as a core value and mission that aligns with ours. So we aim to:
- Use tools made by organisations that align with our values.
- Use tools that have a strong history of stability and reliability to minimise the risk of having to refactor or change our processes too often.
- Use tools that are designed and built in a way that makes switching to other tools easier (e.g. more interoperable).
However, this aim does not mean we won’t be looking for and willing to use new tools or technologies that are substantially better than the tools we are currently using.
Build software that can be used in diverse computing environments
Not all institutions or companies have the funding or access to more powerful computing environments. Nor do many data needs depend on using a server for legal or privacy reasons. Some groups may be easily able to have their data be locally or be put up on popular hosting services like GitHub or Zenodo. To be equitable and inclusive, we aim to design and build for computing environments with less powerful read/write speeds, limits in memory and storage, local rather than remote/server environments, lower Internet access and connectivity, and in different operating systems.
Build software that is code-based rather than GUI-based
Building software with a graphical user interface (GUI) requires extensive expertise, such as UI/UX knowledge, frontend development, and effective API design. GUI’s also require more maintenance and are more likely to have bugs because they can be harder to test. They also aren’t as powerful and flexible as code-based interfaces. So, following our principle of building for us first, we aim to build software that is code-based, such as a Python packages or command-line interfaces.
Be conservative and careful about adding dependencies
Dependencies are other software that our software relies on to work. A fundamental purpose of building software is to build on the work of others, which means depending on others’ work. But, the more dependencies we rely on, the higher the risk of a security breach or vulnerability happening with one of them. So we aim to:
- Keep the number of dependencies to a minimum, to keep risk exposure low.
- Use popular and established software as dependencies. A larger user-base will mean that the dependency has more exposure to different use cases, more opportunity to identify and fix issues, and more people examining the code.
- Be very careful when selecting a software to make as a dependency. Use our principles as a guide to help us decide and write decision posts to document our rationale and make informed decisions.
Develop content closer to where it is used or relevant
The further apart related content is from each other, the higher chance that there will be drift between them. In the context of software, what this means is that when documentation related to a piece of code is in another file or folder, the less likely developers will remember to keep the two updated. So documentation on the code is written in the same file as the code itself.
This also applies to design documentation and how-to guides. When they are in another repository, they are less likely to be updated with changes in the software (and vice versa). So we aim to keep related content as close as possible to each other. Anything related to a software product will be found within the same repository.
Prioritise and seek out simplicity in all aspects of developing software
Simplicity comes in many forms and is much hard to design and build than complex software. For instance, we aim to:
- Favour simple packaging over more complex ones, for instance, building Python packages instead of Docker images.
- Use simpler, less nested folder structures and hierarchies.
- Build smaller, more focused packages instead of larger more complex ones with more features.
- Write simpler, clearer interfaces (like functions) with clear actions, inputs, outputs, and without side-effects.
- Build smaller functions that chain together to build larger ones.
- Use fewer and simpler programming data structures.
Use a functional programming paradigm
Functional programming tends to be easier to reason about, develop, and most importantly test. This is largely because it avoids or makes it difficult to modify existing objects when run within the function, emphasises clear input and output without side effects, and has features that are more natural to humans to think about. We aim to make use of more functional programming features and where absolutely necessary use object-oriented programming.