Skip to content

Commit

Permalink
programming guidelines
Browse files Browse the repository at this point in the history
  • Loading branch information
jkanche committed Jan 11, 2024
1 parent b7973c4 commit 83406e8
Show file tree
Hide file tree
Showing 2 changed files with 115 additions and 0 deletions.
Empty file removed chapters/development_guide.qmd
Empty file.
115 changes: 115 additions & 0 deletions chapters/philosophy.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Programming philosophy {.unnumbered}

As we developed BiocPy packages, we established standards and contribution
guidelines to ensure code quality and consistency across all our packages.

## Class design

Our objective is to provide a consistent user experience in Python for each Bioconductor class.
In most cases, this is achieved by directly re-implementing the class and its
associated methods in Python.
Occasionally, the Bioconductor implementation contains historical baggage
(e.g., the storage of `rowData` in a `RangedSummarizedExperiment`,
`MultiAssayExperiment` harmonization); developers should use their own discretion
to decide whether that really needs to be replicated in Python.

## Naming

We highly recommend adhering to
[Google's Python style guide](https://google.github.io/styleguide/pyguide.html)
for consistency.
In summary, classes should use `PascalCase` and follow Bioconductor's
class names.
Methods should use `snake_case` and take the form of `<verb>[_<details>]`, such as
`get_start()`, `set_names()` and so on.
Method arguments should also use `snake_case` format.

Usability has been another crucial objective, to facilitate an easy
transition for users between R and Python classes.
For instance, when computing flanking regions in R:

```r
flank(gr, width=2, start=FALSE, both=TRUE)
```

In the [GenomicRanges](https://github.com/BiocPy/GenomicRanges) Python packages, we maintain consistency by expecting the same method name.
The only difference lies in the shift from a functional to an object-oriented
programming paradigm:

```python
gr.flank(width=2, start=False, both=True)
```

### Functional discipline

The existence of mutable types in Python introduces the potential danger of
modifying complex objects.
If a mutable object has a user-visible reference and is also a member of a
larger Bioconductor object, a user-specified modification to that object may
violate the constraints of the parent object.

To address these issues, we enforce a functional programming discipline in all
class methods.
By default, all methods should avoid side effects that mutate the object.
This simplifies reasoning around the effects of methods and/or mutations in large
complex objects.

The most notable application of this philosophy is in ***setter*** methods.
Instead of mutating the object directly, they should return a new copy of the object
with the desired modification.
The *"depth"* of the copy depends on the nature of the field being set; the aim
should be to avoid any modification of the contents in `self`.
Implementations may offer an `in_place=` option to apply the modification to the
original object, but this default to `False`.

To avoid performance issues, ***getter*** methods may return mutable objects
without copying, assuming that their return values are read-only and will not be
directly mutated (Setter methods that operate via a copy are allowed).
In some cases, the return value of a getter method may be directly mutated,
e.g., because a copy was already created in the getter; this should be clearly
stated in the documentation but should not be treated as the default.

#### Property based getters and setters

Direct access to class members (via properties or `@property` decorator) should
generally be avoided, as it is too easy to perform modifications via one-liners
with the `class.property` notation on the left-hand-side of an assignment.

The default assumption is that property-based setters will mutate the object in-place.

## Type hints

As the term suggests, type hints are "hints" used to enhance the developer
experience; they should not dictate how we write our code.

For this reason, we prefer simple types in these hints, usually corresponding to
base Python types with minimal nesting.
For example, if a function is expected to operate on any arbitrary list, the
basic list type hint should suffice.

```python
def find_element(arr: list, query: int)
pass
```

If the function expects a list of strings,

```python
from typing import List

def find_element(arr: List[str], query: str):
pass
```

If the function accepts multiple types as inputs,

```python
from typing import Union

def find_element(arr: List[str], query: Union[int, str, slice]):
pass
```

Additionally, we provide recommendations on setting up the package, different
testing environments, documentation, and publishing workflows.
These details can be found in the [developer guide](https://github.com/BiocPy/developer_guide).

0 comments on commit 83406e8

Please sign in to comment.