Contributing¶

Extending the framework¶

There are several possibilities to extend the framework. In the following the structure of the framework is shown to allow an easy extension of the basic modules. There are five types of modules that can be included quite easy, they are listed in the table below: Each module requires a module level load method to be defined, that passes the hyperparameters from the sacred configuration to the constructor of the class.

dataset:	The datasets live in the `deep_bottleneck.dataset` folder and require a load-method returning a training and a test dataset.
model:	The models live in in the `deep_bottleneck.model` folder and require a `load`-method as well. But in this case the load-method returns a trainable keras-model.
estimator:	The mutual information estimators live in the `deep_bottleneck.mi_estimator` folder and require a load-method as well. The `load`-method should return an estimator that is able to compute the mutual information based on a dataset and is described in more detailed by a hyperparameter called `discretization_range`.
callback:	Callbacks can be used for different kinds of tasks. They live in the `deep_bottleneck.callbacks` folder and are used to save the needed information during the training or to influence the training process (e.g. early stopping). They need to inherit from `keras.callbacks.Callback`.
plotter:	Plotters are using the saved data of the callbacks to create the different plots. They live in the `deep_bottleneck.plotter` folder and need a load method returning a plotter-class inheriting from `deep_bottleneck.plotter.base.BasePlotter`.

To add a new module, it needs to be added into the respective folder. Then the configuration parameter needs to be set to the import path of the module. If the path is correctly defined and the module has a matching interface, it will automatically be imported in experiment.py and conduct its tasks. More about the interfaces and the existing methods in the API-documentation.

Git workflow¶

This workflow describes the process of adding code to the repository.

Describe what you want to achieve in an issue.
Pull the master to get up to date.
1. git checkout master
2. git pull
Create a new local branch with git checkout -b <name-for-your-branch>. It can make sense to prefix your branch with a description like feature or fix.
Solve the issue, most probably in several commits.
In the meantime there might have been changes on the master branch. So you need to merge these changes into your branch.
1. git checkout master
2. git pull to get the latest changes.
3. git checkout <name-for-your-branch>
4. git merge master. This might lead to conflicts that you have to resolve manually.
Push your branch to github with git push origin <name-for-your-branch>.
Go to github and switch to your branch.
Send a pull request from the web UI on github.
After you received comments on your code, you can simply update your pull request by pushing to the same branch again.
Once your changes are accepted, merge your branch into master. This can also be done by the last reviewer that accepts the pull request.

Git commit messages¶

Have a look at this guideline.

Most important:

Single line summary starting with a verb (50 characters)
Longer summary if necessary (wrapped at 72 characters).

Editors like vim enforce these constraints automatically.

Style Guide¶

Follow PEP 8 styleguide. It is worth reading through the entire styleguide, but the most importand points are summarized here.

Naming¶

Functions and variables use snake_case
Classes use CamelCase
Constants use CAPITAL_SNAKE_CASE

Spacing¶

Spaces around infix operators and assignment

a + b not a+b
a = 1 not a=1

An exception are keyword arguments

some_function(arg1=a, arg2=b) not some_function(arg1 = a, arg2 = b)

Use one space after separating commas

some_list = [1, 2, 3] not some_list = [1,2,3]

In general PyCharm’s auto format (Ctrl + Alt + l) should be good enough.

Type annotation¶

Since Python 3.5 type annotation are supported. They make sense for public interfaces, that should be kept consistent.

def add(a: int, b: int) -> int:

Docstrings¶

Use Google Style for docstrings in everything that has a somewhat public interface.

Clean code¶

And here our non exhaustive list to guidelines to write cleaner code.

Use meaningful variable names
Keep your code DRY (Don’t repeat yourself) by abstracting into functions and classes.
Keep everything at the same level of abstraction
Functions without side effects
Functions should have a single responsibility
Be consistent, stick to conventions, use a styleguide
Use comments only for what cannot be described in code
Write comments with care, correct grammar and correct punctuation
Write tests if you write a module

Experiment workflow¶

Define a hypothesis
Define set of parameters that is going to stay fixed
Define parameter to change (including possible values for the parameter)
Create a meaningful name for the experiment (group of experiment, name of parameter tested)
Make sure you set a seed (Pycharm: in run options append: “with seed=0”)
Program experiment (set parameters) using our framework
Commit your changes locally to obtain commit hash: this is going to be logged by sacredboard
Make sure your experiment is logged to the database
Start the experiment
Interpret and document results in a notebook. Include relevant plots using the artifact viewer. Make sure the notebook is completely executed.
Move your notebook to docs/experiments, so it will be automatically included in the documentation.
Push your local branch to github - to make all commits available to everyone

Documentation¶

To build the documentation run:

$ cd docs
$ make html

A short restructeredText reference. There is also a longer video tutorial

If you added new packages and want to add them to the API documentation use:

$ sphinx-apidoc -o docs/api_doc/ deep_bottleneck deep_bottleneck/credentials.py deep_bottleneck/experiment.py deep_bottleneck/demo.py

Make sure to change the header of modules.rst back to “API Documentation”.