Contributing

Extending the framework

There are several possibilities to extend the framework. In the following the structure of the framework is shown to allow an easy extension of the basic modules. There are five types of modules that can be included quite easy, they are listed in the table below: Each module requires a module level load method to be defined, that passes the hyperparameters from the sacred configuration to the constructor of the class.

dataset:The datasets live in the deep_bottleneck.dataset folder and require a load-method returning a training and a test dataset.
model:The models live in in the deep_bottleneck.model folder and require a load-method as well. But in this case the load-method returns a trainable keras-model.
estimator:The mutual information estimators live in the deep_bottleneck.mi_estimator folder and require a load-method as well. The load-method should return an estimator that is able to compute the mutual information based on a dataset and is described in more detailed by a hyperparameter called discretization_range.
callback:Callbacks can be used for different kinds of tasks. They live in the deep_bottleneck.callbacks folder and are used to save the needed information during the training or to influence the training process (e.g. early stopping). They need to inherit from keras.callbacks.Callback.
plotter:Plotters are using the saved data of the callbacks to create the different plots. They live in the deep_bottleneck.plotter folder and need a load method returning a plotter-class inheriting from deep_bottleneck.plotter.base.BasePlotter.

To add a new module, it needs to be added into the respective folder. Then the configuration parameter needs to be set to the import path of the module. If the path is correctly defined and the module has a matching interface, it will automatically be imported in experiment.py and conduct its tasks. More about the interfaces and the existing methods in the API-documentation.

Git workflow

This workflow describes the process of adding code to the repository.

  1. Describe what you want to achieve in an issue.

  2. Pull the master to get up to date.

    1. git checkout master
    2. git pull
  3. Create a new local branch with git checkout -b <name-for-your-branch>. It can make sense to prefix your branch with a description like feature or fix.

  4. Solve the issue, most probably in several commits.

  5. In the meantime there might have been changes on the master branch. So you need to merge these changes into your branch.

    1. git checkout master
    2. git pull to get the latest changes.
    3. git checkout <name-for-your-branch>
    4. git merge master. This might lead to conflicts that you have to resolve manually.
  6. Push your branch to github with git push origin <name-for-your-branch>.

  7. Go to github and switch to your branch.

  8. Send a pull request from the web UI on github.

  9. After you received comments on your code, you can simply update your pull request by pushing to the same branch again.

  10. Once your changes are accepted, merge your branch into master. This can also be done by the last reviewer that accepts the pull request.

Git commit messages

Have a look at this guideline.

Most important:

  • Single line summary starting with a verb (50 characters)
  • Longer summary if necessary (wrapped at 72 characters).

Editors like vim enforce these constraints automatically.

Style Guide

Follow PEP 8 styleguide. It is worth reading through the entire styleguide, but the most importand points are summarized here.

Naming

  • Functions and variables use snake_case
  • Classes use CamelCase
  • Constants use CAPITAL_SNAKE_CASE

Spacing

Spaces around infix operators and assignment

  • a + b not a+b
  • a = 1 not a=1

An exception are keyword arguments

  • some_function(arg1=a, arg2=b) not some_function(arg1 = a, arg2 = b)

Use one space after separating commas

  • some_list = [1, 2, 3] not some_list = [1,2,3]

In general PyCharm’s auto format (Ctrl + Alt + l) should be good enough.

Type annotation

Since Python 3.5 type annotation are supported. They make sense for public interfaces, that should be kept consistent.

def add(a: int, b: int) -> int:

Docstrings

Use Google Style for docstrings in everything that has a somewhat public interface.

Clean code

And here our non exhaustive list to guidelines to write cleaner code.

  1. Use meaningful variable names
  2. Keep your code DRY (Don’t repeat yourself) by abstracting into functions and classes.
  3. Keep everything at the same level of abstraction
  4. Functions without side effects
  5. Functions should have a single responsibility
  6. Be consistent, stick to conventions, use a styleguide
  7. Use comments only for what cannot be described in code
  8. Write comments with care, correct grammar and correct punctuation
  9. Write tests if you write a module

Experiment workflow

  1. Define a hypothesis
  2. Define set of parameters that is going to stay fixed
  3. Define parameter to change (including possible values for the parameter)
  4. Create a meaningful name for the experiment (group of experiment, name of parameter tested)
  5. Make sure you set a seed (Pycharm: in run options append: “with seed=0”)
  6. Program experiment (set parameters) using our framework
  7. Commit your changes locally to obtain commit hash: this is going to be logged by sacredboard
  8. Make sure your experiment is logged to the database
  9. Start the experiment
  10. Interpret and document results in a notebook. Include relevant plots using the artifact viewer. Make sure the notebook is completely executed.
  11. Move your notebook to docs/experiments, so it will be automatically included in the documentation.
  12. Push your local branch to github - to make all commits available to everyone

Documentation

To build the documentation run:

$ cd docs
$ make html

A short restructeredText reference. There is also a longer video tutorial

If you added new packages and want to add them to the API documentation use:

$ sphinx-apidoc -o docs/api_doc/ deep_bottleneck deep_bottleneck/credentials.py deep_bottleneck/experiment.py deep_bottleneck/demo.py

Make sure to change the header of modules.rst back to “API Documentation”.