User guide¶
Installation¶
Environment¶
To run the experiment you need to install the required dependencies. We highly recommend that you use a virtual environment as provided by conda or pipenv.
Then in your environment run:
$ pip install -r requirements/dev.txt
Sacred setup¶
When running experiments, the hyperparameters, metrics and plots are managed through Sacred and are stored in a mongoDB database. Though you can setup your mongoDB instance however you want, it is most conveniently done through the provided Docker files. This will not only get you started with mongoDB in no time, but will also set up a mongo-express interface to conveniently manage your database and sacredboard to monitor your runs. In order to use them you need to
- Install Docker Engine.
- Install Docker Compose.
- Navigate to the directory with the setup files.
$ cd infractructure/sacred_setup
Edit the
.env
file. This file is hidden by default, but you can still edit it with any text editor, e.g. byvi .env
. Replace all values in angle brackets with meaningful and secure values.Run docker-compose:
docker-compose up -d
This will pull the necessary containers from the internet and build them. This may take several
minutes.
Afterwards mongoDB should be up and running. mongo-express
should now be available on port 8081
,
accessible by the user and password you set in the .env
file (ME_CONFIG_BASICAUTH_USERNAME
and ME_CONFIG_BASICAUTH_PASSWORD
). Sacredboard should be available on port 5000
.
The current setup is optimized for a team that collaboratively stores results on a remote server.
When running the experiments locally for yourself, you should change the port mapping in the
docker-compose.yml
file to only map to localhost, such that you do not expose your database to
the internet. Simply prefix all port mappings with localhost, e.g. replace:
ports:
- 5000:5000
by
ports:
- 127.0.0.1:5000:5000
5. In a final step, you need to tell sacred how to connect to the database. Edit file
deep_bottleneck/credentials.py
, again replacing all values in angle brackets by the
values you actually set in the .env
file. Additionally, you have to provide the IP
address of the server your database is running on, which is either the address given
by your server provider or 127.0.0.1
when running mongo locally.
- You are ready to run some exciting experiments!
Importing and exporting from mongoDB¶
The following section is meant to help you migrate your data from one server to another. If you are just starting you can skip this section.
To export data from your mongo container run
$ docker run --rm --link <container_id>:mongo --network <network_id> -v /root/dump:/backup mongo bash -c 'mongodump --out /backup --uri mongodb://<username>:<password>@mongo:27017/?authMechanism=SCRAM-SHA-1'
make sure you you create the output folder, in this case /root/dump
beforehand. You also need
to look up the id of your current mongo container using docker ls
and find the id
of the network is running is using docker network ls
. Then replace <username>
and <password>
by the values you originally set in your .env
.
To import data again run following the same steps as above.
$ docker run --rm --link <container_id>:mongo --network <network_id> -v /root/dump:/backup mongo bash -c 'mongorestore /backup --uri mongodb://<username>:<password>@mongo:27017/?authMechanism=SCRAM-SHA-1'
How to use the framework¶
Running experiments¶
The idea of the project is based on the concepts presented by Tishby.
To reproduce the basic setup of the experiments one can simply start experiment.py
.
If all the required packages are installed properly and the program is started, different things should happen.
- First the required modules of the framework are imported based on the defined configuration (more about configurations in “Adding new Experiments”).
- A neural network is trained using the defined dataset. The progress of this process is also logged in the console.
- During the training process the required data is saved in regular time-steps to the local filesystem.
- Given the saved data (e.g. the activations) it is possible to compute the mutual information of the different layer and the input/output.
- Using this different plots as e.g. the information plane plot are created and saved simultaneously in the filesystem and in the database.
The results of the experiments can be looked up either in the
deep_bottleneck/plots
folder (only the plots of the last runs are saved) or usingeval_tools
as described below.
Evaluation tools¶
To make the rich results generated by the experiments accessible, we created an evaluation tool. It lets you query experiments based on id, name or other configuration parameters and lets you view the generated plots, metrics and videos conveniently in Jupyter notebooks. To get you started have a look at deep_bottleneck/eval_tools_demo.ipynb.
Adding new experiments (config)¶
Configuration¶
During the exploration of Tishby’s idea already a lot of experiments have been done, but there are still many things
one can do using this framework. To define a new experiment a new configuration needs to be added.
The existing configurations are saved in the deep_bottleneck/configs
folder.
To add a new configuration a new JSON
file is required.
The currently relevant parts of the configuration and their effects are explained in the following table.
epochs: | Number of epochs the model is trained for. Most of the experiments for the harmonics dataset used 8000 epochs. |
---|---|
batch_size: | Batch size used during the training process. Most dominant batch size in our experiments was 256. |
architecture: | Architecture of the trained model. Defined as a list of integers, where every integer defines the number of neurons in one layer. It is important to notify that an additional readout layer is automatically added (with the number of neurons corresponding to the number of classes in the dataset). The basic architecture for the harmonics dataset is [10, 7, 5, 4, 3]. |
optimizer: | The optimizer used for the training of the neural network. Possible values are “sgd”, or “adam”. |
learning_rate: | The learning rate of the optimizer. Default values are 0.0004 for harmonics and 0.001 for mnist. |
calculate_mi_for: | |
The calculate_mi_for parameter defines the dataset that is used for the mutual information computation. It can be done either for the training data (value: “training”), test data (“test”) or the full dataset (“full_dataset”). | |
activation_fn: | The activation-function used to train the model. The following activation function are implemented:
tanh , relu , sigmoid , softsign , softplus , leaky_relu ,
hard_sigmoid , selu , relu6 , elu and linear . |
model: | The parameter which defines the basic model-choice. Currently only different architectures of feed-foreward-networks can be used.
So the possible choices right now are models.feedforward and models.feedforward_batchnorm , the actual architecture is defined by the architecture parameter. |
dataset: | The parameter which defines the dataset used for training.
Currently implemented datasets are harmonics , mnist , fashion_mnist and mushroom . |
estimator: | The estimator used for the computation of the mutual information. Because mutual information cannot
be computed analytically for more complex networks, it is necessary to estimate it.
Possible estimators are mi_estimator.binning , mi_estimator.lower , mi_estimator.upper . |
discretization_range: | |
The different estimators have a different hyperparameter to add artificial noise to the estimation.
This parameter is used as a placeholder for the different hyperparameter.
A typical value is 0.07 for binning and 0.001 for upper and lower . |
|
callbacks: | A list of additional callbacks as for example early stopping.
Needs to defined as a list of paths to the callbacks, as e.g. [callbacks.early_stopping_manual] . |
n_runs: | Number of runs the experiment is repeated. The results will be averaged over all runs to compensate for outliers. |
Executing multiple experiments¶
Using these parameters one should be able to define experiments as desired. To execute the experiment(s)
one could simply start des experiment.py but mainly due to our usage of external hardware resources
(Sun grid engine) we had to develop another way to execute experiments.
We created two python files: run_experiment.py
and run_experiment_local.py
, which can run
either a single experiment or a group of experiments.
For the local execution of experiments with run_experiment_local.py
one needs to switch to the
deep_bottleneck folder by:
$ cd deep_bottleneck
and then execute experiments by either pointing to a specific JSON
file defining the experiment, e.g.:
$ python run_experiments_local.py -d configs/basic.json
or pointing at a directory containing all the experiments one wants to execute, e.g.:
$ python run_experiments_local.py -d configs/mnist
In that case all the JSONs
in the folder and in its sub-folders are recursively executed.
Running experiment on the Sun grid engine¶
In case one uses a sun grid engine to execute the experiments it is possible to start
run_experiments.py
on the engine in the same way with as described above.
The experiments will get submitted to the engine using qsub
.
In that case it is important to make sure that an /output/-folder exists on the directory-level
of the experiment.sge
file.
Additionally it might be important to run experiments that are repeatable and will return the same results in every run. Because the basic step of the framework is to train a neural network, including some kind of randomness the results of two runs might be different even though they are based on the same configuration. To avoid misconceptions it is possible to set a seed for each experiment, simply by using:
$ python experiment.py with seed=0
(the exact seed is arbitrary, it just needs to be consistent). In case that one of the
run_experiment
files is used this step is done for you,
but even in the other cases some IDEs allow to set script-parameters for normal executions of a
specific file, such that it is not required to start the experiment.py
out of the command-line.