1 Star 0 Fork 0

Nut-Guo / adv

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
Clone or Download
contribute
Sync branch
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README
MIT

NN Template

PyTorch Lightning Conf: hydra Logging: wandb Conf: hydra UI: streamlit Code style: black

Generic template to bootstrap your PyTorch project. Click on and avoid writing boilerplate code for:

  • PyTorch Lightning, lightweight PyTorch wrapper for high-performance AI research.
  • Hydra, a framework for elegantly configuring complex applications.
  • DVC, track large files, directories, or ML models. Think "Git for data".
  • Weights and Biases, organize and analyze machine learning experiments. (educational account available)
  • Streamlit, turns data scripts into shareable web apps in minutes.

nn-template is opinionated so you don't have to be. If you use this template, please add to your README.

Usage Examples

Checkout the mwe branch to view a minimum working example on MNIST.

Structure

.
├── .cache              
├── conf                # hydra compositional config 
│   ├── data
│   ├── default.yaml    # current experiment configuration        
│   ├── hydra
│   ├── logging
│   ├── model
│   ├── optim
│   └── train
├── data                # datasets
├── .env                # system-specific env variables, e.g. PROJECT_ROOT
├── requirements.txt    # basic requirements
├── src
│   ├── common          # common modules and utilities
│   ├── pl_data         # PyTorch Lightning datamodules and datasets
│   ├── pl_modules      # PyTorch Lightning modules
│   ├── run.py          # entry point to run current conf
│   └── ui              # interactive streamlit apps
└── wandb               # local experiments (auto-generated)

Streamlit

Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science.

In just a few minutes, you can build and deploy powerful data apps to:

  • Explore your data
  • Interact with your model
  • Analyze your model behavior and input sensitivity
  • Showcase your prototype with awesome web apps

Moreover, Streamlit enables interactive development with automatic rerun on files changes.

Launch a minimal app with PYTHONPATH=. streamlit run src/ui/run.py. There is a built-in function to restore a model checkpoint stored on W&B, with automatic download if the checkpoint is not present in the local machine:

Data Version Control

DVC runs alongside git and uses the current commit hash to version control the data.

Initialize the dvc repository:

$ dvc init

To start tracking a file or directory, use dvc add:

$ dvc add data/ImageNet

DVC stores information about the added file (or a directory) in a special .dvc file named data/ImageNet.dvc, a small text file with a human-readable format. This file can be easily versioned like source code with Git, as a placeholder for the original data (which gets listed in .gitignore):

git add data/ImageNet.dvc data/.gitignore
git commit -m "Add raw data"

Making changes

When you make a change to a file or directory, run dvc add again to track the latest version:

$ dvc add data/ImageNet

Switching between versions

The regular workflow is to use git checkout first to switch a branch, checkout a commit, or a revision of a .dvc file, and then run dvc checkout to sync data:

$ git checkout <...>
$ dvc checkout

Read more in the docs!

Weights and Biases

Weights & Biases helps you keep track of your machine learning projects. Use tools to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.

This is an example of a simple dashboard.

Quickstart

Login to your wandb account, running once wandb login. Configure the logging in conf/logging/*.


Read more in the docs. Particularly useful the log method, accessible from inside a PyTorch Lightning module with self.logger.experiment.log.

W&B is our logger of choice, but that is a purely subjective decision. Since we are using Lightning, you can replace wandb with the logger you prefer (you can even build your own). More about Lightning loggers here.

Hydra

Hydra is an open-source Python framework that simplifies the development of research and other complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line. The name Hydra comes from its ability to run multiple similar jobs - much like a Hydra with multiple heads.

The basic functionalities are intuitive: it is enough to change the configuration files in conf/* accordingly to your preferences. Everything will be logged in wandb automatically.

Consider creating new root configurations conf/myawesomeexp.yaml instead of always using the default conf/default.yaml.

Sweeps

You can easily perform hyperparameters sweeps, which override the configuration defined in /conf/*.

The easiest one is the grid-search. It executes the code with every possible combinations of the specified hyperparameters:

PYTHONPATH=. python src/run.py -m optim.optimizer.lr=0.02,0.002,0.0002 optim.lr_scheduler.T_mult=1,2 optim.optimizer.weight_decay=0,1e-5

You can explore aggregate statistics or compare and analyze each run in the W&B dashboard.


We recommend to go through at least the Basic Tutorial, and the docs about Instantiating objects with Hydra.

PyTorch Lightning

Lightning makes coding complex networks simple. It is not a high level framework like keras, but forces a neat code organization and encapsulation.

You should be somewhat familiar with PyTorch and PyTorch Lightning before using this template.

Environment Variables

System specific variables (e.g. absolute paths to datasets) should not be under version control, otherwise there will be conflicts between different users.

The best way to handle system specific variables is through environment variables.

You can define new environment variables in a .env file in the project root. A copy of this file (e.g. .env.template) can be under version control to ease new project configurations.

To define a new variable write inside .env:

export MY_VAR=/home/user/my_system_path

You can dynamically resolve the variable name from Python code with:

get_env('MY_VAR')

and in the Hydra .yaml configuration files with:

${oc.env:MY_VAR}
MIT License Copyright (c) 2021 Valentino Maiorca, Luca Moschella Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

No description expand collapse
Python and 2 more languages
MIT
Cancel

Releases

No release

Contributors

All

Activities

Load More
can not load any more
1
https://gitee.com/nutguo/adv.git
git@gitee.com:nutguo/adv.git
nutguo
adv
adv
main

Search