Convolutional Architectures¶
This package lists contributed convolutional architectures.
GPT-2¶
-
class
pl_bolts.models.vision.
GPT2
(embed_dim, heads, layers, num_positions, vocab_size, num_classes)[source] Bases:
pytorch_lightning.
GPT-2 from language Models are Unsupervised Multitask Learners
Paper by: Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever
Implementation contributed by:
Example:
from pl_bolts.models.vision import GPT2 seq_len = 17 batch_size = 32 vocab_size = 16 x = torch.randint(0, vocab_size, (seq_len, batch_size)) model = GPT2(embed_dim=32, heads=2, layers=2, num_positions=seq_len, vocab_size=vocab_size, num_classes=4) results = model(x)
-
forward
(x, classify=False)[source] Expect input as shape [sequence len, batch] If classify, return classification logits
-
Image GPT¶
-
class
pl_bolts.models.vision.
ImageGPT
(embed_dim=16, heads=2, layers=2, pixels=28, vocab_size=16, num_classes=10, classify=False, batch_size=64, learning_rate=0.01, steps=25000, data_dir='.', num_workers=8, **kwargs)[source] Bases:
pytorch_lightning.
Paper: Generative Pretraining from Pixels [original paper code].
Paper by: Mark Che, Alec Radford, Rewon Child, Jeff Wu, Heewoo Jun, Prafulla Dhariwal, David Luan, Ilya Sutskever
Implementation contributed by:
Original repo with results and more implementation details:
Example Results (Photo credits: Teddy Koker):
Default arguments:
¶ Argument
Default
iGPT-S (Chen et al.)
–embed_dim
16
512
–heads
2
8
–layers
8
24
–pixels
28
32
–vocab_size
16
512
–num_classes
10
10
–batch_size
64
128
–learning_rate
0.01
0.01
–steps
25000
1000000
Example:
import pytorch_lightning as pl from pl_bolts.models.vision import ImageGPT dm = MNISTDataModule('.') model = ImageGPT(dm) pl.Trainer(gpu=4).fit(model)
As script:
cd pl_bolts/models/vision/image_gpt python igpt_module.py --learning_rate 1e-2 --batch_size 32 --gpus 4
- Parameters
Pixel CNN¶
-
class
pl_bolts.models.vision.
PixelCNN
(input_channels, hidden_channels=256, num_blocks=5)[source] Bases:
torch.nn.
Implementation of Pixel CNN.
Paper authors: Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu
Implemented by:
William Falcon
Example:
>>> from pl_bolts.models.vision import PixelCNN >>> import torch ... >>> model = PixelCNN(input_channels=3) >>> x = torch.rand(5, 3, 64, 64) >>> out = model(x) ... >>> out.shape torch.Size([5, 3, 64, 64])
UNet¶
-
class
pl_bolts.models.vision.
UNet
(num_classes, input_channels=3, num_layers=5, features_start=64, bilinear=False)[source] Bases:
torch.nn.
Paper: U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper authors: Olaf Ronneberger, Philipp Fischer, Thomas Brox
Implemented by:
- Parameters
input_channels¶ (
int
) – Number of channels in input images (default 3)num_layers¶ (
int
) – Number of layers in each side of U-net (default 5)features_start¶ (
int
) – Number of features in first layer (default 64)bilinear¶ (
bool
) – Whether to use bilinear interpolation or transposed convolutions (default) for upsampling.
Semantic Segmentation¶
Model template to use for semantic segmentation tasks. The model uses a UNet architecture by default. Override any part of this model to build your own variation.
from pl_bolts.models.vision import SemSegment
from pl_bolts.datamodules import KittiDataModule
import pytorch_lightning as pl
dm = KittiDataModule('path/to/kitt/dataset/', batch_size=4)
model = SemSegment(datamodule=dm)
trainer = pl.Trainer()
trainer.fit(model)
-
class
pl_bolts.models.vision.
SemSegment
(lr=0.01, num_classes=19, num_layers=5, features_start=64, bilinear=False)[source] Bases:
pytorch_lightning.
Basic model for semantic segmentation. Uses UNet architecture by default.
The default parameters in this model are for the KITTI dataset. Note, if you’d like to use this model as is, you will first need to download the KITTI dataset yourself. You can download the dataset here.
Implemented by:
- Parameters