Skip to content

Commit 5b34400

Browse files
authored
Merge branch 'main' into port-eurosat
2 parents 14f8610 + 01f07ee commit 5b34400

21 files changed

+833
-149
lines changed

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ Instead of relying directly on `black` however, we rely on
8383
[ufmt](https://github.com/omnilib/ufmt), for compatibility reasons with Facebook
8484
internal infrastructure.
8585

86-
To format your code, install `ufmt` with `pip install ufmt` and use e.g.:
86+
To format your code, install `ufmt` with `pip install ufmt==1.3.2 black==21.9b0 usort==0.6.4` and use e.g.:
8787

8888
```bash
8989
ufmt format torchvision

docs/source/models.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ architectures for image classification:
3838
- `ResNeXt`_
3939
- `Wide ResNet`_
4040
- `MNASNet`_
41-
- `EfficientNet`_
41+
- `EfficientNet`_ v1 & v2
4242
- `RegNet`_
4343
- `VisionTransformer`_
4444
- `ConvNeXt`_
@@ -70,6 +70,9 @@ You can construct a model with random weights by calling its constructor:
7070
efficientnet_b5 = models.efficientnet_b5()
7171
efficientnet_b6 = models.efficientnet_b6()
7272
efficientnet_b7 = models.efficientnet_b7()
73+
efficientnet_v2_s = models.efficientnet_v2_s()
74+
efficientnet_v2_m = models.efficientnet_v2_m()
75+
efficientnet_v2_l = models.efficientnet_v2_l()
7376
regnet_y_400mf = models.regnet_y_400mf()
7477
regnet_y_800mf = models.regnet_y_800mf()
7578
regnet_y_1_6gf = models.regnet_y_1_6gf()
@@ -122,6 +125,9 @@ These can be constructed by passing ``pretrained=True``:
122125
efficientnet_b5 = models.efficientnet_b5(pretrained=True)
123126
efficientnet_b6 = models.efficientnet_b6(pretrained=True)
124127
efficientnet_b7 = models.efficientnet_b7(pretrained=True)
128+
efficientnet_v2_s = models.efficientnet_v2_s(pretrained=True)
129+
efficientnet_v2_m = models.efficientnet_v2_m(pretrained=True)
130+
efficientnet_v2_l = models.efficientnet_v2_l(pretrained=True)
125131
regnet_y_400mf = models.regnet_y_400mf(pretrained=True)
126132
regnet_y_800mf = models.regnet_y_800mf(pretrained=True)
127133
regnet_y_1_6gf = models.regnet_y_1_6gf(pretrained=True)
@@ -238,6 +244,9 @@ EfficientNet-B4 83.384 96.594
238244
EfficientNet-B5 83.444 96.628
239245
EfficientNet-B6 84.008 96.916
240246
EfficientNet-B7 84.122 96.908
247+
EfficientNetV2-s 84.228 96.878
248+
EfficientNetV2-m 85.112 97.156
249+
EfficientNetV2-l 85.810 97.792
241250
regnet_x_400mf 72.834 90.950
242251
regnet_x_800mf 75.212 92.348
243252
regnet_x_1_6gf 77.040 93.440
@@ -439,6 +448,9 @@ EfficientNet
439448
efficientnet_b5
440449
efficientnet_b6
441450
efficientnet_b7
451+
efficientnet_v2_s
452+
efficientnet_v2_m
453+
efficientnet_v2_l
442454

443455
RegNet
444456
------------

hubconf.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@
1313
efficientnet_b5,
1414
efficientnet_b6,
1515
efficientnet_b7,
16+
efficientnet_v2_s,
17+
efficientnet_v2_m,
18+
efficientnet_v2_l,
1619
)
1720
from torchvision.models.googlenet import googlenet
1821
from torchvision.models.inception import inception_v3

references/classification/README.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ Then we averaged the parameters of the last 3 checkpoints that improved the Acc@
8888
and [#3354](https://github.com/pytorch/vision/pull/3354) for details.
8989

9090

91-
### EfficientNet
91+
### EfficientNet-V1
9292

9393
The weights of the B0-B4 variants are ported from Ross Wightman's [timm repo](https://github.com/rwightman/pytorch-image-models/blob/01cb46a9a50e3ba4be167965b5764e9702f09b30/timm/models/efficientnet.py#L95-L108).
9494

@@ -114,6 +114,26 @@ torchrun --nproc_per_node=8 train.py --model efficientnet_b7 --interpolation bic
114114
--val-resize-size 600 --val-crop-size 600 --train-crop-size 600 --test-only --pretrained
115115
```
116116

117+
118+
### EfficientNet-V2
119+
```
120+
torchrun --nproc_per_node=8 train.py \
121+
--model $MODEL --batch-size 128 --lr 0.5 --lr-scheduler cosineannealinglr \
122+
--lr-warmup-epochs 5 --lr-warmup-method linear --auto-augment ta_wide --epochs 600 --random-erase 0.1 \
123+
--label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0 --weight-decay 0.00002 --norm-weight-decay 0.0 \
124+
--train-crop-size $TRAIN_SIZE --model-ema --val-crop-size $EVAL_SIZE --val-resize-size $EVAL_SIZE \
125+
--ra-sampler --ra-reps 4
126+
```
127+
Here `$MODEL` is one of `efficientnet_v2_s` and `efficientnet_v2_m`.
128+
Note that the Small variant had a `$TRAIN_SIZE` of `300` and a `$EVAL_SIZE` of `384`, while the Medium `384` and `480` respectively.
129+
130+
Note that the above command corresponds to training on a single node with 8 GPUs.
131+
For generatring the pre-trained weights, we trained with 4 nodes, each with 8 GPUs (for a total of 32 GPUs),
132+
and `--batch_size 32`.
133+
134+
The weights of the Large variant are ported from the original paper rather than trained from scratch. See the `EfficientNet_V2_L_Weights` entry for their exact preprocessing transforms.
135+
136+
117137
### RegNet
118138

119139
#### Small models

test/builtin_dataset_mocks.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -878,6 +878,34 @@ def celeba(info, root, config):
878878
return CelebAMockData.generate(root)[config.split]
879879

880880

881+
@register_mock
882+
def country211(info, root, config):
883+
split_name_mapper = {
884+
"train": "train",
885+
"val": "valid",
886+
"test": "test",
887+
}
888+
split_folder = pathlib.Path(root, "country211", split_name_mapper[config["split"]])
889+
split_folder.mkdir(parents=True, exist_ok=True)
890+
891+
num_examples = {
892+
"train": 3,
893+
"val": 4,
894+
"test": 5,
895+
}[config["split"]]
896+
897+
classes = ("AD", "BS", "GR")
898+
for cls in classes:
899+
create_image_folder(
900+
split_folder,
901+
name=cls,
902+
file_name_fn=lambda idx: f"{idx}.jpg",
903+
num_examples=num_examples,
904+
)
905+
make_tar(root, f"{split_folder.parent.name}.tgz", split_folder.parent, compression="gz")
906+
return num_examples * len(classes)
907+
908+
881909
@register_mock
882910
def dtd(info, root, config):
883911
data_folder = root / "dtd"
Binary file not shown.
Binary file not shown.
Binary file not shown.

test/test_prototype_transforms.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ def test_auto_augment(self, transform, input):
126126
(
127127
transforms.Normalize(mean=[0.0, 0.0, 0.0], std=[1.0, 1.0, 1.0]),
128128
itertools.chain.from_iterable(
129-
fn(color_spaces=["rgb"], dtypes=[torch.float32])
129+
fn(color_spaces=[features.ColorSpace.RGB], dtypes=[torch.float32])
130130
for fn in [
131131
make_images,
132132
make_vanilla_tensor_images,

test/test_prototype_transforms_functional.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@
1414
def make_image(size=None, *, color_space, extra_dims=(), dtype=torch.float32):
1515
size = size or torch.randint(16, 33, (2,)).tolist()
1616

17-
if isinstance(color_space, str):
18-
color_space = features.ColorSpace[color_space]
1917
num_channels = {
2018
features.ColorSpace.GRAYSCALE: 1,
2119
features.ColorSpace.RGB: 3,

torchvision/_utils.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
import enum
2+
3+
4+
class StrEnumMeta(enum.EnumMeta):
5+
auto = enum.auto
6+
7+
def from_str(self, member: str):
8+
try:
9+
return self[member]
10+
except KeyError:
11+
# TODO: use `add_suggestion` from torchvision.prototype.utils._internal to improve the error message as
12+
# soon as it is migrated.
13+
raise ValueError(f"Unknown value '{member}' for {self.__name__}.") from None
14+
15+
16+
class StrEnum(enum.Enum, metaclass=StrEnumMeta):
17+
pass

torchvision/csrc/io/decoder/gpu/gpu_decoder.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,7 @@ torch::Tensor GPUDecoder::decode() {
2929
unsigned long videoBytes = 0;
3030
uint8_t* video = nullptr;
3131
at::cuda::CUDAGuard device_guard(device);
32-
auto options = torch::TensorOptions().dtype(torch::kU8).device(torch::kCUDA);
33-
torch::Tensor frame = torch::zeros({0}, options);
32+
torch::Tensor frame;
3433
do {
3534
demuxer.demux(&video, &videoBytes);
3635
decoder.decode(video, videoBytes);

0 commit comments

Comments
 (0)