Add SwiftFormer, SHViT, StarNet, FasterNet and GhostNetV3 #2499

brianhou0208 · 2025-05-21T18:34:05Z

New Model

Model Request

resolve #2450

Result

Model	Acc@1	MACs	Params	GPU Throughput	CPU Throughput	iOS NPU Latency	Android NPU Latency
swiftformer_xs	75.586	603.77M	3.48M	3356.03	146.25	0.63	3.288
swiftformer_s	78.466	986.19M	6.09M	2519.29	123.62	0.76	3.386
swiftformer_l1	80.902	1.60G	12.06M	1832.04	100.77	1.07	3.810
swiftformer_l3	83.000	4.01G	28.49M	969.61	43.39	1.88	4.700
fasternet_t0	71.904	335.69M	3.91M	8924.14	461.35	0.50	0.256
fasternet_t1	76.210	850.94M	7.60M	4939.89	262.55	0.67	0.384
fasternet_t2	78.896	1.90G	14.98M	2548.34	133.69	1.04	0.641
fasternet_s	81.290	4.55G	31.18M	1219.00	62.62	1.88	1.257
fasternet_m	82.966	8.72G	53.52M	605.40	28.72	-	2.456
fasternet_l	83.506	15.50G	93.47M	381.80	17.43	-	4.505
shvit_s1	72.778	241.47M	6.31M	13566.14	494.77	0.60	0.461
shvit_s2	75.192	365.58M	11.45M	10214.81	413.26	0.81	0.523
shvit_s3	77.340	600.83M	14.21M	6676.91	299.91	0.96	0.625
shvit_s4	79.378	986.63M	16.55M	4149.87	177.38	1.09	0.798
starnet_s1	73.564	422.89M	2.87M	3859.64	233.59	0.48	0.471
starnet_s2	74.660	545.05M	3.68M	3888.33	203.43	0.51	0.438
starnet_s3	77.380	755.38M	5.75M	2760.39	150.84	0.66	0.548
starnet_s4	78.838	1.05G	7.48M	1938.80	112.58	0.81	0.720
ghostnetv3_100	76.928	168.00M	6.15M	3467.852	168.138	-	0.668

Only SHViT S4 uses (256, 256) input image size
SwiftFormer weight is on google drive

Param / MACs / Throughput

NPU Latency

iOS NPU: Apple iPhone 14 Pro Max / iOS 18.5 / A16
Android NPU: Samsung S24 / Android 14 / Snapdragon® 8 Gen 3 | SM8650

iOS latency reported for iPhone 14 Pro Max (iOS 18.5) uses the benchmark tool from Xcode 16.3
Android latency reported for Samsung Galaxy S24 (Android 14) uses the benchmark tool from Qualcomm® AI Hub Models

Measure Android Latency

import torch
import timm
from timm.utils.model import reparameterize_model
import qai_hub as hub


def latency_android(name, img_size=224):
    torch_model = timm.create_model(name)
    torch_model.eval()
    torch_model = reparameterize_model(torch_model)
    input_shape = (1, 3, img_size, img_size)
    example_input = torch.rand(input_shape)
    traced_model = torch.jit.trace(torch_model, example_input)
    compile_job = hub.submit_compile_job(
        model=traced_model,
        device=hub.Device("Samsung Galaxy S24 (Family)"),
        input_specs=dict(image=input_shape),
    )
    target_model = compile_job.get_target_model()
    profile_job = hub.submit_profile_job(
        model=target_model,
        device=hub.Device("Samsung Galaxy S24 (Family)"),
    )
    profile_result = profile_job.download_profile()
    latency_ms = profile_result["execution_summary"]["estimated_inference_time"] / 1000
    print(f"{name} 📊 Latency: {latency_ms:.3f} ms")


if __name__ == '__main__':
    name_list = timm.list_models('shvit*')
    print(name_list)
    for name in name_list:
        latency_android(name)

HuggingFaceDocBuilderDev · 2025-05-21T19:18:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rwightman · 2025-05-25T22:16:50Z

@brianhou0208 there aren't any weights setup to test for swiftformer in the pretrained cfgs, I tested all the others and they appear to be working as expected.

brianhou0208 · 2025-05-26T16:10:21Z

Hi @rwightman ,

Since the Swiftformer weights are stored on Google Drive and cannot be directly linked in the pretrained cfgs, I downloaded the weights locally for testing and was able to pass all the tests.

Below are the Acc Top1 & Top5 results for reference.

Acc Top1 & Top5

Model	Acc@1	Acc@5	FLOPs	MACs	Params
swiftformer_xs	75.586	92.326	1.21G	603.77M	3.48M
swiftformer_s	78.466	93.972	1.97G	986.19M	6.09M
swiftformer_l1	80.902	95.378	3.20G	1.60G	12.06M
swiftformer_l3	83.000	96.238	8.02G	4.01G	28.49M
fasternet_t0	71.904	90.234	671.38M	335.69M	3.91M
fasternet_t1	76.210	92.822	1.70G	850.94M	7.60M
fasternet_t2	78.896	94.386	3.81G	1.90G	14.98M
fasternet_s	81.290	95.59	9.10G	4.55G	31.18M
fasternet_m	82.966	96.41	17.45G	8.72G	53.52M
fasternet_l	83.506	96.594	30.99G	15.50G	93.47M
starnet_s1	73.564	91.508	845.78M	422.89M	2.87M
starnet_s2	74.660	92.146	1.09G	545.05M	3.68M
starnet_s3	77.380	93.624	1.51G	755.38M	5.75M
starnet_s4	78.838	94.306	2.10G	1.05G	7.48M
shvit_s1	72.778	91.020	482.93M	241.47M	6.31M
shvit_s2	75.192	92.304	731.15M	365.58M	11.45M
shvit_s3	77.340	93.332	1.20G	600.83M	14.21M
shvit_s4	79.378	94.374	1.97G	986.63M	16.55M

Shortly Test Code

from typing import Any, Dict, Union, List
from tqdm import tqdm
import torch
from torch.utils.data import DataLoader
import torchvision.datasets as datasets
import torchvision.transforms as transforms
import timm
from timm.utils.metrics import AverageMeter, accuracy
from timm.utils.model import reparameterize_model
from timm.models.swiftformer import checkpoint_filter_fn

device = torch.device('mps')
torch.mps.empty_cache()

def get_model_acc(model: torch.nn.Module):
    cfg: Dict[str, Any]= model.default_cfg
    _, height, width = cfg['input_size'] if 'test_input_size' not in cfg else cfg['test_input_size']
    crop_pct = cfg['crop_pct'] if 'test_crop_pct' not in cfg else cfg['test_crop_pct']
    imgsz = height if height == width else (height, width)
    interp_mode = {"nearest": 0, "bilinear": 2, "bicubic": 3}
    val_dataset = datasets.ImageFolder(
        './imagenet/val',
        transforms.Compose([
            transforms.Resize(int(imgsz / crop_pct), interpolation=interp_mode[cfg['interpolation']]),
            transforms.CenterCrop(imgsz),
            transforms.ToTensor(),
            transforms.Normalize(cfg['mean'], cfg['std'])])
    )
    val_loader = DataLoader(
        val_dataset, batch_size=64, shuffle=False, pin_memory=False, prefetch_factor=4, num_workers=4,
        persistent_workers=True#, pin_memory_device='mps'
    )

    top1 = AverageMeter()
    top5 = AverageMeter()

    model.eval()
    model = reparameterize_model(model)
    model.to(device)

    torch.mps.synchronize()
    with torch.inference_mode():
        for images, target in tqdm(val_loader):
            images = images.to(device)
            target = target.to(device)
            output = model(images)
            acc1, acc5 = accuracy(output, target, topk=(1, 5))
            top1.update(acc1, images.size(0))
            top5.update(acc5, images.size(0))
    torch.mps.synchronize()
    return {"ACC@1": round(top1.avg.item(), 4), "ACC@5": round(top5.avg.item(), 4)}

model_weight_list = {
    "swiftformer_xs": "./SwiftFormer_XS.pth",
    "swiftformer_s": "./SwiftFormer_S.pth",
    "swiftformer_l1": "./SwiftFormer_L1.pth",
    "swiftformer_l3": "./SwiftFormer_L3.pth",
}


if __name__ == "__main__":
    model_list = timm.list_models('swiftformer*', pretrained=False)
    print(model_list)
    for name in model_list:
        torch.mps.empty_cache()
        model = timm.create_model(name, pretrained=False).eval()
        weight = torch.load(model_weight_list[name], map_location='cpu', weights_only=True)
        weight = checkpoint_filter_fn(weight, model)
        model.load_state_dict(weight)
        result = get_model_acc(model)
        print(name, result)

Output

['swiftformer_l1', 'swiftformer_l3', 'swiftformer_s', 'swiftformer_xs']
swiftformer_l1 {'ACC@1': 80.902, 'ACC@5': 95.378}
swiftformer_l3 {'ACC@1': 83.0, 'ACC@5': 96.238}
swiftformer_s {'ACC@1': 78.466, 'ACC@5': 93.972}
swiftformer_xs {'ACC@1': 75.586, 'ACC@5': 92.326}

…ckpoint filter fns and minor renames

rwightman · 2025-05-26T23:01:49Z

@brianhou0208 woops, missed those drive links... okay, all sorted. Weights on the hub waiting for final checks

rwightman · 2025-05-27T01:50:17Z

@brianhou0208 thanks, all merged

brianhou0208 added 13 commits April 29, 2025 00:58

add swiftformer

7a87199

update swiftformer

081e6c2

add starnet

b60a4fb

add shvit

77788f4

add test_models

91e6e17

add fasternet

a3e66b1

update fasternet

7fc0692

update group_matcher

89d2952

Merge branch 'main' into fast_model

cb98094

fix fasternet

1922ca5

add ghostnetv3

236b00c

Merge branch 'main' into fast_model

bea1137

Merge branch 'main' into fast_model

75823ab

Updated faster models w/ hub weight locations, commented out some che…

4209788

…ckpoint filter fns and minor renames

rwightman merged commit 3114a9f into huggingface:main May 27, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add SwiftFormer, SHViT, StarNet, FasterNet and GhostNetV3 #2499

Add SwiftFormer, SHViT, StarNet, FasterNet and GhostNetV3 #2499

Uh oh!

brianhou0208 commented May 21, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 21, 2025

Uh oh!

rwightman commented May 25, 2025

Uh oh!

brianhou0208 commented May 26, 2025

Uh oh!

rwightman commented May 26, 2025

Uh oh!

Uh oh!

rwightman commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Add SwiftFormer, SHViT, StarNet, FasterNet and GhostNetV3 #2499

Add SwiftFormer, SHViT, StarNet, FasterNet and GhostNetV3 #2499

Uh oh!

Conversation

brianhou0208 commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Model

Model Request

Result

Param / MACs / Throughput

NPU Latency

Uh oh!

HuggingFaceDocBuilderDev commented May 21, 2025

Uh oh!

rwightman commented May 25, 2025

Uh oh!

brianhou0208 commented May 26, 2025

Acc Top1 & Top5

Shortly Test Code

Uh oh!

rwightman commented May 26, 2025

Uh oh!

Uh oh!

rwightman commented May 27, 2025

Uh oh!

Uh oh!

brianhou0208 commented May 21, 2025 •

edited

Loading