Copyright (c) 2024 Baltsat Konstantin, Tarasov Artem, Vandanov Sergey, Serov Alexandr
The trained model weights distributed through this project are licensed under the Creative Commons Attribution-NonCommercial 4.0 International licence — the same licence that governs the upstream training data (Happy Whale). A model is a derivative of its training set; we cannot relax the upstream restriction.
Earlier drafts of this repository labelled the models as Apache 2.0. That
labelling was inconsistent with the upstream Happy Whale dataset’s
CC-BY-NC-4.0 terms, and the issue was flagged during expert review of the
intermediate НТО (round 4). The correct canonical licence is CC-BY-NC-4.0.
The Hugging Face mirror at 0x0000dead/ecomarineai-cetacean-effb4 matches
this file.
⚠️ COMMERCIAL USE RESTRICTIONS
The trained models in this repository were developed using datasets that include data licensed under CC-BY-NC-4.0 (Creative Commons Attribution-NonCommercial 4.0 International). As a result:
The trained models may NOT be used for commercial purposes without explicit permission from the original data providers (Happy Whale and Ministry of Natural Resources of the Russian Federation).
This restriction applies specifically to the trained model weights (.pt, .pth, .onnx files) and any derivatives thereof.
Under EU law, models trained on non-commercial data inherit those restrictions when the model is sold or used commercially.
Pretrained Models (Transfer Learning): This project uses pretrained models (ResNet, EfficientNet, ViT, Swin Transformer) that were originally trained on ImageNet. ImageNet has non-commercial research-only terms, adding another layer of restriction beyond Happy Whale and Ministry RF data.
⚠️ CRITICAL: ImageNet Pretrained Weights Restrictions
Our fine-tuned models are built upon pretrained models from the following sources, all of which use ImageNet for initial training:
| Pretrained Model | Source | Code License | Pretrained Weights | ImageNet Terms |
|---|---|---|---|---|
| ResNet-50/101 | torchvision | BSD-3-Clause | ImageNet-1k | Non-commercial |
| EfficientNet-B0/B5 | TIMM (Google) | Apache 2.0 | ImageNet-1k | Non-commercial |
tf_efficientnet_b0_ns |
Google Noisy Student | Apache 2.0 | ImageNet + JFT-300M | Non-commercial |
| ViT-B/16, ViT-L/32 | Google ViT | Apache 2.0 | ImageNet-21k | Non-commercial |
| Swin-T, Swin-L | Microsoft Swin | MIT | ImageNet-22k | Non-commercial |
| ConvNeXt-L | Facebook ConvNeXt | CC-BY-NC 4.0 | ImageNet-22k | Non-commercial |
ImageNet Dataset Terms:
Implications for This Project:
Our models face triple restrictions on commercial use:
Any ONE of these restrictions is sufficient to prohibit commercial use. All three apply simultaneously.
When using our models, you must also acknowledge the pretrained model sources:
Pretrained Model Attributions:
- ResNet: torchvision (BSD-3-Clause), https://github.com/pytorch/vision, trained on ImageNet-1k
- EfficientNet: TIMM/Google (Apache 2.0), https://github.com/huggingface/pytorch-image-models, trained on ImageNet-1k
- Vision Transformer: Google Research (Apache 2.0), https://github.com/google-research/vision_transformer, trained on ImageNet-21k
- Swin Transformer: Microsoft (MIT), https://github.com/microsoft/Swin-Transformer, trained on ImageNet-22k
- ConvNeXt: Facebook/Meta (CC-BY-NC 4.0), https://github.com/facebookresearch/ConvNeXt, trained on ImageNet-22k
All pretrained weights subject to ImageNet non-commercial terms (https://www.image-net.org/download.php).
✅ Research and Educational Use
✅ Personal and Non-Commercial Use
✅ Government and Conservation Organizations
The following trained models are subject to this license:
| Model Name | Architecture | Version | Status | File Location |
|---|---|---|---|---|
efficientnet_b4_512_fold0.ckpt |
EfficientNet-B4 (ArcFace head, 13 837 / 15 587 slots) | effb4-arcface-v1 | Production | whales_be_service/src/whales_be_service/models/efficientnet_b4_512_fold0.ckpt |
encoder_classes.npy |
Label encoder for ArcFace head | effb4-arcface-v1 | Production | whales_be_service/src/whales_be_service/models/encoder_classes.npy |
resnet101.pth |
ResNet-101 (ArcFace, fallback backbone) | v1.0 | Fallback | whales_be_service/src/whales_be_service/models/resnet101.pth |
model-e15.pt |
Vision Transformer L/32 (legacy Stage 1) | v1.0 (epoch 15) | Deprecated | models/model-e15.pt (not auto-downloaded; Yandex Disk only) |
| Other experimental models | Various | - | Research | models/*.pt, models/*.pth |
Note: ONNX-optimized models (.onnx files) are also subject to the same license terms.
The anti-fraud gate uses OpenCLIP ViT-B/32 LAION-2B pretrained weights, which are released under their own upstream licence (MIT / permissive). The EcoMarineAI calibrated threshold file (anti_fraud_threshold.yaml) is an artefact of this project and inherits CC-BY-NC-4.0 from the training data.
The production models are distributed through:
Download via ./scripts/download_models.sh — the script automatically verifies SHA256 checksums against models/checksums.sha256 and retries up to 3 times on network errors.
Models are NOT stored directly in the GitHub repository due to size constraints (.gitignore exclusion).
When using these models, you must provide proper attribution:
@misc{whales-identification-2024,
author = {Baltsat, Konstantin and Tarasov, Artem and Vandanov, Sergey and Serov, Alexandr},
title = {EcoMarineAI: Automated Whale and Dolphin Identification from Aerial Photography},
year = {2024},
publisher = {GitHub},
url = {https://github.com/0x0000dead/whales-identification},
note = {Models trained on Happy Whale and Ministry of Natural Resources RF data}
}
Additionally, you must acknowledge the original data sources:
The following uses are explicitly prohibited:
❌ Commercial exploitation without data provider consent ❌ Applications that harm marine mammals or their habitats ❌ Surveillance or tracking of marine mammals for hunting purposes ❌ Misrepresentation of model capabilities or accuracy ❌ Use in contexts that violate local wildlife protection laws
Model versions follow semantic versioning: vMAJOR.MINOR.PATCH
Current Stable Version: v1.0 (January 2025)
Model Card: See MODEL_CARD.md for detailed performance metrics, training data specifications, and evaluation results.
For inquiries regarding commercial use, custom licensing, or partnerships:
THE MODELS ARE PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MODELS OR THE USE OR OTHER DEALINGS IN THE MODELS.
The models’ predictions should not be the sole basis for critical conservation decisions. Always validate model outputs with expert marine biologist review.
This license may be updated to reflect changes in:
Last Updated: January 2025 Version: 1.0 Effective Date: January 2025