EcoMarineAI touches three regulated areas: data licenses, software licenses, and environmental / ethical use. This doc consolidates everything so a legal / ethics reviewer can audit in one place.
LICENSE in the repo root).The combined training corpus has two sources; the derivative model inherits the most restrictive terms of both.
support@happywhale.com).LICENSE_DATA.md.| Use case | Permitted? | Notes |
|---|---|---|
| Academic research | ✓ | Attribute both sources |
| Educational use | ✓ | Accredited institutions |
| Non-profit conservation | ✓ | — |
| Scientific publications | ✓ | Cite both datasets |
| Government monitoring (RF) | ✓ | With ФСИ approval |
| Commercial products | ✗ | Blocked by CC-BY-NC and gov restrictions |
| Open-source tools (MIT) | ✓ | For non-commercial downstream use |
| Startups / for-profit orgs | ✗ | Unless they obtain commercial licence |
| Component | Upstream | Licence |
|---|---|---|
| OpenCLIP ViT-B/32 | laion/CLIP-ViT-B-32-laion2B-s34B-b79K |
Apache 2.0 |
| EfficientNet-B4 (ImageNet) | timm efficientnet_b4 pre-trained weights |
Apache 2.0 |
| ArcFace head (fine-tuned) | ktakita/happywhale-exp004-effb4-trainall |
MIT (Kaggle user content — CC0/MIT mixture) |
| ResNet-101 (legacy) | baltsat/Whales-Identification |
MIT |
| rembg background removal | danielgatis/rembg |
MIT |
All permissive, but each component’s attribution requirement is preserved in LICENSE_MODELS.md.
0x0000dead/ecomarineai-cetacean-effb4 on HF carries the combined licence: CC-BY-NC-4.0 (taking the strictest of the inputs). The HF model card lists all upstream sources in the datasets front-matter and in the ## Licensing section.
See LICENSES_ANALYSIS.md for the full 159-dependency license breakdown. Summary:
individual_id column refers to whales, not people).DOCS/RESEARCH_NOTES.md.LICENSE_MODELS.md (“Applications that harm marine mammals”, “Surveillance for hunting purposes”).The source code alignment is complete. The accompanying research report (НТО) must still follow ГОСТ rules for its Russian-language deliverable. A non-exhaustive checklist of what the code side enables:
| ГОСТ requirement | Source artifact |
|---|---|
| Библиографические ссылки на методы | RESEARCH_NOTES.md §6 |
| Воспроизводимость результатов | scripts/compute_metrics.py, scripts/benchmark_* |
| Структурированное описание архитектуры | ML_ARCHITECTURE.md |
| Документация пользовательского интерфейса | USER_GUIDE_BIOLOGIST.md |
| Документация API | API_REFERENCE.md |
| План тестирования | TESTING_STRATEGY.md |
| Развёртывание и эксплуатация | DEPLOYMENT.md, MLOPS_PLAYBOOK.md |
For data-license questions: the upstream providers (Happy Whale and the Ministry of Natural Resources RF). For code-license questions: open a GitHub issue tagged legal.