EcoMarineAI — Solution overview
Audience: project sponsors, grant reviewers, environmental-agency stakeholders, marine biologists without ML background.
What is EcoMarineAI?
EcoMarineAI is an open-source AI library and web service that identifies individual whales and dolphins from aerial photographs. Upload a photo → get back the species, individual ID (when the animal is in the training set), confidence score, and — critically — an answer to “is this even a cetacean at all?”.
It is designed to close three concrete gaps in current marine-mammal monitoring:
- Time cost. Manual identification of individual whales from aerial surveys takes trained biologists minutes per frame. A drone flight generates thousands of frames.
- Consistency. Different labellers disagree on hard cases. A single model applied uniformly gives reproducible results.
- Error visibility. Traditional identification pipelines silently return something even on an image that isn’t a whale at all. EcoMarineAI’s anti-fraud gate explicitly rejects non-cetacean inputs with a documented reason.
Who uses it?
| User |
How they interact |
Value |
| Marine biologist (field) |
Web UI or whales-cli predict |
Drop a photo, get species + confidence in < 1 second |
| Research lab |
Python CLI batch mode + CSV/SQL exports |
Process thousands of images into a searchable database |
| Conservation NGO |
REST API /v1/predict-single |
Integrate predictions into a dashboard or mobile survey app |
| Government monitoring agency |
/metrics Prometheus + /v1/drift-stats |
Continuous oversight of system availability and model drift |
| ML researcher |
Pretrained checkpoint on Hugging Face |
Fine-tune on their own species, publish derivative work |
| Citizen scientist |
Web UI |
Contribute photos, learn about cetaceans |
What it does (user-facing)
- Accepts any RGB image (JPEG/PNG/WEBP/BMP) or a ZIP archive of many images.
- Filters out anything that isn’t a photo of a whale or dolphin (CLIP zero-shot anti-fraud gate, TNR ≥ 90.2%).
- Identifies the individual animal from 13 837 known cetacean individuals across 30 species.
- Returns a structured response with species name, individual ID, confidence, and an optional background-removed mask.
- Logs every prediction to an in-memory drift monitor so operators see degradation before users complain.
- Exports results to CSV, SQLite, or PostgreSQL via the
integrations/ scripts.
What’s under the hood
Two stages running in series:
- Anti-fraud gate: OpenCLIP ViT-B/32 trained on LAION-2B. Zero-shot classification against 10 positive prompts (whale / dolphin / cetacean descriptors) and 14 negative prompts (text / buildings / fish / sharks / landscapes). Calibrated threshold chosen to give TNR ≥ 0.90 while preserving TPR ≥ 0.85.
- Identification model: EfficientNet-B4 backbone + ArcFace head on 13 837 individuals. Cosine similarity between the image embedding and the class centroids gives a naturally interpretable confidence score.
Full technical details in ML_ARCHITECTURE.md.
All numbers below are computed by scripts/compute_metrics.py on a reproducible in-repo test split (100 positives from Happy Whale + 102 negatives from Intel Image Dataset, total 202 images).
| Metric |
Measured |
ТЗ target |
Status |
| Sensitivity / TPR |
0.9500 |
> 0.85 |
✓ |
| Specificity / TNR |
0.9020 |
> 0.90 |
✓ |
| Precision |
0.9048 |
≥ 0.80 |
✓ |
| F1 |
0.9268 |
> 0.60 |
✓ |
| Latency (p95, CPU) |
519 ms |
≤ 8 000 ms |
✓ |
| Linear time complexity |
R² = 1.000 |
linear |
✓ |
| Noise robustness |
0.0 % drop |
≤ 20 % |
✓ |
| Identified individuals |
13 837 |
≥ 1 000 |
✓ |
Why it matters beyond this one project
- Open data × open models. Code is MIT, models inherit CC-BY-NC-4.0 from the upstream Happy Whale dataset, everything is reproducible from Kaggle + HuggingFace mirrors.
- Scientific rigour. Every number in this document comes from a script that any reviewer can re-run on their own laptop in under a minute.
- Extensibility. Adding a new species means adding rows to the training CSV and re-fitting the ArcFace head — the rest of the pipeline doesn’t change. Adding a new export integration is ~80 lines of Python (see
integrations/sqlite_sink.py).
- Failure visibility. The CLIP anti-fraud gate makes the system explicit about the edges of its knowledge. When you feed it a photo of your cat, it says so, loudly.
What’s next
See ROADMAP.md for the detailed plan by ФСИ-grant milestone.