Base URL: http://localhost:8000 (or whatever you set via ALLOWED_ORIGINS).
OpenAPI schema is auto-served at GET /docs (Swagger UI) and GET /redoc (ReDoc).
GET /healthLiveness probe. Returns status: ok iff the service process is up; device
reports the inference device (cuda:0 when the GPU is passed through, e.g.
via docker-compose.gpu.yml, otherwise cpu).
curl http://localhost:8000/health
Response:
{"status": "ok", "device": "cpu"}
GET /metricsPrometheus-compatible plain text metrics. Scrape interval 15 s is fine.
Exposed counters and gauges:
| Name | Type | Meaning |
|---|---|---|
uptime_seconds |
counter | Seconds since process start |
availability_percent |
gauge | (requests − errors) / requests × 100 |
requests_total |
counter | All HTTP requests |
errors_total |
counter | Requests that returned ≥ 400 |
predictions_total |
counter | Successful (not rejected) predictions |
rejections_total |
counter | Rejections (either anti-fraud or low-confidence) |
rejections_by_reason{...} |
counter | Rejection breakdown by not_a_marine_mammal/low_confidence |
latency_avg_ms |
gauge | Mean HTTP latency |
cetacean_score_avg |
gauge | Rolling mean of the CLIP positive score |
POST /v1/predict-singleIdentify a single image. Returns a Detection object regardless of accept/reject — rejected: true is still HTTP 200, because rejection is a successful classification (“this is not a whale”).
Content-Type: multipart/form-datafile: one image (image/jpeg, image/png, image/webp, image/bmp).curl -X POST \
-F 'file=@whale.jpg;type=image/jpeg' \
http://localhost:8000/v1/predict-single
{
"image_ind": "whale.jpg",
"bbox": [0, 0, 512, 341],
"class_animal": "a6e325d8e924",
"id_animal": "bottlenose_dolphin",
"probability": 0.0756,
"mask": "iVBORw0KGgoAAAANS...",
"is_cetacean": true,
"cetacean_score": 0.9997,
"rejected": false,
"rejection_reason": null,
"model_version": "effb4-arcface-v1",
"candidates": [
{"class_animal": "a6e325d8e924", "id_animal": "bottlenose_dolphin", "probability": 0.0756},
{"class_animal": "208b91b1ca2b", "id_animal": "bottlenose_dolphin", "probability": 0.0625},
{"class_animal": "e5b92928d76e", "id_animal": "bottlenose_dolphin", "probability": 0.0301}
]
}
{
"image_ind": "text_screenshot.png",
"bbox": [0, 0, 800, 600],
"class_animal": "",
"id_animal": "unknown",
"probability": 0.0,
"mask": null,
"is_cetacean": false,
"cetacean_score": 0.08,
"rejected": true,
"rejection_reason": "not_a_marine_mammal",
"model_version": "effb4-arcface-v1"
}
{
"image_ind": "broken.jpg",
"bbox": [0, 0, 0, 0],
"class_animal": "",
"id_animal": "unknown",
"probability": 0.0,
"mask": null,
"is_cetacean": false,
"cetacean_score": 0.0,
"rejected": true,
"rejection_reason": "corrupted_image",
"model_version": "effb4-arcface-v1"
}
| Code | Condition | Example body |
|---|---|---|
| 415 | Missing / non-image content type | {"detail": "Только изображения."} |
| 400 | Empty upload | {"detail": "Пустой файл."} |
| 415 | PIL can’t decode the payload | {"detail": "Не удалось распознать изображение."} |
| 429 | Rate-limit exceeded (60 req / 60 s / IP) | {"detail": "Превышен лимит запросов. Повторите позже."} |
POST /v1/predict-batchIdentify every image in a ZIP archive. Returns a list of Detection objects (one per readable image).
Content-Type: multipart/form-dataarchive: one ZIP (application/zip or application/x-zip-compressed).zip batch.zip whale1.jpg whale2.jpg cat.jpg
curl -X POST \
-F 'archive=@batch.zip;type=application/zip' \
http://localhost:8000/v1/predict-batch
[
{ "image_ind": "whale1.jpg", "is_cetacean": true, "rejected": false, ...},
{ "image_ind": "whale2.jpg", "is_cetacean": true, "rejected": false, ...},
{ "image_ind": "cat.jpg", "is_cetacean": false, "rejected": true,
"rejection_reason": "not_a_marine_mammal", ...}
]
mask field by default (rembg is slow).| Code | Condition | Example body |
|---|---|---|
| 415 | Non-ZIP content type | {"detail": "Ожидается ZIP-архив."} |
| 400 | Malformed ZIP | {"detail": "Не удаётся распаковать архив."} |
| 429 | Rate-limit | {"detail": "Превышен лимит запросов..."} |
GET /v1/drift-statsRolling-window summary of CLIP cetacean_score values seen by the service. Useful as a lightweight drift signal.
{
"n": 5,
"alarms_total": 0,
"score_mean": 0.2111,
"score_std": 0.3947,
"probability_mean": 0.0151
}
n: number of predictions in the rolling window (max 1000).alarms_total: times the window mean dropped > 10 pp below the calibrated baseline.POST /predict-single and POST /predict-batch (without the /v1 prefix) delegate to the v1 versions so legacy clients don’t break during upgrades.
class Detection(BaseModel):
image_ind: str # filename or ZIP entry
bbox: list[int] # [x1, y1, x2, y2]
class_animal: str # 12-hex individual_id, "" on reject
id_animal: str # species name or "unknown"
probability: float # 0.0–1.0 identification confidence
mask: str | None = None # base64 PNG, optional
is_cetacean: bool = True # CLIP gate decision
cetacean_score: float = Field(ge=0, le=1, default=1.0) # gate positive softmax
rejected: bool = False # true if gate or low-confidence fired
rejection_reason: Literal[
"not_a_marine_mammal", "low_confidence", "corrupted_image"
] | None = None
model_version: str = "effb4-arcface-v1"
candidates: list[Candidate] = [] # top-k alternatives
class Candidate(BaseModel):
class_animal: str # 12-hex individual_id
id_animal: str # species name
probability: float # 0.0–1.0
All new fields (is_cetacean onwards) have defaults, so the response is a strict superset of the v1.0 shape; old clients continue to parse new responses without changes.
import requests
with open("whale.jpg", "rb") as f:
r = requests.post(
"http://localhost:8000/v1/predict-single",
files={"file": ("whale.jpg", f, "image/jpeg")},
timeout=30,
)
r.raise_for_status()
det = r.json()
if det["rejected"]:
print(f"Rejected: {det['rejection_reason']} (score={det['cetacean_score']})")
else:
print(f"{det['id_animal']} — {det['class_animal']} @ {det['probability']:.2%}")