Department of Computer Science and Engineering, University of Bologna, Cesena (FC), Italy
Computer Science and Engineering Department, The Ohio State University, Columbus (OH), USA
Wildlife behavior acquisition
A paramount tool for ethologists and biologists to gather insights into the nature and inform
conservation efforts for endangered species.
Animal health monitoring
Behavioral changes induced by climate change or human activity
Current population level
Insights into future population levels
background image: Abujoy, License CC BY-SA
background image derived from: Abujoy, License CC BY-SA
GPS collars
Great position tracking
Possibly equipped with further sensors (temperature, accelerometer…)
Long battery life
No video
Invasive (requires capture and release) $\Rightarrow$ Limited sample size
author: Arddu, License CC Attribution 2.0 Generic
author: Winterline, License CC Attribution-Share Alike 3.0 Unported
author: Kalyan Varma, License CC BY-SA
author: Prashanthns, License CC BY-SA
Camera traps
Photos and potentially videos
Non-invasive
Multiple species
Static and with limited range
False triggers
Subject to vandalism and theft
Generally fragile (the tiger in the first picture destroyed the camera)
Fixed-wing drone aerial views
Very large area coverage
Long flights
Nadir imagery: good for mapping, bad for individual behavior
Requires specialized training
Predefined flight paths
Non-nadir perspective
Quadcopters and similar drones
Large area coverage
Although much smaller than fixed-wing drones
Non-Nadir view is great for individual behavior
Multiple drones can get different perspectives
Dynamic trajectories
Noise may disturb wildlife
Relatively short battery life
Skilled pilots required
Practically impossible to coordinate multiple drones effectively by hand
$\Rightarrow$ Multi-Drone Coordination
No need for human pilots
Similar to well-known problems in the literature!
A special OMOkC
In the Online Multi-Object k-Coverage (OMOkC) problem,
dones coordinate to cover each interesting target with at least $k$ points of view.
Our problem is a variant of OMOkC, in which:
The focus is on animal groups rather than single animals $\Rightarrow$ Herd tracking
Drones have a blind zone due to their non-nadir point of view $\Rightarrow$ Blind zone
The position of animals within the Field-of-View dramatically changes the quality of the result $\Rightarrow$ FoV centrality
The angle at which a subject is being observed matters, lateral views are more informative than frontal ones $\Rightarrow$ Observation angle
Observers emit noise that may alter the behavior of the observed animals $\Rightarrow$ Noise pollution
Observation is performed in contexts with limited infrastructure $\Rightarrow$ Decentralized coordination
Contribution
A methodology to evaluate the performance in wildlife video acquisition
We define metrics for:
The centrality in the Field-of-View of each camera
The overall angles of observation of each animal
The noise pollution generated by the drones
We build simulations based on a novel herd simulation algorithm based on the KABR dataset
(Jenna presented the algorithm at SISSY on Monday)
$\Rightarrow$ We observe that pre-existing OMOkC algorithms do not perform as well as expected in our context,
and thus we propose to extend the current SOTA with:
An herd-aware decentralized multi-drone coordination algorithm
FoV Centrality
Let $P_c$ be the center of the FoV $\mathcal{V}$ of camera $c$.
Let $F(c)$ be the maximum distance from the center of the FoV
then
$F(c) = \max \left| P - P_c \right| ~ \forall P \in \mathcal{V}$.
For any camera $c$, $F(c)$ represents the worst possible position in its FoV.
For an animal $z$ located in $P_z$, a normalized estimate of how poorly it is positioned in the FoV of $c$ is:
the ratio between its distance to the center and $F(c)$:
$\frac{\left| P_z - P_c \right|}{F(c)}$
The normalized FoV centrality for a target animal $z$ and a drone $c$ is then: $Q(z, c) = 1 - \frac{\left| P_z - P_c \right|}{F(c)}$
Generalized for a set of cameras $C$ observing a target $z$: $\Gamma(z) = \max_{c \in C} Q(z, c)$
TL;DR: the closer to the center, the better
find the worst possible position to be used as bound
use that to estimante how good is the animal position for each camera
for each animal, consider only the best camera
Observation angle: body coverage
Ideas
the best observation comes from a perfectly perpendicular angle
the “longer” the side of the animal that is being observed, the better the observation
that’s why observations from the side are more valuable than frontal or back ones
small deviations from perpendicularity are not that bad
approximate the animal’s body with a polygon
for each segment $s$ find the camera $c$ observing the segment midpoint from the smallest angle $\alpha_s$: $c$ has the best available view for $s$
normalize $\alpha_s$ in $[0, 1]$ with $\Phi: [-\frac{\pi}{2}, \frac{\pi}{2}]\rightarrow{}[0, 1]$.
use a logistic function to penalize more the extreme angles: $\Phi(x;\mu,\nu)=\left[1+\left(\frac{x(1-\mu)}{\mu(1-x)}\right)^{-\nu}\right]^{-1}, \mu=\frac{1}{2}, \nu=5$
get the observation quality for $s$: $\xi(s) = \Phi\left(\frac{|\alpha_s|}{\frac{\pi}{2}}; \frac{1}{2}, 5\right)$.
repeat for every “side” of the animal in $S_z$ to get the body coverage $\Diamond(z) = \frac{\sum_{s \in{} S_z} |s| \cdot \xi(s)}{|S_z|}$
Noise pollution
We need the Sound Pressure Level $L_P$ at the position of the animal.
Of course, manufacturers only provide the Sound Power Level $L_W$, a measure of the sound energy emitted by the drone.
To convert into the SPL at distance $r$ from the drone, we need a directivity factor $Q$:
$L_P = L_W - \left| 10 \log_{10} \left(\frac{Q}{4 \pi r^{2}}\right) \right| $
We assume $Q=1$ (spherical propagation), and $r=1m$ (a typical distance at which manifacturer measure the Sound Power Level).
The $L_P$ perceived by an animal $z$ at distance $d$ from the drone
with air attenuation is:
$ L_{P_d}(z) = L_{P}(z) + 20 \log_{10} \left(\frac{r}{d}\right)$
For multiple drones $C$, their contributions sum:
$L_{P_T}(z) = 10 \log_{10} \left(\sum_{c \in{C}} 10^{\frac{L_{P_c}(z)}{10}} \right)$
To normalize in $[0, 1]$, we assume that a noise below $20dB$ (~ a ticking watch) can’t be distinguished from the background,
and a noise above $80dB$ (~ police car siren) will always disturb the animal.
Since noise is perceived non-linearly, we use a sigmoid with
$\mu=40dB$ (~ refrigerator hum, our proxy for the background noise).
The final normalized noise metric is thus $\rho(z) = \Phi\left(h(L_{P_T}(z)); h(\mu), 4\right)$
TL;DR
we assume noise propagates in air without major obstacles or reflections
we set silence at the sound of a ticking watch, and maximum noise at the level of a police siren
we sum the contribution of every drone and consider non-linear perception
plain LinPro
Herd-sensitive tracking
Running state-of-the-art OMOkC algorithms¹ on our setup highlighted some issues:
OMOkC algorithms are designed to cover individual targets, not groups
Current SOTA algorithms are meant to quickly react to changes in interestingness, but all animals are equally interesting
Usual setups have enough drones to provide $k$ views for each target, but with herds targets largely outnumber drones
$\Rightarrow$ We alter the general structure of OMOkC algorithms to track herd centroids instead of individual targets.
Identification and localization: each drone identifies and localizes the animals in its FoV as best as it can
we accept localization and identification errors
Information exchange and consensus: local information is exchanged among drones to reach consensus on the herd composition,
then each drone, locally, performs a recursive hierarchical agglomerative clustering² to find the herd centroid
we accept limited communication ranges and network segmentation
we accept that different drones may have different information and compute different centroids
Prioritization: we feed the locally-computed herd centroids to the original OMOkC algorithms
global metric, $\nu~\Rightarrow~$ drones per every herd, $\zeta~\Rightarrow~$ herd count
Force-Field LinPro+Clustering (ff_linpro_c) is the best across the board
Plain Force-Field LinPro, that outperforms all other algorithms in “classic” OMOkC scenarios, is the worst in our context
The higher the drone:herd ratio, and the more herds, the larger is the gap between ff_linpro_c and the remainder of the algorithms, showing better adaptation
Coverage results
1-, 2-, and 3-coverage, all algorithms configured to achieve 3-coverage ($k=3$)
Force-Field LinPro+Clustering (ff_linpro_c) is the best but for 1-coverage and too few drones
Smooth-Available (sm_av) achieves good 1-coverage, but performance degrades with higher coverages
It is likely that ff_linpro_c configured with $k=1$ would perform better