REU Site: Animal Language Processing and Understanding
What animals are talking about is a fascinating research topic. Previous work in bio-acousitics studied a handful of species and has suffered from limited scale and high cost. Furthermore, no previous research has attempted to adopt a systematic approach to associate animal vocal sounds with written symbols, and meanings like we do with human languages.
Our REU Site projects are based on the leading hypothesis that animal languages are similar to human languages to some degree, including that the languages of animals evolve through their interaction with other creatures and that animals are able to express their unique feelings to the outer changes of surroundings by differentiating their sounds. Considering this, two hypotheses are put forward:
Structural Similarities to Human Language: We hypothesize that animal communication may possess structures akin to human language, including phonemes, lexicon, and syntax, indicating the presence of a language-like system.
Contextual Semantics: We propose that animals employ consistent communication patterns in specific contexts, reflecting semantics. This suggests that animal vocalizations are associated with distinct activities or intentions.
In our preliminary study, we have built a pipeline (see above figure) of acquiring dog videos from YouTube, cleaning and segmenting the audio tracks the videos, transcribing the vocals into a sequence of predefined phonetic symbols, and further discovering phonemes and words which are meaningful linguistic units from the dog vocalization. Based on this pipeline, the undergraduate participants of REU site will make two research thrusts: i) to collect high quality, partially annotated multimedia animal communication data of other species; and ii) to experiment on this data using our animal language processing pipeline to make new discoveries about the languages of those new species.
To achieve these goals, we will recruit student animal lovers with computing, biology, psychology or linguistics backgrounds. Each year, participants will spend 8 weeks away from UTA learning the basics (including basic programming, natural language processing, linguistics and animal communications), and produce a concrete research proposal to discovery one or more linguistic properties in a selected animal species, on a part-time basis (Phase 1), and 8 weeks summer full-time at UTA conducting the planned research and submit a research paper to a top-tier conference or journal (Phase 2).
Qualifications
Required:
Preferred:
Application Materials
Important REU Dates
| Capacity | Up to 10 students |
|---|---|
| Tentative Program Dates | Phase 1 (remote): March 23 - May 12 (~8 weeks, up to 5 hrs a week)Phase 2 (on-site): May 25 - July 17 (8 weeks full-time) |
| Applications Open | Jan 1, 2026 |
| Application Deadline | Feb 20, 2026 |
| Acceptance Notification | Mid March, 2026 |
Benefits
Animal Video Samples:
These are some sample animal videos, showcasing the intricate vocalizations and behaviors of study subjects.
