Research Experience for Undergraduates Site
REU Site: Animal Language Processing and Understanding
CSE Department, The University of Texas at Arlington

Activity Timetable

The project consists of two phases: Phase 1 (March 11 - May 19 part-time) and Phase 2 (May 23 – July 31 full-time). Phase one features data collection, knowledge acquisition, and some online sync-ups. You are expected to spend up to 5 hours a week during Phase 1. The second phase is followed by a full-time, on-site commitment at UTA for the next 10 weeks during the summer.

Phase 1:

Self-study: All students will be provided with a list of book chapters, papers, and tutorial videos about animal communications and an introduction to natural language processing to prepare them theoretically for the upcoming summer program. For computer science students who will be developing the web interface will take online courses to acquire the skills necessary for this research and become familiarized with the AniVoice code base.

Data collection: Students will be grouped into several research groups working on a particular animal species of their interest. They will be instructed on how to record the animal communication videos as well as provided with equipment if necessary.

Bi-weekly sync-ups: Every two weeks, each group will meet with their mentors and report their progress. These meetings generally aim at checking the quality of the data, and rectifying the methodology, including adjusting the recording approaches or optimizing the scraping procedures. In addition to these meetings, the participanting students will communicate on an ad hoc basis with their mentors to promptly resolve questions or issues.

Phase 2:

Orientation and Lab Tours (Week 1-2): Graduate Mentors will assist participants to settle in to the UTA house and assign them to the workspace in the PI’s and Co-PI’s labs. Participants will meet mentors’ graduate students and become familiarized with the campus and surroundings. Workshops on research conduct and safety will be held. Participants will are given a demonstration and a walk-through of the existing AniVoice pipeline and learn how to use it.

Data Filtering, Cleansing, and Annotation (Week 3-4): Participants will focus on putting all data collected from Phase 1 together, and apply preprocessing code to extract useful audio/visual data with actual animal communication scenes. The participants may also need to annotate both audio and video data for word and sub-word segmentation, video scene recognition and object recognition, etc. In the first year of the project, the group that is tasked with web development will develop a web interface that allows internet users to upload their own recordings, as well as annotate their video clips on the web. In subsequent years, the participants will maintain this web interface and integrate the collected data from the web platform with their own data.

Vocal Transcription, Semantics Discovery (Week 5-6): After fine-tuning the word/subword segmentation models and acoustic feature extraction models, participants will transcribe the animal vocalizations into sequences of distinct symbols. Participants will further finetune the video understanding and alignment models to align the words in the transcripts with the video scenes and give them semantics.

Evaluation and Analysis (Week 7-8): Participants will test the fine-tuned AniVoice pipeline to evaluate its ability to accurately segment, recognize, and understand animal communications within the data. These tests will help ensure the respective components’ reliability and robustness. Statistical analysis will be performed to infer the possible meaning of the animal words, which will be validated against animal science literature. These activities will provide feedback to assess its usability and effectiveness in annotating and contributing data which will guide refinements and enhancements to AniVoice.

Documentation and Presentation (Week 9-10): In the final two weeks, participants will focus on compiling documentation and preparing presentations to share their findings. Participants will compile comprehensive documentation that encapsulates the entire research process, including data collection, preprocessing, model development, testing results, and best practices. This documentation will serve as a valuable resource for future researchers. At the same time, participants will prepare and deliver final research presentations to showcase their accomplishments and insights gained through the program. Participants will also produce a detailed research report summarizing their work, findings, and contributions to the AniVoice project. If such reports are deemed publication-worthy, we will carry the participants through the paper writing and submission process.