Language Documentation and Archiving for Northeast Indian Languages


This is an accelerated course for community language documenters, language revivalists, folklore and oral literature specialists, digital archivists, and linguists interested in the documentation and description of the languages of the Tibeto-Burman family. The curriculum provides training to take a digital object from its creation to a structured archival package such that it can be readily ingested in a state-of-art digital archive. The four modules involve hands-on activities which will result in a documentary language collection that is ready for archiving and use by communities and researchers.

Format: hybrid; sessions in-person or via ZOOM
10 sessions, 2 hours each of class time; 10 sessions, 2 hours each of lab time
Office hours once per week to address questions

Date Lead Topic Objectives Readings and Assignments
March 25 Prof. Ramesh C Gaur Inaugural Introduce the Certificate course

Book release of CoRSAL occasional publications

Introduce Tetso College (Principal Khing)

Introduce the resource persons (ShobhanaChelliah will lead)

MS, IGNCA, Chief Guest

26 March MB, SLC Introduction; Creating a language collection; Facilitating the creation of a regional or community repository To use a language documentation kit for audio and video recording

To understand what to collect and why

To understand project file management vs. archival file management

To explore repositories for archiving

To understand legal and ethical considerations for collection of and access to indigenous materials

To understand the difference between archiving repositories and websites

Collaborative Language Archiving Curriculum Module 1 and 2
27 March MH Installing software To familiarize with technologies for processing and providing intellectual access to data,

To review installation process of KeyMan, SayMore, and Flex.

 

Why Language Documentation Matters Chapters 1-2

Collaborative Language Archiving Curriculum Module 2

29 March-02 April MB Data management and metadata creation To understand the basic types of metadata

To learn how to use a metadata editor

To understand additional metadata needs for archive users

To use software (SayMore) to manage your files

Why Language Documentation Matters Chapters 3-4

Collaborative Language Archiving Curriculum Module 3

05-09 April SLC Making materials accessible for use: transcription Transcribing connected text.  Discuss the uses of connected text for language revitalization, pedagogy, and description. Set up and transcribe using SayMore.  Discuss IPA versus practical orthography, introduce KeyMan. From Source to Analysis, Chapter 1

Curriculum Module 3

11-15 April MH, BH, SLC Making materials accessible for use: word lists Flex install and review of basic features. Practice adding words through “collect words”. Discuss semantic domains and rapid word collection. Add through Lexicon edit. Discuss fields and dictionary formats. Swipe in words from existing databases. Adding video, audio, and images. File naming and data management for these resources.

 

From Source to Analysis, Chapter 2
19-23 April SLC, BH Making materials accessible for use: Translation Translating words, translating clauses.  Annotating connected text.  Use FLEx Texts baseline.  Discuss what constitutes a clause and determines a line. Discuss punctuation versus diacritics. Discuss standardization of orthography for analysis (word breaks and representing allomorphy on the baseline). Collaborative Language Archiving Curriculum Module 4

From Source to Analysis, Chapter 3

26-30 April BH, SLC Making materials accessible:  morphological annotation Prepping for annotation. What to read, how to create lists of abbreviations, what to understand about grammar of related languages.  What are the major categories?  What kinds of morphology will you encounter?  What zones should you expect? Create your own lists and possible abbreviations. Pass one of glossing – Using the Gloss tab, enter lexeme glosses and free translations; discuss word net hierarchies, discuss translations quagmires. Discuss functional: semantic glossing. Discuss what is ready for archiving, metadata needed, file naming From Source to Analysis, Chapter 4-6
03 -07 May MD, SLC Disseminate your corpus: Use the corpus Writing a guide to your IGT. Applying Leipzig Glossing Rules principles and implementation. Discuss hierarchical glossing rules principles and implementation. Use your corpus: Discuss uses of the corpus for dictionary creation and grammatical description. Use the Concordance feature. Using examples for grammatical discovery. Moving example to a text document. Discuss how to cite your corpus. From Source to Analysis, Chapter 7-8
10-14 May MD, SLC, MH Disseminate:  IGT collections at CoRSAL .Format, front matter, agreements, the final look Languages of the Barak Valley
17-21 May team Closing ceremony Where to from here…

Texts: Chelliah, Shobhana. (2021). Why Language Documentation Matters.  Springer Briefs in Linguistics. Dordrecht:  Springer Academic Press.

Chelliah, Shobhana and Samson Lotven.  2021.  From Source to Analysis:  A fieldworker’s guide to annotation.  UNT open books.

Computational Resource for South Asian Languages. (2020). Collaborative Language Archiving Curriculum. Retrieved [03 02 2023] from https://corsal.unt.edu/curriculum.

Haokip, Pauthang; Chelliah, Shobhana Lakshmi (series editor); Burke, Mary & Heaton, Marty (volume editors).  Annotated Texts of the Languages of the Barak Valley: Thadou, Saihriem, Hrangkhol, Ranglong, book, 2021; Denton, Texas. (https://digital.library.unt.edu/ark:/67531/metadc1808476/: accessed February 11, 2022), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu.