Minimal Checklist for the Preservation of Digital Language Documentation Materials
This checklist, which is published at http://hdl.handle.net/10125/55829 and copied here for easy access, was developed by DELAMAN to serve as a guide to the minimal level of digital data preservation that is generally accepted by the professional standards of language documentation. It should be noted that going above and beyond this baseline level of preservation is desirable, encouraged and even compulsory by granting agencies, particular archives, and/or professional expectations.
- Materials are deposited with a digital repository with an institutional commitment to long-term preservation and access (e.g., a DELAMAN archive or an institutional repository). Furthermore, deposits are made on a regular and frequent basis, as materials are created.
- Materials are in digital formats that are recommended by the international archival standards (e.g. IASA). This typically means they are non-proprietary, well-documented, and/or open source.
- Audio files are minimally 48khz/16bit, ideally 96khz/24bit BWF. Digitization should use high-quality analog to digital converters. Playback machines should allow azimuth adjustment to maximize the signal captured from the tape.
- Text files are .txt, .xml, rtf, pdf.
- Video files are in an uncompressed format (ideally JPEG2000)
- Materials are additionally available in formats that are easy to access and download (e.g. compressed as mp3, mp4).
- Materials have been created on recording equipment that has been selected with an eye toward quality.
- Materials are described using standardized metadata (e.g., OLAC, IMDI, Dublin Core, MODS).
- A description of the deposited collection has been included in the collection.
- A significant portion of the collection is public access, or a clear procedure for requesting access is indicated. If public access is impossible, a statement about why should be included in the collection description.
Report of the DELAMAN Costing Case Study
“The cost of not archiving”, presentation at the INNET workshop 2014 by Nick Thieberger.
Minimal Checklist for the Preservation of Digital Language Documentation Materials , same as above.
Useful links on:
- Help and general information
- Software for linguists
- Software for musicologists
- Audio & video recording
- Texts: fonts, Unicode, XML, etc.
- Intellectual property rights
- Metadata and standards organizations
- Digital archives information
The EMELD School of Best Practice
A wealth of information about technologies, methods, tools, andmore, from experts across the spectrum of language documentation.
Ask An Expert
Get advice from a panel of experts for your particular problem.
Resource Network for Linguistic Diversity
This network is aimed at providing information and sharing expertise among the community of linguists, especially those working with endangered languages, particularly about tools for doing language work, such as toolsfor recording, transcription, media-linking, corpus construction and analysis, interlinearising, lexicography, and so on.
Documentation of Endangered Languages (DOBES) Information about the DOBES program and also informationabout language documentation in general.
Endangered Languages & Cultures Blog
A blog about linguistic documentation, fieldwork, technology, language maintenance and education, and issues that pertain to endangered languages and cultures around the world.
Interactive bibliography of technology
The Reading Room at the School of Best Practice.
Interactive software database
Also at the School of Best Practice.
Ethnologue: Languages of the World
From Summer Institute of Linguistics International.
An encyclopedic reference work cataloging all of the world’s 6,912 known living languages.
The Language Archive, Max Planck Institute for Psycholinguistics – Nijmegen
Information about archiving, software and services related to language resources.
Linguistics Education Resource Guide
Page with links to online resources for different linguistic disciplines
Including Elan (transcription, annotation, and alignment of texts withaudio & video recordings);
Arbil (corpus management tool); and several other useful tools. Versions for Mac, Windows, Linux.
Transana (transcription of audio & video)
Paid software, Windows and Mac version
TranscriberAG: a tool for segmenting, labeling and transcribing speech.
Versions for Mac, Windows, Linux.
Praat (phonetic analysis)
Versions for Mac, Windows, Linux.
The Field Linguist’s Toolbox, from the Summer Institute of Linguistics
Toolbox is a data management and analysis tool for field linguists. It is especially useful for maintaining lexical data, and for parsing and interlinearizing text, but it can be used to manage virtually any kind of data.
Windows only, Mac via Windows emulator
Successor to Toolbox
Audacity (audio editing)
Simple and free program for digitizing, editing, and converting audio data.
Versions for Mac, Windows, and Linux.
Shareware music notation assistance software developedby Andy Robinson. For Windows, Mac and Linux. There is an active usergroup (yahoo groups) and Andy has incorporated lots of useful featuresat user request. Features include control of playback speed, frequencyfilters, flexible text markup, pitch analysis of selected waveformwith piano keyboard to check pitch, tempo computation, and in thelatest version integration with USB footpedals.
The Vermont Folklife Center, Audio Field Recording Equipment Guide
A very useful and easy to understand guide to different recording technologies and their use in ethnographic recording.
The Transom: Jay Allison, The Basics (of field recording)
Aimed at reporters, but relevant to linguists.
Microsoft MovieMaker for Beginners
Aimed at vacationers, but with a few very good ideas for making a a decent video recording.
Apple iMovie (free with new Mac, paid from app store)
Microsoft Movie Maker (free download)
Doulos SIL Unicode IPA
Keyboard layouts: General explanations
Programs for defining your own keyboard
Windows: Microsoft Keyboard Layout Creator (MSKLC) click here
How to install MSKLC: http://www.languagegeek.com/keyboard_general/msklc_installation.html
Mac OSX: http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=ukelele
A Gentle Introduction to XML
By Sperberg-McQueen, C. M. and Lou Burnard, 2001.
Chapter 2 of TEI P4: Guidelines for Electronic Text Encoding andInterchange, XML-compatible edition. TEI Consortium.
Berkeley Digital Library SunSite
A large collection of links to resources about copyrights and intellectual property rights.
World Intellectual Property Organization
Cultural and Intellectual Property Rights: A pathfinder for Native People,Students, Educators, and the General Public
American Association of Anthropology Ethics Blog
Open Language Archives Community (OLAC)
International Standards for Language Engineering
Component MetaData Infrastructure
Dublin Core Metadata Initiative
Metadata Object Description Schema (MODS)
International Standards Organization
National Initiative for a Networked Cultural Heritage (NINCH)
NINCH is a diverse coalition of organizations created to assure leadership from the cultural community in the evolution of the digital environment.
Especially useful is the NINCH Guide to Good Practice (link on this page.)
Open Language Archives Community (OLAC)