000 02899nam a22003737a 4500
001 103595
003 KnowledgeUnlatched
005 20210303104814.0
006 m o d
007 cr u||||||||||
008 210129p20182019gw o u00| u eng d
037 _5BiblioBoard
245 0 4 _aThe Unicode cookbook for linguists
_bManaging writing systems using orthography profiles /
_cSteven Moran, Michael Cysouw.
020 _a9783961100903
024 8 _a10.5281/zenodo.1296780
029 1 _ahttps://library.biblioboard.com/ext/api/media/9cac473a-6c46-4158-bd09-f2fff0eb0fbb/assets/thumbnail.jpg
040 _aScCtBLL
_cScCtBLL
100 1 _aMoran, Steven
_eauthor.
506 0 _aAccess copy available to the general public.
_fUnrestricted
_2star
700 1 _aCysouw, Michael
_eauthor.
264 1 _bLanguage Science Press,
300 _a1 online resource (149 p.)
520 _aThis text is a practical guide for linguists, and programmers, who work with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together at the intersection between the Unicode Standard and the International Phonetic Alphabet. Although these standards are often met with frustration by users, they nevertheless provide language researchers and programmers with a consistent computational architecture needed to process, publish and analyze lexical data from the world's languages. Thus we bring to light common, but not always transparent, pitfalls which researchers face when working with Unicode and IPA. Having identified and overcome these pitfalls involved in making writing systems and character encodings syntactically and semantically interoperable (to the extent that they can be), we created a suite of open-source Python and R tools to work with languages using orthography profiles that describe author- or document-specific orthographic conventions. In this cookbook we describe a formal specification of orthography profiles and provide recipes using open source tools to show how users can segment text, analyze it, identify errors, and to transform it into different written forms for comparative linguistics research.
588 0 _aDescription based on print version record.
590 _aLanguage Science Press 2018-2020
650 7 _aLanguage Arts & Disciplines / Linguistics
_2bisacsh
650 0 _aLanguage arts
655 0 _aElectronic books.
758 _iIs found in:
_aKnowledge Unlatched
_1https://openresearchlibrary.org/module/2774bc74-146a-484f-a7ba-ab1d6a09bbfb
856 4 0 _uhttps://openresearchlibrary.org/content/9cac473a-6c46-4158-bd09-f2fff0eb0fbb
_zView this content on Open Research Library.
_70
999 _c20000
_d20000