GB/T 36472-2018

Specification on Tibetan phrase classification and tagging for information processing (English Version)

GB/T 36472-2018
Standard No.
GB/T 36472-2018
Language
Chinese, Available in English version
Release Date
2018
Published By
General Administration of Quality Supervision, Inspection and Quarantine of the People‘s Republic of China
Latest
GB/T 36472-2018
Scope
The categories in this standard specifically refer to the categories of Tibetan phrases used in information processing, such as noun phrases (NP), verb phrases (VP), adjective phrases (AP), etc.
Introduction

Introduction and background of standard formulation

GB/T 36472-2018 "Classification and labeling specifications for Tibetan phrases for information processing" is to meet the needs of the development of Tibetan information technology and unify the classification and labeling system of Tibetan phrases. The standard is based on traditional Tibetan grammar and combined with modern computer natural language processing technology.

Comparative Analysis of Standard Frameworks

Phrase Categories Major Categories Minor Categories Markup Codes
Noun Phrase (NP) One of 8 major categories 6 minor categories, including noun complement structure, suffix structure, etc. NP
Verb Phrase (VP) One of 8 major categories 8 minor categories, including object-verb structure, complement structure, etc. VP
Adjective Phrase (AP) One of the 8 major categories 2 minor categories, including parallel structure and modifier-predicate structure AP

Professional term analysis and actual cases

Practical application of noun phrases (NP)

For example, "nomal bka'" in Tibetan belongs to a noun phrase with a complement structure, which is used to describe a specific person or thing.


Technology evolution analysis

This standard combines the traditional Tibetan grammar system with the needs of modern computer processing, and systematically classifies and labels Tibetan phrases for the first time. Compared with previous scattered research results, it has achieved a leap from theory to application.

Implementation Suggestions

  • Phrase classification and annotation should be carried out strictly in accordance with this standard when developing Tibetan information processing systems
  • Relevant research institutions are recommended to establish a unified Tibetan phrase annotation database
  • In the future, a more fine-grained semantic analysis framework can be built on this basis

Conclusion

The formulation and implementation of GB/T 36472-2018 will greatly promote the process of Tibetan informatization and provide an important foundation for computer-assisted translation, information retrieval and other applications.

GB/T 36472-2018 history

  • 2018 GB/T 36472-2018 Specification on Tibetan phrase classification and tagging for information processing



Copyright ©2007-2025 ANTPEDIA, All Rights Reserved