articles:chdict
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Next revisionBoth sides next revision | ||
articles:chdict [2007/01/08 18:52] – gabor | articles:chdict [2008/06/10 16:53] – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== CHDICT project ====== | ||
+ | ===== What is it? ===== | ||
+ | |||
+ | As of December 2006, the **CHDICT project** is work in progress. Its objectives are: | ||
+ | |||
+ | - Generate a raw Chinese => Hungarian dictionary by combining the existing resources of [[: | ||
+ | - Create a website where the Hungarian dictionary, and possibly also the [[: | ||
+ | - Provide an online editing interface where users can change existing entries and contribute new ones. | ||
+ | - Build a community of contributors and editors to improve and extend the dictionary. | ||
+ | - Systematically extend the dictionary. | ||
+ | |||
+ | |||
+ | ===== CHDICT features ===== | ||
+ | |||
+ | **XML.** With no legacy data to maintain, CHDICT will be stored in XML from the very beginning. Besides preventing codepage issues, this is also expected to: | ||
+ | |||
+ | * Enforce editorial rigor and improve the quality of the data. | ||
+ | * Facilitate machine processing of the data for future purposes. | ||
+ | |||
+ | **Annotation and structure.** A few key features of CHDICT' | ||
+ | |||
+ | * Not only are senses formally separated, but also glosses and free text annotation within senses. | ||
+ | * Chinese part of speech must be indicated on a per-sense basis for all manually revised entries. | ||
+ | * Senses can optionally contain additional information including measure words; field, region, style; synonyms and antonyms; and example sentences. | ||
+ | * Many items that are headwords in [[: | ||
+ | |||
+ | **Editing.** The only way to edit entries will be through a web-based form that mirrors CHDICT' | ||
+ | |||
+ | **Version control.** Two user roles, contributor and editor, will be distinguished, | ||
+ | |||
+ | ===== Status ===== | ||
+ | |||
+ | **December 29, 2006** -- 5800 entries have been generated, excluding proper nouns. Work on the website, version control and dictionary engine is in progress. | ||
+ | |||
+ | I expect the website to go live in the first half of 2006. | ||
+ | |||
+ | |||
+ | ===== Discussion ===== | ||
+ | |||
+ | You can download the current working draft of CHDICT' | ||
+ | |||
+ | Many of my decisions have been based on [[HanDeDict]] (e.g., fields of application). I believe a common convention for parts of speech, fields, styles and regions could benefit all of our projects. | ||
+ | |||
+ | I would also like to suggest creating a shared resource of Chinese example sentences and their translations in English, German, French and Hungarian. | ||
+ | |||
+ | ===== Maintainer ===== | ||
+ | |||
+ | The CHDICT project is maintained by [[ugray@mail.datanet.hu|Gábor Ugray]]. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | See also: | ||
+ | |||
+ | * [[: | ||
+ | * [[HanDeDict]] | ||
+ | * [[CFDICT]] | ||
+ | |||
+ | |||