Login

View change log entry

Back

Navigation:  ◀ 79048  79050 ▶ 

Change log entry 79049
Processed by: richwarm (2023-08-16 00:55:38 GMT)
Comment: << review queue entry 74165 - submitted by 'cws' >>
The first four combined words have exactly one meaning, and each component word of the combined word has exactly one meaning too. Thus I think it will help to prune the dictionary of words that are merely AB = A + B

电脑病毒

电脑 has exactly one meaning, computer. Likewise with 病毒 it has one meaning only, virus

However, for compound word like 电脑语言; though both of its component words (电脑 and 语言) has exactly one meaning (computer and language respectively), 电脑语言 has two possible interpretations, i.e., programming language and computer language, thus I excluded it in the list of words to be pruned

電腦語言 电脑语言 [dian4 nao3 yu3 yan2] /programming language/computer language/

Or programming language and computer language have same semantic? We need to group them with semicolon

電腦語言 电脑语言 [dian4 nao3 yu3 yan2] /programming language; computer language/

By the way, there's a part in my code that segments dian4 nao3 bing4 du2 to dian4 nao3_bing4 du2. I can make a script that scan compound words (e.g., 电脑病毒) that will check if the compound word has one meaning only; and if in turn, both of its component words (电脑 and 病毒) has exactly one meaning too, then it will be tagged for pruning. From the pruning list produced, I will manually check and submit these compound words for pruning
----------------------------------------------

Editor: You say it will help to prune the dictionary that way, but you don't say *how* it will help.

电脑病毒 is a term that many dictionaries include, but you propose to delete it. This suggests your criteria for deletion may be too weak – i.e., it may result in needless deletions.

Suppose we delete 电脑病毒, and then, a few years later, we refine the definition of 病毒 as follows:
/(medicine) virus/computer virus/(fig.) harmful idea/
电脑病毒 currently needs to be deleted by the standard you propose, but later on, the very same standard would dictate that it needs to be reinstated!
Diff:
# - 電腦軟件 电脑软件 [dian4 nao3 ruan3 jian4] /computer software/
# - 電腦病毒 电脑病毒 [dian4 nao3 bing4 du2] /computer virus/
# - 電腦系統 电脑系统 [dian4 nao3 xi4 tong3] /computer system/
# - 電腦網路 电脑网路 [dian4 nao3 wang3 lu4] /computer network/
- 電腦語言 电脑语言 [dian4 nao3 yu3 yan2] /programming language/computer language/
+ 電腦語言 电脑语言 [dian4 nao3 yu3 yan2] /programming language; computer language/
By MDBG 2024
Privacy and cookies
Help wanted: the CC-CEDICT project is looking for new volunteer editors!