人文學院研究人員 - 劉擇明博士

人文學院研究人員 - 劉擇明博士
- 語言學及現代語言系助理教授
- 語⾔學及語⾔研究中⼼總監
- Linguistics
I've been working at the intersection of linguistics and digital humanities, dedicating significant effort to researching computational linguistic work and the intricacies of language resource construction, with a focus on Cantonese and regional endangered languages. I believe this work provides a solid foundation for linguistic and sociolinguistic investigation and holds value in various other areas, including language revitalization and reclamation efforts and social sciences research. As a linguist, I have a strong interest in the mechanisms of discourse particles, a common feature in East Asian languages, including their syntax, semantics, and pragmatics.
I am enthusiastic about supervising and collaborating on projects that explore any aspect of corpus linguistics, tool development, or the linguistic investigation of Cantonese and nearby Sinitic varieties.
- Words.hk, a lexical resource of Cantonese lexical items and sentences
- TypeDuck, a Cantonese Jyutping phonemic keyboard for Ethnic Minorities Students and Heritage Learners
- Hakka and Waitau Text-to-Speech System
- Lau, C. M. (forthcoming). Ideologically driven divergence in Cantonese vernacular writing practices. In Dupre, J.-F. (ed.) Politics of Language in Hong Kong.
- Lau, C. M., Lau, M., To, A. W. H. (2024). The Extraction and Fine-grained Classification of Written Cantonese Materials through Linguistic Feature Detection. In Proceedings of the 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024, pages 24–29, Torino, Italia. ELRA and ICCL.
- Lam, C., Lau, C. M., Lee, J. (2024). Multi-Tiered Cantonese Word Segmentation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 11993–12002, Torino, Italia. ELRA and ICCL.
- Lau, C. M., Chan, G. W.-Y., Tse, R. K.-W., & Chan, L. S.-Y. (2022). Words.hk: A comprehensive Cantonese dictionary dataset with definitions, translations and transliterated examples. In Proceedings of the 1st Workshop on Dataset Creation for Lower-Resourced Languages (DCLRL) @LREC2022 (pp. 53-62). France: European Language Resources Association.
- Lau, C.-M. (2019). Building Cantonese dictionaries using crowdsourcing strategies: The words.hk project. In A. W.-B. Tso (Ed.), Digital humanities and new ways of teaching (pp. 89-107). Singapore: Springer.