跳至主要內容

CRLLS: Bridging Technology and Linguistic Heritage

  • 2025年03月28日
  • 專題故事
  • 人文學院

Interviewer: Andy Ng, Kinki Mak and Eric Lam

Author: Eric Lam

 

The Faculty of Humanities (FHM) operates four specialised research centres devoted to advancing scholarship across diverse fields including languages, linguistics, literature, history, cultural studies, music, and visual arts. These centres actively pursue interdisciplinary initiatives in sustainable humanities, digital humanities, and comparative culture of care. Following our coverage of The Centre for Research on Chinese Language and Education and The Research Centre for Chinese Literature & Literary Culture, this March issue of FHM Research Bulletin turns its spotlight to The Centre for Research on Linguistics and Language Studies (CRLLS).

 

CRLLS is committed to building language resources and conducting core linguistic research to support linguistic and cultural sustainability and education for Hong Kong and surrounding regions. The Centre has achieved remarkable success with its flagship project TypeDuck, a Cantonese input keyboard system that has garnered over 20,000 users globally. Its commitment to supporting graduate students and early-career researchers has borne fruit, as exemplified by individual awards, such as the Young Scholar Award received by Dr Lai Yik-po from the Li Fang-Kuei Society for Chinese Linguistics. Through its dedication to leveraging research strengths for positive societal change, the Centre has established itself as a crucial bridge between academic research and community needs.

 

Speaking with Research Bulletin, CRLLS Director Dr Lau Chaak-ming shared insights into the Centre's innovative work, from its research priorities and distinctive capabilities to its vision for cultural preservation and educational advancement.

 

CRLLS Director Dr Lau Chaak-ming

 

Key Foci: Corpus Linguistics and Language Resource Development

 

Two key strategic areas define CRLLS's research priorities: corpus linguistics for language acquisition research and language resource development for social impact.

 

I.  Corpus Linguistics for Language Acquisition Research

The Centre's nine-year trajectory in corpus linguistics represents a comprehensive approach to understanding and enhancing language acquisition. Through material collection, tool development, and pedagogical innovations, all efforts focus on utilising corpus linguistics data to improve language acquisition and understand its underlying mechanisms. "Our data collection process and end products are designed to serve not only Hong Kong but also Chinese/ Cantonese learners globally,” says Dr Lau.

 

The TypeDuck project stands as a prime example of this mission in action. Supported by the SCOLAR Language Fund, this innovative keyboard system breaks down barriers to Chinese learning for Hong Kong's ethnic minorities. The system enables users to input Chinese characters through Cantonese Jyutping romanisation, while providing translation support in multiple languages including English, Indonesian, Nepali, and Urdu. The Centre’s engagement with the education community has been extensive - it has conducted on-site teaching and maintained meaningful exchanges with teachers across 40 schools, with its user base exceeding 20,000 - approximately half in Hong Kong and the remainder distributed globally.

 

II. Language Resource Development for Social Impact

 

CRLLS champions sustainable language resource development through a robust infrastructure designed for maximum societal benefit. The Centre's commitment to open-source initiatives reflects its innovative approach to resource constraints. "We focus on developing extremely cost-effective operational models," Dr Lau explains. "Our strategic decision to make our code and data open source whenever possible enables us to create widely accessible resources - from dictionaries to language corpora - that serve global needs." This open-access philosophy creates a foundation for future research and development. Scholars studying Hong Kong languages (Cantonese, Hong Kong English, Hakka, etc.) can freely build upon these materials, accelerating research advancement and fostering collaborative innovation.

 

The Centre's influence extends through its growing network of local, regional, and international partnerships in linguistic resource development. The Director emphasises that these partnership's strength lies in its commitment to open-source principles: "Rather than focusing on exclusive institutional partnerships, we prioritise broader community impact. This marks a departure from traditional models where resources were confined within institutional boundaries, limiting access and creating proprietary content silos."

 

Remarkable Competencies: Leveraging Infrastructure and Expertise for Meaningful Change

 

CRLLS is dedicated to harnessing its strengths to address pressing social challenges by seamlessly integrating linguistic research with community needs to drive meaningful change.

 

I. Comprehensive Language Resource Infrastructure
 

The Centre has built an extensive collection of language resources, software platforms, and experimental data that drive innovative linguistic solutions. Three outstanding projects showcase this technological leadership:

 

1.  DOLD
 

The Digital Platform for Collecting Online Language Data (DOLD) enhances remote language data collection. "DOLD enables us to gather complex linguistic data without physical presence at research sites," Dr Lau explains. "By simply sharing links with participants, we've significantly streamlined our data collection process." What started as a psycholinguistic research tool has evolved into a vital platform for documenting minority languages with unprecedented efficiency.

 

The innovative DOLD platform

 

2.  TypeDuck

 

The platform's significant impact has led to numerous invitations for knowledge exchange across international venues. The system has gained particular traction among heritage speakers in North America and Europe, who report that its tools and materials have been instrumental in helping them preserve and transmit their language to future generations.

 

3.  LearnDuck

 

The innovative TypeDuck input system has grown into a comprehensive learning ecosystem, LearnDuck. "We've expanded beyond basic input functionality to create a targeted Chinese-as-a-second-language learning platform," Dr Lau notes. "The system now encompasses educational materials, interactive worksheets, and engaging online games." This single initiative has produced two valuable educational tools: an intuitive input system and an integrated learning platform that supports diverse learning needs.

 

II. Expertise in Corpus Methodologies


CRLLS advances language education through a twin-focused approach: tailoring learning experiences to student needs while empowering educators with corpus-based tools for customised teaching solutions.

 

1. Tailoring Learning Experiences
 

CRLLS's mastery of corpus methodologies, the Centre’s core members Dr Angel Ma and Dr Rebecca Chen in particular, distinguishes it as a leader in language education innovation. The Centre champions data-driven learning principles that transform traditional teaching paradigms. "Modern language education offers unprecedented access to learning materials," Dr Lau adds. "We believe optimal learning occurs when students engage with content that resonates with their interests and aligns with their specific needs."

 

2.  Empowering Educators
 

The Centre's approach prioritises educator empowerment through practical solutions. "In Dr Ma’s projects, rather than attempting to create universal teaching materials," he notes, "we equip teachers with the tools to leverage language corpora effectively. By facilitating access to comprehensive language data, we enable frontline educators to curate materials that best serve their students' unique requirements." This innovative methodology allows teachers to address diverse learning needs by developing customised educational content while upholding rigorous academic standards.

 

Positive Social Impact: Supporting Language Development


CRLLS’s tools supporting the use of standard Cantonese Jyutping romanisation can have far-reaching impact. According to Hong Chi Association, a non-profit organisation dedicated to quality service in educating, training and empowering people with intellectual disabilities and their families, the organisation developed a strategy to facilitate speech production in children who had previously been non-verbal - not due to poor speech skills but due to an inability to produce vocal sounds - and utilised the Centre’s tools in their teaching. This unanticipated development is one of many contributions CRLLS has made to the local community. “We witnessed a transformative breakthrough when students with moderate intellectual disabilities began engaging with TypeDuck's keyboard-based phonetic input system," Dr Lau explains.

 

The impact of this became evident during a recent school visit. "I observed firsthand as a previously non-verbal student engaged in direct conversation," he recalls emotionally. "The approach combines TypeDuck's text input capabilities with Poe's AI platform to facilitate interactive dialogue." This unexpected application of tools originally designed for non-Chinese speaking learners has opened new horizons in special education. The Centre added features that are specifically designed for autistic children’s educational needs to TypeDuck in early 2025.  Looking ahead, the Director and his colleague Dr Jesse Yip will continue with corpus-based analysis of language data to assist both autism education and public perception about autism.

 

The flagship input system TypeDuck

 

Trends and Challenges: Navigating Cultural Preservation in the AI Era


As CRLLS looks towards the future, it faces both challenges and opportunities in the rapidly evolving landscape of linguistic research and technology. The Centre identifies two key areas that shape its strategic direction.

 

 

I. Shrinking Cultural Diversity
 

"Today's emphasis on efficiency often overshadows the importance of diversity, pushing us toward a more homogeneous culture," Dr Lau observes. This trend disproportionately affects minority communities, who face mounting challenges in accessing quality education and maintaining their cultural heritage. The phenomenon of language shift poses a particular threat to the rich linguistic mosaic that has long characterised Hong Kong and its surrounding regions. "Our research centre holds significant potential in addressing these challenges, especially through sustained documentation," the leader emphasises.

 

 

II. The Promise and Limitations of AI
 

While artificial intelligence (AI) offers transformative potential for linguistic research and preservation, its current capabilities reveal notable constraints. "AI technology can significantly advance our mission," Dr Lau notes, "particularly in supporting vulnerable populations - the elderly, individuals with speech impairments, and those facing mental health challenges."

 

However, CRLLS has identified a critical disparity in AI development: while tools for English are sophisticated, support for minority and low-resource languages remains inadequate. This gap creates significant challenges in language education and assessment for these communities. In response, the Centre is pioneering innovative solutions that maximise impact with minimal resource investment. "Our development of Text-to-Speech engines for Waitau and Hakka languages demonstrates the potential of our approach," Dr Lau expounds. "While Cantonese is often considered resource-poor for development, our success with Waitau - working with just 1% of Cantonese's resources - proves that effective solutions are possible even with limited means."

 

Advancing SDGs: Fostering Quality Education


CRLLS's work aligns closely with the United Nations' Sustainable Development Goals (SDGs), particularly in promoting quality education. The Centre's commitment to quality education is demonstrated through its innovative approach to language learning. "Our corpus methodologies and data-driven learning tools directly address educational accessibility and efficiency," says Dr Lau. "When we develop resources like TypeDuck or implement corpus-based teaching methods, we're creating pathways for more effective and inclusive education."

 

Local school outreach on TypeDuck

 

Future Directions: Advancing Language Documentation and Preservation


Looking ahead, CRLLS is charting a dual-focused strategy to advance its mission of language documentation and preservation. The Centre's roadmap emphasises comprehensive local documentation initiatives while fostering international collaborations to enhance its reach and impact.

 

I. Language Documentation and Promotion

 

Looking ahead, CRLLS is positioning itself as a leading organisation in promoting language and cultural sustainability through digital humanities work in Hong Kong. This is a continuation of Dr Lau’s Lord Wilson Heritage Trust project, publication of a story collection and textbook for Waitau and Hakka. He says, “Publication of printed materials represents our initial phase. Our long-term strategy involves continued data collection and promotion efforts. Our work has revealed that we've underestimated the extent of cultural diversity. ” CRLLS members Dr Lau and Postdoctoral fellow Dr Liu Yanting, as Co-Investigators of Fieldwork for the Project for the Protection of Language Resources of China, will continue to document regional languages in Hong Kong.

 

A story collection for Waitau and Hakka


II. Global Initiatives for Language Preservation
 

The Centre is expanding its reach through strategic collaborations with local and international partners to develop innovative approaches to linguistic preservation. A significant milestone has been the initiative with UCLA and other leading institutions focused on Asian language conservation. "Since our initial meeting last year, we've fostered ongoing dialogue with researchers studying languages across Asia - from Hong Kong and Taiwan to Thailand, Japan, South Korea, and Indonesia," Dr Lau explains. "This collaborative network has now culminated in a joint book project exploring regional language preservation efforts."
 

Ending


As CRLLS evolves in the rapidly changing landscape of linguistic research and digital innovation, it remains steadfast in its mission to preserve and promote linguistic diversity. Through innovative solutions and international partnerships, the Centre transforms how communities connect with their linguistic heritage in the digital age. As the Director stresses, "Language preservation is not just about documentation—it's about empowering communities to maintain their cultural identity and creating sustainable pathways for future generations." This vision propels CRLLS forward as it works to ensure diverse languages thrive in an interconnected world, making linguistic and cultural sustainability a reality for communities both locally and globally.