![]() |
The SWiP project makes use of language, data and knowledge technologies to promote language equality among all of South Africa's official languages. The linguistic hegemonic status of English (and to a lesser extent Afrikaans) has resulted in English being the language of learning and teaching[1] which downplays an African epistemology,[2] thus local African languages are commonly under resourced.[3] The acronym"SWiP" describes the three main partners in a national collaboration between SADiLaR, the free encyclopedia Wikipedia and PanSALB who are working alongside local speech and language communities within Academica, to address language equality using digital technologies, especially Wikipedia.[4]
Under apartheid, certain languages were marginalised, including isiNdebele, Siswati, Xitsonga and Tshivenda.[5] To address the underrepresentation of South Africa's indigenous languages, three organisations are collaborating to build better low-resource languages corpora. These organisations are:[6]
Wikipedia is a common source of language data for natural language processing (NLP).[7] Low-resource languages have limited corpora of text (speech data, annotated text and other forms of linguistic data) for LLMs to draw on for NLP. The SWiP project has introduced a variety of alternative possibilities for the collection and compilation of corpora of suitable text for low-resource languages, and rolled this out on a national scale. These corpora can be used to create corpus-based dictionaries or semi-automatic translation.[8]
This collaborative project is also intended to promote, preserve, and digitise South Africa's indigenous languages and cultural knowledge by enhancing their presence on digital platforms such as Wikipedia.[9] By partnering with cultural and linguistic organisations, the project was designed to close the digital gap and ensure that local languages and cultural narratives are preserved and shared online.[6]
{{cite journal}}
: CS1 maint: DOI inactive as of May 2025 (link)
© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search