Multilingualism in cyberspace

From 5th-9th June 2017 the Information for All Programme (IFAP), part of UNESCO, organised a “Global Expert Meeting” on Multilingualism in Cyberspace in Khanty-Mansiysk, Russia, with the UK representative being Dr Drew Whitworth of the MA: Digital Technologies, Communication and Education team.

View over Khanty-Mansiysk
View over the city of Khanty-Mansiysk towards the river

The city of Khanty-Mansiysk lies in the Ugra region of western Siberia, at the confluence of the rivers Ob and Irtysh. For its size and relative isolation, it might seem surprising that Khanty-Mansiysk can support a national league ice hockey team, a large conference centre, a brand-new concert hall and a world-renowned chess tournament centre, but the explanation is a simple one — oil revenue. Ugra is the centre of the Russian oil industry and in the marshes and forests of the region may lie the world’s largest remaining oil reserves.

However, at least some of the money is being put to good educational use. The region is predominantly Russian-speaking but there are two living minority languages there with about 50,000 speakers between them, Khanty and Mansi (hence the name of the city). From the name of the region, Ugra, linguists drew the name of the Finno-Ugric group of languages, to which these tongues belong, and Khanty and Mansi are therefore related to Finnish, Estonian and Hungarian, all quite different from the Indo-European group spoken in most of the rest of Europe. The semi-autonomous government of Ugra quite generously funds initiatives to preserve the language, with the Ob-Ugric Institute engaged in academic research and practical educational programmes.

Not all minority languages are so well-backed by state funding, and many are disappearing. Many of the various projects reported on in the conference, organised by the UNESCO Information for All Programme (IFAP), were interested in using digital, Internet and mobile technologies to help preserve the linguistic diversity of regions and the world as a whole. Much evidence suggests this diversity is not innately protected in cyberspace.

For example, though there is technically a ‘free market’ in what content is created for the World Wide Web, the distribution of different languages thereon does not reflect the diversity of mother tongues across the world. Several delegates did argue that W3C figures on the use of different languages online were based on dubious methods that over-estimated the amount of English content and underestimated other languages, particularly Chinese, but whatever methods are used to survey the Web, languages such as English, French, German and Spanish take an even greater slice out of the total than would be expected looking at the number of speakers in the world. On the other hand, some languages with many speakers, notably Arabic, are significantly under-represented. The problem here is not just language, but script. The conference heard about the phenomenon of ‘Arabezi’, the transliteration of Arabic into the Roman alphabet. Arabic keypads for mobile phones are a fairly recent introduction.

Simultaneous translator
Simultaneous translation at the conference: Russian – English

More recently, spearheaded (in terms of public tools anyway) by Google Translate, automatic translation has become more available. It is an open question whether computers will ever be able to attain the skill level of the best human translators, like the two simultaneous Russian/English translators who worked throughout the conference with impressive skill. But certainly automated translation has improved in recent years. Yet Google Translate remains English-centred, and does not recognise many minority languages. The conference was told how language preservation cannot be successful from the top-down: by their very nature languages grow from local community roots, and the most successful initiatives, while they may use ICTs in various cutting-edge ways (e.g. crowdsourcing), have to be rooted in the local physical and informational landscapes that gave rise to the language itself.

The question of why the world needs this linguistic and cultural diversity in the first place was addressed by Drew Whitworth in his contribution to the final session of the conference. He introduced the concept of xenophilia, the “love of difference”, as not just a moral philosophy but a potential active approach to the design of information systems. Should a perfect information system deliver only what the user requests?

Drew speaking at the conference
Drew Whitworth (left) speaking at the conference

What if, for instance, a future visitor’s search for information about a region like Ugra brought up not only English-, or Russian-language content (and thus perspectives), but locally-generated, Khanty-language information, translated into a language that the reader could comprehend but at the same time serving as an introduction to local views on the area and the indigenous language itself? Difference and diversity are essential to knowledge formation, something central to the social theories of learning of authors such as Etienne Wenger, or from George Siemens, connectivism. Knowledge can be formed within communities, but it is only at the boundaries, the zones where one community or group must interact with another, that ‘translation’ takes place, thus dialogue between the groups. (For a longer version of Drew’s presentation see this Slideshare page.)

In short, linguistic diversity is just as essential to sustainable development as is biodiversity to the health of our broader ecologies. We must learn how to preserve it as an integral part of our work with technologies in all ways, whether ‘educational’ or not.

  • Archives

  • Twitter @MADTCE