Saturday, December 27, 2014

Can you guess the correct language by geographical distribution?

There's a quiz game where you have to match languages to their geographic distribution. Needless to say, we all need to play it. It's fun, go go go :D!

Tell me your score and comments :)! I got 24/25, for some stupid reason I couldn't get no. 17.

Thursday, December 25, 2014

Mele Kalikimaka me ka Hauʻoli Makahiki Hou

Tis Christmas time, so let's watch Tom Scott's video on the phonology of Hawaiian and why "Merry Christmas" become "Mele Kalikimaka". After that, let's learn how to greet each other in upon this Christian/Western Holidays of Christmas and New Years in lots of other languages, courtesy of Omniglot of course.

Luxenbourgish: E schéine Chrëschtdag an e glécklecht neit Joer
Ukranian: Різдвом Христовим
Standard Arabic: أجمل التهاني بمناسبة الميلاد و حلول السنة الجديدة
Quechua: Sumaq kausay kachun Navidad qampaq
Romani: Baxtalo Krećuno thaj Baxtalo Nevo Berš
Tibetan: ༄༅།།ལོ་གསར་ལ་བཀྲ་ཤིས་བདེ་ལེགས་ཞུ།
Xhosa: Siniqwenelela Ikrisimesi Emnandi Nonyaka Omtsha Ozele Iintsikelelo

Från oss alla, till er alla: En Riktig God Jul!

Friday, December 19, 2014

Really good radio show episode on nativeness of languages, go listen!

The excellent radio/podcast show Talk the Talk just did an episode on nativeness of language that is really, really good. They've interviewed Dr Vyvyan Evans who just came out with a book about why language isn't an instinct and they ask really good questions and get really good answers.

I know I'm saying "really good" a lot, but that's just because I'm very excited and this is awesome. I'll calm down in a bit. In the meantime, enjoy this gif as a display of my excitement which cannot be expressed fully in words and go listen to the show!

"Just thinking"... #1: Complexity and Junk DNA

I have a lot of thoughts and ideas but I always like to fact-check and double check before I say anything. I thought I'd share some of the stranger ones with you for the sake of entertainment. These are not serious thoughts, just strange thoughts. This one is one of the strangest and might fit better in an edition of the Speculative Grammarian.

What can we learn by comparing mature/complex language features and junk DNA in biological organisms?

I.e. something entities that have been around for a long time acquire and have a hard time to get rid of. Mature features or frills in languages can be things that take a long time to evolve and sometimes bleaches down so much that it barely has any function anymore, except to mark group membership. Complex features is a umbrella term for lots of things, they're not necessarily the same.

I don't know a lot about Junk DNA/non-coding DNA, but I do know that you can get it from not deleting copies (bleaching..?) and retroviruses. Organisms that have little of it are more likely to be younger, but not all old have a lot of it. In other words, you gotta be old to have it, but some old organisms delete a lot. This is my poor understanding of it.

The comparison seems to end badly when we have to consider how "they get rid of it", languages might get rid of frills/mature patterns by intense contact situations (pidgins -> creoles etc) or perhaps large proportions of L2-speakers. As I understand it we're not entirely sure why and when biological organisms delete junk DNA, some do it a lot (carnivorous bladderwort plant, bacteria) and others not at all (pine trees).

This is probably not a very good idea at all, but when you're around people working with Neanderthals and in a strange mood it's hard not to get inspired.

So.. that's the strange thought for this time. Lemme know if you liked it and then I'll share more with you :).

What should we call our app game about linguistic diversity?

Hello! Wanna help us find a name for a smartphone app game about linguistic diversity?
I'm working with Seán Roberts (of MPI Nijmegen and the blog Replicated Typo), Mark Dingemanse (also of MPI Nijmegen but of the blog the Ideophone) and Peter Withers & Pashiera Barkhuysen at the Language in Interaction project (WP7) to create a game for smartphones that let's you compete in recognising and explore languages of the world.
We need a name for this game and we'd like your help in figuring it out. Since cultural evolution and iterated learning is very cool, Seán's set up a very quick iterated learning game that you can play that will help us. Help us evolve an app name! The more that play the merrier! 
It's a very simple game and takes very little time. As Seán says "We’ll throw some app names at you, you try to remember them, then we throw your names at someone else." 

Thursday, December 18, 2014

Listen to the world's languages! 2: Sinitic and other languages on Phonemica

Following on from the preceding post on listening to the languages of the world by navigating audio samples on maps, here I will introduce Phonemica 乡音苑/ 鄉音苑, a website with user-contributed audio files of Sinitic (and some other nearby) languages, mapped by pins that are (supposedly) colour coded by linguistic affiliations. Even if you are not into Sinitic languages, it is fun to listen to, and familiarise yourself with, the various Sinitic languages. (You know, just in case, e.g., the next pub quiz asks you to distinguish the Cantonese vs. Mandarin vs. Taiwanese Mandarin versions of 'Let It Go'.)

 Sample 1
Version A
Version B

(And just for fun, unofficial versions of 'Let It Go' in Taiwanese, Shanghainese, and 26 Sinitic lects.)

The level of diversity amongst the Sinitic languages is similar to that amongst the Romance languages. At around the same time that Vulgar Latin was spread around by Roman soldiers, Late Archaic/ Early Medieval Chinese was spread around by soldiers of the expanding Qín 秦 (221 – 206 BCE) and Hàn 漢 (206 BCE – 9 CE; 25 – 200 CE) Empires. The Sinitic languages are also commonly known as 'Chinese dialects'. This 'dialect' is a (mis)translation of the Chinese term 方言 (fāngyán in Mandarin). In terms of the dialect vs. language distinction, Chinese linguistics take an approach that is considerably more 'lumpist' than Western linguistics. Mutually unintelligible, but demonstrably related speech varieties are often called 方言 fāngyán of each other in Chinese linguistics. The difference between 'further apart' Sinitic languages like Hokkien and Mandarin is perhaps similar to that between English and Icelandic, French and Romanian, or Russian and Bulgarian. (I like the term 'topolect' as a translation of 方言 fāngyán; see, e.g., Language Log.)

Coming back to Phonemica. The first item that you might find useful is the language selection drop-down list in the top right of the page. Currently there is a selection of English, 中文正體 'Chinese Traditional', 中文简体 'Chinese Simplified', and 한국어 'Korean'. (Ultimately, most things are in Chinese.) After that, you might want to click on the map and click on a pin. Last year when I checked out this website, the pins were nicely colour coded according to the classification of the Sinitic languages in the Language Atlas of China (1987). However, the colour coding is now (Dec 2014) somewhat all over the place, especially for Hakka. (Someone needs to fix this.)

When you click on a pin, a pop-up box appears, giving you brief information about the speaker and the language variety. For instance, the pop-up box of the pin located at Seoul gives you a photo, the speaker ID of 华侨先生 [huáqiáo xiānshēng 'Mister Overseas Chinese'], age, gender, first order dialect group of 官话 'Mandarin', and subdivision of 胶辽官话 'Jiaoliao Mandarin'.

The following are the labels of the first order Sinitic dialect groups used in Phonemica (largely identical to the classification used in the Language Atlas of China):

官话 Mandarin
晋语 Jin
瓦乡话 Waxiang
湘语 Xiang
赣语 Gan
徽语/徽洲 Hui
吴语 Wu (including, e.g., Shanghainese)
闽语 Min (including, e.g., Hokkien)
客语 Hakka
粤语 Yue (including, e.g., Cantonese)
平话和土话 Pinghua and Tuhua

(This map and this article from Wikipedia might be useful. See also (*teehee*) de Sousa 2015.)

There are also recordings labeled as 多种方言 'multiple dialects', and recordings of non-Sinitic languages like Atayal, 壮语 Zhuang and 诺苏 Nuosu. (There is also Ong Be mislabled as Danzhou Yue in Hainan Island.)

When you click further, it leads you to a more-detailed page about the speaker, with links to the recordings. When you click further and get to the page of the recording, you can play the recording, you see waveforms, and underneath you get (maximally) vernacular in Chinese characters, romanisation, IPA, translation in Mandarin (i.e. Standard Written Chinese), and translation in English.

The following are samples of some better known Sinitic languages: Cantonese, Hakka, Hakka, Teochew, Taiwanese, Hokchew, Wenzhou, Shanghainese, Shanghainese, Beijing Mandarin.

At the bottom of the homepage are links to various media reports in English and Chinese on the Phonemica project.


de Sousa, Hilário. 2015. “The Far Southern Sinitic Languages as part of Mainland Southeast Asia”. In Enfield, N.J. and Bernard Comrie (eds.). Languages of Mainland Southeast Asia – The State of the Art: 352–435. Berlin; New York: De Gruyter Mouton.

Tuesday, December 9, 2014

3 x Conlangs!

I've got three things on constructed languages for ya. Be it for fiction, peace or just plain fun - constructed language fascinate. They vary in thoroughness and devotion of fan community, we all have our favourite. My favourite is the same as that of my teacher Mikael Parkvall, i.e. Solrésol. Here's the wiki on Solrésol, Romeo and Juliet on Solresol and there is also a linguistic problem from the Swedish contest based on Solrésol but I can't find it right now.

For your enjoyment

Imma big fan of constructed languages that do not have any aim of being optimal and improve upon communication or the world, and preferably that isn't solely based on European assumptions and categories. Do you have any preferences or favourites?

Languages of Nerdfighteria - help me reach John and Hank!

I tried to figure out how many languages are spoken in the online community of Nerdfighteria once, and now I'd like to see if I can get a hold of John and Hank Green with your help. Wanna join?

How is this interesting to a non-nerdfighter reader of this blog? Well, this is a good example of young people's enthusiasm for language and by encouraging it we're making the world better and possibly also linguistics better - just like the Olympiads of Linguistics or all the tumblrs, blogs, youtube-channels etc on linguistics out there on the interwebs. We in academia need to connect with the surrounding world, and enthusiastic young people are a great place to start.

Imma nerdfighter, this means that I form a part of the community around the vlogbrothers, brainscoop, scishow, the Art Assignment, crashcourse and many other excellent youtube-channels. It's an interactive community formed by nerds interested in knowledge and the world in general - that's the brief and simple way of putting it. The originators are Hank and John Green. Nerdfighteria is an environment that suits many geeks, nerd and academics very well. They've even done a very good video on linguistics that I highly recommend.

Anyway, a long time ago Hank Green asked the question "How many languages does nerdfightera speak?" in a video, which quickly lead to a forum thread where everyone started listing what languages they know. Since I'm a big fan of systematicity (being a scientist and all) I tried to bring some order into it all quickly by introducing a large collaborative spreadsheet with an accompanying section on how to deal with the spreadsheet since things quickly became chaotic there too.

This is by far not the best way of doing it, but I had to act quickly to ride the wave of what was going on. Had I done it today I woulda done things differently for sure. As you can tell when you enter the spreadsheet, there's been a lot of people there editing and I've "let" them enter things such as programming languages, conlangs, "unclear things" etc. However, I've not officially counted those entities as "languages". Needless to say, a more well-constructed survey would be in its place.

Anywho, there are still interesting facts one can derive from the information in the spreadsheet. In Nerdfightera, a community made up of nerdy (predominately) young people from all over the world, there are at least 127 languages spoken by at least one person. That is roughly 1,79 % of the languages of the world, but it means that we as a community could communicate to 66.99% of the population of our planet. This is because a few languages are spoken by many, you can read more about this distribution here. We can also learn that we have such interesting things as Australian Sign Language, Hawaiian Creole English and Chickasaw among our midst. Jay, go us!

Now at the time of all of this I tried to get in contact with Hank, John and their crew, but it proved very hard. They're constantly bombarded with messages I'm sure and a well-meaning timid linguist who wants to suggest a new survey and possible contribute with some linguistics to the crash courses isn't always gonna cut through the noise.

What I want to tell them is this:
  • this is what happened after Hank asked the question
  • these are some of the things we can learn from that forum thread and the collaborative spreadsheet (more than what is brought up in this post)
  • in the next census of nerdfighteria, would you like some assistance in figuring this out in a better way?
  • would you like to include linguistics and other non-natural sciences more often in scishow and crashcourse? If so, I'd be happy to help with linguistics ^^!
Can y'all help me reach John and Hank? You can retweet this (click here), reblog it on tumblr (click here), whatever you like. Let's see if we can get their attention and try and provide an answer to Hank's old question. If they want to get in touch, or if you or anyone else wants to pose a question about languages and/or linguistics, you're all very welcome to do so here :)!

When exploring the languages of Nerdfighteria one mustn't forget the non-English part of the forum, please redirect people there if they're asking to get to know non-English nerdfighters. There are also lots of non-English nerdfighter facebook groups, I cannot list them all. And there's also the linguistics-forum thread.

Alright, let's do this!

p.s I do realize that Project for Awesome is coming up, one of the most busy times of the year for Nerdfighteria. I have reasons to think that this is going to become relevant..

Ping @fishingboatproceeds (Johnh Green) @edwardspoonhands (Hank Green) and @scishow

Talking to linguists - on podcast?

Like I've said before, you know what I most often like better than reading grammars? Talking to speakers/signers and experts! You know, it's not a good idea to say "so, you're a linguists - what languages do you speak?", but at the same time linguists do speak and do research into a lot of languages and by collaborating we can make more effective use of our time investigating linguistic diversity. See my old posts here about "how many languages do linguists speak?".
These past two weeks I've been out on a little tour doing just that. I've been to several places in Germany: Hillerse, Berlin, Potdsam, Leipzig, Freiburg and then over to the UK to meet people at SOAS. These are some of the people I've met and had time to sit down with:
Ulrike Mosel - retired professor of the University of Kiel, renowned specialist on oceanic languages, especially Samoan, Teop & Tolai, and also originally an orientalist, as it was called then, with a specialty in Arabic.
Tom Güldemann - professor of African languages at Humbolt University in Berlin with a specialty in Khoisan and Bantu
Lee James Pratchett - PhD student at the Humboldt university working on ǂKx'ao||'ae
Victoria Apel - PhD student at the Humboldt university working on Fulfulde [fuf]
Lutz Marten - Professor of General and African Linguistics at SOAS in London with a specialty in Bantu
Andrew Harvey - PhD student at SOAS working on Gorwaa
Pillip Jaggar - Emeritus prof at the SOAS, specialist in Hausa
Cephas Delalorm - PhD student at SOAS working Sekpele
Abbie Hantgan - Post-Doc in the Cross-roads project and specialist in Bangime

And then there's many more that I've talked to and started collaborating with, like the rest of the Crossroads of Multilingualism-gang.

They're excited and want to share, I'm excited and want to learn. It was awesome and I'm very thankful to each and everyone. Field workers are awesome and generous, I love them.

I wish there was a better way of introducing these amazing people and their knowledge to the world. Perhaps through a public outreach podcast with 15 minutes segments for each language? I usually sit down for longer and ask questions that are not of interest to the general public (is the syntactic pivot S = A or S = O?), but I could do a short summary and interview. If the Lexicon Valley or the Speculative Grammarian podcast would like a segment of 15 minutes where one linguists presents one language, let me know... just saying.

Would that be something you as readers of this blog would listen to? Tell me.

Monday, December 8, 2014

Listen and watch the world's languages!

There is a site called Language Landscape where you can listen and see languages of the world by navigating audio samples on a map. They're currently offering 357 samples, both audio and video (because remember: sign language!!), go check it out and contribute your own! It's super awesome.
You remember sometime ago when Jeremy Collins wrote a post here in connection with a workshop on cross-lingusitic databases and wrote that he  would like to see data points tied to specific speakers and geographical coordinates, as one possible solution.

Well, guess what the Language Landscape does? Exactly that! I've just come back from SOAS in London where I met a certain Samantha Goodchild that works on Language Landscape and told me about the site. Such cool stuff, y'all go contribute, listen and watch now!

While you're at it you should also go to Paul Heggarty and Colin Renfrew's site Sound Comparisions where you can hear cognate sets from lots of languages. Cognate sets are words with a shared origin, like "dotter" in Swedish and "daughter" in English. Here's a little taste of Sound Comparisons, cognates of the word "right".

I've got a third source for you to listen to the worlds languages. Do you remember the Great Language Game? It's a very cool game where you can compete in recognizing languages of the world by labeling audio samples. I wrote a blog post about it with Seán Roberts at Reduplicated Typo some time ago. The creator of that game, Lars Yencken, has also made it possible to listen to all the languages freely here.

And a fourth! You can get better at recognizing languages by hearing with Langscapes Language Familiarization game!

In short, there are lots of ways for you to improve your skills in recognizing languages in your local public transport and impress your friends. Also, you'll become a better person from it. Go do it, do it now.

Linguistic diveristy, important things to think about concerning maps and hot research topics

The tumblr The land of Maps recently reblogged a map from the 2004 edition of Ethnologue displaying the linguistic diversity of Africa. I thought I'd just add some brief commentary and information about the kind of research questions this touches upon. Now, Africa is super-diverse don't get me wrong, but this image does not show the full picture. I know Imma party pooper, I've accepted this about myself.

Ethnologue © 2005 SIL International
This map is based offa Ethnologue and displays each language of Africa as a polygon shape covering the area where the language is spoken. Ethnologue is a catalog of the worlds languages administrated by SIL. They're also the keepers of certain standards of languages, such as the ISO 639-3 codes for language names. The most recent edition, from 2014, does not feature maps of this kind. They do have maps of smaller areas though, like Nigeria which is one of the most diverse regions of Africa.

In order to see the diversity of Africa we can look at such maps as the one above, but there are a couple it things that are good to keep in mind for a correct reading of the map: family relations, polygons versus dots, population size, contact areas and multilingualism.

a) we're not getting the genealogical dimension clearly, i.e. which languages belong to the same family? What kind of diversity are we interested in? Is just speaking different "languages" enough? What about being similar due to shared genealogy or contact? In fact, we can speak of diversity in at least four different ways:
  • diversity of how many languages are currently spoken in an area (see original map above)
  • how that diversity is distributed across speakers/signers, i.e. sure there are lots of languages spoken/signed in a certain area, but most people actually speak one and the same languages. This can be represented by the Greenberg Diversity Index that meaures how likely it is that two random speakers share the same mother tongue, read more here.
  • genealogical diversity, how are these languages distributed across families? And here we might want to split this up into top-level families and genera since language families can vary greatly in time depth whereas genera groups are always of max 3500-4000 years making them more suitable for direct comparison
  • typological diversity, i.e. when it comes to comparing grammar, phonology, semantics etc cross-linguistically - how similar are the languages? We get contact areas where lots of languages from different families are nevertheless similar in their typological profile. We can measure this for example with the Dahl-distance (Dahl 2006).
So what does Africa look like in terms of genealogical diversity? Well, according to Glottolog we've got 57 languages families and isolates, compared to Eurasia's 45 and South America's 117. Here's a screenshot from another site called Langscape where they supply interactive maps that also display family relations (colors) and multilingualism (overlapping shapes). There are plenty of problems with those maps too, missing information etc. But, for showing genealogy and multilingualism they're often better than Ethnologue's. Go there and click away!
Screen grab from Langscape, © 2014 University of Maryland
If your interested in the distribution of typological features across not only areas, but also language families, why not check out the World Atlas of Language Structures sunburst explorer? It's a handy easy to use online tool that lets you explore the 2500+ languages over 166 features (resulting in 69 000+ datapoints) of WALS in a clear and concise way. I've written a post explaining more here.

b) polygons versus dots and populations. There are different opinions how to best display languages on a map, the main one being do we use polygon shapes or dots? They are good for different purposes, it depends on if we want to treat all languages as equally interesting or if we want to grasp the geographical area over which they are spoken and therefore see contact better (polygons). Both of these fail to represent population, for that we can either modify the size of the dots (like Gap Minder does) or distort the size of the areas like Worldmapper does.

I wanted to show how this works, but I couldn't find a map where the polygons of each languages was distorted according to number of speakers/signers per language. What I could find though are maps of amount of languages spoken in an area, one from Ethnologue and one form World Mapper. 

Both are using map projections similar to Gall-Peters. Now, we're going from dots to polygons. In the first map each dot represents a language, it is from Ethnologue's previous edition form 2009. Ethnologue assigns each language in their catalog to one country, and then lists countries where they language "is also spoken". They also divide languages into those spoken indigenously and those labeled as immigrant languages, read more here. We're dealing with the indigenous here, and one country per languages (as far as we know). This is comparable to all CLLD-maps where each language has one dot in one location, sometimes in the same as Ethnologue but sometimes not. This is important to know for a correct reading of the map.
Ethnologue © 2009 SIL International 
In the second map Worldmapper has taken the same data, hopefully even from the same edition of Ethnologue (they don't say explicitly), with some modifications, read more here. What they've done now is distort the size of countries/territories to show the amount of indigenous languages spoken there.
Worldmapper © Copyright Sasi Group (University of Sheffield) and Mark Newman (University of Michigan).
Now, if we compare this map to one where the countries/territories are distorted to represent human population we get a very different view, in particular we get less people and more languages in Nigeria and Papua New Guinea and more people but less languages in China and India.
Worldmapper © Copyright Sasi Group (University of Sheffield) and Mark Newman (University of Michigan).
Why are there so many languages in certain areas? Sure, it needs to be said that some of it might be due to overly excited linguists indulging in to much splitting, dividing languages too much. But, that doesn't fully answer the question, and even if it does it needs to be proven. What about self-sufficiently, exogamy, political organization, multilingualism, trade, isolation and time depth of settlement? There are lots of questions concerning the emergence, maintenance and decline of linguistic diversity that need answering, and actually this is a very hot research topic at the moment. One project that seeks to answer such questions is the Wellsprings of Linguistic Diversity-project at Australia National University in Canberra. In fact, yours truly will become a part of that group. If you're into this you should also read this awesome free book by prof Enfield that just-just came out!

c) multilingualism. In the original map that spawned this post there is not clear information on how multilingualism is treated. Do we only see the the languages with the most speakers in a certain area? Many of the areas with the most languages are also ripe with multilingualism.People speak the languages of their parents, the languages of their spouse parents, the languages of the neighboring villages, the regional lingua franca etc. And, not only do they do this now, they've been at this extreme multilingualism for a very long time. How does this work? Well, that just so also happens to be a hot research topic tied into the other question of why there is such diversity in certain areas and not in others. This questions is addressed by the Wellsprings project too, but also by the Crossroads of Multilingualism project of SOAS and the Babel Problem PhD of the language in interaction consortium. The Langscape map that we looked at above does a better job of displaying multilingualism, but is still missing loads of data.

Alright, that's it for now. I hope you've learned something about data visualization of languages on maps and different kinds of linguistic diversity. Be sure to write if you wanna tell us something or ask something.

Dahl, Ö., 2006. An exercise in "a posteriori" language sampling. Ms, Stockholm University Linguistics Department.

Registers and styles

In Samoan (Oceanic, Austronesian) there are two styles: T-language or 'o le tautala lelei (the good language) and K-language or 'o le tautala leaga (the bad language). Styles refer to inter-language variation that is not dialectal (bound to geography), but rather changes depending on the factors such as formality and intimacy of the context.

 These two are quite interesting since they are marked by a switch of [t] to [k], [r] to [l] and [n] to [ŋ], among other markers. There is a PhD dissertation on this from 2001 (Mayer), in which we can find this quote that very clearly explains what "styles" are. I like it and I want to share it with you.

Because the Samoans almost always use K's instead o f T's, a few elders have been inclined to remark, "Why should we be so careful about using T’s? Why shouldn’t we speak the language the same as the Samoans do?’’ The most common reply is that anyone with a true love for languages will adhere to its pure form and avoid this corruption. But much more important than this is the fact that there is a standard of appropriateness that is expected o f missionaries. Even in English, whether you realize it or not, you change your way of speaking to fit the situation. You hail your friends with something like "Hiya Gang!" To a good friend of your father's you say "Hi. Mr. Smith." Your bishop gets a "Good morning, Bishop Olson." Why not "Hiya Oley!" to your bishop? Why not "Good morning, Gang!" to your friends? The answer is that you use the kind o f language appropriate for each situation. In Samoa there is a standard expected of all ministers, whether they be L.D.S. or Methodist, American or Samoan. A missionary will lose the respect of the people if he speaks in K's. 

This is advice on the Samoan language to missionaries of the Church of Jesus Christ of Latter-Day Saints from Johnson and Harmon (1972) as cited in Mayer (2001).

Alongside "style" there is also the term "register", as Mayer writes (1972:18-19) 

A register may be said to be situation-specific and is concerned with the aim of the communication. It consists of structural and linguistic features that set it apart from other registers within the speech community. Style, as distinct from register, refers to varieties of speech which focus on the relationships between the participants. 
 Style has been defined as a variety of language with shared features among its speakers that are used to express dimensions such as intimacy/distance, casualness/formality, and peremptoriness/politeness. On the other hand, register has been defined as a variety of language with shared features among its speakers that are used in specific situations and for specific functions. For both style and register, these features may include vocabulary, syntactic patterns, prosodic features, and phonology. Genre is concerned with form and is often structured by extralinguistic rules that are concerned with external format. That these subcategories of language variation occur on a continuum rather than in highly defined and discrete forms has contributed to the difficulty in distinguishing between them. 

So, there are a difference between register and style and yet they will often be hard to tease apart.

Johnson, A. P., and L. E. Harmon. 1972. Let’s speak Samoan. Apia: Church of Jesus Christ of the Latter-Day Saints Press.

 Mayer, J. F. (2001) Code-switching in Samoan: T-style and K-style. PhD dissertation: University of Hawai'i

Why should we care about languages that are dying?

Linguist John McWhorter wrote recently in the New York times on the topic of why should we care if 90% of languages that are alive today perish within a very short time span. Go read the article, it's a very good read.

I'll do a brief sum-up here and then present my own thoughts. The text is 911 words and I doubt that McWhorter has exhausted all his reflections on this topic, let's just keep that in mind. In this text he answers a frequently asked question "if indigenous people want to give up their ancestral language to join the modern world, why should we consider it a tragedy?". This is a common question face by linguists, and a very important one that should be addressed with great care and respect.

He dismisses the earlier linguistic-relativist argument he used to put forward, i.e. that each language represent a different world view and therefore another way for us as a collective human society of viewing the world.

Now, he rather puts forward two other arguments:

1) languages are important markers of community and identity, when people lose their languages they lose their culture and history.

2) He writes languages are scientifically interesting even if they don’t index cultural traits. They offer variety equivalent to the diversity of the world’s fauna and flora.

In other words, even if the linguistic relativism part of linguistic diversity isn't interesting, the basic question of linguistic typology is. I've made an attempt at formulating that question by myself, based offa Hjelmslev and Boas, I can't know if this is what McWhorther is interested in. Here we go:

Linguistic typology aims to map out the logical possible options of linguistic diversity and then see how actual languages are  distributed across that design space, correlate that with genealogy, contact, known grammaticalization paths and the likes and then from the resulting knowledge about what is more probable and what is less probable possibly discern something interesting about human nature and cognitive capacity.

In order to understand ourselves as a species better we need to study what we can do, what diversity we are capable of.

McWhorther then finishes with:
These are the arguments I have ready for the “Why should we care?” fellow these days. We should foster efforts to keep as many languages spoken as possible, and to at least document what the rest of them are like. Cultures, to be sure, show how we are different. Languages, however, are variations on a worldwide, cross-cultural perception of this thing called life.

To me, there are two very important things missing here. Firstly, the agency of the communities. We cannot as linguists, and certainly not as non-members of the communities, make predictions about what they want. We can let them know that their language is
a) a language (at all), this is sometimes misunderstood
b) equal to other languages
c) interesting
Just like all other languages it deserves a dictionary, a grammar, pedagogical books, a place in the political system etc. That we can do, that we can say. The public often seems ignorant of the fact that all languages are equal.

Languages with more speakers or more written down literature are by now means more valuable or better. This is a fact I find that I need to explain often, once to a frenchman who claimed that french was the only language with which philosophy could be discussed. Sigh. Another time it was someone who thought that the reason that English was spoken by so many was because it had traits that were just more optimal for communication. In answer to that question I recommend looking at this map of "Countries the UK has not invaded" and then have a long hard think.

If explaining this and showing interest gives communities more power and confidence, great. But we cannot tell them that they should be doing this or that. It is not our right. In Swedish we would call this to remove someone's "myndighet". Their right and competence to govern themselves. Imma Trekkie, so I can't help but think of some of the moral dilemma episodes where humans get put in zoos to be studied by much more advances alien beings. We can't be guilting people for having to live in a world where their standard of life, and the standard of life for their families, will often increase significantly if they learn the majority language of their region.

I don't think this is what linguists do, nor what McWhorter is proposing, but I think it's worth keeping in mind. The fact that languages are dying are tied together with larger patterns of urbanisation, cultural colonialism etc. You might wanna read this study on correlations between linguistic diversity and ecnomonic growth. We can do what we can in stressing the fact that all languages are equal and that we, as well-educated people with authority, find them all interesting to study. By doing that perhaps we can instill confidence and/or have an effect on the political struggle of that community. It's sad and crass, but by discussing matters like this as privileged people our voices sometimes can make a greater difference compared to the voices of the actual communities than we perhaps wish they would.

As for the second argument, if that was true then we need just document and study all languages currently spoken and then we need not care whether they live or die. That is not an answer to the question "why should we care that they are dying?", it's answering the question "why should we care that they are dying before we can extract all new information from them?".

Add to this line of thought the fact that 7 000 languages is actually not enough if we want to answer the larger typological question outlined above. Let's just do a quick count. Humans have spoken languages roughly 100 000 years. Let's pretend that a language is "intact", i.e. mutually intelligible, over at most 1000 years. Then let's say that independent of population growth during this time we've kept at a constant 5 000 communities, at least. That means half a million languages, and we're counting generously. That means that less than 2% ar alive today, and we want to study the extent of our capabilities as humans on this. It is not futile and this count is not very accurate (by far). However, just. Keep this in mind will ya?

This is a rather trivial and obvious observation, but I think that this text is not meant to be read as what linguists should do, it's rather an answer to that question "why should we care" asked by someone as un-empahtic as to not understand that people identity with their language and to not care that it is taken away from them is just wrong and enforcing colonial ideas that certain languages are better than others. This to me is a reminder that we still need to tell language communities and members of the privileged communities asking such questions as the one McWorther receive that:

all languages are equal

and we apparently need to be saying this over, and over and over again. Perhaps exactly because humans associated culture and identity so much with languages, and we already seem capable of viewing certain cultures as better than others. This questions is also often asked by people who do not speak many languages and have found language learning hard, giving them a skewed comparison of languages. This questions is closely tied with the other very frequently asked question: which languages are most difficult? Well, all languages are learned by infants in roughly the same time, so they can't really be that different. However, not all languages are equally easy for an adult Germanic speaker to learn.

Perhaps it needs to be said, I think diversity is great and it's very sad that languages are dying. But, the picture is larger and we mustn't let ourselves as linguist fall into guilting people just because we want to answer privileged people who cannot understand the fact that all languages are equal.

Btw, y'all should give the UNESCO's Universal declaration of Linguistic Rights a read.

Friday, December 5, 2014

Rejected language families

There's a lot of different ideas on what languages are related and how. Some are well-received by most linguists, and others are very controversial. There's a wikipedia entry for all suggested language families that Glottolog has rejected.  (**EDIT I've seen some errors in there, I'm gonna improve it.)

There are three main sites if you're into historical linguistics: Ethnologue, MultiTree and Glottolog. Ethnologue is SILs big catalogue, MultiTree is Linguist Lists initiative to make lots of hypotheses of language history accessible and Glottolog is a huge database of bibliographical information by Harald Hammarström, Sebastian Nordhoff, Robert Forkel and Martin Haspelmath of the MPI Society.

Most often I prefer using Glottolog. Glottolog gives a reference for each point of the tree and classification, i.e. you can see which sources has been used to argue a particular relationship or grouping. Glottolog is more "splitting" than Ethnologue, i.e. more prone to not postulate deep groupings. You can read more about that here.

Btw, remember. If you're not sure if what you're looking for has been classified as a languoid, dialect, top-level-family, isolate or some node in between or if you're not sure of the iso 639-3 code, glottocode or the name glottolog has chosen as default, then search glottolog here and not in the languoids "table". Then you'll search all levels and all alternate names.

Pakistan National Linguistics Olympiad! Hurray!

As some of you already know, I organise the national olympiad of linguistics in Sweden and am a board member of the international contest. Today I'd like to take the opportunity to celebrate the Pakistan National Linguistics Olympiad (PLO). A linguistic olympiad is a chance for students of secondary school to learn about the wonders of linguistics.  
The Pakistani contest write the following on their site, very well-put!

Whether it’s telling a joke, naming a baby, using voice recognition software, or helping a relative who’s had a stroke, you’ll find the study of language reflected in almost everything you do.When you study linguistics at any level, you gain insight into one of the most fundamental parts of being human- the ability to communicate through language. You can study every aspect of language from functional theory to language acquisition to psycholinguistics. Studying linguistics enables you to understand how language works, how it is used and how it is developed and preserved over time. Linguists are not only polyglots, grammarians, and word lovers. They are researchers dedicated to the systematic study of language who apply the scientific method by making observations, testing hypotheses, and developing theories. The science of language encompasses more than sounds, grammar, and meaning. When you study linguistics, you are at the crossroads of every discipline.

They're one of our newest members, I'm very excited to welcome them into the international contest!

Navigating the world by language

This is me in Leiden navigating the channels with an old world map. It's a Swedish map,  made by Docenten Friherren Sten De Geer in 1924. The upper section is the distribution of languages across the world and the lower religion. I'll make a longer post about this map and linguistics in the 20's later, for now I just wanted to share this image.

Thursday, December 4, 2014

Linguistic relativism and data visualisation

This is a sweet little story by a data visulator by the name of Muyueh Lee (李慕約). It's about comparing the entries in wikipedia for different color terms between Chinese and English. The link will take you to a site which will elegantly tell you a little story about the data and what he's found. 

He starts of with:

Language represents our view of the world, and knowing its limits helps us understand how our perception works. 

Now, this is true and very interesting. There are some problems with this study, but as a way of showing data visualisation and what one can do with freely available data online it's great. There is so much data out there that we can do interesting things with.

The story he tells is about color terms, and it just so happens to be one of the most well-reserach areas of lexical typology (systematic cross-linguistic comparison of words). It all started with Berlin and Kay's work on basic color categories, and then there's been lots of good and interesting research since then. I'm afraid I cannot provide you with a satisfactory history of all that research her enow. Down here is a little simplified representation of their research, it's an implicational chain where languages of the categories to the right also have all the categories of the previous stages to the left. I.e. you won't have "orange" if you don't also have "red". If you're interested in basic color categories read this and look at the surveys website here.
If you're curious about linguistic relativism in general, I' also like to recommend this short quote and these articles

Levinson, S. C., & Majid, A. (2009). The role of language in mind. In S. Nolen-Hoeksema, B. Fredrickson, G. Loftus, & W. Wagenaar (Eds.), Atkinson and Hilgard's introduction to psychology (15th ed., pp. 352). London: Cengage learning. (free PDF here) 

Majid, A., & Levinson, S. C. (2011). The senses in language and culture. The Senses & Society, 6(1), 5-18 (free PDF here)

More linguistics all the time yeah!

Then, as Jan Wohlgemuth of Linguisten also pointed out we should not forget enterprises like Glottopedia and others that deal with linguistic terminology. For more on resources of linguistic terminology you can go to this text here.

Now, soon it'll be time to start adding proper descriptions for all of these sources and posts. Some are news feeds from institutions like the MPI in Nijmegen or PARADISEC, others are funny tumblrs with serious topics mixed with humour, others are educational channels with carefully produced manuscripts. 

Want more linguistics in your life?

Do you know of any other sites, blogs or similar where people can learn about linguistics and the diversity of the languages of our planet?

I know of loads, I try to keep track of them all but sometimes it becomes a tad bit overwhelming. I like lists, so I made a list. It's only of sources that are in English and I haven't written down  a description for each one (yet). If there's anything you think I should add, please add a comment, reblog on tumblr and tell me, or tell us here.

There's also this lists here of free online database of languages.

I used to be jealous of my partner who's in physics because they've got lots of online lectures etc that they can use to complement their education. We're not there yet, but there are more initiatives then one might think. It's awesome. Let's spread the word! Check some of these blogs and sites out and if you like it share it onwards. Because you know:

Wednesday, December 3, 2014

Natural Causes of Language by Enfield - New awesome free book with video introduction

N.J. Enfield of the Univeristy of Sydney and Max Planck Institute of Psycholinguistics has just written a book at the open access publisher Language Science Press. It's called "Natural causes of language: Frames, biases, and cultural transmission".

Whoop whoop, you can download it and read it right now and the author has made a video introduction. If you're stressed and think you can't deal with this right now, know that the book is 64 pages and the video is 2.21 minutes. It's well within you capabilities and it is very much worth it if you have any interested in language whatsoever.

The best part of the video is when he talks about the kinds of doubts that we have in linguistics, for example "that language is a real thing" and "that tree diagrams are useful representations of language history". Couldn't agree more.

This book is a part of a new series they're launching called "conceptual foundations of language science". I'm already excited about the next one.

What causes a language to be the way it is? Some features are universal, some are inherited, others are borrowed, and yet others are internally innovated. But no matter where a bit of language is from, it will only exist if it has been diffused and kept in circulation through social interaction in the history of a community. This book makes the case that a proper understanding of the ontology of language systems has to be grounded in the causal mechanisms by which linguistic items are socially transmitted, in communicative contexts. A biased transmission model provides a basis for understanding why certain things and not others are likely to develop, spread, and stick in languages. Because bits of language are always parts of systems, we also need to show how it is that items of knowledge and behavior become structured wholes. The book argues that to achieve this, we need to see how causal processes apply in multiple frames or 'time scales' simultaneously, and we need to understand and address each and all of these frames in our work on language. This forces us to confront implications that are not always comfortable: for example, that "a language" is not a real thing but a convenient fiction, that language-internal and language-external processes have a lot in common, and that tree diagrams are poor conceptual tools for understanding the history of languages. By exploring avenues for clear solutions to these problems, this book suggests a conceptual framework for ultimately explaining, in causal terms, what languages are like and why they are like that.