iñagural post


I'm Jeff, and I'm here in an effort to boost readership and controversy on this greatest of all Pseudo-Philological Blogs, possibly by some ACTUAL philology, though that would probably chase off the few and the faithful who have remained.

Style over substance

I prefer my phonological grammars not to be poluted with phonetic substance. That being the case, I'm always interested (read disturbed) when I run into phonological patterns that are do not look abstract and cannot be easily explained as a result of systematic misperception or systematic failures in production. One of these is the peculiar cross-linguistic tendency towards a particular vowel pattern in coordinate compounds and echo reduplication. Take the English cases tick-tock or flip-flop (English doesn't have these things very productively, but I'm fishing for a familiar example). In both of these cases, and hundreds of others, the first value is high (and front) and the second vowel is low (and back). These could be treated either as coordinate compounds (both flip and flop exist as verbs in the English lexicon) or as echo-reduplication. Similar patterning can be found in A-Hmao (a Hmongic language of China), Jingpho (a Tibeto Burman language of China, Burma, and India), and many other languages. In A-Hmao nominal reduplication (the first instance of the reduplicated element, the ”reduplicant“, must have either /i/ or /u/ as the vowel in the root). In Jingpho coordinate compounds, if one root contains a high vowel, then that root always occurs first in the sequence.

I find it comforting, on the one hand, that sonority driven stress cannot explain this kind of pattern. The stress pattern of English echo-reduplication constructions is trochaic (stress goes on the high [low sonority] vowel) while in Jingpho coordinate compounds, stress is iambic and falls on the lower (higher sonority) vowels. I find it distressing, on the other hand, that this kind of thing appears to be part of the grammars of some languages (as is clearly the case in both Jingpho and A-Hmao). Don't get me wrong: as a Berkeleyan, I'm just fine with sound symbolism and all the allied stuff--I just don't want it in my grammar.

My modest (read unoriginal) proposal is this: some devices characteristic of a given speech style may be misinterpreted by language users as aspects of grammar. Thus, there need be no direct interface between phonetics and phonology even in these cases. The fact that there is a robust cross-linguistic pattern describable only in phonetic terms is a fact about stylistics, rather than grammar; the incorporation of stylistic patterns into grammars is a fact of history. The grammar itself is left as a device that performs uniform computations over abstract symbols. In other words, I choose style over substance.

Spelling reform

Apropos all the recent chatter about (German) spelling reform, I have to admit that I have mixed feelings about orthographic reform for English. On the one hand, I should benefit, since I am a terrible speller. In fact, my interest in phonology originated in my endless attempts (during high school) to devise a more phonetically transparent writing system for English. On the other hand, the words that really give me problems, such as “Liberman” and “Phonoloblog” (reduced to phonoblog, apparently through long-distance haplology) are unlikely to be affected by any spelling reform.

Getting pedantic with pedantry I

During my dark time in internet isolation, I seem to have missed a very interesting post by Scott Martens of FFOE and Pedantry which responds--in part--to my “Language is misunderstanding” post. In it, Martens argues for a particular lexicalist model of ”grammar“ (although he seems not to like that word) and has some interesting things to say about a number of other topics. Although it has been some time since this post was made, I still have a few things to say about it (which will come in installments).

Central to Scott's post is the idea that language is not a collection of words and rules that reside in the minds of speakers, but rather a set of social conventions that reside in a community:
What this means is that English is defined not by a body of rules and a set of words, but as the protocol English speakers use to communicate when they believe they are speaking English. This shifts the definition of English to a definition first of the English-speaking community, and second an explanation of why they identify some communicative acts as speaking English and other communicative acts otherwise. This makes language not a property of individuals but a communal property. It sets the boundaries of what is and isn't English, and what is and isn't language, where it belongs: in the field of socially constructed categories.

I think it is undeniable that languages are social constructs, just like any other part of culture. The idea that language is purely a social phenomenon is problematic, however.

Supposing that we define English as Scott does, consider the following thought-experiment: take a group of monolingual pre-adolescents from the American heartland, who are not aware of a distinction between speaking and speaking English (anecdotally, I know that such kids exist). Transport them to a desert island. Leave them there for a generation, where they form their own isolated community. Will the language spoken by their children be English? Certainly, it will share many of the formal properties that we recognize as characteristics of English, and would be mutually intelligible with most varieties of English. However, the speakers would neither be part of the larger English speaking community, nor would they be aware that they were speaking English.

Now take a more extreme example: send a single individual of the type we have described to a desert island, sentenced to a life of isolation. Would language cease to exist in this individual? If personal experience is any guide, they are likely to continue speaking to themselves in their native language, both vocally and subvocally. This language is likely to retain almost all of the properties we identify as “English” even though it has ceased to function as a social medium. I do not believe that the social definition of a language that Scott gives us quite captures what we mean when we talk about languages. In fact, I have to concede that Chomsky has a valid point when he points out that most of the language is not spoken or written, but instead is the substance of internal monologues. There is something about language that is not social, but cognitive (a fact which Scott accepts as well). Scott denies, however, that there is any body of knowledge that can be called language, labelling the information about language stored by speakers as the lexicon. However, I think that his species of lexicon (strikingly similar to the Constructicon in some versions of Construction Grammar) is closer to the conventional sense of the word language than his social definition. I think we both agree that, the solipsist-idealist view of language that characterizes much of Chomsky's (and his fellow traveller's) thought on the subject offers little hope of really understanding why languages are structured the way they are, or why they change in the ways that they do.

What am I saying, then? Arguing whether language is a social phenomenon or a property of the minds of individuals is pointless, because it is clearly both, in the same sense that all social phenomena are constrained by the facts of individual cognition. Our theories about linguistics are constrained by these same facts, which facts explain, for example, why we don't do linguistic description with neural nets. Linguistics, as a discipline, will only be mature when the majority of its practicioners realize that there can be no single, God's-eye model that answers the ends of all linguistic investigations, be it social, formal-computational, or cognitive.

Not my job

My recent move thrust me into a broadband-free wilderness, and my resulting internet withdrawls had become so severe (after a couple of days) that I now write to you through the mediation of a stolen AOL CD. As I attempted to get a handle on what had happened in the lingosphere during my days of submersion, I noticed a very interesting discussion on overgeneration over at phonoblog. To this, I would add my two bits.

I have to agree with Charles Reiss's terminology: simply waving one's hands over overgeneration is a “cop-out”. The goal of phonologists should be to explain all of the sound patterns in language. As a linguist who prefers his phonology “hold-the-substance”, I believe that much of this explanation must reside in domains outside of the grammar (but certainly, not outside of langauge). As Reiss rightly notes, this note has been pounded by John Ohala for years, and as John Ohala will tell you, this idea predates Ohala's work by many generations. Perceptual and articulatory factors feed sound changes and ensure that certain types of patterns will be very common, while others will be quite rare. The inherent computational properties of the grammar must also constrain the set of possible phonological patterns, making certain conceivable grammars uncomputable or unlearnable. Given these two modes of explanation, it should be acceptable to excuse one's grammatical model for the sin of overgeneration only when some extragrammatical factor can explain the gaps.

On the other hand, I do not think that evidence from typological gaps is as compelling as is conventionally assumed. First of all, the number of human languages for which we have reliable and complete phonological descriptions is not large. Secondly, there are very good reasons--discussed by Johanna Nichols among others--for believing that the human languages that currently exist, and those that have existed historically, do not form a representative sample of possible human languages. The idea is that there are evolutionarily plausible and computationally possible languages that have never existed simply because human linguistic history has not yet played out all of it's options. If we accept these two propositions, we should assume that many typological gaps are purely accidental. Frankly, we have to believe in the possibility of a lot of things for which we have no direct evidence, just as we have to believe that Proto-Indo-European was a possible human language even though it is completely unattested.

Chomsky and Moving

I have been blogging lightly (not at all) and will continue to do so until I finish moving (from within siren-shot of People's Park to quiet Kensington). In the meantime, you should definitely check out this post (about a new paper by Pinker and Jakendoff) over at Semantic Compositions if you haven't already (by way of Ryan Gabbard at the Audhumlan Conspiracy).

I don't think any self-respecting Minimalist could respect the way I'm moving, but we can agree that my doing so has nothing to do with communication ;-)

Good news for this blog

In celebration of the one week anniversary of this blog, it will change from a solo blog to a group blog. Jeff Pynes, a classics student turned linguist, and one of the originators of the “It's Ablaut Time” label, has returned from two years in Honduras and will start blogging here. I hope you'll extend as warm a welcome to Jeff as I have received.

Good news for Berkeley

...and sad news for Ohio. Official rumor has it that Keith Johnson will be joining the Berkeley linguistics department in January of 2005. That means we'll be getting two amazing new faculty members this year: Line Mikkelsen this fall and Keith Johnson in the spring. It's almost enough to make me want to stick around Berkeley for a while ;-)

Morphologically Conditioned Sound Change II

A few posts back, I looked at a new case of what looks, on the surface, like morphologically conditioned sound change. Homophonous noun and verb prefixes *m- and *m- developed into Moyon Naga bʌ- and ŋ- respectively (the verb prefix is a syllabic nasal that assimilates to the following consonant in place of articulation, but which is realized as a velar nasal before glottals and vowels). What gives?

Well, as entangledbank pointed out, there are at least three possible culprits (assuming we want to avoid the idea that sound changes are morphologically conditioned): stress differences, analogy, and affixation.

We could suppose, for example, that nouns received stress on the penultimate syllable while verbs received stress on the final syllable (not unlike some word pairs in English). Stress might have served as an environment for the denasalization or fortification of *m-, as well as the strengthening of the anaptyctic vowel in the prefix. Unstressed syllabic *m- might, around the same time, have become *ŋ-. This stress would have to be traced, originally, either to analogy or some sort of stress-shifting affixation. Such a state of affairs should never come about independently through neo-Grammarian sound change. One problem with this hypothesis is the fact that related languages show no evidence for such a stress difference, so the whole argument would have to rest upon an internal argument based upon these facts. The second problem is that *m doesn't become b in roots of monosyllabic words, where it would invariably fall in the stressed syllable (e.g. PTB *mik > Moyon mik. This fact ends up being a problem for several possible hypotheses, and tends to make sound change look less attractive and analogy more attractive.

So what do I think? I believe that affixation is ultimately to blame. In a number of languages related to Moyon (including several of the Tangkhul languages) all nouns receive a prefix ʔ- or ʔa- except in certain morphosyntactic environments (e.g. the obligatory possession construction). While the glottal stop is usually followed by an anaptyctic vowel, the loss of this vowel would result in the creation of prenasalizaed sonorants where that prefix immediately preceded liquids or nasals. Stress then acted as a conditioning environment: in stressed roots, glottalized sonorants became plain sonorants, but in unstressed prefixes, glottalized sonorants become voiced plosives. Later, anaptyctic vowels were lost after sonorants (if they ever existed). Thus *m-ka > **ʔəməkha > **ʔməkha > Moyon *bʌkha ‘chin’ but *m-kaw (approximately) > **məkhow > Moyon ŋkhow ‘cough’.

Stupid Linguist Jokes

If you've found your way to my blog, you've probably heard some version of the “all odd numbers are prime” joke before, but here is an expanded version.

UPDATE: Mark Liberman over at Language Log has disclosed the true purpose of this posting: to get linguists more lunch invitations.

UPDATE: Just realized I mispelled Prof. Liberman's name.


Lately, as part of my research project on the languages of the Tangkhul people (Tangkhuls are a Naga group residing mostly in Northeastern Manipur, India) I've been reading a lot of documents written by British officers and colonial administrators who were working in what is now northeastern India during the 19th century. While I cannot approve of the imperialist agenda which they sought to advance, I cannot help but be impressed by the erudite scholarship, insatiable curiosity, and undaunted resourcefulness with which they they pursued it. Officers like Captain Gordon and Major W. McCulloch energetically gathered ethnographies and linguistic data from a dizzying array of people groups in the Valley of Manipur and elsewhere. Most of these peoples and languages are still intact, but many have remained unstudied since the Victorian era.

While Political Agent, McCulloch put this ethnolinguistic knowledge to use during the Kuki invasion of Manipur (knowing that the Kukis were motivated by pressure from more powerful Chin tribes to the south) and, through some skillful negotiations, saved the throne of Manipur from crisis while at the same time enlisting the invading Kukis in the service of the British Empire. Those who worry that the United States is trying to bring back those old colonial days may draw some comfort from the fact that my nation has never been known for producing scholar-officer-administrators of McCulloch's cast.

Language is misunderstanding

I used to think we could talk intelligently about the grammars of languages by starting with the assumption that grammars are designed for communication. The more I look at actual languages, the less I believe that this is the case. While languages obviously serve as media of communcation, they are in many ways ill-suited to this task. Grammars are too complex, too byzantine, too intricate, and indeed too beautiful, to be optimal codes for communcation. Think of the convoluted tone rules of some Bantu languages, or nominalizations of Kuki-Chin languages that express with a complicated systems of ablaut, consonant mutation, and tone alternations what other languages express as well with a simple affix. These formal filigrees, I would argue, are not there to serve some communicative function, but simply because speakers of languages assume that any pattern they can detect in their language is an essential part of the linguistic code which, if not maintained in their own speech would lead to them being outed as the language bluffers that they are. These include not only robust patterns that form the backbone of the shared linguistic code, but also chimerical patterns that speakers apprehend based upon their own misperceptions and mistaken inferences. It is in these understandable misunderstandings that the seeds of language change lie, or so I and my ilk would like you to believe.

But what about salutary innovations? Can changes that improve languages (by allowing them to express some new semantic or pragmatic distinction, for example) be traced to failures? While this point of view seems pessemistic, the role of mistaken inferences in adding diversity to the linguistic pool is essentially analogous to the role of mutations in adding diversity to the genetic pool (a point made powerfully by Juliette Blevins's new book Evolutionary Phonology). In both cases, the mistransmission of a code adds to a pool of choices from which other factors (evironmental factors/learnability) differentially select. The take-home message is that even optimizing changes in language are a product of our inability to completely understand one another.

UPDATE: Languagehat adds:
for many of us it is precisely the intricate, byzantine bits that are a primary attraction. I've never been able to work up any interest in Esperanto and the other simplified languages, despite their theoretical value for easy communication, because they're too damn boring. If I can't have irregular verbs, I'd rather grunt and point.
I like irregular verbs too, but not as much as noun-classes ;-) Not everyone is agreed on this point, however, and Zizka comments
So anyway, I ended up loving Chinese, with its pidgin-like grammar. And I will always blame German noun declensions for Hitler. So I guess I disagree.
I'll agree that (Mandarin) Chinese is pidgin-like in it's way (not surprising given it's semi-artificial status) but that it nevertheless has a certain charm. However, I wouldn't want to spend my career studying it. Further, Zizka may be right that morphological complexity produces genocidal maniacs: Hitler may have had to learn German, but Stalin had to learn Georgian. By this rule of thumb, we had better keep our eyes on those Athabaskans.

Harmless Drudges

As Languagehat pointed out in his welcome post for “It's Ablaut Time”, I work for the Sino-Tibetan Etymological Dictionary and Thesaurus (STEDT). It's fun, but during my time at STEDT, I have learned that one of the worst responses you can give to the date question “So what do you do?” is, “I work on dictionaries.” Lexicographers are, I think, are basically seen as accountants, only without the glamorous and quick-paced lifestyle. Oh, and with smaller incomes.

Morphologically Conditioned Sound Change?!?

There has been a relatively long and deep consensus (excusing certain malcontents like Yakov Malkiel) that sound changes are never morphologically conditioned. If a sound change appears only in a certain morphological environment, there must have been some phonological conditioning factor there at the time the sound change took place, even if we can't see it now. Knowing this, I'm a bit troubled by something I discovered today. I was comparing data from a language I just did some elicitation on (Sorbung, a previously undocumented Tibeto-Burman language of Manipur, India) and Moyon Naga (which appears to be more closely related to Sorbung than any of the other languages I've looked at). These languages both have lexical prefixes (usually with no transparent function) associated with noun and verb roots. In nouns, we find the following correspondence:
We find a different correspondence in verbs:
Looking at these data, the default assumption would be that these two prefixes (one a noun prefix and the other a verb prefix) were originally phonologically distinct, but have become homophonous through some set of sound changes in Sorbung, while in Moyon, they have remained distinct. The problem with this assumption is that at least some of these prefixes are supposed to reflect homophonous Proto-Tibeto-Burman suffixes, as in *m-ka ‘jaw’ and *m-nwəy ‘laugh’. So what's the deal? Are the reconstructions wrong? Is this really a morphologically conditioned sound change?

I think I might have an answer. If anyone's interested, I'll share it later.

Contour contexts

I've been talking to Larry Hyman recently about contour tones. Actually, talking to Larry about tones is not new, but one recent topic is: what is the “optimal” environment for a contour tone? Larry's reasoning goes as follows: a falling tone is optimally perceived between a L and a H (L.HL.H)--that the middle syllable bears a contour is completely unambiguous in such cases. However, L.HL.H sequences are suboptimal from the standpoint of production, since they require the speaker to do lots of glottal acrobatics (two jumps in pitch, and one glide). A H.HL.L sequence is much easier to articulate--optimal from the standpoint of production, since only one pitch modulation is required--but is perceptually quite ambiguous. The question Larry poses is, “Which consideration wins when?”

I have some thoughts on this. It is possible to recast Larry's proposal in Ohalaesque terms. Speakers are likely to misarticulate L.HL.H sequences quite often in ordinary speech, producing perhaps [L.L.H], or [L.H.H] while intending /L.HL.H/. If this happens often enough, listeners are apt to “undercorrect”, assume that this new pattern is the correct one, and thus start producing tokens of this type intentionally.

For sequences of the type H.HL.L, speakers are given lots of room to assume that the contours in correctly produced tokens are the result of timing errors. [H.HL.L] could be either /H.L.L/, with the H lasting longer than the syllable with which it is associated or /H.H.L/, with the final L being anticipated on the preceding syllable. If speakers are aware that lags in tone production are more common than anticipation effects, then we would predict that the first scenario would be more common: intended /H.HL.L/ is hypercorrected as /H.L.L/.

So who wins? Note that neither of the kinds of changes predicted by this model result in an increase in the number of pitch modulations. That is to say, in some sense, that production wins, since misperception is as likely as not to result in articulatorily more optimal forms by reinterpreting contours as timing errors. What if this prediction turns out to be empirically wrong? What if there are, in fact, many cases where tonal alternations add additional transitions just in case they enhance the perceptability of contours? I think this would be important evidence that the substance-free version of phonology that I favor is on the wrong track.

Unicode IPA Input

Just saw the new Unicode IPA Input tool Charwrite mentioned on Language Log. This tool looks very promising, and I can think of one web application, in particular, in which I plan to incorporate this nice javascript. As Mark Liberman mentions, the handling of diacritics is still not all one could hope for, and various types of browser font-weirdness complicate the design of these things (Charwrite works great for me in Firefox, but has font troubles in IE).

Speaking of Unicode IPA input, since being forced to become a Windows user (Linux, I still love thee), I have developed a great fondness for Tavultesoft's Keyman keyboard manager and the Unicode IPA keymap for Keyman that comes with SIL's Doulos Unicode font (beta). Doing linuguistics on Linux and other Unix-like operating systems would be much easier, I think, if there was an input system of this kind for X11.

You may notice that there is nothing popular about this blog, and that it has featured little if any philology (at least in the modern sense of the term). Too bad: we had already settled on the subtitle before I even knew what a blog was. ;)

Incorporation Alert

I like compounding so it follows that I love Southeast Asian languages. Many of the Hmong-Mien and Tibeto-Burman languages I work on have nifty verb-object compounds that look, in some ways, suspiciously like noun incorporation. However, few people seem to have made this connection, and there is probably a good reason for this. Given that these languages are not exactly champions when it comes to affixal morphology, it could be quite difficult to distinguish noun-incorporation from less interesting types of compounding or even simple collocation.

Be that as it may, in Southeast Asian languages, one does not say a person is deaf. Instead, one says that they are ‘ear deaf’ or ‘deaf-eared’. Thus, in Hmong (Mong Leng):
‘That person is deaf.’

And it's not just Hmong. In a fairly normal Tangkhul language called Kachai, the (nominalized) citation form for ‘be deaf’ is nɐ́-kə̄-pā ear-NOM-deaf `deaf-eared'. Here, the fact that the nominalization prefix attaches directly to the verbal root, rather than the compound as a whole makes this look less compound like (and probably even less like incorporation). On the other hand, the bare root nɐ́ never occurs in isolation--it always bears a noun class prefix outside of compounds. In another Tangkhul language called Huishu, something interesting happens: the word for ‘be deaf’ is khə̀-nì nì-kə̀-khɐ́ CLASS-ear ear-NOM-deaf ‘ears are deaf-eared’. Here, the same compound or collocation contains both the bound root form for ‘ear’ and the free form for ‘ear’ (class prefix + root). It will be interesting to see if this is simply and idiosyncracy of this particular form, or if this type of construction is found more generally in Huishu.

Language is bluffing

It strikes me that a huge number of insights into linguistic phenomena can be dervied from a few relatively simple propositions. One of these is the observation that language is a code employed only by code-breakers: that none of us knows the language we speak as a fully explicit system. Instead, we bluff our way through, filling in the gaps in our knowledge of the code with an inference here and a leap of logic there. This capacity to extrapolate from the known to the unknown is, in essense, grammar. If these inferences follow naturally enough from the parts of the code everyone around us agrees upon, they are incorporated into it. If they don't follow at all from shared knowledge of the code, we come off looking inarticulate. The interesting thing is that the parts of the code we all agree upon were, at some point in the past, somebody's bluff.

Over the lips, through the gums...

It now seems to be a requirement of the field that linguists--like law professors, philosophers, and people with too much spare time--have to have a blog. The title I've chosen is one that a couple of friends of mine once kicked around for a (humorous) magazine. It expresses at least three themes that will come up on this blog: historical linguistics, non-concatenative morphological processes, and stupid puns.