On Monday and Tuesday 15 and 16 October, CLAS will be hosting a workshop on computer-supported lemmatisation of Middle French texts. The workshop will bring together specialists on French language, medieval literature, information technology, and corpus linguistics from the UK, France and Belgium. Dr Godfried Croenen, who organises the workshop, explains what it is about:
Lemmatisation is the identification of the root word — the dictionary entry, if you will — for any word form that can be found in a text. It includes linking various verb forms to the infinitive form, and recognising plural and/or feminine forms of nouns and adjectives. It also means distinguishing between forms that are spelled the same but that are actually different words (nouns, verbs etc.). Being able to do this reliably with the help of computers is a major challenge, but it has a number of research applications. It will help us to provide automatic context-sensitive help on medieval French language for users of the Online Froissart webiste. My colleagues and I will also be looking at how lemmatisation will help us in the large-scale computer analysis of spelling habits of medieval scribes. This in turn will help us to identify scribes whose handwriting has survived in two or more manuscripts. These manuscripts are often held in different libraries, even in different countries, and it is therefore not easy to do visual comparison, unless one has high quality reproductions. Being able to use our knowledge of the spelling habits of scribes will be of invaluable help.