Projects

Projects developed by osca.dev

Calelh

Domenge published on
4 min, 748 words

https://calelh.osca.dev

Calelh presentation

The Calelh application is the Louis Alibert's dictionary digitalized. Being numeric gives a new shine to the remarkable work of the linguist. Now its dictionary is only available in a facsimile edition, however a numeric version of the second part (entries and definitions) has been typed by to the Paul Valery University of Montpellier, there are those data that were used to populate the database.

The first part of the dictionary is the booklet and has been typed by ourselves inside the Calelh project.

Alibert's work concentrates on listing the lemmas as entries. A lemma is exploded in all the terms producted following the derivations or the compositions. Data processing starts from the lemma to develop its production in an ontologic form.

The booklet (first introductory pages)

There is plenty of information in the first pages of a dictionary.

Those pages are organized according to an ensenhador (a table of content). The site mimics the book by following the same organization.

The booklet has four parts :

  • Phonetic mutations of the lengadocian parlance exposes its linguistic terminology;
  • Morphology (how occitan popular words are elaborated);
  • How greek and latin words are used to form scientific and scolar occitan words;
  • The list of the abbreviations according to their type.

Across the booklet, text is enriched to ease the reading by highlighting and discriminating recommanded forms among the used ones. Inside, the markup language and CSS help to isolate and mark the terms for an easy and sure extraction.

types of abbreviation

  • POS for part of speech,
  • LOC for the word localization,
  • STRUCT to qualifiy the definitions structure,
  • ACCEPTION to deambiguate the different meanings,
  • META for contextual information not valorized yet.
Read More
#