LLOD (Linguistic Linked Open Data) is a generic name for a set of mutually connected language resources, using ontological relations. The connections between concepts and between concepts and their expression in natural language make them suitable for both research and industrial applications in the area of content analysis, natural language understanding, (language- and knowledge-based) inferencing and other tasks. In the presented task, the concrete work will be on converting the SynSemClass project dataset (in part as a result of a previous Humane AI Net microproject called META-O-NLU) into LLOD, connecting it to the huge amount or interlinked data already available. A partner is involved in the Prêt-à-LLOD H2020 project, making this project synergistic in nature and multiplicative in terms of results in previous projects. Partners are also involved in the COST Action “European network for Web-centered linguistic data science” (NexusLinguarum).

Output

– SynSemClass (1000 classes min.) in LLOD / PreMON / OntoLex-Lemon ontological model, to be integrated in the LLOD. Covering 4 languages (CZ, EN, DE and ES)

– Tools for conversion from the XML original format to RDF and OWL, editing and checking

– Paper at some major 2023 conference (*ACL, *AI, LREC, ISWC, LDK…) or dedicated workshop (LAW, *SEM, Linked Data in Linguistics (LDL)), LRE Journal, Semantic Web Journal

Project Partners:

  • Charles University Prague, Jan Hajic
  • German Research Centre for Artificial Intelligence (DFKI), Thierry deClerck

 

Primary Contact: Jan Hajič, Charles Unversity