- Funded by:
- Enterprise Ireland
- Project Leaders:
- John Dunnion, Joe Carthy, Dr. Nick Kushmerick (Dept of Computer Science, University College Dublin)
- Principal Researcher:
- Shazia Akhtar
- Supervisors:
-
Prof. Ronan Reilly (Dept of Computer Science, NUI Maynooth)
John Dunnion (Dept of Computer Science, University College Dublin) - Description:
- XML markup system is fully automatic, it is inspired by the WEBSOM method and a machine learning algorithm C5/See5. By using WEBSOM method, the system clusters the marked-up documents such that semantically similar documents lie close together on a Self-Organizing Map (SOM). The system then employs an inductive learning algorithm (C5/See5) to automatically learn and apply mark-up rules from the nearest SOM neighbours of an unmarked document. The system has a learning behaviour, it learns from mark-up errors in order to improve accuracy. The automatically marked-up documents produced by the system lie on a Self-Organizing Map, for better management and retrieval.