Semantic tagging of and enhancements to published texts
Please visit the jubillee issue ZooKeys 50, dedicated to semantic approaches in biofiversity publishing and dissemination.
The Internet and especially Web 2.0 technologies, also known as the semantic Web (http://en.wikipedia.org/wiki/Semantic_Web), has stimulated the development of radical new models of publication, dissemination, reading and analysis of scientific content. Semantic tagging is generally considered to be a method of assigning markers, or tags, to text strings to identify their meaning so that the string and its meaning can be made discovarable and readable not only by humans but also by computers.
Pensoft makes continuous efforts to develop and implement innovative methods of mark up in academic publishing. We have designed and developed the Pensoft Mark Up Tool (PMT). The tool provides the following operations:
- Importing and retrieval of XML, HTML and Adobe InDesign files
- Interlinking options between PMT and InDesign allowing simultaneous mark up and editorial work
- Tagging and autottaging at different granularity level, according to TaxPub or any other XML schema designed for such purpose
- Cross-linking of citations within the text and reference list
- Finding and linking taxon names through www.uBio.org and PMT’s own web harvester
- Providing several links to various external sources
- Exporting the text to semantically enhanced HTML version of the paper, vizualizing some of the important tag elements, as well as the literature references cited in the text and external links to them (when abailable)
- Mapping localities listed in the papers or within separate taxon treatments
- Generating the Taxon Pensoft Profile page for each taxon name sited in a paper, providing the reader with a quick and up-to-date summary information on a taxon from certified external sources
- Exporting to TaxPub XML file (http://sourceforge.net/projects/taxpub), validated for archiving in PubMedCentral and indexing in PubMed
- XML export of new species descriptions to Encyclopedia of Life
- XML export of treatments or any other tagged information in various formats acceptable by aggregators and indexers, e.g. www.Plazi,org used as an example
The Semantic Web could also be called a “linked Web” because most semantic enhancements are in fact provided through various kinds of links to external resources. The results of these linkages will be visualized in the HTML versions of the published papers through various cross-links within the text and more particularly through the Pensoft Taxon Profile (PTP) (http://ptp.pensoft.eu). PTP is a web-based harvester that automatically links any taxon name mentioned within a text to external sources and creates a dynamic web-page for that taxon. PTP saves readers a great amount of time and effort by gathering for them the relevant information on a taxon from leading biodiversity sources in real time.
A substantial feature of the semantic Web is open data publishing, where not only analysed results, but original datasets can be published as citeable items so that the data authors may receive academic dredit for their efforts. For more information, please visit our detailed Data Publishing Policies and Guidelines for Biodiversity Data.