The Tanl Spanish Pipeline is a Web service for:

  • extrating named entities from Spanish texts;
  • parsing Spanish texts and producing parse trees according to the Tanl Dependency Notation.
The service uses tools from the Tanl Suite to:

  • split text into sentences;
  • tokenize sentences;
  • extract lemma, Part-of-Speech and morphology for each token;
  • extract named entities;
  • build the dependency trees.

The tools are connected to each other forming a Tanl Data Pipeline.
The pipeline uses the Tanl NER and the DeSR parser. The pipeline architecture was developed as part of the SemaWiki project.

You may look at the Python source code for this pipeline.