Extract text and metadata from a number of different text and presentation templates on Java platform using GroupDocs.Parser for Java API. Following template formats are supported:
- dotx (Template)
- dotm (Macro-enabled template)
- ott (OpenDocument Text Template)
- potx (Template)
- potm (Macro-enabled template)
- ppsm (Macro-enabled slideshow)
- pptm (Macro-enabled presentation)
Below code samples demonstrates how to extract text and metadata from templates.
// Extracting Text
void extractText(String fileName) {
// Extract a text from the file
String text = Extractor.DEFAULT.extractText(fileName);
// Print an extracted text
System.out.println(text);
}
// Extracting Metadata
void extractMetadata(String fileName) {
// Extract metadata from the file
MetadataCollection metadata = Extractor.DEFAULT.extractMetadata(fileName);
// Print extracted metadata
for (String key : metadata.getKeys()) {
// Print a metadata key
System.out.print(key);
System.out.print(": ");
// Print a metadata value
System.out.println(metadata.get_Item(key));
}
}
In addition to this, parsing API also supports retrieving tables from PDF documents and allows identifying the media type for your secure Office Open XML documents – http://bit.ly/2CCy7bX
原文链接:How to extract text and metadata from text and presentation templates
暂无评论内容