Answer: Using PDFbox to determine the coordinates of words in a document

I’m working on extract data from PDF files. This post helps me to determine for the coordinate position by word searching.

answer re: Using PDFbox to determine the coordinates of words in a document
Apr 12 ’15
2

take a look on this, I think it’s what you need.

https://jackson-brain.com/using-pdfbox-to-locate-text-coordinates-within-a-pdf-in-java/

Here is the code:

import java.io.File;
import java.io.IOException;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import org.apache.pdfbox.exceptions.InvalidPasswordException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDStream;
import org.apache.pdfbox.util.PDFTextStripper;
import org.apache.pdfbox.util.TextPosition;

public class PrintTextLocations extends PDFTextStripper {

public static StringBuilder tWord

…
Open Full Answer

原文链接：Answer: Using PDFbox to determine the coordinates of words in a document

文章版权声明 1、本网站名称：拾光赋
2、本站永久网址：https://www.blogs.ink
3、本网站的文章部分内容可能来源于网络，仅供大家学习与参考，如有侵权，请联系站长QQ：805375623进行删除处理。
4、本站一切资源不代表本站立场，并不代表本站赞同其观点和对其真实性负责。
5、本站一律禁止以任何方式发布或转载任何违法的相关信息，访客发现请向站长举报
6、本站资源大多存储在云盘，如发现链接失效，请联系我们我们会第一时间更新。

THE END

Java（EN）
# java # pdfbox

Answer: Using PDFbox to determine the coordinates of words in a document

请登录后发表评论