This repository contains the source code of the Ballerina pdfbox library package
This module offers two core APIs: one for converting PDF documents into images and another for extracting text from PDF documents, providing efficient and versatile solutions for PDF processing.
import xlibb/pdfbox;
public function main() returns error? {
// Convert the PDF located at a file path into an array of Base64-encoded images.
string[] base64ImagesForFilePath = check pdfbox:toImagesFromFile("C://path/to/file/myfile.pdf");
// Convert the PDF available at a URL into an array of Base64-encoded images.
string[] base64ImagesForURL = check pdfbox:toImagesFromURL("https://url/to/file/myfile.pdf");
// Convert the PDF represented as a byte array into an array of Base64-encoded images.
string[] base64ImagesForByteArr = check pdfbox:toImagesFromBytes([your, byte, array]);
}
import xlibb/pdfbox;
public function main() returns error? {
// Extract text from the PDF located at a file path.
string[] base64ImagesForFilePath = check pdfbox:toTextFromFile("C://path/to/file/myfile.pdf");
// Extract text from the PDF available at a URL.
string[] base64ImagesForURL = check pdfbox:toTextFromURL("https://url/to/file/myfile.pdf");
// Extract text from the PDF represented as a byte array.
string[] base64ImagesForByteArr = check pdfbox:toTextFromBytes([your, byte, array]);
}
The pdfbox
library provides practical examples illustrating usage in various scenarios. Explore these examples, covering the following use cases:
-
Download and install Java SE Development Kit (JDK) version 17. You can download it from either of the following sources:
Note: After installation, remember to set the
JAVA_HOME
environment variable to the directory where JDK was installed. -
Download and install Ballerina Swan Lake.
-
Download and install Docker.
Note: Ensure that the Docker daemon is running before executing any tests.
-
Export Github Personal access token with read package permissions as follows,
export packageUser=<Username> export packagePAT=<Personal access token>
Execute the commands below to build from the source.
-
To build the package:
./gradlew clean build
-
To run the tests:
./gradlew clean test
-
To build the without the tests:
./gradlew clean build -x test
-
To run tests against different environments:
./gradlew clean test -Pgroups=<Comma separated groups/test cases>
-
To debug the package with a remote debugger:
./gradlew clean build -Pdebug=<port>
-
To debug with the Ballerina language:
./gradlew clean build -PbalJavaDebug=<port>
-
Publish the generated artifacts to the local Ballerina Central repository:
./gradlew clean build -PpublishToLocalCentral=true
-
Publish the generated artifacts to the Ballerina Central repository:
./gradlew clean build -PpublishToCentral=true
As an open-source project, Ballerina welcomes contributions from the community.
For more information, go to the contribution guidelines.
All the contributors are encouraged to read the Ballerina Code of Conduct.