Packaging Guide
Bundling Streaming Connectors
Section titled “Bundling Streaming Connectors”Java / Scala
Section titled “Java / Scala”The stream connectors are expected to be bundled as a Single JAR (Fat JAR). We recommend using some thing like maven-shade-plugin to build the final JAR file.
Additionally, we recommend using maven-assembly-plugin to bundle all required files and JAR into a single distribution.
Here is a sample using maven-shade-plugin and maven-assembly-plugin
<build> ... <plugins> ... <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.2.1</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>false</shadedArtifactAttached> <artifactSet> <excludes> <exclude>com.google.code.findbugs:jsr305</exclude> </excludes> </artifactSet> <filters> <filter> <!-- Do not copy the signatures in the META-INF folder. Otherwise, this might cause SecurityExceptions when using the JAR. --> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>org.sunbird.obsrv.connector.ExampleSourceConnector</mainClass> </transformer> <!-- append default configs --> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer> </transformers> </configuration> </execution> </executions> </plugin> <plugin> <artifactId>maven-assembly-plugin</artifactId> <executions> <execution> <id>distro-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> <configuration> <descriptors> <descriptor>src/main/assembly/src.xml</descriptor> </descriptors> </configuration> </execution> </executions> </plugin> ... </plugins></build>And here is a sample of src/main/assembly/src.xml for building the distribution
<?xml version="1.0" encoding="UTF-8"?>
<assembly xmlns="http://maven.apache.org/ASSEMBLY/2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/ASSEMBLY/2.2.0 https://maven.apache.org/xsd/assembly-2.2.0.xsd"> <id>distribution</id> <formats> <format>tar.gz</format> </formats> <fileSets> <fileSet> <directory>${basedir}/src/main/resources</directory> <outputDirectory>/</outputDirectory> <includes> <include>*</include> </includes> <excludes> <exclude>obsrv-connector-config.conf</exclude> </excludes> </fileSet> <fileSet> <directory>target</directory> <outputDirectory>/</outputDirectory> <includes> <include>*.jar</include> </includes> </fileSet> </fileSets></assembly>Run mvn clean package to generate the distribution
The contents of the distribution should be in the following format
example-connector-0.1.0-distribution.tar.gz├── example_connector.jar├── alerts.yaml├── metadata.json├── metrics.yaml├── ui-config.json└── icon.svgBundling Batch Connectors
Section titled “Bundling Batch Connectors”Java / Scala
Section titled “Java / Scala”The batch connectors are bundled as a standalone JAR file, where the dependent JARs are to be included in the libs folder in the bundle, by using the maven-assembly-plugin.
Here is a sample of including the maven-assembly-plugin to your build
<build> ... <plugins> ... <plugin> <artifactId>maven-assembly-plugin</artifactId> <executions> <execution> <id>distro-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> <configuration> <descriptors> <descriptor>src/main/assembly/src.xml</descriptor> </descriptors> </configuration> </execution> </executions> </plugin> ... </plugins></build>And here is a sample of src/main/assembly/src.xml for building the distribution
<?xml version="1.0" encoding="UTF-8"?>
<assembly xmlns="http://maven.apache.org/ASSEMBLY/2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/ASSEMBLY/2.2.0 https://maven.apache.org/xsd/assembly-2.2.0.xsd"> <id>distribution</id> <formats> <format>tar.gz</format> </formats> <fileSets> <fileSet> <directory>${basedir}/src/main/resources</directory> <outputDirectory>/</outputDirectory> <includes> <include>*</include> </includes> <excludes> <exclude>obsrv-connector-config.conf</exclude> </excludes> </fileSet> <fileSet> <directory>target</directory> <outputDirectory>/</outputDirectory> <includes> <include>*.jar</include> </includes> </fileSet> </fileSets> <dependencySets> <dependencySet> <outputDirectory>libs</outputDirectory> <useTransitiveFiltering>true</useTransitiveFiltering> </dependencySet> </dependencySets></assembly>Run mvn clean package to generate the distribution
The contents of the distribution should be in the following format
example-connector-0.1.0-distribution.tar.gz├── libs └── sample-dependency.jar├── example_connector.jar├── alerts.yaml├── metadata.json├── metrics.yaml├── requirements.txt├── ui-config.json└── icon.svgPython
Section titled “Python”Since we use PySpark for building the batch connectors in Python, we are required to include the dependent JARs as a part of the distribution.
Here are the steps if you are using poetry as a package manager for Python Connectors.
Add a build_dist.py script to the scripts folder with the following contents. Briefly the script does the following
- It exports the python requirements to
requirements.txtfile, so that these can be installed on the runtime. - It downloads all the dependent JARs that are to be included in the package using
mvn
import subprocessimport os
def main(): # remove dirs recursively subprocess.Popen("rm -rf dist", shell=True).wait()
# Path to the directory containing the JAR files jar_dir = os.path.join(os.path.dirname(__file__), '..', 'libs') os.makedirs(jar_dir, exist_ok=True)
subprocess.Popen("""poetry export --without-hashes --format=requirements.txt | awk '{split($0,a,"; "); print a[1]}' > requirements.txt""", shell=True).wait() # TODO: Only uncomment the below line, if you have Java dependencies for your script, which have to be included in the libs folder # subprocess.Popen("mvn dependency:copy-dependencies -DrepoUrl=http://repo1.maven.org/maven2/ -DexcludeTrans -DoutputDirectory=libs", shell=True).wait() subprocess.Popen("poetry build -f sdist", shell=True).wait() subprocess.Popen("rm -rf requirements.txt libs", shell=True).wait()
if __name__ == '__main__': main()The packaging instructions have to be included in the pyproject.toml file.
packages = [ { include = "example_connector", from = ".", format = "sdist" },]
include = [ "requirements.txt", # "libs/*.jar", # TODO: Only uncomment the below line, if you have Java dependencies for your script, which have to be included in the libs folder "ui-config.json", "metadata.json", "alerts.yaml", "metrics.yaml", "icon.svg" # Add additional files like the icons that are to be included in the bundle]
[build-system]requires = ["poetry-core"]build-backend = "poetry.core.masonry.api"
[tool.poetry.scripts]package = "scripts.build_dist:main"Add a pom.xml file to the root and specify the dependent JARs required by the connector
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>in.sanketika.connectors</groupId> <artifactId>example-connector</artifactId> <version>1.0.0</version> <dependencies> <dependency> <groupId> </groupId> <artifactId> </artifactId> <version> </version> </dependency> ... </dependencies></project>Run poetry run package to build the distribution that can be installed on Obsrv.
The contents of the distribution should be in the following format
example-connector-0.1.0-distribution.tar.gz├── libs └── sample-dependency.jar├── example_connector └── main.py├── alerts.yaml├── metadata.json├── metrics.yaml├── requirements.txt├── ui-config.json└── icon.svg