Most users will be able to use Mothra from a standard Maven repository, such as Maven Central. The Mothra libraries may be accessed the following maven coordinates:
org.cert.netsa:mothra_2.12:1.6.0
If you are using Scala 2.11 (the default for most Spark 2 versions), Scala 2.12 with Spark 2, or Scala 2.13 with Spark 3, you will need the appropriate version of the artifact:
org.cert.netsa:mothra_spark-2_2.11:1.6.0
org.cert.netsa:mothra_spark-2_2.12:1.6.0
org.cert.netsa:mothra_2.13:1.6.0
This same change may be needed for other artifacts that you wish to use.
spark-shell
and spark-submit
examples
When running spark-shell
, you can specify additional
packages to use via the --packages
option. For
example, the following should work to load the Mothra libraries
(and demonstrate the available version):
$ spark-shell --packages "org.cert.netsa:mothra_2.12:1.6.0"
...
scala> org.cert.netsa.util.versionInfo("mothra")
Some(1.6.0)
The same technique can be used to specify dependencies for jar
files used in spark-submit
. For example:
$ spark-submit --packages "org.cert.netsa:mothra_2.12:1.6.0" some-job.jar
This would run the executable jar file some-job.jar
with Mothra and all of it dependencies fetched and added to the
job's classpath.
You may also retrieve tool wrappers for the Mothra packing tools
which automatically download the necessary libraries from Maven
Central. These wrappers are in the archive named
mothra-1.6.0-tools.tar.gz
, and are configured to run
on UNIX-style systems (including macOS and Linux.)
The tool wrappers automatically download the needed artifacts into
your home directory when run (under
~/.cache/coursier
), then execute them in the JVM.
Files are always retrieved from Maven Central. If you need to use
a proxy not otherwise configured in your JVM, you can use the
JAVA_OPTS
environment variable to specify the
proxy using Java properties.
If for some reason you need to download files from a source other
than Maven Central (for example, an internal caching repository or
a local repository of known good packages), you can use Coursier
yourself to build bootstrap scripts that either fetch from your
own repository or build the jars directly ihto the script. See the
documentation for Coursier's cs bootstrap
command
for more details.
If you're using Jupyter notebooks, you may be able to configure additional external libraries on a per-notebook basis. The exact mechanism to use for this depends upon which Jupyter kernel you're using for Scala. Some likely methods are describe below.
In the Apache Toree kernel:
%AddDeps org.cert.netsa mothra_2.12 1.6.0
In the spylon kernel:
%%init_spark
launcher.packages = ["org.cert.netsa:mothra_2.12:1.6.0"]
In the Almond kernel:
import $ivy.`org.cert.netsa::mothra:1.6.0`
There are a variety of Jupyter kernels available, and these mechanisms can vary depending on version. Please make sure to check your Jupyter kernel's documentation for more details on how to configure additional external packages.
If you're using Apache Zeppelin, you should be able to configure it to load Mothra libraries using something like:
%dep z.load("org.cert.netsa:mothra_2.12:1.6.0")
As noted above for Jupyter, make sure to check documentation for the version of Zeppelin you're using to make sure the method has not changed.