Using Mothra via Maven

Most users will be able to use Mothra from a standard Maven repository, such as Maven Central. The Mothra libraries may be accessed the following maven coordinates:


    org.cert.netsa:mothra_2.12:1.6.0
    

If you are using Scala 2.11 (the default for most Spark 2 versions), Scala 2.12 with Spark 2, or Scala 2.13 with Spark 3, you will need the appropriate version of the artifact:


    org.cert.netsa:mothra_spark-2_2.11:1.6.0
    org.cert.netsa:mothra_spark-2_2.12:1.6.0
    org.cert.netsa:mothra_2.13:1.6.0
    

This same change may be needed for other artifacts that you wish to use.

spark-shell and spark-submit examples

When running spark-shell, you can specify additional packages to use via the --packages option. For example, the following should work to load the Mothra libraries (and demonstrate the available version):


    $ spark-shell --packages "org.cert.netsa:mothra_2.12:1.6.0"
    ...
    scala> org.cert.netsa.util.versionInfo("mothra")
    Some(1.6.0)
    

The same technique can be used to specify dependencies for jar files used in spark-submit. For example:


    $ spark-submit --packages "org.cert.netsa:mothra_2.12:1.6.0" some-job.jar
    

This would run the executable jar file some-job.jar with Mothra and all of it dependencies fetched and added to the job's classpath.

Command-line tools using Coursier

You may also retrieve tool wrappers for the Mothra packing tools which automatically download the necessary libraries from Maven Central. These wrappers are in the archive named mothra-1.6.0-tools.tar.gz, and are configured to run on UNIX-style systems (including macOS and Linux.)

The tool wrappers automatically download the needed artifacts into your home directory when run (under ~/.cache/coursier), then execute them in the JVM. Files are always retrieved from Maven Central. If you need to use a proxy not otherwise configured in your JVM, you can use the JAVA_OPTS environment variable to specify the proxy using Java properties.

If for some reason you need to download files from a source other than Maven Central (for example, an internal caching repository or a local repository of known good packages), you can use Coursier yourself to build bootstrap scripts that either fetch from your own repository or build the jars directly ihto the script. See the documentation for Coursier's cs bootstrap command for more details.

Using Maven artifacts with Jupyter notebooks

If you're using Jupyter notebooks, you may be able to configure additional external libraries on a per-notebook basis. The exact mechanism to use for this depends upon which Jupyter kernel you're using for Scala. Some likely methods are describe below.

In the Apache Toree kernel:


    %AddDeps org.cert.netsa mothra_2.12 1.6.0
    

In the spylon kernel:


    %%init_spark
    launcher.packages = ["org.cert.netsa:mothra_2.12:1.6.0"]
    

In the Almond kernel:


    import $ivy.`org.cert.netsa::mothra:1.6.0`
    

There are a variety of Jupyter kernels available, and these mechanisms can vary depending on version. Please make sure to check your Jupyter kernel's documentation for more details on how to configure additional external packages.

Using Maven artifacts with Apache Zeppelin

If you're using Apache Zeppelin, you should be able to configure it to load Mothra libraries using something like:


    %dep z.load("org.cert.netsa:mothra_2.12:1.6.0")
    

As noted above for Jupyter, make sure to check documentation for the version of Zeppelin you're using to make sure the method has not changed.