This page provides reference material for the DataWave Quickstart

Global Bootstrap Functions

The functions below are implemented in bin/common.sh

Function Name Description
allStart Start up all services in the appropriate sequence
allStop Stop all services gracefully. Use --hard flag to kill -9
allStatus Display current status of all services, including PIDs if running
allInstall Install all services
allUninstall Uninstall all services, leaving tarballs in place. Optional --remove-binaries flag
allPrintenv Display current state of all service configurations

Service Bootstrap Functions

The functions below are implemented for each service, where {servicename} can be one of…

hadoop accumulo zookeeper datawaveWeb, datawaveIngest, or simply datawave for both
Function Name    Description
{servicename}Start Start the service
{servicename}Stop Stop the service
{servicename}Status Display current status of the service, including PIDs if running
{servicename}Install Install the service
{servicename}Uninstall Uninstall but leave tarball(s) in place. Optional --remove-binaries flag
{servicename}IsRunning Returns 0 if running, non-zero otherwise. Mostly for internal use
{servicename}IsInstalled Returns 0 if installed, non-zero otherwise. Mostly for internal use
{servicename}Printenv Display current state of the service configuration, bash variables, etc
{servicename}PidList Display all service PIDs on a single line, space-delimited

Accumulo Shell Alias (ashell)

To quickly launch the Accumulo Shell and authenticate as the root user, use the quickstart’s ashell alias

  $ ashell

  Shell - Apache Accumulo Interactive Shell
  -
  - version: 2.1.2
  - instance name: my-instance-01
  - instance id: cc3e8158-a94a-4f2e-af9e-d1014b5d1912 
  -
  - type 'help' for a list of available commands
  -
  root@my-instance-01>

Nuclear Options

Quick Uninstall

To quickly kill any running services and uninstall everything (leaving downloaded *.tar.gz files in place):

   $ allStop --hard ; allUninstall

Same as above, but also remove any downloaded *.tar.gz files:

  $ allStop --hard ; allUninstall --remove-binaries

Quick Reinstall

Same as above, but re-download and reinstall everything:

  $ allStop --hard ; allUninstall --remove-binaries && allInstall

DataWave Functions

Scripts

DataWave’s features are exposed primarily through configs and functions defined within the scripts listed below

Script Name    Description
query.sh Query-related functions for interacting with DataWave Web’s REST API
bootstrap.sh Common functions. Parent wrapper for web & ingest bootstraps
bootstrap-web.sh Bootstrap for DataWave Web and associated functions
bootstrap-ingest.sh Bootstrap for DataWave Ingest and associated functions
bootstrap-user.sh Configs for defining DataWave Web test user’s identity, roles, auths, etc

A few noteworthy functions and their descriptions are listed by category below

DataWave Web Functions

datawaveWebStart [ --debug ]
Start up DataWave’s web services in Wildfly. Pass the --debug flag to start Wildfly in debug mode
Implementation: bootstrap-web.sh
datawaveQuery --query <query-expression>
Submit queries on demand and inspect results. Use the --help flag for information on query options
Query syntax guidance is here
Implementation: query.sh
datawaveWebTest
Wrapper function for test-web/run.sh script. Run a suite of curl-based tests against DataWave Web
Supports several options. Use the --help flag for more information
Implementation: bootstrap-web.sh

DataWave Ingest Functions

datawaveIngestJson /path/to/some/tvmaze.json
Kick off M/R job to ingest raw JSON file containing TV show data from http://tvmaze.com/api
Ingest config file: myjson-ingest-config.xml
File ingested automatically by the DataWave Ingest installer (install-ingest.sh): tvmaze-api.json
Use the ingest-tv-shows.sh script to download & ingest more of your favorite shows
Implementation: bootstrap-ingest.sh
datawaveIngestWikipedia /path/to/some/enwiki.xml
Kick off M/R job to ingest a raw Wikipedia XML file. Any standard enwiki-flavored file should suffice
Ingest config file: wikipedia-ingest-config.xml
File ingested automatically by the DataWave Ingest installer (install-ingest.sh): enwiki-20130305*.xml
Implementation: bootstrap-ingest.sh
datawaveIngestCsv /path/to/some/file.csv
Kick off M/R job to ingest a raw CSV file similar to my.csv
Ingest config file: mycsv-ingest-config.xml
File ingested automatically by the DataWave Ingest installer (install-ingest.sh): my.csv
Implementation: bootstrap-ingest.sh

Build/Deploy Functions

datawaveBuild
Rebuild DataWave as needed (i.e., after the initial install/deploy)
Implementation: bootstrap.sh
datawaveBuildDeploy
Redeploy DataWave as needed (i.e., after the initial install/deploy)
Implementation: bootstrap.sh

DataWave Build Notes

The quickstart performs the following steps at build time in order to reliably configure DataWave for deployment under your DW_SOURCE/contrib/datawave-quickstart directory. Note that, by default, it leverages the dev profile as defined in DW_SOURCE/pom.xml

  1. Copies the existing DW_SOURCE/properties/dev.properties file to datawave-quickstart/data/datawave/build-properties/dev.properties
  2. Appends to the copied dev.properties file any property overrides that are necessary for deployment
  3. Creates ~/.m2/datawave/properties/dev.properties symlink that points to the copied dev.properties file
  4. Executes the Maven build command given by $DW_DATAWAVE_BUILD_COMMAND

Note that you may select a different Maven profile for this purpose by simply overriding $DW_DATAWAVE_BUILD_PROFILE in your environment


PKI Notes

In the quickstart environment, DataWave Web is PKI enabled and uses two-way authentication by default. Moreover, the following self-signed materials are used…

File Name Type Description
ca.jks JKS Truststore for the Wildfly JEE Application Server
testServer.p12 PKCS12 Server Keystore for the Wildfly JEE Application Server
testUser.p12 PKCS12 Test user client cert
  • Passwords for all of the above: ChangeIt

  • To access DataWave Web endpoints in a browser, you’ll need to import the client cert into the browser’s certificate store

  • The goal of the quickstart’s PKI setup is to demonstrate DataWave’s ability to be integrated easily into an organization’s existing private key infrastructure and user auth services. See datawave/bootstrap-user.sh for more information on configuring the test user’s roles and associated Accumulo authorizations

  • To test with your own certificate materials, override the keystore & truststore variables from datawave/bootstrap.sh within your ~/.bashrc prior to installing the quickstart