Striker Agent API

From Alteeve Wiki
Jump to navigation Jump to search

 AN!Wiki :: Striker Agent API

Warning: This API is not complete!

This is the Striker Agent API definition.

Striker Agents are stand-alone agents that scan and monitor specific hardware or software associated with Anvil! clusters and foundation pack equipment.

Version

Name: Striker-Agent_API
Version: 1.0.0
Date: 2014-09-17

The Striker Agent API version structure is 'x.y.z' where:

x Major revisions. Striker Agents are not required to support previous major revisions.
y Minor revisions. Existing specifications may be extended and new, optional features may be added. Existing specifications will not be changes and compatibility will not be broken.
z Minor Releases to fix bugs, no new features added.

Overview

This document describes how a striker agent is expected to operate as part of the "Striker Scanner" monitoring program. Programs that conform to this API are known as "Striker Agents".

General Conventions

Striker Agents can have any name the author wishes. However, it is recommended that the name reflects the purpose of the agent. By convention, the name if Striker Agents is in the format '<subsystem>_<component>'. Examples are:

storage_mdadm Provides monitoring for "mdadm" (software) RAID arrays
storage_lsi Provides monitoring for LSI (Avago) based hardware RAID controllers.
storage_hp Provides monitoring for HP based hardware RAID controllers.
software_drbd Provides monitoring of LINBIT's "Distributed, Replicated Block Device" software application.
software_rgmanager Provides monitoring of Red Hat's "Resource Group Manager" cluster software.
network_bond Provides monitoring of bonded network interfaces.
network_brocade Provides monitoring of Brocade based ethernet network switches.

There is no official list of subsystem prefixes. If you plan to develop a Striker Agent, you are free to select any subsystem name you wish. However, it is requested that you consult the Anvil! community to help decide what subsystem your agent might best fit.

Scanner Agents may have associated libraries, reference SQL schemas, READMEs and so on. These files MUST have the same name as the Striker Agent, plus a file extension. There are no requirements on what the extension(s) can be. By convention, certain extensions are common:

.lib Library of functions, subroutines or similar.
.sql Reference SQL schema
.xml File containing language "strings".
.README Document describing your Striker Agent.

Programming language specific shared libraries are allowed. These shared libraries must be named:

scan-common-<language>.<library_extension>

For example, the shared perl library is named:

scan-common-perl.lib

New functions can be relatively easily added to existing shared language libraries and new language libraries are welcome to be added to Striker. To contribute, simply contact the community and all reasonable additions and improvements will be accommodated.

Command Line Switches

All Striker Agents MUST support the following list of command line arguments. The supported output of the argument will be described along with the command line switch. All acceptable return codes will be specified. If a switch will return output, it must be XML formatted.

Command line switches will be passed independently and will not be merged. Some Some switches will have a short version with a single hyphen prefix. All switches will have a long form with a double-hyphen prefix. Where switches take values, the switch's value will follow with either a space or use an equal sign without a space. Examples;

Equal command line switches, short and long form:

-f, --foo

Equal command line switch with variable:

--foo bar, --foo=bar

A bare double-hyphen will mark the end of string processing. Anything read in after a bare '--' can be ignored.

Striker Agents must exit with return code '0' on success. Unless specified, any non-0 return code will be treated as a failure, regardless of textual output.

-c, --check

Run a check. Exit it with one of the following exit codes;

0 No changes detected at all. Use if all sensors/data showed no change at all since the previous scan.
1 Low importance changes detected and logged. No alert necessary. Commonly used for logging variable fan speeds, minor temperature changes, etc.
2 Moderate changes detected. Non-critical/debug alert generated. Commonly used for noteworthy events like RAID FBU/BBU relearn cycle related changes. This should be used when an event is most likely unimportant, but has the potential to be important and you want technical people to be notified as a safety step.
3 Important event, general alert generated. All unexpected changes, changes that exceed thresholds (static or delta), etc.
9 Scan failed. The first scan failure should generate an alert. Subsequent failures should not. The "last scan failed" flag should be cleared the first time the scan completes successfully.

All events that generate alerts should return the alert message in XML format. Alerts should be returned in XML format (described below [not yet complete]) with the alert message available in all supported languages.

-i, --info

Display information about the agent in XML format.

-s, --sql

-s, --sql

Agents must return the SQL schema the agent will require in XML format defined below. The Striker Scanner will create or modify tables as needed before calling the agent to run. Each table will have a corresponding table created in the 'history' schema which can later be used by the agent to query historical values, generate reports, etc.

Note: New tables, and new columns for existing tables, may be added, but existing columns may not be modified. If an agent provides a schema that would cause an existing table's column to be changed, and error alert email will be generated and the agent will not be run. An administrator will need to manually adjust the table's columns to conform to the schema returned by this call for the agent to be used again. The scanner will examine the schema compatibility on every pass, though only one alert email will be sent. If the schema is later found to be compatible, a follow up alert will be dispatched and any future compatibility issue will again trigger an alert.

Schema format:

<striker-scanner>
	<sql>
		<table name="x" type="sequence" />
		<table name="y" type="data">
			<column name="sm_id" type="integer" is_id="true">
				<primary_key table="storage_mdadm_sequence" value="nextval" />
			</column>
			<column name="x" type="y" not_null="true" default="true" />
			<column name="sm_md_created" type="text" />
			<column name="sm_md_raid_level" type="text" />
			<column name="sm_md_uuid" type="text" comment="delete ':'" />
			<column name="sm_md_device" type="text" />
			<column name="sm_md_size" type="bigint" comment="Store in bytes, provided as both base 2 and 10" />
			<column name="sm_md_state" type="text" />
			<column name="sm_md_devices_raid" type="integer" />
			<column name="sm_md_devices_total" type="integer" />
			<column name="sm_md_devices_active" type="integer" />
			<column name="sm_md_devices_working" type="integer" />
			<column name="sm_md_devices_failed" type="integer" />
			<column name="sm_md_devices_spare" type="integer" />
		</table>
		<table name="storage_mdadm_device" type="data">
			<column name="smd_id" type="integer">
				<primary_key table="storage_mdadm_device_sequence" value="nextval" />
			</column>
			<column name="smd_sm_id">
				<foreign_key name="smd_sm_id" table="storage_mdadm" column="sm_id" />
			</column>
			<column name="smd_md_raid_level" type="text" />
			<column name="smd_md_uuid" type="text" comment="delete ':'" />
			<column name="smd_md_device" type="text" />
			<column name="smd_md_size" type="bigint" comment="Store in bytes, provided as both base 2 and 10" />
			<column name="smd_md_state" type="text" />
			<column name="smd_md_devices_active" type="integer" />
			<column name="smd_md_devices_working" type="integer" />
			<column name="smd_md_devices_failed" type="integer" />
			<column name="smd_md_devices_spare" type="integer" />
		</table>
	</sql>
</striker-scanner>

General:

All 'table' and 'column' elements must have a unique 'name' attribute. This attribute name must be a valid table or column name in PostgreSQL.

Element Descriptions

<table ...>

Valid 'value' attribute values;

<table type="...">

data Table is a normal data table. One or more '<column ... />' child elements must be provided.
sequence Table is a sequence. New sequences will always start at '0'.
Type 'sequence'

Sequence tables are used by most tables for automatically generating unique sequence numbers for 'data' type table entries. These help ensure uniqueness among entries when all other attributes might change. It's not required, but it is recommended, that a sequence is created for all 'data' type tables.

Type 'data'

These are the main tables used to store data.

<column type="...">

The 'type' attribute is required. Valid 'value' attribute values;

<column type="...">

This may be any of the PostgreSQL supported data types. The most common are:

boolean This can have three states;
  • true
  • false
  • null ("unknown", not 'true' or 'false')
text Variable-length, free-form text string.
integer Stores signed whole number (no decimal place). Acceptable range is -2147483648 to +2147483647.
bigint Similar to 'integer' but with a wider range; Stores signed whole number (no decimal place). Acceptable range is -9223372036854775808 to +9223372036854775807.
decimal Stores numbers with decimal places. Range is any number with up to 131072 digits before the decimal place and up to 16383 digits after the decimal place.

The 'not_null' attribute is optional. It's value must be 'true' or 'false'. When set, the associated column will never allowed to be empty.

The 'default' attribute is optional. It's value must be compatible with the column's 'type'. Compatibility is defined by PostgreSQL. The value given will be assigned to the row when no other value is specified. This is often combined with 'not_null', but it can be used without it, too.

The 'comment' attribute is optional, it's value is ignored by the agent.

Child elements

Child elements

Optional child elements:

<primary_key ... />

Defines the parent column as a SQL Primary Key. Other table columns can later reference an entry in this column. Once references, the primary key entry will not be removable so long as any foreign keys still reference it.

Required attributes are 'table' and 'value'. The format is:

<primary_key table="x" value="y" />

Where 'x' is the SQL compliant name of the primary key and 'y' is a SQL compatible value.

<foreign_key ... />

Defines a Foreign Key. It must reference a defined '<primary_key ... />'. Once referenced, the referenced primary key row will not be removable for as long as their is one or more foreign keys referencing it. Further, attempting to reference a non-existant primary key will cause an error.

Required attributes are 'name', 'table' and 'column'. The format is:

<foreign_key name="x" table="y" column="z" />

Where 'x' is the name of the foreign key, 'y' is the <table name="..."> containing the '<primary_key ... />' and 'z' is the <column name="..."> that the foreign key references.

-v, --version

-v, --version

Agents must return the numerical Striker-Agent_API version they conform to. Output must be a single XML element named 'striker-scanner' with variable name of 'SA API' and a value in the format 'x.y', where 'x' and 'y' are integers representing the API major and minor versions supported by the agent, respectively.

Example output:

<striker-scanner variable="SA API" value="1.0" />