General Build Tags

A Spectre Build script is defined using a build tag with optional tags between open and close { } braces. This topic describes general tags that can be used to qualify the input. Additional topics describe tags for the various input types: text, cBase, Dive, tunnel, and ODBC. For more information, see the links at the end of this topic.

The tags are listed in alphabetical order in each of the following tables.

General Build Tags

Tags	Examples	Notes
build	build { ... }	Defines the block for the Build definition for the cBase. Parameters can be declared here. See the take -parameter tag below.
calendar	calendar "modified-standard" type="standard" { period "year-month" format="YYYY-MMM" } calendar "fiscal" type="custom" { cbase-input "fiscal.cbase" } calendar "fiscal" file="fiscal.calendar"	Adds a calendar for the Build. The first sample declaration reformats the standard display from YYYY/MM to YYYY-MMM. The second sample creates a fiscal calendar from the indicated cBase. The last sample builds the fiscal calendar from the named calendar text file. See Calendar Declarations.
current date	property "Current Date" "2021/02/25"	Defines the system property Current Date. The string value must use the format "YYYY/MM/DD". Use the Spectre function table_system_property() to reference this table property. Available starting with Spectre 7.1(18).
custom-property	custom-property "color" "blue" custom-property "footer" "Confidential"	Defines custom properties for the cBase. Custom properties are not used by Spectre but are available for users to add their own metadata to Spectre objects. Custom properties can be referenced in Spectre expressions using functions like column_custom_property(), table_custom_property(), or calc_custom_property(), depending on the context. TIP: Custom properties in a cBase are available to ProDiver Reports using the variable $MODEL_VAR() (for example, $MODEL_VAR(color)). The custom-property tag, if used to define a footer, can be accessed in ProDiver via report macros using either $MODEL_VAR() or $MODEL_FOOTER. See Spectre Properties.
date dimension	property "Date Dimension" "Dat"	Defines a system property for the cBase where Dat is the name of a column. In Spectre scripts, use table_system_property() to examine the Date Dimension. Available starting with Spectre 7.1(18).
default-calendar	default-calendar "offset november" default-calendar "modified-standard"	Specifies a default calendar. The default calendar can be set to any declared calendar (including custom calendars), or standard, offset <month>, or iso 8601. The default calendar is standard unless declared otherwise.
description	property description "YTD cBase"	Includes text information in the cBase. This system property can be referenced by using the Spectre function table_system_property(). The description property is available to ProDiver Reports through the variable $COMMENTS. For more information, see the ProDiver Help > About Text Variables.
extract-time	extract-time "2010-05-23 11:34:12" extract-time "2010-05-25" extract-time "now" build { take-parameter "Extract Time" default="now" text-input "basic.txt" extract-time "$(Extract Time)" ... }	Defines an extract-time that describes when the input data was extracted from the source data. The string value can be passed in as parameter to the Build script, computed beforehand in the Production script that runs the build. The extract-time should be given a date in the form of "YYYY-MM-DD" or a timestamp in the format of "YYYY-MM-DD HH:MM:SS", or the string "now". If only a date is given, the time will be set to 12 noon of that date. The extract-time value is stored as the `Extract Timestamp` system property in the cBase. It is used in Diver Reports or Report Palettes to populate the $EXTRACT_DATE and $EXTRACT_TIME string variables. For more information, see the ProDiver Help > About Text Variables. To use the extract time in Spectre scripts, see the extract_timestamp() Spectre function. NOTE: The build time is recorded automatically as the `Build Timestamp` system property in the cBase. This build time is available in Diver Reports using $BUILD_DATE and $BUILD_TIME, and in Spectre scripts using the build_timestamp() function. See Miscellaneous Functions.
file		See the calendar tag.
locale	locale "sv_SE" build { text-input { ... } locale "en_US" output "<file name>.cbase" }	Ensures correct handling of monetary and date data. Include this tag if you are working in a non-English setting, or to ensure consistency for multiple cBase inputs. You can reference this system property using the Spectre function table_system_property(). NOTE: Combining cBases with different locales is not possible in 7.0, but can be done in Build scripts starting with Spectre 7.1. When `en_US_POSIX` is the specified or default locale, Spectre uses `en_US`. The `locale` tag setting overrides the system locale or environment locale and builds the month names into the cBase using the month names for the designated language.
<operations>	build { text-input "/common/basic.txt" add "Weight %" `value("Weight") / sum(value("Weight"))` add "Source Index" `row_number()` output "basicplus.cbase" }	Applies operations to the data. Once the input source for the cBase has been specified, operations such as lookups, filtering, sorting and adding can be defined on the data to create new columns. See the Build Operation Tags table below.
output	output "/cbases/sales.cbase"	Defines the location and file name for the cBase output. If the path string is more than a file name, then the sub directory must already exist relative to the current working directory. For example, a script such as `<project>/build/xyz.build` with the line output “cbases/xyz.cbase” will write to `<project>/build/cbases/xyz.cbase`.
property	build { text-input "/common/basic.txt" { ... } property "Description" "This is a description for the cBase"	Defines a system property to be associated with the cBase. A property defined on the build level (not the column level), appears in the Spectre Build Editor as a cBase property. You can refer to the property in Spectre scripts using the table_system_property() function. NOTE: cBase properties are not related to Project Settings, that is user properties. See Spectre Properties.
take-parameter	build { take-parameter "Input File" text-input "$(Input File)" { column ... } }	Declares the parameters as needed so that their values can be used in subsequent lines.

Operations are generally listed after the input type has been specified.

Build Operation Tags

Tags

Examples

Notes

add

add "Double Value" `value("Price") * 2`

add "Weight %" `value("Weight") / sum(value("Weight"))`

add "Source Index" `row_number()`

add "Invoice Month" `lpad(string(extract(value("Invoice Date"), "month")),2,"0")`

add "Order YearMo" `period("gregorian month", value("Order Date"))`

add "YearMo" `period("Fiscal", "year-month", value("Date"))`

Adds a new column as calculated by the expression. The column type is determined by the value returned from the expression. Appears as "Calculated Column" in the Spectre Build GUI. Add operations can include the following tags: label, format, required-dimension, summary-type, suggested-dimensions, sort-by, sort-natural, and custom-property. (See Text, cBase, and Dive Input Build Tags.)

NOTE: You can use cPlan calcs rather than cBase adds to minimize the cBase size and the number of unique columns that would need to be loaded off of disk to handle queries. On the other hand, cPlan calcs and lookups are always slower than accessing a cBase column.

breaks

build {

text-input "../data/basic/basic.txt" {

column "Mood" type="string"

column "Color" type="string"

}

sort {

column "Mood"

column "Color"

}

breaks {

column "Mood"

column "Color"

first "First"

last "Last"

index "Index"

level "Level"

}

output "test_break.cbase"

}

Identifies break levels in the input flow. A break level is a set of rows that have identical values for a set of break columns. The Break object creates up to four new columns that describe the position of each row in the break level. This action allows subsequent calculations to process a row of a break column knowing whether it is the first or last row of the break level. The four new columns are:

First—Indicates the beginning of a break level. This column has a value of "1" on the first row of a break level, and "0" otherwise.
Last—Indicates the end of a break level. This column has a value of "1" on the last row of a break level, and "0" otherwise.
Index—Indexes each break level. This column has a value of "1" on the first row of a break level, "2" on the second row, and continues as such.
Level—Counts the number of break columns that have the same values as the previous row (in order).
- If the value of the first break column is different from the previous value, this column has a value of "0".
- If the first break column has the same value as the previous value, but the second break column is different, this column has a value of "1". This column can be used to detect intermediate break levels.

In the example, the output cBase has two columns, Mood and Color, and four columns, First, Last, Index, and Level. The following table contains sample data:

Mood	Color
Happy	Yellow
Happy	Yellow
Sad	Blue
Sad	Blue
Sad	Blue

When the breaks tag processes this table, the results are as follows:

Mood	Color	First	Last	Index	Level
Happy	Yellow	1	0	1	0
Happy	Yellow	0	1	2	2
Sad	Blue	1	0	1	0
Sad	Blue	0	0	2	2
Sad	Blue	0	1	3	2

expand

expand {

start "Start Month"

end "End Month"

output "Project Month"

}

Expands one row into multiple rows, using an inclusive range defined by two columns. In the example, two columns containing months ("Start Month" and "End Month") are defined as the inclusive range for expansion. Given a table such as the following:

Project	Start Month	End Month	Employees
Project A	2020/06	2020/08	2
Project B	2021/01	2021/03	4

The expand tag processes and builds out one row for each month in between "Start Month" and "End Month" and consolidates the two input columns as the output column "Project Month". Note that other columns ("Employees" in this example) fill the newly expanded rows with identical values.

Project	Project Month	Employees
Project A	2020/06	2
Project A	2020/07	2
Project A	2020/08	2
Project B	2021/01	4
Project B	2021/02	4
Project B	2021/03	4

NOTES:

The expand tag supports integer-backed data types (integer and fixed100) and date types. Fixed100 columns iterate by .01 each row.
Datetime columns are not supported.
Rows where the start value is greater than the end value are excluded from the output, as are null and unknown values.
Period data types must have the same calendar to properly expand.

filter

filter `row_number() <= 80`

filter `param_match("Car Filter", "Category")`

filter `value("Color") = "Red" or value("Direction") = "North"`

text-input "companies.txt" {

filter `value("Sales Region") = "Boston"`

}

cbase-input "districts.cbase"

filter `value("Sales Region") = "Boston"`

Filters the input based on a Spectre expression. The first example pulls in the first 80 rows. In the second example, "Car Filter" is the parameter passed in and "Category" is the column, so the data is filtered accordingly. The third example filters 2 columns on the values in the expression.

Filter statements can be placed inside of individual inputs as well as outside. When inside, the filter is limited to just that input; otherwise it applies to all the inputs.

include-other-left-columns

build {

text-input "my-input.txt"

join "right outer" {

text-input "your-input.txt"

key "A" "A" "A"

include-other-left-columns suggested-dimension=false

}

output "my-output.cbase"

}

Specifies that all columns from the left flow of a join are to be included in the output flow. Key columns are not included. Similar to a column tag, include-other-left-columns has an optional suggested-dimension tag. The include-other-left-columns tag can be used with include-other-right-columns.

include-other-right-columns

Specifies that all columns from the right flow of a join are to be included in the output flow. Same as include-other-left-columns but for the right flow.

join

join "left outer" {

text-input "/data/Customer2.txt"{

column "Customer ID" type="integer"

column "Customer Name" type="string"

column "Location" type="string"

}

key "Customer ID" "Customer ID" "Cust ID"

column "Invoice Number"

column "Product"

column "Quantity"

column "Unit Price"

column "Customer Name"

column "Location"

column "Cust ID"

}

Joins two input flows. Join type can be:

inner—Only rows that have matching keys are returned.
left outer—All rows from the left flow are returned.
outer—All rows are returned in the output flow.
right outer—All rows from the right flow are returned.

keep

build {

cbase-input "budget.cbase" {

keep {

column "Account"

column "Fee Total"

}

Keeps only the specified columns from the input. Useful when there are many columns and only a select few are needed. Keep must be part of the input block.

key

key "Customer ID" "Customer ID" "Cust ID"

Names the left and right flow columns to match on, plus the column to include in the output. This attribute is part of a join block.

keys-are-required-dimensions

Indicates that all columns brought in via the lookup are flagged as Infos off of the key dimension. This attribute is part of the lookup block.

lookup

build {

text-input "basic.txt"

lookup {

text-input "lookup-color.txt"

key "Color" "LU_Color"

column "Wavelength"

column "Hex"

}

output "lookup.cbase"

}

lookup {

text-input "lookup.txt"

key "A" "A"

column "B" suggested-dimension=false

}

lookup {

text-input "lookup.txt" {

include-other-columns suggested-dimension=false

}

key "A" "A"

column "B"

}

lookup {

consolidate {

cbase-input "lookup-pre.cbase"

cbase-input "lookup-post.cbase"

}

key "A" "A"

column "B"

}

Includes supplementary data in the cBase. The first example lookup supplements the columns in basic.txt by matching the basic.txt's "Color" column against the lookup-color.txt's "LU_Color" column, to retrieve two additional columns: Wavelength and Hex.

NOTE:

It is a best practice to specify which columns are wanted from the lookup.
Lookup keys need to have the same data type defined in the source and lookup.
Multiple keys can be declared, allowing for a multi-key join.
Lookup columns can be set as optional by using the suggested-dimension attribute with either column or include-other-columns.
When there is no child block after text-input, all columns are brought in and their types are inferred.
Use of a child block allows you to limit columns and apply other build operations (that is, filter, remove, lookup, and resolve-duplicate).
Avoid operations inside a lookup block.
If <unknown> appears as a key for a row, then <unknown> is used for all lookup columns for that row.
Null is a valid key.
By default, Spectre allows duplicate keys as long as the values are the same; duplicate keys with different values trigger a warning. See resolve-duplicates. Also note that having duplicate keys, even with equal values, is slower than proper lookup files.
Lookups are case sensitive. To simulate case-insensitivity, add a column with the lower-cased version of your actual key column on either side, join on it and (optionally) remove it. See String Functions.
If a row in a lookup's base table has a value <unknown> in a key column, then the lookup columns are assigned the value <unknown> at that row. Best practice is to not have key columns in the base table that contain the value <unknown>.

prefix

dive {

cplan {

text-input "/data/basic.txt" {

add "C" `value("Color")`

prefix "prefix-"

}

window {

dimension "prefix-C"

}

Prefixes all the outputted column names with the specified string.

NOTE: The prefix is applied after the operations. In this example, the value() call uses the column name "Color" un-prefixed, while the output column "C" is prefixed to be called "prefix-C".

range-key

build {

text-input "base.txt"

lookup {

text-input "lookup.txt"

range-key "Date" "Start Date" "End Date"

key "Code" "Code"

}

output "result.cbase"

}

cplan {

text-input "some_integers.txt"

lookup {

text-input "neighborhoods.txt"

range-key "number" "small" "big"

}

Performs lookups using the range-key tag to connect one or more of the key columns to pairs of range columns in the lookup. A range-key takes three arguments:

Name of the key column in the main table
Name of the begin range column in the lookup table
Name of the end range column in the lookup table.

For a lookup row to match a main row, the main table's key value must be between the begin-range and end-range values, inclusively.

The begin-range and end-range values can be null, in which case the range is open on either side (or both sides).

Regular lookup keys can be any type except for double. Range lookup keys can be any type except for string.

Build Range Lookup Example

remove

remove "Account"

remove "Fee Total"

Removes the column named.

rename

rename "Mood" "Attitude"

Renames a column. In this example, the "Mood" column is renamed "Attitude." In the Spectre Build GUI this appears as an Alias.

replace

replace "Weight" `value("Weight") * 100`

replace "Customer Phone" `concat("(",substr(value("Customer Phone"),1,3),") ", substr(value("Customer Phone"),4,3), "-",substr(value("Customer Phone"),7,4))`

column "Profit" type="double"

replace "Profit" `fixed100(value("Profit"))`

Replaces a column's contents based on a Spectre expression. In the first example, the weight column contents is replaced by the result of multiplying it by 100. The second example replaces the phone number with a formatted version. The third example replaces the double value in the input with a fixed100 value in the output.

Replace is a type of add operation. See the description for add earlier in this table.

resolve-duplicates

lookup {

text-input "presidents.csv"

key "Name" "Name"

column "Rank"

resolve-duplicates "first"

}

Controls behavior for lookups. The tag value can be strict, any, first, or last. The selection becomes relevant whenever a lookup table has more than one row with the same set of keys (or, in the case of range-keys, when the keys cannot resolve some row in the base table unambiguously to one row in the lookup table). The specifications are:

strict—lookup succeeds when there is exactly one match (default)
any—any matching row in the lookup table can be used
first—the earliest match in the lookup table is used
last—the last match in the lookup table is used

NOTE:

By default (the strict option), when a build encounters duplicate lookup keys with identical values, that value is used. Otherwise, a warning is issued due to the ambiguity, and <unknown> is used instead. There is only one warning issued per lookup. Use the resolve-duplicate tag set to any, first, or last to change this behavior.
Best practice is to avoid duplicate or <unknown> keys.

rotate

rotate {

rotate-output-names "Party" "First Name" "Last Name"

row-out "Democrat" "DEMOCRAT FIRST" "DEMOCRAT LAST"

row-out "Republican" "REPUBLICAN FIRST" "REPUBLICAN LAST"

}

Defines a rotate operation on the data flow.

rotate-output-names

rotate-output-names "heading col name" "rotated data 1" "rotated data 2"

Identifies a heading column that indicates which columns are rotated to create a row, and value columns for the rotated column values. This tag names the columns that the respective row-out arguments refer to.

TIP: Elements for the rotate-output-names are labeled Row Name Column and Value Columns in the GUI.

row-out

row-out "head" "start col" "end col"

Refines the rotation. The row-out tags specify the data that populates the heading column with the first argument and the columns that are rotated with the remaining arguments.

TIP: Elements for row-out appear in the GUI as Input columns :DimensionValue (first arg) and whatever is specified for Column.

sort

cbase-input "demo_drs*.cbase"{

sort {

column "Customer"

column "City"

}

sort column "Profit" reverse=true

sort {

column "Account ID"

column "ED Arrival Date" reverse=true

}

add "Last ED Visit for Account" `if(on_previous_row(value("Account ID"))=value("Account ID"),"","Y")`

Sorts the data on one or more columns; reverse is optional.

The sort as part of the build block sorts the union of multiple inputs; as part of an input block, sorts the inputs separately. For example, in this first snippet the list of cBases (indicated by the wildcard character) are sorted individually, not as one input.

The third snippet uses the on_previous_row function, in combination with the sort, to find the last ED visit for each account.

NOTE: Spectre uses "numeric collation" when sorting strings. When a subset of a string looks like a number, then the numeric value is considered during the sort. For example, Spectre sorts "a2" before "a10". See the sort-natural tag in Text, cBase, and Dive Input Build Tags.

TIP: Using a sort with a filter expression can allow you to squash your data—eliminate detail.

squash

build {

cbase-input "../data/basic/basic.cbase"

squash {

dimension "Mood"

dimension "Lower Taste" `lower(value("Taste"))`

column "Height"

column "Weight"

column "X" `calc("Height") + calc("Weight")`

}

output "test_output.cbase"

}

Combines rows that are identical for a set of dimension columns. This tag adheres to the following syntax:

squash {

dimension "Dim1"

...

column "Calc1"

...

}

Columns may be explicitly referenced for inclusion or calculated as a constant or summary value. In the example, the Mood column and a column called Lower Taste are the dimensions that are evaluated to remove duplicate values. Lower Taste is calculated to contain the lowercase contents of the Taste column. The included columns are the existing columns of Height and Weight as well as a column called X. Each row of the X column contains that same row's Height and Weight values added together.

use-script-order

rotate use-script-order=false {

rotate-output-names "Size" "Count"

row-out "Small"

row-out "Medium"

row-out "Large"

}

Refines the rotation by allowing the programmer to avoid using script order for strings in a rotate. If not specified, defaults to true.

With the example in the snippet, the headings column, when sorted, will sort in string-sort order: "Large", "Medium", "Small". If use-script-order were omitted or set to true, then that column would sort in script-order—that is, the order they appear in the script: "Small", "Medium" then "Large". Available starting with Spectre 7.1(19).