Working with JSON File Format in Spectre
JSON is a structured data format that can be used as a data input. Because the JSON format does not necessarily conform to the structure of a cBase, certain values must be set in the json-input object to properly import values from a JSON file. Consider the following JSON file as an example:
{
"type": "collection",
"entries": [
{
"resource": {
"id": "1",
"name": {
"first": "Jane",
"last": "Doe"
},
"birthDate": "1973-05-31"
},
...
}]
}
To read this data, each record in the JSON file needs to be defined in Spectre. In this example, that is the entries array. Spectre calls that the parent. Next, Spectre creates a single row for each element in the parent. Those elements can be complex JSON objects or arrays. By configuring the columns in the json-input object, users can control which values are imported. The following build script returns the ID, First Name, Last Name, and Birth Date from the JSON file:
build {
json-input "people.json"parent="entries" {
column "ID"key="resource.id"
column "First Name"key="resource.name.first"
column "Last Name"key="resource.name.last"
column "Birth Date"key="resource.birthDate"
}
output "people.cbase" }
JSON value types are:
-
null
-
Boolean
-
number
-
string
-
object
-
array
Object and array values cannot be mapped to Spectre values. Instead, they can be deconstructed into multiple columns using the key attribute. If no type is given for a column, Spectre determines the column type while reading the file. For string JSON values, Spectre determines if the string is an integer, fixed100, double, date, datetime, or period before setting the overall column type.
Tag | Description |
---|---|
path |
Defines a .json file as a data source. This field must be a relative path to the json file to read. The file must match standard JSON specifications. TIP: You can use JSON Lint to verify the validity of your file. |
parent |
If it is provided, identifies where the parent value can be found in the file. See JSON path(below). The parent contains the values that are mapped to rows. The parent value must be an object or an array. If a parent tag is absent, the root of the JSON document is used as the parent. |
name | Specifies the name of the json-input object. |
limit-rows | Directs the build to use a specific number of rows from the JSON file. TIP: Using limit-rows=1 is an easy way to ensure calculations find all needed columns in the input. |
trim | Removes any leading or trailing spaces from the data. This is the default behavior for Spectre. Set to false to disable the trimming of white space. |
property |
Defines a system property to be associated with the cBase. A property defined on the build level (not the column level), appears in the Spectre Build Editor as a cBase property. You can refer to the property in Spectre scripts using the table_system_property() function. NOTE: cBase properties are not related to project settings, those are user properties. See Spectre Properties. |
custom-property |
Defines custom properties for the cBase. Custom properties are not used by Spectre but are available for users to add their own metadata to Spectre objects. Custom properties can be referenced in Spectre expressions using functions such as column_custom_property(), table_custom_property(), or calc_custom_property(), depending on the context. TIP:
|
prefix |
Prefixes all outputted column names with the specified string. NOTE: The prefix is applied after all operations. |
column |
Defines the columns for the cBase. The key attribute controls how Spectre finds the value for the column. If a key is not provided, Spectre uses the first JSON field name that matches the column name (ignoring case). If a key is provided, the column is considered a JSON Path and identifies where to find the value for this column in the JSON file. |
drop-key |
Specifies the JSON field names to ignore when using the include-other-columns tag. NOTE: JSON field names are case sensitive. It is possible for a JSON object to have both an ID field and an id field. To drop both fields, you must specify both of them: drop-key "ID" drop-key "id" |
id-column |
Defines a column where each row is the name of a field in a JSON object. For example, given the following input: Copy
Declare a column as: id-column "Abbrev" The Abbrev column has the values CA, TX, FL. NOTE: When using the include-other-columns attribute, the default behavior for object parent values is to create a column named ID for the field names. You can use the drop-id attribute to avoid creating that column. |
drop-id | When the parent value is an object and you are using the include-other-columns attribute, this prevents the creation of an ID column for the field names. |
include-other-columns |
Automatically includes columns not explicitly listed. NOTE: When using the include-other-columns attribute, only fields that have null, Boolean, number, or string values are included. If the incoming values have fields of type object or array, they are not automatically made into columns. For example, given the following input: Copy
Spectre creates id and date columns, but not info or color columns. To read the color field, declare it using the key attribute: column "color" key="info.color". |
Some options are interpreted as JSON Paths (parent in json-input, and key in column). A JSON Path is a sequence of field names or array indexes, identifying where to find a value in a complex object or array.
For instance, in the following input:
[{
"name": "Boston",
"info": {
"state": "Massachusetts",
"lat long": [
42.361145,
-71.057083
]
}
},
...
]
Here are some sample JSON paths and the values they would extract:
- name—Boston
-
info.state—Massachusetts
-
info.'lat long'.0—42.361145
-
info.'lat long'.1— -71.057083
The tokens in a JSON path are separated by periods. Each token in a JSON path is either a plain string or a quoted string.
For object values, the token is a field name. For array values, the token is the array name followed by an integer representing the index of the array item. Array indexes are zero-based. In the above example, the first element of the lat long array is referenced as info.'lat long'.0, where the zero represents the first element of the array. For each subsequent element of the array, the index iterates by one.
Use quoted strings for field names that contain characters other than letters, numbers or underscores. Within a quoted string, use backslashes to escape the quotation mark character or other backslashes.
NOTE: Because the JSON path is already in double quotation marks in the Spectre script, you need an extra layer of backslashes for double quotation marks in quoted strings. For this reason, DI recommends that you use single quotation marks.
As of Spectre 7.2(14), the asterisk (*) character is used to retrieve all values or objects from an input JSON array.
For instance, consider the following input:
[{
"cust": "A",
"addr": [ "A-1", "A-2" ]
},
{
"cust": "B",
"addr": [ "B-1" ]
}]
Using a wildcard character in the key attribute for addr retrieves all of the values of the addr array.
json-input "test.json" { column "Customer"key="cust" column "Address"key="addr.*" }
This produces the following output:
Customer | Address |
---|---|
A | A-1 |
A | A-2 |
B | B-1 |
Note that a row is created for each element in the array, not for each element in the JSON object.
Multiple columns can be pulled from the same sub-array.
For instance, consider the following input:
[ {
"cust": "A",
"addr":
[
{"city": "A-1", "state": "AA-1" },
{ "city": "A-2", "state": "AA-2" }
]
}, {
"cust": "B",
"addr": [ {"city": "B-1", "state": "BB-1" } ]
}]
Using the wildcard character followed by the name of an object in a sub-array specifies that the input column is populated only with values from that object.
json-input "test.json" { column "Customer"key="cust" column "City"key="addr.*.city" column "State"key="addr.*.state" }
This produces consider the following output:
Customer | City | State |
---|---|---|
A | A-1 | AA-1 |
A | A-2 | AA-2 |
B | B-1 | BB-1 |
NOTE:The wildcard character cannot be used in the parent tag.
For more information about JSON, see JSON Reference.