Directory Input Object

The Integrator Directory input object processes file system directories as input data, allowing Integrator to list files that exist on the system and process their names and attributes as data. This data is an input flow for other Integrator objects and can be processed like any other data source.

By default, this input object will list files in a single directory. When the walk flag is set to "true", this input object will act on all files in a sub-tree of the given directory, processing all files that are under it in the file system hierarchy.

Since the Directory object simply creates a list of the files, another input object is required to read the contents of those files. For example, use the file_list_input attribute with its value defined as the name of the Directory object. Then the Filein object can choose which files are to be included in the input flow.

Directory Attributes

Attribute Type Description
input_type
(required)
String Identifies the object as a Directory input object. The value of this string is "directory".
directory String Defines the top-level directory to be listed. If this attribute is not present, the current directory is used.
file_type String Defines the type of directory entries that are listed by the Directory object. This attribute can be "file", "directory", or "link".
Only directory entries matching the given type are returned. Either file_type or file_types is required and are mutually exclusive. Use one or the other.
file_types Array of Strings Defines the types of directory entries that are listed by the Directory object. This attribute can be "file", "directory", and/or "link". Only directory entries matching the given types are returned. File_types and file_type are mutually exclusive. Use one or the other.
walk Boolean Determines whether or not to process files in a single directory or an entire sub-tree. Values include:
  • true—The directory and its subdirectories are listed.
  • false—Only the directory is listed. (default)

starname String Defines a "star name" (file match string) for selecting a set of input files based on a wildcard. The string is case-insensitive. This attribute is not used to restrict which subdirectories are walked when the walk flag is set. This attribute is optional; if it is not used, all files are selected. Values include:
  • ? (question mark)—Matches any single character.
  • * (asterisk)—Matches a sequence of characters.

starnames Array of Strings Allows multiple starname strings to be used to specify filenames. If a starname string does not match any files, it is ignored. The attributes starname and starnames are mutually exclusive. Use one or the other.
first Integer Determines the number of records to be read from the Directory object. If present, Integrator reads up to the specified number of records. This limit is particularly useful for script testing on a small number of input records. If this attribute is not used, all rows are returned.
hidden Boolean Determines whether hidden directory entries are returned by the Directory object. This attribute is optional; if it is not present, hidden directory entries are not listed. Values include:
  • true—Hidden directories are returned by the Directory object and hidden subdirectories are listed if the walk attribute is true. For UNIX and OS/400 platforms, any directory entry starting with "." is considered hidden. On Windows platforms, any directory entry with the Hidden attribute set is considered hidden.
  • false—Hidden directory entries are not listed. (default)

error_action String Determines the behavior of the Directory object when it encounters a directory that cannot be listed or a file that cannot be examined. Values include:
  • error—Listing problems will stop Integrator with an error. (default)
  • warn—Listing problems are noted, but processing continues.
  • ignore—Listing problems will not be reported and processing continues.

If not specified, this attribute also defaults to "error".

NOTE: This attribute is Error Action in Visual Integrator.

aliases Array of Strings

Defines new column names for the columns already defined in the input data. Format is "oldname=newname". Blanks before or after the columns names will be ignored. Spaces within a column name are acceptable. If newname is blank, then the given column is deleted from the output flow.

NOTE: This attribute is Alias Lines in Visual Integrator.

prefix String Defines a prefix that is prepended to all columns in the flow that are not aliased using the aliases array. If you want a space between the prefix and the column name, include that space in the prefix string definition.
keep_columns Array of Strings Defines a list of columns to be kept by the object. If this attribute is not used, all columns are kept. The output flow of the object is limited to those columns that are listed, and no excluded columns are available to subsequent process objects. Column names in the keep_columns array should be given after they are aliased or prepended with the prefix string.
encoding String

Defines how files names are read and interpreted in terms of character encoding. Values include:

  • auto—The input object sets the encoding based on the file signature and the Unicode state of other objects in the same task.
  • ascii—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters.
  • gb18030—The file is interpreted as Chinese National Standard 18030-2000 characters. The gb18030 encoding option is supported on Windows platforms only.

  • latin1—The characters in the file are interpreted as ISO-8859-1 or Latin1 characters.
  • utf-8—The file is interpreted as UTF-8 Unicode characters.
  • unicode—The file is interpreted as 2-byte Unicode characters (UCS-2) with native byte swapping, unless overridden by a UCS-2 file signature.
  • unicode-be—The file is interpreted as UCS-2 characters in a big-endian fashion.

  • unicode-le—The file is interpreted as UCS-2 characters in a little-endian fashion.

UCS-2 and UTF-8 files can include a Byte Order Mark (BOM) at the beginning of the file to denote the file encoding. These file signatures are defined as follows:

  • UCS-2 Big EndianFE FF
  • UCS-2 Little EndianFF FE
  • UTF-8EF BB BF

File signatures are common for Unicode files on Windows operating systems. If the file input object reads multiple files, the signature of each file determines its encoding.

If the encoding attribute is auto and no signature is found, the encoding is assumed to be latin1 if no other object in the task handles Unicode data and the VI file is not encoded as utf-8 (using the charset 1208 directive). Otherwise, the encoding is assumed to be utf-8.

See also Integrator Unicode Data Support.

trace_after Sub-object

Traces data flows leaving the specified object, which makes debugging scripts easier. This is equivalent to adding a Trace process object immediately after the current object.

See Embedded Trace Object for more on using trace sub-objects.

The Directory object generates an output flow containing the following columns.

NOTE: When running within a project, all platform-specific fields (Drive, Access Mode, Inode number) will be blank.

Directory Object Output Flow Columns

Column Description
Filename Contains the single-level filename within the directory (for example, "test.int").
Path Contains the full pathname of the directory entry, including the filename.
Drive On Windows platforms, contains the drive name for the given file. On other platforms, or when the directory is a UNC path (for example, \\machine\...), this column is blank.
File Extension Contains the file extension for the given directory entry. This is defined as the string following the last '.' in the filename.
Relative Path Contains the relative path of the directory entry, relative to the top-level directory. For example, if /di_solution/diveline is listed with the walk flag set to true, the relative path for "/di_solution/ diveline/config/atlcfg.cfg" will be "config/atlcfg.cfg".
File Type Contains the file type of the given directory entry. On Windows platforms, this will be one of "file" or "directory". On other platforms, this will be one of "file", "directory", "link", "named pipe", "block device", "character device", or "socket".
File Size Contains the file size, in bytes, of the given directory entry.
Modified Date Contains the date (of the local machine) the directory entry was last modified, as a standard Dimensional Insight date (YYYY/MM/DD).
Accessed Date Contains the date (of the local machine) the directory entry was last accessed, as a standard Dimensional Insight date (YYYY/MM/DD).
Modified Time Contains the time (of the local machine) the directory entry was last modified, as a standard Dimensional Insight time (HH:MM:SS).
Accessed Time Contains the time (of the local machine) the directory entry was last accessed, as a standard Dimensional Insight time (HH:MM:SS).
C Date

On Windows platforms, contains the date (of the local machine) the directory entry was created, as a standard Dimensional Insight date (YYYY/MM/DD).

On Unix/Linux platforms, contains the date the directory entry has file content changes or metadata changes (unix ctime).

C Time

On Windows platforms, contains the time (of the local machine) the directory entry was created, as a standard Dimensional Insight time (HH:MM:SS).

On Unix/Linux platforms, contains the time the directory entry has file content changes or metadata changes (unix ctime).

File Attributes On Windows platforms, contains the file attributes for the given directory entry, as listed by the DOS DIR command. The following attributes are returned: Archive (A), Hidden (H), Read-only (R), and System (S). On non-Windows platforms, this column is blank.
Access Mode On non-Windows platforms, contains the file mode as defined by the UNIX "ls" command, used to describe the file type and access bits for the given directory entry. The access mode is a 10 character string, with "-" representing null access, e.g. "drwxr-xrx".
Owner On non-Windows platforms, contains the owner of the file as a string, for example, "root". On Windows platforms, this column is blank.
Group On non-Windows platforms, contains the file’s group as a string, e.g., "staff". On Windows platforms, this column is blank.
UID On non-Windows platforms, contains the UID (user id) of the owner of the file as a number. On Windows platforms, this column is blank.
GID On non-Windows platforms, contains the GID (group) of the group of the file as a number. On Windows platforms, this column is blank.
Inode Number On non-Windows platforms, contains the i-node number of the directory entry. On Windows platforms, this column is blank.
Link Target On non-Windows platforms, contains the target of a symbolic link. On Windows platforms, or for non-link directory entries, this column is blank.
File Blocks Contains the allocation size of the given directory entry, in terms of kilobyte blocks.

NOTE: When Integrator is run within a project, only the following information is returned:

  • Filename
  • Path
  • File Extension
  • Relative Path
  • File Type
  • File Size
  • Modified Date
  • Modified Time

Other file information is beyond the project abstraction and will not be returned.