FACT Extracts from IRI Workbench
Eclipse Plug-In Unloads & Loads 'Big Data'
Users of IRI's FAst extraCT (FACT) tool can now design and launch large table extracts from the IRI Workbench, an Eclipse Plug-In that already supports CoSort
® SortCL transformation and FieldShield
® data masking functions.
The new job wizard, syntax-aware configuration file editor, and graphical execution options for FACT in the workbench provide all the front-end support necessary for rapidly off-loading table data into portable flat files. In addition to creating and launching the FACT configuration (.ini) file, the Workbench now supports target table creation and loader control file specification for faster loading.
Fast extracts from, and pre-CoSorted loads to, very large database (VLDB) tables play key performance roles in: high-volume data warehouse and operational data store (ODS) acquisition, off-line reorgs, database migration and replication, archive and retrieval, data franchising (data preparation for BI tools), ad hoc reporting, and search-based applications (SBAs). FACT uses proprietary connection protocols, multiple threads, and standard SQL select syntax to extract data from Oracle, DB2, Sybase, SQL Server, and Altibase tables on Unix, Linux and Windows.
To support subsequent (batch) or concurrent (piped) transformation and load operations, FACT automatically creates the extract file metadata in both CoSort/SortCL Data Definition File (DDF) and the database's load utility metadata formats. Using FACT in the IRI Workbench lets you utilize that functionality in a broader visual and operational context, where you can:
Additional details on, and screen images from, FACT within the IRI Workbench can be found here. If you have any questions about FACT, or would like to arrange a webinar or obtain an on-site evaluation copy, please email firstname.lastname@example.org.
see and work with data in source and target tables via the Data Source Explorer
use the Data Definition File (DDF) metadata FACT creates in CoSort SortCL and FieldShield jobs
run CoSort SortCL data transformations and reports, plus direct path loads, in-line with FACT (batch/piped ETL)
feed FACT, CoSort or FieldShield output to other Eclipse plug-ins like BIRT for advanced reporting
work on unload, load, reorg, and ETL projects in teams with version control
RowGen Development Update
IRI's Test Data Tool is Being Upgraded
Along with FieldShield, IRI's data masking and encryption solution, RowGen can be part of data loss prevention and privacy law compliance initiatives by replacing the need for production data for testing, outsourcing, and application development. Using realistic, referentially correct test data is also a safe way to protoptye ETL and database operations, and to benchmark new hardware and software platforms.
The current RowGen release, 2.11, uses the same syntax as the CoSort SortCL program to create big data in custom formats suitable for testing. On the Windows side, the product ships with a data model-parsing interface created by RapidACE LLC.
IRI is now in the process of re-developing its data model parsers using both newer technology from RapidACE and the Eclipse Data Tools Plug-In (DTP) expressed through the Data Source Explorer window in the IRI Workbench. In addition to more ergonomic RowGen job script creation in the GUI, IRI Workbench users will be able to send pre-sorted test data directly to ODBC-connected tables and bulk database load utilities.
IRI intends to add the updated RowGen functionality in the IRI Workbench next quarter. Meanwhile, if you have specific feature/function requests for the next RowGen release, please email email@example.com.
CoSort Expands in Hospitality Sector
Mereo in France Leverages IRI's Big Data Engine
Mereo is a leading provider in the field of revenue optimization and business intelligence (BI) for Hospitality, Entertainment and Travel sector clients.
Since 2000, Mereo has supported these efforts -- from profit assessment to process improvement, tools integration, implementation, maintenance and support, as well as staff training in. Based in Paris, the Company has assembled an expert consulting and engineering team experienced in yield and revenue management solution implementations in the travel, leisure and media industries.
Mereo has worked with IRI and CoSort Solutions France to integrate CoSort into their core applications. With high volumes of client sales data to be analyzed, Mereo uses CoSort to manipulate and manage their data sets, and to calculate different sets of key performance indicators (KPI) from that data.
CoSort was selected because its SortCL program suited Mereo's data transformation requirements perfectly, and ran across the various Unix, Linux and WIndows platforms Mereo customers are using. Mereo's integration of SortCL in ETL-related operations is typical because data warehouse and BI architects can easily leverage the sort, join, aggregation, cross-calculation, and reformatting functions through simple 4GL job scripts.
For more information on the data transformation functions that SortCL can perform in a single pass for high volume data staging environments, click here.
SortCL can perform similar bulk data preparation for BI tools (also known as data franchising); SortCL creates CSV and XML file targets, as well as ODBC row inserts, as hand-offs to those tools. For more information on the business intelligence functions SortCL can perform natively as a report generator, or as a data franchising tool, click here.
Tech Tip - Field Predicate Feature
A New Short-Cut for SortCL Script Writers
For those CoSort users moving to version 9.5.1 and still using text editors to create and modify job scripts (rather than the IRI Workbench GUI that automates script creation), SortCL's new /FIELD_PREDICATE statement can help reduce the size and complexity of job of scripts -- and thus the time needed to manually create and edit them -- by making field definitions as simple as a field name.
The predicate allows you to specify one or more repeating attributes of input or output fields only once at the beginning of the /INFILE or /OUTFILE section of a job script, rather than having to specify those attributes in every /FIELD statement. Once specified, SortCL will use the same attributes in every field statement that follows the predicate statement until the predicate attribute is either manually overridden in a specific field, or replaced by a subsequent predicate statement.
In addition to storing one or more repeating field attributes, the predicate will also automatically calculate byte offsets for fixed-position fields and augment the ordinal positions of delimited fields. This makes it easier to add or remove fields because their positions will be re-calculated automatically, and eliminates the possibility of specifying a wrong position number.
The more field attributes that can be specified in the predicate statement, the smaller each subsequent field statement needs to appear.
By way of example, consider this input file, called 'addresses':
Dick Jones,1234 Maple St.,Philadelphia,Pennsylvania
Sam Henderson,1400 Highway A1A,Satellite Beach,Florida
Harry James,50 Elm Ave.,Boston,Massachusetts
Sarah Smith,300 Thornton Rd.,Frankfurt,Kentucky
This SortCL job script with predicate statements on input and output:
# delimited field, starting at position 1
# ASCII output assumed, uniform field width
# new SIZEs and FILL characters follow
produces the output file 'fixed-output.txt'
Dick Jones^^^^^ 1234 Maple St.^ Philadelphiaxxxxx Pennsylvaniaxxxxx
The first two output fields have a fixed size of 15 and pad out with the ^ character, while the next two fields are both 17 bytes and pad with an 'x'.
Sam Henderson^^ 1400 Highway A1 Satellite Beachxx Floridaxxxxxxxxxx
Harry James^^^^ 50 Elm Ave.^^^^ Bostonxxxxxxxxxxx Massachusettsxxxx
Sarah Smith^^^^ 300 Thornton Rd Frankfurtxxxxxxxx Kentuckyxxxxxxxxx
If you have specific questions or feedback on this SortCL feature, or are interested in testing the latest release of CoSort, please email firstname.lastname@example.org.