FACT v3 Available
DB Extracts Up to 3X Faster
IRI is pleased to announce the third generation of Fast Extract (FACT) software for very large database (VLDB) unloads. FACT is a key component in data warehouse ETL (extract-transform-load) and ELT operations, offline reorgs, archive projects, and database migrations.
FACT v3 extracts big data from DB tables into flat files far faster than prior versions and other unload methods. On a test workstation, FACT v3 unloaded a 260GB Oracle test table in 54 minutes, vs. 161 using v2.5. In addition to parallel hints and max row specifications, FACT v3 supports three proprietary query splitting methods for multi-threaded extraction.
FACT v3 on Unix and Linux servers currently extracts from Oracle 8.1 (and above) and DB2 UDB 8.2 (and above), and will eventually also support:
Sybase (ASE and IQ) – 12.5 and above
Altibase – 4.2 and above
Tibero – 4.0 and above
MySQL – 5.0 and above
MS SQL – 2003 and above
Oracle – 8.2.0 and above
DB2 – 8.2.0 and above
MS SQL – 2003 and above
Please contact firstname.lastname@example.org if you are interested in testing FACT v3. For more information, see also:
CoSort Speeds Top BI Tools
BOBJ, Cognos, Microstrategy
A senior BI/DW architect in Singapore has recently proven the performance benefit of CoSort for centralized pre-transformation (data franchising) on three of the top data visualization applications (BI tools) in the market. Time-to-delivery tests conducted with and without CoSort in SAP's Business Objects (BOBJ), IBM's Cognos, and Microstrategy reporting environments showed a dramatic performance difference with even small amounts of input data.
Given the need to sort, join, and aggregate flat file sources prior to producing reports in each BI tool, comparison tests were conducted whereby these transforms ran first inside the tools (i.e. within the BI layer), and then externally -- via CoSort's Sort Control Language (SortCL) program (i.e. outside the BI layer). In every case, transforming the data with CoSort first made sense. It was not only at least 2-3 times faster, but eliminated the need to repeat the transformations and maintain the data and metadata in every reporting project.
For more information on the benchmarks, refer to the individual news announcements at:
and to see benchmark details and screen shots, see the blog articles on each under:
Virtual IRI Workbench Demos
VMWare and IBM Virtual Appliances
Those who have installed and configured the IRI Workbench to run on their own systems know there is some work to do preparing the GUI environment. Demo installations have required downloading the GUI and CLI packages, licensing CoSort and FACT, connecting to Oracle over different protocols, and importing the demo archive. IRI has now made that entire process much easier - and can deliver a demo-to-market package far faster - through a VMware image. Those with VMplayer can play the image and find everything ready to go.
IRI is also in the process of preparing a similar deliverable based on Linux for the IBM Kernel-based Virtual Machine (KVM) and PowerVM® environments via IBM's Virtual Appliance Factory (VAF). This will also enable IRI to certify and market CoSort, FACT, FieldShield, RowGen, and NextForm within the IRI Workbench "Ready for PureSystems."
Also under consideration are on-line versions of these images that will preclude the need for downloads and installation. Cloud users can eventually log in to transform and protect their data under a Software as a Service (SaaS) business model with IRI.
Tech Tip: Secure National IDs
FieldShield Data Masking Examples
Among the different field-level data protection functions offered in FieldShield and CoSort (SortCL) are character masks for all or part of fields containing personally identifying information (PII). Users can define the replacement character and specific byte locations within each datum to permanently obscure certain parts of the field value.
Standard masks for US Social Security (SSN) and credit card number (CCN) values are provided in the FieldShield data masking dialog in the IRI Workbench GUI, built on Eclipse. The dialog also allows custom character replacements for any string, and then creates target /FIELD specifications in FieldShield (or SortCL) job scripts to apply the 'replace_chars' or 'mask' function at runtime.
Below are examples of /FIELD statements for commonly used data masks of other national ID numbers that FieldShield or CoSort (SortCL) users can specify in the /OUTFILE sections of their job scripts:
# Canada SIN, e.g. 459-238-962 => 459-xxx-xxx
# Chile RUT, e.g. 320505-13-3565 => xx05xx-xxx5
# Hong Kong NID , e.g. 27224729 => 2722xxxx
# Korea SSN, e.g. => 640907-1031419 => xxxxxx-1031419
# Malaysia NRIC, e.g. => xx05xx-xxx5
# Nordic countries PIN/CPR, e.g. => 211099-xxxx
# Spain DNI, e.g. 29.572.047C => xx.xxx.xx7C
# Taiwan NIC, e.g. F250286893 => F25xxxx893
# UK NINO, e.g. SP-123456-D => SP-xxxxxx-D
If you have any questions or need help implementing data masking or other content-aware data loss prevention functions (like encryption, pseudonymization, randomization, de-identification, hashing, etc.), please email email@example.com