Data Translation And Transformation Challenges For Mainframe Application Modernisation

Mainframe-based production applications are both widespread and mission critical. Yet numerous organizations – motivated by opportunities to reduce hardware and software costs, expand skill sets, and leverage the scalability of emerging technologies – have begun to modernize their mainframe applications. Specifically, they are migrating applications to open systems such as Windows and UNIX.

It is widely accepted that more than 70 percent of the world’s transactional production applications are running on mainframe platforms, with a typical enterprise having millions of lines of legacy code in production. Even so, technology and workforce trends all lean towards open systems.

Efforts to modernize mainframe applications are often complex and include many phases, each of which has unique organizational and technical challenges. Migrating the data between the platforms is one of the earliest phases and a critical prerequisite to all other phases. Although it is seemingly the simplest phase, it is filled with hidden complexity and is frequently a point of failure. Success rates have been estimated to be below 20 percent.

These metrics are not surprising for several reasons. First, most mainframe applications have been accumulating data for years, even decades, resulting in volumes of legacy data that can exceed hundreds of terabytes. That’s a lot of information to handle. Additionally, mainframe data sources and formats are notoriously difficult to interpret, manipulate and convert.

Also there is the daunting requirement to perform equivalent data integration and transformation tasks, such as sorting, merging, copying, and joining in the open systems environment, while maintaining high levels of performance, scalability and reliability. This must take place without requiring significant or complex development efforts. Organizations must plan for and properly address these data considerations in order to successfully accomplish their mainframe application modernization initiatives.

Requirements for Data Migration and Integration

A critical — and challenging — part of a mainframe modernization initiative is converting the massive amounts of mainframe EBCDIC data into ASCII format while preserving packed-decimal and binary numeric values. Since the volume of data to be migrated can be huge, it is critical to be able to process large data sets without failing. Ideally, data needs to be processed both quickly (short elapsed time) and efficiently with a minimum use of hardware resources.

The mainframe data must not only be translated from the mainframe format to ASCII, it must then be transformed into the appropriate format so it can be loaded into the open systems-based application. Transformation of the records is not a process to be taken lightly, as mainframe production processing often uses complex, hierarchical record structures to pack significant amounts of information into a single data set.

Several levels of information are typically bundled into a single data set through the use of multiple record types, multiple composite field groupings and arrays, sometimes varying in repetition from record to record. This hierarchical data organization often needs to be broken up and normalized for relational storage. The software used to process these complex records must not only be able to understand the complicated record compositions, but must also be able to leverage the existing metadata that describes these complex structures.

Mainframe applications store these data structures in a variety of highly specialized formats, requiring software that can both recognize and process the data. Without built-in support for these mainframe sources, record formats and data types, the projects would require costly, time consuming, and error-prone manual interaction and custom coding.

Mainframe Application Modernization Implementation Approaches

Various implementation approaches exist for modernizing mainframe-based applications, including:

  • Re-architecting mainframe applications for the open systems environment
  • Replacing mainframe applications with open systems-based packaged applications
  • Re-hosting existing mainframe code on open systems

For all of these approaches, the requirements surrounding an accurate and efficient handling of the data and the performance of equivalent transformations are critical for success.

Many data integration software solutions available today provide deep functional capabilities and extremely efficient processing to support the specific and complex requirements associated with mainframe application modernization projects.

For instance, at the core of a highly functional mainframe application modernization solution is the use of proprietary sorting algorithms, I/O optimization, parallel processing, and dynamic environmental monitoring techniques. Data integration software that leverages the benefits of high efficiency and low resource utilization, combined with the latest architectural innovations of open systems platforms, will equate to the shortest elapsed time and the least amount of required hardware resources for processing what can be massive amounts of data.

Data integration software generally provides the ability to handle massive data volumes and deep support for mainframe data, as well as support for the variety of data migration scenarios that are likely to be encountered in mainframe modernization projects.

When data integration software is functioning optimally, it can help execute the following:

  • Undergo EBCDIC to ASCII conversion with deep support for various mainframe sources, formats, and data types
  • Perform robust data transformation capabilities using complex legacy metadata
  • Provide open systems equivalents of mainframe data integration processes
  • Provide deep support for re-hosting environments
  • Support bi-directional translations and transformations
  • Provide support for changed data capture processing

Utilizing data integration software that is specifically optimized for handling mainframe migration projects can help eliminate the risks associated with using other techniques like utilities or custom coding and deliver greater productivity and higher quality results.

Alternatives for Mainframe Migration Data Conversion

We’ve spent time discussing some of the advantages of data integration software. Here we take a look at other options available for converting data as part of mainframe migration initiative.

Custom Coding

Custom coded solutions are reported to be the most common method for converting data as part of mainframe migration initiatives, often employing COBOL on the mainframe. Custom coded scripts and programs are attractive, due to low initial costs and the availability of developer skills. However, these solutions can become saddled with problems, and have been known to be error-prone, difficult and expensive to debug and maintain. COBOL solutions on the mainframe for data migration can be particularly costly in terms of elapsed time, data latency, and MIPS. Yet for some organizations, custom coding will be an option that meets their needs.

Mainframe Utilities

Most mainframe-based file transfer programs have an option to automatically convert a file from EBCDIC to ASCII. These utilities can be used to translate or unpack binary data to ASCII. However, users should be aware of some of the limitations. Mainframe processing can be expensive when compared to open systems processing. Organizations can also run into issues like limits on the number of job steps or record layouts that no longer match translated data if they are not careful.

UNIX Utilities

UNIX based conversion utilities, such as dd, are another alternative for mainframe application modernization initiatives. However, there are some important limitations to be aware of before choosing this option. For example, dd cannot convert packed decimals, has no support for COBOL copybook, and has limitations in speed and scale. The limitation that presents the greatest risk, however, is that the presence of a single typographical error in the syntax can corrupt some or all of the data on the disk.

Reducing Complexity Breeds Success

Multiple business and technology drivers are motivating companies to modernize or re-platform their mainframe applications to run on open systems such as Windows and UNIX. These mainframe application modernization initiatives are often complex and include many phases. Effectively handling the data translation and transformation processing is critical to the success of these projects. Organizations need to carefully review and understand their requirements and data volumes before proceeding with an approach that is both facile and robust enough to capture all mission critical data. Only then will the business benefits organizations are seeking like cost savings and greater agility become achievable.

Stephanie Best is the Director of Product Marketing at Syncsort. Stephanie is a 17-year product marketing veteran who has held worldwide marketing leadership roles in the IBM InfoSphere group and previously with Systems Research & Development (SRD). In her role at Syncsort, Best is responsible for product strategy, positioning, and deliverables for the company’s data integration and data protection portfolios.