Using ZIP and UNZIP on VM/CMS ============================= Installing executables ---------------------- The following CMS MODULEs are available: ZIP ZIPNOTE ZIPCLOAK ZIPSPLIT UNZIP In addition to these, each MODULE file also has an EXEC with the same name. These EXECs are front-ends to the MODULES that will attempt to set up the required runtime libraries before running the MODULE. All the EXECs are identical. Only their names are different. They are stored as plain text files. The CMS MODULE files have been packed using the COPYFILE command to allow their file format to be properly restored, since variable length binary files will not currently unzip properly (see below for details). The MODULEs are shipped with a filetype or extension of CMO (for CMS MODULE). Their names may vary on the distribution disk to indicate their level, etc. To restore them to executable MODULEs on CMS, do the following: 1. Upload them to CMS with a Fixed record length with LRECL 1024. Example, from a DOS or OS/2 window, type this: SEND unzip.cmo A:unzip module a (RECFM F LRECL 1024 Example, using FTP from CMS, type this: BINARY FIXED 1024 GET unzip.cmo unzip.module.a Note: Replace "unzip.cmo" with the actual name. 2. Use COPYFILE to unpack the file. Example, in CMS type this: COPYFILE UNZIP MODULE A (UNPACK REPLACE OLDDATE 3. Repeat steps 1-2 for each of the programs. 4. Build the ZIPINFO module by typing this: COPYFILE UNZIP MODULE A ZIPINFO MODULE A (OLDDATE 5. Upload the EXECs to CMS as text files (with ASCII-to-EBCDIC translation). Example, from a DOS or OS/2 window, type this: SEND unzip.exc A:unzip exec a (CRLF Example, using FTP from CMS, type this: GET unzip.exc unzip.exec.a 6. Repeat steps 4 for each of the EXECs. Preparing the environment ------------------------- The executables provided were compiled with IBM C 3.1.0 and require the the Language Environment (LE) runtime libraries. To provide access to the runtime libraries: 1. Link to the disk containing the Language Environment files, if necessary. 2. Use the command "GLOBAL LOADLIB SCEERUN" These commands can be placed in your PROFILE EXEC. Note: EXECs have been provided called ZIP, UNZIP, etc. that issue the GLOBAL LOADLIB statement. This was done to alleviate frustration of users that don't have the GLOBAL LOADLIB statement in their PROFILE EXEC. These EXECs may require changing for your system. Unfortunately, there is no way, using IBM C, to produce a MODULE that doesn't require a runtime library. Testing ------- To test the MODULEs, just type ZIP or UNZIP. They should show help information on using the commands. If you see something like this: DMSLIO201W The following names are undefined: CEEEV003 DMSABE155T User abend 4093 called from 00DCD298 reason code 000003EB Then you don't have access to the proper runtime libraries, as described above. Here is additional information on the ZIP and UNZIP programs that may assist support personnel: - Compiled with IBM C V3R1M0 on VM/ESA 2.2.0 with CMS level 13 Service Level 702. - Require the SCEERUN LOADLIB runtime library. This is part of the Language Environment (LE). - Linked with options RMODE ANY AMODE ANY RLDSAVE. If you continue to have trouble, report the problem to Zip-Bugs (see the bottom of this document). Compiling the source on VM/CMS ------------------------------ The source has been successfully compiled previously using C/370 2.1 and 2.2. The source has been recently compiled using IBM C 3.1.0 on VM/ESA 2.2.0 with CMS level 13. I don't have access to an MVS system so the code hasn't been tested there in a while. 1. Unzip the source files required for CMS. The root-level files inside the ZIP file and the files in the CMSMVS subdirectory are needed. Example (use both commands): unzip -aj zip23.zip -x */* -dc unzip -aj zip23.zip cmsmvs/* -dc This example unzips the files to the C-disk, while translating character data and ignoring paths. If you don't already have a working UNZIP MODULE on CMS you will have to unzip the files on another system and transport them to CMS. All the required files are plain text so they can be transferred with ASCII-to-EBCDIC translations. 2. Repeat step 1 with the zip file containing the UNZIP code. Unzip the files to a different disk than the disk used for the ZIP code. 3. To compile the ZIP code, run the supplied CCZIP EXEC. To compile the UNZIP code, run the supplied CCUNZIP EXEC. NOTE: Some of the ZIP and UNZIP source files have the same name. It is recommended that you keep the source from each on separate disks and move the disk you are building from ahead of the other in the search order. For example, you may have a 192 disk with the ZIP source code and a 193 disk with the UNZIP source code. To compile ZIP, access the 192 disk as B, then run CCZIP. This will create the following modules: ZIP, ZIPNOTE, ZIPSPLIT, ZIPCLOAK. To compile UNZIP, access 193 as B, then run CCUNZIP. This will create the following modules: UNZIP, ZIPINFO (a copy of UNZIP). ========================================================================= Using ZIP/UNZIP --------------- Documentation for the commands is in MANUAL NONAME (for ZIP) and in UNZIP DOC UNZIP. INFOZIP DOC describes the use of the -Z option of UNZIP. The rest of this section explains special notes concerning the VM/CMS version of ZIP and UNZIP. Filenames and directories ------------------------- 1. Specifying filenames a. When specifying CMS files, use filename.filetype.filemode format (separate the three parts of the name with a period and use no spaces). Example: profile.exec.a Unfortunately, this prevents you from using ZIP from FILELIST. To unzip a zip file, however, you can type something like this next to it in FILELIST: unzip /n -d c This will unzip the contents of the current file to a C-disk. b. It is possible to use DD names with ZIP and UNZIP on CMS, though it can be cumbersome. Example: filedef out disk myzip zip a zip dd:out file1.txt file2.txt While you can also use a DD name for the input files, ZIP currently does not correctly resolve the filename and will store something like "dd:in" inside the ZIP file. A file stored in this manor cannot easily be unzipped, as "dd:in" is an invalid filename. c. In places where a directory name would be used on a PC, such as for the ZIP -b (work path) option or the UNZIP -d (destination path) options, use a filemode letter for CMS. For example, to unzip files onto a C-disk, you might type something like this: unzip myzip.zip -d c Currently, ZIP uses the A-disk for work files. When zipping large files, you may want to specify a larger disk for work files. This example will use a C-disk for work files. zip -b C myzip.zip.c test.dat.a 2. Filename conversions a. Filemode letters are never stored into the zip file or take from a zip file. Only the filename and filetype are used. ZIP removes the filemode when storing the filename into the zip file. UNZIP assumes "A" for the filemode unless the -d option is used. b. When unzipping, any path names are removed from the fileid and the last two period-separated words are used as the filename and filetype. These are truncated to a maximum of eight characters, if necessary. If the filetype (extension) is missing, then UNZIP uses "NONAME" for the filetype. Any '(' or ')' characters are removed from the fileid. c. All files are created in upper-case. Files in mixed-case cannot currently be stored into a ZIP file. d. Shared File System (SFS) directories are not supported. Files are always accessed by fn.ft.fm. To use an SFS disk, Assign it a filemode, then it can be used. 3. Wildcards in file names a. Wildcards are not supported in the zip filename. The full filename of the zip file must be given (but the .zip is not necessary). So, you can't do this: unzip -t *.zip b. Wildcards CAN be used with UNZIP to select (or exclude) files inside a zip file. Examples: unzip myzip *.c - Unzip all .c files. unzip myzip *.c -x z*.c - Unzip all .c files but those starting with Z. c. Wildcards cannot currently be used to select files with ZIP. So, you can't do this: zip -a myzip *.exec I expect to fix this for CMS in the future. 4. File timestamps a. The dates and times of files being zipped or unzipped are not currently read or set. When a file is zipped, the timestamp inside the zip file will always be the current system date and time. Likewise, when unzipping, the date and time of files being unzipped will always be the current system date/time. b. Existing files are assumed to be newer than files inside a zip file when using the -f freshen option of UNZIP. This will prevent overwriting files that may be newer than the files inside the zip file, but also effectively prevents the -f option from working. 5. ASCII, EBCDIC, and binary data Background ---------- Most systems create data files as just a stream of bytes. Record breaks happen when certain characters (new line and/or carriage return characters) are encountered in the data. How to interpret the data in a file is up to the user. The system must be told to either notice new line characters in the data or to assume that the data in the file is binary data and should be read or written as-is. CMS and MVS are record-based systems. All files are composed of data records. These can be stored in fixed-length files or in variable length files. With fixed-length files, each record is the same length. The record breaks are implied by the LRECL (logical record length) attribute associated with the file. With variable-length files, each record contains the length of that record. The separation of records are not part of the data, but part of the file structure. This means you can store any type of data in either type of file structure without having to worry about the data being interpreted as a record break. Fixed-length files may have padding at the end of the file to make up a full record. Variable-length files have no padding, but require extra record length data be stored with the file data. Storing fixed-length files into a zip file is simple, because all the data can just be dumped into the zip file and the record format (RECFM) and logical record length (LRECL) can be stored in the extra data area of the zip file so they can be restored when UNZIP is used. Storing variable-length data is harder. There is no place to put the record length data needed for each record of the file. This data could be written to the zip file as the first two bytes of each record and interpreted that way by UNZIP. That would make the data unusable on systems other than CMS and MVS, though. Currently, there isn't a solution to this problem. Each record is written to the zip file and the record length information is discarded. Binary data stored in variable-length files can't be put into a zip file then later unzipped back into the proper records. This is fine for binary data that will be read as a stream of bytes but not OK where the records matter, such as with CMS MODULEs. If the data is text (character data), there is a solution. This data can be converted into ASCII when it's stored into a zip file. The end of each record is now marked in the file by new line characters. Another advantage of this method is that the data is now accessible to non-EBCDIC systems. When the data is unzipped on CMS or MVS, it is converted back into EBCDIC and the records are recreated into a variable-length file. So, here's what we have... a. To store readable text data into a zip file that can be used on other platforms, use the -a option with ZIP to convert the data to ASCII. These files will unzip into variable-length files on CMS and should not contain binary data or corruption may occur. b. Files that were zipped on an ASCII-based system will be automatically translated to EBCDIC when unzipped. To prevent this (to unzip binary data on CMS that was sent from an ASCII-based system), use the -B option with UNZIP to force Binary mode. To zip binary files on CMS, use the -B option with ZIP to force Binary mode. This will prevent any data conversions from taking place. c. When using the ZIP program without specifying the "-a" or "-B" option, ZIP defaults to "native" (EBCDIC) mode and tries to preserve the file information (RECFM, LRECL, and BLKSIZE). So when you unzip a file zipped with ZIP under CMS or MVS, UNZIP restores the file info. The output will be fixed-length if the original was fixed and variable-length if the original was variable. If UNZIP gives a "write error (disk full?)" message, you may be trying to unzip a binary file that was zipped as a text file (without using the -B option) Summary ------- Here's how to ZIP the different types of files. RECFM F text Use the -a option with ZIP to convert to ASCII for use with other platforms or no options for use on EBCDIC systems only. RECFM V text Use the -a option with ZIP to convert to ASCII for use with other platforms or no options for use on EBCDIC systems only. RECFM F binary Use the -B option with ZIP (upper-case "B"). RECFM V binary Use the -B option with ZIP. Can be zipped OK but the record structure is destroyed when unzipped. This is OK for data files read as binary streams but not OK for files such as CMS MODULEs. 6. Character Sets If you are used to running UNZIP on systems like UNIX, DOS, OS/2 or Windows, you will may have some problems with differences in the character set. There are a number of different EBCDIC code pages, like there are a number of different ASCII code pages. For example, there is a US EBCDIC, a German EBCDIC, and a Swedish EBCDIC. As long as you are working with other people who use the same EBCDIC code page, you will have no trouble. If you work with people who use ASCII, or who use a different EBCDIC code page, you may need to do some translation. UNZIP translates ASCII text files to and from Open Systems EBCDIC (IBM-1047), which may not be the EBCDIC that you are using. For example, US EBCDIC (IBM-037) uses different character codes for square brackets. In such cases, you can use the ICONV utility (supplied with IBM C) to translate between your EBCDIC character set and IBM-1047. If your installation does not use IBM-1047 EBCDIC, messages from UNZIP may look a little odd. For example, in a US EBCDIC installation, an opening square bracket will become an i-acute and a closing square bracket will become a u-grave. The supplied ZIP and UNZIP EXECs attempt to correct this by setting CMS INPUT and OUTPUT translations to adjust the display of left and right brackets. You may need to change this if brackets don't display correctly on your system. 7. You can unzip using VM/CMS PIPELINES so unzip can be used as a pipeline filter. Example: 'PIPE COMMAND UNZIP -p test.zip george.test | Count Lines | Cons' Please report all bugs and problems to: Zip-Bugs@lists.wku.edu ----------------------------------------------------------------------- Original CMS/MVS port by George Petrov. e-mail: c888090@nlevdpsb.snads.philips.nl tel: +31-40-781155 Philips C&P Eindhoven The Netherlands ----------------------------------------------------------------------- Additional fixes and README re-write (4/98) by Greg Hartwig. e-mail: ghartwig@ix.netcom.com ghartwig@vnet.ibm.com ----------------------------------------------------------------------- Additional notes from Ian E. Gorman. e-mail: ian@iosphere.net