Using ZIP and UNZIP on VM/CMS
=============================


Installing executables
----------------------

The following CMS MODULEs are available:
   ZIP
   ZIPNOTE
   ZIPCLOAK
   ZIPSPLIT
   UNZIP

In addition to these, each MODULE file also has an EXEC with the same
name.  These EXECs are front-ends to the MODULES that will attempt to
set up the required runtime libraries before running the MODULE.
All the EXECs are identical.  Only their names are different.
They are stored as plain text files.

The CMS MODULE files have been packed using the COPYFILE command to
allow their file format to be properly restored, since variable length
binary files will not currently unzip properly (see below for details).
The MODULEs are shipped with a filetype or extension of CMO (for CMS
MODULE).  Their names may vary on the distribution disk to indicate
their level, etc.

To restore them to executable MODULEs on CMS, do the following:
   1. Upload them to CMS with a Fixed record length with LRECL 1024.
      Example, from a DOS or OS/2 window, type this:
         SEND unzip.cmo A:unzip module a (RECFM F LRECL 1024

      Example, using FTP from CMS, type this:
         BINARY FIXED 1024
         GET unzip.cmo unzip.module.a

      Note:  Replace "unzip.cmo" with the actual name.

   2. Use COPYFILE to unpack the file.
      Example, in CMS type this:
         COPYFILE UNZIP MODULE A (UNPACK REPLACE OLDDATE

   3. Repeat steps 1-2 for each of the programs.

   4. Build the ZIPINFO module by typing this:
         COPYFILE UNZIP MODULE A ZIPINFO MODULE A (OLDDATE

   5. Upload the EXECs to CMS as text files (with ASCII-to-EBCDIC
      translation).
      Example, from a DOS or OS/2 window, type this:
         SEND unzip.exc A:unzip exec a (CRLF

      Example, using FTP from CMS, type this:
         GET unzip.exc unzip.exec.a

   6. Repeat steps 4 for each of the EXECs.


Preparing the environment
-------------------------

The executables provided were compiled with IBM C 3.1.0 and
require the the Language Environment (LE) runtime libraries.

To provide access to the runtime libraries:
   1. Link to the disk containing the Language Environment files,
      if necessary.

   2. Use the command "GLOBAL LOADLIB SCEERUN"

   These commands can be placed in your PROFILE EXEC.

   Note:  EXECs have been provided called ZIP, UNZIP, etc. that
   issue the GLOBAL LOADLIB statement.  This was done to alleviate
   frustration of users that don't have the GLOBAL LOADLIB statement
   in their PROFILE EXEC.  These EXECs may require changing for
   your system.

   Unfortunately, there is no way, using IBM C, to produce a MODULE
   that doesn't require a runtime library.


Testing
-------

To test the MODULEs, just type ZIP or UNZIP.  They should
show help information on using the commands.

If you see something like this:
   DMSLIO201W The following names are undefined:
    CEEEV003
   DMSABE155T User abend 4093 called from 00DCD298 reason code 000003EB

Then you don't have access to the proper runtime libraries, as
described above.

Here is additional information on the ZIP and UNZIP programs that
may assist support personnel:
   - Compiled with IBM C V3R1M0 on VM/ESA 2.2.0 with
     CMS level 13 Service Level 702.

   - Require the SCEERUN LOADLIB runtime library.  This is
     part of the Language Environment (LE).

   - Linked with options RMODE ANY AMODE ANY RLDSAVE.

If you continue to have trouble, report the problem to Zip-Bugs
(see the bottom of this document).



Compiling the source on VM/CMS
------------------------------

The source has been successfully compiled previously using
C/370 2.1 and 2.2.  The source has been recently compiled using
IBM C 3.1.0 on VM/ESA 2.2.0 with CMS level 13.  I don't have
access to an MVS system so the code hasn't been tested there
in a while.

 1. Unzip the source files required for CMS.  The root-level files
    inside the ZIP file and the files in the CMSMVS subdirectory are
    needed.  Example (use both commands):
       unzip -aj zip23.zip -x */*   -dc
       unzip -aj zip23.zip cmsmvs/* -dc

    This example unzips the files to the C-disk, while translating
    character data and ignoring paths.

    If you don't already have a working UNZIP MODULE on CMS you will
    have to unzip the files on another system and transport them
    to CMS.  All the required files are plain text so they can
    be transferred with ASCII-to-EBCDIC translations.

 2. Repeat step 1 with the zip file containing the UNZIP code.
    Unzip the files to a different disk than the disk used for the ZIP
    code.

 3. To compile the ZIP code, run the supplied CCZIP EXEC.
    To compile the UNZIP code, run the supplied CCUNZIP EXEC.

NOTE:
Some of the ZIP and UNZIP source files have the same name.  It is
recommended that you keep the source from each on separate disks and
move the disk you are building from ahead of the other in the search
order.

For example, you may have a 192 disk with the ZIP source code and
a 193 disk with the UNZIP source code.  To compile ZIP, access
the 192 disk as B, then run CCZIP.  This will create the following
modules:  ZIP, ZIPNOTE, ZIPSPLIT, ZIPCLOAK.

To compile UNZIP, access 193 as B, then run CCUNZIP.  This will create
the following modules:  UNZIP, ZIPINFO (a copy of UNZIP).


=========================================================================


Using ZIP/UNZIP
---------------

Documentation for the commands is in MANUAL NONAME (for ZIP) and in
UNZIP DOC UNZIP.  INFOZIP DOC describes the use of the -Z option of
UNZIP.

The rest of this section explains special notes concerning the VM/CMS
version of ZIP and UNZIP.


Filenames and directories
-------------------------

 1. Specifying filenames

    a. When specifying CMS files, use filename.filetype.filemode format
       (separate the three parts of the name with a period and use no
       spaces).  Example:  profile.exec.a

       Unfortunately, this prevents you from using ZIP from
       FILELIST.  To unzip a zip file, however, you can type something
       like this next to it in FILELIST:
          unzip /n -d c

       This will unzip the contents of the current file to a C-disk.

    b. It is possible to use DD names with ZIP and UNZIP on CMS, though
       it can be cumbersome.  Example:
          filedef out disk myzip zip a
          zip dd:out file1.txt file2.txt

       While you can also use a DD name for the input files, ZIP
       currently does not correctly resolve the filename and will
       store something like "dd:in" inside the ZIP file.  A file stored
       in this manor cannot easily be unzipped, as "dd:in" is an invalid
       filename.

    c. In places where a directory name would be used on a PC, such as
       for the ZIP -b (work path) option or the UNZIP -d (destination
       path) options, use a filemode letter for CMS.  For example,
       to unzip files onto a C-disk, you might type something like this:
          unzip myzip.zip -d c

       Currently, ZIP uses the A-disk for work files.  When zipping
       large files, you may want to specify a larger disk for work files.
       This example will use a C-disk for work files.
          zip -b C myzip.zip.c test.dat.a


 2. Filename conversions

    a. Filemode letters are never stored into the zip file or take from
       a zip file.  Only the filename and filetype are used.
       ZIP removes the filemode when storing the filename into the
       zip file.  UNZIP assumes "A" for the filemode unless the -d
       option is used.

    b. When unzipping, any path names are removed from the fileid
       and the last two period-separated words are used as the
       filename and filetype.  These are truncated to a maximum of
       eight characters, if necessary.  If the filetype (extension)
       is missing, then UNZIP uses "NONAME" for the filetype.
       Any '(' or ')' characters are removed from the fileid.

    c. All files are created in upper-case.  Files in mixed-case
       cannot currently be stored into a ZIP file.

    d. Shared File System (SFS) directories are not supported.
       Files are always accessed by fn.ft.fm.  To use an SFS disk,
       Assign it a filemode, then it can be used.


 3. Wildcards in file names

    a. Wildcards are not supported in the zip filename.  The full
       filename of the zip file must be given (but the .zip is not
       necessary).  So, you can't do this:
          unzip -t *.zip

    b. Wildcards CAN be used with UNZIP to select (or exclude) files
       inside a zip file.  Examples:
          unzip myzip *.c          - Unzip all .c files.
          unzip myzip *.c -x z*.c  - Unzip all .c files but those
                                     starting with Z.

    c. Wildcards cannot currently be used to select files with ZIP.
       So, you can't do this:
          zip -a myzip *.exec

       I expect to fix this for CMS in the future.


 4. File timestamps

    a. The dates and times of files being zipped or unzipped are not
       currently read or set.  When a file is zipped, the timestamp
       inside the zip file will always be the current system date and
       time.  Likewise, when unzipping, the date and time of files
       being unzipped will always be the current system date/time.

    b. Existing files are assumed to be newer than files inside a zip
       file when using the -f freshen option of UNZIP.  This will prevent
       overwriting files that may be newer than the files inside the
       zip file, but also effectively prevents the -f option from working.


 5. ASCII, EBCDIC, and binary data

    Background
    ----------
    Most systems create data files as just a stream of bytes.  Record
    breaks happen when certain characters (new line and/or carriage
    return characters) are encountered in the data.  How to interpret
    the data in a file is up to the user.  The system must be told
    to either notice new line characters in the data or to assume
    that the data in the file is binary data and should be read or
    written as-is.

    CMS and MVS are record-based systems.  All files are composed
    of data records.  These can be stored in fixed-length files or
    in variable length files.  With fixed-length files, each record
    is the same length.  The record breaks are implied by the
    LRECL (logical record length) attribute associated with the file.
    With variable-length files, each record contains the length of
    that record.  The separation of records are not part of the
    data, but part of the file structure.

    This means you can store any type of data in either type of file
    structure without having to worry about the data being interpreted
    as a record break.  Fixed-length files may have padding at the
    end of the file to make up a full record.  Variable-length files
    have no padding, but require extra record length data be stored
    with the file data.

    Storing fixed-length files into a zip file is simple, because all
    the data can just be dumped into the zip file and the record
    format (RECFM) and logical record length (LRECL) can be stored
    in the extra data area of the zip file so they can be restored
    when UNZIP is used.

    Storing variable-length data is harder.  There is no place to put
    the record length data needed for each record of the file.  This
    data could be written to the zip file as the first two bytes of
    each record and interpreted that way by UNZIP.  That would make
    the data unusable on systems other than CMS and MVS, though.

    Currently, there isn't a solution to this problem.  Each record is
    written to the zip file and the record length information is
    discarded.  Binary data stored in variable-length files can't be put
    into a zip file then later unzipped back into the proper records.
    This is fine for binary data that will be read as a stream of bytes
    but not OK where the records matter, such as with CMS MODULEs.

    If the data is text (character data), there is a solution.
    This data can be converted into ASCII when it's stored into
    a zip file.  The end of each record is now marked in the file
    by new line characters.  Another advantage of this method is
    that the data is now accessible to non-EBCDIC systems.  When
    the data is unzipped on CMS or MVS, it is converted back into
    EBCDIC and the records are recreated into a variable-length file.


    So, here's what we have...

    a. To store readable text data into a zip file that can be used
       on other platforms, use the -a option with ZIP to convert the
       data to ASCII.  These files will unzip into variable-length
       files on CMS and should not contain binary data or corruption
       may occur.

    b. Files that were zipped on an ASCII-based system will be
       automatically translated to EBCDIC when unzipped.  To prevent
       this (to unzip binary data on CMS that was sent from an
       ASCII-based system), use the -B option with UNZIP to force Binary
       mode.  To zip binary files on CMS, use the -B option with ZIP to
       force Binary mode.  This will prevent any data conversions from
       taking place.

    c. When using the ZIP program without specifying the "-a" or "-B"
       option, ZIP defaults to "native" (EBCDIC) mode and tries to
       preserve the file information (RECFM, LRECL, and BLKSIZE).  So
       when you unzip a file zipped with ZIP under CMS or MVS, UNZIP
       restores the file info.  The output will be fixed-length if the
       original was fixed and variable-length if the original was
       variable.

    If UNZIP gives a "write error (disk full?)"  message, you may be
    trying to unzip a binary file that was zipped as a text file
    (without using the -B option)


    Summary
    -------
    Here's how to ZIP the different types of files.

    RECFM F text
       Use the -a option with ZIP to convert to ASCII for use with other
       platforms or no options for use on EBCDIC systems only.

    RECFM V text
       Use the -a option with ZIP to convert to ASCII for use with other
       platforms or no options for use on EBCDIC systems only.


    RECFM F binary
       Use the -B option with ZIP (upper-case "B").

    RECFM V binary
       Use the -B option with ZIP.  Can be zipped OK but the record
       structure is destroyed when unzipped.  This is OK for data files
       read as binary streams but not OK for files such as CMS MODULEs.


 6. Character Sets

    If you are used to running UNZIP on systems like UNIX, DOS, OS/2 or
    Windows, you will may have some problems with differences in the
    character set.

    There are a number of different EBCDIC code pages, like there are a
    number of different ASCII code pages.  For example, there is a US
    EBCDIC, a German EBCDIC, and a Swedish EBCDIC.  As long as you are
    working with other people who use the same EBCDIC code page, you
    will have no trouble.  If you work with people who use ASCII, or who
    use a different EBCDIC code page, you may need to do some
    translation.

    UNZIP translates ASCII text files to and from Open Systems EBCDIC
    (IBM-1047), which may not be the EBCDIC that you are using.  For
    example, US EBCDIC (IBM-037) uses different character codes for
    square brackets.  In such cases, you can use the ICONV utility
    (supplied with IBM C) to translate between your EBCDIC character set
    and IBM-1047.

    If your installation does not use IBM-1047 EBCDIC, messages from
    UNZIP may look a little odd.  For example, in a US EBCDIC
    installation, an opening square bracket will become an i-acute and a
    closing square bracket will become a u-grave.

    The supplied ZIP and UNZIP EXECs attempt to correct this by setting
    CMS INPUT and OUTPUT translations to adjust the display of left and
    right brackets.  You may need to change this if brackets don't
    display correctly on your system.


 7. You can unzip using VM/CMS PIPELINES so unzip can be used as
    a pipeline filter.  Example:
       'PIPE COMMAND UNZIP -p test.zip george.test | Count Lines | Cons'




Please report all bugs and problems to:
   Zip-Bugs@lists.wku.edu


-----------------------------------------------------------------------
Original CMS/MVS port by George Petrov.
e-mail:  c888090@nlevdpsb.snads.philips.nl
tel:     +31-40-781155

Philips C&P
Eindhoven
The Netherlands

-----------------------------------------------------------------------
Additional fixes and README re-write (4/98) by Greg Hartwig.
e-mail:  ghartwig@ix.netcom.com
         ghartwig@vnet.ibm.com

-----------------------------------------------------------------------
Additional notes from Ian E. Gorman.
e-mail:  ian@iosphere.net