** ** ****** ** ** ******** ** ** ***** ** ** ** * ** ** ** *** ** ** ** ** ** * ** ** ** * **** **** ** * ** ** ** ** ** **** ** ** ** ** ** *** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** ** ** ** ****** ** ** ** ***** ***** ** ** ** ****** ****** ***** ** **** ******** ****** ****** ** ** ** ** * ** ** * ** ** ** ** ****** **** ** ** ** ** **** ****** ** ** ** ** **** ** ** ** ** ** ** ** ** ** ** ** ** * ** ** ** ** ** ** ** ****** ***** ** **** ** ****** ** ** **** KY REGISTER **** KY REGISTER **** KY REGISTER **** KY REGISTER *** JUNE 1989 TABLE OF CONTENTS Conversion to VM/XA SP on the 3090 . . . . . . . . . . . . . . . . . 55 New CMS Batch Facility on the 3090 . . . . . . . . . . . . . . . . . 158 Graphics under VM/XA SP and CMS 5.5 . . . . . . . . . . . . . . . . 231 Dialup Modem Numbers Changing . . . . . . . . . . . . . . . . . . . 305 UKCC Short Courses . . . . . . . . . . . . . . . . . . . . . . . . . 321 Maintain Current Backups . . . . . . . . . . . . . . . . . . . . . . 410 VIEW: Your Information Service . . . . . . . . . . . . . . . . . . 495 PHOENIX Online Demon Available . . . . . . . . . . . . . . . . . . . 523 Holiday Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . 586 INFO/EXPO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596 New List for TeX Users . . . . . . . . . . . . . . . . . . . . . . . 613 BMDPCA: Simple Correspondence Analysis . . . . . . . . . . . . . . 638 Suggestions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916 Service Directory . . . . . . . . . . . . . . . . . . . . . . . . . 930 ************************************************************************* CONVERSION TO VM/XA SP ON THE 3090 All IBM 3090 users were switched to VM/XA SP Release 2 and CMS 5.5 on Tuesday, May 9, 1989. This affected users on UKCCS and UKCCXA. IBM 3084 (UKCC) users were not affected. The 3090 is now called UKCCS for mail and file transfer (RSCS and BITNET) purposes and for requesting service through PASSTHRU (PVM) and TCP/IP. The conversion to CMS 5.5 includes a new CMS batch facility, and some graphics packages were also affected by the conversion. EXTENDED ARCHITECTURE The traditional 370 architecture allowed only 24 bits for storage addressing, limiting programs to a 16 megabyte address space. The new extended architecture (XA) uses 31 bits for storage addressing, providing a potential for billions of bytes of addressability. In addition to this, XA introduces many other changes, most significantly in the input/output system and the format and use of reserved storage locations, control registers, and status indicators. VM/XA SP RELEASE 2 VM/XA SP makes the new features of XA available to CMS and programs running under it. XA SP supports two modes of virtual machine operation: 370 and XA. In 370 mode (the initial setting at logon) the old 370 architecture and its constraints apply. This increases compatibility with previous releases of VM, but restricts programs to 16 megabytes. In XA mode all of the new features of XA are used, but at the cost of compatibility. The important point to remember is that if you don't need a larger virtual machine, you should keep below 16 megabytes and run in 370 mode to minimize difficulties during conversion. The STORAGE command automatically handles mode switching as necessary based on your machine size. VM/XA SP introduces a number of other changes that are unrelated to the new architecture. Changes in the spool system are perhaps the most obvious. There have been changes in the syntax of some commands and in the formats of many messages. VM/XA SP does not yet provide all the features of earlier, non-XA versions of VM. CMS STORAGE CMS uses the address space from about 12 megabytes to 16 megabytes for its nucleus, disk directories, and shared code. This is true regardless of the size of your virtual machine. Thus a 12 megabyte machine provides about the same amount of space for user programs and data as a 16 mega- byte machine. The initial storage size for most users at logon is 2 megabytes. FORTRAN The current VS FORTRAN (2.3) is supported under XA SP. Large arrays that will extend above the 16 megabyte boundary must be declared in dynamic common. If these arrays are passed through to subroutines through the calling sequences, then only the calling program needs to declare them as dynamic common. GRAB VSF2 to get the current version of FORTRAN. CONVERSION RECOMMENDATIONS In general, IBM-supplied commands work in all modes. Most other commands work in 370 mode (16 megabytes or less), but may not work in XA mode (greater than 16 megabytes). Commands that haven't been converted will often fail with "operation exception" or "specification exception" errors. Some of these commands are used by other commands. Many other commands, for example, depend on MENUEXEC for display management and are not XA-capable. * Use storage under 12 megabytes whenever possible. * Use the STORAGE command to define your storage size and machine mode. * If you need a function that isn't supported in XA mode switch back to 370 mode by defining a storage size less than 12 megabytes. * The number of simultaneous users of the SESSION command is severely restricted under VM/XA. * The CMS batch system is considerably different than the system used under HPO. The functions provided are about the same, but batch jobs must be written as REXX EXECs. The syntax of the BATCH command has changed, too. See HELP NEWBATCH for more information. GETTING HELP UKCCS users (using VM/SP HPO 4.2) should consult these HELP files: Help CMSXA (for all users) Help NEWBATCH (for users of CMS BATCH) For more help with CMS BATCH, contact Trent Fraebel, SYSTRENT@UKCC.UKY.EDU, 257-2277, 206 McVey Hall. If you have questions about graphics under XA, contact Charles Fisher at 257-2268, SYSCHUCK@UKCC.UKY.EDU, 206 McVey Hall. For other questions about CMS 5.5 or XA, contact a Consultant in 110 McVey Hall, 257-2249, SUGGEST@UKCC.UKY.EDU. ************************************************************************* NEW CMS BATCH FACILITY ON THE 3090 The new CMS batch facility on the 3090 provides VM users with the ability to submit, schedule, query, and execute batch jobs in a VM environment. It replaces the batch program that ran on the HPO system. The old system will not run under XA. The new batch facility is controlled by a supervisory virtual machine that dispatches, controls, and monitors other virtual machines in which the batch jobs are processed. Some of the highlights of the new facility include * An autolog option that directs the program to automatically log on your userid at a particular time. * An easy-to-use set of commands that allows you to submit a job, check its status, change job parameters, and cancel the job. * The ability to change job options after the job has been submitted. * Allows you to run jobs at any time, including nights or weekends. * Provides enhanced function and additional flexibility in job management. The major differences between the batch system on UKCC and UKCCS include: * Job Control statements are now options of the BATCH functions. * All jobs must be written in REXX. * /SET control card for CARD, LINES, PRINT, PUNCH, SIZE, and TIME are now included in the CLASS option of the SUBMIT function. See CLASS options below. There are three classes of jobs available: small, medium, and large. Following are the default and maximum settings. Class CPU sec. Print/Punch Virt. Storage Def. Max. Def. Max. Def. (in 1,000s) Small 60 3600 10 200 4 megs (1 hr) Medium 60 57600 10 200 12 megs (16 hrs) Large 60 57600 10 200 32 megs (16 hrs) (machine mode XA) The default class is SMALL. To change default settings use the option on the SUBMIT command. For example, BATCH SUBMIT fn ft fm ( CLASS LARGE SECONDS 50000 PRINT 25 With the RATE option you can set DAY, NIGHT, WEEKEND, or HOLIDAY. The default rate is DAY. More detailed information is available in the IBM publication, VM Batch Facility Users' Guide SC34-4094-1. Online help is available by typing HELP BATCH on UKCCS. If you have questions about the new batch facility, contact Trent Fraebel, 257-2277, SYSTRENT@UKCC.UKY.EDU, 206 McVey Hall. ************************************************************************* GRAPHICS UNDER VM/XA SP AND CMS 5.5 While most graphics programs will run in 370 mode automatically, the conversion of the 3090 to VM/XA SP has affected some graphics programs. Following are the graphics commands and packages which may be affected by the conversion. Most of these commands were written by the UKCC, and some were written at other universities or are from non-IBM commercial sources. XA indicates that a command or package can be used in either an XA-mode machine or a 370-mode machine. 370 indicates that a 370-mode machine is required. No indicates that the command is not available under VM/XA SP. CADAM No CATIA No CBDS No DI3000 370 I-DEAS No GDDM 370 GK-2000 370 GraPHIGS 370 LINEMODE 370 LWGDDM 370 LWPLOT XA LWPRINT XA PAINTSHO 370 PICSURE 370 PLOT No PLOTLW XA PLOTTEK 370 TKPLOT 370 CONVERSION PROGRESS We're working to make more graphics packages available in 370 mode and to make more of the graphics functions already available in 370 mode work in XA mode. Engineering graphics packages, in particular, should be working at least in 370 mode soon. Following are some specific notes on various graphics commands. DI3000 DI3000 should work with no noticeable alterations except that it may be necessary to use the dynamic device drivers in some cases. Due to several assembler subroutines dependent on 370 operating instructions, the programs using DI3000 are usable in 370 mode only. We are working to rectify this. GDDM Due to the arrangement of shared segments under CMS 5.5 it may be necessary to issue GLOBAL TXTLIB ADMGLIB PLILIB before using some utilities written with GDDM. The GDDM <-> REXX interface available under CMS 5.5 is not RXGDDM, but a newer IBM product GDDM-REXX. Conversion between the two is very simple. Help on using the new interface is available; just type HELP GDDMREXX. Currently, no program using GDDM runs successfully in XA mode. A new release of the software package (GDDM-XA) may fix this problem. I-DEAS A new version incorporating support for CMS 5.5 has arrived and will be installed when the user community on UKCCS has been moved over to CMS 5.5 and VM/XA SP 2. PLOT This command cannot send files to be plotted directly from CMS 5.5. Send your files to UKCC and plot from there. GETTING HELP If you need help coping with the conversion, contact Chuck Fisher, 257-2268, SYSCHUCK@UKCC.UKY.EDU, 206 McVey Hall; or Bob Williamson, 257-2227, ROBERTT@UKCC.UKY.EDU, 207 McVey Hall. ************************************************************************* DIALUP MODEM NUMBERS CHANGING On July 1, 1989 the dialup modem phone numbers will be changed. The phone numbers affected and what they will become are: 257-2400 changing to 258-2400 for 2400 baud access. 257-9200 changing to 258-1200 for 1200 baud access. All lines will be at 8 bits, no parity, 1 stop bit. After the conversion there will be twenty 2400 baud modems and sixteen 1200 baud modems. If you have questions or need help, contact Robert Lee at 257-2201, SYSBOB@UKCC.UKY.EDU, 9 McVey Hall. ************************************************************************* UKCC SHORT COURSES The following short courses are free to all UK faculty, staff, and students, but preregistration is required. If you register for a course and then find that you will be unable to attend, please cancel your registration by calling 257-UKCC. Failure to do so may jeopardize your right to register for future UKCC short courses. You can register online. Enter PUBLIC, and then type SHORTCOUrse, or enter VIEW UKCC SHORTCOURSE. If there are prerequisites for a particular class, they'll be listed in the class description. If you have questions about class content or bypassing prerequisites, call the instructor for that class. Introduction to VM/CMS and XEDIT Tuesday, June 20 and Thursday, June 22 Noon to 2:00 p.m. 104 King Library A basic introduction to interactive use of the IBM mainframe systems, this class presumes no previous knowledge of the IBM systems or any other computer system. You'll learn how to access the computer, how to create and manage files on your account, and how to use online tools such as CALENDAR and VIEW. You'll also learn how to use the CMS text editor, XEDIT, to create and modify individual files. This course will be taught in two two-hour sessions. Both sessions will provide hands-on practice of the commands that are covered. You will be given a class computer account which will remain active for the duration of the course. Your instructor will be Pat Murphy (257-2244). Introduction to Electronic Mail on the IBM Thursday, June 29 Noon to 2:00 p.m. 104 King Library An introduction to the MAIL command on the IBM 3084, this class is for the beginner. You'll learn how to create mail files and send them to other IBM system users, to WANG system users, or to PRIME system users. We'll also cover the use of BITNET to communicate with individuals at other academic centers around the world. You'll learn how to read incoming mail and some techniques for storing old mail. We'll discuss how to create and maintain a NAMES file of individuals with whom you frequently correspond. You will be taught the logon sequence and some basic CMS background before we begin the discussion of MAIL. This class presumes no previous knowledge of the IBM systems or any other computer system. You'll be given a CMS account for the duration of the course and will receive hands-on instruction for all the commands covered. Your instructor will be Pat Murphy (257-2244). Introduction to PHOENIX Monday and Tuesday, July 17 and 18 Noon to 2:00 p.m. 104 King Library The UKCC has purchased PHOENIX, a courseware authoring and presentation system, to run on the IBM 3084. This software greatly simplifies the task of creating computer-based training packages and computer test bank applications. The system provides a powerful full screen editor for creating presentation screens. Standard question types which are supported through a complex answer analysis feature are short answer, fill-in-the-blank, multiple choice, and true-false. The entire system from sign-on to sign-off is menu-driven, making it relatively easy even for non-programmers to develop quality computer courseware to supplement or enhance existing classroom instruction. Students can access courseware written for the IBM 3084 from any of the terminal cluster sites on campus. This introductory course is intended for anyone who has an interest in developing computer-aided instruction. No previous computer experience is required. Pat Murphy will be your instructor (257-2244). Introduction to SAS Monday through Thursday June 26 through 29 3:00 p.m. to 5:00 p.m. 103 McVey Hall SAS is a collection of powerful and flexible data management and statistical analysis procedures that allow you to create and analyze libraries of data files on the 3084. The course will emphasize simple data manipulation and general syntax and is designed for new and inexperienced SAS users. CMS and XEDIT knowledge is prerequisite. Your instructor will be Steve Thomson (257-2259). ************************************************************************* MAINTAIN CURRENT BACKUPS If your files are important, back them up! While the UKCC has very few problems with loss of data stored on tapes and disks, it does happen occasionally. You are responsible for maintaining backup of critical information stored on tape or disk. Be sure this backup is recent enough to satisfy your needs in the event the data are no longer readable. Data loss can occur when you accidently erase a file or when a user writes over another user's tape. We can restore data; that is, we can copy data from a backup source, but we cannot recreate or reconstruct data if no backup exists, even if the loss results from faulty equipment or mishandling. We recommend you keep a backup copy of data on a duplicate tape or disk. Two backups are recommended for data that are very difficult or impossible to recreate. You can keep your backup copy on a hard disk or on diskette. A computer printout can also serve as backup, but if the data on both tape and disk are destroyed, it may be inaccurate, time-consuming, and expensive to recreate the information into a machine-readable form by keying or scanning. Maintaining backups is the most effective safeguard against a virus attack. Without adequate backups, the discovery of a virus within a file would entail examining each element of the data within every file to detect and eliminate the infection. This could be a very time-consuming and costly process. WHAT TO BACK UP Keep backups for any data and programs that would be difficult to replace. Backups are essential for complex research or experiment data, for outdated or very old data files, and for data that cannot be recreated. You are also responsible for identifying files which contain data required to be retained for archive purposes. (There is an environmentally controlled and secured archive storage area in the King Library. Archivist Terry Birdwhistell in Special Collections has more information.) Keep in mind that, over time, the ability to retrieve data stored on magnetic media decreases. This is most often the result of deterioration of the physical media and sometimes due to a loss of strength of the magnetic signal. Thus, files which are archived should be periodically renewed. We recommend that all archive data files be created in two copies with one being retained at the UKCC and the other copy being retained in an off-site storage area. This will ensure that the data would not be lost if either the UKCC or the off-site storage area suffered a disaster. To provide a high degree of assurance that the file can be read after it has been stored for an extended period of time, we suggest: Twelve months from the time the original file was created, use the duplicate to create a second duplicate. Use the same vendor-supplied utility program for the duplication that you used to create the first duplicate. The first duplicate should be rotated to the off- site storage area and the copy which was in the off-site storage area returned to the UKCC for use. By repeating this twelve-month cycle you ensure that no copy of your data becomes older than two years without being replaced. Don't use a non-commercial program, a locally written program, or a user-written program to back up your files. Use a vendor-supplied utility program such as FATAR, FDRDSF, SYNCGENR, IEBGENER, or IEBCOPY to create a second copy of your files. This will ensure that the original file can be read and will provide a reasonable degree of assurance that the file will not be modified during the copy operation. A Consultant can help you copy to tape. A temporary tape is limited to two days of storage. For long term storage, expect to use a 6250 BPI standard labeled tape with a volume serial number selected by the Operations staff. UNAUTHORIZED USE We strongly recommend that all magnetic tapes be secured from unauthorized use. To accomplish this, contack Jack Coffman, UKA051@UKCC.UKY.EDU, 257-2253, 218 McVey Hall for the necessary forms. A Consultant in 110 McVey Hall can give you more information and help on utility programs and backups for your data stored on tapes and disks. A Consultant in the Micro Lab, 107 McVey Hall, can help you with downloading to a microcomputer. A copy of the FATS/FATAR user manual is available for reference in the Consulting Room, 110 McVey Hall. A Consultant can also give you more information and help on utility programs and backups for data stored on tapes and disks. A Micro Lab Consultant in 107 McVey Hall can help you with downloading to a microcomputer. For more information about tape protection and authorization, contact Jack Coffman, UKA051@UKCC.UKY.EDU, 257-2273, 218 McVey Hall. ************************************************************************* VIEW: YOUR INFORMATION SERVICE Want to hear what scientists are saying about cold fusion, or chart this year's sunspot activity? VIEW has the answers. With VIEW you can keep up with the arts on the Arts list or learn more about viruses on the Virus list. Other lists carry information on such subjects as desktop publishing, laser printers, risks, space, vector processing, veterinary medicine, and video technology. Primarily designed for campuswide announcements, VIEW is a convenient way to keep yourself informed about what's going on around the world. VIEW is easy to use. Just enter VIEW from your CMS userid, and VIEW will display a menu. Move the cursor to an item, and press ENTER to select it. If you know the name of an item, you can bypass the menus. For example, to see the campus news, enter VIEW NEWS and go directly to it. If your department has campus news suitable for VIEW, contact Dave Elbon at 257-2230, SYSDAVE@UKCC.UKY.EDU, 211B McVey Hall. ************************************************************************* PHOENIX ONLINE DEMO AVAILABLE PHOENIX, the courseware authoring and presentation system, has been available for a year now, and several departments around campus are making use of this exciting software package. PHOENIX greatly simplifies the task of creating computer-based training courses and computer test bank applications. The demo of PHOENIX, which was shown at INFO/EXPO, is now available to all CMS users and can be accessed from any of the terminal cluster sites throughout the campus. To view the demo, logon and enter GRAB PHOENIX Then enter CBEIUCV PHOENIX to bring up the PHOENIX course sign-on screen. When this screen appears, type S102 next to the prompt for sign-on ID and EXPO next to the course name. Press the ENTER key. Press the ENTER key to advance from one screen to the next except when a menu is displayed. When a menu is displayed, choose your options from that menu. Once you enter your menu choice, press ENTER. Almost all PHOENIX presentation screens have a command arrow at the bottom of the screen. You may exit PHOENIX at any time by entering SIGNOFF or SIGN OFF on that line. PHOENIX authoring is menu-driven, from sign-on to sign-off, so that even nonprogrammers can develop quality computer courseware to supplement or enhance existing classroom instruction. Its powerful full screen editor can create presentation screens, with such features as centering single lines or entire blocks of text, text wrapping, moving or copying single lines or blocks of text and drawing boxes and lines for additional visual impact. The presentation screens are already formatted and can be used to present textual information or questions. Standard question types which are supported through a complex answer analysis feature are short answer, fill-in-the-blank, multiple choice, and true-false. Presentation screens are automatically linked in the order in which they are entered, but you can use a branching option which will branch to different locations in the course based on the student's response to individual questions. In addition to simplifying the production of courseware, PHOENIX also provides accurate, timely tracking of student performance. The UKCC will offer the short course Introduction to PHOENIX in July. If you questions about PHOENIX, contact Pat Murphy, UKC103@UKCC.UKY.EDU, 257-2244; Wayne Beech, 257-2238, WAYNE@UKCC.UKY.EDU; or Peggy Akridge, 257-2237, PEGGY@UKCC.UKY.EDU. ************************************************************************* HOLIDAY SCHEDULE Tuesday, July 4 is an official UK holiday. The UKCC offices, Consulting Room, and Micro Lab will be closed. The Data Center and Users' Rooms in 103 and 111 McVey Hall will be open from Noon until 12:30 a.m. The IBM and PRIME systems will be in operation, as usual. ************************************************************************* INFO/EXPO The two-day information fair held in April was well-received by the University community. Many departments on campus participated by showing how they use computer-based services for research and instruction. Online electronic databases, CD-ROM products, electronic courier systems, printing services, using local, national and international networks, document scanning, and UK cable TV were a few of the systems displayed. Plans are under way for another fair in the fall. All areas of the University are invited to participate. Details will be forthcoming. -- Lavine Thrailkill ************************************************************************* NEW LIST FOR TeX USERS A campuswide list has been created to discuss TeX-related problems, suggestions, and helpful macros. To add yourself to the list, send SUBscribe TEXUSERS your first name your last name to LISTSERV@UKCC.UKY.EDU from any machine on campus that uses TeX. You'll receive electronic notification. Once you're on the list, you can send mail to TEXUSERS@UKCC.UKY.EDU The list is an open discussion list. When you reply to mail received from the list, it will reach all members of the list. If you want to send a private reply, send mail to the original sender. For more information about the list or about TeX, contact Shashi Sathaye, 257-2247, SYSSHASH@UKCC.UKY.EDU, 210 McVey Hall. -- Shashi Sathaye ************************************************************************* BMDPCA: SIMPLE CORRESPONDENCE ANALYSIS Simple correspondence analysis is an exploratory data analysis technique to map the cell frequencies of a two-way cross classification table into a graphical display, where the rows and columns of the table are represented as points in the display. Devotees of this technique claim that the plots and tables generated by the correspondence analysis will contain most of the salient features of your data. The BMDP-88 statistical library on MVS/370 includes BMDPCA, a program to perform such simple correspondence analyses. Correspondence analysis is very popular in Europe, particularly France, where the term "exploratory data analysis" is apparently synonymous with correspondence analysis. Apparently, when faced with continuous variables, Europeans will often group them into convenient categories and proceed with a usual correspondence analysis. If you have a two-way frequency table, with rows defining different populations, and two columns, it's very easy to compare the row percentages to get a clear picture, at least for the rows, of the data layout. Row percentages add to 100, so it's easy to look at one column of the table and compare the various row profiles. But this is more difficult, perhaps impossible, with more than two column levels and more than two row levels in the table. For example, if there are 7 column categories in each row, you have to group rows by their similarity in 6 dimensions. Correspondence analysis takes both row and column category profiles (all values in that row or column) and represents them in a lower dimensional space where it is easier to judge similarity. Recall that if O represents an observed frequency and E represents an expected frequency under some model, then, summing over all cells in the table, the Pearson chi-square=S (O-E)2/E. A correspondence analysis starts by computing the square root of the cell contribution to the Pearson chi- square statistic to test independence (or equivalently homogeneity) of the variables defining the margins of the table. That is, for i indexing rows of the table, and j columns, then with N denoting total frequency, d(i,j)=(1/N) (O-E)/|E If there are, say, I rows and J columns in the table, this defines an IxJ matrix, say D. A correspon- dence analysis is then essentially principal components of the matrices defined by crossproducts of the matrix D (D'D or DD',where D' is the transpose of D). The traces of these two crossproducts matrices are equal, as are their positive eigenvalues. In either case, the trace is called the total inertia. It is just the Pearson chi square statistic for testing independence (or homogeneity) divided by N. Note that the sum of the eigenvalues of the matrix is the inertia. If most of the inertia can be attributed to a few dimensions, corresponding to the largest eigenvalues, then, presumably, little information in the table is lost by considering only the derived factor space associated with those largest eigenvalues. Often one or two dimensions represent virtually all the total inertia (say close to 90%). By taking principal components from D'D, you get components for rows while taking principal components of DD' gives you components for columns. Both sets of components generate "canonical variates" that can be represented in the same derived space. (Mathematically, this is just a so-called "singular value decomposition" of D.) In fact, correspondence analysis is just a canonical correlation analysis of the matrix of standardized deviates. If little information is lost (the first two or so eigenvalues are large), then most row or column profiles should be well represented in this derived factor space. If a column or row profile is well represented in the factor space, most of its contribution to the total inertia is represented by the distance to the origin in that factor space. So distances in the derived factor space will roughly correspond to chi-square values. To aid in interpretation of such distances, BMDPCA provides a printer plot of the canonical variate values corresponding to row and column profiles in this factor space. Similarities are probably most easily judged from this plot. Axes will usually not be scaled equally, so some care must be taken when interpreting apparent distances on the plot, but the plots can still be informative. In general, canonical variate values of similar rows (or columns) will be close together in the derived factor space. Thus, the distance can be used as a measure of similarity. Also, the closer the canonical variate values of a row (or column) are to the origin, then, generally, the closer will be the row (or column) profile to the "average" row (or column) profile. Much more roughly, the i-th profile and j-th column profile will generally be close when the ij-th cell is much larger than expected under independence, and will be far apart when the ij-th cell is much smaller than expected. Thus, very small distances or very large distances between row profile and a column profile might indicate positive or negative association respectively. Hence, from the points in the derived factor space, it will often be possible to group similar rows, similar columns, and possibly get some idea of unusually large or small cell frequencies. A number of statistics are printed to help judge whether or not a column or row profile is well represented in the factor space. Probably the two most useful are the "Quality," labeled "QLT" on the output, and the correlation, labeled "COR." The Quality is essentially an R&S'2 statistic for that row or column. Values close to 1 indicate the representation is good, while small values indicate the representation is bad. Provisionally, it would seem that quality values below .6 are very bad, and probably they should be above .7 or .8. The Correlation can be interpreted as the contribution of each derived dimension to the Quality, and serves as an indicator of how well a profile is represented by that axis. SOME SPECIFIC CA COMMANDS /INPUT The following commands are useful for direct input of frequency tables. TABLE=#1,#2,... Specifies direct input of a multi-way frequency table. #1, #2, ... define the number of levels of each categorical variable defining the table. The input data would correspond to cell frequencies. CORRESpondence.../ Performs two-way correspondence analysis. ROW= varlist. Lists variables to be combined into a row factor. COLumn=varlist. Lists variables to be combined into a column factor. AXIS = #. Number of axes (factors) to be used. Default is as many as needed to accumulate 90% of inertia. CONSTant=#. Construct axes (factors) until # of inertia used. Default is #=0.9. SELect(var)=list. Restricts analysis to listed strata of the variable. Var cannot appear in ROW or COL command. COUNT=varname. Indicates variable containing cell frequencies. SUPPLEMentary.../ Analyzes supplemental profiles in the derived factor space from the previous CORRESpondence paragraph. This is useful to indicate various reference groups, e.g. known population profiles, in the derived factor space. PRINT ... / Controls printed output. OBServed. Default. Prints the observed 2-way frequency table. To suppress state NO OBServed. EXCluded. Default. Prints a table of missing, out of range values. To suppress state NO EXCluded. PERCENT=NONE|ROW|COLumn|TOTal. One or more. Prints tables of specified percentages. EXPected. Prints table of expected cell frequencies under independence. DIFFerence. Prints table of "residuals" ((O-E)/E). Note these are not the standardized deviates used in the correspondence analysis. DICTionary. Indicates how labels are generated. Only appropriate when more than one variable defines a margin of the observed frequency table. PLOT .../ Generates one- and two-dimensional plots of profiles. BOTH. Plot both row and column profiles on one plot. ROWS. Plot row profiles on one plot. COLumns. Plot column profiles on one plot. AXIS=1|2|3. Number of axes in plot. AN EXAMPLE Suppose we have a response (dumping syndrome) which takes levels none, slight, or moderate (denoted N, S, M respectively) to one of four surgical treatments at two different hospitals. Note that moderate dumping syndrome is an extreme response. Defining rows by hospital and treatment combinations, and columns by the response, suppose the following frequency table is observed: RESPONSE HOSP TREAT N S M Total A a 23 7 2 32 b 23 10 5 38 c 20 13 5 38 d 24 10 6 40 B a 18 6 1 25 b 18 6 2 26 c 13 13 2 28 d 9 15 2 26 Total 148 80 25 253 A possible way to help interpret the structure in the data set would be to perform a simple correspondence analysis. If we define columns as the response, there are only two possible axes (since there are three possible responses), explaining 100% of the inertia. The following BMDPCA program would probably be appropriate. /PROBLEM TITLE IS 'Severity of Dumping Syndrome'. /INPUT VARIABLES ARE 3. FORMAT IS FREE. TABLE IS 3,2,4. /VARIABLE NAMES ARE RESPONSE,HOSP,TREAT. /CATEGORY NAMES(RESPONSE) ARE NONE,SLIGHT,MODERATE. NAMES(TREAT) ARE a,b,c,d. NAMES(HOSP) ARE A,B. /END 23 7 2 18 6 1 23 10 5 18 6 2 20 13 5 13 13 2 24 10 6 9 15 2 CORRES ROW IS HOSP, TREAT. COL IS RESPONSE./ PLOT ROWS. COLS. BOTH. / PRINT DICT./ END / Rows are defined by the hospital-treatment combinations, so DICT. is a useful command. We get the following correspondence plot. BMDPCA Severity of Dumping Syndrome .+....+....+....+....+....+....+....+....+....+....+. .2 + + Ba + - | - - | Aa - - Bd Bc | Bb - A - S | N - X 0. ++----+----+----+----+----+----+----+----+----+----++ I - | - S - | - - Ac | Ab - 2 - | Ad - -.2 + + + - | - - | - - | - - | - -.4 + +M + .+....+....+....+....+....+....+....+....+....+....+. -.5 -.3 -.1 .1 .3 -.6 -.4 -.2 0. .2 .4 AXIS 1 eigenvalues: L1 = .0628 ( 79.7%) L2 = .0160 ( 20.3%) This plot suggests rows Ab and Ad have similar profiles, as do rows Aa, Bb, and, to a lesser extent Ba. On axis 1, which explains 79.7% of the inertia, the rows fall into 5 groups. The column corresponding to moderate response is essentially equivalent to the average of all columns on this axis, that is, it's close to the origin. Back in both dimensions, there is evidence that technique d at hospital A is positively associated with a moderate response, while all techniques at hospital B show some negative association with moderate response. So, overall, hospital B seems to be better. However, technique c at hospital B is positively associated with slight occurrence of dumping. Since any occurrence of the dumping syndrome is potentially harmful, such observations may be useful information. When interpreting the plot remember that because of the scaling problems in the printer plots mentioned above, it's not really distance between points on the plots that indicate similarity, but a comparison of their horizontal and vertical distances to the origin. You can observe other patterns from the plot, but it is valuable to note that these are only observed patterns and will be very sensitive to a few observations in the data. Correspondence analysis is an exploratory technique. To draw conclusions, we would probably have to resort to some hypothesis testing procedures such as BMDP4F, CATMOD in SAS, or LOGLINEAR in SPSSx, preferably applied to a different subset of the data. BMDPCA is described in volume 2 of the new BMDP Statistical Software Manual, University of California Press. BMDPCA is also described in the BMDP technical report #87, CA: Correspondence Analysis, available from BMDP Statistical Software, Inc., 1440 Sepulveda Blvd., Suite 316, Los Angeles, CA 90025, for $6.50 (including postage and handling). A reference copy of each is available in the Consulting Room, 110 McVey Hall. For more information or help with BMDPCA, contact Steve Thomson, STEVE@UKCC, 120 McVey Hall, 257-2259. -- Steve Thomson ************************************************************************* SUGGESTIONS 1. LINKC exec on the WATC disk has a bug in it. It tries to generate a load module by using the GENMOD command. In the enclosed exec it shows that instead of GENMOD &1 it uses GEN &1. CMS doesn't recognise the GEN and doesn't generate the load module, thereby frustrating the programmer (me). It's a very easy bug to fix; just replace GEN with GENMOD. We made the suggested correction, and the LINKC works correctly now. Thanks for the suggestion! ************************************************************************* UKCC SERVICE DIRECTORY McVey Service E-Mail Address Phone Hall Vice President, Information Services Eugene R. Williams DPS128@UKCC 257-3609 Director, University Computing Services Dr. Douglas Hurley HURLEY@UKCC 257-2900 128 Director, Communications & Distributed Systems Doyle Friskney DOYLE@UKCC 257-6225 Director, Computational Sciences Dr. John Connolly CONNOLLY@UKCC 257-8737 324 Academic Consulting Services Lavine Thrailkill UKC105@UKCC 257-2257 121 CMS Consulting Bob Crovo CROVO@UKCC 257-2258 109 Complaints Carol Lotz LOTZ@UKCC 257-2213 129 Consultant for Remote Sites Wanda Dixon Spisak WANDA@UKCC 257-2206 115 Consulting Consultant on Duty SUGGEST@UKCC 257-2249 110 Contingency Planning & Security Jack L. Coffman UKA051@UKCC 257-2273 218 Database - IDMS Rick Chlopan DBA003@UKCC 257-2211 230E Data Center 257-2222 61 Data Entry Frank McCormick OPFRANK@UKCC 257-2216 72 Disk Rental Janet Hyatt HYATT@UKCC 257-2212 130 Larry Johnson JOHNSON@UKCC 257-2217 130 Facilities Operations Joe Williams UKA048@UKCC 257-2231 122 Graphics Consultation Bob Williamson ROBERTT@UKCC 257-2227 207 Information Center Judy Kisil UKA041@UKCC 257-2241 222 Information Resources Dr. Jon Hesseldenz UKA045@UKCC 257-3904 230D Instructional Software Wayne Beech WAYNE@UKCC 257-2238 100 Machine Room 257-2222 59 Management Information Systems Forrest Hahn UKA006@UKCC 257-2260 123 Memos and Manuals Consulting Room 257-2249 110 Micro Lab 257-2207 107 Network/Telecommunications UKT101@UKCC 257-2229 New Accounts Janet Hyatt HYATT@UKCC 257-2212 130 Larry Johnson JOHNSON@UKCC 257-2217 130 Numerical Analysis Consulting Anne Leigh ANNE@UKCC 257-2205 109B Optical Scanner - NCS Chris Corman CHRIS@UKCC 257-2243 109 Bob Crovo CROVO@UKCC 257-2258 109 Passwords Janet Hyatt HYATT@UKCC 257-2212 130 Larry Johnson JOHNSON@UKCC 257-2217 130 PRIME Information Peggy Akridge PEGGY@UKCC 257-2237 100 Program Documentation/Libraries Consulting Room 257-2249 110 Publications Office Marguerite Floyd EDITOR@UKCC 257-2219 200 Refunds Consulting Room 257-2249 110 SAS and SPSS Consulting Steve Thomson STEVE@UKCC 257-2259 120 Lorinda Wang UKC333@UKCC 257-2204 109B Statistical Consulting Steve Thomson STEVE@UKCC 257-2259 120 Tapes to Borrow, Tape Storage Data Center 257-2222 61 Tours of UKCC Lavine Thrailkill UKC105@UKCC 257-2257 121 User Account Services Janet Hyatt HYATT@UKCC 257-2212 130 Larry Johnson JOHNSON@UKCC 257-2217 130 Vectorization Consulting Tom Faller TOMFAL@UKCC 257-2236 314 ************************************************************************* UNIVERSITY COMPUTING ADVISORY COMMITTEE Douglas E. Hurley, Central Administration H. Clay Owen, Central Administration A.J. Hauselman, Community Colleges James W. Phillips, Community Colleges Raphael Finkel, Lexington Campus Leonard K. Peters, Lexington Campus N. Clare Detraz, Medical Center David A. Nash, Medical Center T. Earle Bowen, Ex Officio Ben W. Carr, Ex Officio Wimberly C. Royster, Ex Officio Donald E. Sands, Ex Officio Eugene R. Williams, Ex Officio *************************************************************************