_________________________________________________________________ _________________________________________________________________ Using the AMPL Under UNIX This booklet supplements AMPL: A Modeling Language for Math- ematical Programming for users of the AMPL software on computers running the UNIX operating system. Part I of this booklet describes the installation and use of the AMPL under UNIX. Parts II introduce the CPLEX Solver solver for linear, integer, and network optimization. Licenses for AMPL and CPLEX are available from CPLEX Optimization. CPLEX Optimization, Inc. 930 Tahoe Blvd #802-279 Incline Village, NV 89451 Ph: (702) 831-7744 Fax: (702) 831-7755 email: info@cplex.com I. Using AMPL The AMPL software described in this booklet runs under any version of the UNIX operating system. It requires approximately 2 megabytes of disk space for the Commercial Edition program files. The Commercial Edition is essentially the full AMPL system described in AMPL: A Modeling Language for Mathematical Program- ming (the ``AMPL book''). I.1 Installation The AMPL Commercial Edition software consists of two exe- cutable files, ampl, CPLEX and a directory Models of models and data from the AMPL book. These files are distributed on floppy disk in tar format. To extract the files, put the dis- tribution disk into a floppy drive, cd to a suitable directory (for example, ampl or your own bin), and type the command tar xvf /dev/rfd0c The tar command extracts the AMPL software into the current directory. (The name of the floppy disk drive, here shown as /dev/rfd0c, may be different on different systems.) The material on the distribution disk is compressed. To uncompress, type the command uncompress *.Z To test that the installation is complete, try solving one of the small sample problems from the distribution disk: % ampl ampl: model Models/steel.mod; ampl: data Models/steel.dat; ampl: option solver cplex; ampl: solve; CPLEX 3.0: optimal solution; objective 192000 1 simplex iterations ampl: quit; % Eventually you will want to install these programs in some more public directory, such as /usr/local/bin; in any case, you may wish to add your new AMPL directory to your search path, so that you can invoke AMPL from any directory. I.2 Running AMPL Once the installation is complete, you can run AMPL by typ- ing ampl at the prompt, as shown in the preceding example. (We have shown the prompt as %, which is typical if you use the C shell; if you use the Bourne shell or the Korn shell, the prompt is likely to be $.) The AMPL command quit returns you to the UNIX prompt. For many modeling projects, this is all you need to know about invoking AMPL. The ampl command has longer forms that are convenient or necessary in certain circumstances. The general form of the com- mand is: ampl switch-listopt filename-listopt The optional switch-list contains items that turn certain AMPL features on or off. These items begin with the - character. Such items are often called ``command-line options'', but here we call them ``switches'' to distinguish them from the ``options'' controlled by AMPL's option command. The items in the filename- list are filenames, or the - character alone. One or more spaces must separate each item from the next. We describe the switches first, and then the use of the filenames. AMPL can also be instructed through UNIX environment vari- ables to initialize certain options, or to read an initialization file automatically upon invocation. These features are explained at the end of this section. Switches Many command-line switches are simply synonyms for options that can be set within the AMPL command environment. For exam- ple, the -P switch, % ampl -P ampl: has the same effect as setting the presolve option to zero: % ampl ampl: option presolve 0; ampl: Table 1 lists all of these alternatives. _________________________________________________________________ Switch Option Interpretation -Cn Cautions n n = 0: suppress caution messages n = 1: report caution messages (default) n = 2: treat cautions as errors -en eexit n n > 0: abandon a command after n errors n < 0: abort AMPL after |n| errors n = 0: report any number of errors -L linelim 1 when using ``defining'' equations to substitute a linear expression for a variable, make an explicit substitution, so that the resulting model can be recognized as linear -P presolve 0 turn off the presolve phase -s randseed '' use current time for random number seed -sn randseed n use n for random number seed -S substout 1 use ``defining'' equations to eliminate variables -T gentimes 1 show the time taken to generate each model component -t times 1 show the time taken in each model translation phase Table 1: Command-line switches. _________________________________________________________________ The -v switch reports the version of AMPL you are running, and the -? switch provides a summary of command-line switches. Filenames If you give one or more filenames on the command line, AMPL will read model declarations, data statements and commands from these files in the order given, instead of entering the usual command environment. This provides a way of running AMPL as a ``batch'' program, rather than interactively. For example, the command % ampl diet.mod case2/diet.dat /tmp/diet.run % accomplishes the same thing as % ampl ampl: include diet.mod; ampl: include case2/diet.dat; ampl: include /tmp/diet.run; ampl: quit; % We assume that diet.mod contains declarations for an AMPL model, case2/diet.dat contains a data statement followed by data for a particular case, and /tmp/diet.run contains AMPL commands for solving the resulting mathematical program and reporting the results. This is only one possible arrangement, however; the contents of all three files might instead be concatenated into one, and in general any command-line file may contain a combina- tion of declarations, data and commands. (We have used filenames compatible with the ones employed on MS-DOS systems, but ampl accepts any valid UNIX filenames.) To combine reading from command-line files with interactive operation of AMPL, place the character - in the filename-list to indicate where interactive operation is to begin. Most commonly, it appears at the end; to read files diet.mod and case2/diet.dat, and then give further instructions interactively, you would type: % ampl diet.mod case2/diet.dat - ampl: A - character can precede another filename, however, if you want to type a few lines interactively before proceeding to read the next file: % ampl diet.mod case2/diet.dat - /tmp/diet.run ampl: let n_max['NA'] := 50000; ampl: end; % The end command terminates the interactive environment, after which commands are read from /tmp/diet.run. The quit command, by contrast, would terminate AMPL before the last file could be read. Environment variables and initialization files Within a UNIX shell you may define ``environment variables'' that retain their values as long as the shell is active. When you invoke AMPL, each environment variable is interpreted as an AMPL option. For example, you could set the AMPL options display_width and cplex_options by defining the corresponding envi- ronment variables like this in a C shell session, % setenv display_width 60 % setenv cplex_options 'dual scale 1' % ampl or like this in a Bourne or Korn shell session: $ display_width=60; export display_width $ cplex_options='dual scale 1'; export cplex_options $ ampl AMPL's option command confirms that the desired settings have been made: ampl: option display_width, cplex_options; option display_width 60; option cplex_options 'dual scale 1'; Option values ``inherited'' from environment variables in this way always override any default values that the options would otherwise have. This feature can provide a convenient means to redefine a few AMPL option defaults. You need only define the appropriate environment variables once, when you log in or start up a new shell window. After that, their values will be inherited each time you invoke AMPL. The inheritance only works in one direc- tion, so that changing an option's value in AMPL never affects any environment variable in UNIX. If you have many option settings or other initializations that you want to carry out each time AMPL is invoked, you may wish to keep a list of startup commands in a file. AMPL takes the name of this file from the environment variable OPTIONS_IN, if such a variable has been defined. In the C shell, the command is % setenv OPTIONS_IN amplinit With the Bourne and Korn shells, it is $ OPTIONS_IN=amplinit; export OPTIONS_IN AMPL will execute the contents of the file amplinit before read- ing any other files or prompting for any commands. The effect is the same as if include amplinit were the first command processed by AMPL. If you want AMPL to remember all your option settings from one invocation to the next, first set up an options command file by running a short AMPL session like this: % ampl ampl: option OPTIONS_INOUT 'amplopt'; ampl: quit; % AMPL writes to the specified file, amplopt in this case, a list of option commands that set all options to the values they had when you typed quit. To use this file, set the corresponding environment variable to the same filename; for the C shell: % setenv OPTIONS_INOUT amplopt and for Bourne and Korn shells: $ OPTIONS_INOUT=amplopt; export OPTIONS_INOUT At each subsequent invocation of AMPL, the previous option set- tings will be read from the specified file; and each time there- after that you quit AMPL, the current option settings - including any changes you might have made - will be written back to the file. The effect is to preserve all option values from one invo- cation to the next. To save the trouble of typing the commands to set your AMPL environment variables each time you log in, place the commands in your initialization file (normally .cshrc for the C shell or .profile for other shells). The OPTIONS_IN file is read before the OPTIONS_INOUT file, if both are specified. I.3 Memory The amount of memory (not disk space) required by AMPL includes an initial memory region upon invocation, plus addi- tional memory for generating instances and running solvers. The total memory requirement thus varies according to the problem that you are trying to solve, and tends to increase as an AMPL session proceeds. Most obviously, a larger problem instance will require more memory to generate and solve. The size depends upon several fac- tors, including the number of variables, the number of objectives and constraints, and the number of terms in the objective and constraint expressions. AMPL's memory requirements also depend in a less straight- forward way upon the complexity of a model. Memory space is needed for every model component, including sets that are not directly used to index anything, parameters that are defined in terms of other data, and variables that are later removed by the presolve phase. The memory requirement for processing a model can thus be much larger than the size of the resulting instance would suggest. For example, when a var declaration is indexed over a Cartesian product of many sets, var Ship {i in ORIG, j in DEST, p in PROD, q in AREA[p], t in 1..T} ... some memory must be set aside for each resulting variable, even if the great majority of these variables are later fixed to zero by the constraints and eliminated by presolve. If the total mem- ory is insufficient, you may need to reformulate the model so that the variable is declared over a subset of tuples (i,j,p,q,t), as explained in Chapter 6 of the AMPL book. Since the need for memory ultimately depends upon the model, the data, and any commands you give, AMPL cannot determine in advance how much memory it might require. Most often, insuffi- cient memory is reflected in slow performance, since the operat- ing system must page information in and out. If the system imposes limits on process size, you might also encounter an out- of-memory message, either directly after solve or another AMPL command (if AMPL cannot get enough memory), or after some solver output (if the solver cannot get enough memory). Certain algo- rithms within MINOS and OSL may have particularly large addi- tional requirements for memory; for details, see the discussions of the algorithms in Parts II and III of this booklet. Clearly there is no single best way to remedy a problem of insufficient memory. We suggest that you first review your model and data, to ensure that you are not creating unnecessarily great requirements for memory. If the out-of-memory message comes from the solver, you should also investigate whether solver-specific directives might reduce the memory needed. If no reformulations or new directives seem to help, try scaling back your data to see how large a problem does fit in your computer's memory. Such an exercise may provide some esti- mate of how much more memory you would need, or may help you to discover which components of your model are making the greatest demands for memory allocations. I.4 Files When you type solve, AMPL generates an instance in memory, then writes a description to one or more temporary files. The solver reads these files, applies an algorithm, and writes its results to another temporary file, which is in turn read by AMPL. Finally the temporary files are all deleted. In normal use these steps occur automatically, and you need not give any thought to them. This section describes two circumstances in which you may want to be more aware of the files that AMPL is creating: when the temporary files do not fit in the file system, and when you want to save a solution for later inspection. Relocating temporary files If your file system is getting full, it may not have enough space to hold AMPL's temporary files, and you will get an error message to that effect after typing solve. To get around this problem, you will have to either free more space, or direct the temporary files to another file system. The option TMPDIR specifies a directory to which temporary files will be written. When it is left at its default setting, the temporary files are written to /tmp. If there were also a filesystem /moretmp having more free space, for example, you could send the temporary files there by typing ampl: option TMPDIR "/moretmp"; You could also send the files to one of your own directories, say mytemp, by setting TMPDIR to mytemp. Saving solutions Since a solver called from AMPL normally writes its results into a temporary file, no permanent record is kept. You can save any portion of the results by redirecting the output of display, print or printf to a file (simply append > filename to the com- mand), but the solution is otherwise inaccessible once you spec- ify reset or quit. To save a solution for subsequent examination, you may use the write command to override the creation of temporary files. For example, if you have a model in diet.mod and data in diet.dat, you could type the following: % ampl ampl: model diet.mod; data diet.dat; ampl: write bdietrun; ampl: solve; CPLEX 3.0: optimal solution found. 6 iterations, objective 88.2 ampl: quit; % ls -l dietrun* -rw-r--r-- 1 bwk 1221 Nov 16 15:40 dietrun.nl -rw-r--r-- 1 bwk 252 Nov 16 15:40 dietrun.sol % The first character of the string following write is interpreted specially, as explained below. The rest of the string is com- bined with different extensions to produce the names for the files that AMPL uses. The ls -l listing above shows that two files have been created in this case: dietrun.nl, which contains the problem description that was sent to the solver, and dietrun.sol, which contains the result description that was sent back. Because write was used explicitly, these files take the place of the temporary ones that would normally be generated, and they are not automatically deleted. To view the results in dietrun.sol at a later time, you would use the solution command: % ampl ampl: model diet.mod; data diet.dat; ampl: solution dietrun.sol; CPLEX 3.0: optimal solution found. 6 iterations, objective 88.2 ampl: display Buy; Buy [*] := BEEF 0 CHK 0 FISH 0 HAM 0 MCH 46.6667 MTL -1.07823e-16 SPG -1.32893e-16 TUR 0 ; You do have to repeat the model and data statements, so that AMPL sets up the appropriate instance again. But solution then gets the old results directly from the specified file, without running any solver. When b is the first character of the string that follows write, as above, AMPL uses a compact and efficient binary format for the files. When g appears instead, AMPL uses a compact text format that may be easier to transfer between computers. There is also an m option that causes AMPL to write linear problems in the widely recognized ``MPS format''; the resulting filename ends in mps rather than nl. You can use these MPS files with CPLEX, but the names in the MPS files will be generic. To identify the correspondance between the MPS file names and AMPL names, use AMPL's auxiliary files feature to generate .row and .cols files. Also, if you use AMPL to write out a MIP problem be aware that if you have an AMPL model with [0,infinity) integer variables, you will get different results if you write out an MPS file or solve it directly. This is because of inconsistencies among various solvers regarding the treatment of integer variables with no upper bounds specified in an MPS file. Some solvers view such variables as binary, while others view them as general integer. AMPL writes out an MPS file without any bounds for such general integer variables; CPLEX assumes such variables are binary. versions of MINOS and OSL supplied in the AMPL Commercial Edition, but they may be useful for communicating test problems to other solvers. The AMPL option auxfiles lets you request creation of sev- eral auxiliary files by the write command: key extension file contents a .adj adjustment to objective, e.g., to compensate for fixed variables eliminated by presolve c .col names of the variables (columns) sent to the solver f .fix names of variables fixed by presolve, and the values to which they are fixed r .row names of the constraints (rows) sent to the solver s .slc names of ``slack'' constraints eliminated by presolve because they can never be binding u .unv names of variables dropped by presolve because they are never used in the problem instance If you set auxfiles to a string containing one or more of the specified key letters, write creates the corresponding file with the specified extension (unless it would be empty). For example, if you type ampl: option auxfiles "cr"; before solve in our example above, the files dietrun1.col and dietrun1.row are created and the names of the variables and con- straints, respectively, are written to them. II. Using CPLEX CPLEX is an optimization package for linear, network and integer programming. This supplement to AMPL: A Modeling Lan- guage for Mathematical Programming summarizes the most important features of CPLEX for AMPL. Section II.1 describes the kinds of problems to which CPLEX is applicable, and Section II.2 explains in general terms how solver-specific directives can be passed from AMPL to CPLEX. Sections II.3 and II.4 describe CPLEX's facilities for linear and network optimization and for pure and mixed integer program- ming, respectively; each section briefly introduces the algo- rithms that CPLEX uses, then categorizes and explains the rele- vant directives. II.1 Applicability CPLEX is designed to solve linear programs as described in Chapters 1-8 and 11-12 of AMPL: A Modeling Language for Mathemat- ical Programming, as well as the integer programs described in Chapter 15. Integer programs may be pure (all integer variables) or mixed (some integer and some continuous variables); integer variables may be binary (taking values 0 and 1 only) or may have any lower and upper bounds. For the network linear programs described in Chapter 12, CPLEX also incorporates an especially fast network optimization algorithm. CPLEX does not solve nonlinear programs as described in Chapter 13: ampl: model nltransd.mod; data nltrans.dat; ampl: option solver cplex; ampl: solve; Sorry, cplex can't handle nonlinearities. CPLEX 3.0: exit code 1 This restriction applies if your model uses any function of vari- ables that AMPL identifies as ``not linear'' - even a function such as abs or min that shares some properties of linear func- tions. CPLEX does solve piecewise-linear programs, as described in Chapter 14, if AMPL transforms them to problems that CPLEX's solvers can handle. The transformation is to a linear program, if the following conditions are met. Any piecewise-linear term in a minimized objective must be convex, its slopes forming an increasing sequence as in: <<-1,1,3,5; -5,-1,0,1.5,3>> x[j] Any piecewise-linear term in a maximized objective must be con- cave, its slopes forming a decreasing sequence as in: <<1,3; 1.5,0.5,0.25>> x[j] Any piecewise-linear term in the constraints must be either con- vex and on the left-hand side of a < constraint (or equivalently, the right-hand side of a > constraint), or else concave and on the left-hand side of a > constraint (or equivalently, the right-hand side of a < constraint). In all other cases, the transformation is to a mixed-integer program. AMPL automatically performs the appropriate conversion, sends the resulting linear or mixed-integer program to CPLEX, and converts the solution back. The conversion has the effect of adding a variable to cor- respond to each linear piece; when the above rules are not satis- fied, additional integer variables and constraints must also be introduced. II.2 Controlling CPLEX from AMPL In many instances, you can successfully apply CPLEX by sim- ply specifying a model and data, setting the solver option to cplex, and typing solve. For larger linear programs and espe- cially the more difficult integer programs, however, you may need to pass specific directives to CPLEX to obtain the desired results. To give directives to CPLEX, you must first assign an appro- priate character string to the AMPL option called cplex_options. When CPLEX is invoked by solve, it breaks this string into a series of individual directives. Here is an example: ampl: model diet.mod; ampl: data diet.dat; ampl: option solver cplex; ampl: option cplex_options 'crash=0 dual \ ampl? feasibility=1.0e-8 scale=1 \ ampl? iterations=100'; ampl: solve; CPLEX 3.0: crash 0 dual feasibility 1e-08 scale 1 iterations 100 CPLEX 3.0: optimal solution; objective 88.2 1 iterations (0 in phase I) CPLEX confirms each directive; it will display an error message if it encounters one that it does not recognize. CPLEX directives consist of an identifier alone, or an iden- tifier followed by an = sign and a value; a space may be used as a separator in place of the =. Unlike AMPL, CPLEX treats upper- case and lower-case letters as being the same. You may store any number of concatenated directives in cplex_options. The example above shows how to type all the directives in one long string, using the \ character to indicate that the string continues on the next line. Alternatively, you can list several strings, which AMPL will automatically concate- nate: ampl: option cplex_options 'crash=0 dual' ampl? ' feasibility=1.0e-8 scale=1' ampl? ' iterations=100'; In this form, you must take care to supply the space that goes between the directives; here we have put it before feasibility and iterations. If you have specified the directives above, and then want to try setting, say, optimality to 1.0e-8 and changing crash to 1, you might think to type: ampl: option cplex_options 'optimality=1.0e-8 crash=1'; This will replace the previous cplex_options string, however; the other previously specified directives such as feasibility and iterations will revert to their default values. (CPLEX supplies a default value for every directive not explicitly specified; defaults are indicated in the discussion below.) To append new directives to cplex_options, use this form: ampl: option cplex_options $cplex_options ampl? ' optimality=1.0e-8 crash=1'; A $ in front of an option name denotes the current value of that option, so this statement just appends more directives to the current directive string. As a result the string contains two directives for crash, but the new one overrides the earlier one. II.3 Using CPLEX for linear programming For linear programs, CPLEX employs either the simplex algorithm or a barrier algorithm to solve the problem. Refer to a linear programming textbook for more information on these algorithms. Four distinct methods of optimization are incorporated in the CPLEX package: o A primal simplex algorithm that first finds a solution feasible in the constraints (Phase I), then iterates toward optimality (Phase II). o A dual simplex algorithm that first finds a solution sat- isfying the optimality conditions (Phase I), then iterates toward feasibility (Phase II). o A network primal simplex algorithm that uses logic and data structures tailored to the class of pure network linear programs. o A hybrid algorithm that first obtains an optimal solution with a primal-dual barrier algorithm, then efficiently obtains an optimal basic solution using the simplex method. CPLEX normally chooses one of these algorithms for you, but you can override its choice by the directives described below. The simplex algorithm maintains a subset of basic variables (or, a basis) equal in size to the number of constraints. A basic solution is obtained by solving for the basic variables, when the remaining nonbasic variables are fixed at appropriate bounds. Each iteration of the algorithm picks a new basic vari- able from among the nonbasic ones, steps to a new basic solution, and drops some basic variable at a bound. The coefficients of the variables form a constraint matrix, and the coefficients of the basic variables form a nonsingular square submatrix called the basis matrix. At each iteration, the simplex algorithm must solve certain linear systems involving the basis matrix. For this purpose CPLEX maintains a factorization of the basis matrix, which is updated at most iterations, and is occasionally recomputed. The sparsity of a matrix is the proportion of its elements that are not zero. The constraint matrix, basis matrix and fac- torization are said to be relatively sparse or dense according to their proportion of nonzeros. Most linear programs of practical interest have many zeros in all the relevant matrices, and the larger ones tend also to be the sparser. The amount of RAM memory required by CPLEX grows with the size of the linear program, which is a function of the numbers of variables and constraints and the sparsity of the coefficient matrix. The factorization of the basis matrix also requires an allocation of memory; the amount is problem-specific, depending on the sparsity of the factorization. When memory is limited, CPLEX automatically makes adjustments that reduce its require- ments, but that usually also reduces its optimization speed. The following CPLEX directives apply to the solution of lin- ear programs, including network linear programs. The letters i and r denote integer and real values, respectively. Directives for problem specification While AMPL completely specifies the problem and its objective sense, it is possible to change the objective sense after specifying the model. minimize maximize These directives instruct CPLEX to set the objective sense to be minimize or maximize, respectively. Directives for problem and algorithm selection CPLEX consults several directives to decide how to set up and solve a linear program that it receives. The default is to apply the primal simplex method to the linear program as given, substituting the network variant if network nodes and arcs were declared in the AMPL model. The following discussion indicates situations in which you should consider experimenting with alter- natives. dualthresh=i (default 32000) primal dual Every linear program has an equivalent ``opposite'' linear program; the original is customarily referred to as the primal LP, and the equivalent as the dual. For each variable and each constraint in the primal there are a corresponding constraint and variable, respectively, in the dual. Thus when the number of constraints is much larger than the number of variables in the primal, the dual has a much smaller basis matrix, and CPLEX may be able to solve it more efficiently. The primal and dual directives instruct CPLEX to set up the primal or the dual formulation, respectively. The dualthresh directive makes a choice: the dual LP if the number of con- straints exceeds the number of variables by more than i, and the primal LP otherwise. primalopt dualopt baropt The primalopt, dualopt and baropt directives instruct CPLEX to apply the primal simplex algorithm, the dual simplex algorithm, or the barrier algorithm respectively. The simplex algorithms two variants use basis matrices of the same kind and dimension, but employ opposite strategies in constructing a path to the optimum. Any can be applied regardless of whether the primal or the dual LP is set up as explained above; in general the six combinations of primalopt/dualopt/baropt and primal/dual all perform differently. CPLEX uses primalopt by default if neither dualopt nor baropt is specified. Linear programs that are highly degenerate (many basic variables at their bounds) and that have little variability in the righthand sides (constant terms) of their constraints often solve faster using the dual simplex. Also consider trying the dual sim- plex if CPLEX's primal simplex reports problems of numerical inaccuracy; few linear programs exhibit poor numerical performance in both the primal and the dual algorithms. The barrier algorithm tends to work well when the product of the constraint matrix and its transpose remains sparse. netopt=i (default 1) CPLEX incorporates an optional heuristic procedure that looks for ``pure network'' constraints in your linear program. If this procedure finds sufficiently many such constraints, CPLEX applies its fast network simplex algorithm to them. Then, if there are also non-network constraints, CPLEX uses the network solution as a start for solving the whole LP by the general pri- mal or dual simplex algorithm, whichever you have chosen. The default value of i=1 invokes the network-identification procedure if and only if your model uses node and arc declara- tions, and CPLEX sets up the primal formulation as discussed above. Setting i=0 suppresses the procedure, while i=2 requests its use in all cases. You can have CPLEX display the number of network nodes (constraints) and arcs (variables) that it has extracted, by setting the prestats directive (described with the preprocessing options below) to 1. CPLEX's network simplex algorithm can achieve dramatic reductions in optimization time for ``pure'' network linear pro- grams defined entirely in terms of node and arc declarations. (For a pure network LP, every arc declaration must contain at most one from and one to phrase, and these phrases may not spec- ify optional coefficients.) In the case of linear programs that are mostly defined in terms of node and arc declarations, but that have some ``side'' constraints defined by subject to decla- rations, the benefit is highly dependent on problem structure; it is best to try experimenting with both i=0 and i=1. relax This directive instructs CPLEX to ignore any integrality restrictions on the variables. The resulting linear program is solved by whatever algorithm the above directives specify. Directives for preprocessing Prior to applying any simplex algorithm, CPLEX modifies the linear program and initial basis in ways that tend to reduce the number of iterations required. The following directives select and control these preprocessing features. aggregate=i1 (default 1) agglim=i2 (default 10) When i1 is left at its default value of 1, CPLEX looks for constraints that (possibly after some rearrangement) define a variable x in terms of other variables: two-variable constraints of the form x = y + b; constraints of the form x = sum {j} yj, where x appears in less than i2 other constraints. Under certain conditions, both x and its defining equation can be eliminated from the linear program by substitution. In CPLEX's terminology, each such elimination is an aggregation of the linear program. Aggregation can yield a substantial reduction in the size of some linear programs, such as network flow LPs in which many nodes have only one incoming or one outgoing arc. If i2 > 2, however, aggregation may also increase the number of nonzero constraint coefficients, resulting in more work at each simplex iteration. The default setting of i2=10 usually makes a good tradeoff between reduction in size and increase in nonzeroes, but you may want to experiment with lower values if CPLEX reports that many aggregations have been made. If CPLEX consistently reports that no aggregations can be performed, on the other hand, you can set i1 to 0 to turn off the aggregation routine and save a little memory and processing time. To request a report of the number of aggregations, see the prestats directive below. presolve=i (default 1) Prior to invoking any simplex algorithm, CPLEX attempts to apply various simple and fast transformations that reduce the size of the linear program without changing its optimal solution. In this presolve phase, constraints that involve only one non-fixed variable are removed; either the variable is fixed and also dropped (for an equality constraint) or a simple bound for the variable is recorded (for an inequality). Each inequality constraint is subjected to a simple test to determine if there exists any setting of the variables (within their bounds) that can violate it; if not, it is dropped as nonconstraining. Further iterative tests attempt to tighten the bounds on primal and dual variables, possibly causing additional variables to be fixed, and additional constraints to be dropped. AMPL's presolve phase, as described in Section 10.2 of the AMPL book, also performs many (but not all) of these transformations. To see how many variables and constraints are eliminated by AMPL's presolve, set option show_stats to 1. To suppress AMPL's presolve, so that all presolving is done in CPLEX, set option presolve to 0. CPLEX's presolve can be suppressed by changing i to 0 from its default of 1. In rare cases the presolved linear program, although smaller, is actually harder to solve; thus if CPLEX reports that many variables and constraints have been eliminated by presolve, you may want to compare runs with and without presolve. On the other hand, if CPLEX consistently reports that presolve eliminates no variables or constraints, you can save a little processing time by turning presolve off. To request a report of the number of eliminations performed by presolve, see the prestats directive below. prestats=i (default 0) When this directive is changed to 1 from its default of 0, CPLEX reports on the activity of the aggregation and presolve routines: Presolve eliminated 1645 rows and 2715 columns in 3 passes. Aggregator did 22 substitutions. Presolve Time = 1.70 sec. During the development of a large or complex model, it is a good idea to monitor this report, and to turn on its AMPL counterpart by setting option show_stats to 1. An unexpectedly large number of eliminated variables or constraints may indicate that the formulation is in error or can be substantially simplified. scale=i (default 0) This directive governs the scaling of the coefficient matrix. The default value of i=0 implements an equilibration scaling method, which is generally very effective. A value of 1 invokes a modified, more aggressive scaling method that can pro- duce improvements on some problems. A value of -1 will disable scaling. Since CPLEX has internal logic that determines when it need not scale a problem, setting the scale directive to -1 rarely improves performance. Directives for algorithmic control of the simplex method Several key strategies of the primal and dual simplex algo- rithms can be changed through CPLEX directives. If you are repeatedly solving a class of linear programs that requires sub- stantial computer time, some experimentation with alternative strategies can be worthwhile. crash=i (default 1) This directive governs CPLEX's procedure for choosing an initial basis, except when the basis is read from a file as spec- ified by the directive startbasis described below. A value of i=0 causes the objective to be ignored in choosing the basis, while values of -1 and 1 select two different heuristics for tak- ing the objective into account. The best setting for your pur- poses will depend on the specific characteristics of the linear programs you are solving, and must be determined through experi- mentation. pgradient=i (default 0) This directive governs the primal simplex algorithm's choice of a "pricing" procedure that determines which variable is selected to enter the basis at each iteration. Your choice is likely to make a substantial difference to the tradeoff between computational time per iteration and the number of iterations. As a rule of thumb, if the number of iterations to solve your linear program exceeds three times the number of constraints, you should consider experimenting with alternative pricing procedures. The recognized values of i are as follows: -1 Reduced-cost pricing 0 Hybrid reduced-cost/devex pricing 1 Devex pricing 2 Steepest-edge pricing 3 Steepest-edge pricing with slack initial norms 4 Full reduced-cost pricing The "reduced cost" procedures are sophisticated versions of the pricing rules most often described in textbooks. The "devex" and "steepest edge" alternatives employ more elaborate computations, which can better predict the improvement to the objective offered by each candidate for entering variable. Compared to the default of i = 0, the less compute-intensive reduced-cost pricing (i=-1) may be preferred if your problems are small or easy, or are unusually dense -- say, 20 to 30 nonzeroes per column. Conversely, if you have more difficult problems which take many iterations to complete Phase I, consider using devex pricing (i=1). Each iteration may consume more time, but the lower number of total iterations may lead to a substantial overall reduction in time. Do not use devex pricing if your problem has many variables and relatively few constraints, however, as the number of calculations required per iteration in this situation is usually too large to afford any advantage. If devex pricing helps, you may wish to try steepest-edge pricing (i=2). This alternative incurs a substantial initialization cost, and is computationally the most expensive per iteration, but may dramatically reduce the number of iterations so as to produce the best results on exceptionally difficult problems. The variant using slack norms (i=3) is a compromise that sidesteps the initialization cost; it is most likely to be advantageous for relatively easy problems that have a low number of iterations or time per iteration. Full reduced-cost pricing (i=4) is a variant that computes a reduced cost for every variable, and selects as entering variable one having most negative reduced cost (or most positive, as appropriate). Compared to CPLEX's standard reduced-cost pricing (i=-1), full reduced-cost pricing takes more time per iteration, but in rare cases reduces the number of iterations more than enough to compensate. This alternative is supplied mainly for completeness, as it is proposed in many textbook discussions of the simplex algorithm. dgradient=i (default 0) This directive governs the dual simplex algorithm's choice of a "pricing" procedure that determines which variable is selected to leave the basis at each iteration. Your choice is likely to make a substantial difference to the tradeoff between computational time per iteration and the number of iterations. As a rule of thumb, if the number of iterations to solve your linear program exceeds three times the number of constraints, you should consider experimenting with alternative pricing procedures. The recognized values of i are as follows: 0 Pricing procedure determined automatically 1 Standard dual pricing 2 Steepest-edge pricing 3 Steepest-edge pricing in slack space 4 Steepest-edge pricing with unit initial norms Standard dual pricing (i=1), described in many textbooks, selects as leaving variable one that is farthest outside its bounds. The three "steepest edge" alternatives employ more elaborate computations, which can better predict the improvement to the objective offered by each candidate for leaving variable. The default (i=0) lets CPLEX choose a dual pricing procedure through an internal heuristic based on problem characteristics. Steepest-edge pricing involves an extra initialization cost, but its extra cost per iteration is much less in the dual simplex algorithm than in the primal. Thus if you find that your problems solve faster using the dual simplex, you should consider experimenting with the steepest-edge procedures. The standard procedure (i=2) and the variant "in slack space" (i=3) have similar computational costs; often their overall performance is similar as well, though one or the other can be advantageous for particular applications. The variant using "unit initial norms" (i=4) is a compromise that sidesteps the initialization cost; it is most likely to be advantageous for relatively easy problems that have a low number of iterations or time per iteration. pricing=i (default 0) To promote efficiency, CPLEX considers only a subset of the nonbasic variables as candidates to enter the basis. The default of i=0 selects a heuristic that dynamically determines the size of the candidate list, taking problem dimensions into account. You can manually set the size of this list to i>0, but only very rarely will this improve performance. refactor=i (default 0) This directive specifies the number of iterations between refactorizations of the basis matrix. At the default setting of i=0, CPLEX automatically calcu- lates a refactorization frequency by a heuristic formula. You can determine the frequency that CPLEX is using by setting the display directive (described below) to 1. Since each update to the factorization uses more memory, CPLEX may reduce the factor- ization frequency if memory is low; in extreme cases, the basis may have to be refactored every few iterations and the algorithm will be very slow. Given adequate memory, CPLEX's performance is relatively insensitive to changes in refactorization frequency. For a few extremely large, difficult problems you may be able to improve performance by reducing i from the value that CPLEX chooses. netfinder=i (default 1) This directives governs the method used by the CPLEX network optimizer to extract a network from the linear program. It is only pertinent for networks with side constraints. In that case, the choice of the netfinder directive may influence the size of the network extracted, thus reducing optimization time. The default value (i=1) extracts only the natural network from the problem. CPLEX then invokes its network simplex method on the extracted network. In some cases, a larger network can be extracted by allowing CPLEX to multiply rows by -1 (reflection scaling) and rescaling constraints and variables so that more matrix coefficients are plus or minus 1. Setting the netfinder directive to 2 enables reflection scaling only, while setting it to 3 allows reflection scaling and general scaling. Directives for algorithmic control of the barrier method Several key strategies of the barrier algorithm can be changed through CPLEX directives. If you are repeatedly solving a class of linear programs that requires substantial computer time, some experimentation with alternative strategies can be worthwhile. barfactor=i This directive specifies the type of Cholesky factorization used in the barrier algorithm. The default setting is machine dependent. For i=0, a leftward (pulling) Cholesky factorization is used. For i=1, a rightward (pushing) Cholesky factorization is used. barobjrange=r (default 1e+15) This directive sets the maximum absolute value of the objective function. CPLEX's barrier algorithm looks at this value to detect unbounded problems. Any positive value is acceptable input. However, care should be taken to avoid choosing a value so small that CPLEX will conclude a problem is unbounded when it is not. barvarup=r (default 1e+20) This directive sets the upper bound for all variables that have infinite upper bounds. It is used to prevent difficulties associated with unbounded optimal faces. Any positive value less then or equal to 1e+20 is acceptable input. However, care should be taken to avoid choosing a value so small that CPLEX will conclude a problem has an unbounded optimal face when it does not. comptol=r (default 1e-8) This directive specifies the complementarity tolerance used by the barrier algorithm to test convergence. The barrier algorithm will terminate with an optimal solution if the relative complementarity is smaller than this value. Any positive number larger than 1e-10 is acceptable input. dense=i (default 0) CPLEX uses this directive to distinguish dense columns in the constraint matrix. Because barrier algorithm performance can improve dramatically if dense columns are treated separately, changing this value may improve optimization time. Columns with more nonzeros than this setting are considered to be dense. If left at the default value, CPLEX will automatically determine a value, considering factors such as the size of the problem. Any nonnegative integer is acceptable input. growth=r (default 1e+6) This directive is used to detected unbounded optimal faces. At higher values, the barrier algorithm will be less likely to conclude that the problem has an unbounded optimal face, but more likely to have numerical difficulties if the problem does have an unbounded face. Any positive number is acceptable input. ordering=i (default 1) This directive sets the method used to permute the rows of the constraint matrix in order to reduce fill in the Cholesky factor. There is a trade off between ordering speed and sparsity of the Cholesky factor. The default multiple minimum degree algorithm tends to be faster but produces more fill. The alternate setting uses a minimum local fill algorithm. This tends to be slower but produces less fill. Directives for improving stability CPLEX is highly robust and has been designed to avoid prob- lems such as degenerate stalling and numerical inaccuracy that can occur in the simplex algorithm. However, some linear pro- grams can benefit from adjustments to the following directives if difficulties are encountered. perturbation=r (default 0.0001) doperturb=i (default 0) The simplex algorithm tends to make very slow progress when it encounters solutions that are highly degenerate (in the sense of having many basic variables lying at one of their bounds, rather than between them). When CPLEX detects degenerate stal- ling, it automatically introduces a perturbation that expands the bounds on every variable by a small amount, thereby creating a different but closely related problem. Generally, CPLEX can make faster progress on this less constrained problem; once optimality is indicated, the perturbation is removed by resetting the bounds to their original values. The value of r determines the size of the perturbation. If you receive messages from CPLEX indicating that the linear pro- gram has been perturbed more than once, r is probably too large; reduce it to a level where only one perturbation is required. The default doperturb value of i=0 selects CPLEX's automatic perturbation strategy. If an automatic perturbation occurs early in the solution process, consider setting i=1 to select perturba- tion at the outset. This alternative will save the time of first allowing the optimization to stall before activating the pertur- bation mechanism, but is useful only rarely, for extremely degen- erate problems. feasibility=r1 (default 1.0e-6) markowitz=r2 (default 0.01) optimality=r3 (default 1.0e-6) If a problem is making slow progress through Phase I, or repeatedly becomes infeasible during Phase II, numerical diffi- culties have arisen. Adjusting the algorithmic tolerances con- trolled by these directives may help. Decreasing the feasibility tolerance, increasing the optimality tolerance and/or increasing the Markowitz tolerance will typically improve numerical behav- ior. The feasibility tolerance r1>0 specifies the degree to which a linear program's basic variables may violate their bounds. You may wish to lower r1 after finding an optimal solution if there is any doubt that the solution is truly optimal; but if it is set too low, CPLEX may falsely conclude that the problem has no fea- sible solution. The Markowitz threshold r2<1 influences the order in which variables are eliminated during basis factorization. Increasing r2 may yield a more accurate factorization, and consequently more accurate computations during iterations of the simplex algorithm. Too large a value may produce an inefficiently dense factoriza- tion, however. The optimality tolerance r3>0 specifies how closely the optimality (or dual feasibility) conditions must be satisfied for CPLEX to declare an optimal solution. Directives for starting and stopping Normally CPLEX uses an internal procedure to determine a starting point for the simplex algorithm, then iterates to opti- mality. The following directives override these conventions so that you can start from a saved basis, and can stop when a cer- tain criterion is satisfied. bariterlim=i (default 100) CPLEX stops after i barrier algorithm iterations and returns its current solution, whether or not it has determined that the solution is optimal. endbasis f1 startbasis f2 CPLEX always sends its final solution back to AMPL, where you can use commands such as display to view the results. If the endbasis directive is specified, CPLEX also writes a record of the final simplex basis to the file named f1, in the standard MPS basis format. Normally this is an optimal basis, but it may be otherwise if an optimum does not exist or could not be found by the chosen algorithm, or if the iterations were terminated prema- turely by one of the directives described below. By default, CPLEX employs a heuristic procedure to determine a starting basis for the simplex algorithm, as indicated in the discussion of the crash directive above. If the startbasis directive is specified, then the initial basis is instead read from the file f2, which must also be in the standard MPS basis format. (This basis determines the initial solution; CPLEX ignores any initial values of variables from AMPL.) This feature can be useful for quickly solving a series of linear programs that differ only in the data. Before the first one is solved, endbasis is set to the name of a temporary file: ampl: model steelT3.mod; data steelT3.dat; ampl: option solver cplex; ampl: option cplex_options 'endbasis /tmp/steelT3.bas'; ampl: solve; CPLEX 3.0: endbasis /tmp/steelT3.bas CPLEX 3.0: optimal solution; objective 514521.7143 30 iterations (0 in phase I) Then a data value is changed, and startbasis is set to the same filename so that the previously optimal basis is used as a start: ampl: let avail[1] := 32; ampl: option cplex_options 'startbasis /tmp/steelT3.bas'; ampl: solve; CPLEX 3.0: startbasis /tmp/steelT3.bas CPLEX 3.0: optimal solution; objective 493648.7143 1 iterations (1 in phase I) Only 1 iteration is necessary to step to an adjacent basis that is optimal for the modified linear program. Although CPLEX will try to make use of any previously saved basis specified by startbasis, this approach should be relied upon only when the change to the data does not affect the numbers of variables and constraints. (These numbers can be reduced by simplifications carried out in AMPL's presolve phase, even when it would seem that a change in the data should have no effect. If the number of iterations is unexpectedly high, use the AMPL command option show_stats 1 to see what presolve has done, or use option presolve 0 to turn presolve off.) Another use for this feature is to restart CPLEX, possibly with some directives changed, after it has reached an iteration limit short of optimality due to one of the directives described next. endvector f1 startvector f2 These directives are used to take a barrier algorithm solution and write it to or read it from a CPLEX .vec file. Because AMPL always instructs CPLEX to take it's barrier method solution and apply a hybrid method to obtain a basic solution, this feature can only be used if a barrier iteration limit is exceeded. If the endvector directive is specified, CPLEX will write out .vec file named f1. Similarly, if the startvector directive is specified, CPLEX will read in a .vec file named f2 and use it to initiate the hybrid crossover method that results in an optimal basic solution. Note that CPLEX will not perform additional barrier iterations after reading in the .vec file. iterations=i CPLEX stops after i simplex method iterations and returns its current solution, whether or not it has determined that the solution is optimal. The default is machine dependent. The respective default values for personal computers, workstations and mainframes are 1000000, 5000000 and 10000000. lowerobj=r1 (default -1.0e+75) upperobj=r2 (default +1.0e+75) CPLEX stops at the first iteration where the solution is feasible in the constraints, and the objective value is below r1 or above r2. At their default values these directives have no practical effect. Setting r1 (for a minimization) or r2 (for a maximization) to a ``good'' value for the objective will cause CPLEX to stop as soon as it achieves this value. singular=i (default 10) CPLEX will attempt to repair the basis matrix up to i times when it finds evidence that the matrix is singular. Once this limit is exceeded, CPLEX terminates with the current basis set to the best factorizable basis that has been found. time=r (default 1.0e+75) CPLEX stops after r seconds of computation time and returns its current solution, whether or not it has determined that the solution is optimal. Directives for controlling output When invoked by solve, CPLEX normally returns just a few lines to your screen to summarize its performance. The following directives let you choose a greater amount of output, which may be useful for monitoring the progress of a long run, or for com- paring the effects of other directives on the detailed behavior on CPLEX's algorithms. Output normally comes to the screen, but may be redirected to a file by specifying solve >filename. bardisplay = i (default 1) The default of i=1 causes CPLEX to print a ``log line'' recording the barrier iteration number, primal and dual objective values, and infeasibility information after each barrier iteration: ampl: model sched.mod; ampl: data sched.dat; ampl: option solver cplex; ampl: option cplex_options 'baropt'; ampl: solve; CPLEX 3.0: baropt No LP presolve or aggregator reductions. Number of nonzeros in lower triangle of A*A' = 105 Total time for multiple-min-degree ordering = 0.01 sec. Summary statistics for Cholesky factor: Rows in Factor = 17 Size of dense window = 14 Integer space required = 53 Total non-zeros in factor = 144 Itn Primal Obj Dual Obj Prim Inf Upper Inf Dual Inf 0 3.5610114e+02 0.0000000e+00 4.25e+02 0.00e+00 1.70e+01 1 2.7132333e+02 2.0271095e+02 5.61e-13 0.00e+00 2.17e+00 2 2.7009625e+02 2.3854931e+02 7.53e-13 0.00e+00 6.28e-01 3 2.6635347e+02 2.6489118e+02 2.85e-11 0.00e+00 2.74e-02 4 2.6560129e+02 2.6559065e+02 6.82e-13 0.00e+00 9.20e-15 5 2.6560000e+02 2.6559995e+02 7.18e-13 0.00e+00 9.98e-15 6 2.6560000e+02 2.6560000e+02 8.60e-13 0.00e+00 8.69e-15 Barrier Time = 0.05 sec. Crossover. Primal: Fixing 111 variables. 0 PMoves: Infeasibility 0.00000000e+00 Objective 2.65600000e+02 111 PMoves: Infeasibility 0.00000000e+00 Objective 2.65600000e+02 Primal: Pushed 92, Exchanged 19. Dual: Fixed no variables. Total Crossover Time = 0.03 sec. CPLEX 3.0: optimal solution; objective 265.6 0 iterations When i=0, CPLEX produces a minimal few lines of output summarizing the results of a barrier algorithm run. When i=2, additional information about the barrier run is provided. This level of output is occasionally useful for diagnosing problems of degeneracy or instability in the simplex algorithm. display=i (default 1) The default of i=1 causes CPLEX to print a ``log line'' recording the iteration number and the scaled infeasibility or objective value after each refactorization of the basis matrix: ampl: model dist.mod; data dist13.dat; ampl: option solver cplex; ampl: option cplex_options 'display=1'; ampl: solve; CPLEX 3.0: display 1 Iteration Log . . . Iteration: 1 Scaled Infeas = 1130.332363 Iteration: 60 Scaled Infeas = 348.567681 Iteration: 120 Scaled Infeas = 163.223410 Iteration: 170 Objective = 913550.609174 Iteration: 229 Objective = 906659.378426 Iteration: 289 Objective = 904373.319517 CPLEX 3.0: optimal solution; objective 903202.0326 326 iterations (169 in phase I) Additional information on the operation of the network simplex algorithm is also provided, if applicable. This is usually the appropriate setting for tracking the progress of a long run. When i=0 CPLEX produces a minimal few lines of output summarizing the results of a simplex algorithm run. When i=2, a log line is displayed after each iteration. This level of output is occasionally useful for diagnosing prob- lems of degeneracy or instability in the simplex algorithm. logfile f1 This directive instructs CPLEX to create a logfile named f1 that will contain output from the optimization. The amount of output in the log file will depend on other directives, such as the display directive described above. timing=i (default 0) When this directive is changed to 1 from its default value of 0, a summary of processing times is displayed: Input = 0.06 Solve = 6.42 Output = 0.05 Input is the time that CPLEX takes to read the problem from a file that has been written by AMPL. Solve is the time that CPLEX spends trying to solve the problem. Output is the time that CPLEX takes to write the solution to a file for AMPL to read. II.4 Using CPLEX for integer programming For problems that contain integer variables, CPLEX uses a branch-and-bound approach. The optimizing algorithm maintains a hierarchy of related linear programming subproblems, referred to as the search tree, and usually visualized as branching downward. There is a subproblem at each node of the tree, represented by a circle in the diagram. The algorithm starts with just a top (or root) node, whose associated subproblem is the relaxation of the integer program - the LP that results when all integrality restrictions are dropped. If this relaxation happened to have an integer solu- tion, then it would provide an optimal solution to the integer program. Normally, however, the optimum for the relaxation has some fractional-valued integer variables. A fractional variable is then chosen for branching, and two new subproblems are gener- ated, each with more restrictive bounds for the branching vari- able; for example, if the branching variable is binary (or 0-1), one subproblem will have the variable fixed at zero, the other node will have it fixed at one. In the search tree, the two new subproblems are represented by two new nodes connected to the root. Most likely each of these subproblems also has fractional-valued integer variables, in which case the branching process must be repeated; successive branchings produce the sort of tree structure shown in our diagram above. If there are more than a few integer variables, the branch- ing process has the potential to create more nodes than any com- puter can hold. There are two key circumstances, however, in which branching from a particular node can be discontinued: o The node's subproblem has no fractional-valued integer variables. It thus provides a feasible solution to the original integer program. If this solution yields a better objective value than any other feasible solution found so far, it becomes the incumbent, and is saved for future com- parison. o The node's subproblem has no feasible solution, or has an optimum that is worse than a certain cutoff value. Since any subproblems under this node would be more restricted, they would also either be infeasible or have an optimum value that is no better. Thus none of these subproblems need be considered. In these cases the node is said to be fathomed. Because subprob- lems become more restricted with each branching, the likelihood of fathoming a node becomes greater as the algorithm gets deeper into the tree. So long as nodes are not created by branching too much faster than they are inactivated by fathoming, the tree can be kept to a reasonable size. When no active nodes are left, CPLEX is finished, and it reports the final incumbent solution back to AMPL. If the cutoff value has been set throughout the algorithm to the objective value of the current incumbent - CPLEX's default strategy - then the reported solution is declared optimal. Other cutoff options, described below, cannot provide a provably optimal solution, but may allow the algorithm to finish much faster. CPLEX's memory requirement for solving linear subproblems is about the same as its requirement for linear programs discussed in the previous section. In the branch-and-bound algorithm, how- ever, each active node of the tree requires additional memory. The total memory that CPLEX needs to prove optimality for an integer program can thus be much larger and less predictable than for a linear program of comparable size. Because a single integer program generates many LP subprob- lems, even small instances can be very compute-intensive and require significant amounts of memory. In contrast to solving linear programming problems using CPLEX, where little user inter- vention is required to obtain optimal results, you will typically have to set some of the following directives to get satisfactory results on integer programs. You can either change the way that the branch-and-bound algorithms work, or relax the conditions for optimality, as explained in the two corresponding subsections below. When experimenting with these possibilities, it is also a good idea to include directives that set a stopping criterion and that display some informative output; these are described in the next two subsections. If you consistently fail to receive any useful solution in response to solve after a reasonable amount of time, and are in doubt as to how to proceed, consult the troubleshoot- ing tips at the end of this section. If you wish to drop all integrality requirements and solve the integer program as a linear program, use the relax directive described in Section II.3. Directives for preprocessing All of the preprocessing directives described in Section II.3 above are also applicable to problems that specify integer-valued variables. The following directives control additional preprocessing steps that are applicable to certain mixed-integer programs only. cliques (default 0) Integer programming solve times can often be improved by generating clique cuts. These additional constraints tighten the feasible region, reducing the number of fractional variables to choose from when CPLEX needs to select a branching variable. The default value of 0 for this directive lets CPLEX decide whether it will generate clique cuts. A value of 1 instructs CPLEX to always try to generate these cuts, while a value of -1 deactivates this feature. covers (default 0) Integer programming solve times can often be improved by generating cover cuts. These additional constraints tighten the feasible region, reducing the number of fractional variables to choose from when CPLEX needs to select a branching variable. The default value of 0 for this directive lets CPLEX decide whether it will generate cover cuts. A value of 1 instructs CPLEX to always try to generate these cuts, while a value of -1 deactivates this feature. sos2=i (default 1) An optimization problem containing piecewise-linear terms may have to be converted to an equivalent mixed-integer program, as explained in Section II.1. When i is at its default value of 1, this conversion results in only one extra variable per piecewise-linear breakpoint. All of the extra variables associated with a particular piecewise-linear term are marked as belonging together, so that CPLEX's branch-and-bound procedure knows to treat them specially. Variables so marked have come to be known as a "special ordered set of type 2," whence the name sos2 for this directive. When i is changed to 0 from its default of 1, the conversion creates a larger number of variables, but does not employ the special ordered set feature. This alternative has no known advantages, and is supplied for completeness only. sosscan=i (default 0) When i is changed to 1 from its default of 0, CPLEX searches for constraints of the form sum {j in S0} xj - sum {j in S1} xj = 1 - | S1 | , where all of the variables xj, j in (S0 union S1), are binary (restricted to the values 0 and 1). These variables are marked as belonging together, and CPLEX's branch-and-bound procedure uses special branching rules to handle them efficiently. (Variables so marked are known as a "special ordered set of type 3" or sos3 sets for short.) An AMPL constraint is recognized as having the above form even if some terms have been moved to the other side of the = sign. Thus a sos3 constraint may be written as, for example, sum {j in S0} x[j] + sum {j in S1} (1-x[j]) = 1; here it is clear that x[j] must be zero for all j in S0 and 1 for all j in S1, with exactly one exception. If we imagine S1 being empty, we get the common special case sum {j in S0} x[j] = 1; this says that x[j] must be 1 for exactly one of the indices j in S0. You can see the number of sos3 sets found by CPLEX, by setting the prestats directive (described in Section II.3 above) to 1. If few or none are found, then i is best left at its default of 0. Even when many sos3 constraints are found, you may have to experiment to determine whether CPLEX's special branching rules offer an advantage for your problem. Directives for algorithmic control In contrast to the case of linear programming, the algorithmic control directives for solving pure and mixed integer programs are less likely to produce the best possible performance at their default values. You may wish to specify alternative values for one or more of the following directives to improve solution times. You can view each of these directives as corresponding to a particular decision faced at each step in the branch-and-bound procedure. To be specific, imagine that an LP subproblem has just been solved. The sequence of decisions and the corresponding directives are then as follows: o Branch next from which node in the tree? (backtrack, nodesel) o Branch by constraining which fractional variable at the selected node? (varsel; also optionally mip_priorities, plobjpri, plconpri) o Investigate which of a fractional variable's two resulting branches first? (branch, heuristic, reducecostfix) o Solve the resulting new subproblem by which LP algorithm? (startalgorithm, mipalgorithm) It is often hard to predict which combination of directives will work best. Some experimentation is usually required; your knowledge of the problem structure may also suggest certain choices of branch-and-bound strategy. backtrack=r (default .85) nodesel=i (default 1) The settings of these directives determine the criterion for choosing the next node from which to branch, once some feasible integer solution has been found. When i is set to 1 or 2, CPLEX associates a value with each node, and chooses a node based on these values. For i=1, a node's value is the bound on the integer optimum that is given by solving the LP subproblem at that node. For i=2, a node's value is an estimate of the best integer objective that can be achieved by branching from that node; estimates are derived from so-called pseudocosts, which are in turn derived from the solutions to the LP subproblems. Depending on the value at the current (most recently created) active node, CPLEX either branches from that node, or else backtracks to the node that has the best bound (i=1) or best estimate (i=2) among all active nodes. The backtracking decision is made by comparing the value (bound or estimate) at the current node with the values at parent nodes in the tree. If the value of the current node has degraded (increased for a minimization, decreased for a maximization) by at least a certain amount relative to the values at parent nodes, then a backtrack is performed. The cutoff for degradation is determined by an internal heuristic that is regulated by the value of r. Lower values of r > 0 favor backtracking, resulting in a strategy that is more nearly "breadth first". The search jumps around fairly high in the tree, solving somewhat dissimilar subproblems. Good solutions are likely to be found sooner through this strategy, but the processing time per node is also greater. Higher values of r discourage backtracking, yielding a strategy that is more nearly "depth first". Successive subproblems are more similar, nodes are processed faster, and integer solutions are often quickly found deep in the search tree. Considerable time may be wasted in searching the descendants of one node, however, before backtracking to a better part of the tree. The default value of .85 gives a moderately breadth-first search and represents a good compromise. Lower values often pay off when the LP subproblems are expensive to solve. Setting i to 0 chooses a pure depth-first strategy, regardless of r. CPLEX automatically uses this strategy to search for an initial feasible integer solution at the outset of the branch-and-bound procedure. varsel=i (default 0) Once a node has been selected for branching, this directive determines how CPLEX chooses a fractional-valued variable to branch on. By default (i=0) the choice is made by an internal heuristic based on the problem and its progress. The maximum infeasibility rule (i=1) chooses the variable with the largest fractional part. This forces larger changes earlier in the tree, and tends to produce faster overall times to reach the optimal integer solution. The minimum infeasibility rule (i=-1) chooses the variable with the smallest fractional part. This may lead more quickly to a first integer feasible solution, but will usually be slower overall to reach the optimal integer solution. A pseudocost rule (i=2) estimates the worsening of the objective that will result by forcing each fractional variable to an adjacent integer, and uses these "degradations" in an internal heuristic for choosing a variable to branch on. This setting tends to be most effective when the problem embodies complex tradeoffs, and the dual variables have an economic interpretation. Strong branching (i=3) considers several different branches by actually solving subproblems for different choices of branching variable. The variable yielding the best results is then chosen. Strong branching requires more time for each node, but usually fewer nodes to solve the problem. This strategy works especially well on binary problems where the number of binary variables is signficantly greater than the number of rows. It is also useful when memory is limited; creating fewer nodes requires less memory. option mip_priorities 'v1 i1 v2 i2 ...' ; If your model contains more than one declaration of integer variables, you should consider the assignment of branching prior- ities. Each time that CPLEX must choose a fractional-valued integer variable on which to branch, it gives preference to the fractional variables that have the highest priorities. A judi- cious choice of priorities can guide the search in a way that dramatically reduces the number of nodes generated. Priorities are not specified via cplex_options like other directives, but are instead read from the AMPL option mip_priorities. The content of mip_priorities should be a string consisting of space separated pairs vk ik, where vk is a name that appears after var in the model, and 1ceil(r) and one that has the constraint x 0, the cutoff is instead the incumbent's value -r (if mini- mizing) or +r (if maximizing). This forces the mixed-integer optimization to ignore integer solutions that are not at least r better than the one found so far. As a result there tend to be fewer nodes generated, and the algorithm terminates more quickly; but the true integer optimum may be missed if its objective value is within r of the best integer objective found. integrality=r (default 1.0e-5) In the optimal solution to a subproblem, a variable is con- sidered to have an integral value if it lies within r of an inte- ger. For some problems, increasing r may give an acceptable solution faster. Directives for halting and resuming the search. There is usually no need to make exhaustive runs to determine the impact of different search strategies or optimality criteria. While you are experimenting, consider using one of the directives below to set a stopping criterion in advance. In each case, the best solution found so far is returned to AMPL. You can arrange to save the entire search tree when CPLEX halts, so that the search may be resumed from where it left off. Directives for this purpose are also listed below. mipsolutions=i (default 10000) The search is terminated after i feasible solutions satisfy- ing the integrality requirements have been found. nodes=i (default 20000) The search is terminated after i linear programming subprob- lems have been solved. time=r (default 1.0e+75) The search is terminated after r seconds of computing time. treememory=r (default 1e75) The search is terminated when the branch-and-bound search tree reaches a size of r megabytes. Since the search tree is kept entirely in memory, you can use this directive to assure that CPLEX will return with a (possibly less than optimal) solution rather than terminating with an out-of-memory error. In general, r should be set to a few megabytes less than the amount of memory available, to allow for memory used by the operating system, by other programs, and by CPLEX and its other data structures. For example, if your MS-DOS computer has 16 megabytes of real memory, you might start by setting r to 13. If you can use virtual memory, you may be able to set r substantially higher than the number of megabytes of physical memory, although performance will be slower as a result. endtree=f1 starttree=f2 CPLEX progressively allocates more memory for the search tree as the branch-and-bound procedure creates new nodes; it frees all this memory at termination. If the endtree directive is specified, CPLEX also writes a record of the final tree to the file named f1, in a compact binary format. CPLEX normally starts the branch-and-bound procedure from a tree that consists only of the root node, as explained at the beginning of this section. If the starttree directive is specified, then CPLEX instead starts from the search tree stored in the file named f2. This file must be one that was previously written, for the same problem, by the endtree directive. These directives are particularly useful for large and difficult problems that may take hours or days to solve to optimality. If you would like to look at the first integer solution that CPLEX finds, for example, you can set mipsolutions=1 together with endtree and any other directives you like: ampl: model multmip3.mod; ampl: data multmip3.dat; ampl data: option solver cplex; ampl: option cplex_options 'mipsolutions=1 varsel=-1' ampl? ' endtree=multmip.tre'; ampl: solve; CPLEX 3.0: mipsolutions 1 varsel -1 endtree multmip.tre CPLEX 3.0: mixed-integer solutions limit; objective 238225 251 simplex iterations 64 branch-and-bound nodes ampl: display Trans >multmip.so1; ampl: A display of the Trans variables, at the values they take in the first integer solution, has been directed to the file multmip.so1 for future examination. You could also browse through the values interactively at this point. When you are ready to continue, you need only set starttree to the same file as endtree, and make any other changes to the branch-and-bound directives that you wish. Then give the solve command again: ampl: option cplex_options 'mipsolutions=1 varsel=1' ampl? ' starttree=multmip.tre endtree=multmip.tre'; ampl: solve; CPLEX 3.0: mipsolutions 1 varsel 1 starttree multmip.tre endtree multmip.tre CPLEX 3.0: mixed-integer solutions limit; objective 235625 596 simplex iterations 124 branch-and-bound nodes ampl: display Trans >multmip.so2; ampl: CPLEX's counts of the numbers of iterations and nodes are cumulative. Since this is a minimization problem, the objective at this second solution is lower than at the first. To continue past this point with all the same CPLEX directives, you need only type solve: ampl: solve; CPLEX 3.0: mipsolutions 1 varsel 1 starttree multmip.tre endtree multmip.tre CPLEX 3.0: optimal integer solution; objective 235625 601 simplex iterations 142 branch-and-bound nodes ampl: We see here that the objective value from the second solution (235625) was optimal, but that CPLEX had to process an additional 18 nodes to prove optimality. (This is not a fluke. The branch-and-bound procedure must often examine many nodes to prove optimality after it has found an optimal solution.) Directives for controlling output When invoked by solve, CPLEX normally returns just a few lines to your screen to summarize its performance. The following directives let you choose more output, which may be useful for monitoring the progress of a long run, or for comparing the effects of other directives on the behavior of the branch-and- bound algorithm. Output normally comes to the screen, but may be redirected to a file by specifying solve >filename. mipdisplay=i1 (default 0) mipinterval=i2 (default 1) The default of i1=0 produces a minimal few lines of output from CPLEX, summarizing the results of the run. When i1=1, a single ``log line'' is displayed for every integer solution found: ampl: option solver cplex; ampl: option cplex_options 'mipdisplay=1'; ampl: model mod/multmip1.mod; data dat/multmip1.dat; ampl: solve; CPLEX 3.0: mipdisplay 1 Node Log . . . Best Integer = 2.368000e+05 Node = 17 Best Node = 2.235040e+05 Best Integer = 2.335500e+05 Node = 29 Best Node = 2.239300e+05 Best Integer = 2.332750e+05 Node = 71 Best Node = 2.247610e+05 Best Integer = 2.313750e+05 Node = 105 Best Node = 2.255300e+05 Best Integer = 2.300500e+05 Node = 163 Best Node = 2.269560e+05 Best Integer = 2.298500e+05 Node = 204 Best Node = 2.281340e+05 CPLEX 3.0: optimal integer solution within mipgap; objective 229850 786 integer iterations The information includes the number of nodes processed, and the objective values of the best integer solution found so far and of the best unprocessed node subproblem. (The optimal value lies between these two.) When i1=2, a more detailed log line is displayed once every i2 nodes, as well as for each node where an integer solution is found. A * indicates lines of the latter type. The default of i2=1 gives a complete picture of the branch-and-bound process, which may be instructive for small examples. With a larger choice of i2, this setting can be very useful for evaluating the progress of long runs; the log line includes a count of the num- ber of active nodes, which gives an indication of the rate at which the search tree is growing or shrinking in memory. When i1=3, CPLEX also prints log lines from the simplex method (as described in the previous section) for the solution of each subproblem. timing=i (default 0) This directive can be used to display a summary of processing times. It works the same for integer programming as for linear programming, as described in Section II.3 above. Common difficulties The following troubleshooting tips address the two difficulties most often encountered in solving integer programs with CPLEX. The most common failure when solving integer problems is running out of memory. This problem arises when the branch-and-bound tree becomes so large that insufficient memory is available to solve an LP subproblem. As memory gets tight, you may observe frequent warning messages while CPLEX attempts to navigate through various operations within limited memory. Eventually the solution process will be terminated with a message indicating an unrecoverable integer programming error. The tree information saved in memory can be substantial. CPLEX saves a basis for every unexplored node. When you use the best-bound or best-estimate strategy of node selection, the list of such nodes can become very long for large or difficult problems. Moreover, once a run has failed because of insufficient memory, there is no reliable way to predict how much additional memory might be necessary for success. To avoid out-of-memory failures, we recommend setting the treememory directive described above. CPLEX will then return to AMPL the best integer solution it has been able to find within the available memory. You may also be able to reduce CPLEX's memory requirements by modifying the search strategy to generate a smaller tree and use less memory. Switching to a more nearly depth-first search selection strategy (by increasing the backtrack directive) often works; depth-first search rarely generates a large unexplored node list since it dives deep into the branch-and-bound tree rather than jumping around at higher levels. A second common difficulty is failure to prove optimality. One frustrating aspect of the branch-and-bound technique for integer programming is that the solution process can continue long after the optimal solution has been found. In these situations the branch-and-bound tree is being exhaustively searched in an effort to guarantee that the current integer feasible solution is indeed optimal. Remember that the branch-and-bound tree for a problem in n binary variables may be as large as 2^n nodes; a problem containing only 30 such variables could produce a tree having over one billion nodes! Although CPLEX's search strategies avoid visiting the vast majority of nodes, the search can still easily take more computer memory or time than you have available. In general you should set at least one limit on the number of nodes processed, number of improved solutions found, or total processing time, using the CPLEX directives given above. Setting limits ensures that the tree search will terminate in reasonable time. You can then inspect the solution and, if necessary, re-run the problem using different directive settings. Consider some of the shortcuts described above for improving performance, particularly those for relaxing optimality. They may provide you with an optimal or very nearly optimal solution, even though a proof of optimality would require more computer resources than you have available. Acknowledgement Much of this material is based on text provided by Janet Lowe.