Function approximation for largescale reinforcement learning
We are developing new techniques to solve largescale, highdimension machine learning problems, such as multiagent optimization. We use sparse distributed memory, to reduce the size of the tables that must be stored during the learning process. We show that adaptive prototype generation, Kanerva Coding, and fuzzy function approximation can be used to improve the performance of the solver. 

Computational infrastructure for seamless, intersite Grid
computing.
We are developing and implementing a new Grid software layer that allows users to deploy parallel MPI applications across hosts and clusters that are located at different computing sites. Using message relaying and dynamic scheduling, the Grid layer allows the user to view the resources as if they are all located within the same cluster. This approach has been applied to parallelize a tomographic mammography application. (Joint work with D. Kaeli, CenSSIS, and T. Wu at Mass General Hospital) 

Applications of Combinatorial Optimization to Switching, Testing, Reconfigureable Computing, and Cache Layout (Joint work with M. Leeser, D. Kaeli, F. Lombardi and Z. Navabi). 

Microprocessoraware scheduling algorithms for modern compilers.
We have developed highperformance algorithms that maximize instructionlevel parallelism in the presence of scheduling restrictions, including complex resource constraints, branch delay slots, non fullypipelined functional units, and limits on the number of available registers. Problems of interest have included scheduling for superblocks and hyperblocks, developing tight lower bounds on schedule length, backtracking acyclic schedulers, and scheduling and register allocation, with spill code (Joint work with S. Abraham and A. Eichenberger) 

Lower bounds on Schedule Length
We developed tight lower bounds on schedule length that are used to guide compilers in making accurate scheduling decisions. We performed a comprehensive study of the tightness of lower bounds on basic block execution time as well as developing and evaluating new bounds. We developed the first tight lower bounds on superblock execution time that specifically account for the dependence and resource conflicts between groups of branches, and gave a fast implementation suitable for inclusion in a production compiler, and the tightest known lower bound on weighted completion time scheduling. (Joint work with A. Eichenberger) 

Parallel and Scalable Processing Systems and Programming Toolsets
We developed a methodology for increasing the performance of four classes of finite difference computational applications: finite difference time domain (FDTD), finite difference frequency domain (FDFD), Steepest Descent Fast Multipole Method (SDFMM), and Semianalytic Mode Matching (SAMM). By defining processor and communication models using synthetic benchmarks for a parallel architecture, and performing detailed performance analysis of a target application, we developed parallelization strategies for each class of applications. These strategies were tailored both to the characteristics of each application as well as to the computational environment of the underlying parallel architectures. (Joint work with D. Kaeli, C. Rappaport and T. Wu) 
