Sarat Sreepathi
 
Home
Research
Resume
Cyberinfrastructure
Neptune
Course Projects
- - - - - - -
Current Projects
- - - - - - -
Secure Water
Pflotran
Earlier
Gyro
GenASiS
 
 
Cyberinfrastructure Print

Cyberinfrastructure for Threat Detection in Water Distribution Networks


This page provides an outline for my Master's thesis project. Updated information about the project is available at http://www.secure-water.org


Abstract

Threat management in drinking water distribution systems involves real-time characterization of any contaminant source and plume, design of control strategies, and design of incremental data sampling schedules. This requires dynamic integration of time-varying measurements along with analytical modules that include simulation models, adaptive sampling procedures, and optimization methods. These modules are compute-intensive, requiring multi-level parallel processing via compute clusters. Since real-time responses are critical, the computational needs must also be adaptively matched with available resources. This requires a software system to facilitate this integration via a high performance computing architecture such that the measurement system, the analytical modules and the computing resources can mutually adapt and steer each other. In this project, we are developing such an adaptive cyberinfrastructure system facilitated by a dynamic workflow design.

Background

  • EPANET is an extended period hydraulic and water-quality simulation code developed at EPA.
  • We use the EPANET code currently to solve the single source water contamination problem.
  • In this scenario, we assume the existence of the contaminant at a single location in the water distribution network.
  • We run the simulation using this true source parameters and obtain the concentration profiles at several predetermined sensor locations.
  • The simulation is also run for several guess sources and the sensor readings so obtained are compared to the true source observations.
  • The source identification problem is thus posed as an inverse problem. The goal then is to come up with a guess source which generates concentration profiles similar to those from true source.

Coarse Grain Parallelization of Water Quality Simulation Code(EPANET)

  • The original version on EPANET that we obtained could only simulate one source within a run.
  • Developed a wrapper around the original EPANET that could take several sources and simulate them successively within one run.
  • This amortizes the startup costs(i.e., setup time when starting the program) as we pack more computation per run.
  • The wrapper is then parallelized using MPI. This version is referred to as 'pepanet'.
  • Within the MPI program, the total trial sources are divided among all the processes. Each process then simulates the assigned group of sources successively.
  • The root process collects all the fitness functions from all the member processes and writes it to an output file.

Optimization Methods Using Evolutionary Computing Techniques and Concepts from Graph Theory.

  • In EPANET, the water distribution network is represented as a connected graph. But the node numbering is not entirely amenable for using optimization methods like Genetic algorithms(GA) effectively.
  • Hence node reordering using the Cuthill McKee algorithm is performed to create a equivalent suitable representation.
  • An optimization method that implements genetic algorithms(GA) has been developed.
  • The GA implementation uses real encoding for the parameters and simulated binary crossover(SBX) operator for crossover.

Workflow Encompassing Interactions Between the Optimization Methods and Parallelized Water Quality Simulation Code.

  • Glue code has been developed to facilitate interaction between the optimization methods and the parallelized EPANET(pepanet).
  • The wrapper(pepanet) is designed keeping in mind various optimization method implementations. This has been tested with diverse optimization method implementaions in Java, C and Matlab.

Framework for Deployment on Large Scale Distributed Systems like Teragrid and SURAgrid.

  • Preliminary work has been done in this area. Shell scripts and other glue code is developed to deploy the current framework in Teragrid.
  • This included configuration of passwordless ssh etc. too for ensuring seamless file based communication.

Performance Analysis and Optimization.

  • During the initial development of parallel EPANET(pepanet), a random search method was implemented to solve the source identification problem.
  • This random search implementation is then parallelized using MPI. Extensive benchmarking has been performed to study scalabilty issues and compare the serial and parallel versions.
  • Performance analysis is currently being undertaken for the Teragrid deployment scenario.

Visualization

  • A visualization tool is developed to provide better understanding of the search patterns of the optimization method.

 
 
 

 

© 2008 Sarat Sreepathi
Green Web Hosting! This site hosted by DreamHost.