- Computer Scientist in the Future Technologies Group at Oak Ridge National Laboratory.
Deputy Lead, Performance Group, E3SM (Energy Exascale Earth System Model)
12+ years experience in design and development of efficient parallel scientific applications on leadership class supercomputers.
High Performance Computing, Data/Performance Analytics, Exascale Co-design, Optimization Algorithms, Computational Intelligence, Parallel I/O, Performance Analysis and Optimization.
MODELING DOMAINS: Water Distribution Systems, Groundwater, Energy Economy Optimization.
- [ January 2007 - November 2012 ] North Carolina State University,
Raleigh, NC, USA.
PhD in Computer Science.
Minor in Civil, Construction and Environmental Engineering
Dissertation: Optimus: A Scalable Parallel Metaheurisitic Optimization Framework with Environmental Engineering Applications.
- [ August 2004 - December 2006] North Carolina State University, Raleigh, NC, USA.
M.S. in Computer Science.
Thesis: Cyberinfrastructure for Contamination Source Characterization in Water Distribution systems.
- [ June 2000 - May 2004] V. R. Siddhartha Engineering College (Nagarjuna University), Vijayawada, AP, India.
Bachelor of Technology in Computer Science and Engineering.
Project: High Performance Cluster Computing Using Beowulf Clusters.
- [ June 2013 - Present] Computer Scientist, Future Technologies Group, Oak Ridge National Laboratory, Oak Ridge, TN.
E3SM (Energy Exascale Earth System Model)
Deputy Lead, Performance Group: Performance engineering and optimization efforts on leadership class supercomputers.
Exascale Computing Project (ECP)
Design and development of E3SM (Energy Exascale Earth System Model) Multiscale Modeling Framework (MMF) targeted towards next generation supercomputers like Summit at ORNL.
Lead co-design activities in collaboration with interested software groups and compute vendors.
Oxbow: Performance Analytics for Exascale Co-design
Architected a data analytics platform for exascale application characterization called Performance Analytics Data Store (PADS) using MongoDB.
Design and development of a visual analytics portal for data exploration and analysis.
Exploring machine learning techniques for novel insights into experiment data.
SUPER: Institute for Sustained Performance, Energy and Resilience
Communication characterization and optimization of large scale scientific applications, Fusion (XGC) and Climate (CESM).
Development and performance optimization of techniques for mapping processes to processor cores at runtime so as to minimize communication overhead by taking advantage of network topology.
SUPER-EFRC (Energy Frontier Research Center) Pilot
Optimization of WastePD applications and workflow.
CESAR: Center for Exascale Simulation of Advanced Reactors
Characterization and validation of various proxy apps from the co-design centers.
- [ October 2014 - October 2017] Adjunct Assistant Professor, Civil, Construction and Environmental Engineering Department,
North Carolina State University, Raleigh, NC.
Collaborate on NSF project: Cyber-Enabled Water and Energy Systems Sustainability Utilizing Climate Information.
- [ January - June 2013] Postdoctoral Research Scholar, North Carolina State University, Raleigh, NC.
- [ February - July 2009] Research Fellow with the Blue Brain Project, EPFL, Lausanne, Switzerland.
Designed scalable multi-objective optimization algorithms to perform model fitting for single cell neuron models. Conducted experiments on 8192 cores of an IBM BlueGene/L supercomputer. The overall project aims to reverse-engineer the mammalian brain through detailed simulations on supercomputers.
- [ August - September 2005] Research Intern at Oak Ridge National Laboratory, Oak Ridge, USA.
Evaluated performance of a nuclear fusion application, Gyro under the supervision of Dr.Patrick Worley.
- [ May - August 2005] Software Engineer Intern at Microsoft, Redmond, USA.
Developed/extended framework and automation tools for testing Password Management Application for Identity Integration Server (MIIS) team.
- [ August 2004 - March 2005] Database Programmer at Systems - NCSU Libraries, Raleigh, USA.
Developed scripts for importing large volumes of data into library database from external data sources. Developed an Enterprise Resource Management tool for the Acquisitions and Collection Management departments using Oracle and MySQL.
- [2007 - 2012] Ph.D. Candidate - Advisor: Dr.Kumar Mahinthakumar.
Project: Optimization Methods for Universal Simulators (Optimus)
Designed a scalable parallel metaheuristic optimization framework for integration of a desired population-based search method with a target scientific application. It effectively scaled to quarter of a million cores on the Cray XK6 supercomputer (Jaguar) at Oak Ridge Leadership Computing Facility.
- Developed a parallel middleware component, PRIME (Parallel Reconfigurable Iterative Middleware Engine) for scalable deployment on emergent supercomputing architectures.
- Designed a new technique, TAPSO (Topology Aware Particle Swarm Optimization) for network based optimization problems with applications to Water Distribution Systems problems.
- Achieved 84.82% of baseline at 200,000 cores relative to performance at 1000 cores for a weak scaling scenario.
- Environment: C, C++, MPI, HDF5, Python, Matplotlib.
- Project: Scalable Parallel I/O module for Environmental Management applications
Developed a parallel I/O library as part of a larger Department of Energy initiative, Advanced Simulation Capability for Environmental Management (ASCEM), for reading and writing of structured and unstructured scientific datasets.
- Developed software module for supporting checkpoint/restart of simulations.
- Achieved over 25X speedup in HDF5 I/O read performance and 3X speedup in write performance for a subsurface simulator (PFLOTRAN) at over 100K processor cores on Lustre file system (Jaguar supercomputer).
- Environment: C, Fortran, MPI, HDF5.
- Project: Tools for Energy Model Optimization and Analysis (TEMOA)
Founding member of TEMOA, an open source Energy-Economy Optimization modeling framework. The energy system is described algebraically as a network of linked processes that convert a raw energy commodity into an end-use demand through a series of one or more intermediate commodities. Technologies are linked to one another in a network via model constraints representing the allowable flow of energy commodities. The model objective is to minimize the present cost of energy supply by deploying and utilizing energy technologies and commodities over time to meet a set of exogenously specified end-use demands.
- Designed conceptual model to balance energy flows across the system while meeting constraints (demand, capacity etc. ).
- Visualization of system-level flows from optimization results using Graphviz.
- Environment: Python, Pyomo, GraphViz, Sphinx.
- Project: Performance Engineering Research Institute - A Dept. of Energy SciDAC (Scientific Discovery through Advanced Computing) Project
Conducted performance analysis and optimization of scientific applications on the IBM BlueGene/P and Cray XT4/5 supercomputers at ANL and ORNL respectively. Achieved an improvement of 18.77% for PFLOTRAN over baseline results on 4096 cores of Cray XT4.
- [Dec 2004 - Dec 2006] Graduate Research Assistant.
- Project: Adaptive Cyberinfrastructure for Threat Management in Water Distribution Systems - A NSF DDDAS (Dynamic Data-Driven Application Systems) Project.
- Parallelized Water Distribution Systems simulator (EPANET) and ported to Blue Gene/P and Cray XT4.
- Developed a framework encompassing interactions between the optimization methods and parallel EPANET.
- Designed a grid enabled workflow for efficient execution of water quality simulations and deployment on large scale distributed systems like Teragrid and SURAgrid.
- Developed visualization tool for analysis of search patterns of the optimization method.
- Environment: C, MPI, Python, Tk.
- Project: Performance Evaluation Research Center (DOE SciDAC project)
Analyzed performance of Nuclear Fusion (Gyro) and Astrophysics (GenASiS) applications. Evaluated performance analysis tools at several Supercomputing facilities (Oak Ridge National Laboratory, Teragrid etc.).
- Cluster Administration
Research group cluster(s) planning, deployment, benchmarking and maintenance. Analyzed and optimized the performance of High Performance Linpack (HPL) benchmark on multiple compute clusters.
High Level Languages: C, C++, Python, Fortran.
Operating Systems: Linux (RedHat, SuSE etc.), IBM AIX , Solaris.
Supercomputing Platforms: GPU, KNL, Cray XT4/5, XK6/7, IBM BlueGene/P, x86-64/x64, IA-64, IBM P-690, SGI Altix, Cray X1.
Compilers: IBM, Pathscale, Intel, Portland Group, GNU.
- (Worked on supercomputers/clusters with above architectures and programming environments).
Libraries: MPI, OpenMP, MPI-IO, HDF5, MKL.
Databases: MySQL, SQLite, MongoDB.
Development Tools: GNU Binutils, Subversion, Mercurial, Git, Buildbot, Valgrind, TotalView, Allinea DDT, Doxygen, Sphinx, Intel Pin.
HPC Tools: TAU, CrayPAT, HPC Toolkit, SvPablo, PerfSuite, mpiP, PAPI.
Sys Admin:BASH, DNS, DHCPD, BOOTP/PXE, NIS, NFS, Modules, Cluster Management.
Visualization: GraphViz, Matplotlib, gnuplot, Tk, Mayavi, jqPlot, D3.
Recognition and Service
- Awarded Gold medal in the ACM Graduate Student Research Competition at Supercomputing 2012 conference.
- Member, OLCF User Group Executive Board.
- Technical Program Committee, Supercomputing conference, 2017.
- Selected for Early Career Program at Supercomputing conference 2016.
- Review Editor, Data-driven Climate Sciences, Frontiers in Big Data journal.
- Co-convener, Eighth Workshop on Data Mining in Earth System Science, IEEE International Conference on Data Mining 2018.
- Reviewer, DOE SBIR (Small Business Innovation Research) proposals.
- Reviewer, Parallel Computing Journal.
- Reviewer, ASCE Journal of Computing in Civil Engineering.
- Reviewer, Journal of Water and Climate Change.
- Reviewer, Journal of Environmental Modelling and Software.
- Reviewer, Computers and Geosciences.
- Reviewer, IEEE Transactions on Evolutionary Computing.
- Member of IEEE, ACM (+ SIGHPC, SIGEVO), AGU
- Kevin Hunter, Sarat Sreepathi, Joseph F. DeCarolis, Modeling for insight using Tools for Energy Model Optimization and Analysis (Temoa), Energy Economics, Volume 40, November 2013, Pages 339-349, ISSN 0140-9883.
- Joseph DeCarolis, Kevin Hunter and Sarat Sreepathi, The Case for Repeatable Analysis with Energy Economy Optimization Models, Energy Economics, Volume 34, Issue 6, November 2012, Pages 1845–1853.
- Jacqueline Chame, Chun Chen, Mary Hall, Jeffrey K. Hollingsworth, Kumar Mahinthakumar, Gabriel Marin, Shreyas Ramalingam, Sarat Sreepathi, Vamsi Sripathi, Ananta Tiwari, PERI Autotuning of PFLOTRAN, In Journal of Physics, Proceedings of SciDAC July 2011.
- Sarat Sreepathi, Kumar Mahinthakumar, Emily Zechman, Ranji Ranjithan, Downey Brill, Xiaosong Ma, and Gregor von Laszewski, Cyberinfrastructure for Contamination Source Characterization in Water Distribution systems, Lecture Notes in Computer Science, Volume 4487/2007 (International Conference on Computational Science (1) 2007: 1058-1065)
- G. Mahinthakumar, G. von Laszewski, S. Ranjithan, E. D. Brill, J. Uber, K. W. Harrison, S. Sreepathi, and E. M. Zechman, An Adaptive Cyberinfrastructure for Threat Management in Urban Water Distribution Systems, Lecture Notes in Computer Science, Springer-Verlag, pp. 401-408, 2006.( International Conference on Computational Science (3) 2006: 401-408)
- P. Worley, J. Candy, L. Carrington, K. Huck, T. Kaiser, G. Mahinthakumar, A. Maloney, S. Moore, D. Reed, P. Roth, H. Shan, S. Shende, A. Snavely, S. Sreepathi, F. Wolf, and Y. Zhang, Performance Analysis of GYRO: A Tool Evaluation, Journal of Physics: Conference Series, 16 (2005), pp. 551-555. (Proceedings of the 2005 SciDAC Conference, San Francisco, CA, June 26-30, 2005.)
- Sarat Sreepathi, Jitendra Kumar, Richard T. Mills, Forrest M. Hoffman, Vamsi Sripathi, William W. Hargrove, Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers , IEEE International Conference on Cluster Computing (CLUSTER), Sept 5-8, 2017. DOI: 10.1109/CLUSTER.2017.88
- Philip C. Roth, Hongzhang Shan, David Riegner, Nikolas Antolin, Sarat Sreepathi , Leonid Oliker, Samuel Williams, Shirley Moore, and Wolfgang Windl. Performance Analysis and Optimization of the RAMPAGE Metal Alloy Potential Generation Software, In Proceedings of 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems (SEPS’17). ACM, New York, NY, USA, 10 pages. DOI: 10.1145/3141865.3141868
- Sarat Sreepathi, Ed D’Azevedo, Bobby Philip, and Patrick Worley, Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers , In Proceedings of the ACM/SPEC International Conference on Performance Engineering (ICPE). ACM, March 12-18, 2016. DOI: 10.1145/2851553.2851575
- M.L. Grodowitz, Sarat Sreepathi,Hierarchical Clustering and K-means Analysis of HPC Application Kernels Performance Characteristics, Nineteenth Annual IEEE High Performance Extreme Computing Conference (HPEC ‘15), Waltham, MA, September 15-17,2015. DOI: 10.1109/HPEC.2015.7322484
- Sarat Sreepathi, Megan Grodowitz, Robert Lim, Philip Taffet, Philip Roth, Jeremy Meredith, Seyong Lee, Dong Li, and Jeffrey Vetter, Application Characterization using Oxbow Toolkit and PADS Infrastructure, First International Workshop on Hardware-Software Co-Design for High Performance Computing (Co-HPC 2014), In conjunction with SC’14, November 17, 2014, New Orleans, LA.
- Sarat Sreepathi, Vamsi Sripathi, Richard Mills, Glenn Hammond, G. Kumar Mahinthakumar, SCORPIO: A Scalable Two-Phase Parallel I/O Library With Application to a Large Scale Subsurface Simulator, IEEE Conference on High Performance Computing (HiPC) 2013, Bengaluru, India.
- Sarat Sreepathi, Downey Brill, Ranji Ranjithan and Gnanamanikam Mahinthakumar, Parallel Multi-Swarm Optimization Framework for Search Problems in Water Distribution Systems , Proceedings of World Environmental and Water Resources Congress 2012, Albuquerque, NM.
- Joseph DeCarolis, Kevin Hunter and Sarat Sreepathi, The TEMOA Project: Tools for Energy Model Optimization and Analysis, International Energy Workshop 2010, Stockholm, Sweden, June 21-23, 2010.
- Jitendra Kumar, Sarat Sreepathi, Downey Brill, G. Kumar Mahinthakumar, S. Ranjithan,
Detection of leaks in water distribution systems using routine water quality
measurements , Proceedings of World Environmental and Water Resources Congress 2010, Providence, Rhode Island, USA
Conference Talks and Posters
- Sarat Sreepathi,Matthew Norman, Anikesh Pal, Walter Hannah and Carl Ponder, Development of a cloud resolving model for heterogeneous supercomputers , American Geophysical Union Fall 2017 Meeting, New Orleans, Dec 11-15, 2017.
- Sarat Sreepathi, Optimus: A Parallel Optimization Framework With Topology Aware PSO and Applications ( Poster ), ACM Student Research Competition (SRC) Poster at SC '12 : 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, Salt Lake City, UT, November 10-16, 2012. (Won Gold medal)
- Sarat Sreepathi and G.Kumar Mahinthakumar, Optimus: A Scalable Parallel Metaheuristic Optimization Framework With Topology Aware PSO and Applications, Talk at Swarmfest, Charlotte, NC, July 29-30, 2012.
- Sarat Sreepathi and G.Kumar Mahinthakumar, Parallel Metaheuristic Optimization Framework for Population-Based Search Algorithms with Environmental Applications, Lecture at SIAM Conference on Parallel Processing for Scientific Computing, Savannah, GA, February 15-17, 2012.
- Richard T. Mills, Sarat Sreepathi, Vamsi Sripathi, Kumar G. Mahinthakumar, Glenn Hammond, Peter Lichtner, Barry F. Smith, Jitendra Kumar, and Gautam Bisht. Engineering the PFLOTRAN subsurface flow and reactive transport code for scalable performance on leadership-class supercomputers, 15th SIAM Conference on Parallel Processing for Scientific Computing, Savannah, Georgia, February 15-17, 2012 .
- Joseph DeCarolis, Kevin Hunter and Sarat Sreepathi, Multi-stage stochastic optimization of a simple energy system, International Energy Workshop 2012, Cape Town, South Africa.
- Sarat Sreepathi, Vamsi Sripathi, Glenn Hammond, Richard Mills, and G. Kumar Mahinthakumar. Poster: A Scalable Two-Phase Parallel I/O Library with Application to a Large Scale Subsurface Simulator, In Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion (SC '11 Companion). ACM, New York, NY, USA, 67-68.
- Sarat Sreepathi and G. Kumar Mahinthakumar, Poster: Parallel Multi-Swarm Optimization Framework for Search Problems in Water Distribution Systems, EAP Workshop at Supercomputing 2011 Conference.
- Joe DeCarolis, Kevin Hunter, Sarat Sreepathi, Designing Robust Hedging Strategies for U.S. Electric Sector Planning Under an Uncertain Climate Policy , USAEE/IAEE (International Association for Energy Economics) Conference Calgary, Canada October, 2010.
- Sarat Sreepathi, Simulation-Optimization for Threat Management in Urban Water Systems , Application Driven Design for a Large-Scale, Multi-Purpose Grid Infrastructure, Demo at Fall 2006 Internet2 Meeting, Chicago, IL, December 2006.
- Patrick Worley, Jeff Candy, Laura Carrington, Kevin Huck, Tim Kaiser, Kumar Mahinthakumar, Allen Malony, Dan Reed, Philip Roth, Hongzhang Shan, Sameer Shende, Allan Snavely, Sarat Sreepathi, Ying Zhang, Tools or No tools?: A parallel performance analysis case study (extended abstract), Poster Presentation at Supercomputing 2006.