Hi! I’m Kunj Sinha
A big thanks to MDAnalysis and WESTPA for accepting my proposal for Google Summer of Code 2026.
I’ll be mentored by Jeremy Leung, Lillian Chong and Nilay Verma.
I will be working on the project “Interface for post-simulation analysis (“crawling”) of WESTPA simulations”.
Summary:
This project aims to implement a WESTPAParser and a WESTPAReader in WESTPA which will expose WESTPA’s HDF5 Framework simulation data as a standard MDAnalysis Universe. Currently, users are required to write extensive boilerplate code to manually navigate the HDF5 simulation data via WESTPA’s w_crawl before any structural observables can be computed. This project replaces that manual process by allowing users to obtain an MDAnalysis Universe directly from a west.h5 simulation file. This native integration will be accessible through both a Python API and a new w_mdacrawl CLI tool, serving as a high performance drop-in replacement for existing workflows. The project will also allow users to use MDAnalysis’s AnalysisBase backend to perform analysis using parallelization. Furthermore, the project will also implement a method to save the resulting analysis results back into the HDF5 framework as auxdata. This will ensure that any computed properties remain compatible with the broader WESTPA ecosystem for use in future simulations or downstream analysis tools like w_ipa.
Rough Timeline
May 1 - May 24
- Interact with mentors and setup proper communication times.
- Discuss and make a detailed roadmap for project outcomes.
- Get involved with the community.
May 25 - June 1
- Setup developer environment and start basic
h5pytraj_segs/reads. - Register format ‘WESTPA’ and add entry points to
pyproject.toml.
June 1 - June 7
- Build
WESTPAParserskeleton, implement__init__. - Build
parse()with configuration detection.
June 8 - June 14
- Return fully populated Topology object after detecting config.
- Add parser test cases and verify that it is passing without issues.
June 15 - June 21
- Build
WESTPAReaderskeleton and flat frame index. - Implement
__init__usingwestpa.analysisAPI to build flatframe_indexlist. - Implement
n_framesandn_atomsproperties.
June 22 - June 28
- Create the
Timestepobject and verify thatlen(u.trajectory)returns the correct frame count. - Implement
_read_frame(). Add filter to verify the sequence of trajectory frames using the pointer dataset. - Resolve
frame_index[i]and opentraj_segsfile with iteration level caching. - Read coordinates via
h5pyhyperslab, setts.positions.
June 29 - July 5
- Write coordinate correctness tests via direct
h5pyreads for multiple frames across multiple iterations. ts.metadataand reader completion. Populate weight,pcoord, iteration, walker,parent_id,endpoint_typeon every_read_framecall.- Implement
_reopen()andclose().
July 6 - July 12
- Write metadata tests by asserting against
west.h5seg_indexfields directly. - Expose any existing
auxdatadatasets ints.data. - Run RMSD and
RadiusOfGyrationdirectly on the constructed Universe. - Submit work for Midterm Evaluation.
July 13 - July 19
- Verify RMSD and
RadiusOfGyrationresults against Tutorial 7.5 reference outputs. - Setup skeleton for parallelization support.
- Document completed work so far.
July 20 - July 26
- Implement parallelization by adding
__getstate__and__setstate__forh5pyhandle management. - Test that pickling the reader works without
TypeError. - Run analysis and check whether results are identical to a serial run. Test with
n_workers=4. - Implement
save_to_west_h5()completely and handle overwrite cases.
July 27 - August 2
- Verify
w_ipacompatibility. - Verify written
auxdatais readable without errors.
August 3 - August 9
- Implement
w_mdacrawlCLI tool viaargparseand setup the entire pipeline from Universe construction tosave_to_west_h5(). - Register as console script entry point.
- Verify the entire project works as intended with all the test cases passing.
August 10 - August 16
- Create Jupyter notebook tutorial by reproducing the entire Tutorial 7.5
w_crawlworkflow. - Write documentation for all three public APIs -
WESTPAParser,WESTPAReader,save_to_west_h5(). - Project Completion and Submission
- Work on future enhancements if time permits.
I will be posting updates of my progress here.