Obituary - John D. Westbrook Jr. (1957-2021)

Biography | Publications Curriculum Vitae | Videos | Slides | Articles | Obituary

John D. Westbrook Jr. (1957-2021), Research Professor at Rutgers University and Data & Software Architect Lead for the RCSB PDB, passed away on October 18, 2021.

He was incredibly beloved and respected by his colleagues at Rutgers and throughout the world, known for his dry wit and endless enthusiasm for thinking about all aspects of data and data management.

John had a long and highly successful career developing ontologies, tools, and infrastructure in data acquisition, validation, standardization, and mining in the structural biology and life science domains. His work established the PDBx/mmCIF data dictionary and format as the foundation of the modern Protein Data Bank (PDB) archive (

More than twenty-five years ago, while still a graduate student, John recognized the importance of a well-defined data model for ensuring delivery of high quality and reliable structural information to data users. He was the principal architect of the mmCIF data representation for biological macromolecular data. Based on a simple, context- free grammar (without column width constraints), data are presented in either key-value or tabular form. All relationships between common data items (e.g., atom and residue identifiers) are explicitly documented within the PDBx Exchange Dictionary ( Use of the PDBx/ mmCIF format enables software applications to evaluate and validate referential integrity within any PDB entry. A key strength of the mmCIF technology is the extensibility afforded by its rich collection of software-accessible metadata.

The current PDBx/mmCIF dictionary contains more than 6,200 definitions relating to experiments involved in macromolecular structure determination and descriptions of the structures themselves. The first implementation of this schema was used for the Nucleic Acid Database, a data resource of nucleic acid-containing X-ray crystallographic structures. Today, this dictionary underpins all data management of the PDB. Since 2014, it has served as the Master Format for the PDB archive. It also forms the basis of the Chemical Component Dictionary (, which is used to maintain and distribute small molecule chemical reference data in the PDB.

John Westbrook, Jeffrey Deschamps, Judith Flippen-Anderson in 2010 (courtesy of Christine Zardecki)


In 2011, the Worldwide Protein Data Bank (wwPDB) PDBx/mmCIF Working Group was established to enable direct use of PDBx/mmCIF format files within major macromolecular crystallography software tools and to provide recommendations on format extensions required for deposition of larger macromolecule structures to the PDB. This was a key step in the evolution of the PDB archive, which enabled studies of macromolecular machines, such as the ribosome, as single PDB structures (instead of split entries with atomic coordinates distributed among different entry files). In 2019, mandatory submission of PDBx/mmCIF format files for deposition was announced (Adams et al. Acta Crystallographica D75, 451-454).


To ensure the success of the PDBx/mmCIF dictionary and format, John worked with a wide range of community experts to extend the framework to encompass descriptions of macromolecular X-ray crystallographic experiments, 3D cryo-electron microscopy experiments, NMR spectroscopy experiments, protein and nucleic acid structural features, diffraction image data, and protein production and crystallization protocols. Most recently, these efforts have been focused on developing compatible data representations for X-ray free electron (XFEL) methods, and for integrative or hybrid methods (I/HM). I/HM structures, currently stored in the prototype PDB-Dev archive (, presented new challenges for data exchange among rapidly evolving and heterogeneous experimental repositories. Proper management of I/HM structures in PDB-Dev also required extension of the PDBx/mmCIF data dictionary to include coarse-grained or multiscale models, which will be essential for studying macromolecular structures in situ using cryo-electron tomography and other bioimaging methods.

John Westbrook at the 2017 Congress and General Assembly of the
International Union of Crystallography in Hyderabad, India

John contributed broadly to community data standards enabling interoperation and data integration within the biology and structural biology domains. His efforts have included (i) describing the increasing molecular complexity of macromolecular structure data, (ii) representing new experimental methodologies, including I/M techniques, and (iii) expanding the biological context required to facilitate broader integration with a spectrum of biomedical resources. John’s work has been central to connecting crystallographic and related structural data for biological macromolecules to key resources across scientific disciplines. His efforts have been described in more than 120 peer-reviewed publications, one of which has been cited more than 21,000 times according to the Web of Science (Berman et al. Nucleic Acids Research 28, 235-242). Eight of his most influential published papers have appeared in the International Tables of Crystallography.

John has also done yeoman service to the crystallographic community over many years and was recognized with the inaugural Biocuration Career Award from the International Society for Biocuration in 2016.