About the Genomics Tool
Zang’s new tool adapts a model from number theory and cryptology called “simplex encoding.” He and his colleagues used that to code DNA sequences into mathematical forms and, ultimately, convert the complex genome sequence into a much simpler mathematical form. They can then compare different forms to detect bias and noise in the sequence data that cannot be found easily using conventional approaches.
“The DNA sequences’ complexity increases exponentially when they get longer. They are difficult to model because a typical dataset has millions of sequences from thousands of cells,” said Shengen Shawn Hu, a research scientist in Zang’s lab and the lead author of this work. “But the simplex encoding model can give an accurate estimation of sequence biases because of its beautiful mathematical property.”
Tests of the tool showed it was significantly better at analyzing complex single-cell data to characterize different cell types. This is important for both basic biology research and disease diagnosis, in which doctors must detect tiny numbers of disease cells within much larger specimens, ranging from tens of thousands to millions of cells.
“The biases were not easy to find because they were tangled with real signals and hidden in the big data. It might not be a big deal if people are only going to pick the strongest signals from a large number of cells,” said Zang, who recently co-led several other single-cell genomics research in studying coronary artery disease and gut development. “But when you look at single-cell data, there are no low-hanging fruits anymore. The signals are always weak on the individual cell level, and the effect of noise and biases can be catastrophic. Bias correction is often ignored but can be vital in single-cell data analysis.”
To make their new tool widely available, the researchers have created free, open-source software and posted it online.
“We hope this tool can benefit the biomedical research community in studying chromatin biology and genomics, and eventually help disease research,” Zang said. “It is always exciting to see our peers use the tools we developed to make important scientific discoveries in their own research.”
The researchers have published their findings in the scientific journal Nature Communications. The article is open access, meaning it is free to read. The team consisted of Shengen Shawn Hu, Lin Liu, Qi Li, Wenjing Ma, Michael J. Guertin, Clifford A. Meyer, Ke Deng, Tingting Zhang and Chongzhi Zang.
Zang is part of UVA’s departments of Public Health Sciences, Biochemistry and Molecular Genetics, and Biomedical Engineering. The Department of Biomedical Engineering is a collaboration of UVA’s School of Medicine and School of Engineering.
The work was supported by the National Institutes of Health, grants R35GM133712, K22CA204439 and R35GM128635; the National Science Foundation, grant NSF-796 2048991; the University of Pittsburgh Center for Research Computing; UVA Cancer Center; and the NIH’s National Cancer Institute, Cancer Center Support Grant P30 CA44579.
To keep up with the latest medical research news from UVA, subscribe to the Making of Medicine blog.
I have been writing professionally for over 20 years and have a deep understanding of the psychological and emotional elements that affect people. I’m an experienced ghostwriter and editor, as well as an award-winning author of five novels.