When remote learning became an inevitable reality, Oak Professor of Biological Sciences Judy Stone was most concerned about her Evolutionary Analysis course. To her, advanced classes are where students can learn how to develop good questions, and interaction is at the heart of that process.

“I thought—I'm going to make really enticing things for them, really enticing datasets, enticing exercises, and I'm going to try to get some interaction, but also just to give them tools ... that would allow them to explore,” Stone said about revising her curriculum for the rest of the semester.

Pivoting to adapt, she learned through one of her colleagues about a database documenting COVID-19 virus’s own evolution as it sweeps around the globe. This platform by Nextstrain, an open-source project to harness the scientific and public health potential of pathogen genome data, brings together COVID-19 samples from across the world and uses data visualization to present the virus’s genetic epidemiology in real time. It shows the novel coronavirus’s phylogeny, which establishes the genetic relationship among the samples, maps out the worldwide transmission, and breaks down COVID-19’s entire genome, revealing the virus’s genetic diversity and mutations. For Stone’s evolutionary biology course, this site was a perfect vehicle for scientific inquiry.

“We’re interested in natural selection,” said Stone, whose research focuses on evolution, ecology, and botany. Viruses have a lot of mutations, she explained, and natural selection is constantly eliminating some or favoring others. And by looking at DNA sequences and carrying out different tests, scientists can detect “signature of selection,” meaning “whether natural selection has been acting on those genes,” she said.

Thus she created an assignment for her students to take a deep dive into the data for a single day, April 7. “There are plenty of things we can learn directly by looking at the phylogenetic tree and/or the map, such as the fact that the second big wave of U.S. cases came from China via Europe,” said Stone. “But I asked my students to go beyond that, to explain how they would use the tools of population genetics to seek evidence about how natural selection may be operating on these sequences.”

Why is this important?

“Because they may be relevant for learning about modes of transmission, infection, morbidity, or vaccine design,” said Stone. “The ‘signature of selection’ won’t tell you directly what to do, but it could help focus the attention of bench scientists to figure out which leads are most worthy of pursuing.”

Examining the data, Stone’s students did have critical questions. Would we need different vaccines for different mutations? Could certain strains be more contagious? When the virus moved to colder climates, was there any evidence that it had changed to better adapt?

 

Genomic epidemiology of novel coronavirus transmissions chart

This map shows how COVID-19 traveled from one place to the next, spreading across the globe. (Used under a CC-BY-4.0 license from nextstrain.org)

Genomic epidemiology of novel coronavirus phylogeny chart

This is a phylogenetic tree, where each dot represents a sample and each color corresponds to a country. The purple dots, representing China, are at the root of the tree. As the virus proliferated, it formed separate lineages. (Used under a CC-BY-4.0 license from nextstrain.org)

Through this exercise, the students also were a step ahead of the news coming out on COVID-19, such as the April 8 New York Times story about how the coronavirus had come into the U.S. first from China to Washington State and then from Europe to the East Coast. “It was really funny because all of my students knew that before it was in the Times,” Stone said, “because they all looked at this [data] and they know how to read the [phylogenetic] tree.”

Laura Sokoloski ’21, an environmental science and biology double major with a creative writing minor, connected the news she has been reading with the database. “I was seeing all this news coming out about how the [virus’s] origin was probably Europe, whereas this [data] points towards China one hundred percent and then later emergence in Europe,” she said. This was striking to her, as was finding there was a canine host for the virus. “[In] all of the indexed points in the data set, all the hosts are human, except for one, which is a dog,” she said, questioning if that was a matter of testing or a rare situation.

What was not uncommon, though, was COVID-19’s spread. “When you can see all of the different connections and it’s totally covered by colors, you could just see how it happens—now everything is interconnected,” Sokoloski said. “It’s just very striking.”

Jonathan Pankauski ’21, a chemistry: biochemistry major with a minor in mathematics, was also intrigued to see the virus’s movement and the relationship between the samples in different geographic locations. But as an EMT and CER member with family and friends in the medical profession, Pankauski was most curious—and worried—about COVID-19’s attachment to surfaces.

“Which of these portions of the genome are evolving super quickly? What proteins are they in?” he asked. “Is that part of the reason why so many disinfectants, which are supposed to work against coronaviruses, just don’t seem to work against COVID-19?” To explain how he’d answer these questions, he pointed to the tests he learned in class.

“It was … a practical way to apply what we’ve been learning,” Pankauski said. “I’m sure there are people out there right now … using these tests, trying to answer these questions.”