From artificial intelligence to bioinformatics, Clare Batres Congdon senses new applications for computer science

By Alicia Nemiccolo MacLeay '97

For Assistant Professor of Computer Science Clare Bates Congdon, computer science is about much more than programming languages and microchips. It is a means to many ends.

While her main areas of expertise are artificial intelligence, machine learning and data mining, her research and teaching often are far more interdisciplinary, incorporating fields as diverse as art, biology and mathematics.
Clare Bates Congdon
One such foray is into the emerging field of bioinformatics, which uses computer technology to manage biological information. Bioinformatics may not be in your spellchecker yet, but it has broad implications for advancing our understanding of biology, genetics and medicine. Bioinformatics provides the computer science applications that allow geneticists to study the human genome and microbiologists to select HIV strains for vaccine development, for example.

Last spring Congdon teamed up with Judy Stone, Clare Boothe Luce Assistant Professor of Biology, to offer a bioinformatics course. "Bioinformatics is a unique thing for a school of this size," said Congdon. Often professors would need training just to offer it, but Stone and Congdon were more than up to speed: "Judy and I both did bioinformatics theses before anyone was using that word."

Back in the early and mid-'90s Congdon was a graduate student working in the University of Michigan's artificial intelligence lab. Her thesis compared genetic algorithms (step-by-step sequences of actions that can evolve a solution to a problem) to other machine-learning approaches to complex epidemiological problems. Specifically, Congdon used data mining (a term she avoids since it raises some biologists' hackles) to look for patterns in the genetic and biochemical characteristics of people who did and didn't have a family history of heart attacks. Her conclusion? Genetic algorithms are a superior approach.

Bioinformatics wasn't even a term a decade ago, let alone its own discipline. Now, the subject is incredibly hot, says Congdon. "Bioinformatics really took off with the success of the Human Genome Project," she said. Biologists collected the necessary data, but there was too much to sift through with standard approaches. "Up till then, there was just a handful of computer scientists who thought that biology was an interesting area to apply their skills to," said Congdon.

Now biologists recognize that science is increasingly about data, and computer science students are excited to have a real application for their subject. "Biology has appeal to people," said Congdon. "It's about us, it's about our lives."

Chris Blomberg '04, a biology major and research assistant for Stone, was one of the six computer science students and six biology students who enrolled in Colby's first bioinformatics class. Like most of his fellow biology majors he had no formal computer science background, and the reverse was true for the computer science students. "There was an awful lot of learning to come together in the middle," said Congdon, but it "was a great dynamic and they really embraced it."

Blomberg had noticed in his student research that while software programs for biologists were helpful, they didn't always satisfy a researcher's needs. "I was inspired to take this class to see if there was any way I could learn to improve them," he said.

For his class project Blomberg learned how to write in the programming language Perl, which can be used to manage the data of DNA and RNA sequences. Now Blomberg wants to try to write a program for his research that would look for genetic sequences rather than to count nucleotides and aligning sequences by hand, as he has done.

"The most important thing that I learned," said Blomberg, "was what computer science can actually do to make the job or research of a biologist easier."

Making genetic research easier with the help of genetic algorithms is Congdon's current research. Geneticists used to look at physical traits of species to find phylogenies, or evolutionary relationships, among scores of species, says Congdon. Now one can program genetic algorithms to look at data and reevaluate how species might be related through evolutionary history or even how individual organisms are related. Since HIV mutates fairly rapidly, phylogenetics can be used to trace its transmission by looking at DNA sequences of HIV collected from different people.

The system Congdon is developing has found some phylogenies, tree-like structures of 40 species, not found by the most commonly used phylogenetic software, Phylip. "It's not necessarily true that Phylip can't find these phylogenies but that it couldn't find them in comparable time," said Congdon, who sees promise in her approach.

Genetic algorithms have been able to find solutions to problems that better human-derived solutions, says Congdon: "This is a big thing for computer scientists because it means that sometimes you should let the computer 'evolve' or 'learn' the best solution rather than trying to engineer it yourself."

It's not all hard science for Congdon, or her students, though. With a background in art (including a minor in studio art as an undergraduate at Wesleyan), she has been uniquely positioned to help students combine art and computer science into independent majors. She proudly describes how one student created an abstract world that responded to emotion,a project featured in the Museum of Art's annual senior art exhibit. And she is eager to show off another recent graduate's laser touch screen. "He had the potential to be a computer artist and show his work in galleries," Congdon said.

Congdon's field is wide and ever expanding, but whether it's in new art media or bioinformatics, taking new approaches is essential. That's true, whether it's in teaching, helping computers evolve or studying evolution.