Putting the Pieces Together: Sequencing the Blueberry Genome

— Written By and last updated by

Dr. Allan Brown, a researcher with N.C. State’s Plants for Human Health Institute at the N.C. Research Campus in Kannapolis, is leading a team that is sequencing the blueberry genome. The work is a major step toward understanding the genetic information of the blueberry, specifically which genes are responsible for making the health-protective natural components in the fruit. It is expected to yield new discoveries in both medical and agricultural research.

(Read the Full Feature Story)

Video Full Text:

Blueberries are one of the food crops under the microscope at the N.C. State University Plants for Human Health Institute. Located at the North Carolina Research Campus in Kannapolis, the institute has led the way to sequence the blueberry genome. Dr. Allan Brown, a molecular geneticist with the institute and a member of the department of horticultural science, served as the lead researcher.

The blueberry genome project, funded by the University of North Carolina General Administration, is one of the first major collaborative efforts of scientists at the North Carolina Research Campus. A genome contains an organism’s hereditary information, encoded in DNA. A plant’s DNA contains important clues that have the potential to help solve health problems, such as preventing or curing diseases. The sequenced genome provides a blueprint that is useful for health and medical researchers as well as plant breeders. It is expected to yield new discoveries.

The core team, led by Dr. Brown, includes bioinformatics experts Mark Burke with the David H. Murdock Research Institute, known as DHMRI; Dr. Cory Brouwer with the UNC Charlotte Bioinformatics Research Center; and Dr. Michael Wang, with the DHMRI Genomics Laboratory. The project demonstrates how on-site collaboration of researchers working across disciplines can lead to significant scientific advances. This group and their lab members have met weekly to map strategy and keep the project on track.

The DHMRI Genomics Laboratory played a central role in the sequencing process by using specialized equipment and protocol to decipher the DNA and generate raw data. Let’s take a look in the lab.

It all starts with a one milliliter test tube containing DNA. The USDA Agricultural Research Service lab in Beltsville, Maryland, supplied the DNA from a parent plant that is related to the Northern Highbush blueberry, Vaccinium corymbosum.

The first step is to physically shear a long strand of DNA; this process is called fragmentation. Researchers then construct a library of fragmented pieces. They add an adaptor piece of known DNA to the ends of the fragments. They then load millions of DNA fragments into 8 channels in a flowcell. Each fragment is amplified by replication to generate a thousand copies. This is a strategy to improve accuracy in sequencing.

There are two methods for sequencing. Both were used for the blueberry genome project. “Short read sequencing” generates data for fragments that consist of 300 base pairs – the building blocks of DNA. The lab equipment reads the nucleotide on a single strand and attaches the complement nucleotides simultaneously, labeling the A’s, T’s, C’s and G’s with a different color dye.

“Long read sequencing” uses fragments that are 800 base pairs in length. The fragments are attached to beads placed in an oil bubble. More than a million copies are made on the bead. The double strand is denatured and nucleotides are added one at a time. The result is four layers of information that are then sandwiched together. The blueberry has about 500 million base pairs coding for more than 25,000 genes.

The raw data generated by the genomics lab is transferred to the bioinformaticists for sequence assembly using specialized computer software.

Using a hybrid approach, Brown’s team likens the sequence assembly process to putting a puzzle together. The longer fragments were used to build a ‘scaffold’ by overlapping matched segments, similar to putting together a puzzle frame. The shorter fragments were then used to fill in the distinguishable objects of the puzzle; a process referred to as “backfilling.” The final puzzle pieces, those of similar color or pattern, are akin to the repetitive DNA. They may seem like filler, but they are important to get a complete picture.

Researchers expect the sequenced blueberry genome to boost the North Carolina blueberry industry. This industry is valued at $58.2 million annually. North Carolina ranks 6th in the nation in blueberry production. Future improved varieties – developed with the help of the recently sequenced blueberry genome – have the potential to bolster yield and revenue for North Carolina farmers. Breeding efforts will be more efficient with the availability of the sequenced genome and the newly identified molecular markers.

In addition to the economic value, it’s the berry’s value to health that has scientists at the North Carolina Research Campus excited about publishing the first blueberry genome sequence. Blueberries are rich in health-promoting phytonutrients that reduce the symptoms of several chronic diseases. For example, Dr. Mary Ann Lila, director of the N.C. State Plants for Human Health Institute, studies bioactive compounds in many fruits and vegetables, including blueberries. The sequenced genome will help her hone in on the health-protective properties of blueberries.

The sequencing of the genome has revealed helpful information to several areas of research. One unexpected finding was the genetic similarities between blueberry and grape. Both fruits are revered for their anthocyanin content. Anthocyanins are pigments responsible for the red, blue and purple color of the fruit, which Dr. Lila studies for their health-protective properties.

Another blueberry breakthrough resulting from the sequence will help plant breeders. The team at the North Carolina Research Campus has identified more than 20,000 molecular markers that will be added to the genetic linkage map for blueberry. Molecular markers are short sequences of DNA that serve as “road signs,” or reference points, along the genetic linkage map.

The science also helps cultivate the next generation of scientists. Students at Davidson College in Davidson, North Carolina, analyzed the data in a biology course under the direction of Dr. Malcolm Campbell. At the end of the semester, the students presented their findings to the lead scientists from the Research Campus. A local high school science teacher, April Baucom, spent the summer interning with Dr. Brown. Her experiences in the lab and in the blueberry field will help her share the world of scientific discovery with her students.

Sequencing the blueberry genome is a major step toward understanding the genetic information of the blueberry. It is expected to yield new discoveries in both medical and agricultural research. The results of the sequencing and annotation are accessible to scientists and plant breeders worldwide through the website, www.vaccinium.org.