Show icon Show search tips...
Hide icon Hide search tips...

[CCICADA-announce] DIMACS/CCICADA Interdisciplinary Seminar Series - Tuesday, March 25, 2014

Linda Casals lindac at
Mon Mar 17 11:52:55 EDT 2014


DIMACS/CCICADA Interdisciplinary Seminar Series Presents

Title: Constructing and Clustering of Similarity Graphs from
        Large-Scale Metagenomic Collections

Speaker: Jaroslaw Zola, Rutgers Discovery Informatics Institute 

Date: Tuesday, March 25, 2014 11:00am - 12:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University                 
             Busch Campus, Piscataway, NJ


Metagenomics is the study of a population of organisms by fragmenting
and sequencing their collective DNA. With the advent of
next-generation high-throughput DNA sequencing, large-scale
metagenomic studies became routine producing data collections with
millions of DNA reads. Metagenomic clustering is a strategy to
organize such data collections by identifying taxonomic units from
which they have been obtained.

In this talk, I will present a parallel graph-based approach to
metagenomic clustering. The method exploits sketching techniques to
construct large-scale similarity graphs while alleviating the
prohibitive cost of all pairs comparisons. It also employs carefully
designed dynamic load balancing techniques to scale to parallel
machines with thousands of cores. I will then show how the metagenomic
clustering problem can be posed as that of identifying dense
sub-graphs, and will describe a MapReduce realization of the
corresponding heuristic.

DIMACS/CCICADA Interdisciplinary Series 

More information about the Dimacs-ccicada-announce mailing list