Crosspost from microBEnet: Collection of papers on "The Science of Science Communication"

Crossposting this from microBEnet 

 Just got pointed to this by Sharon Strauss, the chair of the Evolution and Ecology department here at UC Davis: The Science of Science Communication II Sackler Colloquium.  This is a collection of papers from a colloquium held in Septment 2013.  Slides and videos of the talks are available online. The papers and links (copied from the PNAS site) are listed below.  There are many papers here of relevance to work done at microBEnet and are also likely of general interest to many:

Today on "Express Yourself" Teen Radio - me - being interviewed about #Microbes & #OpenScience

Just a little self-centered plug.  I was interviewed recently for Express Yourself! | VoiceAmerica™ teen radio show.  The teens interviewing me included Henna Hundal who worked in my lab this summer as an intern on our "Seagrass Microbiome" project. See a post from Cassie Ettinger about Henna's work.  Also see:

It was a fun interview and I love the idea of teens doing a science radio show.   From their site
Science is everywhere. From the stars that light the night sky to the intricate patterns on a butterfly’s wings, science is at play in all parts of our world and is continually making our lives so great. Hosts Henna Hundal and Courtney Chung discuss how science shapes our perspective on life from cell phones to lawn mowers, from cures for diseases to prosthetic limbs. Global Youth Talk reporter, Ryan Sim, talks about science careers in the United Nations, and how this international community is looking at science innovation to create solutions for the next generation. Special guest Dr. Jonathan Eisen,a Full Professor at the University of California, Davis, with appointments in the UC Davis Genome Center, the School of Medicine, and the College of Biological Sciences focuses on communities of microbes and how they provide new functions - to each other or to a host. Dr. Eisen is entertaining with his study systems of boiling acid pools, surface ocean waters, agents of many diseases, and the microbial ecosystems in and on plants and animals. In Health with Henna, Henna Hundal reports on how we can prevent the negative effects of prolonged sitting. It’s important to take those “stretch breaks” every hour. Whether it’s writing scientific articles, thoughtful science reporting, or even talking about science on the radio, integrating humanities with science is key to reaching a mass audience.
So - I recommend everyone listen ...12 noon Pacific Time on VoiceAmerica Kids Channel. Express Yourself! | VoiceAmerica™


.




Personalized Medicine World Conference 2015: 55 speakers 7 of which are women #YAMMM #StemWomen

Well, umm, Ralph Snyderman, despite the email invitation I will not be attending PMWC 2015 Silicon Valley.  Why not?  Well how about the fact that you have 55 speakers listed, only 7 of which are women.






Previous year's meetings are not much better.  For example, for the 2014 Meeting in Silicon Valley the Track 1 session (which they call the premier session or something like that) has a ratio of 52:5 Male:Female.


#YAMMM Alert: Drug Discovery and Therapy World Congress, a meeting made for @realDonaldTrump & other men

Note - see update at bottom of post

Elizabeth Bik sent me a link to this meeintg: DRUG DISCOVERY & THERAPY WORLD CONGRESS 2015 with a comment about the ratio of males to females in the keynote speakers.  And it is painful.  Of the plenary and keynote speakers, 15 are male and 1 is female.  Below I show pics of the plenary and keynote speakers:

Plenary and Keynote Speakers at Drug Discovery and Thearpy World Congress

Female Plenary and Keynote Speakers at Drug Discovery and Thearpy World Congress
Two bonus people who could have been giving keynote talks but who actually are not.

The gender bias at this meeting puts into perspective the push by the NIH to get drug researchers to inlcude more female subjects in their studies.  See for example, Why Are All the Lab Rats Boys? NIH Tells Drug Researchers to Stop Being Sexist Pigs.  Here is a thought, maybe we can get some of these speakers to cancel speaking at the meeting and also maybe we can get nobody to attent the meeting.  Sigh.  Yet another mostly male meeting.  Also known as "YAMMM".

--------------------
UPDATE October 14, 2014.

Well, this is one of the strangest and lamest things I have seen associated with a conference in a while.  Elizabeth Bik just emailed me to show me an invite she received to the "Global Biotechnology Congress 2015."  And here is the bizarre thing.  It is at the same time as the Drug Discovery meeting discussed here.  Same place.  Same speakers.  It is apparently the same meeting with a new name.


Same bad gender ratio of course too.

Did they do this to avoid people discovering my post about the awful gender ratio?  I don't know but seems like it might be so.  What a joke.  Well, I can guarantee people will associated this meeting name with the previous one.  


Triclosan in toothpaste: potential risks are not a "rumor" as arrogant Colgate official argues, but are something to worry about

Triclosan in my toothpaste (and maybe yours too)
I was reading some posts of a friend and went down a bit of a rabbit hole that led me to a place that did not make me happy.

First I saw a post about some issues with Crest Toothpastes containing polyethylene: Dentist calls Crest toothpaste dangerous; Now P&G changing ingredients.  This seemed a bit disturbing.  But then I saw a "Related Link": Shoppers Ditching Colgate Total Amid Triclosan Fears.  And I thought - holy cra*## - really? I had no idea triclosan was in toothpaste.  And why did I react strongly?  Well triclosan, which is antimicrobial agent, though it has it's potential benefits, has some potential risks associated with it's antibacterial activity (see also this discussion from the EU).  What are these possible risks?  Risks like increasing the frequency and spread of antimicrobial resistance.  And risks like messing with microbial ecosystems.

And due to these potential risks, I have been blogging and writing and complaining about the use of triclosan in various building materials for some time now.  For example

And also see:

You see, I thought, for reasons that are unclear to me right now, that the main issue with agents like triclosan was their use in kitchen counters and clothing and building materials.  Well, it never even occurred to me that it would be in oral care products and thus purposefully introduced into the human body.

So I decided to check to see if my toothpaste had any in it.  And, well, $*##.  It did.

Well, that is disturbing.  So I decided to Google around to see what else there was out there on Triclosan in toothpaste.  And I discovered this gem from Colgate in response to the news story I mentioned above: Colgate officials have responded to such concerns by saying they think it is perfect safe.  The piece is by Patricia Verduin, PhD., Head of Colgate-Palmolive Research & Development.

Here is a quote from that "article":
We all know that a rumor travels half-way around the world before the truth even has a chance to be heard.  But we want the truth to have a chance to catch up. We encourage consumers to look at the facts.
And here is another.
I know the science and I know how it works.  It is the only toothpaste I use.
What a condescending, arrogant response.  I am looking at facts.  As I presume are others.  And what we see does not make us happy.  Just as we as a society are freaking out (justifiably) about overuse of traditional antibiotics (I use the term traditional antibiotics here to refer to things commonly called antibiotics), we should also be worried - probably really worried - about overuse of agents like triclosan.  Don't let the "biocidal" or "antimicrobial" or "antiseptic" terminology fool you.  If one of the major effects of a chemical is to kill microbes, it is something to worry about.  See for example  Triclosan Promotes Staphylococcus aureus Nasal Colonization which shows some evidence that these worries are not just rumors that travel around the world.  Here are some quotes from that article
These findings are significant because S. aureus colonization is a known risk factor for the development of several types of infections. Our data demonstrate the unintended consequences of unregulated triclosan use and contribute to the growing body of research demonstrating inadvertent effects of triclosan on the environment and human health.
In the end, I would like to say - shame on Colgate and Proctor and Gamble.  Yes, triclosan in their toothpaste may lower the risk for some oral health problems.  But consider traditional antibiotics as an analogy.  Yes, they can lower the risk from dying from an infection and are very important tools for all sorts of cases.  But at the same time, even with their benefits, we as a society are looking to reduce their use as much as possible and to reserve them for cases where they are truly needed.  Prophylactic use of traditional antibiotics is now generally frowned upon.  Similarly, prophylactic use of triclosan in toothpaste, even with some health benefits, has way way too many potential, unknown, health risks to be continued.  This notion is not a rumor.  It is not "overselling the microbiome".  It is simply following the precationary principle until we know more.  It is certainly much more "the truth" right now than idiotic statements like "I know the science and I know how it works."

Story Behind the Paper: Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads (by Rogan Carr and Elhanan Borenstein)

Here is another post in my "Story Behind the Paper" series where I ask authors of open access papers to tell the story behind their paper.  This one comes from Rogan Carr and Elhanan Borenstein.  Note - this was crossposted at microBEnet.  If anyone out there has an open access paper for which you want to tell the story -- let me know.


We’d like to first thank Jon for the opportunity to discuss our work in this forum. We recently published a study investigating direct functional annotation of short metagenomic reads that stemmed from protocol development for our lab. Jon invited us to write a blog post on the subject, and we thought it would be a great venue to discuss some practical applications of our work and to share with the research community the motivation for our study and how it came about.

Our lab, the Borenstein Lab at the University of Washington, is broadly interested in metabolic modeling of the human microbiome (see, for example our Metagenomic Systems Biology approach) and in the development of novel computational methods for analyzing functional metagenomic data (see, for example, Metagenomic Deconvolution). In this capacity, we often perform large-scale analysis of publicly available metagenomic datasets as well as collaborate with experimental labs to analyze new metagenomic datasets, and accordingly we have developed extensive expertise in performing functional, community-level annotation of metagenomic samples. We focused primarily on protocols that derive functional profiles directly from short sequencing reads (e.g., by mapping the short reads to a collection of annotated genes), as such protocols provide gene abundance profiles that are relatively unbiased by species abundance in the sample or by the availability of closely-related reference genomes. Such functional annotation protocols are extremely common in the literature and are essential when approaching metagenomics from a gene-centric point of view, where the goal is to describe the community as a whole.

However, when we began to design our in-house annotation pipeline, we pored over the literature and realized that each research group and each metagenomic study applied a slightly different approach to functional annotation. When we implemented and evaluated these methods in the lab, we also discovered that the functional profiles obtained by the various methods often differ significantly. Discussing these findings with colleagues, some further expressed doubt that that such short sequencing reads even contained enough information to map back unambiguously to the correct function. Perhaps the whole approach was wrong!

We therefore set out to develop a set of ‘best practices’ for our lab for metagenomic sequence annotation and to prove (or disprove) quantitatively that such direct functional annotation of short reads provides a valid functional representation of the sample. We specifically decided to pursue a large-scale study, performed as rigorously as possible, taking into account both the phylogeny of the microbes in the sample and the phylogenetic coverage of the database, as well as several technical aspects of sequencing like base-calling error and read length. We have found this evaluation approach and the results we obtained quite useful for designing our lab protocols, and thought it would be helpful to share them with the wider metagenomics and microbiome research community. The result is our recent paper in PLoS One, Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads.

The performance of BLAST-based annotation of short reads across the bacterial and archaeal tree of life. The phylogenetic tree was obtained from Ciccarelli et al. Colored rings represent the recall for identifying reads originating from a KO gene using the top gene protocol. The 4 rings correspond to varying levels of database coverage. Specifically, the innermost ring illustrates the recall obtained when the strain from which the reads originated is included in the database, while the other 3 rings, respectively, correspond to cases where only genomes from the same species, genus, or more remote taxonomic relationships are present in the database. Entries where no data were available (for example, when the strain from which the reads originated was the only member of its species) are shaded gray. For one genome in each phylum, denoted by a black dot at the branch tip, every possible 101-bp read was generated for this analysis. For the remaining genomes, every 10th possible read was used. Blue bars represent the fraction of the genome's peptide genes associated with a KO; for reference, the values are shown for E. coli, B. thetaiotaomicron, and S. Pneumoniae. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776. doi:10.1371/journal.pone.0105776. See the manuscript for full details.
The performance of BLAST-based annotation of short reads across the bacterial and archaeal tree of life using the 'top gene' protocol. See the manuscript for full details. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776 


To perform a rigorous study of functional annotation, we needed a set of reads whose true annotations were known (a “ground truth”). In other words, we had to know the exact locus and the exact genome from which each sequencing read originated and the functional classification associated with this locus. We further wanted to have complete control over technical sources of error. To accomplish this, we chose to implement a simulation scheme, deriving a huge collection of sequence reads from fully sequenced, well annotated, and curated genomes. This schemed allowed us to have complete information about the origin of each read and allowed us to simulate various technical factors we were interested in. Moreover, simulating sequencing reads allowed us to systematically eliminate variations in annotation performance due to technological or biological effects that would typically be convoluted in an experimental setup. For a set of curated genomes, we settled on the KEGG database, as it contained a large collection of consistently functionally curated microbial genomes and it has been widely used in metagenomics for sample annotation. The KEGG hierarchy of KEGG Orthology groups (KOs), Modules, and Pathways could then serve as a common basis for comparative analysis. To control for phylogenetic bias in our results, we sampled broadly across 23 phyla and 89 genera in the bacterial and archaeal tree of life, using a randomly selected strain in KEGG for each tip of the tree from Ciccarelli et al. From each of the selected 170 strains, we generated either *every* possible contiguous sequence of a given length or (in some cases) every 10th contiguous sequence, using a sliding window approach. We additionally introduced various models to simulate sequencing errors. This large collection of reads (totaling ~16Gb) were then aligned to the KEGG genes database using a translated BLAST mapping. To control for phylogenetic coverage of the database (the phylogenetic relationship of the database to the sequence being aligned) we also simulated mapping to many partial collections of genomes. We further used four common protocols from the literature to convert the obtained BLAST alignments to functional annotations. Comparing the resulting annotation of each read to the annotation of the gene from which it originated allowed us to systematically evaluate the accuracy of this annotation approach and to examine the effect of various factors, including read length, sequencing error, and phylogeny.

First and foremost, we confirmed that direct annotation of short reads indeed provides an overall accurate functional description of both individual reads and the sample as a whole. In other words, short reads appear to contain enough information to identify the functional annotation of the gene they originated from (although, not necessarily the specific taxa of origin). Functions of individual reads were identified with high precision and recall, yet the recall was found to be clade dependent. As expected, recall and precision decreased with increasing phylogenetic distance to the reference database, but generally, having a representative of the genus in the reference database was sufficient to achieve a relatively high accuracy. We also found variability in the accuracy of identifying individual KOs, with KOs that are more variable in length or in copy number having lower recall. Our paper includes abundance of data on these results, a detailed characterization of the mapping accuracy across different clades, and a description of the impact of additional properties (e.g., read length, sequencing error, etc.).

A principal component analysis of the pathway abundance profiles obtained for 15 HMP samples and by four different annotation protocols. HMP samples are numbered from 1 to 15 according to the list that appears in the Methods section of the manuscript. The different protocols are represented by color and shape. Note that two outlier protocols for sample 14 are not shown but were included in the PCA calculation. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776. doi:10.1371/journal.pone.0105776. See the manuscript for full details.
A principal component analysis of the pathway abundance profiles obtained for 15 HMP samples and by four different annotation protocols.The different protocols are represented by color and shape. See the manuscript for full details. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776 
Importantly, while the obtained functional annotations are in general representative of the true content of the sample, the exact protocol used to analyze the BLAST alignments and to assign functional annotation to each read could still dramatically affect the obtained profile. For example, in analyzing stool samples from the Human Microbiome Project, we found that each protocol left a consistent “fingerprint” on the resulting profile and that the variation introduced by the different protocols was on the same order of magnitude as biological variation across samples. Differences in annotation protocols are thus analogous to batch effects from variation in experimental procedures and should be carefully taken into consideration when designing the bioinformatic pipeline for a study.

Generally, however, we found that assigning each read with the annotation of the top E-value hit (the ‘top gene’ protocol) had the highest precision for identifying the function from a sequencing read, and only slightly lower recall than methods enriching for known annotations (such as the commonly used ‘top 20 KOs’ protocol). Given our lab interests, this finding led us to adopt the ‘top gene’ protocol for functionally annotating metagenomic samples. Specifically, our work often requires high precision for annotating individual reads for model reconstruction (e.g., utilizing the presence and absence of individual genes) and the most accurate functional abundance profile for statistical method development. If your lab has similar interests, we would recommend this approach for your annotation pipelines. If however, you have different or more specific needs, we encourage you to make use of the datasets we have published along with our paper to help you design your own solution. We would also be very happy to discuss such issues further with labs that are considering various approaches for functional annotation, to assess some of the factors that can impact downstream analyses, or to assist in such functional annotation efforts.

Notes from 2007 for a blog post I should have written: How many microbial cells in humans?

Well sometimes you just screw up.  In 2007 I attended some planning meetings for the human microbiome project (see for example A human microbiome program? a post I wrote from one of the meetings in 2007).  And at those meetings I kept asking one question.  Where did this "fact" everyone kept citing that there were "10 times as many microbial cells in the human body as there were human cells" come from?  I could not find a citation.  So I started taking some notes for a blog post about this.  Here are those notes:

Wikipedia linkOnline textbook hereSears paper from Arizona site. She discusses only gut bacteria and cites a Gordon paper from 2001.
Seems to not be from this paper but really from here:
This in turn is not from there but apparently here

But, alas I got distracted.  And I did keep asking people - where did this "fact" come from.  And most people just brushed me off (and probably thought I was a bit of a crank ...). And nobody had a good answer.  Well, I was both pleased and sad (because I should have done it) to see Is your body mostly microbes? Actually, we have no idea by Peter Andrey Smith in the Boston Globe who addresses this issue in much much more detail that I ever could have done.  Everyone who works on the human microbiome and who is interested in "facts" and how they can get misreported should read this.  As a side note, Smith reports in the article that this is even given as a fact in Ted talks.  Sadly mine was one of them.  This is despite the fact (yes, the fact) that I swore to myself that I would NOT say that in my talk since I have been such a crank about this issue at meetings.  OMG - such truisms are so pervasive that even someone who actively questioned the truism still used it.  Uggh.  Oh well.  I really should have finished that draft post.

Total Pageviews

Popular Posts

الأربعاء، 1 أكتوبر 2014

Crosspost from microBEnet: Collection of papers on "The Science of Science Communication"

Crossposting this from microBEnet 

 Just got pointed to this by Sharon Strauss, the chair of the Evolution and Ecology department here at UC Davis: The Science of Science Communication II Sackler Colloquium.  This is a collection of papers from a colloquium held in Septment 2013.  Slides and videos of the talks are available online. The papers and links (copied from the PNAS site) are listed below.  There are many papers here of relevance to work done at microBEnet and are also likely of general interest to many:

الثلاثاء، 30 سبتمبر 2014

Today on "Express Yourself" Teen Radio - me - being interviewed about #Microbes & #OpenScience

Just a little self-centered plug.  I was interviewed recently for Express Yourself! | VoiceAmerica™ teen radio show.  The teens interviewing me included Henna Hundal who worked in my lab this summer as an intern on our "Seagrass Microbiome" project. See a post from Cassie Ettinger about Henna's work.  Also see:

It was a fun interview and I love the idea of teens doing a science radio show.   From their site
Science is everywhere. From the stars that light the night sky to the intricate patterns on a butterfly’s wings, science is at play in all parts of our world and is continually making our lives so great. Hosts Henna Hundal and Courtney Chung discuss how science shapes our perspective on life from cell phones to lawn mowers, from cures for diseases to prosthetic limbs. Global Youth Talk reporter, Ryan Sim, talks about science careers in the United Nations, and how this international community is looking at science innovation to create solutions for the next generation. Special guest Dr. Jonathan Eisen,a Full Professor at the University of California, Davis, with appointments in the UC Davis Genome Center, the School of Medicine, and the College of Biological Sciences focuses on communities of microbes and how they provide new functions - to each other or to a host. Dr. Eisen is entertaining with his study systems of boiling acid pools, surface ocean waters, agents of many diseases, and the microbial ecosystems in and on plants and animals. In Health with Henna, Henna Hundal reports on how we can prevent the negative effects of prolonged sitting. It’s important to take those “stretch breaks” every hour. Whether it’s writing scientific articles, thoughtful science reporting, or even talking about science on the radio, integrating humanities with science is key to reaching a mass audience.
So - I recommend everyone listen ...12 noon Pacific Time on VoiceAmerica Kids Channel. Express Yourself! | VoiceAmerica™


.




الاثنين، 29 سبتمبر 2014

Personalized Medicine World Conference 2015: 55 speakers 7 of which are women #YAMMM #StemWomen

Well, umm, Ralph Snyderman, despite the email invitation I will not be attending PMWC 2015 Silicon Valley.  Why not?  Well how about the fact that you have 55 speakers listed, only 7 of which are women.






Previous year's meetings are not much better.  For example, for the 2014 Meeting in Silicon Valley the Track 1 session (which they call the premier session or something like that) has a ratio of 52:5 Male:Female.


الأربعاء، 24 سبتمبر 2014

#YAMMM Alert: Drug Discovery and Therapy World Congress, a meeting made for @realDonaldTrump & other men

Note - see update at bottom of post

Elizabeth Bik sent me a link to this meeintg: DRUG DISCOVERY & THERAPY WORLD CONGRESS 2015 with a comment about the ratio of males to females in the keynote speakers.  And it is painful.  Of the plenary and keynote speakers, 15 are male and 1 is female.  Below I show pics of the plenary and keynote speakers:

Plenary and Keynote Speakers at Drug Discovery and Thearpy World Congress

Female Plenary and Keynote Speakers at Drug Discovery and Thearpy World Congress
Two bonus people who could have been giving keynote talks but who actually are not.

The gender bias at this meeting puts into perspective the push by the NIH to get drug researchers to inlcude more female subjects in their studies.  See for example, Why Are All the Lab Rats Boys? NIH Tells Drug Researchers to Stop Being Sexist Pigs.  Here is a thought, maybe we can get some of these speakers to cancel speaking at the meeting and also maybe we can get nobody to attent the meeting.  Sigh.  Yet another mostly male meeting.  Also known as "YAMMM".

--------------------
UPDATE October 14, 2014.

Well, this is one of the strangest and lamest things I have seen associated with a conference in a while.  Elizabeth Bik just emailed me to show me an invite she received to the "Global Biotechnology Congress 2015."  And here is the bizarre thing.  It is at the same time as the Drug Discovery meeting discussed here.  Same place.  Same speakers.  It is apparently the same meeting with a new name.


Same bad gender ratio of course too.

Did they do this to avoid people discovering my post about the awful gender ratio?  I don't know but seems like it might be so.  What a joke.  Well, I can guarantee people will associated this meeting name with the previous one.  


الثلاثاء، 23 سبتمبر 2014

Triclosan in toothpaste: potential risks are not a "rumor" as arrogant Colgate official argues, but are something to worry about

Triclosan in my toothpaste (and maybe yours too)
I was reading some posts of a friend and went down a bit of a rabbit hole that led me to a place that did not make me happy.

First I saw a post about some issues with Crest Toothpastes containing polyethylene: Dentist calls Crest toothpaste dangerous; Now P&G changing ingredients.  This seemed a bit disturbing.  But then I saw a "Related Link": Shoppers Ditching Colgate Total Amid Triclosan Fears.  And I thought - holy cra*## - really? I had no idea triclosan was in toothpaste.  And why did I react strongly?  Well triclosan, which is antimicrobial agent, though it has it's potential benefits, has some potential risks associated with it's antibacterial activity (see also this discussion from the EU).  What are these possible risks?  Risks like increasing the frequency and spread of antimicrobial resistance.  And risks like messing with microbial ecosystems.

And due to these potential risks, I have been blogging and writing and complaining about the use of triclosan in various building materials for some time now.  For example

And also see:

You see, I thought, for reasons that are unclear to me right now, that the main issue with agents like triclosan was their use in kitchen counters and clothing and building materials.  Well, it never even occurred to me that it would be in oral care products and thus purposefully introduced into the human body.

So I decided to check to see if my toothpaste had any in it.  And, well, $*##.  It did.

Well, that is disturbing.  So I decided to Google around to see what else there was out there on Triclosan in toothpaste.  And I discovered this gem from Colgate in response to the news story I mentioned above: Colgate officials have responded to such concerns by saying they think it is perfect safe.  The piece is by Patricia Verduin, PhD., Head of Colgate-Palmolive Research & Development.

Here is a quote from that "article":
We all know that a rumor travels half-way around the world before the truth even has a chance to be heard.  But we want the truth to have a chance to catch up. We encourage consumers to look at the facts.
And here is another.
I know the science and I know how it works.  It is the only toothpaste I use.
What a condescending, arrogant response.  I am looking at facts.  As I presume are others.  And what we see does not make us happy.  Just as we as a society are freaking out (justifiably) about overuse of traditional antibiotics (I use the term traditional antibiotics here to refer to things commonly called antibiotics), we should also be worried - probably really worried - about overuse of agents like triclosan.  Don't let the "biocidal" or "antimicrobial" or "antiseptic" terminology fool you.  If one of the major effects of a chemical is to kill microbes, it is something to worry about.  See for example  Triclosan Promotes Staphylococcus aureus Nasal Colonization which shows some evidence that these worries are not just rumors that travel around the world.  Here are some quotes from that article
These findings are significant because S. aureus colonization is a known risk factor for the development of several types of infections. Our data demonstrate the unintended consequences of unregulated triclosan use and contribute to the growing body of research demonstrating inadvertent effects of triclosan on the environment and human health.
In the end, I would like to say - shame on Colgate and Proctor and Gamble.  Yes, triclosan in their toothpaste may lower the risk for some oral health problems.  But consider traditional antibiotics as an analogy.  Yes, they can lower the risk from dying from an infection and are very important tools for all sorts of cases.  But at the same time, even with their benefits, we as a society are looking to reduce their use as much as possible and to reserve them for cases where they are truly needed.  Prophylactic use of traditional antibiotics is now generally frowned upon.  Similarly, prophylactic use of triclosan in toothpaste, even with some health benefits, has way way too many potential, unknown, health risks to be continued.  This notion is not a rumor.  It is not "overselling the microbiome".  It is simply following the precationary principle until we know more.  It is certainly much more "the truth" right now than idiotic statements like "I know the science and I know how it works."

Story Behind the Paper: Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads (by Rogan Carr and Elhanan Borenstein)

Here is another post in my "Story Behind the Paper" series where I ask authors of open access papers to tell the story behind their paper.  This one comes from Rogan Carr and Elhanan Borenstein.  Note - this was crossposted at microBEnet.  If anyone out there has an open access paper for which you want to tell the story -- let me know.


We’d like to first thank Jon for the opportunity to discuss our work in this forum. We recently published a study investigating direct functional annotation of short metagenomic reads that stemmed from protocol development for our lab. Jon invited us to write a blog post on the subject, and we thought it would be a great venue to discuss some practical applications of our work and to share with the research community the motivation for our study and how it came about.

Our lab, the Borenstein Lab at the University of Washington, is broadly interested in metabolic modeling of the human microbiome (see, for example our Metagenomic Systems Biology approach) and in the development of novel computational methods for analyzing functional metagenomic data (see, for example, Metagenomic Deconvolution). In this capacity, we often perform large-scale analysis of publicly available metagenomic datasets as well as collaborate with experimental labs to analyze new metagenomic datasets, and accordingly we have developed extensive expertise in performing functional, community-level annotation of metagenomic samples. We focused primarily on protocols that derive functional profiles directly from short sequencing reads (e.g., by mapping the short reads to a collection of annotated genes), as such protocols provide gene abundance profiles that are relatively unbiased by species abundance in the sample or by the availability of closely-related reference genomes. Such functional annotation protocols are extremely common in the literature and are essential when approaching metagenomics from a gene-centric point of view, where the goal is to describe the community as a whole.

However, when we began to design our in-house annotation pipeline, we pored over the literature and realized that each research group and each metagenomic study applied a slightly different approach to functional annotation. When we implemented and evaluated these methods in the lab, we also discovered that the functional profiles obtained by the various methods often differ significantly. Discussing these findings with colleagues, some further expressed doubt that that such short sequencing reads even contained enough information to map back unambiguously to the correct function. Perhaps the whole approach was wrong!

We therefore set out to develop a set of ‘best practices’ for our lab for metagenomic sequence annotation and to prove (or disprove) quantitatively that such direct functional annotation of short reads provides a valid functional representation of the sample. We specifically decided to pursue a large-scale study, performed as rigorously as possible, taking into account both the phylogeny of the microbes in the sample and the phylogenetic coverage of the database, as well as several technical aspects of sequencing like base-calling error and read length. We have found this evaluation approach and the results we obtained quite useful for designing our lab protocols, and thought it would be helpful to share them with the wider metagenomics and microbiome research community. The result is our recent paper in PLoS One, Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads.

The performance of BLAST-based annotation of short reads across the bacterial and archaeal tree of life. The phylogenetic tree was obtained from Ciccarelli et al. Colored rings represent the recall for identifying reads originating from a KO gene using the top gene protocol. The 4 rings correspond to varying levels of database coverage. Specifically, the innermost ring illustrates the recall obtained when the strain from which the reads originated is included in the database, while the other 3 rings, respectively, correspond to cases where only genomes from the same species, genus, or more remote taxonomic relationships are present in the database. Entries where no data were available (for example, when the strain from which the reads originated was the only member of its species) are shaded gray. For one genome in each phylum, denoted by a black dot at the branch tip, every possible 101-bp read was generated for this analysis. For the remaining genomes, every 10th possible read was used. Blue bars represent the fraction of the genome's peptide genes associated with a KO; for reference, the values are shown for E. coli, B. thetaiotaomicron, and S. Pneumoniae. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776. doi:10.1371/journal.pone.0105776. See the manuscript for full details.
The performance of BLAST-based annotation of short reads across the bacterial and archaeal tree of life using the 'top gene' protocol. See the manuscript for full details. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776 


To perform a rigorous study of functional annotation, we needed a set of reads whose true annotations were known (a “ground truth”). In other words, we had to know the exact locus and the exact genome from which each sequencing read originated and the functional classification associated with this locus. We further wanted to have complete control over technical sources of error. To accomplish this, we chose to implement a simulation scheme, deriving a huge collection of sequence reads from fully sequenced, well annotated, and curated genomes. This schemed allowed us to have complete information about the origin of each read and allowed us to simulate various technical factors we were interested in. Moreover, simulating sequencing reads allowed us to systematically eliminate variations in annotation performance due to technological or biological effects that would typically be convoluted in an experimental setup. For a set of curated genomes, we settled on the KEGG database, as it contained a large collection of consistently functionally curated microbial genomes and it has been widely used in metagenomics for sample annotation. The KEGG hierarchy of KEGG Orthology groups (KOs), Modules, and Pathways could then serve as a common basis for comparative analysis. To control for phylogenetic bias in our results, we sampled broadly across 23 phyla and 89 genera in the bacterial and archaeal tree of life, using a randomly selected strain in KEGG for each tip of the tree from Ciccarelli et al. From each of the selected 170 strains, we generated either *every* possible contiguous sequence of a given length or (in some cases) every 10th contiguous sequence, using a sliding window approach. We additionally introduced various models to simulate sequencing errors. This large collection of reads (totaling ~16Gb) were then aligned to the KEGG genes database using a translated BLAST mapping. To control for phylogenetic coverage of the database (the phylogenetic relationship of the database to the sequence being aligned) we also simulated mapping to many partial collections of genomes. We further used four common protocols from the literature to convert the obtained BLAST alignments to functional annotations. Comparing the resulting annotation of each read to the annotation of the gene from which it originated allowed us to systematically evaluate the accuracy of this annotation approach and to examine the effect of various factors, including read length, sequencing error, and phylogeny.

First and foremost, we confirmed that direct annotation of short reads indeed provides an overall accurate functional description of both individual reads and the sample as a whole. In other words, short reads appear to contain enough information to identify the functional annotation of the gene they originated from (although, not necessarily the specific taxa of origin). Functions of individual reads were identified with high precision and recall, yet the recall was found to be clade dependent. As expected, recall and precision decreased with increasing phylogenetic distance to the reference database, but generally, having a representative of the genus in the reference database was sufficient to achieve a relatively high accuracy. We also found variability in the accuracy of identifying individual KOs, with KOs that are more variable in length or in copy number having lower recall. Our paper includes abundance of data on these results, a detailed characterization of the mapping accuracy across different clades, and a description of the impact of additional properties (e.g., read length, sequencing error, etc.).

A principal component analysis of the pathway abundance profiles obtained for 15 HMP samples and by four different annotation protocols. HMP samples are numbered from 1 to 15 according to the list that appears in the Methods section of the manuscript. The different protocols are represented by color and shape. Note that two outlier protocols for sample 14 are not shown but were included in the PCA calculation. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776. doi:10.1371/journal.pone.0105776. See the manuscript for full details.
A principal component analysis of the pathway abundance profiles obtained for 15 HMP samples and by four different annotation protocols.The different protocols are represented by color and shape. See the manuscript for full details. Figure and text adapted from: Carr R, Borenstein E (2014) Comparative Analysis of Functional Metagenomic Annotation and the Mappability of Short Reads. PLoS ONE 9(8): e105776 
Importantly, while the obtained functional annotations are in general representative of the true content of the sample, the exact protocol used to analyze the BLAST alignments and to assign functional annotation to each read could still dramatically affect the obtained profile. For example, in analyzing stool samples from the Human Microbiome Project, we found that each protocol left a consistent “fingerprint” on the resulting profile and that the variation introduced by the different protocols was on the same order of magnitude as biological variation across samples. Differences in annotation protocols are thus analogous to batch effects from variation in experimental procedures and should be carefully taken into consideration when designing the bioinformatic pipeline for a study.

Generally, however, we found that assigning each read with the annotation of the top E-value hit (the ‘top gene’ protocol) had the highest precision for identifying the function from a sequencing read, and only slightly lower recall than methods enriching for known annotations (such as the commonly used ‘top 20 KOs’ protocol). Given our lab interests, this finding led us to adopt the ‘top gene’ protocol for functionally annotating metagenomic samples. Specifically, our work often requires high precision for annotating individual reads for model reconstruction (e.g., utilizing the presence and absence of individual genes) and the most accurate functional abundance profile for statistical method development. If your lab has similar interests, we would recommend this approach for your annotation pipelines. If however, you have different or more specific needs, we encourage you to make use of the datasets we have published along with our paper to help you design your own solution. We would also be very happy to discuss such issues further with labs that are considering various approaches for functional annotation, to assess some of the factors that can impact downstream analyses, or to assist in such functional annotation efforts.

السبت، 20 سبتمبر 2014

Notes from 2007 for a blog post I should have written: How many microbial cells in humans?

Well sometimes you just screw up.  In 2007 I attended some planning meetings for the human microbiome project (see for example A human microbiome program? a post I wrote from one of the meetings in 2007).  And at those meetings I kept asking one question.  Where did this "fact" everyone kept citing that there were "10 times as many microbial cells in the human body as there were human cells" come from?  I could not find a citation.  So I started taking some notes for a blog post about this.  Here are those notes:

Wikipedia linkOnline textbook hereSears paper from Arizona site. She discusses only gut bacteria and cites a Gordon paper from 2001.
Seems to not be from this paper but really from here:
This in turn is not from there but apparently here

But, alas I got distracted.  And I did keep asking people - where did this "fact" come from.  And most people just brushed me off (and probably thought I was a bit of a crank ...). And nobody had a good answer.  Well, I was both pleased and sad (because I should have done it) to see Is your body mostly microbes? Actually, we have no idea by Peter Andrey Smith in the Boston Globe who addresses this issue in much much more detail that I ever could have done.  Everyone who works on the human microbiome and who is interested in "facts" and how they can get misreported should read this.  As a side note, Smith reports in the article that this is even given as a fact in Ted talks.  Sadly mine was one of them.  This is despite the fact (yes, the fact) that I swore to myself that I would NOT say that in my talk since I have been such a crank about this issue at meetings.  OMG - such truisms are so pervasive that even someone who actively questioned the truism still used it.  Uggh.  Oh well.  I really should have finished that draft post.