I've been toying over a week with writing something based on an interesting Twitter discussion started by Dr. Laura Williams (@MicroWavesSci) of Providence College pondering the best way to approach teaching molecular genetics (really, science in general) at the undergraduate level. In particular, Professor Williams wondered about the dangers of branding various key experiments with the names of the experimenters, such as Hershey-Chase or Meselson-Stahl. The risk she points out is that this can devolve into an exercise in memorizing names and dates without assimilating concepts, or conversely that some students will find the names more of a hindrance than a help. I'm going to play a bit with this, but I do emphasize that for her this is reality and for me it is a hobby (or perhaps a retirement fantasy, if I should ever actually retire). Or in other words, for the academic this is her industry but for this industrial scientist it is academic.
Monday, September 18, 2017
Tuesday, August 29, 2017
High throughput sequencing of genomes is over twenty years old, which demanded the development of automated pipelines for annotating this data. I've worked on such pipelines since the early 1990s, implementing them as a student and at two different corporate stops. Indeed, we were reviewing results from my pipeline versus some of the other ones out there to see what can be done better. And unfortunately, I've found infuriating problems with RefSeq entries annotated with NCBI's bacterial genome annotation pipeline. Now I'm usually one to sing the praises of NCBI -- they are a key resource for biological research and they make available multiple spectacular public services freely to the entire world. But I'm afraid this time I need to vent.
Tuesday, August 15, 2017
Last week's news contained a story sure to raise eyebrows. A group of computer security researchers from the University of Washington claimed to have demonstrated that they could hijack a computer via sequencing a carefully-constructed DNA fragment. Visions of NextSeqs rampaging through the streets immediately sprung to mind. The paper is interesting and has some useful warnings for the bioinformatics community, but certainly the news coverage has been strong on hype and alarmism.
Saturday, August 05, 2017
Over on Quora a common type of question is "Can I be a computational biologist if I am now an X". Personally I take a very broad view and think just about anyone with intellectual curiosity can become any kind of scientist. A related type of question is "how skilled do I need to be in Y to succeed in computational biology", where Y is most often programming, biology or math. I got thinking about this and started wondering whether I am actually at all skilled in math. Here is the results of that analysis.
Friday, July 21, 2017
When Oxford Nanopore announced their GridION X5 instrument in March, I and others attempted to parse the difference between the two pricing plans -- and I made a bit of a hash of it. The X5 runs 5 MinION flowcells independently in parallel from a single desktop instrument, which also includes FPGA-based acceleration of basecalling plus a license to perform sequencing-for-hire. Indeed, Matt Loose tweeted out an image of an "X6" and then mention of an "X7"; the X6 had a MinION plugged into the USB port and apparently the FPGA unit can keep up with seven flowcells all running simultaneously. Now Oxford has launched an interesting third "Starter Pack" plan that offers an even lower price point for the system.
Wednesday, June 28, 2017
Tuesday's Boston Globe carried a piece originating from STAT news on an interesting natural product antibiotic, pleuromutilin. A research group recently published a new total synthesis of this fungal terpene, an advance which promises to enable greater medicinal chemistry around the molecule. That part is cool. Unfortunately, when it gets to the biology of pleuromutilin the piece by Eric Boodman completely spits the bit, trotting out some horribly inaccurate tropes.
Wednesday, June 14, 2017
In my bit on "I'm not dead yet" technologies recently, I included large scale Sanger sequencing. That reflects to a large degree my personal experiences and biases. Targeted Sanger is great for spot checking the occasional junction or misbehaving clone or strain, but I forget that many clinicians still see it as a gold standard. Apparently there are others who disagree with me, as Thermo Fisher recently launched a new Sanger instrument targeted at small labs, and according to GenomeWeb Promega plans an instrument offering in the same space as well.
Tuesday, June 06, 2017
I'm going to step outside the usual topic space here and cover an interesting but frustrating book I read partly on the flight to London Calling (which is about the only connection it has to genomics). Ice Ghosts, by Paul Watson, covers the searches for the lost Franklin Expedition, a mid-1800s British Navy attempt to find the Northwest Passage. It's a pretty good book, after all it did win a Pulitzer Prize, The topic is thrilling: explorers under difficult conditions and a mystery that lasted over a century. There are lessons for science in general, such as the value in carefully evaluating oral histories that some would discard as unreliable. But what is maddening for me is that in a book for which a central theme is poorly understood geographies and their interpretations, the set of supplied maps fail miserably at assisting in the telling of the story.
Monday, May 22, 2017
In the closing talk of the pre-London Calling workshop, Hans Jansen had closed his presentation with a question whether at some future date sequence assembly would become obsolete. This was meant to be an aspirational vision for a distance timepoint, but one correspondent on Twitter saw it as hype. I got in a bit of a discussion, constrained by the dreaded 140 character limit, which ended up largely illustrating that I have a somewhat more restricted definition of assembly than some people. I'm going to explore this and you can judge for yourself
Thursday, May 18, 2017
Okay, I'm desperately behind on writing up the external science from London Calling. Not helpful that I claimed I would not only do so, but in multiple installments. A number of the plenaries focused on large genome assembly, so that's what I'll tackle now -- plus a few other bits. See also my Storify summaries, which include other reports on the conference. Also check out my storifies on the SMRT Leiden conference, which ran at the beginning of the same week and discusses many similar topics.
Sunday, May 14, 2017
Jonathan Jacobs posted his annual reminder that the Sequencing, Finishing and Analysis in the Future Meeting (SFAF) will be this week. Alas, that meeting hasn't had many more tweeters in the past than Jonathan, but perhaps this year there will be more. There's a glut of genomics conferences to track, compile tweets and opine on -- besides London Calling, there's been SMRT Leiden and Biology of Genomes, all in the span of two weeks! This post is going to be a bit short on actual writing and more to just flag some talks at SFAF that grabbed my attention. What I realized is that the talks at SFAF illustrate that a number of technologies I consider effectively dead retain significant attention.
#ImBiased, but… Best conf. of 2017: #SFAF2017 #infectiousdisease #inherited #disease #agrigenomics #human #genomics https://t.co/yTu2MxKc41 pic.twitter.com/FCoSmTp6an— Jonathan Jacobs (@bioinformer) May 10, 2017
Tuesday, May 09, 2017
London Calling 2017 came to a close last Friday. Any excuses of jet lag or nights running up ONT's bar tab won't hold up much longer, so time to finish this post (I really did start the night after Clive's talk!) I'm going to largely divide coverage on the dividing line of who presented: today's piece on Oxford Nanopore presentations, particularly Clive Brown's, and in the near future at least one focusing on the science users presented. For other summaries of the action, I've created a storify of just blog posts and similar summaries of the meeting, as there were a great number (and I am on the hunt for additional ones I've missed)
Thursday, May 04, 2017
I attended on Wednesday the London Calling pre-conference workshop, an add-on for those wishing for help getting started with MinION sequencing. Judging from who I spoke to, many participants were utterly new to nanopore sequencing and more than a few were like me in that they had tried the platform and wanted to do better. My colleague has gotten some very good results recently, which has re-fired my determination to get good at that myself. Below are some limited notes I took that may be of general interest. Large portions of the workshop will go largely uncovered, as I focused on what was surprising or new.
Tuesday, May 02, 2017
Oxford Nanopore's London Calling confab runs Thursday and Friday, with a training workshop on Wednesday. I'll be there -- who can resist a conference nearly at the Tower of London? -- and will also be testing whether my personal "field of nanopore sequencing suppression" can defeat ONT's best trainers. Here's some preview of what I'll be particularly looking for, though being surprised will be lots of fun too. Much more fun that reading (the wrong) patents!
Monday, May 01, 2017
Oxford Nanopore has launched lawsuits in the UK and Germany against Pacific Biosciences, alleging infringement of a European patent licensed from Daniel Branton's lab at Harvard, EP1192453, which is apparently exclusively licensed to Oxford. When I wrote about Pacific Biosciences first lawsuit against Oxford Nanopore late last year I titled it "PacBio's Quixotic Patent Litigation", as it appeared the Oxford could easily dodge the lawsuit by abandoning the 2D sequencing technology, which Oxford is in the process of doing. I've swapped in "enigmatic" for this title, as I'm not even sure what aspect of PacBio is allegedly infringing the patent.
Wednesday, April 26, 2017
A pretty common question over on Quora is something along the lines of "how do I learn bioinformatics". Great question! Tonight I'm going to outline a project which I think would make a good first bioinformatics project. It is rich in content and keys off an interesting new non-computational result. And since I've left graffiti on multiple Quora threads that I would write something like this in the immediate future, here it is!
Saturday, April 22, 2017
In my recent piece on long read assembly, I laid out part of the case against the N50 statistic. Historically, the issues with the statistic have been around the fact it can be gamed at the expense of assembly correctness or assembly coverage. These are concerns for the typical sort of short read assemblies we've grown used to: lots of contigs and the temptation (perhaps justified) to try to go for higher N50s by more aggressive merging or by filtering out the short contigs. Elin Videvall over at The Molecular Ecologist has a nice ongoing series of posts illustrating the statistic and these commonplace issues:
I'm going to come at the problem from the other end, as a new preprint from 10x Genomics illustrates the problem of using an N50 statistic (or any related Nxx statistic) with good long-read / linked read assemblies -- but doesn't demonstrate this point quite as strongly as I thought when I first started drafting this.
Thursday, April 20, 2017
A TV movie produced by and starring American culture mogul Oprah Winfrey is about to hit screens which dramatizes Rebecca Skloot's The Immortal Life of Henrietta Lacks. If you haven't read this remarkable book, you really should. It should certainly be required reading for anyone entering biomedical fields. That's not to claim it is perfect; one of Lacks' sons has objected to the way his family is portrayed. But it is a searing human story of how the most famous cell line in the world came to be. Even if you excuse some of the injustices done as compatible with then contemporary ethical standards, it is a thought-provoking piece on the topic of what our biomedical ethics should be.
Thursday, April 13, 2017
A restaurant I frequented during my grad school days had a map on the wall showing Boston area transit routes from roughly the 1940s. Remarkably, most of those streetcar routes are found largely unchanged in the MBTA's current bus routes. Yes, routes have been altered to account for expansion of the Red Line and shifting of the Orange Line, but most of the routes are little changed and very, very few new ones have been added. Some of that reflects the canalization of routes by the street patterns; there are only so many large streets suitable for buses and Somerville's hills and the various rivers impose further constraints. Much of it lies in the always tight purses at the T and the political difficulty of ever closing an old route to enable moving resources to a new one. Unfortunately, the commuting patterns in Boston are not conserved from the 1940s, with far more workers commuting from distant suburbs and dense developments springing up.
Monday, April 10, 2017
Adaptive immunity is an endlessly fascinating topic which I have not explored very deeply, which is particularly unfortunate given the many parallels to computing. Combinatorial logic is used to construct a vast array of possible antigen readers, expression logic ensures that only one such reader is expressed in a given cell and hypermutation and evolution are used to optimize these readers to match specific antigens. All this not only creates weapons to deploy against foreign invaders, but also a memory which effectively records an individual's history of environmental exposures. Just before I started writing this two tweets highlighted using adaptive immunity profiling to reveal exposure to tuberculosis and cytomegalovirus. Adaptive immunity is responsible for transplant rejection, with new companies looking to more selectively modulate immunity to enable transplants without shutting the immune system down. Adaptive immunity also ties into the white hot field of immunotherapy for oncology, exploring whether differences in antigen response underlay variation in immunotherapy success. To enable profiling adaptive immunity on a mass scale, 10x Genomics has now introduced a single-cell kit for targeted profiling of T-cell receptor variable regions.
Tuesday, April 04, 2017
Advances in optical mapping, linked reads, PacBio and nanopore sequencing are enabling generating highly contiguous large genome sequences routinely and inexpensively. However, this in turn is creating intense demand for efficiently and reliably preparing ultra-high molecular weight (uHMW) DNA. By this term, I mean DNA approaching or exceeding a megabase in size. Methods for preparing HMW and uHMW DNA tend to be very old-school, reaching back at least back to the 1970s, 80s and 90s for approaches used in the early days. Phenol-chloroform preps with the DNA spooled out onto a glass hook or rod are one popular approach; another is to embed cells in agarose blocks, extract the DNA within the block and then degrade the agarose to retrieve the DNA. Nuclei preps are yet another approach. Any liquid handling must be performed gently and with wide bore pipettes. These techniques tend to be tedious and slow affairs, requiring many manual steps. As an alternative, Sage Sciences has launched an instrument which automates a process with no hazardous chemicals, the SageHLS.
Thursday, March 30, 2017
A new paper on using Hi-C sequencing appeared in Science recently, demonstrating the generation of chromosome-length scaffolds for human as well as several insect genomes. The authors even provide a cost model, proposing that by processing multiple genomes in parallel the sequencing reagent cost (but not labor) of this approach should be about $10K per human genome. In the case of the insect genomes, the paper enables a look at chromosome evolution which is simply impossible with lower resolution. These findings resonate with a number of pieces I've written over the years, but particularly with my recent criticism of the proposal Earth BioGenome project and a spirited defense of that concept made in the comments of my piece by a member of the steering committee.
Monday, March 27, 2017
I've been contemplating this post for a while, but it can be seen as another angle on my recent post on the challenges of drug discovery, so it finally left the mental queue. We often use other mammalian species in drug development to predict human toxicity. We know animals aren't the same as people, but lacking a better alternative that's what we do. Now, as regular readers know I keep company with a dog, and that sometimes has me wondering: how well do we understand the cases of things we can eat but which are dangerous for our canines?
Saturday, March 25, 2017
My correspondent @datarade shot a tweet my way on his quest to understand drug discovery. He does this despite the fact I've promised posts on previous tweets that are submerged in my mental queue. But the best part of teaching is forcing yourself to rethink what you think you know, so I'm going to actually take this one on in the space of "what is a target, how do we pick them and how do we drug them". Which I've found to be enlightening and frustrating. It's a messy space because so much is empirical, and I keep devising and then discarding taxonomies and explanatory approaches because they all seem unsatisfactory.
Tuesday, March 21, 2017
Pacific Biosciences has made new thrusts in their ongoing intellectual property action against Oxford Nanopore, adding two recently issued patents to the fray. Oxford has publicly brushed these off as "another pore excuse for a lawsuit", but certainly the battle is not over. One of these patents, 9,542,527 "Compositions and methods for nucleic acid sequencing", appears to concern using hairpin linkages to read both strands, much like the 9,404,146 "Compositions and methods for nucleic acid sequencing" patent that PacBio led with. Since Oxford has announced they will abandon their "2D" methods that use such hairpins, this angle would seem to be soon irrelevant (as I predicted back when PacBio originally attacked). But the other, US 9,546,400 "Nanopore sequencing using n-mers" covers basecalling methods, which is a new twist. A route to challenge any patent is to identify "prior art", information which was publicly available at the time of the patent filing which impinges on the claims in the patent application. Not only can exact matches to prior art be an issue, but also anything which would be "obvious" to a skilled practitioner. And that can certainly be a can of worms
Monday, March 20, 2017
The advent of so-called next generation sequencers, particularly those from Illumina, have brought the price of sequence data down dramatically. However, there is a catch: the cost of preparing DNA to go into the sequencer, the process known as library preparation, has glided downwards on a much shallower trajectory. This means that for projects wishing to sequence very large numbers of small genomes or large constructs the cost of library preparation can be similar to or even exceed the cost of data generation. A small company north of Boston called seqWell Inc™ has a new approach to Illumina library generation which they are on the cusp of making widely available, and not only does this bring the cost per well down but it is designed to yield normalized libraries from relatively unnormalized samples.
Tuesday, March 14, 2017
Clive Brown gave a webcast today with updates on a number of Oxford Nanopore topics, but clearly the flagship announcement was a new instrument, GridION X5. Due to the raging snowstorm in the Boston area I was home with my teammate and we've been doggedly going through the tweets (now storified) and my notes (plus David Eccles' nice set) to retrieve the juiciest bones therein.
Wednesday, March 08, 2017
Last week I posted a piece on some amazing new nanopore data, only to be red-faced to discover the next morning that I had misread the axes. So I re-posted the piece with the offending data and subsequent analysis in strike-thru font. After I did that, I was informed that the same dataset actually did have leviathan reads, bigger than my misinterpretation.
Thursday, March 02, 2017
Oxford Nanopore and its collaborators have shown at least three interesting advances in the last few months which I haven't yet covered; the most astounding of which was announced this week. I'll take these three in an order which works logically for me, though it isn't strictly chronological plus I'll touch on some parts of their platform which have not made advances which were perhaps expected.
(Morning after: Ugh, ugh, ugh -- I misread an axis, inserting an extra 0 -- so major crossouts in one section; why I shouldn't post late at night during pauses in day job stuff)
(Morning after: Ugh, ugh, ugh -- I misread an axis, inserting an extra 0 -- so major crossouts in one section; why I shouldn't post late at night during pauses in day job stuff)
Tuesday, February 28, 2017
There's been a bit of buzz recently about an unfunded proposal to ultimately sequence every living species on Earth, warming up by sequencing every eukaryotic species, with a targeted cost of $4.8B. It pains me a bit to write this, but I'm with those who think this is not a wise way to spend money and certainly not likely to work for anywhere near that budget.
Friday, February 17, 2017
I've used my scheme for collecting and organizing tweets to capture most of the feed from this week's AGBT17 conference. I still need to pore over these in detail, so I won't try to distill out much thoughts (other than single-cell sequencing is clearly in exponential growth phase!).
Monday, February 13, 2017
Obtaining a complete genome sequence for a bacterium or archean is essentially a solved problem, if you can culture the bug. Grow up biomass, purify the DNA and then use PacBio alone or a combination of long reads (PacBio or Oxford Nanopore) and short reads. These should yield a closed genome with a very low error rate. A few bugs spit at you by repeated failing PacBio sequencing or having some monster prophage or other repeat that is longer than the read lengths, but these are very rare. With advances in metagenomics techniques, the solving of uncultured genomes is becoming increasingly easy and many of these remarks also apply to fungi and other eukaryotic microorganisms. Once you have the sequence, then the lack of introns in bacteria and archea makes gene prediction almost trivial, and you now have a parts list for the organism. But is that a useful parts list? A new paper in Nature Methods makes some progress in improving the utility of those parts lists, though we are still far from actually fully understanding an organism given its genome.
Thursday, February 02, 2017
A bit of a foray into Oxford Nanopore land again. By replacing a bench bumbler with someone competent, we've seen some success with our MinION at Starbase. Highly variable yields though. I've done some looking and discovered this isn't a unique experience. And now Oxford is suggesting that software upgrades alone will give MinION about another 50% boost in yield; it will be interesting to see what this does for variability. Finally, I have a notion of some of the sources of variability and an idea for a troubleshooting tool
Wednesday, February 01, 2017
At the 2015 AGBT meeting, Illumina launched the NeoPrep, a ~$40K instrument to automate the preparation of up to 16 sequencing libraries at a time, using a technology called electrowetting microfludics. Now news comes that Illumina is dropping the NeoPrep, halting sales immediately and allowing existing users about a year of reagents. What happened and how does it impact genomics?
Tuesday, January 31, 2017
I'll spend two hours in project meetings tomorrow. Around the table will be a group of scientists who are all at the top of the game and among the best in the world at what they do. We will be trying to push forward new antibiotics to save lives. Yes, we are also trying to be rewarded monetarily with it, but we all share a mission to improve humanity by finding new drugs for important medical needs.
Friday, January 27, 2017
TULIP is a new assembler for long, error-rich reads such as from nanopore. I was a bit stunned to see that TULIP is written in Perl; I was starting to wonder how many holdouts like me there were. Which led to this exchange on Twitter
@hans_j_jansen @github as someone who can't quite kick the habit, I both applaud&grimace with your use of Perl for leading edge bfx— Keith Robison (@OmicsOmicsBlog) January 23, 2017
Tuesday, January 24, 2017
I've been remiss in writing up a piece on 10X Genomics based on a phone discussion last week with Michael Schnall-Levin (VP Computational Biology and Applications) and Anup Parikh (Director, Product Marketing). I always appreciate companies reaching out to me and spending time to educate me on their products and plans, and this was a very interesting and enjoyable conversation.
Saturday, January 21, 2017
Earlier this week one of my colleagues had gotten a somewhat ominous email from the CEO of Gen9 titled "Special Gen9 Announcement", which led off by saying that their holiday shutdown would be followed with a "corporate restructuring period" during which "Gen9 will not be accepting orders". The next day came an article from Scott Kirsner detailing the effective shutdown of Gen9 and sale of its assets to Ginkgo Bioworks for an undisclosed amount of cash and stock. Interestingly, Kirsner reports that only 10 Gen9 employees will make the transition and that most of the Gen9 staff was laid off in mid-December. It is surprising that no gossip of the cutbacks seemed to enter my radar, given a number of personal connections to the company (CEO Kevin Munnelly was a colleague at Millennium; several members of the Gen9 business group were ex-Codon or ex-Infinity and we had done limited business with Gen9)
Tuesday, January 17, 2017
Monday evening brought news that Bio-Rad has further consolidated its grip on the droplet microfluidics space by acquiring RainDance Technologies for an undisclosed price. Bio-Rad had previously acquired droplet digital PCR company QuantaLife back in October of 2011 and targeted sequencing company GnuBio in April of 2014. While the droplet digital PCR has been marketed for many years now, the GnuBio effort had gone relatively quiet since the acquisition. However, Bio-Rad announced the JP Morgan conference that this technology will be launched as OncoDrop late this year.
Monday, January 09, 2017
At today's J.P. Morgan Healthcare Conference Illumina made a number of small announcements -- some new partnerships, Firefly on track for launch later this year, launch of the single cell workflow partnered with Bio-Rad. Then CEO Francis deSouza dropped the big news: a new high-end sequencer architecture to ultimately replace all of the HiSeq instruments. It sounds like an interesting evolution of the Illumina product line, but unfortunately too many headlines and tweets have focused on a distant goal of $100 human genomes. Worse, not only did some commentators misconstrue the announcement as delivering on $100 genomes, but some also touted a sequencing speed of one hour for a genome which isn't remotely true.
Sunday, January 08, 2017
I'm good at acquiring distractions, and a relatively new one is Quora. This site allows users to ask questions which are then answered by members of the community. I lurk in a number of fields, but have answered a few questions related to genomics and related fields of biology. Tackling a question last night required re-learning some details I was disappointed I had forgotten. In researching to regain that knowledge, I skimmed a number of study guides online, which leads to this post.
Saturday, January 07, 2017
With the 2017 J.P. Morgan Conference in Healthcare (#JPM17) starting Monday, I and others have engaged in early reporting or speculation. I've tried to compile a list of presenting companies in the genomics, informatics and synthetic biology tool spaces, but these were filtered quickly from a long list of presenting companies so I may have missed some -- please leave comments and I can add. Also, some of the big conglomerates could speak on these topics but might ignore them, so no promises. For example, Roche has their pharmaceutical CEO speaking, so we may not hear anything about the PacBio breakup or Genia lawsuit. All times are Pacific Standard Time and are from the J.P. Morgan, though I've converted to 24-hour time (hopefully successfully!). You may need to register with J.P. Morgan to follow the links I've provided and access the webcasts when they are available.
Thursday, January 05, 2017
2017 is certainly shaping up to be a big year for nanopore news. I touched on Oxford Nanopore's very full plate in my speculation about sequencing platforms and we already know of two different legal actions which will be progressing, PacBio vs. Oxford Nanopore and University of California vs. Genia. James Hadfield's take on possible Illumina announcements at the J.P. Morgan Conference includes an Illumina nanopore device. That's speculation; today we had a pair of tweets from Two Pore Guys previewing their sensing device and that they will be talking more at J.P. Morgan (all videos from 2PG).
2PG Demo Video - HIV from Two Pore Guys on Vimeo.
See the first public demo of our #nanopore device doing a sample-to-result HIV test! https://t.co/SQvRK4QFuh— Two Pore Guys (@TwoPoreGuys) January 4, 2017
2PG Demo Video - HIV from Two Pore Guys on Vimeo.
Tuesday, January 03, 2017
As I noted in my last post, the University of California has filed suit against Genia claiming that Genia co-founder Roger Chen misappropriated intellectual property from UC Santa Cruz and the laboratory of Mark Akeson (filings include a bunch of other well-known nanopore scientists, including David Deamer and Dan Branton). While the filings are mostly dry, they are enlivened occasionally by such colorful language as "evasive tactics", "aided and abetted" and "stonewalled". Goaded by Mick Watson, I've dug into the court filings and some of the patents (and obtaining those filings apparently cost me some real money, perhaps approaching $1.0e01 dollars).
Monday, January 02, 2017
Another year of blogging is upon us! Since the J.P. Morgan Conference starts a week from today and then before long it's time for AGBT. So if one is going to prognosticate, then there's no time to lose, as announcements could start flying at any time.