Posted by: woodforthetrees | February 7, 2010

Let’s help the CIA

On Christmas Day 2009, towards the evening, news began to trickle through of an attempt to blow up a plane bound for Detroit. Over the coming days more information surfaced; firstly of the quick thinking  and bravery of the Dutch film-maker Jasper Schuringa, who managed to extinguish the flames apparently coming from the lap of a young Nigerian man named Umar Farouk Abdulmutallab.

But more astonishingly we found out that the father of the alleged bomber had contacted the US embassy in Nigeria to report his son’s erratic and suspicious behavior. As a parent this struck me: how agonising it must have been to inform on the son you love, how sure you’d have to be that lives were at stake to give this information to a government that does not have an exemplary record for fair treatment of suspected terrorists.

And yet this key information was lost or simply not deemed important: Umar Farouk Abdulmutallab was not added to the ‘No fly’ list and he duly boarded the plane from Amsterdam to Detroit.

I heard the news, considered it for a few minutes, then shrugged my shoulders and just got on with normal life.

Two weeks later I flew to Denver for a structural biology meeting. I waited  patiently in the immigration queue with all the other non-US citizens and I dutifully allowed the US government to take copies of my fingerprints and a photograph of me for their files. And it struck me: they just can’t see the wood for the trees.

The millions and millions of fingerprints, photos and emails from  innocent people on file are helping to obscure important data. So the unprecedented access the US government has – and the UK government is hot on its tails – to all sorts of private information is hindering, not helping.

Now ‘big data’ isn’t new to scientists. The human genome project was the watershed, and worldwide more than 1000 genomes have been sequenced. Structural genomics groups have solved the structures of thousands of proteins. Proteomics groups are churning out huge datasets too. As scientists, we’re certainly good at producing data. But how good are we at analysing and using the data?

The idea of this blog is to enlist your help in making sense of all this information. I’m the editor of the Structural Genomics Knowledgebase and also of the Signaling Gateway. I want to explain a bit about these projects, and I’d like your help to understand datasets I can’t get my head around. But first, a question for you.

Imagine you work for the CIA and are reponsible for their database about suspected terrorists. You’ve just been passed a note saying that a Nigerian man has rung in expressing his concerns that his son might be involved in terrorist activity. How would you make sure that this information doesn’t get lost amongst all the information? My guess is, you’d set up your database to weight your information. What would you do?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: