Share This Page
Big data is a buzzword commonly tossed around in industry these days. Most simply, big data is a type of statistics—a way of analyzing extremely large data sets with computer algorithms to reveal important patterns and correlations. Corporations in healthcare, retail, and other industry sectors have been trying to leverage big data to help their bottom line for years.
Now large-scale scientific projects are adopting the method, including the National Science Foundation as well as the US BRAIN Initiative. Francis Collins, director of the National Institutes of Health, says this important basic science undertaking—developing tools and technologies to help create a “dynamic understanding of brain function”—will be fueled, in part, by big data and analytics approaches.
“The focus here is on circuits and networks, trying to understand how those circuits allow the brain to carry out the remarkable series of complex functions that most of us take for granted,” he said during a special presentation at Neuroscience 2015, the annual meeting of the Society for Neuroscience. “And we have a remarkable amount of data. All of this puts us in a situation that we need to get used to: How do we handle all the data? How do we make sure that we’re accumulating it in the right places, that it’s marked appropriately with metadata so people can figure out what it is, that we have a plan for sustaining the data so it doesn’t disappear, and that it is accessible to people who have good ideas about how to use it?”
His questions give rise to an important challenge to neuroscientists—one that has already been taken up by several researchers who presented big-data-based projects at Neuroscience 2015.
Researchers from École Polytechnique Fédérale de Lausanne in Switzerland highlighted several experiments done using their virtual reconstruction of a small slice of working rat brain called the Blue Brain Project, a part of the European corollary to the BRAIN Initiative, the Human Brain Project. Sean Hill, co-director of the project, says that their project provides a framework for neuroscientists to test how different neuromodulatory aspects influence how the brain works.
“There is a lot of data about the brain that is being collected all over the world. The greatest honor you can give the data is to find a way to understand it, understand what it is really telling you about how the brain works,” he says. “The Blue Brain is a data-driven modeling project that allows us to test different implications and inform us not only how the brain works normally, but how it might be changed in disease. We can start to ask questions that really tell us about how the circuit functions.”
This initial reconstruction of a small rat brain microcircuit is a proof of concept, Hill says. The group plans to continue to use this modeling approach to one day build a human brain circuit.
“It’s hard work—and it takes a long time to do these kinds of approaches. But they have value,” he says. “And by creating an ecosystem where data is valued, we can work collaboratively to better understand the brain and how its different cells and circuits behave under different conditions.”
Modeling disease characteristics
Clive Niels Svendsen, a researcher at Cedars-Sinai Medical Center, is also using a big-data approach to help distinguish disease states, particularly in amyotrophic lateral sclerosis (ALS), a motor neuron disease that is more commonly known as Lou Gehrig’s disease.
Svendsen and colleagues are using induced pluripotent stem cell (iPSC) technology to create neurons from people with ALS, as well as those with spinal muscular dystrophy, a similar condition. Once they created these neurons, they set out to collect as much data as they could from them.
“This project is a big-data project. We use these cells created from patients with ALS that have a genetic deficit, but we don’t know how that gene deficit results in the disease itself,” says Svendsen. “This approach allows us to create a population of motor neurons that will die. And as they die, we interrogate them with a variety of measurements—from robotic imaging to epigenetic measures—that may help us understand how this gene deficit eventually becomes this disease.”
Once this massive data set is assembled, Svendsen and colleagues are partnering with Google to help analyze it. “We are hoping to see new patterns in this big data set that identify not just ALS, but different subtypes of ALS,” he says. “This is a new platform to look at the brain in a petri dish and better understand what’s happening in these disease states—and then perhaps test different drugs that may help move the cells back to a normal state.”
The data will be available in a publically available database, called NeuroLINCS.org (Library of Integrated Network-Based Cellular Signatures), for open research use.
A future for brain data
While Collins applauds the advances we’ve already made, both in the US and elsewhere, he says that we are still in the early days of the BRAIN Initiative—and we still have a long way to fulfill the goal of understanding basic brain function. Still, he says, computational approaches are here to stay and today’s scientists need to be comfortable with them.
“This is a big challenge. And it’s going to become even more of a challenge as the BRAIN Initiative picks up speed and collects increasingly complex data sets,” he says. “Computational strategies are becoming central to virtually everything we do. So if you’re not in a circumstance where you’re comfortable with that aspect of science, this is a great time to take advantage of training opportunities. Because I don’t think, going forward, that neuroscientists are going to be fully capable of doing all the great things they want to do without a level of comfort with computational analysis.”