|By Rick Morrison||
|December 5, 2012 06:30 AM EST||
There is no doubt that Big Data holds infinite promise for a range of industries. Better visibility into data across various sources enables everything from insight into saving electricity to agricultural yield to placement of ads on Google. But when it comes to deriving value from data, no industry has been doing it as long or with as much rigor as clinical researchers.
Unlike other markets that are delving into Big Data for the first time and don't know where to begin, drug and device developers have spent years refining complex processes for asking very specific questions with clear purposes and goals. Whether using data for designing an effective and safe treatment for cholesterol, or collecting and mining data to understand proper dosage of cancer drugs, life sciences has had to dot every "i" and cross every "t" in order to keep people safe and for new therapies to pass muster with the FDA. Other industries are now marveling at a new ability to uncover information about efficiencies and cost savings, but - with less than rigorous processes in place - they are often shooting in the dark or only scratching the surface of what Big Data offers.
Drug developers today are standing on the shoulders of those who created, tested and secured FDA approval for treatments involving millions of data points (for one drug alone!) without the luxury of the cloud or sophisticated analytics systems. These systems have the potential to make the best data-driven industry even better. This article will outline key lessons and real-world examples of what other industries can and should learn from life sciences when it comes to understanding how to work with Big Data.
What Questions to Ask, What Data to Collect
In order to gain valuable insights from Big Data, there are two absolute requirements that must be met - understanding both what questions to ask and what data to collect. These two components are symbiotic, and understanding both fully is difficult, requiring both domain expertise and practical experience.
In order to know what data to collect, you first must know the types of questions that you're going to want to ask - often an enigma. With the appropriate planning and experience-based guesses, you can often make educated assumptions. The trick to collecting data is that you need to collect enough to answer questions, but if you collect too much then you may not be able to distill the specific subset that will answer your questions. Also, explicit or inherent cost can prevent you from collecting all possible data, in which case you need to carefully select which areas to collect data about.
Let's take a look at how this is done in clinical trials. Say you're designing a clinical study that will analyze cancer data. You may not have specific questions when the study is being designed, but it's reasonable to assume that you'll want to collect data related to commonly impacted readings for the type of cancer and whatever body system is affected, so that you have the right information to analyze when it comes time.
You may also want to collect data unrelated to the specific disease that subsequent questions will likely require, such as information on demographics and medications that the patient is taking that are different from the treatment. During the post-study data analysis, questions on these areas often arise, even though the questions aren't initially apparent. Thus clinical researchers have adopted common processes for collecting data on demographics and concomitant medications. Through planning and experience, you can also identify areas that do not need to be collected for each study. For example, if you're studying lung cancer, collecting cognitive function data is probably unrelated.
How can other industries anticipate what questions to ask, as is done in life sciences? Well, determine a predefined set of questions that are directly related to the goal of the data analysis. Since you will not know all of the questions until after the data collection have started, it's important to 1) know the domain, and 2) collect any data you'll need to answer the likely questions that could come up.
Also, clinical researchers have learned that questions can be discovered automatically. There are data mining techniques that can uncover statistically significant connections, which in effect are raising questions that can be explored in more detail afterwards. An analysis can be planned before data is collected, but not actually be run until afterwards (or potentially during), if the appropriate data is collected.
One other area that has proven to be extremely important to collect is metadata, or data about the data - such as, when it was collected, where it was collected, what instrumentation was used in the process and what calibration information was available. All of this information can be utilized later on to answer a lot of potentially important questions. Maybe there was a specific instrument that was incorrectly configured and all the resulting data that it recorded is invalid. If you're running an ad network, maybe there's a specific web site where your ads are run that are gaming the system trying to get you to pay more. If you're running a minor league team, maybe there's a specific referee that's biased, which you can address for subsequent games. Or, if you're plotting oil reserves in the Gulf of Mexico, maybe there are certain exploratory vessels that are taking advantage of you. In all of these cases, without the appropriate metadata, it'd be impossible to know where real problems reside.
Identifying Touch Points to Be Reviewed Along the Way
There are ways to specify which types of analysis can be performed, even while data is being collected, that can affect either how data will continue to be collected or the outcome as a whole.
For example, some clinical studies run what's called interim analysis while the study is in progress. These interim analyses are planned, and the various courses that can be used afterwards are well defined, but the results afterward are statistically usable. This is called an adaptive clinical trial, and there are a lot of studies that are being performed to determine more effective and useful ways that these can be done in the future. The most important aspect of these is preventing biases, and this is something that has been well understood and tested by the pharmaceutical community over the past several decades. Simply understanding what's happening during the course of a trial, or how it affects the desired outcome, can actually bias the results.
The other key factor is that the touch points are accessible to everybody who needs the data. For example, if you have a person in the field, then it's important to have him or her access the data in a format that's easily consumable to them - maybe through an iPad or an existing intranet portal. Similarly, if you have an executive that needs to understand something at a high level, then getting it to them in an easily consumable executive dashboard is extremely important.
As the life sciences industry has learned, if the distribution channels of the analytics aren't seamless and frictionless, then they won't be utilized to their fullest extent. This is where cloud-based analytics become exceptionally powerful - the cloud makes it much easier to integrate analytics into every user's day. Once each user gets the exact information they need, effortlessly, they can then do their job better and the entire organization will work better - regardless of how and why the tools are being used.
Augmenting Human Intuition
Think about the different types of tools that people use on a daily basis. People use wrenches to help turn screws, cars to get to places faster and word processers to write. Sure, we can use our hands or walk, but we're much more efficient and better when we can use tools.
Cloud-based analytics is a tool that enables everybody in an organization to perform more efficiently and effectively. The first example of this type of augmentation in the life sciences industry is alerting. A user tells the computer what they want to see, and then the computer alerts them via email or text message when the situation arises. Users can set rules for the data it wants to see, and then the tools keep on the lookout to notify the user when the data they are looking for becomes available.
Another area the pharmaceutical industry has thoroughly explored is data-driven collaboration techniques. In the clinical trial process, there are many different groups of users: those who are physically collecting the data (investigators), others who are reviewing it to make sure that it's clean (data managers), and also people who are stuck in the middle (clinical monitors). Of course there are many other types of users, but this is just a subset to illustrate the point. These different groups of users all serve a particular purpose relating to the overall collection of data and success of the study. When the data looks problematic or unclean, the data managers will flag it for review, which the clinical monitors can act on.
What's unique about the way that life sciences deals with this is that they've set up complex systems and rules to make sure that the whole system runs well. The tools associated around these processes help augment human intuition through alerting, automated dissemination and automatic feedback. The questions aren't necessarily known at the beginning of a trial, but as the data is collected, new questions evolve and the tools and processes in place are built to handle the changing landscape.
No matter what the purpose of Big Data analytics, any organization can benefit from the mindset of cloud-based analytics as a tool that needs to consistently be adjusted and refined to meet the needs of users.
Ongoing Challenges of Big Data Analytics
Given this history with data, one would expect that drug and device developers would be light years ahead when it comes to leveraging Big Data technologies - especially given that the collection and analytics of clinical data is often a matter of life and death. But while they have much more experience with data, the truth is that life sciences organizations are just now starting to integrate analytics technologies that will enable them to work with that data in new, more efficient ways - no longer involving billions of dollars a year, countless statisticians, archaic methods, and, if we're being honest, brute force. As new technology becomes available, the industry will continue to become more and more seamless. In the meantime, other industries looking to wrap their heads around the Big Data challenge should look to life sciences as the starting point for best practices in understanding how and when to ask the right questions, monitoring data along the way and selecting tools that improve the user experience.
In his session at 18th Cloud Expo, Bruce Swann, Senior Product Marketing Manager at Adobe, will discuss how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects). Bruce Swann has more than 15 years of experience working with digital marketing disciplines like web analytics, social med...
May. 29, 2016 06:00 PM EDT Reads: 1,446
SYS-CON Events announced today that ContentMX, the marketing technology and services company with a singular mission to increase engagement and drive more conversations for enterprise, channel and SMB technology marketers, has been named “Sponsor & Exhibitor Lounge Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York. “CloudExpo is a great opportunity to start a conversation with new prospects, but what happens after the...
May. 29, 2016 04:45 PM EDT Reads: 1,316
WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at @ThingsExpo, Leo Papadopoulos, CTO of Cloud9, will discuss the importance of WebRTC and how it enables companies to fo...
May. 29, 2016 04:15 PM EDT Reads: 2,592
SYS-CON Events announced today Object Management Group® has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
May. 29, 2016 04:00 PM EDT Reads: 2,658
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discuss how businesses can gain an edge over competitors by empowering consumers to take control through IoT. We'll cite examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He'll also highlight how IoT can revitalize and restore outdated business models, making them profitable...
May. 29, 2016 02:00 PM EDT Reads: 2,972
The IoTs will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm and share the must-have mindsets for removing complexity from the development proc...
May. 29, 2016 01:00 PM EDT Reads: 1,988
Customer experience has become a competitive differentiator for companies, and it’s imperative that brands seamlessly connect the customer journey across all platforms. With the continued explosion of IoT, join us for a look at how to build a winning digital foundation in the connected era – today and in the future. In his session at @ThingsExpo, Chris Nguyen, Group Product Marketing Manager at Adobe, will discuss how to successfully leverage mobile, rapidly deploy content, capture real-time d...
May. 29, 2016 12:45 PM EDT Reads: 1,685
What a difference a year makes. Organizations aren’t just talking about IoT possibilities, it is now baked into their core business strategy. With IoT, billions of devices generating data from different companies on different networks around the globe need to interact. From efficiency to better customer insights to completely new business models, IoT will turn traditional business models upside down. In the new customer-centric age, the key to success is delivering critical services and apps wit...
May. 29, 2016 10:30 AM EDT Reads: 1,296
Join us at Cloud Expo | @ThingsExpo 2016 – June 7-9 at the Javits Center in New York City and November 1-3 at the Santa Clara Convention Center in Santa Clara, CA – and deliver your unique message in a way that is striking and unforgettable by taking advantage of SYS-CON's unmatched high-impact, result-driven event / media packages.
May. 29, 2016 10:00 AM EDT Reads: 2,549
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, will provide an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life ...
May. 29, 2016 09:45 AM EDT Reads: 2,043
SYS-CON Events announced today that BMC Software has been named "Siver Sponsor" of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. BMC is a global leader in innovative software solutions that help businesses transform into digital enterprises for the ultimate competitive advantage. BMC Digital Enterprise Management is a set of innovative IT solutions designed to make digital business fast, seamless, and optimized from mainframe to mo...
May. 29, 2016 09:30 AM EDT Reads: 2,314
SYS-CON Events announced today that MobiDev will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business managers to a full-scale mobile software company with over 200 develope...
May. 29, 2016 08:15 AM EDT Reads: 2,771
SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from Web startups to global enterprises. SoftLayer's modular architecture, full-featured API, and sophisticated automation provide unparalleled performance and control. Its flexible unified platform seamlessly spans physical and virtual devices linked via a world...
May. 29, 2016 07:00 AM EDT Reads: 2,316
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
May. 29, 2016 06:45 AM EDT Reads: 2,965
Companies can harness IoT and predictive analytics to sustain business continuity; predict and manage site performance during emergencies; minimize expensive reactive maintenance; and forecast equipment and maintenance budgets and expenditures. Providing cost-effective, uninterrupted service is challenging, particularly for organizations with geographically dispersed operations.
May. 29, 2016 06:00 AM EDT Reads: 2,186
As cloud and storage projections continue to rise, the number of organizations moving to the cloud is escalating and it is clear cloud storage is here to stay. However, is it secure? Data is the lifeblood for government entities, countries, cloud service providers and enterprises alike and losing or exposing that data can have disastrous results. There are new concepts for data storage on the horizon that will deliver secure solutions for storing and moving sensitive data around the world. ...
May. 29, 2016 06:00 AM EDT Reads: 1,370
SYS-CON Events announced today TechTarget has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. TechTarget is the Web’s leading destination for serious technology buyers researching and making enterprise technology decisions. Its extensive global networ...
May. 29, 2016 05:15 AM EDT Reads: 3,280
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
May. 29, 2016 04:30 AM EDT Reads: 3,267
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device. For more information, please visit https://www.mangoapps.com/.
May. 29, 2016 03:30 AM EDT Reads: 1,023
The essence of data analysis involves setting up data pipelines that consist of several operations that are chained together – starting from data collection, data quality checks, data integration, data analysis and data visualization (including the setting up of interaction paths in that visualization). In our opinion, the challenges stem from the technology diversity at each stage of the data pipeline as well as the lack of process around the analysis.
May. 29, 2016 02:45 AM EDT Reads: 1,518