Welcome!

Apache Authors: Pat Romanski, Liz McMillan, Elizabeth White, Christopher Harrold, Janakiram MSV

Related Topics: @BigDataExpo, Java IoT, Apache

@BigDataExpo: Blog Post

How to Get More ROI from Big Data By @NitinBandugula | @BigDataExpo [#BigData]

Expand your use case options with Apache Drill

How to Get More ROI from Big Data and Hadoop with Apache Drill

Many businesses have run a Hadoop Big Data system dedicated to a specific use case. Perhaps they are collecting call center records, analyzing sensor reports from the factory floor, or monitoring tweets to track customer experience in real-time.

Confining Big Data-driven projects to select initiatives initially made sense, as many of the initial Big Data analysis solutions were optimized for a limited set of use cases. But solution options have matured and expanded, as have the data sources businesses draw from. To get the best out of your Big Data investment now - and take full advantage of the famous "three Vs of Big Data": volume, variety, velocity - you'll want to begin planning your shift from the single use case stage to the multiple use scenario.

Expand Your Use Case Options with Apache Drill
Utilizing data-driven intelligence across the enterprise requires solutions that enable interactive, self-service ways of working with historical and near real-time data. Core Hadoop platform has already solved many of the fundamental (legacy) Big Data access and availability problems. With the addition of standalone query engine Apache Drill, data analysts finally have the freedom to follow their data queries easily across multiple data sources, on demand.

Apache Drill was designed to support a wide range of SQL use cases on Big Data. Drill is particularly well-suited for use in situations that require low latency performance, including interactive query environments (OLAP, self‐service BI, data visualization) and investigative analytics (data science/exploration), and Day Zero analytics on near real-time data. It enables efficient analytics operations across a range of data sources and formats including JSON, Parquet and HBase tables.

Drill's efficiency across multiple use cases comes in great part from its architecture. Drill is built on hierarchically‐organized modules called drillbits, which are responsible for executing SQL statements. A drillbit is installed on each node that holds data, and is capable of executing SQL queries on the data that it manages. When data is stored across many nodes, all applicable drillbits process the query, parallelizing its execution. Applications accessing Drill are "connected" to different drillbits, avoiding availability bottlenecks and ensuring data locality.

Self-Service Data Exploration On-Demand
Drill is the only SQL engine for Hadoop that doesn't demand schemas to be created and maintained, or data to be transformed, before it can be queried. Data analysts can query data in its native formats, including nested data, self-describing data, and data with dynamic schemas. There is no need to explicitly define and maintain schemas; Drill can automatically leverage the structure embedded in the data. Self-service data exploration is finally a reality. Data can be worked with immediately upon its arrival, with no need to prepare a schema. Analysts can change and expand their data sources on the fly without waiting for IT services to structure newly requested data.

Analysts can also leverage their existing SQL skills and BI tools to directly query self-describing data and process complex data types. Of course, Hadoop hasn't lacked for SQL or SQL-comparable solutions - but many were designed with from a historical perspective - reengineering old school tools for Big Data usage. These projects filled a real need, but solutions must now be built to support the myriad of data-producing sources we now utilize, as well as the ways that we transform Big Data into actionable intelligence.

Drill has been tested by the open source community - and it was designed to be extensible. New data sources, new file formats, new operators, and new query languages can be easily added via new user‐defined functions or custom-created storage plugins for traditional data sources.

Drill: The Future of Big Data Exploration
Apache Drill was initially inspired by Google's Dremel project, and the open source community has worked hard to develop Drill is the ideal interactive SQL engine for Hadoop.  The success of these efforts was recently acknowledged officially by the Apache Software Foundation, which announced in December 2014 that it has promoted Drill to a top-level project at Apache.

As a top-level project, Drill joins other illustrious projects such as Apache Hadoop and httpd (the world's most popular Web server). Drill now has its own board of directors, and users can be confident that the project has proven itself, has a viable roadmap for development, and can be confidently deployed for mission-critical use in the long term.

If you're ready to test-drive Drill, you can do so using the MapR Sandbox for Hadoop, which runs on PC, Mac, and Linux platforms. MapR Technologies is the provider of the top-ranked distribution for Apache Hadoop.

You can also view a tutorial on analyzing real-world data using Apache Drill.

More Stories By Nitin Bandugula

As a Sr. Product Marketing Manager at MapR, Nitin brings his engineering, business and management skills together to market technology products. At MapR, Nitin focuses on SQL, batch and in-memory frameworks and streaming technologies on Hadoop. Prior to MapR, Nitin worked for enterprise companies and startups in various roles including Engineering, Product Management and Management Consulting. Nitin holds a Masters degree in Computer Science from the Illinois Institute of Technology and an MBA from the Johnson School at Cornell University.

@ThingsExpo Stories
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
There is huge complexity in implementing a successful digital business that requires efficient on-premise and cloud back-end infrastructure, IT and Internet of Things (IoT) data, analytics, Machine Learning, Artificial Intelligence (AI) and Digital Applications. In the data center alone, there are physical and virtual infrastructures, multiple operating systems, multiple applications and new and emerging business and technological paradigms such as cloud computing and XaaS. And then there are pe...
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
In his session at @ThingsExpo, Greg Gorman is the Director, IoT Developer Ecosystem, Watson IoT, will provide a short tutorial on Node-RED, a Node.js-based programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using a wide range of nodes in the palette that can be deployed to its runtime in a single-click. There is a large library of contributed nodes that help so...
Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...