Welcome!

Apache Authors: Liz McMillan, Pat Romanski, Elizabeth White, Janakiram MSV, Gil Allouche

Blog Feed Post

NoHadoop

NoHadoop is not only Hadoop. Why?

According to the 2014 Big Data & Advanced Analytics Survey conducted by the market research firm Evans Data, only 16% of over 400 developers surveyed worldwide indicated that Hadoop batch processing was satisfactory in all use cases. 71% of developers also expressed a need for real-time complex event processing more than half the time in their applications, and 27% said they use it all the time.

Hadoop has evolved from MapReduce and HDFS in the very beginning to a set of technologies, including Hive, HBase, Sqoop, Flume, Pig, Mahout, etc. Though, Hadoop was originally designed for batch processing. It was based on the map and reduce programming model and it is an overstretch for real-time transactions. Various efforts have been made to enhance Hadoop. For example, YARN was designed to decouple the resource management in the underlying execution engine, so non-MapReduce workloads can be handled. Cloudera rewrote the SQL execution on Hive to roll out Impala. Similarly, Hortonworks has a faster SQL interface in Stinger. MapR took a different route by revamping the file system and the HBase table structure for higher performance. Google also unveiled Cloud Dataflow a couple of months ago, which replaces the algorithms behind Hadoop.

It becomes obvious that Hadoop is not enough for Big Data. Hadoop is not always the best tool for particular business needs. For instance, Hadoop is good for broad exploratory analysis of large data sets, while it makes more sense to leverage a RDBMS for an operational analysis of what was uncovered. Relational databases can better store processed, transformed, aggregated and refined data. Semi-structured data suits well in other document-based alternatives like MongoDB or CouchDB. Graph data stores like Neo4j are the best fit for graphic data.

A rule of thumb is choose wisely to use the right technology for what needs to be done. Better yet, a hybrid architecture of Hadoop and NoHadoop provides the best-of-breed combination to fulfill varying requirements.

For more information, please contact Tony Shan ([email protected]). ©Tony Shan. All rights reserved.

Read the original blog entry...

More Stories By Tony Shan

Tony Shan works as a senior consultant, advisor at a global applications and infrastructure solutions firm helping clients realize the greatest value from their IT. Shan is a renowned thought leader and technology visionary with a number of years of field experience and guru-level expertise on cloud computing, Big Data, Hadoop, NoSQL, social, mobile, SOA, BI, technology strategy, IT roadmapping, systems design, architecture engineering, portfolio rationalization, product development, asset management, strategic planning, process standardization, and Web 2.0. He has directed the lifecycle R&D and buildout of large-scale award-winning distributed systems on diverse platforms in Fortune 100 companies and public sector like IBM, Bank of America, Wells Fargo, Cisco, Honeywell, Abbott, etc.

Shan is an inventive expert with a proven track record of influential innovations such as Cloud Engineering. He has authored dozens of top-notch technical papers on next-generation technologies and over ten books that won multiple awards. He is a frequent keynote speaker and Chair/Panel/Advisor/Judge/Organizing Committee in prominent conferences/workshops, an editor/editorial advisory board member of IT research journals/books, and a founder of several user groups, forums, and centers of excellence (CoE).

@ThingsExpo Stories
The Transparent Cloud-computing Consortium (abbreviation: T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace.
An IoT product’s log files speak volumes about what’s happening with your products in the field, pinpointing current and potential issues, and enabling you to predict failures and save millions of dollars in inventory. But until recently, no one knew how to listen. In his session at @ThingsExpo, Dan Gettens, Chief Research Officer at OnProcess, will discuss recent research by Massachusetts Institute of Technology and OnProcess Technology, where MIT created a new, breakthrough analytics model f...
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, will discuss the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports. The session will include a working demo and a technical d...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lea...
Almost two-thirds of companies either have or soon will have IoT as the backbone of their business in 2016. However, IoT is far more complex than most firms expected. How can you not get trapped in the pitfalls? In his session at @ThingsExpo, Tony Shan, a renowned visionary and thought leader, will introduce a holistic method of IoTification, which is the process of IoTifying the existing technology and business models to adopt and leverage IoT. He will drill down to the components in this fra...
Digital transformation is too big and important for our future success to not understand the rules that apply to it. The first three rules for winning in this age of hyper-digital transformation are: Advantages in speed, analytics and operational tempos must be captured by implementing an optimized information logistics system (OILS) Real-time operational tempos (IT, people and business processes) must be achieved Businesses that can "analyze data and act and with speed" will dominate those t...
If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
I'm a lonely sensor. I spend all day telling the world how I'm feeling, but none of the other sensors seem to care. I want to be connected. I want to build relationships with other sensors to be more useful for my human. I want my human to understand that when my friends next door are too hot for a while, I'll soon be flaming. And when all my friends go outside without me, I may be left behind. Don't just log my data; use the relationship graph. In his session at @ThingsExpo, Ryan Boyd, Engi...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
Adobe is changing the world though digital experiences. Adobe helps customers develop and deliver high-impact experiences that differentiate brands, build loyalty, and drive revenue across every screen, including smartphones, computers, tablets and TVs. Adobe content solutions are used daily by millions of companies worldwide-from publishers and broadcasters, to enterprises, marketing agencies and household-name brands. Building on its established design leadership, Adobe enables customers not o...
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
WebRTC adoption has generated a wave of creative uses of communications and collaboration through websites, sales apps, customer care and business applications. As WebRTC has become more mainstream it has evolved to use cases beyond the original peer-to-peer case, which has led to a repeating requirement for interoperability with existing infrastructures. In his session at @ThingsExpo, Graham Holt, Executive Vice President of Daitan Group, will cover implementation examples that have enabled ea...
SYS-CON Events announced today that ReadyTalk, a leading provider of online conferencing and webinar services, has been named Vendor Presentation Sponsor at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. ReadyTalk delivers audio and web conferencing services that inspire collaboration and enable the Future of Work for today’s increasingly digital and mobile workforce. By combining intuitive, innovative tec...
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
There is growing need for data-driven applications and the need for digital platforms to build these apps. In his session at 19th Cloud Expo, Muddu Sudhakar, VP and GM of Security & IoT at Splunk, will cover different PaaS solutions and Big Data platforms that are available to build applications. In addition, AI and machine learning are creating new requirements that developers need in the building of next-gen apps. The next-generation digital platforms have some of the past platform needs a...
Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at @ThingsExpo, Mathias Herberts, Co-founder and CTO of Cityzen Data, will deep dive into best practices that will ensure a successful smart city journey.
Businesses are struggling to manage the information flow and interactions between all of these new devices and things jumping on their network, and the apps and IT systems they control. The data businesses gather is only helpful if they can do something with it. In his session at @ThingsExpo, Chris Witeck, Principal Technology Strategist at Citrix, will discuss how different the impact of IoT will be for large businesses, expanding how IoT will allow large organizations to make their legacy ap...
Major trends and emerging technologies – from virtual reality and IoT, to Big Data and algorithms – are helping organizations innovate in the digital era. However, to create real business value, IT must think beyond the ‘what’ of digital transformation to the ‘how’ to harness emerging trends, innovation and disruption. Architecture is the key that underpins and ties all these efforts together. In the digital age, it’s important to invest in architecture, extend the enterprise footprint to the cl...
SYS-CON Events announced today that Numerex Corp, a leading provider of managed enterprise solutions enabling the Internet of Things (IoT), will exhibit at the 19th International Cloud Expo | @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Numerex Corp. (NASDAQ:NMRX) is a leading provider of managed enterprise solutions enabling the Internet of Things (IoT). The Company's solutions produce new revenue streams or create operating...