Welcome!

Apache Authors: Carmen Gonzalez, Liz McMillan, Elizabeth White, Pat Romanski, Christopher Harrold

Related Topics: @DXWorldExpo, Microservices Expo, Containers Expo Blog, @CloudExpo, Apache, SDN Journal

@DXWorldExpo: Article

Organizing Big Data

Though your company likely has enough types of data to earn the moniker "big," you can't just throw it all in a single database

Big Data is bandied around so much that it seems to have lost what little meaning it had in the first place. What's so special about it, anyway? Some people think it's just a lot of data, more than most companies are used to (Google doesn't count). The real definition isn't much more … well, definite. It is, as Tech Target's glossary says, "… the voluminous amount of unstructured and semi-structured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis."

How much is "too much"? That's the tricky part; it largely depends on how you set up your database(s) and the hardware you use. For practicality, we'll just say that when you move your siloed data to a non-relational database, you're using big data. Many businesses are switching to a big-data capable infrastructure, enough that Gartner Research predicts that this switch is going to drive $232 billion in spending through 2016. Why are companies spending so much on a single area?

Though your company likely has enough types and volume of data to earn the moniker "big," you can't just throw it all in a single database and expect to see quick, reliable benefits. Hence, the hundreds of billions of dollars companies will spend in the next few years just to support these massive non-relational databases. You can't dump your data onto a traditional hard disk NAS and run reports in the time it takes to get yourself a cup of tea. Hard drives are notorious for poor random I/O performance, so your big data solution needs a solid flash NAS or SAN array that can access non-sequential data quickly.

You'll have to beef up your servers, too. Running a report on a relational sales database is easy for most servers when compared to running a sales report on a big database. Make every member of your organization's life easier by running your database in a cloud server — public, private, in-house, outsourced, hybrid, whatever type is right for you.

But why is big data so important? What makes it attractive to so many companies? Again, we'll turn to Tech Target, though I won't quote it directly:

·          Patterns: Whether you're a B2B shipping container manufacturer or a B2C online retailer, you need more information about your business, your customers and your industry. The more data you collect in one place, the easier it is to see patterns. What's your busiest time of year? Your bestselling product/service? Are most of your customers from the same region? Are people more likely to buy from you after reading 12 or more pages on your website? Having that data available and waiting for you to dig into is invaluable.

·          Space: How much duplicate data is your business storing that you could eliminate by combining all your disparate databases? Are you perhaps collecting too much data? I know my previous point harped on collecting more data, but if you're storing information that serves no purpose, it's a lot easier to eliminate if it's all in a single location.

·          Legal Compliance: If you don't have a complete view of the data you retain, it's hard to make sure you're complying with all applicable laws and policies. If someone sues your company or you undergo a compliance audit, you need to have easy access to all the information you're storing.

It's easy to dismiss big data as some overhyped phrase people throw around, but when you dive deeper, you realize that it serves a practical purpose. The days of every department having its own siloed database aren't over yet, but they should be. The advantages of big data ensure that.

More Stories By Joseph Parker

Joseph Parker has worked in management, supply chain metrics, and business/marketing strategy with small and large businesses for more than 10 years. His experience in development is personal, stemming from his work in mobile marketing and application technology. He is an avid reader of industry publications and follows the ongoing technological trends stemming from software and product development. He is an inbound marketer, avid blogger, and content provider for many business blogs.

IoT & Smart Cities Stories
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
@CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Two weeks ago (November 3-5), I attended the Cloud Expo Silicon Valley as a speaker, where I presented on the security and privacy due diligence requirements for cloud solutions. Cloud security is a topical issue for every CIO, CISO, and technology buyer. Decision-makers are always looking for insights on how to mitigate the security risks of implementing and using cloud solutions. Based on the presentation topics covered at the conference, as well as the general discussions heard between sessio...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settl...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
Data Theorem is a leading provider of modern application security. Its core mission is to analyze and secure any modern application anytime, anywhere. The Data Theorem Analyzer Engine continuously scans APIs and mobile applications in search of security flaws and data privacy gaps. Data Theorem products help organizations build safer applications that maximize data security and brand protection. The company has detected more than 300 million application eavesdropping incidents and currently secu...