Click here to close now.

Welcome!

Apache Authors: Yeshim Deniz, Pat Romanski, Carmen Gonzalez, Elizabeth White, Liz McMillan

Blog Feed Post

The PMML Revolution: Predictive analytics at the speed of business

This guest post is by Alex Guazzelli, VP of Analytics at Zementis Inc. -- ed. PMML, the Predictive Model Markup Language, is the de facto standard to represent predictive analytics and data mining models. With PMML, it is extremely easy to move a predictive solution from one system to another, since it avoids proprietary issues and incompatibilities. Companies around the globe are benefiting from PMML to make instant use of their predictive solutions. With PMML, there is no need for custom coding: you can easily move your solution from the scientist’s desktop, where it was built, to the production environment, where it is operationally deployed. Companies also use PMML as the common language between service providers and external vendors. In this way, it defines a single and clear process for the exchange of predictive solutions. It becomes the bridge not only between data analysis, model building, and deployment systems, but also between all the people and teams involved in the analytical process. This is extremely important, since PMML is used to disseminate knowledge and best practices, and to ensure transparency. All the top analytical tools, commercial and open-source, support PMML. And, the language itself has reached a great level of maturity and refinement. PMML 4.1, its latest version, makes it extremely easy for predictive solutions to be represented in an open and standard way. With PMML, you can represent a myriad of pre- and post-processing steps, besides the predictive modeling techniques per se. PMML 4.1 allows for multiple models (model composition, chaining, segmentation, and ensemble, which includes random forest models), to be represented by a single and concise language element. It also allows for model outputs to be transformed into business decisions. Therefore, a PMML file is able to represent the entire solution, from raw data to business decision, with one or multiple predictive models. The availability of a standard such as PMML combined with scoring solutions in the cloud, for Hadoop, and in-database make it possible for predictive analytics to fulfill its promise and crack the big data code. Zementis, Inc. has been in the forefront of PMML-based scoring, first through its ADAPA Scoring Engine, which is available for on-site deployment or as a service on cloud (Amazon and IBM), and lately through its Universal PMML Plug-in which is offered for a range of databases and for Hadoop. Zementis has partnered with Revolution Analytics, so that predictive solutions built in R can benefit from the vast scoring infrastructure already in place. I am proud to be associated with Zementis and excited to be part of an ever-growing PMML community. A PMML package for R that exports all kinds of predictive models is available directly from CRAN. Traditionally, the PMML Package offered support for the following data mining algorithms: ksvm (kernlab): Support Vector Machines nnet: Neural Networks rpart: C&RT Decision Trees  lm & glm (stats): Linear and Binary Logistic Regression Models  arules: Association Rules kmeans and hclust: Clustering Models  Recently, it has been expanded to support:  multinom (nnet): Multinomial Logistic Regression Models; glm (stats): Generalized Linear Models for classification and regression with a wide variety of link functions  randomForest: Random Forest Models for classification and regression (click HERE for examples); rsf (randomSurvivalForest): Random Survival Forest Models; And, this expansion is still on-going as the R community implements support for other packages and techniques. For more on the PMML package, please take a look at the paper we published with Graham Williams from Togaware in “The R Journal”. For that just follow the link below: PMML: An Open Standard for Sharing Models There may be quite a few reasons for you to move your predictive solution from R to an independent deployment platform. Among them, you may want parallel execution on big data or real-time scoring for applications such as fraud detection or recommender systems. With PMML you can easily move your model to the cloud or inside the database for scoring. Or, even have it executed on Hadoop. It is really up to you! On top of that, PMML allows for side-by-side deployment of predictive assets from R as well as other commercial data mining tools, supporting a multi-vendor environment as well as platform independent deployment. More and more companies and individuals are using the PMML standard for the obvious benefits it provides, putting their predictive solutions on the fast track. With PMML, the speed of predictive solutions can be on par with the speed of business. Dr. Alex Guazzelli is the VP of Analytics at Zementis Inc. where he is responsible for developing core technology and predictive solutions under ADAPA, a PMML-based decisioning platform. With more than 20 years of experience in predictive analytics, Dr. Guazzelli holds a PhD in Computer Science from the University of Southern California and has co-authored the book PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics, now in its second edition (paperback and kindle). You can follow him at @DrAlexGuazzelli.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@ThingsExpo Stories
There's Big Data, then there's really Big Data from the Internet of Things. IoT is evolving to include many data possibilities like new types of event, log and network data. The volumes are enormous, generating tens of billions of logs per day, which raise data challenges. Early IoT deployments are relying heavily on both the cloud and managed service providers to navigate these challenges. In her session at Big Data Expo®, Hannah Smalltree, Director at Treasure Data, discussed how IoT, Big Data and deployments are processing massive data volumes from wearables, utilities and other machines...
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists will peel away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud environment, and we must architect and code accordingly. At the very least, you'll have no problem fil...
SYS-CON Events announced today that MetraTech, now part of Ericsson, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Ericsson is the driving force behind the Networked Society- a world leader in communications infrastructure, software and services. Some 40% of the world’s mobile traffic runs through networks Ericsson has supplied, serving more than 2.5 billion subscribers.
The 4th International Internet of @ThingsExpo, co-located with the 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - announces that its Call for Papers is open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
The worldwide cellular network will be the backbone of the future IoT, and the telecom industry is clamoring to get on board as more than just a data pipe. In his session at @ThingsExpo, Evan McGee, CTO of Ring Plus, Inc., discussed what service operators can offer that would benefit IoT entrepreneurs, inventors, and consumers. Evan McGee is the CTO of RingPlus, a leading innovative U.S. MVNO and wireless enabler. His focus is on combining web technologies with traditional telecom to create a new breed of unified communication that is easily accessible to the general consumer. With over a de...
Disruptive macro trends in technology are impacting and dramatically changing the "art of the possible" relative to supply chain management practices through the innovative use of IoT, cloud, machine learning and Big Data to enable connected ecosystems of engagement. Enterprise informatics can now move beyond point solutions that merely monitor the past and implement integrated enterprise fabrics that enable end-to-end supply chain visibility to improve customer service delivery and optimize supplier management. Learn about enterprise architecture strategies for designing connected systems tha...
Cloud is not a commodity. And no matter what you call it, computing doesn’t come out of the sky. It comes from physical hardware inside brick and mortar facilities connected by hundreds of miles of networking cable. And no two clouds are built the same way. SoftLayer gives you the highest performing cloud infrastructure available. One platform that takes data centers around the world that are full of the widest range of cloud computing options, and then integrates and automates everything. Join SoftLayer on June 9 at 16th Cloud Expo to learn about IBM Cloud's SoftLayer platform, explore se...
SYS-CON Media announced today that 9 out of 10 " most read" DevOps articles are published by @DevOpsSummit Blog. Launched in October 2014, @DevOpsSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce softw...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal an...
15th Cloud Expo, which took place Nov. 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, expanded the conference content of @ThingsExpo, Big Data Expo, and DevOps Summit to include two developer events. IBM held a Bluemix Developer Playground on November 5 and ElasticBox held a Hackathon on November 6. Both events took place on the expo floor. The Bluemix Developer Playground, for developers of all levels, highlighted the ease of use of Bluemix, its services and functionality and provide short-term introductory projects that developers can complete between sessions.
From telemedicine to smart cars, digital homes and industrial monitoring, the explosive growth of IoT has created exciting new business opportunities for real time calls and messaging. In his session at @ThingsExpo, Ivelin Ivanov, CEO and Co-Founder of Telestax, shared some of the new revenue sources that IoT created for Restcomm – the open source telephony platform from Telestax. Ivelin Ivanov is a technology entrepreneur who founded Mobicents, an Open Source VoIP Platform, to help create, deploy, and manage applications integrating voice, video and data. He is the co-founder of TeleStax, a...
The Internet of Things (IoT) promises to evolve the way the world does business; however, understanding how to apply it to your company can be a mystery. Most people struggle with understanding the potential business uses or tend to get caught up in the technology, resulting in solutions that fail to meet even minimum business goals. In his session at @ThingsExpo, Jesse Shiah, CEO / President / Co-Founder of AgilePoint Inc., showed what is needed to leverage the IoT to transform your business. He discussed opportunities and challenges ahead for the IoT from a market and technical point of vie...
The 3rd International @ThingsExpo, co-located with the 16th International Cloud Expo – to be held June 9-11, 2015, at the Javits Center in New York City, NY – is now accepting Hackathon proposals. Hackathon sponsorship benefits include general brand exposure and increasing engagement with the developer ecosystem. At Cloud Expo 2014 Silicon Valley, IBM held the Bluemix Developer Playground on November 5 and ElasticBox held the DevOps Hackathon on November 6. Both events took place on the expo floor. The Bluemix Developer Playground, for developers of all levels, highlighted the ease of use of...
Grow your business with enterprise wearable apps using SAP Platforms and Google Glass. SAP and Google just launched the SAP and Google Glass Challenge, an opportunity for you to innovate and develop the best Enterprise Wearable App using SAP Platforms and Google Glass and gain valuable market exposure. In his session at @ThingsExpo, Brian McPhail, Senior Director of Business Development, ISVs & Digital Commerce at SAP, outlined the timeline of the SAP Google Glass Challenge and the opportunity for developers, start-ups, and companies of all sizes to engage with SAP today.
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With "smart" appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user's habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps, abiding by privacy concerns and making the concept a reality. These challenges can't be addressed w...
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focused on understanding how industrial data can create intelligence for industrial operations. Imagine ...
SYS-CON Events announced today that Liaison Technologies, a leading provider of data management and integration cloud services and solutions, has been named "Silver Sponsor" of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York, NY. Liaison Technologies is a recognized market leader in providing cloud-enabled data integration and data management solutions to break down complex information barriers, enabling enterprises to make smarter decisions, faster.
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
Hadoop as a Service (as offered by handful of niche vendors now) is a cloud computing solution that makes medium and large-scale data processing accessible, easy, fast and inexpensive. In his session at Big Data Expo, Kumar Ramamurthy, Vice President and Chief Technologist, EIM & Big Data, at Virtusa, will discuss how this is achieved by eliminating the operational challenges of running Hadoop, so one can focus on business growth. The fragmented Hadoop distribution world and various PaaS solutions that provide a Hadoop flavor either make choices for customers very flexible in the name of opti...
Cultural, regulatory, environmental, political and economic (CREPE) conditions over the past decade are creating cross-industry solution spaces that require processes and technologies from both the Internet of Things (IoT), and Data Management and Analytics (DMA). These solution spaces are evolving into Sensor Analytics Ecosystems (SAE) that represent significant new opportunities for organizations of all types. Public Utilities throughout the world, providing electricity, natural gas and water, are pursuing SmartGrid initiatives that represent one of the more mature examples of SAE. We have s...