| By Bob Gourley | Article Rating: |
|
| December 7, 2012 02:34 PM EST | Reads: |
833 |
By BobGourley
Cloudera University is offering a new training course on data science titled Introduction to Data Science – Building Recommender Systems. The course is coming to the Washington DC area 20-22 Feb 2012.
If history is our guide, this course will be booked fast. My recommendation: look over the course outline below and register right away. CTOvision readers can attend the course with a 10% discount, so be sure to use the code we provide below to book at the reduced price.
The code is: ClouderaFE_10
Here is more on the course plus the registration link:
Introduction to Data Science – Building Recommender Systems
Course Summary
This hands-on course is suitable for software engineers, data analysts and statisticians. It is problem-driven and focuses on helping participants understand what a data scientist does, the problems they typically solve and their approach to doing so. By taking a practical approach to the subject, including multiple hands-on exercises, participants will leave the course with skills they can immediately apply to real-world problems.
Download the full agenda for Cloudera’s Introduction to Data Science.
Read the blog post: Training a New Generation of Data Scientists.
Duration
3 days.
You Will Learn
- Describe the role and responsibilities of a data scientist
- Explain several ways in which data scientists create value for organizations across many industries
- Locate and acquire data from diverse sources
- Use transformation and normalization techniques to produce accurate, useful data sets
- Determine the most appropriate type of analysis to perform for a given problem
- Be able to implement an automated recommendation system
- Develop, evaluate and refine scoring systems for recommenders
- Understand the considerations involved in working at scale
- Identify meaningful, actionable and business-oriented results from the analysis
Prerequisites
This course is suitable for software engineers, data analysts and statisticians. A basic knowledge of Hadoop is assumed: use of the HDFS file system, awareness of the MapReduce framework, Hadoop Streaming and Hive. Students should have proficiency in a scripting language; Python is strongly preferred, although students familiar with another language such as Perl or Ruby should be able to complete the exercises.
Outline
- Introduction
- Data Science Overview
- Use Cases
- Project Lifecycle
- Data Acquisition
- Evaluating Input Data
- Data Transformation
- Data Analysis and Statistical Methods
- Fundamentals of Machine Learning
- Recommender Overview
- Introduction to Apache Mahout
- Implementing Recommenders with Apache Mahout
- Experimentation and Evaluation
- Production Deployment and Beyond
- Conclusion
- Appendix A : Hadoop Overview
- Appendix B: Mathematical Formulas
- Appendix C : Language and Tool Reference
To register: http://university.cloudera.com/training/data_science/introduction_to_data_science_-_building_recommender_systems.html

Read the original blog entry...
Published December 7, 2012 Reads 833
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Bob Gourley
Bob Gourley, former CTO of the Defense Intelligence Agency (DIA), is Founder and CTO of Crucial Point LLC, a technology research and advisory firm providing fact based technology reviews in support of venture capital, private equity and emerging technology firms. He has extensive industry experience in intelligence and security and was awarded an intelligence community meritorious achievement award by AFCEA in 2008, and has also been recognized as an Infoworld Top 25 CTO and as one of the most fascinating communicators in Government IT by GovFresh.
- Cloud People: A Who's Who of Cloud Computing
- Windows Azure IaaS Reaches General Availability
- Portable Experimenter’s Platform, Powered by Raspberry Pi
- Basho Announces Open Source Riak CS and General Availability of Riak CS Enterprise v1.3
- Predixion Software Announces General Availability of the Latest Version of its Predictive Analytics Platform
- Cloud Expo New York: Real-Time Analytics Using an In-Memory Data Grid
- Cloud Expo New York: The Big Challenge of Big Data & Hadoop Integration
- Agile Solutions for Cloud, Big Data, Mobility Services
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- Cloud Computing: Cutting Costs, Boosting Profits
- AMAX Launches StorMax(TM) CFS, powered by IBM(R) General Parallel File System(TM) (GPFS(TM))
- Big Data: Visualizing the Strategic Business Imperative
- Cloud People: A Who's Who of Cloud Computing
- Examining the True Cost of Big Data
- Windows Azure IaaS Reaches General Availability
- Portable Experimenter’s Platform, Powered by Raspberry Pi
- SUSE Receives Common Criteria Security Certifications
- Basho Announces Open Source Riak CS and General Availability of Riak CS Enterprise v1.3
- Cloud Expo New York: Big Time - Introducing Hadoop on Azure
- Predixion Software Announces General Availability of the Latest Version of its Predictive Analytics Platform
- Cloud Expo New York: Real-Time Analytics Using an In-Memory Data Grid
- Book Excerpt: jQuery Essentials | Part 1
- Cloud Expo New York: The Big Challenge of Big Data & Hadoop Integration
- Help Desk Solution Empowers Employees
- The Top 250 Players in the Cloud Computing Ecosystem
- Web Services Using ColdFusion and Apache CXF
- Cloud People: A Who's Who of Cloud Computing
- Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
- Cloud Expo New York Call for Papers Now Open
- Eclipse "Pollinate" Project to Integrate with Apache Beehive
- An Introduction to Ant
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Beehive Code Now Available in Apache
- Apache's Tomcat 5.5 is First Release Ever to Use Eclipse JDT Java Compiler
- 4th International Cloud Computing Conference & Expo Starts Today
- "Beehive" Now Officially an Open Source Project: Apache Beehive

























