Stanley Black & Decker is a company known for its consumer and professional tools, storage solutions and other products. It has quite a few well-known brands under its umbrella, including DeWalt, Proto, Lista, and Craftsman. However, tools and storage make up only half of their business. At Stanley, they also offer various solutions in consumer and residential security and … [Read more...] about How To Optimize Product Design With AutoML (H2O Driverless AI Case Study)
Infrastructure
Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI, IBM Watson
For most businesses, machine learning seems close to rocket science, appearing expensive and talent demanding. And, if you’re aiming at building another Netflix recommendation system, it really is. But the trend of making everything-as-a-service has affected this sophisticated sphere, too. You can jump-start an ML initiative without much investment, which would be the right … [Read more...] about Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI, IBM Watson
4 Common Pitfalls In Putting A Machine Learning Model In Production
I spoke at a conference recently and one of the talks really resonated with me. It revolved around hosting, securing, and productionizing machine learning models.The speaker asked the audience, “Who in this room has developed a machine learning or artificial intelligence model for their business?” Being a technology conference, 80–90% of the hands shot up.“Now,” he … [Read more...] about 4 Common Pitfalls In Putting A Machine Learning Model In Production
How To Crowdsource Labeled Datasets Quickly With Open-Source Tools Like Snorkel
Getting sufficient amounts of labeled training data is a major bottleneck for many machine learning (ML) projects. You can create fancy models but they will be of little value if domain experts need to spend years labeling the relevant dataset. That is particularly relevant in areas where high expertise is required from data labelers, like, for example, in medical applications … [Read more...] about How To Crowdsource Labeled Datasets Quickly With Open-Source Tools Like Snorkel
How Airbnb Solves Enterprise-Scale Data Challenges For Machine Learning
Traditional data warehouses are built for Business Intelligence analytics, CEO Dashboards, and other types of business reporting prepared for “human consumption.” That often implies that data in these warehouses is not ready for “machine consumption,” including machine learning (ML) models. For example, it is mostly sufficient for humans to know the date of a particular event, … [Read more...] about How Airbnb Solves Enterprise-Scale Data Challenges For Machine Learning
Solving Data Challenges In Machine Learning With Automated Tools
Data is the lifeblood of machine learning (ML) projects. At the same time, the data preparation process is one of the main challenges that plague most projects. According to a recent study, data preparation tasks take more than 80% of the time spent on ML projects. Data scientists spend most of their time on data cleaning (25%), labeling (25%), augmentation (15%), aggregation … [Read more...] about Solving Data Challenges In Machine Learning With Automated Tools
Overview of the Different Approaches to Putting Machine Learning Models in Production
There are different approaches to putting models into production with benefits that can vary dependent on the specific use case. Take, for example, the use case of churn prediction. It is beneficial to have a static value that can be easily looked up when someone calls customer service, but there is some extra value that could be gained if, for specific events, the model could … [Read more...] about Overview of the Different Approaches to Putting Machine Learning Models in Production
Everything a Data Scientist Should Know About Data Management*
(*But Was Afraid to Ask)To be a real “full-stack” data scientist, or what many bloggers and employers call a “unicorn,” you’ve to master every step of the data science process — all the way from storing your data, to putting your finished product (typically a predictive model) in production. But the bulk of data science training focuses on machine/deep learning techniques; … [Read more...] about Everything a Data Scientist Should Know About Data Management*
20 Criteria You Should Use To Choose A Data Catalog
The Roles of a Data CatalogThe difficulties of data management have intensified at a steady pace over the past several years. The management complexities of big data, cloud hosting, self-service analytics, and tightening regulations can’t be ignored. Effective data management has become a top priority for most organizations, but getting there is challenging. Data catalogs … [Read more...] about 20 Criteria You Should Use To Choose A Data Catalog
How to Organize Data Labeling for Machine Learning: Approaches and Tools
If there was a data science hall of fame, it would have a section dedicated to labeling. The labelers’ monument could be Atlas holding that large rock symbolizing their arduous, detail-laden responsibilities. ImageNet — an image database — would deserve its own style. For nine years, its contributors manually annotated more than 14 million images. Just thinking about it makes … [Read more...] about How to Organize Data Labeling for Machine Learning: Approaches and Tools