Click to learn more about author David Talby. 2020 has been a year of massive growth for applied natural language processing (NLP). Even in the wake of COVID-19 and stunted IT budgets, a recent study showed that NLP spending increased 10-30 percent across organization industries, company sizes, and geographies (Gradient Flow). NLP tools can be […]
Ten Common Issues When Using Excel for Data Operations
Click to learn more about co-author Rosaria Silipo. Using Excel to Transform/Analyze Your Data? I know you are still using Excel sheets to transform and/or analyze your data! I know, because most of us still use it to some extent. There is nothing wrong with using Excel. Excel spreadsheets are a great tool to collect […]
What to Look for in a Model Server to Build Machine Learning-Powered Services
Click to learn more about co- author Ion Stoica. Click to learn more about co- author Ben Lorica. Machine learning is being embedded in applications that involve many data types and data sources. This means that software developers from different backgrounds need to work on projects that involve ML. In our previous post, we listed key […]
From Modeling to Scoring: Correcting Predicted Class Probabilities in Imbalanced Datasets
Click to learn more about co-author Maarit Widmann. Click to learn more about co-author Alfredo Roccato. This is the second part of a the From Modeling to Scoring Series, see Part One here. Wheeling like a hamster in the Data Science cycle? Don’t know when to stop training your model? Model evaluation is an important part […]
What Is Data Discovery?
Data discovery describes processes in understanding data sets on hand for data integration and/or data analysis. This step occurs in design and should combine technical search from tools with subject matter expertise, from people. During data discovery, a high-level view is taken in assessing data preparation, or data quality needs. Data discovery can be broken […]
Five Key Features for a Machine Learning Platform
Click to learn more about co- author Ion Stoica. Click to learn more about co- author Ben Lorica. Machine learning platform designers need to meet current challenges and plan for future workloads. As machine learning gains a foothold in more and more companies, teams are struggling with the intricacies of managing the machine learning lifecycle. […]
Guided Visualization and Guided Exploration
Click to learn more about co-author Scott Fincher. Click to learn more about co-author Paolo Tamagnini. Click to learn more about co-author Maarit Widmann. No matter if we are experienced data scientists or business analysts, one of our daily routines is the easy and smooth extraction of the relevant information from our data regardless of […]
What Are GPUs and Why Do Data Scientists Love Them?
Click to learn more about author Eva Murray. Move over, CPUs. The GPUs have arrived in modern enterprises, and data scientists are eager to use them for their modeling and deep learning applications. Why is this happening, and what are the advantages GPUs bring for Data Science applications? Read on and find out. What Are GPUs? GPUs, or graphics […]
From Modeling to Scoring: Finding an Optimal Classification Threshold based on Cost and Profit
Click to learn more about co-author Maarit Widmann. Click to learn more about co-author Alfredo Roccato. Wheeling like a hamster in the Data Science cycle? Don’t know when to stop training your model? Model evaluation is an important part of a Data Science project and it’s exactly this part that quantifies how good your model is, […]
Mise en Place for Data Science
Click to learn more about author Curt Bergmann. When guests arrive at a great restaurant, the chef and all the cooks have already planned and assembled everything they need to quickly deliver excellence on a plate. Their process, called mise en place, is used by chefs all over the world. Emerging after the introduction of […]