In an enterprise analytics team, different roles exist to fill different needs, and those needs must be met in order to be successful. Launching an analytics program doesn’t necessarily require a massive influx of personnel before producing usable insights from data, yet it’s important that critical roles are filled, whatever the size of the team. […]
Without Improved Business Skills, a Dark Day is Coming for Data Scientists
Click here to learn more about Ted Kwartler. Each semester my Harvard University Extension class is about half business and half computer science students. As you may expect, the business pupils struggle with learning how to code. However, computer scientists feel hindered by business case studies and learning how to interact in a cross-functional group. […]
The POFMU Principle: Process Once for Many Uses
Click to learn more about author Matt Habiger. Data is often not created for purposes that please data scientists. It is often collected for operations or billing, and as such, a significant amount of preparation time is needed to make it ready for data science. This is clearly the case with location data sourced from […]
Ensemble Models: Bagging and Boosting
Click to learn more about author Rosaria Silipo. Ensemble models combine multiple learning algorithms to improve the predictive performance of each algorithm alone. There are two main strategies to ensemble models — bagging and boosting — and many examples of predefined ensemble algorithms. Bootstrap aggregation, or bagging, is an ensemble meta-learning technique that trains many […]
Data Architect vs. Data Modeler vs. Data Engineer
Michael Bowers, author and Chief Data Architect at FairCom Corporation, initially set out to research three careers in his presentation titled Data Architect vs. Data Modeler vs. Data Engineer for the DATAVERSITY® Data Architecture Online 2019 Conference. The process brought him to a wealth of information he would have appreciated much earlier in his career, […]
The Chief Data Officer and the Chief Digital Officer: Work Together, Not Apart
Data vs. digital: That’s a big tension within many organizations. Chief Data Officer s and Chief Digital Officers don’t always agree about some important things, said Joe Caserta, president of consulting firm Caserta, during his DATAVERSITY® Enterprise Data World Conference presentation titled Building a Foundation for Disruption and Advanced Analytics. What’s the disconnect between the […]
From Modeling to Scoring: Confusion Matrix and Class Statistics
Click to learn more about author Maarit Widmann. Wheeling like a hamster in the Data Science cycle? Don’t know when to stop training your model? Model evaluation is an important part of a data science project and it’s exactly this part that quantifies how good your model is, how much it has improved from the previous […]
The Collective Data Literacy Gap
Every graph, data set, and chart has a story behind it. Knowing how to make sense of that narrative across an organization requires data literacy. Increasing the data literacy of smaller teams can be complicated, but equiping an entire organization is an even bigger task. So many teams bring in so many different skill sets […]
Regularization for Logistic Regression: L1, L2, Gauss or Laplace?
Click to learn more about author Kathrin Melcher. Regularization can be used to avoid overfitting. But what actually is regularization, what are the common techniques, and how do they differ? Well, according to Ian Goodfellow [1]: “Regularization is any modification we make to a learning algorithm that is intended to reduce its generalization error but […]
Need a Push? Agile Thinking Gets You Over the Hump
Click here to learn more about author Jim Sawyer. Sometimes, we just need to start. We need to ignore our internal excuses and just do something. But it can be tough to do that when the business issues we are trying to inform are nebulous and ill-defined. Regardless of our personalities, inertia often confronts us […]