by Angela Guess
Steve Miller recently wrote for Information Management, “My company, Inquidia Consulting, is currently engaged in/completing several predictive analytics and data science projects. While we distinguish PA from DS, there’s often not a hard dividing line between the two with our customers. Indeed, though we demur, some now consider data science to be any application of statistical methods to business problems. For Inquidia, both PA and DS generally involve statistics and machine learning of some sort, often ‘climaxing’ with predictive models trained and validated on existing data. The ultimate goal is to deploy the models to make go-forward predictions in a business process.”
Miller goes on, “Inquidia’s PA work is usually more narrowly focused than its DS cousin, often as not a particular modeling task with relevant data identified in advance for a relatively short-term project. And the PA customer may suggest ‘theories’ on what the final models might look like for us to test. R, Python and SAS are preferred PA platforms. DS projects, in contrast, are more comprehensive but nebulous, with substantial computation/data integration/wrangling, big (and perhaps unstructured) data , and exploration challenges that precede theorizing and subsequent modeling. In many cases, DS work is shaped more by data programming than by modeling. The Cloud, Redshift, Hadoop/Impala, Spark, R and Python are Inquidia’s usual suspect DS platforms.”
photo credit: Flickr/ suzi54241