Loading...
You are here:  Home  >  Articles by Steve Miller  -  Page 2
Latest

Important Big Data Additions to the R Analyst’s Toolchest

By   /  February 1, 2017  /  BI / Data Science Blogs, Big Data Blogs, Big Data News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education, Database Blogs, Database News, Articles, & Education, Smart Data Blogs, Smart Data News, Articles, & Education  /  No Comments

Click to learn more about author Steve Miller. After my partner read my last blog, Frequencies in R — Part 2, where I used R’s data.table and dplyr packages to construct performant frequencies procedures on an in-memory 27M+ row, 30 attribute data.table, he asked if I’d compared my results with the equivalent functionality of R’s MonetDB.R package. I have […]

Read More →
Latest

Frequencies in R — Part 2

By   /  January 4, 2017  /  Big Data Blogs, Big Data News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education, Smart Data Blogs, Smart Data News, Articles, & Education  /  No Comments

Click here to learn more about author Steve Miller. In last month’s blog, I compared several functions that compute frequencies and crosstabs in R. The ones I’ve worked with primarily, and the foci of Part 1, were table from the base package, xtabs from the stats package, and count from Hadley Wickham’s plyr package. Tests were conducted on a data set […]

Read More →
Latest

Frequencies in R — Part 1

By   /  December 7, 2016  /  Big Data Blogs, Big Data News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education, Smart Data Blogs, Smart Data News, Articles, & Education  /  No Comments

Click here to learn more about author Steve Miller. I’m often asked to name the most common statistical procedure used in my company’s Data Science work. My answer, only partly in jest, is frequencies and crosstabs — to help with the mundane tasks of profiling and exploring data. Indeed frequency distributions and the dotplots that showcase […]

Read More →
Latest

A Common File Format for Python Pandas and R Data Frames

By   /  November 2, 2016  /  Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education  /  No Comments

Click here to learn more about author Steve Miller. I’ve been doing analysis on a Chicago Crime data set off and on the last few of months, using the now ubiquitous Jupyter Notebook to manage my work. Trouble is, I like to switch between data science language leaders R and Python, using the best of each for data munging, […]

Read More →
Latest

Efficient Machine Learning in H2O with R and Python, Part 1

By   /  October 5, 2016  /  BI / Data Science Blogs, BI / Data Science News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education, Data Modeling Blogs, Data Modeling News, Articles, & Education, Smart Data Blogs, Smart Data News, Articles, & Education  /  No Comments

Click to learn more about author Steve Miller. One of the major benefits of working with R and Python for analytics is that there’re always new and freely-available treats from their vibrant open source ecosystems. And now more and more, data scientists are able to reap the benefits of working with data in R, Python […]

Read More →
Latest

Identifying and Deleting “Empty” Columns in R data.frames

By   /  September 7, 2016  /  Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education  /  No Comments

Click here to learn more about Steve Miller. Toward the end of last month’s blog on SAS, R, Python, and WPS, I mentioned a current project challenge of identifying and eliminating “mostly” null columns from wide SAS data sets. As the team discovered, such columns can impose a significant drag on performance. My take is that while […]

Read More →
Latest

SAS, R, or Python – Enter World Programming System (WPS)

By   /  August 3, 2016  /  Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education, Smart Data Blogs, Smart Data News, Articles, & Education  /  No Comments

Click here to learn more about author Steve Miller. With more than a little serendipity, I came across a report detailing the results of the third annual survey by Burtch Works Executive Recruiting, entitled “SAS, R, or Python Survey 2016: Which Tool Do Analytics Pros Prefer?” The survey asked each respondent to name the single […]

Read More →
Latest

My Stock Market Index Dashboard with R, Plotly, and the Plotly Cloud

By   /  July 6, 2016  /  BI / Data Science Blogs, BI / Data Science News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education, Data Strategy Blogs, Data Strategy News, Articles, & Education, Enterprise Information Management, Information Management Blogs  /  No Comments

Click here to learn more about author Steve Miller. It’s been difficult for me to ponder my 2016 stock index dashboard this week, the markets roiled by the turmoil of Brexit taking a toll on my fragile investment psyche. Alas, I dutifully update the underlying data and run the visualization daily, hoping for the best […]

Read More →
Latest

New Jobs Analysis with Python

By   /  June 1, 2016  /  BI / Data Science Blogs, BI / Data Science News, Articles, & Education, Big Data Blogs, Big Data News, Articles, & Education, Data Blogs | Information From Enterprise Leaders, Data Education  /  No Comments

Click here to learn more about author Steve Miller. The presidential race is heating up as primaries come to an end. And if it’s Trump vs Clinton, there’ll be no shortage of strong opinion among the electorate as to which offers the best policies for economics, defense, energy, health care, etc. Last year I posted […]

Read More →
Latest

Web Scraping for Data Science — Part 2

By   /  May 4, 2016  /  Data Blogs | Information From Enterprise Leaders, Data Education, Data Science Blogs, Data Science News, Articles, & Education  /  No Comments

Click here to learn more about author Steve Miller. Read Part 1 of this blog series here. Between R and Python, analytics pros are covered on most data science bases R-Python. In last month’s blog, I discussed simple webscraping using Python in a Jupyter notebbok, the nifty css-generating tool SelectorGadget, and the Python XML and HTML handling package lxml. […]

Read More →