Ideally, a machine learning engineer would have both the skills of a software engineer and the experience of a data scientist and data engineer. However, data scientists and software engineers usually come from very different backgrounds, and data scientists should not be expected to be great programmers, nor should software engineers be expected to provide […]
Applying the NIEM Standard to the Opioid Epidemic, COVID-19, and Other Problems
Click to learn more about author Robert Reynolds. If 2020 has taught us anything, it’s the value of data. As the COVID-19 pandemic spread across the U.S. earlier this year, most states were unprepared to deal with it. Those states that have “followed the data” have more successfully managed the epidemic’s toll on public health […]
DataRobot Acquires Paxata to Bolster its End-to-End AI Capabilities
According to a recent press release, “DataRobot, the leader in enterprise AI, today announced that it has entered into a definitive agreement to acquire Paxata, the pioneer of self-service data preparation and leading data fabric provider, to fulfill its mission to build the world’s first automated end-to-end enterprise AI platform. While the massive impact of […]
TigerGraph Improves Its Graph Database-As-A-Service With Enhanced Performance
A recent press release states, “TigerGraph, the only scalable graph database for the enterprise, today announced new functionality and performance for TigerGraph Cloud. TigerGraph Cloud, the industry’s first and only distributed native graph database-as-a-service, is the most intuitive way to build and run applications that work with highly connected and complex datasets. TigerGraph’s latest distributed […]
Recursion Releases Open-Source Data from Largest Ever Dataset of Biological Images
A new press release reports, “Recursion, a Fast Company ‘Most Innovative Company’ and leader in the artificial intelligence for drug discovery movement, today announced it will open-source a glimpse of the massive biological dataset the company has been building for more than five years. At more than two petabytes, and across more than 10 million […]
Making Machine Learning Datasets Unbiased
Click to learn more about authors Dmitry Pozdnyakov and Olga Ezzheva Machine Learning (ML), a subset of a broader Artificial Intelligence (AI) field, is finding its way into more and more areas of application. From smarter shopping recommendations to better medical diagnosis to more effective fraud detection, businesses are leaning on ML to inject new […]
The Next Wave of Data Regulations: How Businesses Can Navigate the California Consumer Privacy Act of 2018
Click to learn more about author Daniel Wu. The California Consumer Privacy Act of 2018 (CCPA) will take effect on January 1, 2020, and much like the European Union’s (EU’s) General Data Protection Regulation (GDPR) scramble earlier this year, organizations have a lot to do in preparation – or risk paying the price. For each […]
How Big Data Can Improve Data Visualizations and Responsiveness
Click to learn more about author Gilad David Maayan. Modern web design is all about responsiveness, meaning the degree to which designers or webmasters optimize web pages for browsing and navigation across a range of different devices. Mobile traffic accounted for 52.64 percent of total online traffic during 2017, and websites that are not responsive […]
What Is Big Data?
Big Data refers to extremely large data sets of varying types of data – structured, unstructured, and semi-structured – that can be collected, stored, and later analyzed to provide insights for organizations. Big Data’s promise depends on how the data is managed. In the past data was organized in relational models, sometimes within data warehouses, […]
Machine Learning Algorithms: Introduction to Random Forests
Click to learn more about author Alejandro Correa Bahnsen. There are a variety of Machine Learning algorithms, and each has its own strengths and weaknesses. In this second article in a series on Machine Learning algorithms, I introduce Random Forests, a supervised algorithm used for classification and regression. If you missed my Introduction to Machine […]