Kubernetes (sometimes abbreviated to “kube”) is open-sourced, was originally developed by Google, and organizes containers into logical units for transport and use in the cloud. Containers support the construction of self-contained environments capable of transporting data, and the software supporting it. Containers are, ultimately, a way to package software and other application components. It is […]
Blockchain Offers Internet of Things Data Quality and Data Security
The rapid advances of blockchain technology and the Internet of Things are changing how business on the internet gets done. Blockchain provides superior data security and Data Quality and, as a consequence, is changing the way people approach big data. This can be quite useful, as security remains a primary concern for Internet of Things […]
Four Predictions for Natural Language Processing in 2021
Click to learn more about author David Talby. 2020 has been a year of massive growth for applied natural language processing (NLP). Even in the wake of COVID-19 and stunted IT budgets, a recent study showed that NLP spending increased 10-30 percent across organization industries, company sizes, and geographies (Gradient Flow). NLP tools can be […]
The Emergence of Open Analytics
Click to learn more about author Dipti Borkar. As businesses are increasingly becoming more data-driven and need to make faster, more informed decisions, the traditional data warehousing approach for accessing and analyzing data is becoming more impractical, time-consuming and likely to increase cost, effort, and vendor lock-in. It assumes data needs to be ingested and […]
Ten Common Issues When Using Excel for Data Operations
Click to learn more about co-author Rosaria Silipo. Using Excel to Transform/Analyze Your Data? I know you are still using Excel sheets to transform and/or analyze your data! I know, because most of us still use it to some extent. There is nothing wrong with using Excel. Excel spreadsheets are a great tool to collect […]
Redis: Understanding the Open Source Data Store’s Primary Uses and Challenges
Click to learn more about author Bassam Chahine. Redis, which stands for “REmote Dictionary Server,” is a speed-optimized in-memory data store most often used as a cache. Redis has data structure versatility — from strings, lists, dictionaries, and sets, to support for approximate counting, geolocation, and stream processing. While Redis is configured as a cache […]
Open Source, Cloud, and Data: How to Plan Ahead
Click to learn more about author Matt Yonkovit. For most Data Management projects today, open source will be included in some form or other. From the databases that store the data to the management tools that manage and protect that data, or the analytics projects used to interrogate or display that data, open source plays […]
A Brief History of Open Source Data Technologies
Openly sharing information has been a part of human culture since the beginning of civilization. Information would be shared with the general community and the practice has had a powerful impact on the development of tools and machinery. In opposition to this practice, is the concept of ownership and control over new ideas and concepts, […]
NVIDIA Accelerates Apache Spark, World’s Leading Data Analytics Platform
According to a recent press release, “NVIDIA today announced that it is collaborating with the open-source community to bring end-to-end GPU acceleration to Apache Spark 3.0, an analytics engine for big data processing used by more than 500,000 data scientists worldwide. With the anticipated late spring release of Spark 3.0, data scientists and machine learning […]
IBM Announces Elyra AI Toolkit
A new press release reports, “Jupyter Notebooks are now the open standard for data science and artificial intelligence (AI) model development. In keeping with our commitment to open source and the Jupyter community, in particular, IBM is proud to announce Elyra, a set of open source AI-centric extensions to Jupyter Notebooks, and, more specifically, the […]