Data Education View All →

Web Scraping for Data Science — Part 2

steve-miller-300x224

By   49 mins ago

Click here to learn more about author Steve Miller. Read Part 1 of this blog series here. Between R and Python, analytics pros are covered on most data science bases R-Python. In last month’s blog, I discussed simple webscraping using Python in a Jupyter notebbok, the nifty css-generating tool SelectorGadget, and the Python XML and HTML handling package lxml. […]

Read More →

The Elephant in the Cloud, According to Ashish Thusoo

ele

By   44 mins ago

by Angela Guess Srini Penchikala recently wrote in InfoQ, “Ashish Thusoo, CEO and co-founder at Qubole, recently spoke at Enterprise Data World Conference (EDW) about ‘The Elephant in the Cloud’, Hadoop as a Service offering. Part of a wider trend of big data as a service category rather than a product category, Hadoop as a […]

Read More →

4 Data Governance Considerations to Always Keep in Mind

const

By   39 mins ago

by Angela Guess Isabelle Guis recently wrote in ReadWrite, “Data is being collected from more sources than ever and decentralized between cloud and on-premise storage. There’s a lot on the line: organizations who don’t understand how to make use of their internal and external data can impede productivity, put their corporate reputation at risk, miss […]

Read More →

New Report Offers 5 Reasons to Become a Data Scientist

thumbs up

By   34 mins ago

by Angela Guess Katherine Noyes reports in CIO.com, “For the past three years executive recruiter Burtch Works has been surveying data-science professionals about salaries and other related topics. Burtch Works defines data scientists as professionals who can work with enormous sets of unstructured data and use analytics to get meaning out of them. Published on […]

Read More →

Microsoft Moves Deeper Into the Internet of Things

microsoft

By   29 mins ago

by Angela Guess Todd Bishop reports in GeekWire, “Microsoft has acquired Solair, an Italian company whose technology is used by businesses to connect and monitor their equipment and devices — ranging from production lines to espresso machines — using cloud services. It’s the latest move by Microsoft into the Internet of Things and the concept […]

Read More →

Semantic Computing in Healthcare

Massive Wall of Organized Documents

By   24 mins ago

by Angela Guess Jennifer Bresnick recently wrote in Health IT Analytics, “Semantic computing relies on the notion that computers can be “taught” to approach concepts and problems in a similar way to humans.  By linking together certain natural language concepts instead of just solving mathematical equations, computers can make inferences about data sets that might […]

Read More →

The Future of the Data Center: Heterogeneous Computing

cr_hybrid_042516

By   23 hours ago

“The whole point of Heterogeneous Computing is to have the right tools available, so you can use the right processor, in the right place, at the right time,” said Pat McGarry, the VP of Engineering at Ryft in a recent DATAVERSITY® interview. Such a statement certainly sounds both pertinent and beneficial, but in reality what […]

Read More →

MDM and EDW: Easing the Tension Between Data Control and Data Access

tens

By   23 hours ago

by Angela Guess Stephen Baker recently wrote in Data Informed, “There has always been an uneasy truce within large organizations between those who control access to data – the IT group, usually – and those who need that data to improve business performance. In a perfect world, the IT group would like to see a […]

Read More →

A Quick Introduction to Graph Databases

Social Media Marketing ROI Graph - Blue

By   23 hours ago

by Angela Guess Scott Carey recently wrote in ComputerWorld UK, “Anyone that read about the Panama Papers leaks in April will have already seen the benefits of using a graph database. The technology played a crucial role in enabling journalists to wade through immense datasets, quickly making connections between individuals, institutions and tax havens. And […]

Read More →

    Data Webinars | Upcoming Live Presentations View All →

    May 4 DGPO Webinar: A Data Governance Reboot – Positioning DG as an Issue Resolution Service for Immediate Value-add to the Business

    DGPO Featured Image

    By   7 months ago

    DATE: May 4, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees This Webinar is Hosted by: About the Webinar Jefferson County School District is a large K-12 enterprise with over 85,000 students and 14,000 employees located in Denver Co. A previous effort to stand-up Data Governance that focused on […]

    Read More →

    May 5 CDO Webinar: A Compelling Statement to Corporate Leaders – Why You Must Address EIM and DG

    CDO Vision Featured Image

    By   7 months ago

    DATE: May 5, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free for all attendees About the Webinar There is a lot of talk about the management of data as an asset. In fact it is getting tiresome. However, the people that see the need for managing data assets are still not the […]

    Read More →

    May 10 Data-Ed Webinar: Metadata Strategies

    Data-Ed Featured Image

    By   7 months ago

    DATE: May 10, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees   About the Webinar Good systems development often depends on multiple data management disciplines. One of these is metadata. While much of the discussion around metadata focuses on understanding metadata itself along with associated technologies, this comprehensive issue […]

    Read More →

    May 12 Smart Data Webinar: Emerging Data Management Options

    Smart Data Featured Image

    By   7 months ago

    DATE: May 12, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees About the Webinar Everyone talks about the challenges of managing big data, but applications built for the next decade will need more than “bigger” and “faster” versions of the RDBMS systems that dominated at the end of the […]

    Read More →

    May 17 DAMA Webinar: Influencing with Data – Facts Don’t Matter Much!

    DAMA International WEBINARS

    By   7 months ago

    DATE: May 17, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees About the Webinar Business analytics can provide a major stimulus for transformational change. Yet neuroscience and psychology show us that people are typically not “wired” to engage with facts. Analytics leaders seeking to develop a data-driven culture must […]

    Read More →

    May 19 RWDG Webinar: A Data Governance Framework for Smart Data

    RWDG Featured Image

    By   7 months ago

    DATE: May 19, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees About the Webinar Does your organization have smart data? How does your company define smart data? Smart data is data that is used in non-traditional ways such as through machine learning, through the semantic web and by taking […]

    Read More →

    May 24 Webinar: Metadata and the Power of Pattern-Finding

    Objectivity Featured Image

    By   7 months ago

    DATE: May 24, 2016 TIME: 2 PM Eastern / 11 AM Pacific PRICE: Free to all attendees This webinar is sponsored by: About the Webinar According to Gartner, “through 2018, 80 percent of data lakes will not include effective metadata management capabilities, making them inefficient.” Tools within the Apache Spark ecosystem, such as SparkSQL, MlLib, […]

    Read More →

      Data Blogs | Information From Enterprise Leaders View All →

      Web Scraping for Data Science — Part 2

      steve-miller-300x224

      By   49 mins ago

      Click here to learn more about author Steve Miller. Read Part 1 of this blog series here. Between R and Python, analytics pros are covered on most data science bases R-Python. In last month’s blog, I discussed simple webscraping using Python in a Jupyter notebbok, the nifty css-generating tool SelectorGadget, and the Python XML and HTML handling package lxml. […]

      Read More →

      A Data Extraction System for Unstructured Documents

      George Roth

      By   2 days ago

      Click here to learn more about George Roth. Let’s assume that we have a system that extracts information from unstructured documents. There are two types of unstructured documents: Type A:  There are unstructured documents with known content (e.g. legal contract document, SEC filings, etc.). An essential property of these is that can be classified based […]

      Read More →

      May 2016 DATAVERSITY Letter from the Editor

      Letter Featured Image

      By   5 days ago

      We are just returning from our first face-to-face event of the year, Enterprise Data World 2016 (EDW) held this year in San Diego.  It was a great event with over 1,000 attendees present. Always my favorite part of these events is meeting people I’ve been working with online throughout the year. I met so many […]

      Read More →

      The API Economy: A Big Ball of CRUD

      Dave Duggal

      By   5 days ago

      Click this link to learn more about the author Dave Duggal. Quote: “The use of APIs has exploded with the growth of distributed computing, driven by the popularity of the Web, Cloud and now, the Internet of Things (IoT)” Back in 1999 an academic paper, “The Big Ball of Mud” exposed fundamental limitations of ‘modern’ software […]

      Read More →

      Internet of Things: Big Data and Data Security Problems

      Ariel Amster

      By   7 days ago

      Click here to learn more about Ari Amster. In many respects, the Internet of Things (IoT) has already arrived. While many experts are predicting extreme growth over the next decade, for all intents and purposes, the IoT isn’t some far-flung concept of the future; we’re living in it right now. Think of all the devices and […]

      Read More →

      Distributing Machine Intelligence to the Foggy Edge of the IoT

      James Kobielus

      By   1 week ago

      Click here to learn more about author James Kobielus. Machine learning is beginning to sink roots into every node of the Internet of Things (IoT). In this new era IoT endpoints will increasingly be powered by statistically driven algorithms that process real-time sensor data and drive autonomous decisions right then and there. I recently published a […]

      Read More →

      Software Semantic Evolution, Part 6

      Yefim (Jeff) Zhuk

      By   2 weeks ago

      Click to learn more about Yefim (Jeff) Zhuk. Catch up on this blog series with Part 1, Part 2, Part 3, Part 4, and Part 5. SOA and Microservices, RAML and DataSense by MuleSoft, and the Next Step Collaboration of Services and Transformation of “tribal knowledge” Collaboration between people and groups seems to be a thing with a positive sign, although […]

      Read More →

      Selling the Value of Data Science, Governance or Analytics

      Kimberly Nevala

      By   2 weeks ago

      Click here to learn more about Kimberly Nevala. I’ve recently participated in a number of executive forums on the rewards and realities of creating data-savvy and analytically-enabled cultures. Interestingly, one key theme comes up repeatedly in audience Q&A: how to make the case? Seems simple enough, but this step often causes those who understand the […]

      Read More →

      Why Enterprise Data Strategy (EDS)?

      Assad Shaik

      By   2 weeks ago

      Click here to learn more about author Assad Shaik. With focus on financial industry What exactly prompts organizations to adopt an enterprise data strategy? Why is it essential for an organization to define an enterprise data strategy? On a quick look, it is evident that most of the organizations struggle to understand the importance and […]

      Read More →

        Data Articles | Data Science, Business Intelligence, & More View All →

        The Future of the Data Center: Heterogeneous Computing

        cr_hybrid_042516

        By   23 hours ago

        “The whole point of Heterogeneous Computing is to have the right tools available, so you can use the right processor, in the right place, at the right time,” said Pat McGarry, the VP of Engineering at Ryft in a recent DATAVERSITY® interview. Such a statement certainly sounds both pertinent and beneficial, but in reality what […]

        Read More →

        A Smart Database for a New Age of Enterprise Apps

        jz_smartdb_042516

        By   6 days ago

        Do you remember what it was like the first time you got your hands on an iPhone? When you realized that all the things that you used to have to do on separate devices now could be accomplished on one single device? Well, the minds behind LogicBlox would like you to feel the same way […]

        Read More →

        Streaming Analytics 101: The What, Why, and How

        hf_sa101_042516

        By   1 week ago

        Stream processing analyzes and performs actions on real-time data though the use of continuous queries. Streaming Analytics connects to external data sources, enabling applications to integrate certain data into the application flow, or to update an external database with processed information. Bloor Research analyst Philip Howard says stream processing is really an evolution of Complex […]

        Read More →

        Natural Language Processing Poised to Have a Big Impact on the Data Economy

        jz_nlpqa_041416

        By   2 weeks ago

        The stakes are large in the Natural Language Processing (NLP) market: It’s the high ground in the battle for control of the data economy and the key to turning silicon into gold, according to a report issued this quarter from market intelligence firm Tractica. Report authors Bruce Daley, the Principal Analyst, and Clint Wheelock, the […]

        Read More →

        The Future of the Internet of Things: The Industrial Cloud Drives Change

        eg_futureiot_041416

        By   2 weeks ago

        Remember when your computer just sat on your desk, with its only connection a cable to your printer? It was when computers got connected together on the Internet, and when Wi-Fi made those connections available everywhere, that the computers changed the way business was conducted in offices and how people interacted with each other. For […]

        Read More →

        DAMA’s New Certified Data Management Practitioners (CDMP) Exam

        DAMA Transparent

        By   3 weeks ago

        The Data Management Association International® (DAMA) has significantly redesigned their Certified Data Management Practitioners (CDMP) exam. The Associate level is already online and DAMA will be releasing the Practitioner level, along with Data Governance and Data Quality elective exams at the DATAVERSITY® Enterprise Data World 2016 Conference (April 17-22) in San Diego, California. The CDMP […]

        Read More →

        Smart Data Plus Deep Reasoning Equals Business Value from Data Analysis

        jz_deepreason_041116

        By   3 weeks ago

        Coherent Knowledge would like businesses that want to make more effective use of the knowledge they capture and distribute to become more familiar with Rulelog. That is the logic underlying knowledge representation languages such as Vulcan’s Silk and the rule language used by its own Ergo Suite semantic rules and reasoning platform. Smart Data is […]

        Read More →

        How to Be a Data Scientist: Data Science Skill Development

        pg_dsskills1_040316

        By   4 weeks ago

        To the wide business community, a Data Scientist is one of those “data magicians,” who can acquire disparate data masses from diverse business functions; clean, massage, organize, and prepare the data; and, then exploit their inherent skills in mathematics, statistics, and Machine Learning to uncover hidden business insights and intelligence. The data used by a […]

        Read More →

        A Brief History of Artificial Intelligence

        kf_histai_040316

        By   4 weeks ago

        The roots of modern Artificial Intelligence, or AI, can be traced back to the classical philosophers of Greece, and their efforts to model human thinking as a system of symbols. More recently, in the 1940s, a school of thought called “Connectionism” was developed to study the process of thinking. In 1950, a man named Alan […]

        Read More →

          Data Daily | Data News View All →

          The Elephant in the Cloud, According to Ashish Thusoo

          ele

          By   44 mins ago

          by Angela Guess Srini Penchikala recently wrote in InfoQ, “Ashish Thusoo, CEO and co-founder at Qubole, recently spoke at Enterprise Data World Conference (EDW) about ‘The Elephant in the Cloud’, Hadoop as a Service offering. Part of a wider trend of big data as a service category rather than a product category, Hadoop as a […]

          Read More →

          4 Data Governance Considerations to Always Keep in Mind

          const

          By   39 mins ago

          by Angela Guess Isabelle Guis recently wrote in ReadWrite, “Data is being collected from more sources than ever and decentralized between cloud and on-premise storage. There’s a lot on the line: organizations who don’t understand how to make use of their internal and external data can impede productivity, put their corporate reputation at risk, miss […]

          Read More →

          New Report Offers 5 Reasons to Become a Data Scientist

          thumbs up

          By   34 mins ago

          by Angela Guess Katherine Noyes reports in CIO.com, “For the past three years executive recruiter Burtch Works has been surveying data-science professionals about salaries and other related topics. Burtch Works defines data scientists as professionals who can work with enormous sets of unstructured data and use analytics to get meaning out of them. Published on […]

          Read More →

          Microsoft Moves Deeper Into the Internet of Things

          microsoft

          By   29 mins ago

          by Angela Guess Todd Bishop reports in GeekWire, “Microsoft has acquired Solair, an Italian company whose technology is used by businesses to connect and monitor their equipment and devices — ranging from production lines to espresso machines — using cloud services. It’s the latest move by Microsoft into the Internet of Things and the concept […]

          Read More →

          Semantic Computing in Healthcare

          Massive Wall of Organized Documents

          By   24 mins ago

          by Angela Guess Jennifer Bresnick recently wrote in Health IT Analytics, “Semantic computing relies on the notion that computers can be “taught” to approach concepts and problems in a similar way to humans.  By linking together certain natural language concepts instead of just solving mathematical equations, computers can make inferences about data sets that might […]

          Read More →

          MDM and EDW: Easing the Tension Between Data Control and Data Access

          tens

          By   23 hours ago

          by Angela Guess Stephen Baker recently wrote in Data Informed, “There has always been an uneasy truce within large organizations between those who control access to data – the IT group, usually – and those who need that data to improve business performance. In a perfect world, the IT group would like to see a […]

          Read More →

          A Quick Introduction to Graph Databases

          Social Media Marketing ROI Graph - Blue

          By   23 hours ago

          by Angela Guess Scott Carey recently wrote in ComputerWorld UK, “Anyone that read about the Panama Papers leaks in April will have already seen the benefits of using a graph database. The technology played a crucial role in enabling journalists to wade through immense datasets, quickly making connections between individuals, institutions and tax havens. And […]

          Read More →

          Lavastorm and Qlik Team Up to Offer Robust, Easy-to-Use Integrated Analytic Applications

          ls

          By   23 hours ago

          by Angela Guess A new article out of the company reports, “Lavastorm, the leading agile analytics company, today announced that it has partnered with Qlik, a leader in visual analytics, to put a powerful, fully-integrated modern analytics platform into the hands of data analysts and business users directly through Qlik Sense. The dynamic, integrated solution […]

          Read More →

          Qualcomm Makes Phones Smarter With New Snapdragon Deep Learning SDK

          qu

          By   24 hours ago

          by Angela Guess According to a new article out of the company, “Qualcomm Incorporated today announced at the Embedded Vision Summit in Santa Clara, Calif., that its subsidiary, Qualcomm Technologies, Inc., is offering the first deep learning software development kit (SDK) for devices powered by Qualcomm® Snapdragon™ 820 processors. The SDK, called the Qualcomm Snapdragon […]

          Read More →