StreamSets Data Collector Innovation Powers Fortune 500 Pipelines

by Angela Guess

A new press release states, “StreamSets Inc., a provider of innovative data in motion middleware, today announced the addition of several powerful capabilities to its award-winning StreamSets Data Collector™ software for building and operating modern dataflows. These capabilities help enterprises accelerate time-to-value for their big data initiatives by being able to continually ingest data with high quality and reliability to feed next-generation data-driven applications. After just over a year in the market, StreamSets Data Collector is exhibiting accelerating momentum as companies seek to harness their data in motion. The open source software recently passed 100,000 downloads, with 400% quarter-over-quarter growth. Adoption has been especially strong in the world’s largest companies, with over one third of the Fortune 100 downloading the product, including leaders in financial services, healthcare and technology.”

The release goes on, “Key new features found in the most recent release make it easier than ever to modernize data architecture and take advantage of new data sources for digital products and customer experiences. Data Drift Synchronization: The problem of data drift frustrates enterprise data architects as unexpected semantic and schema changes lead to broken pipelines and lost or squandered data. Data Drift Synchronization takes StreamSets Data Collector’s ability to handle data drift to the next level by detecting schema changes and then generating and updating downstream metadata stores on-the-fly. This feature is applicable for relational databases; big data stores, such as Apache Hive, Apache Impala, Apache Kudu and Presto; as well as cloud stores such as Amazon Athena, Amazon Redshift and Microsoft Azure.”

Data Topics

StreamSets Data Collector Innovation Powers Fortune 500 Pipelines

Leave a Reply Cancel reply