by Angela Guess
For those professionals who are new to the world of data and databases, Klint Finley has written a helpful article which clearly defines some of the most common data terms. Finley writes, “It’s hard to keep track of all the database related terms you hear these days. What constitutes “big data”? What is NoSQL, and why are your developers so interested in it? And now “NewSQL”? Where do in-memory databases fit into all of this? In this series, we’ll untangle the mess of terms and tell you what you need to know.”
Finley starts with data itself: “The best definition of data I’ve been able to find so far is from Diffen: ‘Data are plain facts. When data are processed, organized, structured or presented in a given context so as to make them useful, they are called Information… It should be noted that data is plural (for datum), so the correct grammatical usage is ‘Data are misleading.’ However, in practice people tend to use data as a singular form. e.g. ‘This data is misleading.’”
Moving on to Big Data, Finley states, “In short, big data simply means data sets that are large enough to be difficult to work with. Exactly how big is big is a matter of debate. Data sets that are multiple petabytes in size are generally considered big data (a petabye is 1,024 terabytes). But the debate over the term doesn’t stop there. There are other factors that can make data difficult to work with, such as the speed at which data is updated or the data’s lack of structure. Clive Longbottom of Quocirca suggests the term ‘unbounded data’ for data that is fast or unstructured… Where might you run into big data or unbounded data? Social networks, where of users are adding status updates and comments at a high-speed. Or sensor networks with data about the surrounding environment is being stored at a fast pace. Or genomics, where huge amounts of genetic data is being processed.”

















