As the world’s stored digital data grows unfathomably, NoSQL (often referred to as “not only” SQL) serves as a means to navigate these massive stores of data that SQL systems have trouble with. For instance, between 2006 and 2010, stored data increased over 500% to over 1,000 exabytes (equal to one billion gigabytes). Large unstructured data sets such as this are often referred to as Big Data. Websites like Facebook and Amazon store user data subsets, message histories, and user preferences based on likes and views. The interrelatedness and potential market value of this kind of data is big business. NoSQL has proven to be the most effective form of navigation and retrieval of these large data subsets imbedded in Big Data storehouses to date.
Tim Perdue, a former About.com Guide states: “[O]ne way to define NoSQL is to consider what it’s not. It’s not SQL and it’s not relational. Like the name suggests, it’s not a replacement for a RDBMS but compliments it. NoSQL is designed for distributed data stores for very large scale data needs. Think about Facebook with its 500,000,000 users or Twitter which accumulates Terabits of data every single day.” Perdue continues, “In a NoSQL database, there is no fixed schema and no joins. A RDBMS ‘scales up’ by getting faster and faster hardware and adding memory. NoSQL, on the other hand, can take advantage of ‘scaling out.’ Scaling out refers to spreading the load over many commodity systems. This is the component of NoSQL that makes it an inexpensive solution for large datasets.” Being able to navigate through these large data subsets, however, is where the NoSQL Engineer earns his place as king of the castle.
Leading advocates of NoSQL technology champion its importance in this emerging field:
- “NoSQL solutions are generally designed to manage large amounts of data, more than you would store on any single system, and so all generally have some notion of partitioning (or sharding) data across the storage found on multiple servers rather than expecting a centrally connected SAN or networked file system.” – Greg Burd, Developer Advocate for Basho Technologies.
- “The [NoSQL] pros include the ability to scale out compute and storage capacity horizontally over a wide range of hardware resources, simple and fast queries, and a flexible and simple approach to schema management.” –Dave Segleau, director of product management at Oracle.
What is a NoSQL Engineer?
Major companies like Google, Amazon, Facebook, Twitter, and LinkedIn utilize NoSQL to organize data. In fact, many of these major Internet-based companies pioneered the NoSQL technologies (Dynamo, BigTable, Voldemort, FlockDB, Cassandra, PNUTS, etc.) that are used across platforms today, out of a need to deal with three problems that traditional Relational Database Management Systems (RDBMS)—such as MySQL, Oracle, and MSSQL—could not accommodate:
“Unprecedented transaction volumes, expectations of low-latency access to massive datasets, and nearly perfect service availability while operating in an unreliable environment.” Greg Burd, Developer Advocate for Basho Technologies states, “NoSQL is more a rejection of a particular software and hardware architecture for databases than of any single technology, language, or product. Relational databases evolved in a different era with different technological constraints, leading to a design that was optimal for the typical deployment prevalent at that time. But times have changed, and that once successful design is now a limitation.”
NoSQL Engineers have the unique opportunity to scale Web and Application Servers for both relational and non-relational data in order to efficiently serve the purposes of their company. By maintaining and developing technologies associated with NoSQL technologies, the NoSQL Engineer solves problems associated with organizing data and storage strategies and opens up new possibilities regarding the value of data in the decades to come.
What does it take to be a NoSQL Engineer?
1. Formal Education: NoSQL Engineers should have a B.S. in Computer Science, Engineering or an equivalent discipline. Engineers will utilize problem solving skills with the implementation of code. Potential employers will expect a good NoSQL Engineer will have 4-6 years of experience with high-traffic, highly available and scalable database systems (MySQL and NoSQL).
2. Relational Database Proficiency: Familiarity with relational databases should be priority number one. Be familiar with open source NoSQL technologies like Apache Hadoop, Neo4J Hive, and HBase. Other NoSQL models include: MongoDB, CouchDB, Riak, Redis, Voldemort, Cassandra, Dynamo, Cassandra, ZooKeeper, BigTable, SimpleDB, CouchDB, , Pig, MapReduce, Redis, MongoDB, and Riak. A NoSQL Engineer should know operating systems like Linux, UNIX, Apache and Tomcat. Be able to identify the three camps of NoSQL data representation models, and the technologies associated with each: Document-oriented databases format data according to interacting systems and languages. MongoDB stores data as BSON and JavaScript Object Notation (JSON) documents. Graph-based NoSQL databases include Neo4J, which is a major technology in social media sites, as well as a key database design for public transportation links, road maps and network topologies. Finally, key/value databases include Riak, Redis, and Cassandra, which is based on Google’s BigTable. Understand the BASE vs. ACID approach to NoSQL and the value of each in regards to relational databases. Again, as a NoSQL engineer, you aren’t rejecting SQL technologies, but must understand which situations require you to use “Not Only SQL” using larger datasets in conjunction with smaller subsets. Integration of the two systems (SQL-based and NoSQL-based systems) is crucial. Be familiar with what a vendor means when they say they support MapReduce.
3. Big Data: According to Helen Sun and Peter Heller’s 2012 publication Oracle Information Architecture: An Architect’s Guide to Big Data: “Big Data is all about finding a needle of value in a haystack of unstructured information. Companies are now investing in solutions that interpret consumer behavior, detect fraud, and even predict the future! McKinsey released a report in May 2011 stating that leading companies are using big data analytics to gain competitive advantage. They predict a 60% margin increase for retail companies who are able to harvest the power of big data.”
Needless to say, unstructured data has changed IT in a major way, and those that know how to navigate and engineer this data are in high demand. Major companies will continue access Big Data, which will in turn shape the focus of their business models, infrastructure and the interface of their programs. Their growing relationship with NoSQL will only become more useful as new technologies become marketable in the realm of social media and the Internet’s ability to interconnect products to consumers. Knowledge of SQL systems and Hadoop open-source framework are major requirements for employers that utilize unstructured data. Other crucial NoSQL technolgies on the market are Dynamo, Cassandra, Apache Hadoop, ZooKeeper, Big Table, SimpleDB, CouchDB, Neo4J Hive, Pig, MapReduce, Redis, MongoDB, and Riak.
4. Knowledge of Programming Languages: Know a variety of programming languages. NoSQL Engineering requires building applications that are often relational to specific languages and systems, across a broad spectrum. Be familiar with JavaScript, C/C++, HTML, XML, Python, Perl, PHP, Ruby, Scala and Shell, among others.
5. Creativity, Adaptability, and Communication: Hiring firms want smart engineers who are open and able to work independently, as well as collaboratively with other staff members to complete short- and long-term projects. Employers will ask for updates, status reports, and presentations. Keep track of work and co-ordinate with team members so the scope of the project is clear to the Business Analysts. A good NoSQL Engineer should be an excellent troubleshooter and be a high-performance problem solver. Since the market for unstructured data fluctuates constantly, a NoSQL Engineer must be adaptable and be comfortable working in a fast-paced and dynamic environment. After all, the nature of NoSQL technology is built around speed and adaptability.
6. Business Sense: A NoSQL Engineer develops tools and shapes data to the benefit of the company they represent. These tools must function to serve that purpose above all else. Read up on the basic concepts of business and market structure to stand out and create a quality program tailored for the project at hand.
7. Analytics: Understanding and transforming data is part of the creative process. As a NoSQL Engineer, one is expected to gather information regarding a client’s Big Data processing requirement. A NoSQL Engineer is a problem-solver first. According to a various job-profile postings, NoSQL Engineers should multi-task between MySQL and NoSQL technologies “to establish analytics that enable the business, product and technology teams to make data-driven decisions on the best ways to acquire customers and maximize lifetime value.” Take a few classes on data analysis, algorithm development, back-end automation and/or large scale open source data processing.
8. Data Modeling: Knowledge of database design and Data Modeling are implicit with the role of the NoSQL Engineer. A thorough comprehension of modeling tools, techniques, and methodologies such as CA ERWin, dbConstructor, DbSchema, Oracle SQL, PowerDesigner, Agile, ORM diagrams, UML class diagrams, Bachman diagrams, CRC cards, DDL, Zachman Framework and others is valuable. Navigation through OLTP and/or OLAP systems is to be expected. Data, in the age of information, is a commodity worth knowing, so know it well.
As Big Data grows exponentially in Data Management and social media, the role of NoSQL Engineer is slowly gaining ground as the most valued position at the most powerful companies in the world. Smaller companies haven’t looked much into the potential value of NoSQL, though as the Internet connects the world in ways that grow exponentially every minute, the value and appeal of NoSQL Engineering continues to evolve in a big way.


















