The popularity of the MySQL open-source database platform is on the rise. But professionals who fail to convey a solid understanding of database structures and concepts during an interview may limit their appeal to employers, according to Alex Alexander, CEO of San Francisco-based SpringbokSQL, which builds specialty database appliances.
“Someone can always learn another RDBMS as long as they understand the basic working structure of a relational model,” he said. “And myopic professionals tend to overlook a database that could better serve a client’s needs.”
Walk me through the steps in an RDBMS transaction. Specifically, how the data gets inserted into the MySQL RDBMS from the client API, then into the optimizer and file system, and finally how it’s stored on a disk.
- What Most People Say: “To be honest with you, I’m not sure how the process works.”
- What You Should Say: “A transaction is a set of instructions that orders the addition, deletion or modification of a batch of rows in the database. Generally, an SQL statement is submitted to the database server where it is parsed, translated, optimized and evaluated. The optimization process is key because it finds the optimum or most efficient way to process the query. To store an object in a relational database it has to be flattened. To retrieve the object, the system reads the data and then creates or restores the object. Most databases use a storage management module that controls the interaction with the file manager. Some database management systems store the whole database in a single file, some split tables, indexes and other object kinds to separate files, some split files, etc.”
- Why You Should Say It: Knowing how a transaction gets processed is a basic concept, along with how a database interacts with a file system. Employers will examine your knowledge of these basic concepts in order to predict future performance. Plus, developers and DBAs need to educate end users, so it helps to have concepts straight.
Describe the difference between row storage databases and columnar databases. What should developers and DBAs consider when deciding between the two?
- What Most People Say: “As their names imply, a columnar database stores data in columns, while a row-oriented database, like MySQL, stores data in rows. I consider performance and storage requirements when selecting or recommending a database management system.”
- What You Should Say: “The major differences between a columnar database and a traditional row-oriented database have to do with performance, storage and schema modification. Generally speaking, row-oriented storage is more efficient for accessing a single record and inserting and updating data when aggregations aren’t required. For instance, I recently developed a row-oriented database for inserting and retrieving simple banking transactions such as deposits and withdrawals. On the other hand, a column-oriented database stores data in contiguous memory locations. I frequently use a columnar database when a table has a large number of rows, or when columnar operations such as aggregations or summaries or high compression rates are needed. For instance, I often recommend a columnar database like Infobright for BI tasks that query the same item across billions of rows of data. Because the database structure impacts performance, I not only consider the type of data, but the volume and how the data will be used when recommending or selecting a database management system.”
- Why You Should Say It: Databases are designed for different purposes and needs. A developer needs to weigh diverse user requirements, data characteristics and how the organization intends to use the data in order to decide which database structure is best.
What are NoSQL databases? How do they work and how do you decide which one to use?
- What Most People Say: “I’m not sure. I think Cassandra and MongoDB are NoSQL databases.”
- What You Should Say: “A NoSQL database is a non-relational and largely distributed database system that is often used to process and store large volumes of high velocity, structured and unstructured data that is housed in different locations or data centers. Most NoSQL products are built to overcome the performance issues associated with RDBMS. I’m familiar with four different types of NoSQL databases: Cassandra, which is a key-value store that is often used for schemaless data; HBase, which uses wide-column stores to store data rather than rows; MongoDB, which is best for storing, retrieving and managing document-oriented information or semi-structured data; and Polyglot, which is based on graph theory. I typically consider cost, scalability, performance, operational simplicity and the type of data that will be stored and retrieved when selecting a NoSQL database.”
- Why You Should Say It: You can’t develop an affinity or loyalty to a particular vendor since each NoSQL database has pros and cons and is designed for different purposes. The developer or DBA must gather the requirements and consider a number of criteria to select a database that will meet the company’s needs not only today but in the future. Knowing what to consider and how to align a database product with a company’s needs is yet another way to boost your market value if you’re a developer or DBA.