Defensive Database Programming Techniques

Defensive Database Programming Techniques
Dr.NaveenBansal Profile Pic
Dr.NaveenBansal,India,Teacher
Published Date:25-10-2017
Your Website URL(Optional)
Comment
Chapter 1: Basic Defensive Database Programming Techniques The goal of defensive database programming is to produce resilient database code; in other words, code that does not contain bugs and is not susceptible to being broken by unexpected use cases, small modifications to the underlying database schema, changes in SQL Server settings, and so on. If you fail to program defensively, then code that runs as expected on a given standalone server, with a specific configuration, may run very differently in a different environment, under different SQL Server settings, against different data, or under conditions of concurrent access. When this happens, you will be susceptible to erratic behavior in your applications, performance problems, data integrity issues, and unhappy users. The process of reducing the number of vulnerabilities in your code, and so increasing its resilience, is one of constantly questioning the assumptions on which your implementation depends, ensuring they are always enforced if they are valid, and removing them if not. It is a process of constantly testing your code, breaking it, and then refining it based on what you have learned. The best way to get a feel for this process, and for how to expose vulnerabilities in your code and fix them using defensive programming techniques, is to take a look at a few common areas where I see that code is routinely broken by unintended use cases or erroneous assumptions: • unreliable search patterns • reliance on specific SQL Server environment settings • mistakes and ambiguity during data modifications. In each case, we'll identify the assumptions that lead to code vulnerability, and show how to fix them. All the examples in this chapter are as simple as possible, in that there is no concurrency, and the underlying database schema is fixed. 23Chapter 1: Basic Defensive Database Programming Techniques In subsequent chapters, we'll introduce the additional dangers that can arise when exposing the code to changes in the database schema and running it under high concurrency. Programming Defensively to Reduce Code Vulnerability There are four key elements to defensive database programming that, when applied, will allow you to eliminate bugs and make your code less vulnerable to be being subsequently broken by cases of unintended use. 1. Define and understand your assumptions. 2. Test as many use cases as possible. 3. Lay out your code in short, fully testable, and fully tested modules. 4. Reuse your code whenever feasible, although we must be very careful when we reuse T-SQL code, as described in Chapter 5. As noted in the introduction to this book, while I will occasionally discuss the sort of checks and tests that ought to be included in your unit tests (Steps 2 and 3), this book is focused on defensive programming, and so, on the rigorous application of the first two principles. Den fi e your assumptions One of the most damaging mistakes made during the development of SQL and any other code, is a failure to explicitly define the assumptions that have been made regarding how the code should operate, and how it should respond to various inputs. Specifically, we must: • explicitly list the assumptions that have been made • ensure that the these assumptions always hold • systematically remove assumptions that are not essential, or are incorrect. 24Chapter 1: Basic Defensive Database Programming Techniques When identifying these assumptions, there can be one of three possible outcomes. Firstly, if an assumption is deemed essential, it must be documented, and then tested rigorously to ensure it always holds; I prefer to use unit tests to document such assumptions (more on this in Chapter 3). Failure to do so will mean that when the code makes it into production it will inevitably be broken as a result of usage that conflicts with the assumption. Secondly, if the assumption is deemed non-essential, it should, if possible, be removed. Finally, in the worst case, the code may contain assumptions that are simply wrong, and can threaten the integrity of any data that the code modifies. Such assumptions must be eliminated from the code. Rigorous testing As we develop code, we must use all our imagination to come up with cases of unintended use, trying to break our modules. We should incorporate these cases into our testing suites. As we test, we will find out how different changes affect code execution and learn how to develop code that does not break when "something," for example, a language setting or the value of ROWCOUN, changes. T Having identified a setting that breaks one of our code modules, we should fix it and then identify and fix all other similar problems in our code. We should not stop at that. The defensive programmer must investigate all other database settings that may affect the way the code runs, and then review and amend the code again and again, fixing potential problems before they occur. This process usually takes a lot of iterations, but we end up with better, more robust code every time, and we will save a lot of potential wasted time in troubleshooting problems, as well as expensive retesting and redeployment, when the code is deployed to production. Throughout the rest of this chapter, we'll discuss how this basic defensive coding philosophy is applied in practice, by way of some simple practical examples. 25Chapter 1: Basic Defensive Database Programming Techniques Defending Against Cases of Unintended Use All too often, we consider our code to be finished as soon as it passes a few simple tests. We do not take enough time to identify and test all possible, reasonable use cases for our code. When the inevitable happens, and our code is used in a way we failed to consider, it does not work as expected. To demonstrate these points, we'll consider an example that shows how (and how not) to use string patterns in searching. We'll analyze a seemingly working stored procedure that searches a Message tab s le, construct cases of unintended use, and identify an implicit assumption on which the implementation of this procedure relies. We will then need to decide whether to eliminate the assumption or to guarantee that it always holds. Either way, we will end up with a more robust procedure. Listing 1-1 contains the code needed to create a sample Mess tab age ls e, which holds the subject and body of various text messages, and load it with two sample messages. It then creates the stored procedure, SelectMessagesBySubjectBeginn , w in higch will search the messages, using a search pattern based on the kL eywor IKE d. The stored procedure takes one parameter, SubjectBeginni , and is supposed to r ng eturn every message whose subject starts with the specified text. CREATE TABLE dbo.Messages ( MessageID INT IDENTITY(1,1) NOT NULL PRIMARY KEY, Subject VARCHAR(30) NOT NULL , Body VARCHAR(100) NOT NULL ) ; GO INSERT INTO dbo.Messages ( Subject , Body ) SELECT 'Next release delayed' , 'Still fixing bugs' UNION ALL 26Chapter 1: Basic Defensive Database Programming Techniques SELECT 'New printer arrived' , 'By the kitchen area' ; GO CREATE PROCEDURE dbo.SelectMessagesBySubjectBeginning SubjectBeginning VARCHAR(30) AS SET NOCOUNT ON ; SELECT Subject , Body FROM dbo.Messages WHERE Subject LIKE SubjectBeginning + '%' ; Listing 1-1: Creating and populating the Messages table along with the stored procedure to search the messages. Some preliminary testing against this small set of test data, as shown in Listing 1-2, does not reveal any problems. must return one row EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning='Next'; Subject Body - Next release delayed Still fixing bugs must return one row EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning='New'; Subject Body - New printer arrived By the kitchen area must return two rows EXEC dbo.SelectMessagesBySubjectBeginning 27Chapter 1: Basic Defensive Database Programming Techniques SubjectBeginning='Ne'; Subject Body - Next release delayed Still fixing bugs New printer arrived By the kitchen area must return nothing EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning='No Such Subject'; Subject Body - Listing 1-2: A few simple tests against the provided test data demonstrate that results match expectations. Handling special characters in searching In defensive database programming, it is essential to construct cases of unintended use with which to break our code. The test data in Listing 1-1 and the stored procedure calls in Listing 1-2 demonstrate the cases of intended use, and clearly the procedure works, when it is used as intended. However, have we considered all the possible cases? Will the procedure continue to work as expected in cases of unintended use? Can we find any hidden bugs in this procedure? In fact, it is embarrassingly easy to break this stored procedure, simply by adding a few "off-topic" messages to our table, as shown in Listing 1-3. INSERT INTO dbo.Messages ( Subject , Body ) SELECT 'OT Great vacation in Norway' , 'Pictures already uploaded' UNION ALL SELECT 'OT Great new camera' , 28Chapter 1: Basic Defensive Database Programming Techniques 'Used it on my vacation' ; GO must return two rows EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning = 'OT' ; Subject Body - Listing 1-3: Our procedure fails to return "off-topic" messages. Our procedure fails to return the expected messages. In fact, by loading one more mes- sage, as shown in Listing 1-4, we can demonstrate that this procedure can also return incorrect data. INSERT INTO dbo.Messages ( Subject , Body ) SELECT 'Ordered new water cooler' , 'Ordered new water cooler' ; EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning = 'OT' ; Subject Body - Ordered new water cooler Ordered new water cooler Listing 1-4: Our procedure returns the wrong messages when the search pattern contains OT. When using the LIK k Eeyword, square brackets (" " and " "), are treated as wildcard characters, denoting a single character within a given range or set. As a result, while the search was intended to be one for off-topic posts, it in fact searched for "any messages whose subject starts with O or T." Therefore Listing 1-3 returns no rows, since no such messages existed at that point, whereas Listing 1-4 "unexpectedly" returns the message starting with "O," rather than the off-topic messages. 29Chapter 1: Basic Defensive Database Programming Techniques In a similar vein, we can also prove that the procedure fails for messages with the sign % in subject lines, as shown in Listing 1-5. INSERT INTO dbo.Messages ( Subject , Body ) SELECT '50% bugs fixed for V2' , 'Congrats to the developers' UNION ALL SELECT '500 new customers in Q1' , 'Congrats to all sales' ; GO EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning = '50%' ; Subject Body 50% bugs fixed for V2 Congrats to the developers 500 new customers in Q1 Congrats to all sales Listing 1-5: Our stored procedure returns the wrong messages, along with the correct ones, if the pattern contains %. The problem is basically the same: the sign is a wildcar % d character denoting "any string of zero or more characters." Therefore, the search returns the "500 new customers…" row in addition to the desired "50% bugs fixed…" row. Our testing has revealed an implicit assumption that underpins the implementation of the SelectMessagesBySubjectBeginning stored procedure: the author of this stored procedure did not anticipate or expect that message subject lines could contain special characters, such as square brackets and percent signs. As a result, the search only works if the specified SubjectBeginning does not contain special characters. Having identified this assumption, we have a choice: we can either change our stored procedure so that it does not rely on this assumption, or we can enforce it. 30Chapter 1: Basic Defensive Database Programming Techniques Enforcing or eliminating the special characters assumption Our first option is to fix our data by enforcing the assumption that messages will not contain special characters in their subject line. We can delete all the rows with special characters in their subject line, and then add a C constrain HECK t that forbids their future use, as shown in Listing 1-6. The patterns used in the DE command and LETE in the CHECK constraint are advanced, and need some explanation. The first pattern, %%, means the following: • both percent signs denote "any string of zero or more characters" • in this case denotes "opening square bracket, " • the whole pattern means "any string of zero or more characters, followed by an opening square bracket, followed by another string of zero or more characters," which is equivalent to "any string containing at least one opening square bracket." Similarly, the second pattern, %% means " %, any string containing at least one percent sign." BEGIN TRAN ; DELETE FROM dbo.Messages WHERE Subject LIKE '%%' OR Subject LIKE '%%%' ; ALTER TABLE dbo.Messages ADD CONSTRAINT Messages_NoSpecialsInSubject CHECK(Subject NOT LIKE '%%' AND Subject NOT LIKE '%%%') ; ROLLBACK TRAN ; Listing 1-6: Enforcing the "no special characters" assumption. Although enforcing the assumption is easy, does it make practical sense? It depends. I would say that, under most circumstances, special characters in subject lines should be allowed, so let's consider a second, better option – eliminating the assumption. Note that Listing 1-6 rolls back the transaction, so that our changes are not persisted in the database. 31Chapter 1: Basic Defensive Database Programming Techniques Listing 1-7 shows how to alter the stored procedure so that it can handle special characters. To better demonstrate how the procedure escapes special characters, I included some debugging output. Always remember to remove such debugging code before handing over the code for QA and deployment ALTER PROCEDURE dbo.SelectMessagesBySubjectBeginning SubjectBeginning VARCHAR(50) AS SET NOCOUNT ON ; DECLARE ModifiedSubjectBeginning VARCHAR(150) ; SET ModifiedSubjectBeginning = REPLACE(REPLACE(SubjectBeginning, '', ''), '%', '%') ; SELECT SubjectBeginning AS SubjectBeginning , ModifiedSubjectBeginning AS ModifiedSubjectBeginning ; SELECT Subject , Body FROM dbo.Messages WHERE Subject LIKE ModifiedSubjectBeginning + '%' ; GO Listing 1-7: Eliminating the "no special characters" assumption. Listing 1-8 demonstrates that our stored procedure now correctly handles special characters. Of course, in a real world situation, all previous test cases have to be rerun, to check that we didn't break them in the process of fixing the bug. must return two rows EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning = 'OT' ; 32Chapter 1: Basic Defensive Database Programming Techniques SubjectBeginning ModifiedSubjectBeginning OT OT Subject Body OT Great vacation in Norway Pictures already uploaded OT Great new camera Used it on my vacation must return one row EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning='50%'; SubjectBeginning ModifiedSubjectBeginning - 50% 50% Subject Body 50% bugs fixed for V2 Congrats to the developers Listing 1-8: Our search now correctly handles and %. Whether we ultimately decide to enforce or eliminate the assumption, we have created a more robust search procedure as a result. Defending Against Changes in SQL Server Settings A common mistake made by developers is to develop SQL code on a given SQL Server, with a defined set of properties and settings, and then fail to consider how their code will respond when executed on instances with different settings, or when users change settings at the session level. 33Chapter 1: Basic Defensive Database Programming Techniques For example, Chapters 4 and 9 of this book discuss transaction isolation levels, and explain how code may run differently under different isolation levels, and how to improve code so that it is resilient to changes in the isolation level. However, in this chapter, let's examine a few simple cases of how hidden assumptions with regard to server settings can result in vulnerable code. How SET ROWCOUNT can break a trigger Traditionally, developers have relied on the ROSWECTOUNT command to limit the number of rows returned to a client for a given query, or to limit the number of rows on which a data modification statement (UPDA, TDEELETE, MERGE or INSERT ) acts. In either case, SE TROWCOUNT works by instructing SQL Server to stop processing after a specified number of rows. However, the use of SE TROWCOUNT can have some unexpected consequences for the unwary developer. Consider a very simple table, Obje, w cthi s ch stores basic size and weight information about objects, as shown in Listing 1-9. CREATE TABLE dbo.Objects ( ObjectID INT NOT NULL PRIMARY KEY , SizeInInches FLOAT NOT NULL , WeightInPounds FLOAT NOT NULL ) ; GO INSERT INTO dbo.Objects ( ObjectID , SizeInInches , WeightInPounds ) SELECT 1 , 10 , 10 UNION ALL SELECT 2 , 34Chapter 1: Basic Defensive Database Programming Techniques 12 , 12 UNION ALL SELECT 3 , 20 , 22 ; GO Listing 1-9: Creating and populating the Objects table. We are required to start logging all updates of existing rows in this table, so we create a second table, ObjectsChangeLo , in w g hich to record the changes made, and a trigger that will fire whenever data in the Obje tab cts le is updated, record details of the changes made, and insert them into ObjectsChange.Log CREATE TABLE dbo.ObjectsChangeLog ( ObjectsChangeLogID INT NOT NULL IDENTITY , ObjectID INT NOT NULL , ChangedColumnName VARCHAR(20) NOT NULL , ChangedAt DATETIME NOT NULL , OldValue FLOAT NOT NULL , CONSTRAINT PK_ObjectsChangeLog PRIMARY KEY ( ObjectsChangeLogID ) ) ; GO CREATE TRIGGER Objects_UpdTrigger ON dbo.Objects FOR UPDATE AS BEGIN; INSERT INTO dbo.ObjectsChangeLog ( ObjectID , ChangedColumnName , ChangedAt , OldValue 35Chapter 1: Basic Defensive Database Programming Techniques ) SELECT i.ObjectID , 'SizeInInches' , CURRENT_TIMESTAMP , d.SizeInInches FROM inserted AS i INNER JOIN deleted AS d ON i.ObjectID = d.ObjectID WHERE i.SizeInInches d.SizeInInches UNION ALL SELECT i.ObjectID , 'WeightInPounds' , CURRENT_TIMESTAMP , d.WeightInPounds FROM inserted AS i INNER JOIN deleted AS d ON i.ObjectID = d.ObjectID WHERE i.WeightInPounds d.WeightInPounds ; END ; Listing 1-10: Logging updates to the Objects table. Please note that my approach to all examples in this book is to keep them as simple as they can be, while still providing a realistic demonstration of the point, which here is the effect of SET ROWCOUNT. So, in this case, I have omitted: • a "real" key on the ObjectsChangeLo tab g le, enforced by a UNIQU constrain E t (ObjectID, ChangedColumnName, ChangedAt), in addition to the surrogate key on ObjectsChangeLogID • the equivalent insert and delete triggers to log I and NSERDTELETE modifications, as well as UPDATE s. Likewise, there are several ways of logging changes, and the one I chose here may not be the best approach; again, my goal was to keep the example focused and simple. Listing 1-11 shows the code that tests how our trigger logs changes against the tab Obj le e.cts 36Chapter 1: Basic Defensive Database Programming Techniques BEGIN TRAN ; TRUNCATE TABLE can also be used here DELETE FROM dbo.ObjectsChangeLog ; UPDATE dbo.Objects SET SizeInInches = 12 , WeightInPounds = 14 WHERE ObjectID = 1 ; we are selecting just enough columns to demonstrate that the trigger works SELECT ObjectID , ChangedColumnName , OldValue FROM dbo.ObjectsChangeLog ; we do not want to change the data, only to demonstrate how the trigger works ROLLBACK ; the data has not been modified by this script ObjectID ChangedColumnName OldValue - 1 SizeInInches 10 1 WeightInPounds 10 Listing 1-11: Testing the trigger. Apparently, our trigger works as expected However, with a little further testing, we can prove that the trigger will sometimes fail to log Us made to the PDATE Objects table, due to an underlying assumption in the trigger code, of which the developer may not even have been aware 37Chapter 1: Basic Defensive Database Programming Techniques The ROWCOUNT assumption Let's consider what might happen if, within a given session, a user changed the default value for ROWCOUNT and then updated the Object tab s le, without resetting ROWCOU , NT as shown in Listing 1-12. DELETE FROM dbo.ObjectsChangeLog ; SET ROWCOUNT 1 ; do some other operation(s) for which we needed to set rowcount to 1 do not restore ROWCOUNT setting to its default value BEGIN TRAN ; UPDATE dbo.Objects SET SizeInInches = 12 , WeightInPounds = 14 WHERE ObjectID = 1 ; make sure to restore ROWCOUNT setting to its default value so that it does not affect the following SELECT SET ROWCOUNT 0 ; SELECT ObjectID , ChangedColumnName , OldValue FROM dbo.ObjectsChangeLog ; ROLLBACK ; 38Chapter 1: Basic Defensive Database Programming Techniques ObjectID ChangedColumnName OldValue - - 1 SizeInInches 10 Listing 1-12: Breaking the trigger by changing the value of ROWCOUNT. As a result of the change to the ROWCO val UNT ue, our trigger processes the query that logs changes to the SizeInInc co hes lumn, returns one row, and then ceases processing. This means that it fails to log the change to the WeightInPounds column. Of course, there is no guarantee that the trigger will log the change to the SizeInInches column. On your server, the trigger may log only the change of WeightInPounds but fail to log the change in SizeInInc. Whi hes ch column will be logged depends on the execution plan chosen by the optimizer, and we cannot assume that the optimizer will always choose one and the same plan for a query. Although the developer of the trigger may not have realized it, the implied assumption regarding its implementation is that ROWCO is set to its defaul UNT t value. Listing 1-12 proves that that, when this assumption is not true, the trigger will not work as expected. Enforcing and eliminating the ROWCOUNT assumption Once we understand the problem, we can fix the trigger very easily, by resetting ROWCOUNT to its default value at the very beginning of the body of the trigger, as shown in Listing 1-13. ALTER TRIGGER dbo.Objects_UpdTrigger ON dbo.Objects FOR UPDATE AS BEGIN; the scope of this setting is the body of the trigger SET ROWCOUNT 0 ; INSERT INTO dbo.ObjectsChangeLog ( ObjectID , ChangedColumnName , ChangedAt , OldValue ) 39Chapter 1: Basic Defensive Database Programming Techniques SELECT i.ObjectID , 'SizeInInches' , CURRENT_TIMESTAMP , d.SizeInInches FROM inserted AS i INNER JOIN deleted AS d ON i.ObjectID = d.ObjectID WHERE i.SizeInInches d.SizeInInches UNION ALL SELECT i.ObjectID , 'WeightInPounds' , CURRENT_TIMESTAMP , d.WeightInPounds FROM inserted AS i INNER JOIN deleted AS d ON i.ObjectID = d.ObjectID WHERE i.WeightInPounds d.WeightInPounds ; END ; after the body of the trigger completes, the original value of ROWCOUNT is restored by the database engine Listing 1-13: Resetting ROWCOUNT at the start of the trigger. We can rerun the test from Listing 1-12, and this time the trigger will work as required, logging both changes. Note that the scope of our SET ROWC is the trigger OUNT , so our change will not affect the setting valid at the time when the trigger was fired. SET ROWCOUNT is deprecated in SQL Server 2008… …and eventually, in some future version, will have no effect on INSERT , UPDATE or DELETE statements. Microsoft advises rewriting any such statements that rely on ROWCOUNT to use TOP instead. As such, this example may be somewhat less relevant for future versions of SQL Server; the trigger might be less vulnerable to being broken, although still not immune. However, at the time of writing, this example is very relevant. 40Chapter 1: Basic Defensive Database Programming Techniques In this case, one simple step both enforces the underlying assumption, by ensuring that it is always valid, and eliminates it, by ensuring that the code continues to work in cases where ROWCOUNT is not at its default value. Proactively x fi ing SET ROWCOUNT vulnerabilities We have fixed the ROWCOUNT vulnerability in our trigger, but our job is not done. What about other modules in our system? Might they not have the same vulnerability? Having learned of the potential side effects of SREOTWCOUNT, we can now analyze all the other modules in our system, determine if they have the same problem, and fix them if they do. For example, our stored procedure, SelectMessagesBySubjectBe-gin ning (Listing 1-1) has the same vulnerability, as demonstrated by the test in Listing 1-14. SET ROWCOUNT 1 ; must return two rows EXEC dbo.SelectMessagesBySubjectBeginning SubjectBeginning = 'Ne' ; …(Snip)… Subject Body - Next release delayed Still fixing bugs Listing 1-14: SET ROWCOUNT can break a stored procedure just as easily as it can break a trigger. We can apply the same fix, adding SET ROWCOUNT 0; to the very beginning of this stored procedure. Similarly, we should apply this fix to all other modules that need it. If your code is supposed to exist for a considerable time, then it makes perfect sense to fix problems proactively. It is usually faster and easier to do so than to wait until the problem occurs, spend considerable time troubleshooting, and then eventually implement the same fix. 41Chapter 1: Basic Defensive Database Programming Techniques How SET LANGUAGE can break a query Just as the value of ROWCOUN can be changed at the sessi T on level, so can other settings, such as the default language. Many developers test their code only under the default language setting of their server, and do not test how their code will respond if executed on a server with a different language setting, or if there is a change in the setting at the session level. This practice is perfectly correct, as long as our code always runs under the same settings as those under which we develop and test it. However, if or when the code runs under different settings, this practice will often result in code that is vulnerable to errors, especially when dealing with dates. Consider the case of a stored procedure that is supposed to retrieve from our ObjectsChangeLog table (Listing 1-10) a listing of all changes made to the Objects table over a given date range. According to the requirements, only the beginning of the range is required; the end of the range is an optional parameter. If an upper bound for the date range is not provided, we are required to use a date far in the future, December 31, 2099, as the end of our range. CREATE PROCEDURE dbo.SelectObjectsChangeLogForDateRange DateFrom DATETIME , DateTo DATETIME = NULL AS SET ROWCOUNT 0 ; SELECT ObjectID , ChangedColumnName , ChangedAt , OldValue FROM dbo.ObjectsChangeLog WHERE ChangedAt BETWEEN DateFrom AND COALESCE(DateTo, '12/31/2099') ; GO Listing 1-15: Creating the SelectObjectsChangeLogForDateRange stored procedure. 42

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.