Web Of Ideas blogs http://www.webofideas.co.uk The latest blogs from the Web Of Ideas website. en-gb Copyright: (C) Web Of Ideas http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_logo.jpg Ecommerce websites from Web of Ideas http://www.webofideas.co.uk 209 66 C# double question mark (??) syntax <p>Thought I'd share something that I learned today about C#.<br/> Been using it nearly 10 years, and didn't know about the <b>??</b> syntax!<br/></p> <br/> <div class="blogCode"> a = b ?? c; </div> <p><br/><b>a</b> will be assigned the value of <b>b</b> if <b>b</b> is not null. If <b>b</b> is null, <b>a</b> will be assigned the value of <b>c</b>.<br/><br/> i.e. its shorthand for:<br/> <b>if (b == null) a = c else a = b</b></p><br/> <p>Not earth shattering, but a nice time saver, and it follows the same theme of a question mark indicating the null option on system type declarations, like <b>int?</b>, <b>date?</b> etc..</p> <h2>COMMENTS</h2> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>17 Dec 2010, 21:20 by Shelley</b> <br/><i>Thanks, great explanation!</i> </td> </tr> </table> </div> http://www.webofideas.co.uk/Blog/c-sharp-double-question-mark-syntax.aspx Fri, 10 Dec 2010 00:00:00 GMT Ctrl Alt Del in nested remote desktop When using a windows remote-desktop session, the <b>Control Alt Delete</b> keyboard combination doesn't necessarily do what you expect. It will display the 'windows security' dialog in the PC you are running remote desktop in. What if you need to display it on the remote desktop PC?<br/><br/> To press <b>ctrl alt del</b> <i>within</i> the remote desktop, use <b>ctrl alt end</b>. <br/><br/> What about nested remote desktops? - e.g. a remote desktop within a remote desktop. In this case, pressing <b>ctrl alt end</b> always takes you to the first remote desktop session (i.e. the 'outer' session).<br/> To get to the 'inner' session press <b> shift ctrl alt end </b> <br/><br/> Update (21 Feb 2012): In a VMWare client, <b>ctrl alt end</b> doesn't work, need to use <b>ctrl alt insert</b>. <h2>COMMENTS</h2> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>14 Feb 2012, 14:56 by KTM</b> <br/><i>It is actually shift+CTRL+ALT+DEL. It worked from my laptop while connecting through RDP</i> </td> </tr> </table> </div> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>11 Nov 2011, 10:05 by Daniel</b> <br/><i>Sorry, for me it does not work. Nested RDP does not respond to shift ctrl alt end</i> </td> </tr> </table> </div> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>09 Sep 2011, 03:52 by Joe</b> <br/><i>thanks, I really needed this</i> </td> </tr> </table> </div> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>09 Sep 2011, 03:19 by Joe</b> <br/><i>Thaaaaaaaaaaaaaaank you! I really needed this.</i> </td> </tr> </table> </div> http://www.webofideas.co.uk/Blog/remote-desktop-ctrl-alt-delete-nested.aspx Wed, 29 Dec 2010 00:00:00 GMT How to reset an identity column seed in SQL server <p>Always have to look this up so though it worthy of a blog post. An identity column has an automatically generated value that increments for each row added to the table. The starting (‘seed’) value and increment value can be specified, but are usually both set to 1.</p> <p>To set the identity seed back to 1 for a table:<br/><b>DBCC CHECKIDENT('[table name]', RESEED, 1)</b></p> <p>After running this SQL, the next row to be added will get a value of 1 in the identity column.</p> <p>Alternatively, to delete all rows on the table and reset the identity, do a table truncate. A truncate also resets the identity seed to its starting value:<br/> <b>TRUNCATE TABLE [table name]</b></p> <p>A few gotchyas on identity columns: <ul> <li>There is nothing to guarantee that the value in an identity column will be unique or continuous.</li> <ul> <li>If the seed is reset to a lower value without deleting rows, duplicate values will get created. SQL Server does not prevent this. <li>If rows are deleted (without resetting the seed), gaps will appear in the identity values. Again, SQL server does not prevent this.</li> </ul> <li>Only one identity column is allowed per table (which is why the SQL above relates to a table not a column).</li> <li>A key column can be an identity column. In fact this is a common technique to generate unique keys (bearing in mind the uniqueness point above).</li> </ul> <h2>COMMENTS</h2> http://www.webofideas.co.uk/Blog/sql-server-identity-key-column-reset-seed-reseed.aspx Mon, 06 Dec 2010 00:00:00 GMT Introduction to LINQ for C# <h3>What Is Linq?&nbsp&nbsp</h3> <p>Well, it stands for <b>L</b>anguage <b>IN</b>tegrated <b>Q</b>uery, and in simple terms its SQL for code. It lets you query any data within your code using an SQL-like syntax. Simples.</p> <p>But there’s a lot more to it than that, as I discovered yesterday when I was looking at ‘expression trees’. I won’t go into expression trees here, that’s for another blog, but in the process, I discovered what Linq is all about and how it fits nicely into the C# language.</p> <h3>How is Linq implemented in C# and dotNET</h3> <p>To enable SQL-like syntax in the C# compiler, a lot had to change. A lot of new features were added to C# 3.0, which are mostly useful in their own right, but clearly designed primarily to support Linq.</p> <p>Linq only became properly available in the dotNET framework version 3.5 which includes the new assemblies required (i.e. System.Linq), but the compiler foundations were put in place in C# version 3.0.</p> <p>Take a look at this simple Linq statement:</p> <div class="blogCode"> var robotMonsters = <br/> &nbsp;&nbsp;&nbsp;&nbsp;from m in Monsters <br/> &nbsp;&nbsp;&nbsp;&nbsp;where m.IsRobot == true <br/> &nbsp;&nbsp;&nbsp;&nbsp;select new { m.Name, m.HomePlanet, m.FavouriteColour } </div> <br/> It uses the following features of C# that were added in version 3.0 <ul> <li><b>Query Keywords:</b> <p><i>from</i>, <i>where</i> and <i>select</i>. These are language keywords, known as ‘query keywords’, and built into C#.<br/> There are other query keywords - group, into, orderby, join, let, in, on, equal, by, ascending and descending.</p> <p>Note that these aren’t ‘proper’ keywords. They get converted into a different construct by the compiler (see below).</p> </li><br/> <li><b>Type inference:</b> <p>The <i>var</i> type for the result. When you use var, you are telling the compiler to figure out the type because you can’t be bothered.<br/> By the way, I’m not a fan of var where it isn’t needed, as it makes code more difficult to read – having the type explicitly in the code helps you understand what is going on. There’s too much lazy use of var, in my opinion!<br/> However, in this Linq query, the var type is essential because the result of the query is an anonymous type (see below) so there is no explicit type to use. </li><br/> <li><b>Anonymous types:</b> <p>The result in the query above is a collection of a new type that contains Name, HomePlanet and FavouriteColour members. This type is not explicitly declared anywhere – the compiler creates it on the fly. From the<br/> <b>new { b.Name, b.HomePlanet, b.FavouriteColour }</b><br/> syntax, it knows the type of all the members, so it is able to do this. This is why we have to use the var keyword for the robotMonsters result, as there is not an explicit type. The compiler will create a new anonymous type, and will figure out that the robotMonsters result is a collection of this type. </li> </ul> <br/> <p>I mentioned above that query keywords are not proper keywords, because the compiler pre-processes them into a different construct. The compiler will pre-process the above Linq query into something like this:</p> <div class="blogCode"> var robotMonsters = <br/> &nbsp;&nbsp;&nbsp;&nbsp;Monsters<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Where(m => m.IsRobot == true)<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;.Select(m => new {m.Name, m.HomePlanet, m.FavouriteColour}); </div> <br/> <p>Which takes us nicely onto some other C# 3.0 features:</p> <ul> <li><b>Extension methods</b><br/> <p>These give us the ability to extend existing classes without having access to the code or having to derive from them. For example, to extend string to remove rude words, you would create a method (in a utility class), like this:</p> <div class="blogCode"> public static string RemoveRudeWords(this string s)<br/> {<br/> &nbsp;&nbsp;&nbsp;&nbsp;// remove rude words from s and return it<br/> } </div> Then you could use this syntax: <div class="blogCode"> &nbsp;&nbsp;&nbsp;&nbsp;string s = “dash, darn and blast”;<br/> &nbsp;&nbsp;&nbsp;&nbsp;s = s.RemoveRudeWords(); </div> <p>In reality, its just smoke and mirrors – behind the scenes C# calls your helper method; but its nicer to read and more ‘OO’.</p> <p>Linq uses extension methods to add functionality to collection objects in C#, for example it adds <i>Where</i> and <i>Select</i> as extension methods to <i>IEnumerable</i> and all the built-in classes that use it, so the above query works. These extensions (along with all the other Linq stuff not built in to the compiler directly) are in the <i>System.Linq</i> library.</p> </li><br/> <li><b>Lamda expressions:</b> <p>These are a progression of anonymous methods from C# 2.0. It’s basically a concise syntax for declaring a method. In the example, the method we are declaring for the Select would like this if explicitly coded:</p> <div class="blogCode"> public static var GetRobotMonster(Monster m)<br/> {<br/> &nbsp;&nbsp;&nbsp;&nbsp;return new {m.Name. m.HomePlanet, m.FavouriteColour};<br/> } </div> <p>and, if we coded this method, we could pass it in as the delegate to the Select, like this:</p> <div class="blogCode"> .Select(GetRobotMonster) </div> <p>Lamda expressions, like anonymous methods, enable you to create a method ‘on the fly’, but they strip out all unnecessary syntax.</p> <div class="blogCode"> m => new {m.Name, m.HomePlanet, m.FavouriteColour} </div> <p>m is the parameter into the method – the compiler can work out its type, so no need to specify it. The compiler can also work out the return type from the context, so no need to specify it. The return type from the Select (and Where) extension methods is templated - <i>IEnumerable&lt;T&gt;</i>, so T will become the anonymous type.</p> </li> </ul> <br/> <h3>Whats the point of Linq?</h3> Linq doesn’t do anything you couldn’t already do. If you want to query some data in C#, you could write this code: <div class="blogCode"> List<Monster> results = new List<Monster>();<br/> foreach (Monster m in Monsters)<br/> {<br/> &nbsp;&nbsp;if (m.IsRobot)<br/> &nbsp;&nbsp;{<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;results.Add(m);<br/> &nbsp;&nbsp;}<br/> } </div> <p>This is known as imperative programming – you specify exactly how to do it (i.e. you loop through all monsters, check if they are robots, if they are add to the results).</p> <p>Linq is an example of declarative programming. You specify what you want to do only (get all Monsters that are robots), and you don’t care how it is done.<br/> Heads up – common interview question, so remember the difference between imperative and declarative programming!</p> <p>It is something of a paradigm shift to work with Linq, especially for programmers like me who have years of imperative programming under their belt. However, it is cleaner, and its easier to see what’s going on in other peoples code (as we all know, your own code in 2 weeks time is effectively other peoples code). So, Linq is more productive, if used appropriately. Its probably not more technically efficient, as a generic ‘Where’ algorithm will never be as fast as a custom-coded solution; but that isn’t normally a huge concern unless you’re right in the inner loop of a graphics algorithm for XBox.</p> <p>The main point of Linq, from Microsofts perspective anyway, is to have a consistent query language for all kinds of different data sources – relational databases (via ADO.Net), XML, in-program collections, etc.. Any new classes you develop can become ‘Linq compliant’ by implementing a few methods. More interestingly, you can make any existing class ‘Linq compliant’ by implementing a few extension methods, even if you don’t have access to the original code.</p> <p>Microsoft would clearly like Linq to be the query language of choice for programmers – i.e. replace XPath, SQL and other methods of querying data in code. I think they’ve got a pretty good shot at it – it makes sense to have a consistent way of querying, and Linq is well thought out and flexible enough to be the contender.</p> <br/> <p>That’s it for now. Off to do some imperative cooking of my dinner.<br/><br/> <b>Imperative: </b>Chop onions, garlic, chilli, vegetables. Fry for 10 minutes. Add curry paste. Fry another 2 minutes. Add tinned tomatoes. Cook for 20 minutes. Add coriander leaves. Serve.<br/><br/> <b>Declarative: </b>Phone for takeaway. </p> <h2>COMMENTS</h2> http://www.webofideas.co.uk/Blog/linq-introduction-lamda-expressions-query-keywords-anonymous-types-extension-methods-imperative-declarative.aspx Fri, 04 May 2012 00:00:00 GMT Microsoft Access as a website database <br/><p>This blog details how you can successfully use MS Access as the data store for your website.</p> <br/><p>In this blog, I explore the issues you will encounter and explain how to overcome them. A fairly high level of knowledge on web-site and database architecture is assumed. It is also assumed that asp.net is used for the website with ado.net / oledb used to talk to the database, though the techniques described could eaily be fitted to other technologies.</p> <br/><h2>Why?</h2> <br/><p>Access was never intended to be a multi-user high performance database, so why would you want to use it as such? The database behind a modern website needs to be tough – it needs to cope with high volumes, multiple concurrent accesses, transactions, backups. It also needs to be efficient so web page loading times are kept to a minimum. From these requirements, Access is not an obvious choice. It’s great as a single user database for relatively small amounts of data. And contrary to popular belief, the Jet engine is quite a robust and efficient SQL engine, and works nicely with the asp.net technology offerings. But its not <i>SQL Server</i> or <i>Oracle</i>. Its not even <i>My SQL</i>. So why try and use it as a website database? </p> <br/><p>Well, it all comes down to money. The Jet engine is free – free to use and free to host. This means so long as you have a legal copy of Access to develop the database, there will be no hosting costs. Other databases will incur, as a minimum, additional hosting costs (check out you hosting provider – chances are to add an SQL Server or My SQL database will cost extra). There may also be licencing costs, though MySQL is obviously free, and SQL Server Express is free and fine for most small to medium websites. Another consideration may be time and skill set. Many programmers know Access but not SQL Server or My SQL, and if you need to knock out a website by next Tuesday, you haven't got time to learn!</p> <br/><p>There are a lot of obstacles to overcome in using MS Access as a website database, but it can be made to work and provide a cost effective solution for lower-traffic websites. Below I go through the issues you will need to overcome to achieve this.</p> <br/><h2>Connection pooling</h2> <br/><p>Despite certain documentation indicating otherwise, there doesn’t seem to be any connection pooling implemented in the ADO.Net JET engine. Or if it is implemented, it is very poor, because implementing your own dramatically improves performance. In fact, I would say it is essential to use connection pooling in any pages that are heavy on database access, if you want your site to be responsive. It is also prudent to implement some page or fragment caching and/or data object caching (using the dotNet built-in cache), but that is not the focus of this blog. <br/><p> <br/><h3>Access connection pooling code</h3> <br/><p>Its pretty simple to implement connection pooling – just have a class encapsulate your connection object. This class then controls when the connection is opened and closed. Below is some code for a very basic connection pool:</p> <br/><pre>using System; <br/>using System.Data.OleDb; <br/><br/>static public class AccessConnectionPool <br/>{ <br/> static private OleDbConnection _connection = null; <br/> static private object _lock = new object(); <br/><br/> /// <br/> /// Returns an open connection from a connection pool <br/> /// <br/> static public OleDbConnection GetConnection() <br/> { <br/> lock (_lock) <br/> { <br/> if (_connection == null) <br/> { <br/> _connection = new OleDbConnection(); <br/> } <br/><br/> if (_connection.State != System.Data.ConnectionState.Open) <br/> { <br/> _connection.Open(); <br/> } <br/><br/> return _connection; <br/> } <br/> } <br/><br/> /// <br/> /// Closes all open connections <br/> /// <br/> static public void CloseAll() <br/> { <br/> if (_connection.State == System.Data.ConnectionState.Open) <br/> { <br/> _connection.Close(); <br/> } <br/> } <br/>} <br/></pre> <br/> <p> <br/><UL class=bl> <br/><li>The code (above) is a very basic implementation to illustrate the concept. <br/><li>Use the GetConnection() method whenever you need an OleDbConnection object - the connection returned is already open so no need to call Open on it. <br/><li>When you have finished all your data access, use the CloseAll() method to close the connection. You might want to call this, for example, when all page processing is finished (in your Page_Unload event). It is important that this is <i>always</i> called – see below for warnings on what will happen if this is not called! <br/><li>Note that the ‘pool’ is just one connection. You should only maintain one connection per database – remember access is not designed for multiple concurrent access. The Jet OleDb engine does allow you to open multiple connections, but there’s no performance advantage. Also, I can tell you from bitter experience that it is a very bad idea to have more than one open. However exact you are at cleaning up your connections, Access doesn’t like attempted concurrent access in this way, and will eventually start throwing random errors and stop working completely, bringing your website down in the process. You have been warned! <br/><li>What works well for sites I have developed is to have a connection pool (of one connection) that is used only for database read operations. For updates, I create a new connection (i.e. don’t use pooling). This works where update operations are rare relative to read operations (which is pretty much all websites). However, you must always close the open read connection before you open the connection to perform the update – if you don’t, access can get confused and return incorrect data after the update has been committed. <br/><li>You will notice I have serialised access to the connection via a lock. I strongly suggest you serialise all database access – e.g. have all database access contained in a single class and use a common lock object around all methods. <br/><li><b>WARNING: Don’t keep the connection object open over page requests!!</b> If you do keep connections open over page requests, your website will slowly die, as every now and then a connection will get orphaned. You <i>can’t</i> prevent this – if the website has some problem between page requests – for example if an administrator uploads a new file that forces an application re-start, your connection will get orphaned. More advanced databases like SQL Server can cope with this (in SQL Server the connection pooling is external to the application), but Access can’t cope - eventually you will start getting random error message returned by the JET engine and everything will stop working. The only way to recover is to re-set the application pool your website is running in, or re-start IIS. Even keeping a connection open for the length of a request can cause issues if the request dies and the connection doesn’t get closed. Your error handling needs to ensure that connections are closed as soon as there is the slightest hint that something is going wrong. </li></ul> <br/><p></p> <br/><h3>Always close your connections!</h3> <br/><p>I have already mentioned this, but it really is vital if implementing Access connection pooling. You <i>will</i> hit major issues and waste hours trying to figure out what is going on if you don’t get this right. <i>Never</i> leave any orphaned connections. With more tolerant databases, the odd orphaned connection isn’t a big deal – the connection pooling will clean it up. With Access JET OleDb, orphaned connections build up over time and will eventually bring your website down. Your error handling should close all database connections. And take care when doing redirects from code, as these may prevent your connection closing code being called.</p> <br/><h2>SQL limitations</h2> <br/><p>The JET SQL engine has a few limitations in its implementation of SQL – for example you can only have one JOIN per SQL statement (for multiple joins you need to do it the hard way in the WHERE clause). It also does some things in a non-standard way, for example boolean and date syntax is a little strange. None of this is a major concern – there are always subtle differences in SQL syntax and functionality between providers. Just compare Oracle and SQL server!</p> <br/><h2>No stored procedures</h2> <br/><p>This is a far bigger issue, and will be a show-stopper for many. All your SQL will be in code rather than in the database. Personally, I don’t buy the whole ‘stored procedures are the best thing ever and woe betide anyone who runs SQL directly from code’ argument. Yes, they can be more efficient (depending largely on your software and hardware architecture), but its often only marginal, especially if you use SQL parameters in Ado.Net. Yes, security is better (but lets face it, its not that difficult to strip potential hacks from user input – just remove SQL comments, harmful keywords and sort out single quotes). I find stored procedures to be inflexible and arcane compared to a dedicated programming language. When I need to dynamically generate SQL, its so much easier to do it in C# than in T-SQL. For example, I have a whole library of re-usable code in C# that can easily and reliably generate complex SQL statements based on user input and search criteria. Building something similar in T-SQL is horrible. </p> <br/><p>There are, of course, other reasons for using stored procedures, and they are often quite clearly the right way to go. Just venting at the stored procedure police. Lets move on..</p> <br/><h2>Multi concurrent access</h2> <br/><p>This has already been covered to a large extent in the section on connection pooling. Access was designed as a single user database. There are ways for multiple users to access the same database, but they are not ideal and not suited for website access. For a web-based application, the only reliable option is to stick to single user access – the user being your website. I.e. serialise all access to the database.</p> <br/><p>In low traffic, or even medium traffic websites this shouldn’t cause major performance issues so long as the site and database are designed to make data access quick and efficient.</p> <br/><p>This obviously becomes more complex if you’re developing a multi-server website. But lets be honest, you shouldn’t be seriously considering using Access as the database in a server-farm scenario.</p> <br/><h2>File size creep</h2> <br/><p>There is a minor but niggling issue with Access (though it may have been fixed in more recent versions, this is based on Access 2003). Access database files (the mdb files) grow every time the database is opened for an update operation. Access adds a little free-space in anticipation of updates, but this free-space is never re-claimed. Thus, over time, the size of the mdb files grows out of proportion with the amount of data contained within them.</p> <br/><p>This doesn’t seem to cause any performance issues, but it is obviously a bug or design fault. There is an option to automatically shrink the database every time it is opened, but unfortunately this only applies when it is opened in the Access product, not when it is opened via ADO.Net. So, the only options are: <br/><ul> <br/><li>Live with it – its not really a major issue. <br/><li>Regularly open up the database in MS Access and shrink it back down. This is not straightforward when your database is on a remote web server! </li></ul> <br/><p></p> <br/><h2>Summary</h2>Its do-able!. For low-to-medium traffic, single server websites, so long as you follow the guidelines above, you can use MS Access as your website database. This can save the considerable time and/or expense of using a 'grown up' database such as SQL Server or My SQL. Any comments, questions, feel free to post using the comment box below. <b>The important thing is not to stop questioning.</b> <i>Albert Einstein</i> <h2>COMMENTS</h2> <div style="border: solid 1px black;margin:10px"> <table style="border:none"> <tr> <td width="200"><img src='http://www.webofideas.co.uk/Images/GraphicsSets/ecommerce_no_image.gif' width='150' height='150' alt='ecommerce no image'></img></td> <td style="vertical-align:top"> <b>17 Dec 2010, 09:23 by Louis</b> <br/><i>Thanks a million for this! Helped me fix my website issues - my website was slowly getting slower and slower over a period of days until I restarted IIS. Problem was I was keeping access connections open for long periods of time. Closing them at the end of each page request worked a treat!</i> </td> </tr> </table> </div> http://www.webofideas.co.uk/Blog/ms-access-database-website-data-store-jet-ado-net-ole-db-oledb.aspx Fri, 10 Dec 2010 00:00:00 GMT SELECT COUNT(*) in LINQ <p>Another short but sweet one:<br/> This is pretty obvious, but I wasn't thinking straight this morning and it took a while.<br/> To do the equivalent of an SQL SELECT COUNT(*) in LINQ, do this:</p> <div class="blogCode"> (from t in myTable where t.MyColumn == myVariable select t).Count(); </div> <p>Obviously, you can put whatever you like in the <b>where</b> clause, or remove it completely.</p> <h2>COMMENTS</h2> http://www.webofideas.co.uk/Blog/linq-select-count-sql-star.aspx Mon, 13 Dec 2010 00:00:00 GMT Using CHECKSUM and BINARY_CHECKSUM to get hash values from any SQL Server columns I recently had a task to ensure that two databases were identical.<br/> A database was backed up and written to a portable disk in one country, then transported by secure courier to me for further processing.<br/> As a security measure, I needed to come up with a mechanism to determine the data had not been tampered with during transportation.<br/><br/> To do this, I had a series of SQL totals queries run against the data before it was backed up. Then I ran the same queries after I had restored it and compared the results. Identical results meant the data was OK.<br/><br/> Simple totals aren't enough to absolutely guarantee the data has not been tampered with, so I devised some totals that tested the relationships between columns on rows.<br/> For example:<br/><br/> <div class="blogCodeLong"> SUM ( (column1 + column2) / column3 ) </div><br/><br/> However, most of the columns on the database were character data, so I needed a way to achieve this for non-numeric columns.<br/><br/> <b> Enter the SQL Server functions CHECKSUM and BINARY_CHECKSUM.</b><br/><br/> These functions produce a hash for any column, any data type, even if the value is NULL.<br/> So, for non-numeric data, the above query becomes:<br/><br/> <div class="blogCodeLong"> SUM ( (BINARY_CHECKSUM(column1) + BINARY_CHECKSUM(column2)) / BINARY_CHECKSUM(column3) ) </div><br/><br/> What is the difference between CHECKSUM and BINARY_CHECKSUM? The honest answer is I'm not sure. The Microsoft definition is woolly at best, and both produce a valid (but different) hash value. For my purposes, either would have worked. I chose BINARY_CHECKSUM mainly because it 'sounds' more exact than plain old CHECKSUM :)<br/> These are the MSDN definitions :<br/><br/> <b>CHECKSUM: </b>Returns the checksum value computed over a row of a table, or over a list of expressions. CHECKSUM is intended for use in building hash indexes.<br/> <b>BINARY_CHECKSUM: </b>Returns the binary checksum value computed over a row of a table or over a list of expressions. BINARY_CHECKSUM can be used to detect changes to a row of a table.<br/><br/> If anyone can explain this in plain English, please post a comment. Otherwise, I think my selection technique is valid - choose the one that sounds best! <h2>COMMENTS</h2> http://www.webofideas.co.uk/Blog/sql-server-hash-function-value-guid-numeric-character-all-data-types-difference-differences.aspx Thu, 13 Jan 2011 00:00:00 GMT