Friday, November 16, 2012

VLDBs: dbcc checktable, checkalloc, checkcatalog

I like the idea of a full dbcc checkdb run against a space efficient storage level point-in-time copy, or a mounted independent full point-in-time copy at least once within the retnetion period for the oldest full backup and all transaction logs from there to current. But not all of the VLDB systems I am concerned about have that capability. Running a full dbcc checkdb on these systems is concerning because of the total execution time (it would often overlap a high data change ETL), and the potential effects of the private database snapshot maintained by dbcc checkdb.

So I'll be Scripting dbcc checktable, checkalloc, checkcatalog per table for a VLDB with several thousand tables.  A few questions as I determine the scripting design.

1. It looks like, other than stress on the disk subsytem and/or CPU, the private database snapshot maintained by dbcc checktable is a good reason not to run these concurrently for different tables.  Is it true that this is best only attempted in a single thread of activity due to the database snapshot?

2. Do dbcc checktable, checkalloc, checkcatalog each disfavor database blocks they read?

3. If dbcc checktable, checkalloc, checkcatalog  are each run in succession for a given table, will various relevent pages be read from disk 3 times even if not forced by buffer pressure?

4. If a table or index page is already in the buffer pool cache, will dbcc checkdb read it from disk anyway? (some database systems will use cached blocks from cache, and bypass cache for misses/physical reads for integrity check purposes.)

5. Other than the answers to questions 2, 3, and 4 above, are there other considerations when deciding whether to perform operations table by table, or potentially mix-n-match in order to hit an execution time between a low and high watermark?  For example, assume I want only 4 hours per day of consistency check activity.  Should I run select tables for which to run the set of checktable, checkalloc, and checkcatalog based on the sum of their most recent executions being near 4 hours... or should I consider mx-n-match of checktable, checkalloc, and checkcatalog if it results in a more even distribution of execution time?

Background information on per-table dbcc checkdb ~ dbcc checkdb, checkalloc, checkcatalog...  
http://www.sqlskills.com/BLOGS/PAUL/post/CHECKDB-From-Every-Angle-Consistency-Checking-Options-for-a-VLDB.aspx
Not much information on dbcc checkdb disfavoring pagess read, but here's a mention...
http://www.sqlpassion.at/blog/PermaLink,guid,e9deffeb-55f9-4b2d-815d-a9efd523ca41.aspx

2 comments:

  1. My opinions:

    1. No. If a database snapshot already exists for a database, it will use the existing snapshot rather than create multiple snapshots. If you're going to run it concurrently, I would create the snapshot manually, run the commands, and then drop it manually.

    2. Yes.

    3. The three processes read different structures. Reading the smae pages really isn't a concern.

    4. The snapshot is treated like a different database in terms of the buffer cache. So a page in the buffer cache for the live database does not equal a page in cache for the snapshot database. They don't share cache pages.

    ReplyDelete
    Replies
    1. Thanks! I'll update once my scripts are complete, and provide some observations once the scripts are deployed.

      Delete