Possible problem with differential backups

  • Follow


Hi all,
I have a 170GB OLTP database that only processes INSERTs and SELECTs,
no UPDATEs or DELETEs. I add about 1.8GB of data every day, as
verified with my transaction log backups taken every 15 minutes. A
full database backup is created every Friday at 10PM and then
differential backups are created Sa, Su, Mo, Tu, We & Th at 10PM. The
problem I have is that the differential backup file from Saturday
night (the 1st one that is created) is always about 38GB. The
subsequent differential backups add about 1.8GB of data until I return
to Friday night and the process begins anew.

At first I thought that I had some job that was running between Friday
and Saturday, but I cannot find anything. Reviewing a SQL Profiler
trace for that 24 hour period actually showed that the weekend is the
slowest period of data ingestion. To test this I performed a full
database backup at 8AM on Friday morning and then waited 10 minutes to
perform a differential backup. The differential backup file was 8GB.

I opened a case with Microsoft and my tech has stuck with the solution
that this is normal and that I really must have 38GB worth of extent
changes on Saturday and that I must have had 8GB worth of extent
changes during my 10 minute test period. This just doesn't seem
correct to me.

He told me that the differential backup process only keeps track of
extent changes since the last full backup. I agree with this. However,
the example he used does not seem correct.

Given a database with ten extents, all marked as "0" at the time of
full backup/baseline:
0000000000

Let's say that three of those extents are changed to "1":
0001100100

the first differential backup will record those three changes.

Now if those two of the "1"s change back to "0"s:
0000000100

the next differential backup will be smaller that the previous one
because only one extent has been changed from the baseline. I disagree
with him. I believe that there have been five extent changes in total
and just because it happened to change back to the original data it
doesn't matter. I have not found any documentation supporting or
disproving this theory.

Can anyone provide some additional insight?

Thank you.
0
Reply JoshBond 9/8/2010 5:30:35 PM

JoshBond (joshsackett@gmail.com) writes:
> The problem I have is that the differential backup file from Saturday 
> night (the 1st one that is created) is always about 38GB. The subsequent
> differential backups add about 1.8GB of data until I return to Friday
> night and the process begins anew. 

The typical thing would be that you have a reindex job running in that
time frame. However:

> To test this I performed a full database backup at 8AM on Friday morning
> and then waited 10 minutes to perform a differential backup. The
> differential backup file was 8GB. 

I assume that you are not running reindexing jobs that time of day.

Anyway, just to make sure that I get this correctly.

1) You take a full backup.
2) You take a diff backup. This backup is 38 GB in size.
3) You take a new diff backup one day later. This backup is 39.8 GB.
4) And next is 41.6 GB and so on.

Is that right?


> Given a database with ten extents, all marked as "0" at the time of
> full backup/baseline:
> 0000000000
> 
> Let's say that three of those extents are changed to "1":
> 0001100100
> 
> the first differential backup will record those three changes.
> 
> Now if those two of the "1"s change back to "0"s:
> 0000000100

No, that does not make sense. How would SQL Server knows that the
data has reverted back to the original state? 

I don't use differential backups much, but I don't see why the first
diff backup is that big. That's a little funny.

What recovery mode is the database in? Full or Bulk_logged? Are there
any special features in SQL Server you use like replication,full-text
etc?


-- 
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

0
Reply Erland 9/9/2010 8:21:12 PM


By the way, would it be possible for you to post the exact BACKUP command
that you use?
-- 
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

0
Reply Erland 9/9/2010 8:22:05 PM

> 1) You take a full backup.
> 2) You take a diff backup. This backup is 38 GB in size.
> 3) You take a new diff backup one day later. This backup is 39.8 GB.
> 4) And next is 41.6 GB and so on.
> Is that right?

That is correct.

I am not running any reindex jobs during that time period. In fact, no
jobs run between the FULL backup and the first DIFF.

The statements I use are:

BACKUP DATABASE [MyDatabase]
   TO DISK = 'F:\Server1\MyDatabase_FULL_YYYYMMDD.BAK'

-- Time elapses.
BACKUP DATABASE  [MyDatabase]
   TO DISK = 'F:\Server1\MyDatabase_DIFF_YYYYMMDD.DIF'
   WITH DIFFERENTIAL

Also, I typically use RedGate's SQL Backup software. But I have ruled
that out as the problem because the overly large DIFF backup happens
whether I use native SQL backups or RedGate's software.
0
Reply JoshBond 9/10/2010 4:10:17 PM

One possibility is you have something which updates lots of rows, and then 
keeps updating those same rows over again (maybe an UPDATE with an incorrect 
WHERE clause).  That would explain why the diff grows very fast after the 
full backup, but then since it is updating the same rows all the time, the 
diff doesn't grow much, but gets large again very soon after the next full 
backup.

One thing you could try (since it seems to only take about 10 minutes to see 
at least some of this) is to run a SQL trace for a while.  Maybe even only 
just a few minutes if you find it affects performance too much.  In that 
trace you want to include events SQL:StmtCompleted and SP:StmtCompleted. 
Include the column IntegerData.  What is put in IntegerData depends on the 
event, but for those two events, IntegerData is the number of rows affected 
by that statement.  By looking at that column and the Writes column, you may 
be able to see if you have a SQL UPDATE that is affecting more rows than it 
should.

Tom

"JoshBond" <joshsackett@gmail.com> wrote in message 
news:8c683c50-fee9-47fb-8ad2-55324f077e73@m35g2000prn.googlegroups.com...
>> 1) You take a full backup.
>> 2) You take a diff backup. This backup is 38 GB in size.
>> 3) You take a new diff backup one day later. This backup is 39.8 GB.
>> 4) And next is 41.6 GB and so on.
>> Is that right?
>
> That is correct.
>
> I am not running any reindex jobs during that time period. In fact, no
> jobs run between the FULL backup and the first DIFF.
>
> The statements I use are:
>
> BACKUP DATABASE [MyDatabase]
>   TO DISK = 'F:\Server1\MyDatabase_FULL_YYYYMMDD.BAK'
>
> -- Time elapses.
> BACKUP DATABASE  [MyDatabase]
>   TO DISK = 'F:\Server1\MyDatabase_DIFF_YYYYMMDD.DIF'
>   WITH DIFFERENTIAL
>
> Also, I typically use RedGate's SQL Backup software. But I have ruled
> that out as the problem because the overly large DIFF backup happens
> whether I use native SQL backups or RedGate's software. 

0
Reply Tom 9/10/2010 6:03:36 PM

Are you sure you are not backing up the first DIFF into the same file as 
your full backup?

"JoshBond" <joshsackett@gmail.com> wrote in message 
news:8c683c50-fee9-47fb-8ad2-55324f077e73@m35g2000prn.googlegroups.com...
>> 1) You take a full backup.
>> 2) You take a diff backup. This backup is 38 GB in size.
>> 3) You take a new diff backup one day later. This backup is 39.8 GB.
>> 4) And next is 41.6 GB and so on.
>> Is that right?
>
> That is correct.
>
> I am not running any reindex jobs during that time period. In fact, no
> jobs run between the FULL backup and the first DIFF.
>
> The statements I use are:
>
> BACKUP DATABASE [MyDatabase]
>   TO DISK = 'F:\Server1\MyDatabase_FULL_YYYYMMDD.BAK'
>
> -- Time elapses.
> BACKUP DATABASE  [MyDatabase]
>   TO DISK = 'F:\Server1\MyDatabase_DIFF_YYYYMMDD.DIF'
>   WITH DIFFERENTIAL
>
> Also, I typically use RedGate's SQL Backup software. But I have ruled
> that out as the problem because the overly large DIFF backup happens
> whether I use native SQL backups or RedGate's software. 

0
Reply Jeffrey 9/10/2010 8:00:53 PM

Tom Cooper (tomcooper@comcast.net) writes:
> One possibility is you have something which updates lots of rows, and
> then keeps updating those same rows over again (maybe an UPDATE with an
> incorrect WHERE clause).  That would explain why the diff grows very
> fast after the full backup, but then since it is updating the same rows
> all the time, the diff doesn't grow much, but gets large again very soon
> after the next full backup. 

Interesting idea. We'll see what Josh says.

> One thing you could try (since it seems to only take about 10 minutes to
> see at least some of this) is to run a SQL trace for a while.  Maybe
> even only just a few minutes if you find it affects performance too
> much.  In that trace you want to include events SQL:StmtCompleted and
> SP:StmtCompleted. Include the column IntegerData.  What is put in
> IntegerData depends on the event, but for those two events, IntegerData
> is the number of rows affected by that statement.  

There is also the RowCounts column.

Another approach would be to use DBCC PAGE to look at the page with the
DIFF map, and see to which table(s) the updates exents belong to. A bit
tedious though, and not to be done without one Kalen Delaney's books beside
you.

-- 
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

0
Reply Erland 9/10/2010 8:32:42 PM

On Sep 10, 1:00=A0pm, "Jeffrey Williams" <jeff.williams3...@verizon.net>
wrote:
> Are you sure you are not backing up the first DIFF into the same file as
> your full backup?
>

Good idea. However, my first backup is 170GB and the first diff is
38GB. Different files.
0
Reply JoshBond 9/10/2010 10:31:46 PM

7 Replies
241 Views

(page loaded in 0.122 seconds)


Reply: