Wednesday, March 28, 2012

How much storage needed for bit datatypes?

Hi!

I'm investigating if it is an good idea to use some form of "bitmap index",

e.g. I have an normal table with 100-200 columns, but sometimes I just want to know:

"Is there an value in the cell" (...where col20 is not null)

So I will measure if I get some benifit, so If you tried this, let me hear, but otherwise see only the rest:

If I remember correctly SQL Server will group bit-columns in some way minimizing storage needs,

but my questions: how much storage is needed for an bit column:

1 or 2 bit (the second bit, because we can have three states: 0,1 and null)

Will 8 bit-columns need 1 byte or 2 byte?

I tried some tests with a number table (1mio rows), but It didn't give me the answer.

Best Regards

Bjorn

create table tmpbit (a bit)

create table tmpbit8 (a1 bit, a2 bit, a3 bit , a4 bit, a5 bit , a6 bit, a7 bit, a8 bit)

insert into tmpbit

select cast(1 as bit) from numbers

insert into tmpbit8

select cast(1 as bit) , cast(1 as bit) , cast(1 as bit) , cast(1 as bit) ,

cast(1 as bit) , cast(1 as bit) , cast(1 as bit) , cast(1 as bit)

from numbers

select object_id('tmpbit8') -- for input to dbcc

DBCC SHOWCONTIG scanning 'tmpbit' table...

Table: 'tmpbit' (1269891891); index ID: 0, database ID: 6

TABLE level scan performed.

- Pages Scanned................................: 1359

- Extents Scanned..............................: 171

- Extent Switches..............................: 170

- Avg. Pages per Extent........................: 7.9

- Scan Density [Best Count:Actual Count].......: 99.42% [170:171]

- Extent Scan Fragmentation ...................: 1.75%

- Avg. Bytes Free per Page.....................: 1.8

- Avg. Page Density (full).....................: 99.98%

DBCC SHOWCONTIG scanning 'tmpbit8' table...

Table: 'tmpbit8' (1301892005); index ID: 0, database ID: 6

TABLE level scan performed.

- Pages Scanned................................: 1359

- Extents Scanned..............................: 171

- Extent Switches..............................: 170

- Avg. Pages per Extent........................: 7.9

- Scan Density [Best Count:Actual Count].......: 99.42% [170:171]

- Extent Scan Fragmentation ...................: 0.58%

- Avg. Bytes Free per Page.....................: 1.8

- Avg. Page Density (full).....................: 99.98%

both tmp-tables uses 10,617 MB

If your table have 1-8 then the values are strored in 1 Byte. again 1-16 stored in 2 bytes...

ie,

CEILING(n number of BIT columns /8.0) Bytes will be occupied.

|||

Hi!

Yes, but how can db store null without using extra bit?

/Bjorn

|||It can't. In fact, the null bitmap is an entirely separate structure within the row. Inside Microsoft SQL Server 2005: The Storage Engine (Delaney) has a nice diagram on page 216, along with some explanations, but the main details are that the null bitmap lies after the fixed-length column data, but before the variable-length column data. Furthermore, in SQL Server 2005, a row always has a null bitmap, even if there are no nullable columns. The size of the bitmap is 1 byte per 8 columns, rounded up to the nearest byte.

No comments:

Post a Comment