Showing posts with label row. Show all posts
Showing posts with label row. Show all posts

Friday, March 23, 2012

Last() Function?

Greetings,
I'm having some trouble to create a report. In this report I'm supposed to get the last row for each group of customers, ordered by date. Let me explain you a bit further using an example:
The table contains these fields, amongst others: CustomerID, Car, LastVisit.
So, for each customer there are many rows, each one with a different value in LastVisit. What I need to get, and before I started it looked quite easy, is a recordset with the last visit of each customer, given a specific car. For example, the last visit of each customer that has a Ford.
Now, I think there's a function LAST() in Access that actually gets the last row in a group by. That would be great, because the only thing I would have to do is grouping by CustomerID and getting the said last row, but unfortunatelly looks like SQL Server 2000 doesn't support that function.
The question is, how can I get that last row in each group? I've already tried using cursors in a stored procedure and temporary tables but I'm afraid I need a less time consuming solution as I'm dealing with more than 1.5 million rows.
If that's not possible, I guess I could "mark" the rows that I need and filter it later, but I'm not sure how to do it either...
If anyone around can give me some advice I would really appreciate it. I apologize if I was not very clear. I'll try to clarify it if you have doubts.
Thanks in advance,
Ricardo Lopez.

Two thoughts:

1. Try using MIN and MAX (whichever gives the correct value in this case) it will work on values which are non-numeric aswell.

2. Otherwise, use top with an order by, and use desc or asc as appropriate, and write a series of sql statments to get each group, and then combine using the UNION ALL operator.

|||

This will get you the results. It is using a derived table to get the last visit date for a customer id and then joining it with the same table to get the car of that visit.

select v.CustomerID, v.Car
from Visits v
inner join (select CustomerID as CustomerID, max(LastVisit) as LastVisit
from Visits
group by CustomerID) vlast
on (vlast.CustomerID = v.CustomerID and vlast.LastVisit = v.LastVisit)

I have assumed that LastVisit is a datetime column.

Wednesday, March 21, 2012

Last row in select question

DECLARE @.LastID int
DECLARE @.RowsToSelect int

SET ROWCOUNT @.RowsToSelect
SELECT @.LastID = [id] FROM MyTable ORDER BY [id]
SET ROWCOUNT 0

I think that value of @.LastID will be equals to [id] in the last row
of select(i.e. row with number @.RowsToSelect). But I don't understand
why it's true. For example, @.LastID not necessarily equals to last row
value in the same select without 'ORDER BY [ID]'.Kurzman (max@.virtuman.com) writes:
> DECLARE @.LastID int
> DECLARE @.RowsToSelect int
> SET ROWCOUNT @.RowsToSelect
> SELECT @.LastID = [id] FROM MyTable ORDER BY [id]
> SET ROWCOUNT 0
> I think that value of @.LastID will be equals to [id] in the last row
> of select(i.e. row with number @.RowsToSelect). But I don't understand
> why it's true.

Books Online says:

If the SELECT statement returns more than one value, the variable is
assigned the last value returned.

So your observation is correct.

> For example, @.LastID not necessarily equals to last row value in the same
> select without 'ORDER BY [ID]'.

How do you know that it is not the last value? Without an ORDER BY
clause, SQL Server is free to return the rows in any order, so any value
of [id] is correct. Recall that logically tables are unordered sets of
data, so there is no first or last value in a table.

You must always specify an ORDER BY clause to impose a certain order.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

last row in a table

How can we get the last row in a table

thanksTo go along the lines of true database design, you, as the dba/programmer must set a means of determining which record is the "last" record. As you see a database is not necessarily how it's stored, and you must organize your records by either date or identity to determine what was the last insert. Only time you can get the last record inserted is when you're actually doing the insert. Once that transaction is complete, w/o the proper fields applied to sort upon, you're just out of luck.|||Actually what I was trying to do is the following,
I was developing a web application on my pc. One of the tables of the database I was using had an ID field that was an autonumber and I didn't need to insert anything in it, since it used to insert an autonumber by itself. I published the application on europe.webmatrixhosting.net, so I copied my tables to their database. Now when I copied the table, that field became no longer an autonumber, so it gave me an error when I was trying to insert a row telling me that I can't insert null in that field.
So I thought that if I can just get the last row in that table and I increment the id of that row and make it the id of the new row.

Is there a better way of doing it??

Thanks for the help|||Alter the table to make that field an autonumber field again.

Now a big issue is how you "copied" the table over. I'd have generated a script to get the backbone of the tables with all the indexes and such, then executed that on the other server. THEN export the data over.

Hope this helps.|||I'm sorry but I'm not into sql so much, but how can I make it an autonumber with sql|||


Alter YourTable
Alter IdentityColumn int IDENTITY
|||If you do not supply a seed and an increment, 1 will be used for both, which is probably not what you want. Use 1 greater than the current highest ID in your table for the seed so that you do not create duplicate IDs:

Alter YourTable
Alter IdentityColumn int IDENTITY(8765,1)

Terri|||What the other guys suggested is your best bet, but in case that doesn't work for some reason, there's another way you can go here. You can create another table that contains only one column and one row. This row will store the last "autonumber" that you used. So what you can do when you're performing the insert is query this table, get the number, use that number to make your insert, and then call an update on the "autonumber" table to increment the value by 1 (or whatever increment you want to use).

Also, you need to make sure that you lock the "autonumber" table to make sure that nobody else can make a change to that value while you're working with it, which could cause integrity issues.

However, this is simply an alternative, albeit a worse one than what the other two posters provided. If you can, use their suggestion.|||thanks guy, you really helped me alotsql

Last Row Got Stuck

Hello,

I have a small table to manage orders in my company. When a new order is entered, the script makes use of the last row of the table to find out the last order, increments it and creates the new order number. The problem is, a few days ago the last row got stuck. New insertions to the table all got the same order number and are placed above the last row. Anybody has any idea what's going on?hey
do you have multiple concurrent applciations accesing the table?
any reason you are doing the incremting 'manually' instead of using an auto-inc int?
des|||Hello Des,

Below is the protion of the ASP code that assigns the ordernumber. It is in the form "TPD-mmm-nnn" where mmm is the mont (like 002 for Feb) and nnn is the order number (like 001,002,etc). The if clause tries to reset the order number back to 001 if the month starts anew.

...
set rs = Server.CreateObject("ADODB.recordset")
rs.CursorType = 3
selectstr="SELECT * FROM konuttekliflist"
rs.open selectstr, conn
rs.movelast
tekno=rs.fields("teklifno")
tekay=mid(tekno,5,3)
teklifindex=right("00" & right(tekno,4)+1,3)
monthnow=right("00" & month(now()),3)
if tekay=monthnow then
teklifno1="TPD-" & monthnow & "-" & teklifindex
else
teklifno1="TPD-" & monthnow & "-001"
end if
sql="INSERT INTO konuttekliflist ("
...

And yes, the application is used concurrently by 4 people.

Thanks to your interest and help.|||So ID is generated with datenow...are the four apps runnning on sep machines to db server & eachother? are all their dates/times
in perfect sync? otherwise this could cause problems.

If two orders come in at the same time, they could potentially conflict?

i see you select all into a recordset - this could cause locks that conflict with your other apps. why not select(max) substring(mmm)+subsrting(nnn) into local variable, instead of iterating through recordset?
How does it cross over the year?

how about generating the ID in sql with a computed column, using its own date and increment, then at least you dont have four different things competing for the next value? ie.gen id's centrally

des

Last row ?

Hello Team

How can i get a value from a column in the last row

for ex: select orderno from orders .......... but the last row ?

Thanks lot

Greetings,

SQL Server has no concept of last row, or first row or middle row for that matter.

To determine the last row you need to know by what criteria. If your Order Number column is sequential or your table contains a datetime column then you can get a close approximation by using TOP and ORDER BY.

SELECT TOP 1 OrderNo FROM Orders ORDER BY OrderDate DESC

If there is no column in the table from which you can determine the correct ordering then you're stuck.

--

Regards,
Neale NOON

|||

There is one more work around available to get the last record..

if your table has 3 columns,

declare @.col1 as varchar(100);
declare @.col2 as varchar(100);
declare @.col3 as varchar(100);

Select @.Co1l=Col1,
@.Col2=Col2,
@.Col3=Col3
From
Tablename;

Select @.Col1 as Col1, @.Col2 as Col2, @.Col3 as Col3;

So here you need not to use any order by clause. It always fet the last row.

|||If you are really just looking for the last row, reverse your sort order and use SELECT TOP 1 ... That will get you the first row in the reversed order, ie last row.

last row

How do I return the last row in a Sql table? I know it must be easy that's why i can't figure it outSELECT TOP 1 * FROM [mytable] ORDER BY [mycolumn] DESC

last row

Hi!

Another problem:

How can I fetch only the last row in my table?

Thanks,
Darko"Darko Jovisic" <djovisic@.fesb.hr> wrote in message
news:c92khn$spa$1@.bagan.srce.hr...
> Hi!
> Another problem:
> How can I fetch only the last row in my table?
> Thanks,
> Darko

How do you define 'last'? Rows in a table are not in any order, so you have
to use something in the data which means 'last' to you - perhaps a
CreatedDate column, or an OrderNumber, or something else:

select top 1 *
from dbo.MyTable
order by CreatedDate desc

select top 1 *
from dbo.MyTable
order by OrderNumber desc

Simon|||I understand now how to solve the problem!
Thank you very much!

"Simon Hayes" <sql@.hayes.ch> wrote in message
news:40b4d99d_3@.news.bluewin.ch...

> How do you define 'last'? Rows in a table are not in any order, so you
have
> to use something in the data which means 'last' to you - perhaps a
> CreatedDate column, or an OrderNumber, or something else:
> select top 1 *
> from dbo.MyTable
> order by CreatedDate desc
> select top 1 *
> from dbo.MyTable
> order by OrderNumber desc
> Simon|||>> How can I fetch only the last row in my table? <<

Let's get back to the basics of an RDBMS; there is no sequential
access or ordering in an RDBMS, so "first", "next" and "last" are
totally meaningless.

Last report item or RowNumber for details grouping

I have a report with details grouping on table. What i need to do is put row number only on Parent row and skip the child row. When i use RowNumber("GroupName") of course it gives me a current RowNumber. Is there a way to count only parents?

This post may help.

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=903428&SiteID=1

cheers,

Andrew

|||

Hello,

Try putting this in the same row as your parent group row:

=RunningValue(Fields!GroupFieldName.Value, CountDistinct, nothing)

Jarret

Last record in 10,000,000 rows

Hey gang, I got a table that has about 10,000,000 row and I need a test of the last 10 rows. Isn't the statement: Select bottom 10 from table?Rows in a table don't have any order. You can order the result set any way you want. Take the TOP 10 with the order "upside down" and you are in business!

-PatP|||SELECT TOP 1 * from test order by id desc
OR
select * from TEST a
where 1>(select count(*) from TEST where ID>a.id)
-SS|||Do you have an ADD_ROW_TS or IDENTITY Column?|||Gentlemen, I dont beleive I've been clear on what I'm attempting to do. I have a table with 10,000,000 records. (which happend to reflect 6 months historical data) that I need to confirm it is there. I can do a 'select top 10 from table" to get the first 10 record of this new table. I need to see the last 10 to confirm the month,date,time etc... What I've come up with in the sql anaylser is
select * from dbo.INCOMING_TEMP
where RECORD_NUMBER > 10453700
according to my records the total number of rows at 10453747

It's running but taking all day to get there.|||I'm assuming RECORD_NUMBER is defined like

RECORD_NUMBER int IDENTITY(1,1) NOT NULL

Yes?

SELECT TOP 10 * FROM yourTable99 ORDER BY RECORD_NUMBER DESC

Will get you the official last 10 records...

If that's truly the case...

No?|||Thanks that works fine. I havent been my self since I got the damn cast on my foot. So bear with me. Not I get to compare radomn samples of the data before blowing the original table away.

Monday, March 19, 2012

Last Inserted row

I am trying to access the row which was just inserted into a table. Will the following statements be equivalent of each other?

STATEMENT 1:
SELECT JobNumber
from tblCustServiceHistoryHdr
where ServiceType='On-site' AND ServiceStatus = 'NEW' AND DateModified =(SELECT MAX(DateModified) from tblCustServiceHistoryHdr) /* i thought by using DateModified i would get the last modified row*/

STATEMENT 2:
SELECT JobNumber
from inserted
where ServiceType='On-site' AND ServiceStatus = 'NEW' /* i don't need the DateModified criteria here since i am extracting from the INSERTED table*/Would you like to know the last row inserted by your session, or by all sessions ?

About statement 2, remember that the pseudo-table "inserted" can contain many rows. In that case, select jobnumber from inserted is still not an exact answer to your question.

Maybe if you us a little more info about your objective, forum members might have some ideas to help you out.|||Basically there will be many ppl inserting to this table and we need to extract the specific rows that they have inserted and then update them to another table. This will be happening later on.

But for now I will only need to know the inserted row for my session.|||How about introducing a new column to your table which will hold the time stamp for new rows inserted?|||I thought the DateModified column acted like a timestamp...u know everytime there r updates to that particular row the DateModified column is altered using getdate().|||Why not put an Identity Seed field on your table ?

Then do the insert through a stored procedure and pass back the identity field value.

By the way, @.@.IDENTITY returns the last generated value for the newly inserted row.|||What do you want to know the inserted row for ?
Might help us with a solution|||Originally posted by gayamantra
Basically there will be many ppl inserting to this table and we need to extract the specific rows that they have inserted and then update them to another table. This will be happening later on.

But for now I will only need to know the inserted row for my session.
Still confused, your objective is to trigger some action for all the rows that were inserted in your table. To my point of view, your STATEMENT 2 is exactly what you need. I'm still confused with your idea of getting the LAST row inserted, as you will apparently want to do something for ALL rows inserted.

Last GASP on "Insert row in table with Identity field, and get new Identity back " ?

While I have learned a lot from this thread I am still basically confused about the issues involved.

.I wanted to INSERT a record in a parent table, get the Identity back and use it in a child table. Seems simple.

To my knowledge, mine would be the only process running that would update these tables. I was told that there is no guarantee, because the OLEDB provider could write the second destination row before the first, that the proper parent-child relationship would be generated as expected. It was recommended that I create my own variable in memory to hold the Identity value and use that in my SSIS package.

1. A simple example SSIS .dts example illustrating the approach of using a variable for identity would be helpful.

2. Suppose I actually had two processes updating these tables, running at the same time. Then it seems the "variable" method will also have its problems. Is there a final solution other than locking the tables involved prior to updating them or doing something crazy like using a GUID for the primary key!

3. We have done the type of parent-child inserts I originally described from t-sql for years without any apparent problems. (Maybe we were just lucky.) Is the entire issue simply a t-sql one or does SSIS add a layer of complexity beyond t-sql that needs to be addressed?

TIA,

Barkingdog

SSIS processes data in a different way than T-SQL. The "pipleine" processes differently than batch oriented SQL and differently than cursor based SQL. This is by design of course and has certain benefits and trade-offs, and requires a different way of implementing a solution.

Whether you want to characterize that as an additional "layer of complexity" or not is up to you. I guess if you now have differing options for processing data, that have different advantages and disadvantages, with different costs and benefits, where previously you had only one option , you might consider that more complexity.

I consider it more options - and glad to have 'em...

|||

This post includes examples of how to create surrogate keys using SSIS, effectively the same problem as you are trying to solve: http://sqljunkies.com/WebLog/sqlbi/archive/2005/05/30/15684.aspx

We do not have a packaged solution for highly parallel loads in this version, but we will be releasing a paper on this subject in coming months.

Donald

|||

Thanks,

I will read through the paper.

I am starting to think that, because of parallesim in SSIS, that the sql uniqueidentifier may be a more suitable "primary key" for a table than the traditional Identity value (integer) used in the past.

But I do believe the problem relates to paralellism in SSIS. I don't think this problem would likely occur from a stored proc call that used the "insert a new record, get the identity value, populate the child" approach. (Though I guess it could happen if one had a server farm..... Why is everything so complicated?!)

TIA,

Barkingdog

|||

It's not because of parallelism in SSIS. We have seen this same problem with inserting and immediately retrieving an identity over many years. Unfortunately, everyone thinks they can guarantee the result, and sooner or later they trip up because of some glitch. Unfortunately, because these values are your unique identifiers, such a trip up can be quite serious.

Read:

http://www.aspfaq.com/show.asp?id=2499

http://www.databasejournal.com/features/mssql/article.php/3307541

Donald

|||

>> We have seen this same problem with inserting and immediately retrieving an identity over many years"

The problem I face is that we will be running a single program or SSIS package that is the only program which updates these tables. Nothing else will update them. There are no multiple users, no server farm. I can hear the developers saying that the "problem" mentioned can not happen in that of environemnt. Are they correct or can someone come up with a simple t-sql script that forces the problem to happen in any sql environment?

TIA,

Barkingdog

|||

>The problem I face is that we will be running a single program or SSIS package that is the only program which updates these tables. Nothing else will update them. There are no multiple users, no server farm.

The problem I face is that I have heard this many times from many users - and yet somewhere, somehow, somewhen, the world turns out to be not quite this tidy.

If you think this will work for you, I guess I can only say good luck. But I have an uneasy feeling that in a few months time we'll be reading the "how do I rekey all my child records?" post.

Donald

|||

Thanks Donald. I will follow your sage advice!

I'm just preparing for the bruises\abuse I'm going to get from developers who will think I'm wrong and crazy on this point. I'm sure the URL's you listed will dull their bite.

Barkingdog

|||

Here's what I have learned about this issue so far:

1. This issue has always existed. It is now aggravated by multi-processor systems, server farms, and the parellelism introduced in SSIS.

2. The best thought seems to be to generate one's own counter and use it instead of relying upon an Identity field.

3. While better to use SCOPE_IDENTITY than @.@.Identity or IDENT_CURRENT, even that will not guarantee a correct parent-child linkage. (The problem is if one t-sql operaiton starts before another their is no guarantee it will complete and commit before the other. Forget sequential processing.)

4. I think the only available way to get the proper linkage is to use GUID (from NEWID()) to generate the "identity" key for tables. (But that approach won't be quick for use in JOINS unless I index the GUIDs!)

In summary, I don't think there is a GOOD solution to thie fundamental issue yet. It was mentioned that MicroSoft is working on a best-practices document regarding surrogate keys generation . I can hardly wait for it to be released!

Barkingdog

|||

Actually, I can tell you the technique I have used, and don't recollect ever having a problem with it:

1: ALL rtables must have a _natural_ key defined, and enforced as a unique index.

2: Now after inserting a row into an rtable that uses a system generated surrogate key, you can use the _natural key_ to find the row, therefore it is guaranteed to be the correct row, and retrieve whatever surrogate system generated key was inserted in that row. Obviously there is a performance price to pay to getting the right answer, but typically that's better than a fast wrong answer...

3:now you can use that _guaranteed correct_ surrogate key in further processing as needed.

It has been many years since I have built an rtable that did not enforce uiqueness based on a _natural_ key. It is the only way to go...

Last GASP on "Insert row in table with Identity field, and get new Identity back "

While I have learned a lot from this thread I am still basically confused about the issues involved.

.I wanted to INSERT a record in a parent table, get the Identity back and use it in a child table. Seems simple.

To my knowledge, mine would be the only process running that would update these tables. I was told that there is no guarantee, because the OLEDB provider could write the second destination row before the first, that the proper parent-child relationship would be generated as expected. It was recommended that I create my own variable in memory to hold the Identity value and use that in my SSIS package.

1. A simple example SSIS .dts example illustrating the approach of using a variable for identity would be helpful.

2. Suppose I actually had two processes updating these tables, running at the same time. Then it seems the "variable" method will also have its problems. Is there a final solution other than locking the tables involved prior to updating them or doing something crazy like using a GUID for the primary key!

3. We have done the type of parent-child inserts I originally described from t-sql for years without any apparent problems. (Maybe we were just lucky.) Is the entire issue simply a t-sql one or does SSIS add a layer of complexity beyond t-sql that needs to be addressed?

TIA,

Barkingdog

SSIS processes data in a different way than T-SQL. The "pipleine" processes differently than batch oriented SQL and differently than cursor based SQL. This is by design of course and has certain benefits and trade-offs, and requires a different way of implementing a solution.

Whether you want to characterize that as an additional "layer of complexity" or not is up to you. I guess if you now have differing options for processing data, that have different advantages and disadvantages, with different costs and benefits, where previously you had only one option , you might consider that more complexity.

I consider it more options - and glad to have 'em...

|||

This post includes examples of how to create surrogate keys using SSIS, effectively the same problem as you are trying to solve: http://sqljunkies.com/WebLog/sqlbi/archive/2005/05/30/15684.aspx

We do not have a packaged solution for highly parallel loads in this version, but we will be releasing a paper on this subject in coming months.

Donald

|||

Thanks,

I will read through the paper.

I am starting to think that, because of parallesim in SSIS, that the sql uniqueidentifier may be a more suitable "primary key" for a table than the traditional Identity value (integer) used in the past.

But I do believe the problem relates to paralellism in SSIS. I don't think this problem would likely occur from a stored proc call that used the "insert a new record, get the identity value, populate the child" approach. (Though I guess it could happen if one had a server farm..... Why is everything so complicated?!)

TIA,

Barkingdog

|||

It's not because of parallelism in SSIS. We have seen this same problem with inserting and immediately retrieving an identity over many years. Unfortunately, everyone thinks they can guarantee the result, and sooner or later they trip up because of some glitch. Unfortunately, because these values are your unique identifiers, such a trip up can be quite serious.

Read:

http://www.aspfaq.com/show.asp?id=2499

http://www.databasejournal.com/features/mssql/article.php/3307541

Donald

|||

>> We have seen this same problem with inserting and immediately retrieving an identity over many years"

The problem I face is that we will be running a single program or SSIS package that is the only program which updates these tables. Nothing else will update them. There are no multiple users, no server farm. I can hear the developers saying that the "problem" mentioned can not happen in that of environemnt. Are they correct or can someone come up with a simple t-sql script that forces the problem to happen in any sql environment?

TIA,

Barkingdog

|||

>The problem I face is that we will be running a single program or SSIS package that is the only program which updates these tables. Nothing else will update them. There are no multiple users, no server farm.

The problem I face is that I have heard this many times from many users - and yet somewhere, somehow, somewhen, the world turns out to be not quite this tidy.

If you think this will work for you, I guess I can only say good luck. But I have an uneasy feeling that in a few months time we'll be reading the "how do I rekey all my child records?" post.

Donald

|||

Thanks Donald. I will follow your sage advice!

I'm just preparing for the bruises\abuse I'm going to get from developers who will think I'm wrong and crazy on this point. I'm sure the URL's you listed will dull their bite.

Barkingdog

|||

Here's what I have learned about this issue so far:

1. This issue has always existed. It is now aggravated by multi-processor systems, server farms, and the parellelism introduced in SSIS.

2. The best thought seems to be to generate one's own counter and use it instead of relying upon an Identity field.

3. While better to use SCOPE_IDENTITY than @.@.Identity or IDENT_CURRENT, even that will not guarantee a correct parent-child linkage. (The problem is if one t-sql operaiton starts before another their is no guarantee it will complete and commit before the other. Forget sequential processing.)

4. I think the only available way to get the proper linkage is to use GUID (from NEWID()) to generate the "identity" key for tables. (But that approach won't be quick for use in JOINS unless I index the GUIDs!)

In summary, I don't think there is a GOOD solution to thie fundamental issue yet. It was mentioned that MicroSoft is working on a best-practices document regarding surrogate keys generation . I can hardly wait for it to be released!

Barkingdog

|||

Actually, I can tell you the technique I have used, and don't recollect ever having a problem with it:

1: ALL rtables must have a _natural_ key defined, and enforced as a unique index.

2: Now after inserting a row into an rtable that uses a system generated surrogate key, you can use the _natural key_ to find the row, therefore it is guaranteed to be the correct row, and retrieve whatever surrogate system generated key was inserted in that row. Obviously there is a performance price to pay to getting the right answer, but typically that's better than a fast wrong answer...

3:now you can use that _guaranteed correct_ surrogate key in further processing as needed.

It has been many years since I have built an rtable that did not enforce uiqueness based on a _natural_ key. It is the only way to go...

Monday, March 12, 2012

Last edit time?

How can i mark a row with its last edit time?
With access it was obvious, in sqlserver?Hi

You can add a datetime column that you update using the getdate() function.
If you only want to version-stamp the row then you can use a rowversion
column.

John

<pinkoPallino@.pinkopallonia.it> wrote in message
news:426b078b.748264697@.news.tin.it...
> How can i mark a row with its last edit time?
> With access it was obvious, in sqlserver?

Friday, March 9, 2012

Large(ish) update

I'm updating every row in a 200mb table. I'm trying to work out why it
running so slowly and have found that it increases the size of the log file
by 625mb an then empties it, I believe, but leaves it at the same size. I've
got the recovery mode set to simple which I thought was not meant to use the
log file. Can someone tell me what's going on and if there's some method of
speeding this up.
Thanks
MichaelHi Michael,
Simple recovery mode still uses the log file during the transaction, but
then truncates it after the transaction. The log file is used during the
transaction such that the transaction can be rolled back, if necessary (e.g.
if it hits an error or if you cancel the update halfway through). To battle
transaction log growth during large transactions, many people split the
transactions into batches.
Please refer to the following search for many threads related to this topic:
http://groups.google.com/groups?as_...r />
=2005&saf
e=off
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
--
"Michael C" <mculley@.NOSPAMoptushome.com.au> wrote in message
news:evIv9OVzFHA.3152@.TK2MSFTNGP10.phx.gbl...
> I'm updating every row in a 200mb table. I'm trying to work out why it
> running so slowly and have found that it increases the size of the log
> file by 625mb an then empties it, I believe, but leaves it at the same
> size. I've got the recovery mode set to simple which I thought was not
> meant to use the log file. Can someone tell me what's going on and if
> there's some method of speeding this up.
> Thanks
> Michael
>|||Every transaction is logged--even if you use the simple recovery mode. That
way the transaction can be rolled back.
To speed up the update you need to split it into smaller chunks. If the
individual updates are small enough, they will fit into the transaction log
without making it grow. I'm assuming that the growth is probably why the
update is running so slow.
How do you split it up? It depends. If you are setting a column to a
specific value, then you can use something like:
SET ROWCOUNT 1000
AGAIN:
UPDATE tableName SET columnName = <newColumnValue> WHERE columnName !=
<newColumnValue>
IF @.@.ROWCOUNT > 0 GOTO AGAIN
SET ROWCOUNT 0
(Sometimes it's better to use > or < instead of != to speed things up even
more.)
There are many ways to do this, but I'd need a better description of what
you're trying to do and DDL for the table.
"Michael C" <mculley@.NOSPAMoptushome.com.au> wrote in message
news:evIv9OVzFHA.3152@.TK2MSFTNGP10.phx.gbl...
> I'm updating every row in a 200mb table. I'm trying to work out why it
> running so slowly and have found that it increases the size of the log
> file by 625mb an then empties it, I believe, but leaves it at the same
> size. I've got the recovery mode set to simple which I thought was not
> meant to use the log file. Can someone tell me what's going on and if
> there's some method of speeding this up.
> Thanks
> Michael
>|||Hi,
its better to update records in smaller chunk then deleting as whole limit
row / record size for deleting in 1000 or as you like , so that you can
control T-LOG file size
Refer following thread :
http://www.sql-server-performance.c...?TOPIC_ID=10706
;-)
Regards

Large value types out of row - objectproperties

Hi,
I know that i can set "large value types out of row" table option using
"sp_tableoption" but i could not find a way to know if it is enabled or
disabled.
sp_tableoption returns nothing so i thought i could use
objectproperties but there doesn't seem to be any property name
documented for "large value types out of row" in msdn.
So, how can i check if it is enabled or not?
Thanks.What version of SQL Server? Object property in 2000 has the TableTextInRowLi
mit attribute you can
ask for.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
<antimon@.gmail.com> wrote in message news:1141611920.307183.105940@.v46g2000cwv.googlegroups
.com...
> Hi,
> I know that i can set "large value types out of row" table option using
> "sp_tableoption" but i could not find a way to know if it is enabled or
> disabled.
> sp_tableoption returns nothing so i thought i could use
> objectproperties but there doesn't seem to be any property name
> documented for "large value types out of row" in msdn.
> So, how can i check if it is enabled or not?
> Thanks.
>|||It is sql server 2005. TableTextInRowLimit works fine but i want to
learn if the option that forces large value types (char(MAX),
varchar(MAX)..) to be stored in seperated pages even if it is a small
piece of data is turned on or not.|||> It is sql server 2005.
Please include version number upfront, makes the whole process quicker... :-
)
Seems like 2005 doesn't expose text in row for the new datatypes. The sys.ta
bles catalog view does,
however...
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
<antimon@.gmail.com> wrote in message news:1141656316.186672.224600@.j52g2000cwj.googlegroups
.com...
> It is sql server 2005. TableTextInRowLimit works fine but i want to
> learn if the option that forces large value types (char(MAX),
> varchar(MAX)..) to be stored in seperated pages even if it is a small
> piece of data is turned on or not.
>|||I thought i had it in my first post :) sorry about that.
And thanks for the solution, sys.tables works great.

Wednesday, March 7, 2012

large table update

hello, i have a following problem: I have got 130 million row TABLEA
and I need to run this update:
UPDATE TABLEA
SET COLUMNA=(SELECT COLUMNB FROM TABLEB WHERE condition)

I suppose that for every row from TABLEA (80 million times) select statement must be executed, which is terribly slow. Isn't there a better way to write this update? Thank youCould you provide the WHERE clause in detail, pls?|||Originally posted by Coolberg
Could you provide the WHERE clause in detail, pls?

WHERE column from TABLEA=column FROM TABLEB

Large table -- split into smaller ones?

Hi All,
I have to import about 26 million rows, each row about 1Kb in size into
our sql server. This data will be READONLY and might be updated every
quarter.
I am yet to import it and do some performance testing.
what should be the best approach to handling this much data...
should I
* have one large table?
OR
*Horizontally partition data into multiple tables?
thanks for your time
GKIt's generally best to implement a single table unless you have a
compelling reason to do otherwise. A single large table with
appropriate indexes often performs quite well without the additional
administrative complexity of partitioning.
The main advantages of horizontal partitioning are related to admin
tasks like data loads, bulk deletes and index creates. For example,
you'll need about 30GB of free space to build a clustered index on your
table but only a fraction of that amount if partitioned. On the other
hand, if you can load data sorted in clustered index sequence, you can
load with the index in place and forego the index create entirely.
--
Hope this helps.
Dan Guzman
SQL Server MVP
--
SQL FAQ links (courtesy Neil Pike):
http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
http://www.sqlserverfaq.com
http://www.mssqlserver.com/faq
--
"GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
news:edyKf7gqDHA.2496@.TK2MSFTNGP09.phx.gbl...
> Hi All,
> I have to import about 26 million rows, each row about 1Kb in size
into
> our sql server. This data will be READONLY and might be updated every
> quarter.
> I am yet to import it and do some performance testing.
> what should be the best approach to handling this much data...
> should I
> * have one large table?
> OR
> *Horizontally partition data into multiple tables?
> thanks for your time
> GK
>
>|||Thanks Dan,
I understand your point about administrative overheads, but I guess
they are much reduced if I use Partitioned views in SQL server 2000. Would
you advise using that?
As far as loading the data in clustered index sequene is concerned, are you
advising me to, say load data sorted on the clustered index column(s)?
Excuse me if this question is stupid but how do I load with index in place?
Also I didn't understand why it takes so much space to create index for a
large table and less space if the table were to be split up. Does index
creation need thatmuch space just while building the index and the finally
created index take lesser space?
Thank you for your efforts.
GK
"Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
news:OmoikLmqDHA.2216@.TK2MSFTNGP12.phx.gbl...
> It's generally best to implement a single table unless you have a
> compelling reason to do otherwise. A single large table with
> appropriate indexes often performs quite well without the additional
> administrative complexity of partitioning.
> The main advantages of horizontal partitioning are related to admin
> tasks like data loads, bulk deletes and index creates. For example,
> you'll need about 30GB of free space to build a clustered index on your
> table but only a fraction of that amount if partitioned. On the other
> hand, if you can load data sorted in clustered index sequence, you can
> load with the index in place and forego the index create entirely.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> --
> SQL FAQ links (courtesy Neil Pike):
> http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
> http://www.sqlserverfaq.com
> http://www.mssqlserver.com/faq
> --
> "GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
> news:edyKf7gqDHA.2496@.TK2MSFTNGP09.phx.gbl...
> > Hi All,
> > I have to import about 26 million rows, each row about 1Kb in size
> into
> > our sql server. This data will be READONLY and might be updated every
> > quarter.
> > I am yet to import it and do some performance testing.
> > what should be the best approach to handling this much data...
> > should I
> > * have one large table?
> > OR
> > *Horizontally partition data into multiple tables?
> >
> > thanks for your time
> > GK
> >
> >
> >
>|||Just found this good msdn article on partioning data.
http://msdn.microsoft.com/library/default.asp?URL=/library/techart/PartitionsInDW.htm
and it says...(although they are talking about a data warehouse...mine will
not be a data warehouse exactly but it wont be an OLTP either. It'll be a
readonly table with quarterly updates)
<quote>
Query Speed
Query speed should not be considered a reason to partition the data
warehouse relational database. Query performance is similar for partitioned
and non-partitioned fact tables. When the partitioned database is properly
designed, the relational engine will include in a query plan only the
partition(s) necessary to resolve that query. For example, if the database
is partitioned by month and a query is conditioned on January 2000, the
query plan will include only the partition for January 2000. The resulting
query will perform well against the partitioned table, about the same as
against a properly indexed combined table with a clustered index on the
partitioning key.
</quote>
"GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
news:%23%23bRH5oqDHA.2772@.TK2MSFTNGP10.phx.gbl...
> Thanks Dan,
> I understand your point about administrative overheads, but I guess
> they are much reduced if I use Partitioned views in SQL server 2000. Would
> you advise using that?
> As far as loading the data in clustered index sequene is concerned, are
you
> advising me to, say load data sorted on the clustered index column(s)?
> Excuse me if this question is stupid but how do I load with index in
place?
> Also I didn't understand why it takes so much space to create index for a
> large table and less space if the table were to be split up. Does index
> creation need thatmuch space just while building the index and the finally
> created index take lesser space?
> Thank you for your efforts.
> GK
> "Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
> news:OmoikLmqDHA.2216@.TK2MSFTNGP12.phx.gbl...
> > It's generally best to implement a single table unless you have a
> > compelling reason to do otherwise. A single large table with
> > appropriate indexes often performs quite well without the additional
> > administrative complexity of partitioning.
> >
> > The main advantages of horizontal partitioning are related to admin
> > tasks like data loads, bulk deletes and index creates. For example,
> > you'll need about 30GB of free space to build a clustered index on your
> > table but only a fraction of that amount if partitioned. On the other
> > hand, if you can load data sorted in clustered index sequence, you can
> > load with the index in place and forego the index create entirely.
> >
> > --
> > Hope this helps.
> >
> > Dan Guzman
> > SQL Server MVP
> >
> > --
> > SQL FAQ links (courtesy Neil Pike):
> >
> > http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
> > http://www.sqlserverfaq.com
> > http://www.mssqlserver.com/faq
> > --
> >
> > "GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
> > news:edyKf7gqDHA.2496@.TK2MSFTNGP09.phx.gbl...
> > > Hi All,
> > > I have to import about 26 million rows, each row about 1Kb in size
> > into
> > > our sql server. This data will be READONLY and might be updated every
> > > quarter.
> > > I am yet to import it and do some performance testing.
> > > what should be the best approach to handling this much data...
> > > should I
> > > * have one large table?
> > > OR
> > > *Horizontally partition data into multiple tables?
> > >
> > > thanks for your time
> > > GK
> > >
> > >
> > >
> >
> >
>|||> I understand your point about administrative overheads, but I guess
> they are much reduced if I use Partitioned views in SQL server 2000.
Would
> you advise using that?
It depends on the problem you are trying to solve. The article you
referenced in your other post does a pretty good job of pointing out the
pros and cons. If your primary concern is query performance, then PVs
aren't the answer. Personally, wouldn't change my design to accommodate
partitioning unless the additional complexity is more than offset by the
benefits of load speed and maintenance. IMHO, your quarterly data load
probably doesn't justify the use of PVs.
> As far as loading the data in clustered index sequene is concerned,
are you
> advising me to, say load data sorted on the clustered index column(s)?
> Excuse me if this question is stupid but how do I load with index in
place?
Just create the table and its clustered index. If you then load (bulk
insert) data in sequence by the clustered index, load performance will
be good and you won't need to create the clustered index afterward.
Non-clustered indexes can be created after the load.
> Also I didn't understand why it takes so much space to create index
for a
> large table and less space if the table were to be split up. Does
index
> creation need thatmuch space just while building the index and the
finally
> created index take lesser space?
The space requirement to build the clustered index is a consideration
with large tables. You need free space of about 120% of the original
table size. This is because the entire table is rebuilt during the
create so space for old and new data is needed plus some sort work
space. After the create, space for the old data is released and the end
result is that slightly more space than the original heap is allocated
to accommodate the non-leaf nodes of the clustered index.
With multiple smaller tables, you can build each clustered index
separately. Consequently, you'll only need free space to accommodate
120% of the largest table in the lot.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
news:%23%23bRH5oqDHA.2772@.TK2MSFTNGP10.phx.gbl...
> Thanks Dan,
> I understand your point about administrative overheads, but I
guess
> they are much reduced if I use Partitioned views in SQL server 2000.
Would
> you advise using that?
> As far as loading the data in clustered index sequene is concerned,
are you
> advising me to, say load data sorted on the clustered index column(s)?
> Excuse me if this question is stupid but how do I load with index in
place?
> Also I didn't understand why it takes so much space to create index
for a
> large table and less space if the table were to be split up. Does
index
> creation need thatmuch space just while building the index and the
finally
> created index take lesser space?
> Thank you for your efforts.
> GK
> "Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
> news:OmoikLmqDHA.2216@.TK2MSFTNGP12.phx.gbl...
> > It's generally best to implement a single table unless you have a
> > compelling reason to do otherwise. A single large table with
> > appropriate indexes often performs quite well without the additional
> > administrative complexity of partitioning.
> >
> > The main advantages of horizontal partitioning are related to admin
> > tasks like data loads, bulk deletes and index creates. For example,
> > you'll need about 30GB of free space to build a clustered index on
your
> > table but only a fraction of that amount if partitioned. On the
other
> > hand, if you can load data sorted in clustered index sequence, you
can
> > load with the index in place and forego the index create entirely.
> >
> > --
> > Hope this helps.
> >
> > Dan Guzman
> > SQL Server MVP
> >
> > --
> > SQL FAQ links (courtesy Neil Pike):
> >
> > http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
> > http://www.sqlserverfaq.com
> > http://www.mssqlserver.com/faq
> > --
> >
> > "GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
> > news:edyKf7gqDHA.2496@.TK2MSFTNGP09.phx.gbl...
> > > Hi All,
> > > I have to import about 26 million rows, each row about 1Kb in
size
> > into
> > > our sql server. This data will be READONLY and might be updated
every
> > > quarter.
> > > I am yet to import it and do some performance testing.
> > > what should be the best approach to handling this much data...
> > > should I
> > > * have one large table?
> > > OR
> > > *Horizontally partition data into multiple tables?
> > >
> > > thanks for your time
> > > GK
> > >
> > >
> > >
> >
> >
>|||That was very helpful.
Thanks Dan.
"Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
news:OdauywrqDHA.2568@.TK2MSFTNGP09.phx.gbl...
> > I understand your point about administrative overheads, but I guess
> > they are much reduced if I use Partitioned views in SQL server 2000.
> Would
> > you advise using that?
> It depends on the problem you are trying to solve. The article you
> referenced in your other post does a pretty good job of pointing out the
> pros and cons. If your primary concern is query performance, then PVs
> aren't the answer. Personally, wouldn't change my design to accommodate
> partitioning unless the additional complexity is more than offset by the
> benefits of load speed and maintenance. IMHO, your quarterly data load
> probably doesn't justify the use of PVs.
> > As far as loading the data in clustered index sequene is concerned,
> are you
> > advising me to, say load data sorted on the clustered index column(s)?
> > Excuse me if this question is stupid but how do I load with index in
> place?
> Just create the table and its clustered index. If you then load (bulk
> insert) data in sequence by the clustered index, load performance will
> be good and you won't need to create the clustered index afterward.
> Non-clustered indexes can be created after the load.
> > Also I didn't understand why it takes so much space to create index
> for a
> > large table and less space if the table were to be split up. Does
> index
> > creation need thatmuch space just while building the index and the
> finally
> > created index take lesser space?
> The space requirement to build the clustered index is a consideration
> with large tables. You need free space of about 120% of the original
> table size. This is because the entire table is rebuilt during the
> create so space for old and new data is needed plus some sort work
> space. After the create, space for the old data is released and the end
> result is that slightly more space than the original heap is allocated
> to accommodate the non-leaf nodes of the clustered index.
> With multiple smaller tables, you can build each clustered index
> separately. Consequently, you'll only need free space to accommodate
> 120% of the largest table in the lot.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
>
> "GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
> news:%23%23bRH5oqDHA.2772@.TK2MSFTNGP10.phx.gbl...
> > Thanks Dan,
> > I understand your point about administrative overheads, but I
> guess
> > they are much reduced if I use Partitioned views in SQL server 2000.
> Would
> > you advise using that?
> >
> > As far as loading the data in clustered index sequene is concerned,
> are you
> > advising me to, say load data sorted on the clustered index column(s)?
> > Excuse me if this question is stupid but how do I load with index in
> place?
> >
> > Also I didn't understand why it takes so much space to create index
> for a
> > large table and less space if the table were to be split up. Does
> index
> > creation need thatmuch space just while building the index and the
> finally
> > created index take lesser space?
> >
> > Thank you for your efforts.
> > GK
> > "Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
> > news:OmoikLmqDHA.2216@.TK2MSFTNGP12.phx.gbl...
> > > It's generally best to implement a single table unless you have a
> > > compelling reason to do otherwise. A single large table with
> > > appropriate indexes often performs quite well without the additional
> > > administrative complexity of partitioning.
> > >
> > > The main advantages of horizontal partitioning are related to admin
> > > tasks like data loads, bulk deletes and index creates. For example,
> > > you'll need about 30GB of free space to build a clustered index on
> your
> > > table but only a fraction of that amount if partitioned. On the
> other
> > > hand, if you can load data sorted in clustered index sequence, you
> can
> > > load with the index in place and forego the index create entirely.
> > >
> > > --
> > > Hope this helps.
> > >
> > > Dan Guzman
> > > SQL Server MVP
> > >
> > > --
> > > SQL FAQ links (courtesy Neil Pike):
> > >
> > > http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
> > > http://www.sqlserverfaq.com
> > > http://www.mssqlserver.com/faq
> > > --
> > >
> > > "GK" <ksgkrishremovethis@.yahoo.co.uk> wrote in message
> > > news:edyKf7gqDHA.2496@.TK2MSFTNGP09.phx.gbl...
> > > > Hi All,
> > > > I have to import about 26 million rows, each row about 1Kb in
> size
> > > into
> > > > our sql server. This data will be READONLY and might be updated
> every
> > > > quarter.
> > > > I am yet to import it and do some performance testing.
> > > > what should be the best approach to handling this much data...
> > > > should I
> > > > * have one large table?
> > > > OR
> > > > *Horizontally partition data into multiple tables?
> > > >
> > > > thanks for your time
> > > > GK
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>

Monday, February 20, 2012

Large files(Excel) coming out of Reporting Services how can I compress them for email

I have a 29000 row spreadsheet coming out of Reporting Services that is producing a large file that we email. How can I create the most compact spreadsheet/file for scheduled daily emailing out of Reporting Services?

I'm pretty sure there isn't any built-in way to compress the report attached to an email in Reporting Services. There aren't really any tricks that you can use to reduce the size of the report generated, other than making sure that you use numbers instead of strings whenever possible -- numbers will generally take less space in the Excel format than strings. Depending on how you are sending the emails, you might be able to use some third party code or write your own code to compress the exported report and then mail it, but it isn't something that is part of RS.

Large files(Excel) coming out of Reporting Services how can I compress them for email

I have a 29000 row spreadsheet coming out of Reporting Services that is producing a large file that we email. How can I create the most compact spreadsheet/file for scheduled daily emailing out of Reporting Services?

I'm pretty sure there isn't any built-in way to compress the report attached to an email in Reporting Services. There aren't really any tricks that you can use to reduce the size of the report generated, other than making sure that you use numbers instead of strings whenever possible -- numbers will generally take less space in the Excel format than strings. Depending on how you are sending the emails, you might be able to use some third party code or write your own code to compress the exported report and then mail it, but it isn't something that is part of RS.