Monday, March 19, 2012
Last non nulls from a set of rows
an extremely large set of information.
The table all of this data is being dumped into is a transaction style table
with heavy null propogation. The only "real data" out of the table is the
last not null value entered (defined by the record time). For example:
Record 1. 1/3/2006 3:00PM, ID2006, Null, Null, 1006, ABC.
Record 2. 1/3/2006 5:00 PM ID2006, 32, Null, Null, Null
Record 3. 1/3/2006 9:00 PM ID2006, Null, Null, 1007, Null
The result set I would want from the above data is:
ID2006, 32, Null, 1007, ABC
Or in English, The values for ID2006 that are not null in order of most
recent record time.
As an aside, I also can't depend on records being inserted sequentially by
date.
I know I can do this with a series of selects using the TOP keyword, but is
there a more elegant way to deal with this problem? The individual selects
are pretty expensive over such a large amount of data.
Many Thanks!Hi Justin Weinberg,
I don't know how useful this will be but, I thought it might be fun to
play around with it. Go ahead and copy and paste the script below. I
added a second 'ID' record to be sure it was rolling the data up and
grouping correctly.
SET nocount ON
SET ansi_warnings off /* we'll be eliminating NULL values with an
aggregate function. don't need to be reminded */
CREATE TABLE #tbl (
entrydate datetime NOT NULL,
rowid CHAR(6) NOT NULL,
f1 INT NULL,
f15 INT NULL,
f2 INT NULL,
f3 CHAR(4) NULL)
INSERT #tbl
VALUES('1/3/2006 3:00:00.000 PM',
'ID2006',
NULL,
NULL,
1006,
'ABC.')
INSERT #tbl
VALUES('1/3/2006 5:00:00.000 PM',
'ID2006',
32,
NULL,
NULL,
NULL)
INSERT #tbl
VALUES('1/3/2006 9:00:00.000 PM',
'ID2006',
NULL,
NULL,
1007,
NULL)
INSERT #tbl
VALUES('1/3/2006 10:00:00.000 PM',
'ID2008',
NULL,
NULL,
1006,
NULL)
INSERT #tbl
VALUES('1/3/2006 6:00:00.000 PM',
'ID2008',
NULL,
NULL,
NULL,
'ACO.')
INSERT #tbl
VALUES('1/3/2006 11:00:00.000 PM',
'ID2008',
1734,
NULL,
NULL,
NULL)
INSERT #tbl
VALUES('1/3/2006 11:00:00.000 PM',
'ID2008',
NULL,
12,
NULL,
NULL)
SELECT #tbl.rowid,
Max(#tbl.f1) AS maxf1,
Max(#tbl.f15) AS maxf15,
Max(#tbl.f2) AS maxf2,
Max(#tbl.f3) AS maxf3
FROM #tbl
LEFT JOIN (SELECT Max(entrydate) AS ed,
rowid
FROM #tbl
WHERE f1 IS NOT NULL
GROUP BY rowid) AS f1results
ON #tbl.rowid = f1results.rowid
AND #tbl.entrydate = f1results.ed
LEFT JOIN (SELECT Max(entrydate) AS ed,
rowid
FROM #tbl
WHERE f15 IS NOT NULL
GROUP BY rowid) AS f15results
ON #tbl.rowid = f15results.rowid
AND #tbl.entrydate = f15results.ed
LEFT JOIN (SELECT Max(entrydate) AS ed,
rowid
FROM #tbl
WHERE f2 IS NOT NULL
GROUP BY rowid) AS f2results
ON #tbl.rowid = f2results.rowid
AND #tbl.entrydate = f2results.ed
LEFT JOIN (SELECT Max(entrydate) AS ed,
rowid
FROM #tbl
WHERE f3 IS NOT NULL
GROUP BY rowid) AS f3results
ON #tbl.rowid = f3results.rowid
AND #tbl.entrydate = f3results.ed
GROUP BY #tbl.rowid
ORDER BY #tbl.rowid
DROP TABLE #tbl|||Justin,
Here is a solution likely to be faster:
create table T (
d datetime,
id char(6),
a int,
b int,
c int,
x char(3)
)
go
insert into T values('20060103 15:00:00','ID2006',null,null,1006,'ABC')
insert into T values('20060103 17:00:00','ID2006',32,null,null,null)
insert into T values('20060103 21:00:00','ID2006',null,null,1007,NULL)
go
select
id,
ltrim(substring(max(convert(char(17),d,1
12)+str(a)),18,12)) as a,
ltrim(substring(max(convert(char(17),d,1
12)+str(b)),18,12)) as b,
ltrim(substring(max(convert(char(17),d,1
12)+str(c)),18,12)) as c,
ltrim(substring(max(convert(char(17),d,1
12)+x),18,3)) as x
from T
group by id
go
It will be even faster if you can afford to make the following
changes:
create table T (
d datetime,
id char(6),
a int,
b int,
c int,
x char(3),
n as case when a+b+c+case when x is not null then 0 end is null then 1 end
)
go
create clustered index T_ci on T(id,n)
<same query>
Steve Kass
Drew University
Justin Weinberg wrote:
>I'm in a situation where I can't alter how data is being collected, and it'
s
>an extremely large set of information.
>The table all of this data is being dumped into is a transaction style tabl
e
>with heavy null propogation. The only "real data" out of the table is the
>last not null value entered (defined by the record time). For example:
>Record 1. 1/3/2006 3:00PM, ID2006, Null, Null, 1006, ABC.
>Record 2. 1/3/2006 5:00 PM ID2006, 32, Null, Null, Null
>Record 3. 1/3/2006 9:00 PM ID2006, Null, Null, 1007, Null
>The result set I would want from the above data is:
>ID2006, 32, Null, 1007, ABC
>Or in English, The values for ID2006 that are not null in order of most
>recent record time.
>As an aside, I also can't depend on records being inserted sequentially by
>date.
>I know I can do this with a series of selects using the TOP keyword, but is
>there a more elegant way to deal with this problem? The individual selects
>are pretty expensive over such a large amount of data.
>Many Thanks!
>
>
>|||Shorter and easier to maintain, to boot!
Now I know why nobody wants to work on my queries :-)
Friday, February 24, 2012
Large numbers of orphaned/expired requests
w3wp!runningjobs!434!3/23/2007-10:12:57:: i INFO: Adding: 8 running jobs to the database
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:13:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsExpired; found expired request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsClientConnected; found orphaned request
w3wp!runningjobs!434!3/23/2007-10:14:57:: i INFO: RunningJobContext.IsExpired; found expired request
What could be causing this? The reports are making queries through an OLE DB provider. There are no scheduled jobs, and the load doesn't seem that heavy.
Orphaned request indicates that the HTTP request has gone away before we finished processing the report. Generally this is because people request a report, and then close IE.