Update large table (also wide) in SQL Server without indexes
up vote
1
down vote
favorite
I have a requirement; we have 2 SQL Server 2012 databases.
MARKETS_DAILY
: tables in this database are expected to contain only days worth of dataMARKETS_HISTORY
: tables in this database are expected to contain histories for the same set of tables inMARKETS_DAILY
(copied daily data fromMARKETS_DAILY
)
All tables are fine except one. A table named RE_FEED
is 1000 columns wide and has a couple of issues.
12 column definitions are defined as
INT
inMARKETS_DAILY
andVARCHAR(4)
inMARKETS_HISTORY
. Unfortunately when we get length of 5, data gets copied as '*' intoMARKETS_HISTORY
(when we copy daily data fromMARKETS_DAILY
toMARKETS_HISTORY
). So the data is different inMARKETS_DAILY
andMARKETS_HISTORY
for a particular day.There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).
My requirement is to sync both these tables. How do I do this?
Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.
I need to change the datatype to INT
in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.
sql sql-server tsql
|
show 2 more comments
up vote
1
down vote
favorite
I have a requirement; we have 2 SQL Server 2012 databases.
MARKETS_DAILY
: tables in this database are expected to contain only days worth of dataMARKETS_HISTORY
: tables in this database are expected to contain histories for the same set of tables inMARKETS_DAILY
(copied daily data fromMARKETS_DAILY
)
All tables are fine except one. A table named RE_FEED
is 1000 columns wide and has a couple of issues.
12 column definitions are defined as
INT
inMARKETS_DAILY
andVARCHAR(4)
inMARKETS_HISTORY
. Unfortunately when we get length of 5, data gets copied as '*' intoMARKETS_HISTORY
(when we copy daily data fromMARKETS_DAILY
toMARKETS_HISTORY
). So the data is different inMARKETS_DAILY
andMARKETS_HISTORY
for a particular day.There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).
My requirement is to sync both these tables. How do I do this?
Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.
I need to change the datatype to INT
in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.
sql sql-server tsql
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
Why are the columns not good for indexing? None of the columns are large (like avarchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
1
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
1
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34
|
show 2 more comments
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a requirement; we have 2 SQL Server 2012 databases.
MARKETS_DAILY
: tables in this database are expected to contain only days worth of dataMARKETS_HISTORY
: tables in this database are expected to contain histories for the same set of tables inMARKETS_DAILY
(copied daily data fromMARKETS_DAILY
)
All tables are fine except one. A table named RE_FEED
is 1000 columns wide and has a couple of issues.
12 column definitions are defined as
INT
inMARKETS_DAILY
andVARCHAR(4)
inMARKETS_HISTORY
. Unfortunately when we get length of 5, data gets copied as '*' intoMARKETS_HISTORY
(when we copy daily data fromMARKETS_DAILY
toMARKETS_HISTORY
). So the data is different inMARKETS_DAILY
andMARKETS_HISTORY
for a particular day.There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).
My requirement is to sync both these tables. How do I do this?
Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.
I need to change the datatype to INT
in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.
sql sql-server tsql
I have a requirement; we have 2 SQL Server 2012 databases.
MARKETS_DAILY
: tables in this database are expected to contain only days worth of dataMARKETS_HISTORY
: tables in this database are expected to contain histories for the same set of tables inMARKETS_DAILY
(copied daily data fromMARKETS_DAILY
)
All tables are fine except one. A table named RE_FEED
is 1000 columns wide and has a couple of issues.
12 column definitions are defined as
INT
inMARKETS_DAILY
andVARCHAR(4)
inMARKETS_HISTORY
. Unfortunately when we get length of 5, data gets copied as '*' intoMARKETS_HISTORY
(when we copy daily data fromMARKETS_DAILY
toMARKETS_HISTORY
). So the data is different inMARKETS_DAILY
andMARKETS_HISTORY
for a particular day.There are no indexes on either of the tables and strangely we maintain history in both the databases with different values for these 12 columns (correct version in DAILY).
My requirement is to sync both these tables. How do I do this?
Performance seems to take an impact as there are no indexes and these columns are not good for indexing anyway.
I need to change the datatype to INT
in the HISTORY version and then update the values to replicate DAILY version. Searching for performance efficient way to do this. Your inputs will be helpful.
sql sql-server tsql
sql sql-server tsql
edited Nov 12 at 5:11
marc_s
568k12810991249
568k12810991249
asked Nov 11 at 22:11
SQLschooler
83
83
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
Why are the columns not good for indexing? None of the columns are large (like avarchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
1
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
1
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34
|
show 2 more comments
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
Why are the columns not good for indexing? None of the columns are large (like avarchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.
– Larnu
Nov 11 at 22:18
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
1
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
1
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
Why are the columns not good for indexing? None of the columns are large (like a
varchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.– Larnu
Nov 11 at 22:18
Why are the columns not good for indexing? None of the columns are large (like a
varchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.– Larnu
Nov 11 at 22:18
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
1
1
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
1
1
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34
|
show 2 more comments
1 Answer
1
active
oldest
votes
up vote
0
down vote
First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.
add a comment |
up vote
0
down vote
First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.
add a comment |
up vote
0
down vote
up vote
0
down vote
First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.
First, find out what fallout will happen if you fix the data type in markets history. That really needs to be done, since you are losing data.
Second, the maket daily will have some transactional tables, so not all should be copied. You might need several strategies, depending on how data is rolled up for historic records. In the case you have here, adding an index to the market_process_dt field will slow down inserts a bit, but will speed up this query.
answered Nov 12 at 5:54
Lev
514
514
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53253755%2fupdate-large-table-also-wide-in-sql-server-without-indexes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What are you using to copy the table?
– user1443098
Nov 11 at 22:17
Why are the columns not good for indexing? None of the columns are large (like a
varchar(500)
), so why do you believe that is the case? If you aren't going to index them then every query will be forced to run a table scan, which means they'll be as fast (slow) as one.– Larnu
Nov 11 at 22:18
Daily the following statement is run, INSERT INTO MARKETS_HISTORY.RE_FEED SELECT * FROM MARKETS_DAILY.RE_FEED WHERE market_process_dt = (select max(market_process_Dt) from MARKETS_DAILY.RE_FEED )
– SQLschooler
Nov 11 at 22:19
1
You should first add a column list to your insert and select statements, although I suspect at this stage you may not actually know the actual column mappings
– Nick.McDermaid
Nov 11 at 22:29
1
Why would that be a bad idea? One of the most common candidates for an index is the primary key of a table. By nature primary keys have to be unique, so the range of values is the same as the number of rows you have. In fact, I'd argue that columns that have a small number of values (in a large dataset) are going to provide a much lower value for an index than those with a high number of values. A seek on 7M out of 7.1M rows is going to be pointless (the DBMS would probably still do a scan), where as a seek for 1 row in 5M is going to be a huge benefit.
– Larnu
Nov 11 at 22:34