Unable to find entities from table storage after inserting batches of 100
up vote
1
down vote
favorite
Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.
Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.
Code that inserts batches into table storage:
foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);
if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}
This is within an async task function in a shared assembly referenced by all functions.
Code to read out from table storage as partition key lookup:
TableContinuationToken continuationToken = null;
var query = BuildQuery(partitionKey);
var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);
return allItems;
Code that calls that to lookup by partition key:
var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");
I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?
I have disabled the following on table client:
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;
If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.
Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?
azure azure-functions azure-table-storage azure-tablequery
add a comment |
up vote
1
down vote
favorite
Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.
Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.
Code that inserts batches into table storage:
foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);
if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}
This is within an async task function in a shared assembly referenced by all functions.
Code to read out from table storage as partition key lookup:
TableContinuationToken continuationToken = null;
var query = BuildQuery(partitionKey);
var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);
return allItems;
Code that calls that to lookup by partition key:
var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");
I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?
I have disabled the following on table client:
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;
If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.
Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?
azure azure-functions azure-table-storage azure-tablequery
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.
Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.
Code that inserts batches into table storage:
foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);
if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}
This is within an async task function in a shared assembly referenced by all functions.
Code to read out from table storage as partition key lookup:
TableContinuationToken continuationToken = null;
var query = BuildQuery(partitionKey);
var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);
return allItems;
Code that calls that to lookup by partition key:
var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");
I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?
I have disabled the following on table client:
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;
If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.
Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?
azure azure-functions azure-table-storage azure-tablequery
Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.
Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.
Code that inserts batches into table storage:
foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);
if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}
This is within an async task function in a shared assembly referenced by all functions.
Code to read out from table storage as partition key lookup:
TableContinuationToken continuationToken = null;
var query = BuildQuery(partitionKey);
var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);
return allItems;
Code that calls that to lookup by partition key:
var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");
I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?
I have disabled the following on table client:
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;
If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.
Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?
azure azure-functions azure-table-storage azure-tablequery
azure azure-functions azure-table-storage azure-tablequery
edited Nov 11 at 23:04
asked Nov 8 at 22:22
Martin
6916
6916
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago
add a comment |
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53217055%2funable-to-find-entities-from-table-storage-after-inserting-batches-of-100%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago