Unable to find entities from table storage after inserting batches of 100











up vote
1
down vote

favorite












Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.



Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.



Code that inserts batches into table storage:



 foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);

if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}


This is within an async task function in a shared assembly referenced by all functions.



Code to read out from table storage as partition key lookup:



TableContinuationToken continuationToken = null;

var query = BuildQuery(partitionKey);

var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);

return allItems;


Code that calls that to lookup by partition key:



 var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");


I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?



I have disabled the following on table client:



  tableServicePoint.UseNagleAlgorithm = false;          
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;


If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.



Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?










share|improve this question
























  • As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
    – Martin
    1 hour ago

















up vote
1
down vote

favorite












Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.



Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.



Code that inserts batches into table storage:



 foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);

if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}


This is within an async task function in a shared assembly referenced by all functions.



Code to read out from table storage as partition key lookup:



TableContinuationToken continuationToken = null;

var query = BuildQuery(partitionKey);

var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);

return allItems;


Code that calls that to lookup by partition key:



 var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");


I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?



I have disabled the following on table client:



  tableServicePoint.UseNagleAlgorithm = false;          
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;


If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.



Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?










share|improve this question
























  • As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
    – Martin
    1 hour ago















up vote
1
down vote

favorite









up vote
1
down vote

favorite











Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.



Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.



Code that inserts batches into table storage:



 foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);

if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}


This is within an async task function in a shared assembly referenced by all functions.



Code to read out from table storage as partition key lookup:



TableContinuationToken continuationToken = null;

var query = BuildQuery(partitionKey);

var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);

return allItems;


Code that calls that to lookup by partition key:



 var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");


I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?



I have disabled the following on table client:



  tableServicePoint.UseNagleAlgorithm = false;          
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;


If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.



Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?










share|improve this question















Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.



Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.



Code that inserts batches into table storage:



 foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);

if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}


This is within an async task function in a shared assembly referenced by all functions.



Code to read out from table storage as partition key lookup:



TableContinuationToken continuationToken = null;

var query = BuildQuery(partitionKey);

var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);

return allItems;


Code that calls that to lookup by partition key:



 var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");


I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?



I have disabled the following on table client:



  tableServicePoint.UseNagleAlgorithm = false;          
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;


If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.



Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?







azure azure-functions azure-table-storage azure-tablequery






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 23:04

























asked Nov 8 at 22:22









Martin

6916




6916












  • As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
    – Martin
    1 hour ago




















  • As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
    – Martin
    1 hour ago


















As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago






As it turns out, it must perhaps be something to do with the the data being available in another connection as its probably still saving the data for replication, and thus is not available. The solution was to put a delay into the next function to read from the table storage, making use of service bus scheduled broker messaging and delaying it with 15 seconds. The 15 seconds was just a number selected which then worked. Now we can send through 500k records batched to 1k batches using azure functions and service bus queues.
– Martin
1 hour ago



















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53217055%2funable-to-find-entities-from-table-storage-after-inserting-batches-of-100%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53217055%2funable-to-find-entities-from-table-storage-after-inserting-batches-of-100%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Guess what letter conforming each word

Run scheduled task as local user group (not BUILTIN)

Port of Spain