Tree search based Game AI: How to avoid AI 'wandering'/'procrastination' with sparse rewards?

My game AI makes use of an algorithm that searches all possible future states based on the moves I can make (minimax / monte carlo esque). It evaluates these states using a scoring system, picks the highest scored final state and follows it.

This works well in most situations, but awfully when rewards are sparse. For example: there's a desirable collectable object that's 3 tiles to the right of me. The natural solution would be to go right->right->right.

But, my algorithm searches 6 turns deep. And it will will find many paths that eventually collect the object, including ones that take longer than 3 turns. It might for example find a path that's: up->right->down->right->right->down, collecting the object on turn 5 instead.

Since in both cases, the final leaf nodes detect the object as collected, it doesn't naturally prefer one or the other. So, instead of going right on turn 1, it might go up, or down, or left. This behavior will be repeated exactly on the next turn, so that it basically ends up dancing randomly in front of the collectable object, only luck will make it step on it.

That's clearly suboptimal and I want to fix it, but have run out of ideas how to handle this appropriately. Are there any solutions for this issue or is there any theoretical work that deals with handling this issue?

Solutions I've tried:

Make it value object collection more on earlier turns. While this works, to beat evaluator 'noise', the difference between turns must be quite high. Turn 1 must be rated higher than 2, turn 2 rated higher than 3, etc. The difference between turn 1 and 6 needs to be so high that it ends up making the behavior extremely greedy, which is not desirable in most situations. In an environment with multiple objects, it might end up choosing the path that grabs an object on turn 1, instead of the much better path that can grab objects on turn 5 and 6.

Assign the object as a target and value distance to it. If not done on a turn to turn basis, the original problem persists. If done on a turn to turn basis, the difference in importance required per turn once again makes it too greedy. This method also decreases flexibility and causes other issues. Target selection is not trivial and kind of ruins the point of a minimax style algorithm

Going much deeper in my searches so that it can always find a second object. This would cost so much computing power that I'd have to make concessions, like pruning paths much more aggressively. If I do so, I'll be back at the same problem since I won't know how to get it to prefer pruning the 5 turn version over the 3 turn version.

Give extra value to the plans laid out last turn. If it can at least follow the suboptimal path, there wouldn't be as much of an issue. Unfortunately, this once again has to be a pretty strong effect for it to work reliably, making it follow sub-optimal paths in all scenarios, hurting overall performance.

asked Nov 13 at 11:52

AnythingElse

396

add a comment |

Solutions I've tried:

Make it value object collection more on earlier turns. While this works, to beat evaluator 'noise', the difference between turns must be quite high. Turn 1 must be rated higher than 2, turn 2 rated higher than 3, etc. The difference between turn 1 and 6 needs to be so high that it ends up making the behavior extremely greedy, which is not desirable in most situations. In an environment with multiple objects, it might end up choosing the path that grabs an object on turn 1, instead of the much better path that can grab objects on turn 5 and 6.

Assign the object as a target and value distance to it. If not done on a turn to turn basis, the original problem persists. If done on a turn to turn basis, the difference in importance required per turn once again makes it too greedy. This method also decreases flexibility and causes other issues. Target selection is not trivial and kind of ruins the point of a minimax style algorithm

Going much deeper in my searches so that it can always find a second object. This would cost so much computing power that I'd have to make concessions, like pruning paths much more aggressively. If I do so, I'll be back at the same problem since I won't know how to get it to prefer pruning the 5 turn version over the 3 turn version.

Give extra value to the plans laid out last turn. If it can at least follow the suboptimal path, there wouldn't be as much of an issue. Unfortunately, this once again has to be a pretty strong effect for it to work reliably, making it follow sub-optimal paths in all scenarios, hurting overall performance.

asked Nov 13 at 11:52

AnythingElse

396

add a comment |

Solutions I've tried:

Make it value object collection more on earlier turns. While this works, to beat evaluator 'noise', the difference between turns must be quite high. Turn 1 must be rated higher than 2, turn 2 rated higher than 3, etc. The difference between turn 1 and 6 needs to be so high that it ends up making the behavior extremely greedy, which is not desirable in most situations. In an environment with multiple objects, it might end up choosing the path that grabs an object on turn 1, instead of the much better path that can grab objects on turn 5 and 6.

Assign the object as a target and value distance to it. If not done on a turn to turn basis, the original problem persists. If done on a turn to turn basis, the difference in importance required per turn once again makes it too greedy. This method also decreases flexibility and causes other issues. Target selection is not trivial and kind of ruins the point of a minimax style algorithm

Going much deeper in my searches so that it can always find a second object. This would cost so much computing power that I'd have to make concessions, like pruning paths much more aggressively. If I do so, I'll be back at the same problem since I won't know how to get it to prefer pruning the 5 turn version over the 3 turn version.

Give extra value to the plans laid out last turn. If it can at least follow the suboptimal path, there wouldn't be as much of an issue. Unfortunately, this once again has to be a pretty strong effect for it to work reliably, making it follow sub-optimal paths in all scenarios, hurting overall performance.

asked Nov 13 at 11:52

AnythingElse

396

Solutions I've tried:

Make it value object collection more on earlier turns. While this works, to beat evaluator 'noise', the difference between turns must be quite high. Turn 1 must be rated higher than 2, turn 2 rated higher than 3, etc. The difference between turn 1 and 6 needs to be so high that it ends up making the behavior extremely greedy, which is not desirable in most situations. In an environment with multiple objects, it might end up choosing the path that grabs an object on turn 1, instead of the much better path that can grab objects on turn 5 and 6.

Assign the object as a target and value distance to it. If not done on a turn to turn basis, the original problem persists. If done on a turn to turn basis, the difference in importance required per turn once again makes it too greedy. This method also decreases flexibility and causes other issues. Target selection is not trivial and kind of ruins the point of a minimax style algorithm

Going much deeper in my searches so that it can always find a second object. This would cost so much computing power that I'd have to make concessions, like pruning paths much more aggressively. If I do so, I'll be back at the same problem since I won't know how to get it to prefer pruning the 5 turn version over the 3 turn version.

Give extra value to the plans laid out last turn. If it can at least follow the suboptimal path, there wouldn't be as much of an issue. Unfortunately, this once again has to be a pretty strong effect for it to work reliably, making it follow sub-optimal paths in all scenarios, hurting overall performance.

artificial-intelligence minimax game-ai monte-carlo-tree-search

asked Nov 13 at 11:52

AnythingElse

396

asked Nov 13 at 11:52

AnythingElse

396

asked Nov 13 at 11:52

AnythingElse

396

asked Nov 13 at 11:52

AnythingElse

396

asked Nov 13 at 11:52

AnythingElse

396

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53280469%2ftree-search-based-game-ai-how-to-avoid-ai-wandering-procrastination-with-sp%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk