Assembly 8086, Why later instruction doesn't modify previous one after execution

up vote
0
down vote

favorite

I'm new in assembler and trying to figure out this code:

072A:100 mov word ptr [0107], 4567

072A:106 mov ax, 1234

072A:109 add ax, dx

Thing that I understand is that first instruction puts two bytes with values 67 45 at address 072A:107. In the end AX = 4567.

What I don't understand is why newer instruction mov ax, 1234 doesn't change value at address 072A:107 of previous mov word ptr [0107] instruction, why dump isn't changed?.

Thank you in advance.

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

What is your question exactly? Why mov ax, 1234 isn't shown as mov ax, 4567 instead..? Have you tried executing the code once and then generating the disassembly again?
– Michael
yesterday

mind the segments. 'word ptr [107]' isn't necessarily CS:[107]
– Tommylee2k
yesterday

1

Self-modifying code has stopped being practical quite a long time ago. Modern processors prefetch and pre-decode instructions well before they are executed. As-is, this code requires a special instruction between the two, a serializing instruction like cpuid.
– Hans Passant
yesterday

1

@Pooshkis how about rephrasing the title of question? Something like "why later instruction, modified by previous one, does not reset back when executed"? Your current one seems more like you are asking why mov ax,1234 does not modify previous instruction, and that's hopefully clear, it doesn't write any memory, so it can't modify any instruction at all. Or did you have something else on mind and the proposed title is not telling it?
– Ped7g
yesterday

@HansPassant: Fun fact: actual hardware implementations of x86 have stronger i-cache coherency than what's required on paper, because being exactly as weak as the paper spec would be slow, as Andy Glew explains Observing stale instruction fetching on x86 with self-modifying code. I think the most that any x86 has ever required is a taken jump to avoid stale instruction fetch, but modern OoO-exec machines snoop addresses that are already in the pipeline. (Resulting in massively slow machine-clears for self-modifying code.)
– Peter Cordes
yesterday

|
show 1 more comment

up vote
0
down vote

favorite

I'm new in assembler and trying to figure out this code:

072A:100 mov word ptr [0107], 4567

072A:106 mov ax, 1234

072A:109 add ax, dx

Thing that I understand is that first instruction puts two bytes with values 67 45 at address 072A:107. In the end AX = 4567.

What I don't understand is why newer instruction mov ax, 1234 doesn't change value at address 072A:107 of previous mov word ptr [0107] instruction, why dump isn't changed?.

Thank you in advance.

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

What is your question exactly? Why mov ax, 1234 isn't shown as mov ax, 4567 instead..? Have you tried executing the code once and then generating the disassembly again?
– Michael
yesterday

mind the segments. 'word ptr [107]' isn't necessarily CS:[107]
– Tommylee2k
yesterday

1

Self-modifying code has stopped being practical quite a long time ago. Modern processors prefetch and pre-decode instructions well before they are executed. As-is, this code requires a special instruction between the two, a serializing instruction like cpuid.
– Hans Passant
yesterday

1

@Pooshkis how about rephrasing the title of question? Something like "why later instruction, modified by previous one, does not reset back when executed"? Your current one seems more like you are asking why mov ax,1234 does not modify previous instruction, and that's hopefully clear, it doesn't write any memory, so it can't modify any instruction at all. Or did you have something else on mind and the proposed title is not telling it?
– Ped7g
yesterday

@HansPassant: Fun fact: actual hardware implementations of x86 have stronger i-cache coherency than what's required on paper, because being exactly as weak as the paper spec would be slow, as Andy Glew explains Observing stale instruction fetching on x86 with self-modifying code. I think the most that any x86 has ever required is a taken jump to avoid stale instruction fetch, but modern OoO-exec machines snoop addresses that are already in the pipeline. (Resulting in massively slow machine-clears for self-modifying code.)
– Peter Cordes
yesterday

|
show 1 more comment

up vote
0
down vote

favorite

I'm new in assembler and trying to figure out this code:

072A:100 mov word ptr [0107], 4567

072A:106 mov ax, 1234

072A:109 add ax, dx

Thing that I understand is that first instruction puts two bytes with values 67 45 at address 072A:107. In the end AX = 4567.

What I don't understand is why newer instruction mov ax, 1234 doesn't change value at address 072A:107 of previous mov word ptr [0107] instruction, why dump isn't changed?.

Thank you in advance.

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

I'm new in assembler and trying to figure out this code:

072A:100 mov word ptr [0107], 4567

072A:106 mov ax, 1234

072A:109 add ax, dx

Thing that I understand is that first instruction puts two bytes with values 67 45 at address 072A:107. In the end AX = 4567.

What I don't understand is why newer instruction mov ax, 1234 doesn't change value at address 072A:107 of previous mov word ptr [0107] instruction, why dump isn't changed?.

Thank you in advance.

assembly word instructions mov

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

edited 22 hours ago

asked 2 days ago

Pooshkis

New contributor

asked 2 days ago

Pooshkis

asked 2 days ago

Pooshkis

New contributor

Pooshkis is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

What is your question exactly? Why mov ax, 1234 isn't shown as mov ax, 4567 instead..? Have you tried executing the code once and then generating the disassembly again?
– Michael
yesterday

mind the segments. 'word ptr [107]' isn't necessarily CS:[107]
– Tommylee2k
yesterday

1

Self-modifying code has stopped being practical quite a long time ago. Modern processors prefetch and pre-decode instructions well before they are executed. As-is, this code requires a special instruction between the two, a serializing instruction like cpuid.
– Hans Passant
yesterday

1

@Pooshkis how about rephrasing the title of question? Something like "why later instruction, modified by previous one, does not reset back when executed"? Your current one seems more like you are asking why mov ax,1234 does not modify previous instruction, and that's hopefully clear, it doesn't write any memory, so it can't modify any instruction at all. Or did you have something else on mind and the proposed title is not telling it?
– Ped7g
yesterday

@HansPassant: Fun fact: actual hardware implementations of x86 have stronger i-cache coherency than what's required on paper, because being exactly as weak as the paper spec would be slow, as Andy Glew explains Observing stale instruction fetching on x86 with self-modifying code. I think the most that any x86 has ever required is a taken jump to avoid stale instruction fetch, but modern OoO-exec machines snoop addresses that are already in the pipeline. (Resulting in massively slow machine-clears for self-modifying code.)
– Peter Cordes
yesterday

|
show 1 more comment

What is your question exactly? Why mov ax, 1234 isn't shown as mov ax, 4567 instead..? Have you tried executing the code once and then generating the disassembly again?
– Michael
yesterday

mind the segments. 'word ptr [107]' isn't necessarily CS:[107]
– Tommylee2k
yesterday

1

Self-modifying code has stopped being practical quite a long time ago. Modern processors prefetch and pre-decode instructions well before they are executed. As-is, this code requires a special instruction between the two, a serializing instruction like cpuid.
– Hans Passant
yesterday

1

@Pooshkis how about rephrasing the title of question? Something like "why later instruction, modified by previous one, does not reset back when executed"? Your current one seems more like you are asking why mov ax,1234 does not modify previous instruction, and that's hopefully clear, it doesn't write any memory, so it can't modify any instruction at all. Or did you have something else on mind and the proposed title is not telling it?
– Ped7g
yesterday

@HansPassant: Fun fact: actual hardware implementations of x86 have stronger i-cache coherency than what's required on paper, because being exactly as weak as the paper spec would be slow, as Andy Glew explains Observing stale instruction fetching on x86 with self-modifying code. I think the most that any x86 has ever required is a taken jump to avoid stale instruction fetch, but modern OoO-exec machines snoop addresses that are already in the pipeline. (Resulting in massively slow machine-clears for self-modifying code.)
– Peter Cordes
yesterday

What is your question exactly? Why mov ax, 1234 isn't shown as mov ax, 4567 instead..? Have you tried executing the code once and then generating the disassembly again?
– Michael
yesterday

mind the segments. 'word ptr [107]' isn't necessarily CS:[107]
– Tommylee2k
yesterday

Self-modifying code has stopped being practical quite a long time ago. Modern processors prefetch and pre-decode instructions well before they are executed. As-is, this code requires a special instruction between the two, a serializing instruction like cpuid.
– Hans Passant
yesterday

@Pooshkis how about rephrasing the title of question? Something like "why later instruction, modified by previous one, does not reset back when executed"? Your current one seems more like you are asking why mov ax,1234 does not modify previous instruction, and that's hopefully clear, it doesn't write any memory, so it can't modify any instruction at all. Or did you have something else on mind and the proposed title is not telling it?
– Ped7g
yesterday

@HansPassant: Fun fact: actual hardware implementations of x86 have stronger i-cache coherency than what's required on paper, because being exactly as weak as the paper spec would be slow, as Andy Glew explains Observing stale instruction fetching on x86 with self-modifying code. I think the most that any x86 has ever required is a taken jump to avoid stale instruction fetch, but modern OoO-exec machines snoop addresses that are already in the pipeline. (Resulting in massively slow machine-clears for self-modifying code.)
– Peter Cordes
yesterday

|
show 1 more comment

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

When you are looking at that disassembly (before executing first instruction), the memory is already loaded with the machine code (I will assume this is DOS COM file, so cs=ds=ss=0x72A and the first mov will self-modify the second mov).

So the content of memory is already (the middle part is machine code bytes in hexa):

072A:100 C70607016745   (mov word ptr [0107], 4567) <- cs:ip points here

072A:106 B83412         (mov ax, 1234)

072A:109 01D0           (add ax, dx)

After executing first instruction (C7 06 07 01 67 45 - 6 bytes are read by CPU and decoded as mov [..],.. instruction) the memory content will change to:

072A:100 C70607016745   (mov word ptr [0107], 4567)

072A:106 B86745         (mov ax, 4567)  <- cs:ip points here

072A:109 01D0           (add ax, dx)

If you will disassemble the machine code now, you will see the second instruction as "mov ax, 4567" already... the CPU has no idea, that the original source did say mov ax, 1234 and as you can see from the machine code in memory, there's no way to reconstruct that, there's no 1234h value anywhere in memory.

Also when you reload the code from executable, it will be again mov ax, 1234, because that's what is stored in the binary after assembling step, before executing it.

The machine code is not built at runtime from source, the assembler does produce binary machine code during assembling time, so there's nothing to "restore" that second instruction back to mov ax,1234 (source and assembler are not relevant at runtime).

If this would be some kind of interpreted language, preparing every instruction just before execution, assembling from source, then the first instruction would have to modify source to cause self-modification at "interpretation-time", but most of the interpreters don't allow any easy way to modify currently interpreted source.

And even toy/simulator-machines designed to teach assembly (MARS/SPIM, or 8-bit assembler simulator) operate at "runtime" with binary machine code, not source code (although they may or may not allow self-modification to propagate into simulation, some simulators may ignore it and protect original machine code from modification for whatever weird reasons).

warning for assembly newcomers: while self-modification of code may sound cool at first (at least it did to me), it's strongly discouraged: 1) you can NOT use it by default in modern SW (unless you go quite some lengths to enable it) 2) it hurts performance of modern CPUs a lot, because when modern x86 CPU detects write at 107h, it did already fetched+decoded+speculatively executed several instructions down the line, so it has to throw all of that "future" work into trash, clear the internal caches, and start over, which means that instruction like mov ax,1234 which may have been executed in single cycle or even along some other instruction, may instead take 100+ cycles. 3) it allows for difficult to find bugs, if you are not experienced enough to guess all implications of such code.

So it's valuable to understand the concept and what happens, but don't use it unless you are doing something extra niche/specialized, like 256B intro and it saves you two bytes, then it's valid.

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Pooshkis is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53204249%2fassembly-8086-why-later-instruction-doesnt-modify-previous-one-after-execution%23new-answer', 'question_page');
}
);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

So the content of memory is already (the middle part is machine code bytes in hexa):

072A:100 C70607016745   (mov word ptr [0107], 4567) <- cs:ip points here

072A:106 B83412         (mov ax, 1234)

072A:109 01D0           (add ax, dx)

After executing first instruction (C7 06 07 01 67 45 - 6 bytes are read by CPU and decoded as mov [..],.. instruction) the memory content will change to:

072A:100 C70607016745   (mov word ptr [0107], 4567)

072A:106 B86745         (mov ax, 4567)  <- cs:ip points here

072A:109 01D0           (add ax, dx)

Also when you reload the code from executable, it will be again mov ax, 1234, because that's what is stored in the binary after assembling step, before executing it.

So it's valuable to understand the concept and what happens, but don't use it unless you are doing something extra niche/specialized, like 256B intro and it saves you two bytes, then it's valid.

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

add a comment |

up vote
2
down vote

accepted

So the content of memory is already (the middle part is machine code bytes in hexa):

072A:100 C70607016745   (mov word ptr [0107], 4567) <- cs:ip points here

072A:106 B83412         (mov ax, 1234)

072A:109 01D0           (add ax, dx)

After executing first instruction (C7 06 07 01 67 45 - 6 bytes are read by CPU and decoded as mov [..],.. instruction) the memory content will change to:

072A:100 C70607016745   (mov word ptr [0107], 4567)

072A:106 B86745         (mov ax, 4567)  <- cs:ip points here

072A:109 01D0           (add ax, dx)

Also when you reload the code from executable, it will be again mov ax, 1234, because that's what is stored in the binary after assembling step, before executing it.

So it's valuable to understand the concept and what happens, but don't use it unless you are doing something extra niche/specialized, like 256B intro and it saves you two bytes, then it's valid.

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

add a comment |

up vote
2
down vote

accepted

So the content of memory is already (the middle part is machine code bytes in hexa):

072A:100 C70607016745   (mov word ptr [0107], 4567) <- cs:ip points here

072A:106 B83412         (mov ax, 1234)

072A:109 01D0           (add ax, dx)

After executing first instruction (C7 06 07 01 67 45 - 6 bytes are read by CPU and decoded as mov [..],.. instruction) the memory content will change to:

072A:100 C70607016745   (mov word ptr [0107], 4567)

072A:106 B86745         (mov ax, 4567)  <- cs:ip points here

072A:109 01D0           (add ax, dx)

Also when you reload the code from executable, it will be again mov ax, 1234, because that's what is stored in the binary after assembling step, before executing it.

So it's valuable to understand the concept and what happens, but don't use it unless you are doing something extra niche/specialized, like 256B intro and it saves you two bytes, then it's valid.

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

So the content of memory is already (the middle part is machine code bytes in hexa):

072A:100 C70607016745   (mov word ptr [0107], 4567) <- cs:ip points here

072A:106 B83412         (mov ax, 1234)

072A:109 01D0           (add ax, dx)

After executing first instruction (C7 06 07 01 67 45 - 6 bytes are read by CPU and decoded as mov [..],.. instruction) the memory content will change to:

072A:100 C70607016745   (mov word ptr [0107], 4567)

072A:106 B86745         (mov ax, 4567)  <- cs:ip points here

072A:109 01D0           (add ax, dx)

Also when you reload the code from executable, it will be again mov ax, 1234, because that's what is stored in the binary after assembling step, before executing it.

So it's valuable to understand the concept and what happens, but don't use it unless you are doing something extra niche/specialized, like 256B intro and it saves you two bytes, then it's valid.

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

edited 17 hours ago

answered yesterday

Ped7g

12.8k21738

answered yesterday

Ped7g

12.8k21738

answered yesterday

Ped7g

12.8k21738

add a comment |

Pooshkis is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Pooshkis is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk