What cost does bloated object file carry?

-1

While working on an embedded project, I have encountered a function, which is called thousands of times in application's lifetime, often in loops, dozens of times per second. I wondered if I can reduce its cost and I found out, that most of its parameters are known during compilation.

Let me illustrate it with an example.

Original hpp/cpp files can be approximated like this:

original.hpp:



void example(bool arg1, bool arg2, const char* data);

original.cpp:

#include "ex1.hpp"

#include <iostream>



void example(bool arg1, bool arg2, const char* data)

{

    if (arg1 && arg2)

    {

        std::cout << "Both true " << data << std::endl;

    }

    else if (!arg1 && arg2) 

    {

        std::cout << "False and true " << data << std::endl;

    }

    else if (arg1 && !arg2) 

    {

        std::cout << "True and false " << data << std::endl;

    }

    else 

    {

        std::cout << "Both false " << data << std::endl;

    }

}

Let's assume, that every single time the function is called, arg1 and arg2 are known during compilation. Argument data isn't, and for variety of reasons its processing cannot be put in header file.

However, all those if statements can be handled by the compiler with a little bit of template magic:

magic.hpp:

template<bool arg1, bool arg2>

void example(const char* data);

magic.cpp:

#include "ex1.hpp"    

#include <iostream>



template<bool arg1, bool arg2>

struct Processor;



template<>

struct Processor<true, true>

{

    static void process(const char* data)

    {

        std::cout << "Both true " << data << std::endl;

    }

};



template<>

struct Processor<false, true>

{

    static void process(const char* data)

    {

        std::cout << "False and true " << data << std::endl;

    }

};



template<>

struct Processor<true, false>

{

    static void process(const char* data)

    {

        std::cout << "True and false " << data << std::endl;

    }

};



template<>

struct Processor<false, false>

{

    static void process(const char* data)

    {

        std::cout << "Both false " << data << std::endl;

    }

};



template<bool arg1, bool arg2>

void example(const char* data)

{

    Processor<arg1, arg2>::process(data);

}



template void example<true, true>(const char*);

template void example<false, true>(const char*);

template void example<true, false>(const char*);

template void example<false, false>(const char*);

As you can see, even on this tiny example cpp file got significantly bigger compared to the original. But I did remove a few assembler instructions!

Now, in my real-life case things are a bit more complex, because instead of two bool arguments I have enums and structures. Long story short, all combinations give me about one thousand combinations, so I have that many instances of line template void example<something>(const char*);

Of course I do not generate them manually, but with macros, yet still cpp file gets humongous, compared to the original and object file is even worse.

All this in the name of removing several if and one switch statements.

My question is: is size the only problem with the template-magic approach? I wonder if there is some hidden cost with using so many versions of the same function. Did I really saved some resources, or just the opposite?

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
Nov 21 '18 at 2:13

add a comment |

-1

Let me illustrate it with an example.

Original hpp/cpp files can be approximated like this:

original.hpp:



void example(bool arg1, bool arg2, const char* data);

original.cpp:

#include "ex1.hpp"

#include <iostream>



void example(bool arg1, bool arg2, const char* data)

{

    if (arg1 && arg2)

    {

        std::cout << "Both true " << data << std::endl;

    }

    else if (!arg1 && arg2) 

    {

        std::cout << "False and true " << data << std::endl;

    }

    else if (arg1 && !arg2) 

    {

        std::cout << "True and false " << data << std::endl;

    }

    else 

    {

        std::cout << "Both false " << data << std::endl;

    }

}

However, all those if statements can be handled by the compiler with a little bit of template magic:

magic.hpp:

template<bool arg1, bool arg2>

void example(const char* data);

magic.cpp:

#include "ex1.hpp"    

#include <iostream>



template<bool arg1, bool arg2>

struct Processor;



template<>

struct Processor<true, true>

{

    static void process(const char* data)

    {

        std::cout << "Both true " << data << std::endl;

    }

};



template<>

struct Processor<false, true>

{

    static void process(const char* data)

    {

        std::cout << "False and true " << data << std::endl;

    }

};



template<>

struct Processor<true, false>

{

    static void process(const char* data)

    {

        std::cout << "True and false " << data << std::endl;

    }

};



template<>

struct Processor<false, false>

{

    static void process(const char* data)

    {

        std::cout << "Both false " << data << std::endl;

    }

};



template<bool arg1, bool arg2>

void example(const char* data)

{

    Processor<arg1, arg2>::process(data);

}



template void example<true, true>(const char*);

template void example<false, true>(const char*);

template void example<true, false>(const char*);

template void example<false, false>(const char*);

As you can see, even on this tiny example cpp file got significantly bigger compared to the original. But I did remove a few assembler instructions!

Of course I do not generate them manually, but with macros, yet still cpp file gets humongous, compared to the original and object file is even worse.

All this in the name of removing several if and one switch statements.

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
Nov 21 '18 at 2:13

add a comment |

-1

Let me illustrate it with an example.

Original hpp/cpp files can be approximated like this:

original.hpp:



void example(bool arg1, bool arg2, const char* data);

original.cpp:

#include "ex1.hpp"

#include <iostream>



void example(bool arg1, bool arg2, const char* data)

{

    if (arg1 && arg2)

    {

        std::cout << "Both true " << data << std::endl;

    }

    else if (!arg1 && arg2) 

    {

        std::cout << "False and true " << data << std::endl;

    }

    else if (arg1 && !arg2) 

    {

        std::cout << "True and false " << data << std::endl;

    }

    else 

    {

        std::cout << "Both false " << data << std::endl;

    }

}

However, all those if statements can be handled by the compiler with a little bit of template magic:

magic.hpp:

template<bool arg1, bool arg2>

void example(const char* data);

magic.cpp:

#include "ex1.hpp"    

#include <iostream>



template<bool arg1, bool arg2>

struct Processor;



template<>

struct Processor<true, true>

{

    static void process(const char* data)

    {

        std::cout << "Both true " << data << std::endl;

    }

};



template<>

struct Processor<false, true>

{

    static void process(const char* data)

    {

        std::cout << "False and true " << data << std::endl;

    }

};



template<>

struct Processor<true, false>

{

    static void process(const char* data)

    {

        std::cout << "True and false " << data << std::endl;

    }

};



template<>

struct Processor<false, false>

{

    static void process(const char* data)

    {

        std::cout << "Both false " << data << std::endl;

    }

};



template<bool arg1, bool arg2>

void example(const char* data)

{

    Processor<arg1, arg2>::process(data);

}



template void example<true, true>(const char*);

template void example<false, true>(const char*);

template void example<true, false>(const char*);

template void example<false, false>(const char*);

As you can see, even on this tiny example cpp file got significantly bigger compared to the original. But I did remove a few assembler instructions!

Of course I do not generate them manually, but with macros, yet still cpp file gets humongous, compared to the original and object file is even worse.

All this in the name of removing several if and one switch statements.

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

Let me illustrate it with an example.

Original hpp/cpp files can be approximated like this:

original.hpp:



void example(bool arg1, bool arg2, const char* data);

original.cpp:

#include "ex1.hpp"

#include <iostream>



void example(bool arg1, bool arg2, const char* data)

{

    if (arg1 && arg2)

    {

        std::cout << "Both true " << data << std::endl;

    }

    else if (!arg1 && arg2) 

    {

        std::cout << "False and true " << data << std::endl;

    }

    else if (arg1 && !arg2) 

    {

        std::cout << "True and false " << data << std::endl;

    }

    else 

    {

        std::cout << "Both false " << data << std::endl;

    }

}

However, all those if statements can be handled by the compiler with a little bit of template magic:

magic.hpp:

template<bool arg1, bool arg2>

void example(const char* data);

magic.cpp:

#include "ex1.hpp"    

#include <iostream>



template<bool arg1, bool arg2>

struct Processor;



template<>

struct Processor<true, true>

{

    static void process(const char* data)

    {

        std::cout << "Both true " << data << std::endl;

    }

};



template<>

struct Processor<false, true>

{

    static void process(const char* data)

    {

        std::cout << "False and true " << data << std::endl;

    }

};



template<>

struct Processor<true, false>

{

    static void process(const char* data)

    {

        std::cout << "True and false " << data << std::endl;

    }

};



template<>

struct Processor<false, false>

{

    static void process(const char* data)

    {

        std::cout << "Both false " << data << std::endl;

    }

};



template<bool arg1, bool arg2>

void example(const char* data)

{

    Processor<arg1, arg2>::process(data);

}



template void example<true, true>(const char*);

template void example<false, true>(const char*);

template void example<true, false>(const char*);

template void example<false, false>(const char*);

As you can see, even on this tiny example cpp file got significantly bigger compared to the original. But I did remove a few assembler instructions!

Of course I do not generate them manually, but with macros, yet still cpp file gets humongous, compared to the original and object file is even worse.

All this in the name of removing several if and one switch statements.

c++ templates optimization embedded

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

edited Nov 22 '18 at 14:00

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

asked Nov 20 '18 at 21:41

Darth Hunterix

1,22432329

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
Nov 21 '18 at 2:13

add a comment |

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
Nov 21 '18 at 2:13

Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew♦
Nov 21 '18 at 2:13

add a comment |

1 Answer
1

active

oldest

votes

The problem with an increased binary size is almost never the storage of the file itself - the problem is that more code means a lower % of the program instructions are available in cache at any point, leading to cache misses. If you're calling the same instantiation in a tight loop, then having it do less work is great. But if you're constantly bouncing around between different template instantiations, then the cost of going to main memory to load instructions may be far higher than what you save by removing some instructions from inside the function.

This kind of thing can be VERY difficult to predict, though. The way to find the sweet spot in this (and any) type of optimization is to measure. It is also likely to change across platforms - especially in an embedded world.

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

1

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402004%2fwhat-cost-does-bloated-object-file-carry%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

1

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

add a comment |

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

1

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

add a comment |

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

edited Nov 20 '18 at 22:02

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

answered Nov 20 '18 at 21:57

xaxxon

14.5k43061

1

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

add a comment |

1

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

What makes you think the OP is using a system with data or instruction cache? He says he's using a microcontroller. Which can be anything from an antique 8051 to Cortex A or PowerPC. I don't see how this answers the question.

– Lundin
Nov 21 '18 at 11:58

I can now confirm that there is a bit of cache, small as it might be, so it may be a problem. The answer is also useful for other people with similar question. In the end, the concept was abandoned.

– Darth Hunterix
Nov 21 '18 at 21:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk