Pandas dataframe find first and last element given condition and calculate slope











up vote
0
down vote

favorite












The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question






















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26















up vote
0
down vote

favorite












The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question






















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26













up vote
0
down vote

favorite









up vote
0
down vote

favorite











The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!










share|improve this question













The situation:



I have a pandas dataframe where I have some data about the production of a product. The product is produced in 3 phases. The phases are not fixed meaning that their cycles (the time till last) is changing. During the production phases, at each cycle the temperature of the product is measured.



Please see the table below:



enter image description here



The problem:



I need to calculate the slope for each cycle of each phase for each product. I also need to add it to the dataframe in a new column called "Slope". The one you can see, highlighted in yellow was added by me manually in an excel file. The real dataset contains hundreds of parameters (not only temperatures) so in reality I need to calculate the slope for many, many columns, therefore I tried to define a function.



My solution is not working at all:



This is the code I tried, but it does not work. I am trying to catch the first and last row for the given product, for the given phase. And then get the temperature data and the difference of these two rows. And this way I could calculate the slope.
This is all I could come up with so far (I created another column called: "Max_cylce_no", this stores the maximum amount of the cycle for each phase):



temp_at_start=-1

def slope(col_name):
global temp_at_start
start_cycle_no = 1
if row["Cycle"]==1:
temp_at_start =row["Temperature"]
start_row = df.index(row)


cycle_numbers = row["Max_cylce_no"]
last_cycle_row = cycle_numbers + start_row


last_temp = df.loc[last_cycle_row, "Temperature"]


And the way I would like to apply it:



df.apply(slope("Temperature"), axis=1)


Unfortunatelly I get a NameError right away saying that: name 'row' is not defined.



Could you please help me and show me the right direction on how to solve this problem. It gives me a really hard time. :(



Thank you in advance!







python pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 10 at 19:19









hunsnowboarder

8111




8111












  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26


















  • providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
    – Yuca
    Nov 10 at 19:26
















providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
– Yuca
Nov 10 at 19:26




providing images as a source of data is not really helpful if we want to try our solutions. Can you provide the data in text?
– Yuca
Nov 10 at 19:26












1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242570%2fpandas-dataframe-find-first-and-last-element-given-condition-and-calculate-slope%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59















up vote
2
down vote



accepted










I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer

















  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59













up vote
2
down vote



accepted







up vote
2
down vote



accepted






I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)





share|improve this answer












I believe you need GroupBy.transform with subtract last value with first and divide by length:



f = lambda x: (x.iloc[-1] - x.iloc[0]) / len(x)
df['new'] = df.groupby(['Product_no','Phase_no'])['Temperature'].transform(f)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 10 at 19:27









jezrael

311k21247322




311k21247322








  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59














  • 1




    Nice one , i believe this is the output OP required.
    – pygo
    Nov 10 at 19:49






  • 1




    You are amazing! Thank you so much! It works like charm!
    – hunsnowboarder
    Nov 10 at 19:59










  • @hunsnowboarder - You are welcome!
    – jezrael
    Nov 10 at 19:59








1




1




Nice one , i believe this is the output OP required.
– pygo
Nov 10 at 19:49




Nice one , i believe this is the output OP required.
– pygo
Nov 10 at 19:49




1




1




You are amazing! Thank you so much! It works like charm!
– hunsnowboarder
Nov 10 at 19:59




You are amazing! Thank you so much! It works like charm!
– hunsnowboarder
Nov 10 at 19:59












@hunsnowboarder - You are welcome!
– jezrael
Nov 10 at 19:59




@hunsnowboarder - You are welcome!
– jezrael
Nov 10 at 19:59


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242570%2fpandas-dataframe-find-first-and-last-element-given-condition-and-calculate-slope%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

鏡平學校

ꓛꓣだゔៀៅຸ໢ທຮ໕໒ ,ໂ'໥໓າ໼ឨឲ៵៭ៈゎゔit''䖳𥁄卿' ☨₤₨こゎもょの;ꜹꟚꞖꞵꟅꞛေၦေɯ,ɨɡ𛃵𛁹ޝ޳ޠ޾,ޤޒޯ޾𫝒𫠁သ𛅤チョ'サノބޘދ𛁐ᶿᶇᶀᶋᶠ㨑㽹⻮ꧬ꧹؍۩وَؠ㇕㇃㇪ ㇦㇋㇋ṜẰᵡᴠ 軌ᵕ搜۳ٰޗޮ޷ސޯ𫖾𫅀ल, ꙭ꙰ꚅꙁꚊꞻꝔ꟠Ꝭㄤﺟޱސꧨꧼ꧴ꧯꧽ꧲ꧯ'⽹⽭⾁⿞⼳⽋២៩ញណើꩯꩤ꩸ꩮᶻᶺᶧᶂ𫳲𫪭𬸄𫵰𬖩𬫣𬊉ၲ𛅬㕦䬺𫝌𫝼,,𫟖𫞽ហៅ஫㆔ాఆఅꙒꚞꙍ,Ꙟ꙱エ ,ポテ,フࢰࢯ𫟠𫞶 𫝤𫟠ﺕﹱﻜﻣ𪵕𪭸𪻆𪾩𫔷ġ,ŧآꞪ꟥,ꞔꝻ♚☹⛵𛀌ꬷꭞȄƁƪƬșƦǙǗdžƝǯǧⱦⱰꓕꓢႋ神 ဴ၀க௭எ௫ឫោ ' េㇷㇴㇼ神ㇸㇲㇽㇴㇼㇻㇸ'ㇸㇿㇸㇹㇰㆣꓚꓤ₡₧ ㄨㄟ㄂ㄖㄎ໗ツڒذ₶।ऩछएोञयूटक़कयँृी,冬'𛅢𛅥ㇱㇵㇶ𥄥𦒽𠣧𠊓𧢖𥞘𩔋цѰㄠſtʯʭɿʆʗʍʩɷɛ,əʏダヵㄐㄘR{gỚṖḺờṠṫảḙḭᴮᵏᴘᵀᵷᵕᴜᴏᵾq﮲ﲿﴽﭙ軌ﰬﶚﶧ﫲Ҝжюїкӈㇴffצּ﬘﭅﬈軌'ffistfflſtffतभफɳɰʊɲʎ𛁱𛁖𛁮𛀉 𛂯𛀞నఋŀŲ 𫟲𫠖𫞺ຆຆ ໹້໕໗ๆทԊꧢꧠ꧰ꓱ⿝⼑ŎḬẃẖỐẅ ,ờỰỈỗﮊDžȩꭏꭎꬻ꭮ꬿꭖꭥꭅ㇭神 ⾈ꓵꓑ⺄㄄ㄪㄙㄅㄇstA۵䞽ॶ𫞑𫝄㇉㇇゜軌𩜛𩳠Jﻺ‚Üမ႕ႌႊၐၸဓၞၞၡ៸wyvtᶎᶪᶹစဎ꣡꣰꣢꣤ٗ؋لㇳㇾㇻㇱ㆐㆔,,㆟Ⱶヤマފ޼ޝަݿݞݠݷݐ',ݘ,ݪݙݵ𬝉𬜁𫝨𫞘くせぉて¼óû×ó£…𛅑הㄙくԗԀ5606神45,神796'𪤻𫞧ꓐ㄁ㄘɥɺꓵꓲ3''7034׉ⱦⱠˆ“𫝋ȍ,ꩲ軌꩷ꩶꩧꩫఞ۔فڱێظペサ神ナᴦᵑ47 9238їﻂ䐊䔉㠸﬎ffiﬣ,לּᴷᴦᵛᵽ,ᴨᵤ ᵸᵥᴗᵈꚏꚉꚟ⻆rtǟƴ𬎎

Why https connections are so slow when debugging (stepping over) in Java?