FacebookTwitter
Hatrack River Forum   
my profile login | search | faq | forum home

  next oldest topic   next newest topic
» Hatrack River Forum » Active Forums » Books, Films, Food and Culture » Was I being honest?

   
Author Topic: Was I being honest?
King of Men
Member
Member # 6684

 - posted      Profile for King of Men   Email King of Men         Edit/Delete Post 
I am a little worried.

For the past half-year or so, I have been working on a particular method of data analysis, intended to produce very large signal samples of good purity in a particular decay channel. The goal was to reach 800k events of 97% purity.

Now, as our full data set is very large, I have been testing things on the portion of it that is 'off-resonance', about one-tenth of the whole. From this it is of course easy to estimate the size of the signal in the full sample.

About two months ago, I had the analysis method working reasonably well, and I wanted to estimate the amount of data there would be in the full sample. Now this turned out to be a little trickier than my comments above would indicate, because it wasn't so easy to find, in the welter of information at the BaBar site, just how much of our data was on-resonance, how much was off-resonance, and how much I had in fact run over. But in the end, I did get an estimate - a hair under 800k at 97% purity. I told my professor this, and he was well pleased.

That was two months ago; since then, I have made some improvements, some of which should reduce bias, others increase yield. I have now run one-half the full data sample, and gotten out 350k events at 97% purity. Simple math indicates that the full sample will yield 700k.

Now, this is not a major difference. It certainly does not make or break the analysis. And there are some explanations for it : Maybe the on-resonance data is a little dirtier than the off-resonance, so that the estimate was a little skewed. Or maybe my change to eliminate bias did exactly what it was supposed to, and the estimates were a little off because the method overestimated the purity. It is even possible that I entered a certain multiplier into the program that calculated the total expected yield, and forgot to change it when the data sample changed. (And yes, we do need an actual program for that, as we need estimates under several different conditions, and it's just easier to automate it.)

But I am a little uneasy in my mind. Did I really use the best possible honesty in making those estimates? Or did I, perhaps, tinker about, looking in various places in our website until I had a number agreeing with our goals for the analysis? At this distance in time, I cannot tell. I do recall frowning, at one point, and saying "That's odd, we expected a bit more. Let me run those numbers again." Perhaps I should have run them a third time, and not stopped when I had what we expected.

It is nothing earth-shattering. At the absolute worst, I am guilty of a little carelessness; at best, I have done nothing except remove a source of bias in my calculation, which happened to be skewing my numbers upwards. But it gives me a little insight into how easy it is to fool yourself, especially if you are working with methods on the edge of statistical significance. (Not the case here, and just as well, too!)

I am a little worried. But I shall do better in the future.

Posts: 10645 | Registered: Jul 2004  |  IP: Logged | Report this post to a Moderator
Kwea
Member
Member # 2199

 - posted      Profile for Kwea   Email Kwea         Edit/Delete Post 
You didn't lie, and you said that this was a possible outcome, even based on the previous data, right? And you let your boss/prof know about it, and haven't hidden anything from him, right?

Sounds like you are honest, not that that suprises me [Big Grin] .

It is good that you are wondering about it though, and that you want to make sure you do better next time.

Kwea

Posts: 15082 | Registered: Jul 2001  |  IP: Logged | Report this post to a Moderator
Bob_Scopatz
Member
Member # 1227

 - posted      Profile for Bob_Scopatz   Email Bob_Scopatz         Edit/Delete Post 
Disclosure shall set you free. Seriously, if you took the new information to your boss, you've met the primary obligation of honesty in data analysis.

The fact that you've learned methods that are better for estimating the unbiased truth probably means that you could publish a paper on it all. It is something valuable.

And if, as you said, the difference isn't devastating to the project, I would imagine that your boss would welcome the news. If not that, at least knowing at the earliest possible moment would be better than finding out later on.

By the way, I have no freakin' idea what you're talking about.

Posts: 22497 | Registered: Sep 2000  |  IP: Logged | Report this post to a Moderator
King of Men
Member
Member # 6684

 - posted      Profile for King of Men   Email King of Men         Edit/Delete Post 
Well, yes, of course I told him! Not to do so would convert that little voice in the back of my head saying "Are you sure you did that right?" into a raging tempest. He was a little disappointed, naturally, but as he pointed out : "We've done this analysis once with 15k events at 95% purity. 700k or 800k, we are going to see a huge improvement." To be sure, I not only mentioned the possible causes for the discrepancy that I outlined above, I also pointed out some things I am doing to squeeze out the last drop of efficiency, which may put us over the 800k mark again.

I am certainly not in trouble over this, even at the informal level of my boss being displeased. I just wanted to get it off my chest, and see if putting it down in writing clarified my thinking.

Posts: 10645 | Registered: Jul 2004  |  IP: Logged | Report this post to a Moderator
aspectre
Member
Member # 2222

 - posted      Profile for aspectre           Edit/Delete Post 
More importantly, not expressing the qualifications/uncertainties in ones own mind to those who will be reading/using the results would make one a poor scientist.

No respectable scientist expects perfection or impeccable genius in another scientist. What is expected is a willingness to work hard to solve problems, and honesty all the way to the "I may not have taken all factors into consideration" attitude of sharing any&all doubts about ones own work. The latter being an expression of confidence in other scientists' abilities.

The mutually shared trust of "Since I can't think my way around this, maybe you can. Or at least help point me in the right direction." is the second most important factor in doing science.

[ May 08, 2005, 06:44 AM: Message edited by: aspectre ]

Posts: 8501 | Registered: Jul 2001  |  IP: Logged | Report this post to a Moderator
Orson Scott Card
Administrator
Member # 209

 - posted      Profile for Orson Scott Card           Edit/Delete Post 
This is what integrity and rigor look like in the real world. Thanks for posting.
Posts: 2005 | Registered: Jul 1999  |  IP: Logged | Report this post to a Moderator
   

   Close Topic   Feature Topic   Move Topic   Delete Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:


Contact Us | Hatrack River Home Page

Copyright © 2008 Hatrack River Enterprises Inc. All rights reserved.
Reproduction in whole or in part without permission is prohibited.


Powered by Infopop Corporation
UBB.classic™ 6.7.2