## Using linear estimates for non-linear processes.

Anthropology, History, Psychology, Sociology and other related areas.

### Using linear estimates for non-linear processes.

Some would say that using linear estimates of non-linear processes or data is always a problem. The slope of a linear curve is constant, the slope of a non-linear curve is variable, so there will always be error.

These people are wrong.

The proper question is under what conditions can we use linear estimates of non-linear functions without introducing problematic errors.

So, I will lay out the logic. I could just paste a link to someone who does it with strict mathematical language, but that doesn't seem to sink in for some given my past attempts on this site. So instead, I will do as much as I can without math here. Yep, verbally, just like I would do with an undergraduate student who asked me about this but did not have the math to dig it out on their own.

Suppose I am a highly rigorous biosociololgist. That is an oxymoron, but let's pretend. I am inspired by three things: herd immunity, disease infection rates in a population, and "things that go viral" on the internet.

Disease infection rates in a population are non-linear. The rate of infection increases as we approach 50% infection, and then the rate decreases as we approach 100%.

In terms of herd immunity, unvaccinated people gain protection from infection when 90% of the population is vaccinated.

Viral content ramps up suddenly, then falls off.

I decide to see if viral content follows a pattern based on these ideas. I propose that content slowly spreads until it hits 10%, then rapidly spreads to cover 90%, then tails off for the last 10% of the spread. I am not ready to model the tails, but the middle 80% of the process I think I have nailed down. I've looked at actual infection curve data, I've simulated the thing. Time to test!

As this is about estimation, assume that I define these numbers sufficiently well to be able to measure them in a reliable and valid way.

Now you can take out your pencil and paper.

X,Y. On the vertical axis, plot "% of spread of idea". Zero to 100.

On the horizontal, time. Zero to infinity, and we can use the unit of days, but let's just call the units (t) for time.

Will a linear estimate of such a function have problems?

Plot three points.
A = 10% at time t1.
B = 50% at time t2
C = 90% at time t3

First, draw a line ABC. This is a linear estimate of the curve. We could make it an equation, with a constant slope. Call this equation L, which describes the linear curve.

We know for a fact that this line is a bad one for our data in many ways. Seems like it would be useless, right?

Let's call the non-linear equation, which looks like an infection in population curve, E.

Next, draw the curve E which we expect is the real one. That curve will:
Intersect L at points A B and C.
Be "under" the curve L between points A and B
Be "over" the curve L between points B and C
Be symmetrical around point B. So area between LAB and EAB = area LBC and EBC.

I am sure we could come up with such an equation, more than one in fact, but the details do not matter.

Clearly if we use a L to estimate E it will not be a perfect fit at all points. There will be error.

Linear regression assumes estimators are BLUE. Best, unbiased, linear estimators. Are they that for E?

Well, we have error, the gaps between the curves. So Best, which means minimal error, that's violated.

Is the error biased? Yes, but with exceptions. L overestimates E in AB, and underestimates E in BC.

But for AC, the error is UNbiased. And in fact, the error using L to estimate E for data at point A and C only is... zero.

So, assuming E, we can use L to estimate the fit of the data at 10% and 90% ONLY in a way that introduced zero error from the estimation technique.

Why might we do this? If gathering data is costly in time, money, or labor we might not be able to get 10,000 data points, or even 100, but we could get 2.

If we do have more data points, can we still use L? What if we do, for 10,000 data points between A and C?

If we use any part of that data between A and C, we will have biased errors in the estimate. Bad. But we will have, so long as we use all the data between A and C, unbiased estimators. We will have error (the area between the curves) but since the overestimation is equal to the underestimation for AC there is no bias in that estimation introduced error. Maybe that error is not large enough to be a problem. But gosh darn it, we don't want that error there at all! Are we doomed, do we have to abandon linear estimation techniques?

Nope.

Let TF be a transformation function, such that for all values of E, TF results in a point on L.

Run our data, which we expect to fit E, through TF. Call the results EL (E transformed to a line)

Do regression using EL.

To the extent that TF works, we have turned perfectly E data into perfectly L data. We have not lost any information in that transformation. Data that perfectly fits E will perfectly fit L. Data that varies from E will vary from L.

So using L to estimate E via TF results in BLUE estimators in the EL model.

Now, things could be more complicated. If E is not symmetrical around B, then we can't assume unbiased error as easily for AC. Our transformation function might be more complicated. But in any case, if you can come up with TF for non-linear data, you can use linear regression on the transformed data without introducing error in the estimation.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

Ursa Minimus,

Yup, you're right! We definitely can do non-linear regression using non-linear methods. I'm hoping that you realize that the overall regression is non-linear once you introduce non-linear transforms, right?

So out of curiosity, how do you normally do non-linear regressions? You don't guess transformation functions manually as a hack to continue using the same old linear software do ya? The problem with that approach is that it decouples the non-linear parameters from the rest of the problem, essentially reducing the scope of arrivable fit results to be a subset of those employing the non-linear descriptors you manually guessed. In other words, guessing transforms makes your fits worse. In addition to being less accurate, it's also slower and less convenient.

As a side-note, this approach also debases your statistical analysis. Dunno if you're the type to do statistical analysis, but if you do, you're screwing yourself by not considering the consequences of non-linear transforms on error distributions assumed by linear regression software to be linear.

Basically you're doing more work for less accuracy and less statistical validity, all just to keep chugging away on LINEST-type software. That's why I'm such a huge critic of this junk.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

Natural ChemE » October 8th, 2015, 6:09 am wrote:Ursa Minimus,

Yup, you're right! We definitely can do non-linear regression using non-linear methods.

...

That's why I'm such a huge critic of this junk.

I have shown how linear estimation may be used to model non-linear data without introducing error. I could provide more detail, and more math. History has shown me that such is wasted on some on this site. Thus my decision to go this route, so that people in the future might gain some benefit, even if they don't have a lot of math.

Criticisms of linear estimations as "junk" because they oversimplify and introduce error are... in error.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

Ursa Minimus » October 8th, 2015, 11:33 am wrote:I have shown how linear estimation may be used to model non-linear data without introducing error. I could provide more detail, and more math. History has shown me that such is wasted on some on this site. Thus my decision to go this route, so that people in the future might gain some benefit, even if they don't have a lot of math.

Criticisms of linear estimations as "junk" because they oversimplify and introduce error are... in error.

So, say a correlation is perfectly fit by $f(x)=x^{2.25}$. But, you can't figure that out because you refuse to use non-linear analysis explicitly; all you see is that $x$ looks like a curve when you plot it. In fact, it looks a lot like $f(x)=x^2$, so you decide to use $\hat x{\equiv}x^2$ as your transform, then you proceed with linear regressions using $\hat x$. How do you not see reporting $2$ instead of $2.25$ as an error? Don't you care that $x$'s contribution is then messed up, causing the rest of your fit parameters to also be wrong?

What about your statistical analysis? Do you understand that, when you transform variables in non-linear ways, you also transform their sampling error in non-linear ways?

What really gets me is that better analysis is really easy and wouldn't require any more work on your part. So, why be lazy about it? Just have software do the non-linear regression for you. It'll be easier, more accurate, find more significant results, and avoid the logical fallacies that fall out of your messed up statistical analyses.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

Just to be clear, I'm using a completely trivial example like $x^{2.25}$ vs. $x^{2}$ as a gentle intro to the problems with your analysis methods. In general, when you start working with many variables in multiple regressions, the problems grow exponentially.

There's a reason that high $R^2$ values are suspect in social science: because, if you do linear analysis correctly, you shouldn't even be able to arrive at good correlations. It's just a messed up approach used in the absence of mathematical understanding.

And, again: it's okay if you don't know enough math to do non-linear stuff. Heck, I doubt you can even do the linear stuff by hand, right? To you it's all just software either way, so why stick with the old-fashion software that gives you much worse results? It's really bad when your statistical analysis is itself flawed because you can't even see how bad your results are when you don't even know how bad your analysis is.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

Natural ChemE » October 8th, 2015, 10:42 am wrote:
Ursa Minimus » October 8th, 2015, 11:33 am wrote:I have shown how linear estimation may be used to model non-linear data without introducing error. I could provide more detail, and more math. History has shown me that such is wasted on some on this site. Thus my decision to go this route, so that people in the future might gain some benefit, even if they don't have a lot of math.

Criticisms of linear estimations as "junk" because they oversimplify and introduce error are... in error.

So, say a correlation is perfectly fit by $f(x)=x^{2.25}$. But, you can't figure that out because you refuse to use non-linear analysis explicitly; all you see is that $x$ looks like a curve when you plot it. In fact, it looks a lot like $f(x)=x^2$, so you decide to use $\hat x{\equiv}x^2$ as your transform, then you proceed with linear regressions using $\hat x$. How do you not see reporting $2$ instead of $2.25$ as an error? Don't you care that $x$'s contribution is then messed up, causing the rest of your fit parameters to also be wrong?

What about your statistical analysis? Do you understand that, when you transform variables in non-linear ways, you also transform their sampling error in non-linear ways?

What really gets me is that better analysis is really easy and wouldn't require any more work on your part. So, why be lazy about it? Just have software do the non-linear regression for you. It'll be easier, more accurate, find more significant results, and avoid the logical fallacies that fall out of your messed up statistical analyses.

I am no longer treating your comments as worth my time. I am tired of your insults.

Have a nice day.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

Alrighties. Well, as a final note, if you ever want to try a better approach, I'd suggest looking into Risk Solver Platform. It's super easy and will fix your problem for you really quickly.

It'll save you time, make your job easier, provide better results, and avoid many of the statistical mistakes in your current approach. It's made for business execs, so you really can pick it up pretty quickly and easily without having to take a math class or anything.

And I'm being serious man. This isn't about a pissing contest on the internet. I truly believe that Risk Solver Platform, or something like it, will make your life better. Software like this is relatively new, so I get that it isn't well-established yet. But it'll track your distributions and such rigorously, perform optimizations, etc. It's a pleasure to use, and it really would make your research better. It integrates right into Excel, though it also has COM API's if you prefer to make your own programs.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

Let's just say that if we are not using lots of equations, we are really not dealing with stats in detail. So the OP in this thread is akin to using an axe, not a scalpel. But we should remember that the original definition of "hacker" is "one who builds furniture with an axe". So in the right hands, hands with lots of skill, an axe can do some pretty fine things.

If anyone is actually thinking of using linear regression for non-linear data, or time series, or anything like that, they should first do a search on problems using linear regression for TOPIC. This will easily pull up relevant mathematical treatments, problems with using linear regression for the topic, how to fix some of those problems, and how some of those problem fixes create other problems, and perhaps even how some problems cannot be fixed. Anyone with some math skills can do this, and copy and paste the reasoning. The statistical reasoning. For the general case.

However, you will note that I specified a very specific case. A case that relies on methodological control. And if you do the search above, you are very unlikely to find anything that addresses methods as they affect testing a hypothesis. If you know math very well, but know little about research in the social sciences, you are unlikely to think of how methods and statistics interact in the testing of hypotheses in the social sciences.

Further, if you are unaware that explaining 30-40% of the variance of a dependent variable is likely to be considered a STRONG test of the theory being tested, and has a good chance of extending the level of prediction in the field (depending on area), then you might not have appropriate standards in place to judge the quality of such a model as it speaks to the field of knowledge the researcher inhabits.

In non-technical terms:

Theory: what we think is the case.
Methods: how we gather data and set it up to test the case
Statistics: how we decide what we can say, and to what degree of accuracy, about the case.

Problems with statistical estimation are methodological problems to the extent that they interfere with our ability to test the case.

Now, look at L and E again. Note that I specified a method where if we only use data at three time points, both L and E predict the same results. The lines intersect at those three points. If I use 100 cases of "going viral", I get 100 data points at A B and C. And that PARTICULAR data is expected, by both L and E, to have a strong linear relationship.

However, methodologically, by doing that I have stripped out information from all other time points in the process, the process that is assumed in E. And I can't do ANYTHING to speak to the rate of change of the slope of the curve over time. That information is no longer present in the data. So if I do use non-linear techniques, which are designed to capture such information, I will find NOTHING relating to that information. There is nothing to find in that limited data.

But what can I say? Can I speak to how well L fits E in the general case? No. But can I say that if L produces poor fit, E is not very likely to be the case? Yes.

IOW, I can do a test of the theory, and reject a null hypothesis. Failure of L gives me good reason to believe that E would fail as well.

So, why does so much social science research rely on linear regression? Part of the reason is that it is taught in the first stats class in many fields (often with an econometrics textbook), and some at weaker programs, qualitative research programs, and MA/MS programs take no more stats than that. So they know linear regression, and can follow it. They have one tool, and use it for whatever they can. With higher or lower attention to the issues in doing so. But they can do a search, and figure things out from an existing base of knowledge.

But, here's the thing. Most every single stats topic AFTER that starts the same way. Book, monograph, paper, class lectures... the same way. From my classes on categorical data analysis to time series analysis, the first lecture started:

"If you use linear regression for this, this is how it goes wrong. If you try to correct, here is what happens. So to avoid all that, we use TECHNIQUE for this type of analysis."

Those at top schools get this, as they tend to require multiple stats classes and not just one. If you look at the higher level journals, you will see that people get this.

And point of fact, I get it too. I don't try to use linear regression for everything, or even "when I can". I let what I need to test, and the nature of the data I have, determine the technique I use.

For example, social scientists are often called on to analyze data that has already been collected. In this case, we are limited in how we can exert methodological control. This is common in evaluation research.

I was recently contacted by a health researcher, who asked me how to do some case selection on data on a health intervention. That data was mostly biomedical measures like blood pressure. I suggested she construct some dummy variables (0,1) and then use those to sort cases. NOT use them to do analysis, but just sort cases. When she did what I said, she ended up with zero cases in her models. The reason is that she did not set the dummy variables = 0 at the start, so she ended up with some 1s and missing data, but no 0s. That was just an error in practical sequencing of setting up the dummies, which I only saw when she sent me the actual dataset.

After further conversation, it became apparent that she wanted to use linear regression to predict how long people stayed in the intervention program. A good assessment goal. But there are problems. Blood pressure over time is highly intercorrelated, which is bad for linear regression. Could she use change scores instead of BP numbers? Maybe, I would expect that changes in BP (+10, -10) are less correlated over time than BP (170, 160).

But I did not even suggest that. I suggested logistic regression. https://www3.nd.edu/~rwilliam/stats2/l81.pdf By using this, a dependent variable of "still in the program" can be constructed. Those who stayed from time 1 to time 2 are coded 1, those who drop are coded 0. We can then determine the likelihood of staying in the program (compared to dropping out) due to other factors at time 1 included as independent variables in the model.

In this case the health researcher had actually used logistic regression before, but for whatever reason did not think about how she might do it in this situation. Given her problems coding dummy variables, I assume part of the reason was a lack of experience in the methods of research in general and working stats programs in specific, not her statistical knowledge in the mathematical sense. If she had designed the evaluation from the start, she might have used logistic regression from the start. But facing variables that were not dichotomous in her data, and a general tendency to AVOID destroying information in the data during analysis by going to a lower level of measurement, she got locked in.

Sometimes less is more.

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

A simple case of where statistical problems limit the utility of linear regression, but not in all ways:

In statistics, multicollinearity (also collinearity) is a phenomenon in which two or more predictor variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a substantial degree of accuracy. In this situation the coefficient estimates of the multiple regression may change erratically in response to small changes in the model or the data. Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data set; it only affects calculations regarding individual predictors. That is, a multiple regression model with correlated predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with respect to others.

https://en.wikipedia.org/wiki/Multicollinearity

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

BioWizard » Fri Oct 09, 2015 12:50 pm wrote:mtb, and that's why I said in the other thread (and will repeat here):

If we can't agree on what constitutes domain-specific knowledge and what doesn't, then this thread will remain sterile. Or more precisely, whether someone without domain-specific knowledge can evaluate/contribute constructively to the non-domain specific components of the methodology of someone with domain-specific knowledge (of course the guidance of the domain-specific expert remains critical, which was the point of my little story - by the way).

method is not limited to math, correct? What else in social science is involved? What else is involved in Biochemistry?

mtbturtle
Banned User

Posts: 10229
Joined: 16 Dec 2005

### Re: Using linear estimates for non-linear processes.

I believe that NaturalC didn't exactly get what Ursa's point is.

1) I have recently had a strong argument with reviewers because in the face of cellular physiological data (ion channel and synaptic behaviors and drug actions) we observed clear correlations between a factor and an observable (membrane potential vs. drug concentration, synaptic activity vs. cell membrane potential) and such correlations were reasonably linear over a certain range. So we reported the angular coefficient (slope) as a relevant factor: the guy was very disturbed by our using linear correlation, but (a) the existence itself of a correlation is best detected by Pearson's correlation coefficient, which is actually based on linear analysis of variance, but is the most efficient unbiased approach in the absence of any indications about the real nature of the correlation, (b) although the relations certainly are not linear except for a limited range (we observe saturation and strong departures from linearity out of the most interesting range of observation) any nonlinear approach would require either a previous knowledge of the analytical relation or arbitrary assumptions about its shape; (c) the slope that one obtains from linear regression has the valuable nature of a "gain" factor in the transduction (at least over the range where the linear relation is reasonably tenable) whereas non parametric analyses would not be able to give any such information, and nonlinear analysis might be affected by strong errors if the assumed relation is not appropriate.
In other words, keeping things simple avoids too many assumptions and can yield important useful information that cannot be obtained in other ways.

2) problems of statistical analysis of the data, and the effects of nonlinearity on errors, are not so terrible as NaturalC seems to suggest.
If a relation is not linear, the structure of data variability will presumably be heterogeneuos over the curve; however, in the presence of many observations over the whole range of interest, the probability distribution of the errors can be estimated at each point and used for subsequent statistical analyses.
In particular, it seems to me that the transformation suggested by Ursa simply is a variable multiplier, as if the data were observed through a deforming lens that widens out the data in some regions and compresses them in others. Under these conditions the transformation is expected to simply multiply by the same factor the errors as well (and increase the variances by the same factor squared, at each point of the curve). So, if the data are well behaved (normally distributed errors), one simply has to appropriately weight the data in computing error functions. Even if the data are badly behaved, the transformation does not introduce in statistical analysis those terrible complications NaturalC seems to be referring to: errors will simply be multiplied by the factors used to linearize the function.
In other words, the linearization of the nonlinear relation proposed by Ursa simply is a LINEAR TRANSFORMATION (each Y point at a certain X value multiplied by the fixed coefficient associated to that X value) and subsequent statistical analyses are made only very slightly more complex by this.

3) consider Natural's example y = x^2.25.
If I got it right, Ursa's transformation would not be from x^2.25 to x^2.
Let's consider a numerical example: let x move from 0 to 10; y will move from 0 to 177.8. You simply take the slope (=12.5604, intercept is 0 in this case). Next, you draw your fit line (y' = 12.5604·x); then you compute the ratio data/fit for each x to get your "transformation". This "transformation" (obviously equal to x^1.25 / 12.56) nicely gives back a "nonlinear fit" to your data from the simple "linear fit". This same "transformation" will be applied to errors as appropriate (e.g. if errors in Y are thought to be mostly measurement errors, then you consider the departures of measured Y's from the "nonlinear fit"; if errors mostly arise from data variability but you have reason to believe that X determines the location of Y but not its dispersion, you do the same; if you have reason to believe that X influences not only the location of Y but also its dispersion - e.g. the fractional error tends to be constant - then you weight the errors accordingly).
The simple conclusion is that analysing errors always is a problem; if you know the structure of errors, then linear or nonlinear fitting does not make any difference. If you have to guess on the structure of the errors, then you will risk to claim bullshit in any case.

Sorry if I bored out everybody
I've been away for a while and I had to compensate :°)

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

### Re: Using linear estimates for non-linear processes.

Hey neuro. Out or cutiosity, over what concentration range is your relationship linear or able to fit your model? And what happens outside that range?

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

neuro » October 23rd, 2015, 2:15 am wrote:In other words, keeping things simple avoids too many assumptions and can yield important useful information that cannot be obtained in other ways.

...

In other words, the linearization of the nonlinear relation proposed by Ursa simply is a LINEAR TRANSFORMATION (each Y point at a certain X value multiplied by the fixed coefficient associated to that X value) and subsequent statistical analyses are made only very slightly more complex by this.

...

The simple conclusion is that analysing errors always is a problem; if you know the structure of errors, then linear or nonlinear fitting does not make any difference. If you have to guess on the structure of the errors, then you will risk to claim bullshit in any case.

Neuro,

Well said, and good examples, from my quick skim.

There are sometimes good reasons to deviate from the "textbook", when the focus is answering a question with the data. To do that well, one must know the textbook, and understand the implications of violating the textbook when conducting and interpreting the analysis that results. In my experience, what leads people to do this is usually having data that they can NOT treat as they would like, so they have to be creative. But sometimes the data cries out for such a method, as in your example 1) where it just made sense to go that route. Looking at a scatter plot never hurts early on, so long as you don't just stay at the "eyeball" stage of analysis.

As a practical matter, this will almost always create problems with the review process, as in your example 1). But if you have your reasons down, in detail, most reviewers will take the time to consider your choices in detail. And if your reasoning is sound, they should treat it as such.

Not that they always will of course. ;)

Ursa Minimus
Member

Posts: 605
Joined: 05 Feb 2012
Location: Northwoods, USA

### Re: Using linear estimates for non-linear processes.

Ursa Minimus » October 23rd, 2015, 1:52 pm wrote:most reviewers will take the time to consider your choices in detail. And if your reasoning is sound, they should treat it as such.

Not that they always will of course. ;)

They did this time, though the battle was tough!

BioWizard wrote:Hey neuro. Out or cutiosity, over what concentration range is your relationship linear or able to fit your model? And what happens outside that range?

well, I would not get too much into the details.
The point was: we use a drug (TEA) that blocks potassium channels. We show, both theoretically and experimentally, that a given concentration of TEA (within a certain range) produces an almost fixed (SMALL) shift of membrane potential over the range -45 to -15 mV (which is the range of membrane potential we were interested in, for our specific synaptic system), and that at each value of (presynaptic) membrane potential synaptic activity can be considered as LOCALLY linear with small membrane potential changes.
The nonlinear aspects were:
1) dose-effect curves for TEA obviously are not linear, but - as you certainly know to be the case - for concentrations well below saturation the relation is reasonably linear, so that a [TEA concentration to membrane potential change] conversion factor (slope, gain) could be estimated (on this limited range of TEA concentrations).
2) the voltage to synaptic activity relation is not linear. However, the slope (linear regression) of synaptic activity on TEA concentration (i.e over a SMALL change in membrane potential) can be estimated as a gain factor of the system. Since our system is a sensory system (semicircular canal of the inner ear labyrinth), one can see how such gain (membrane potential --> synaptic activity) changes depending on the momentary value of presynaptic potential (which is determined by the phase in the mechanical stimulation of the labyrinth)

Sorry, I realize this is quite technical and possibly quite unclear.
The point is we were interested in defining the "transfer function" of the sensory system, and estimating gain is an important part of it, although such gain often turns out to be variable...

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

### Re: Using linear estimates for non-linear processes.

neuro » 23 Oct 2015 10:13 am wrote:well, I would not get too much into the details.

...

Sorry, I realize this is quite technical and possibly quite unclear.
The point is we were interested in defining the "transfer function" of the sensory system, and estimating gain is an important part of it, although such gain often turns out to be variable...

Neuro, I thought it was sufficiently clear and not too technical. Actually, it was exactly the kind of information I was looking for.

neuro wrote:The point was: we use a drug (TEA) that blocks potassium channels. We show, both theoretically and experimentally, that a given concentration of TEA (within a certain range) produces an almost fixed (SMALL) shift of membrane potential over the range -45 to -15 mV (which is the range of membrane potential we were interested in, for our specific synaptic system), and that at each value of (presynaptic) membrane potential synaptic activity can be considered as LOCALLY linear with small membrane potential changes.
The nonlinear aspects were:
1) dose-effect curves for TEA obviously are not linear, but - as you certainly know to be the case - for concentrations well below saturation the relation is reasonably linear, so that a [TEA concentration to membrane potential change] conversion factor (slope, gain) could be estimated (on this limited range of TEA concentrations).

The approximations you mention are certainly frequently used for in vitro biochemical assays with purified protein at pre-saturation conditions and with direct biochemical readout of protein activity. When you're letting those interactions occur inside whole cells, however, and modeling a global "phenotypic" or "functional" readout that is the contribution of a large number of cellular components, the situation isn't nearly as easily approximated. Granted, you can still fit a linear regression over a tight range in many cases. But I would say it's almost atypical to model cell responses to drugs using linear regression (which could be why it didn't immediately sit well with the reviewer). I'm more used to seeing non-linear models for those kinds of responses, and I see that they almost always fit my cell-response data better (I like to run assays over at least a 3-log concentration range, and typically 6-log).

You probably already know that TEA can bind to and interact with a lot of other stuff, in rather complex ways.

neuro wrote:2) the voltage to synaptic activity relation is not linear. However, the slope (linear regression) of synaptic activity on TEA concentration (i.e over a SMALL change in membrane potential) can be estimated as a gain factor of the system. Since our system is a sensory system (semicircular canal of the inner ear labyrinth), one can see how such gain (membrane potential --> synaptic activity) changes depending on the momentary value of presynaptic potential (which is determined by the phase in the mechanical stimulation of the labyrinth)

As you said, these approximations are probably OK over what you called "a certain range". And clearly you were able to successfully argue that point with the reviewers, so congrats on that. However, I'm sure you're aware about the limitations this places on the explanatory and predictive power of your model as far as drug-target interactions go, and the room it leaves for future improvements. As long as that's recognized, I don't think the reviewers should have a fit over it.

Congrats again on getting the paper accepted :]

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

Bio, the point is our interest was not in the effect of the drug.
We used TEA at low concentrations to change the electrical properties of our cells (which we could not do by patch clamping them, due to the structure of the preparation) in such a way that we could relate small changes in membrane potential with changes in synaptic activity.

We did not wish to define a dose-effect curve for TEA (we should have explored a much larger range, and we would have produced effects that we would not have been able to measure, in addition to obvious saturation).

Actually, the quite interesting point was that the block by TEA was virtually independent of voltage (i.e. the same fraction of channels was blocked by a fixed concentration of TEA at all values of membrane potential, from -70 to -10 mV) and curiously enough such fixed fractional block would produce an almost invariant shift in membrane potential over the range -40 to -15 mV (the range we were interested in). This is an unexpected result, at first sight, because the shift in voltage produced by a change in conductance is expected to vary, according to the varying driving force for the permeant ion (K+ in our case); but the overall bioelectrical properties of these hair cells generated such a behavior (which was also predicted by modeling the set of conductances expressed by these cells).

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

### Re: Using linear estimates for non-linear processes.

neuro » 23 Oct 2015 12:38 pm wrote:Actually, the quite interesting point was that the block by TEA was virtually independent of voltage (i.e. the same fraction of channels was blocked by a fixed concentration of TEA at all values of membrane potential, from -70 to -10 mV) and curiously enough such fixed fractional block would produce an almost invariant shift in membrane potential over the range -40 to -15 mV (the range we were interested in). This is an unexpected result, at first sight, because the shift in voltage produced by a change in conductance is expected to vary, according to the varying driving force for the permeant ion (K+ in our case); but the overall bioelectrical properties of these hair cells generated such a behavior (which was also predicted by modeling the set of conductances expressed by these cells).

I don't know if it's cause my head is in the review I'm writing, or something else, but I'm not currently able to fully process what this paragraph is saying. How did you know that the fraction of blocked channels is the same at all membrane potentials?

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

neuro,

Overall I'd comment that you're basically constructing a primitive non-linear regression by building on linear regression. It suffers from being disjoint, but it does still enjoy some of the benefits of non-linear analysis because it is a non-linear analysis.

Regarding error analysis, I'd point out that non-linear transforms really mess with error distributions. For example, if the transform is as simple as ${\hat x}{\equiv}{{x}^{2}}$ for $x{\in}{\left[0,10\right]}$, then sampling error in $\hat x$ at $x=0$ is relatively small, right? And then much larger at $x=10$? But using common tools and methods, what exactly would we say that the error in $\hat x$ is at $x=0$? Some might argue it's zero since we're multiplying the error in $x$ at $x=0$ by $0$, but this is one of those cases where assuming-distributions-follow-their-mean is qualitatively wrong (as opposed to merely quantitatively wrong in this example). This problem is exacerbated because many researchers who do use this sort of approach also scale their variables to be between $0$ and $1$, forcing all of their work into the area where this problem is the sharpest. And all of this is yet worse because we're exploding our assumption that error is normally distributed, which we know to be a rough approximation even in optimistic cases, by continuously extending the system as though it were completely accurate (allowing the inaccuracy in this assumption to do increasingly more damage).

Regarding correlations,
neuro » October 23rd, 2015, 3:15 am wrote:(a) the existence itself of a correlation is best detected by Pearson's correlation coefficient
, I'd point out that the Pearson correlation coefficient can be zero even when patterns are obvious:
There's a lot of information that this sort of analysis is blind to. Obviously linear regression pretty much misses on periodic stuff, e.g. where the clock hand is over time, since linear stuff's fundamentally unable to do that sorta thing. And when regression techniques fundamentally can't pick up on periodic stuff, it kinda screws systems analysis.

Regarding structure,
neuro » October 23rd, 2015, 3:15 am wrote:(b) although the relations certainly are not linear except for a limited range (we observe saturation and strong departures from linearity out of the most interesting range of observation) any nonlinear approach would require either a previous knowledge of the analytical relation or arbitrary assumptions about its shape;
, I'd encourage you to see linear patterns as being arbitrary too. The unreasonable effectiveness of math comes from entropy maximization against the system; the rest's implementation details.

It's okay to use linear regressions in some contexts; in fact, they can even be unnecessarily general in some cases, e.g. estimating a constant's value. Just, there's no good reason to be limited to linear regression in the general case, nor are common implementations well-done.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009
 BioWizard liked this post

### Re: Using linear estimates for non-linear processes.

This article might be a neat read:In Point #4, the author links his own post that starts out by explaining some reasons that linear regression is so bad despite fixes:And also on improvements to linear regression:
Oh! And here's a great example of reductionism, where folks are noting that a particular technique, elastic net regularization, is reducible to another. This basically means that we don't need to worry about the less general one anymore despite it having been quite common in practice.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

This reminds me of the uses and misuses of logic :-)

wolfhnd
Resident Member

Posts: 4520
Joined: 21 Jun 2005
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

wolfhnd » October 24th, 2015, 5:35 pm wrote:This reminds me of the uses and misuses of logic :-)

Hah definitely!

It is pretty funny to see how disjoint classical academics was before computers. It was just too much work to be rigorous, so we had all of these approximations that we don't need anymore.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

Neuro, thank you for the story of your tests and results. I do not, of course, understand all the technicalities but I do understand enough to know where this is going - or hoping to go - and here is one person who will be right out front awaiting good news. The last message I had was "there is nothing that can be done". He forgot to add "yet". Good luck.
vivian maxine
Resident Member

Posts: 2837
Joined: 01 Aug 2014
 BioWizard, Natural ChemE liked this post

### Re: Using linear estimates for non-linear processes.

I'll just add a note, induced by NaturalChemE's observation about patterns not necessarily showing correlation.
My impression is that a pattern is a pattern, and a correlation is a correlation: two distinct aspects.
In particular, a correlation is not the existence of any function that can map one variable onto the other.
All the examples of the first two rows of your picture can be reduced to null correlation coefficient by simply replotting (rotating the Cartesian plane) along the 2 eigenvectors of the x,y covariance matrix.

My simple point is: does one "measurable" tend to change as another "measurable" changes? The general answer to this question - which may be trivial in mathematics (orthogonal or not?) but often is crucial in physics and biology - simply is suggested by the correlation coefficient, which also suggests to what extent a "measurable" changes as another "measurable changes" (to what extent with respect to other possible sources of change and/or error).
The even more relevant question in trying and understanding physiological processes is: what is the "gain" of the system, i.e. what is the ABSOLUTE change in a "measurable" that seems to be accounted for by a change in another "measurable"? The simplest answer to this question is the linear regression slope. Then, if one wishes to understand better, they certainly must look at departures from linearity, possible dependence of the error spread on one or the other "measurable", and try and see whether any reasonable causal model could account for such findings, and finally build an analytical model - if possible - which in most cases will not be a linear one.

Bio: we measure the current through the K channels at various values of membrane potential in the absence or in the presence of a certain concentration of TEA. At each value of membrane potential, the fractional decrease in current (in the presence of a fixed concentration of TEA) is fixed: this implies that the effect on the kinetics of the channel (blocking, inactivation, change in opening/closing kinetics) presumably are voltage-independent. [The confusing point may be that the effect of TEA on conductances could only be measured in the isolated hair cells (patch clamp) whereas we measured the functional effect of TEA on synaptic activity in the intact preparation].

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

### Re: Using linear estimates for non-linear processes.

neuro » 27 Oct 2015 11:04 am wrote:I'll just add a note, induced by NaturalChemE's observation about patterns not necessarily showing correlation.
My impression is that a pattern is a pattern, and a correlation is a correlation: two distinct aspects.
In particular, a correlation is not the existence of any function that can map one variable onto the other.
All the examples of the first two rows of your picture can be reduced to null correlation coefficient by simply replotting (rotating the Cartesian plane) along the 2 eigenvectors of the x,y covariance matrix.

I was going to say earlier (and therefore I agree with you) that detecting correlation between variables can be considered as a pre-step for modeling the variables as functions of one another. If you detect a correlation using a linear method, then you can at least say that one exists. If you can't detect a correlation, then either one doesn't exist, or the linear method can't detect it. In your case, you didn't need to worry about the second scenario.

After that...

neuro wrote:if one wishes to understand better, they certainly must look at departures from linearity, possible dependence of the error spread on one or the other "measurable", and try and see whether any reasonable causal model could account for such findings, and finally build an analytical model - if possible - which in most cases will not be a linear one.

Agreed.

neuro wrote:Bio: we measure the current through the K channels at various values of membrane potential in the absence or in the presence of a certain concentration of TEA. At each value of membrane potential, the fractional decrease in current (in the presence of a fixed concentration of TEA) is fixed: this implies that the effect on the kinetics of the channel (blocking, inactivation, change in opening/closing kinetics) presumably are voltage-independent. [The confusing point may be that the effect of TEA on conductances could only be measured in the isolated hair cells (patch clamp) whereas we measured the functional effect of TEA on synaptic activity in the intact preparation].

Actually, what I've wondered is whether the fraction of blocked channels changes with voltage in a way such that the "voltage decrease" doesn't. Is there a way to determine fractional block independent of potential?

Thanks neuro.

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

neuro » October 27th, 2015, 11:04 am wrote:In particular, a correlation is not the existence of any function that can map one variable onto the other

I think you mean a "linear correlation" as opposed to just a "correlation".

Truncating the "linear" qualifier can be common in fields where linear regressions are the norm. Unfortunately this convenience of expression can lead readers to over-interpret stuff, like forgetting that the Pearson product-moment correlation coefficient only looks for straight lines.

I definitely agree that checking for linear correlations can be a good first step in analysis. Linearity is highly easy for us to work with - it's something our brains handle well - so it's a very low-cost pattern when we can find it.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

To (maybe) bring it all home, here's a published report about this from my own field, with real-life examples.

http://www.ncbi.nlm.nih.gov/pubmed/23226811

Simplification traps.

Abstract

When experiments are analyzed with simple functions, one gets simple results. A trap springs when experiments show deviations from the expected simplicity. When kinetic experiments do not follow exponential curves, they simply are not of the first or pseudofirst order. They can and have to be calculated on the base of plausible reaction schemes. When dose-response curves are analyzed with logistic functions ("4-parameter fit") and give Hill coefficients different from one, this is an experimental result stating that more than one molecule is involved in eliciting the response. If one ignores that result, one usually finds forgiving referees, but one will loose real money when one tries to develop such an unspecific compound into a drug.

The online version of this article (doi:10.1007/s12154-011-0069-3) contains supplementary material, which is available to authorized users.
KEYWORDS:

4PL; Dose–response curves; Logistic function; Multiexponential fits; Numerical methods; Systematic deviations

One of the examples demonstrates how linear regressions can easily fail dose-response fitting and construction of molecular-binding models, even if they produce very high R2. This could lead to real problems with significant consequences, for example when people conclude incorrectly the presence of a single binding site when there are, in fact, two or more. A real-life example is given, were the case was settled with x-ray crystallography: http://www.ncbi.nlm.nih.gov/pubmed/21531720.

This should demonstrate NCE's original point. Which isn't that linear regressions are necessarily bad, but that that they can mislead you if you never try anything else. And when people don't follow up on their models, wrong conclusions make their way into new models, degrading them as they go.

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)
 Natural ChemE liked this post

### Re: Using linear estimates for non-linear processes.

BioWizard,

In all seriousness, I'm going to print out a bunch of copies of that article on heavy paper (like resume paper), roll them up, and start bonking people on the head with it. I've got a whole slew of it in my office closet here now.

Because vivian maxine was really, really right.

I think that doctors feel the guilt of their mistakes and the glory of their successes more closely than other scientists. They see their patients live or die based on their work. By contrast, Isaac Newton's probably saved vastly many more lives than any doctor, but I suspect that he never felt so close to that fact due to the extreme distance of "better physics" and "lives saved". But regardless of how we feel emotionally, that's the reality of the situation.

Bio, that paper you posted was right about forgiving journal referees. In my profession we've watched chemical plants go up - for things like producing life-saving medicines or pushing back ecological disaster from global warming - only to fail, to the tune of billions of dollars. And the story is sickeningly regular: "not our fault because we used accepted methods". As if, in academia, we're actually so critical as to throw out anyone who's wrong. Melodramatic stories of academic politics aside, being correct isn't a prerequisite for publication.

In the end, human lives hang in the balance over whether or not the stuff we do is correct. So fuck the journals and textbooks; if it doesn't work, in our real world, it's wrong. And lay folks trust us - leading to deaths when we're wrong.

As a disclaimer, I recently watched a major project fail due to these sorts of problems. People will die due to it - I'm guessing primarily of lung cancer, isn't that fun? So, please forgive my bitterness, but I feel that rigor is too often misrepresented as a pedantic concern.

And, yeah, I'm serious about bonking people with that journal article. Might have to come up with some kinda cute reward if they actually read it. Lemme read it myself and see what other ones I can find later.

PS - Just to avoid misleading, neuro, I'm not criticizing your use of a linear regression; I haven't read into it, but I trust that you know what you're doing plenty well enough to have made a correct call. My disgust is with the larger acceptance of inaccuracy in literature and such, not with using a tool like linear regression when it's appropriate since that's actually a good thing.
Natural ChemE
Forum Moderator

Posts: 2754
Joined: 28 Dec 2009

### Re: Using linear estimates for non-linear processes.

BioWizard » October 27th, 2015, 5:45 pm wrote:Actually, what I've wondered is whether the fraction of blocked channels changes with voltage in a way such that the "voltage decrease" doesn't. Is there a way to determine fractional block independent of potential

I am sorry. I know I was not clear, the thing is a bit complicated and I tried not to get into the details.
However, it might be interesting for people to know how it is possible to study ion channels.

We studied the behavior of K ion channels in hair cells isolated from the labyrinth of the frog (cells aimed at detecting angular accelerations of the head). These channels open when the membrane is depolarized. The current through the channel, at any value of membrane potential, depends on the fraction of channels which are open (--> total conductance) at that value of membrane potential and on the "driving force" for K ions at that same potential (the difference between the membrane potential and the potential which would perfectly counterbalance K concentration ratio on the two sides of the membrane, generally about -90 mV internal negative).
Applying TEA at a fixed concentration produces a fixed fractional decrease in conductance (not current) at all membrane potential values, which is interpreted as a "voltage-independent" block.

If the hair cell is subjected to the mechanical displacement of the cilia, an ion channel opens that produces an inward current - in the intact preparation - and therefore depolarizes the cell. As a consequence, some K channels open, and tend to counteract such depolarization (potassium current tends to displace membrane potential toward -90 mV). If you block, say, 20% of the K channels, this would be expected to result in a lower hyperpolarizing power of the channels, and therefore in more effective depolarization by the mechanical stimulus; but if the cell is more depolarized, more K channels open... You make a bioelectrical model and perform some computations. Then you take the isolated hair cell, inject some current to simulate mechanical activation and see what happens to membrane potential, and you see by both approaches that having blocked 20% of K channels leads to a slight shift in membrane potential (a few mV) which is reasonably fixed as long as you drive the cell - by your injecting current - to a membrane potential in the range -40 to -15 mV, which is the reasonable range of membrane potentials produced by mechanical stimulation of the cells in the intact labyrinth.

The final step is measuring synaptic activity in the intact preparation. We do not know what is the membrane potential of the hair cells in the intact preparation, during mechanical stimulation (rotation of the labyrinth); plenty of data from the literature strongly suggests it should lie in the range -40 to -15 mV.
From the above experiments in the isolated hair cell we derive that applying a certain concentration of TEA should shift the membrane potential by, say, 3 mV at the peak of stimulation (provided that the membrane potential of the hair cell is in the range -40 to -15 mV at that moment, which we cannot directly verify, but is by all means quite reasonable).
Thus we can estimate what is the "gain" of the system, i.e. how many more quanta of neurotransmitter are secreted in a second by the hair cell due to a 3 mV (or a 1.5 mV or a 6 mV) depolarization. The trick is that since we could not directly modify the membrane potential of the hair cell in the intact preparation, we instead changed it by a known amount by using a specific concentration of TEA.

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

### Re: Using linear estimates for non-linear processes.

Thanks for the elaboration neuro. The difficulty I'm having is with this particular step:

neuro wrote:Applying TEA at a fixed concentration produces a fixed fractional decrease in conductance (not current) at all membrane potential values, which is interpreted as a "voltage-independent" block.

How are you verifying the fixed ratio of open to closed channels? Are you measuring both current and membrane potential? Or just membrane potential? I know that it significantly simplifies your model if the fraction of open/closer is fixed at all potentials, since you won't have to deal with nonlinear stuff during a depolarization/hyperpolarization step. I'm just wondering exactly how you established it.

How are you verifying that the change in conductance is specific to K+ channel blockage? The cells are bathed in this substance, and TEA is a dirty little molecule. Would interfering with non-voltage-gated channels give you similar results regarding the fixed fractional decrease?

These are the reasons I asked if there was some way to independently quantify the fraction of open/close voltage-dependent K+ channels (specifically). Maybe some kind of fluorescent reporter or such.

I'm asking these questions out of curiosity, and I'm sure you already thought about all of them and know the answers. I've always been fascinated by single-cell electrophysiology and don't mind learning more (as your time permits, of course). I also understand that this may not have been necessary for your study, which you seem to have controlled very well.

BioWizard

Posts: 12761
Joined: 24 Mar 2005
Location: United States
Blog: View Blog (3)

### Re: Using linear estimates for non-linear processes.

Bio, in general you study ion channels by two procedures:
1) inject current into a cell, while measuring its membrane potential, and modulate the current in such a way that the membrane potential remains at the value you decide (voltage clamp): under this condition, the current across all membrane ion channels exactly equals the current you inject into the cell (otherwise the membrane potential would change), so you actually measure such cellular currents
2) inject current into a cell according to a precise paradigm (no current, or a current square wave, or a sinusoid, or whatever) and measure the ensuing changes in membrane potential (current clamp).

To study a specific kind of ion channels, you usually design a voltage clamp protocol:
1) a "holding" membrane potential is imposed to the cell: a certain fraction of the channels will typically be open at that potential (possibly none, e.g. no "delayed rectifier K channels" open at membrane potential <-70 mV);
2) from that holding potential you impose momentary steps (square waves of an appropriate duration) to a series of different "command" potentials.

Each specific type of voltage-dependent ion channels will give rise to a current, in response to each command pulse: such current will have a rising and possibly a falling phase, or maintain a steady value, and such kinetic properties will enable you to recognize the contribution of the specific type of channels you are interested in.

The pharmacological action of a drug can then be tested to examine whether it interferes, and to what extent, with the probability of being open for each channel type at any given membrane potential value (and possibly with the kinetics of channel opening-closing). For example, TEA is indeed a dirty drug, but it blocks the different kinds of K channels (IKV, the purely voltage-dependent component of outward rectifier K current or IKD; IKCa, the voltage and calcium-dependent component of IKD; IA, the rapidly activating and rapidly inactivating voltage dependent K current...) at very different concentrations. At low concentrations, it rather specifically blocks IKCa channels. This was a fortunate coincidence for us because it enabled us to develop the experimental protocol I discussed above

neuro
Forum Moderator

Posts: 2631
Joined: 25 Jun 2010
Location: italy

Next