lsqfitics

git repo docs

Wrapper of lsqfit for computing various information criteria, particularly those listed in arXiv:2208.14983 [stat.ME], using vegas.

Example: estimating the ground state energy from a correlation function

Recall that we can spectrally decompose a correlation function as

\[C(t) = \sum_n A_n e^{-E_n t}\]

We will show how to use the dcut parameter for subset selection. Let us generate some fake correlator data with spectrum $E_n = 1+n/5$ (note that real correlators will feature some amount of autcorrelation).

t = np.arange(40)
overlaps = np.random.normal(10, 2, 10)
corr_true = np.sum([A *np.exp(-(1+n/5) *t) for n, A in enumerate(overlaps)], axis=0)

correlator = np.array([corr_true[ti] + np.random.normal(0, corr_true[ti] *np.exp(0.1*ti), 1000) for ti in t])
correlator = gv.dataset.avg_data(correlator.transpose())

Forming the effective mass $m_\mathrm{eff} = \log [C(t)/C(t+1)]$, we see that there is some contamination from excited states at early times.

effective mass

Suppose we wish to fit the correlator to a single state, i.e.

\[f(t) = A e^{-E_0 t}\]

If we are to fit our fake correlator data, it will not suffice to simply fit all the data we have: clearly the early time data are not modeled by a single exponential. Instead we must choose some subset of the correlator data spanning $t = [t_\mathrm{min}, t_\mathrm{max})$ to fit. Fortunately our information criteria (excluding the logGBF) allow us to compare different data subsets so long as we specify the number of data points excluded.

We will address this data subset selection problem through the model average. We consider 16 different models, in which we fix $t_\mathrm{max}=20$ but allow $t_\mathrm{min}=2, 3, \dots, 17$ to vary. Unlike the previous example involving the Taylor expansion, we now must account for the number of data points excluded (dcut) when computing our model average.

def corr_fcn(x, p):
    return np.sum([A *np.exp(-E *x) for A, E in zip(p['A'], p['E'])], axis=0)

def make_fit(tmin, tmax, nstates):
    prior = {}
    prior['A'] = [gv.gvar(10, 10) for _ in range(nstates)]
    prior['E'] = [gv.gvar(1 + n/5, 0.5) for n in range(nstates)]

    fit = lsqfitics.nonlinear_fit(
        data=(t[tmin:tmax], correlator[tmin:tmax]),
        fcn=corr_fcn,
        dcut=len(t) - (tmax - tmin), # new!
        prior=prior
    )
    return fit

# different model choices
tmin = range(2, 18)

# fits to different models
fits = [make_fit(ti, 20, 1) for ti in tmin]

# E0 for each model
y = [f.p['E'][0] for f in fits] 

# statistics
chi2_reduced = [f.chi2/f.dof for f in fits]

# weights (using the BAIC)
weights = lsqfitics.calculate_weights(fits, ic='BAIC')

Computing the model average lsqfitics.calculate_average(y, weights), we find that $E_0 = 1.006(19)$. The contribution of each model to the model average is shown in the plot below.

E0 stability

2024

lsqfitics

2 minute read

Wrapper of lsqfit for computing various information criteria, particularly those listed in arXiv:2208.14983 [stat.ME], using vegas.

Back to top ↑

2023

Back to top ↑

2022

Back to top ↑

2021

lsqfit-gui

less than 1 minute read

Graphical user interface for lsqfit using dash.

Back to top ↑

2020

spacetime-plots

2 minute read

A python noteboook for plotting points and lines, expressly written for making spacetime diagrams. To get started with a tutorial, launch the binder inst...

Back to top ↑