Skip to content

Commit cd980c2

Browse files
author
John Hoffman
committed
updated readme
1 parent c09f41b commit cd980c2

File tree

2 files changed

+94
-182
lines changed

2 files changed

+94
-182
lines changed

README.rst

Lines changed: 47 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Description
2121

2222
.. image:: plots/templates_and_periodograms.png
2323

24-
The Fast Template Periodogram extends the Generalized Lomb-Scargle
24+
The Fast Template Periodogram extends the Lomb-Scargle
2525
periodogram ([Barning1963]_, [Vanicek1971]_, [Lomb1976]_, [Scargle1982]_, [ZechmeisterKurster2009]_) for arbitrary (periodic) signal shapes. It
2626
naturally handles non-uniformly sampled time-series data.
2727

@@ -38,19 +38,6 @@ Uses the the `non-equispaced Fast Fourier Transform <https://www-user.tu-chemnit
3838
The ``ftperiodogram`` library is complete with API documentation and consistency
3939
checks using ``py.test``.
4040

41-
.. [ZechmeisterKurster2009] `Paper <http://adsabs.harvard.edu/abs/2009A%26A...496..577Z>`_
42-
43-
.. [Lomb1976] `Least-squares frequency analysis of unequally spaced data <http://adsabs.harvard.edu/abs/1976Ap%26SS..39..447L>`_
44-
45-
.. [Scargle1982] `Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data <http://adsabs.harvard.edu/abs/1982ApJ...263..835S>`_
46-
47-
.. [Barning1963] `The numerical analysis of the light-curve of 12 Lacertae <http://adsabs.harvard.edu/abs/1963BAN....17...22B>`_
48-
49-
.. [Vanicek1971] `Further Development and Properties of the Spectral Analysis by Least-Squares <http://adsabs.harvard.edu/abs/1971Ap%26SS..12...10V>`_
50-
51-
.. [VanderPlas2017] `'Understanding the Lomb-Scargle Periodogram <https://arxiv.org/abs/1703.09824>`_
52-
53-
5441

5542
Installing
5643
----------
@@ -61,7 +48,7 @@ As long as you have ``scipy`` and ``numpy`` installed, you should be able to run
6148
``pip install ftperiodogram`` and everything should work fine.
6249

6350
If this doesn't work, consult the instructions in ``CONDA_INSTALL.md`` for installing ``ftperiodogram`` and its dependencies with with
64-
```conda`` <https://www.continuum.io/downloads>`_.
51+
`conda <https://www.continuum.io/downloads>`_.
6552

6653
Examples
6754
--------
@@ -78,8 +65,7 @@ inside the ``notebooks/`` directory::
7865
Updates
7966
-------
8067

81-
* See the `issues <https://github.com/PrincetonUniversity/FastTemplatePeriodogram/issues>`_
82-
section for known bugs! You can also submit bugs through this interface.
68+
* See the `issues <https://github.com/PrincetonUniversity/FastTemplatePeriodogram/issues>`_section for known bugs! You can also submit bugs through this interface.
8369

8470

8571
More information
@@ -101,8 +87,6 @@ multiband model that locked the phases, amplitudes and
10187
offsets of all bands together. They found that template fitting was significantly more accurate for estimating periods of RR Lyrae stars, but the computational resources
10288
needed for these fits were enormous (~30 minutes per object per CPU core).
10389

104-
.. [Sesar2017] https://arxiv.org/abs/1611.08596
105-
10690
How does the fast template periodogram improve things?
10791
------------------------------------------------------
10892

@@ -112,104 +96,76 @@ better way to perform these fits. Details will be presented in a paper
11296
(Hoffman *et al.* 2017, *in prep*), but the important part is you can reduce
11397
the non-linearity of the problem to the following:
11498

115-
* Finding the zeros of an order ``6H-1`` complex polynomial at each trial frequency
116-
* This is done via the ``numpy.polynomial`` library, which performs singular-value decomposition on the polynomial "companion matrix", and scales as ``O(H^3)``.
117-
* Computing the coefficients of these polynomials for all trial frequencies simultaneously by leveraging the non-equispaced fast Fourier transform, a process that scales as ``O(HN_f log(HN_f))``.
99+
- Finding the zeros of an order ``6H-1`` complex polynomial at each trial frequency
100+
- This is done via the ``numpy.polynomial`` library, which performs singular-value decomposition on the polynomial "companion matrix", and scales as ``O(H^3)``.
101+
- Computing the coefficients of these polynomials for all trial frequencies simultaneously by leveraging the non-equispaced fast Fourier transform, a process that scales as ``O(HN_f log(HN_f))``.
118102

119103
This provides two advantages:
120104

121-
:Computational speed and scaling:
105+
:Improved computational speed and scaling:
122106
.. image:: plots/timing_vs_ndata_const_freq.png
107+
Speed comparison for a test case using a constant
108+
number of trial frequencies but varying the number
109+
of observations.
123110

111+
:Numerically stable and accurate:
112+
.. image:: plots/correlation_with_nonlinopt.png
113+
Accuracy comparison between the fast template periodogram
114+
and a ``gatspy``-like method that uses the ``scipy.optimize.minimize``
115+
function to find the optimal phase shift parameter. The minimization
116+
method is given 10 random starting values and the best result is kept.
117+
Though in most cases the truly optimal solution is found, in many cases
118+
a sub-optimal solution is chosen instead (i.e. only a locally optimal
119+
solution was chosen).
124120

125-
* The non-equispaced fast Fourier transform (NFFT)
126-
* Polynomial zero-finding
127-
128-
The FTP is a non-linear extension of the GLS. The nonlinearity
129-
of the problem can be reduced to finding the zeros of
130-
a complex, order `6H-1` polynomial at each trial frequency.
131-
132-
Templates are assumed to be well-approximated by a short truncated Fourier series
133-
of length `H`. Using this representation, the optimal parameters
134-
(amplitude, phase, offset) of the template fit at a given trial frequency
135-
can then be found *exactly* after finding the roots of
136-
a polynomial at each trial frequency.
137-
138-
The coefficients of these polynomials involve sums that can be efficiently
139-
evaluated with non-equispaced fast Fourier transforms. These sums
140-
can be computed in `O(HN_f log(HN_f))` time.
141-
142-
In its current state, the root-finding procedure is the rate limiting step.
143-
This unfortunately means that for now the fast template periodogram scales as
144-
`N_f*(H^3)`. We are working to reduce the computation time so that the entire
145-
procedure scales as `HN_f log(HN_f)` for reasonable values of `H` (`< 10`).
146121

147-
However, even for small cases where `H=6` and `N_obs=10`, this procedure is
148-
about an order of magnitude faster than the `gatspy` template modeler.
122+
How is this different than the multi-harmonic periodogram?
123+
----------------------------------------------------------
149124

150-
151-
### How is this different than the multi-harmonic periodogram?
152-
153-
The multi-harmonic periodogram ([Bretthorst 1988](https://link.springer.com/book/10.1007%2F978-1-4684-9399-3), [Schwarzenberg-Czerny (1996)](http://iopscience.iop.org/article/10.1086/309985/meta)) is another
125+
The multi-harmonic periodogram ([Bretthorst1988]_,[SchwarzenbergCzerny1996]_) is another
154126
extension of Lomb-Scargle that fits a truncated Fourier series to the data
155-
at each trial frequency. This is nice if you have a strong non-sinusoidal signal
156-
and a large dataset. This algorithm can also be made to scale as
157-
`HN_f logHN_f` ([Palmer 2009](http://iopscience.iop.org/article/10.1088/0004-637X/695/1/496/meta)).
127+
at each trial frequency. This algorithm can also be made to scale as
128+
``HN_f logHN_f`` [Palmer2009]_.
158129

159130
However, the multi-harmonic periodogram is fundamentally different than template fitting.
160131
In template fitting, the relative amplitudes and phases of the Fourier series are *fixed*.
161-
In a multi-harmonic periodogram, the relative amplitudes and phases of the Fourier series are *free parameters*. These extra free parameters mean that
132+
In a multi-harmonic periodogram, the relative amplitudes and phases of the Fourier series are *free parameters*.
162133

163-
1. you need a larger number of observations `N_obs` to reach the same signal to noise, and
164-
2. you are more likely to detect a multiple of the true frequency.
134+
The multiharmonic periodogram is more flexible than the template periodogram, but less
135+
sensitive to a given signal. If you're hoping to find a non-sinusoidal signal with an
136+
unknown shape, it might make more sense to use a multi-harmonic periodogram.]
165137

166-
For a discussion of number (2) and possible remedies with Tikhonov regularization, and for an illuminating review
167-
of periodograms in general, see [Vanderplas et al. (2015)](http://adsabs.harvard.edu/abs/2015ApJ...812...18V) and
168-
[Vanderplas (2017)](https://arxiv.org/abs/1703.09824).
138+
For more discussion of the multiharmonic periodogram and related extensions, see [VanderPlas_etal_2015]_ and [VanderPlas2017]_.
169139

170-
### Timing
140+
TODO
141+
----
142+
143+
* Multi-band extensions
144+
* Speed improvements
171145

172-
![timing](plots/timing_vs_ndata.png "Timing compared to non-linear optimization (10 initial guesses)")
173146

174-
The Fast Template Periodogram seems to do better than Gatspy-like template fitting
175-
for virtually all reasonable cases (reasonable meaning a small-ish
176-
number of harmonics are needed to accurately approximate the template,
177-
and small-ish meaning less than about 10).
147+
References
148+
----------
178149

179-
It may be surprising that FTP appears to scale as `NH`, instead of
180-
`NH log NH`, but that's because the NFFT is not the limiting factor (yet).
181-
Most of the computation time is spent calculating polynomial coefficients,
182-
and this computation scales as roughly `NH^3`.
183150

184-
![timingnh](plots/timing_vs_nharm.png "Timing vs harmonics")
151+
.. [ZechmeisterKurster2009] `Paper <http://adsabs.harvard.edu/abs/2009A%26A...496..577Z>`_
185152
186-
The FTP scales sub-linearly to linearly with the number of harmonics `H`
187-
for `H < 10`, and for larger number of harmonics scales as `H^3` (since
188-
zeros are found via singular value decomposition of the polynomial companion matrix).
189-
This limits the set of templates to those that are sufficiently approximated by a small
190-
number of Fourier terms.
153+
.. [Lomb1976] `Least-squares frequency analysis of unequally spaced data <http://adsabs.harvard.edu/abs/1976Ap%26SS..39..447L>`_
191154
192-
### Accuracy
155+
.. [Scargle1982] `Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data <http://adsabs.harvard.edu/abs/1982ApJ...263..835S>`_
193156
194-
Compared with the Gatspy template modeler, the FTP provides improved accuracy as well as speed.
195-
For many values of `p(freq)`, the FTP correlates strongly with results obtained from
196-
non-linear optimization. However, since the problem is not convex, the solution recovered from
197-
non-linear optimization techniques may only represent a *local* minima. FTP, on the other
198-
hand, solves for all local minima simultaneously, from which the globally optimal solution can be
199-
found easily.
157+
.. [Barning1963] `The numerical analysis of the light-curve of 12 Lacertae <http://adsabs.harvard.edu/abs/1963BAN....17...22B>`_
200158
201-
![corrwithgats](plots/correlation_with_nonlinopt.png "Correlation with non-linear optimization")
159+
.. [Vanicek1971] `Further Development and Properties of the Spectral Analysis by Least-Squares <http://adsabs.harvard.edu/abs/1971Ap%26SS..12...10V>`_
202160
161+
.. [VanderPlas2017] `Understanding the Lomb-Scargle Periodogram <https://arxiv.org/abs/1703.09824>`_
203162
204-
The FTP requires that templates be *approximated* by a truncated Fourier expansion. The figure
205-
below compares the template periodograms for a single template approximated by different numbers
206-
of harmonics:
163+
.. [Sesar2017] https://arxiv.org/abs/1611.08596
207164
208-
![accuracy](plots/correlation_with_large_H.png "How many harmonics do we need?")
165+
.. [Bretthorst1988] https://link.springer.com/book/10.1007%2F978-1-4684-9399-3
209166
167+
.. [SchwarzenbergCzerny1996] http://iopscience.iop.org/article/10.1086/309985/meta
210168
211-
TODO
212-
----
169+
.. [Palmer2009] http://iopscience.iop.org/article/10.1088/0004-637X/695/1/496/meta
213170
214-
* Multi-band extensions
215-
* Speed improvements
171+
.. [VanderPlas_etal_2015] http://adsabs.harvard.edu/abs/2015ApJ...812...18V

0 commit comments

Comments
 (0)