0% found this document useful (0 votes)

8 views

Periodicity Detection Method For Small-Sample Time Series Datasets

This document describes a methodology for detecting periodicity in small sample time series datasets, such as those from DNA microarray data. It compares the author's previously developed piccolo method, which uses the discrete Fourier transform and Akaike's information criterion, to other conventional and newly proposed periodicity detection methods. The results show that the piccolo method has higher sensitivity for data containing multiple harmonics and is more robust against noise than the other methods. It is well-suited for analysis of small sample biological time series data.

Uploaded by

Filipe da Silveira

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Periodicity Detection Method For Small-Sample Time Series Datasets

Uploaded by

Filipe da Silveira

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Bioinformatics and Biology Insights

Open Access
Full open access to this and
thousands of other papers at
Methodology
http://www.la-press.com.

Periodicity Detection Method for Small-Sample

Time Series Datasets

Daisuke Tominaga
Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi 2-4-7,
Koto, Tokyo, 135-0064, Japan. Corresponding author email: tominaga@cbrc.jp

Abstract: Time series of gene expression often exhibit periodic behavior under the influence of multiple signal pathways, and are
represented by a model that incorporates multiple harmonics and noise. Most of these data, which are observed using DNA microarrays,
consist of few sampling points in time, but most periodicity detection methods require a relatively large number of sampling points.
We have previously developed a detection algorithm based on the discrete Fourier transform and Akaike’s information criterion. Here
we demonstrate the performance of the algorithm for small-sample time series data through a comparison with conventional and newly
proposed periodicity detection methods based on a statistical analysis of the power of harmonics.
We show that this method has higher sensitivity for data consisting of multiple harmonics, and is more robust against noise than other meth-
ods. Although “combinatorial explosion” occurs for large datasets, the computational time is not a problem for small-sample datasets.
The MATLAB/GNU Octave script of the algorithm is available on the author’s web site: http://www.cbrc.jp/%7Etominaga/piccolo/.

Keywords: periodicity detection, gene expression time series, information criterion, discrete Fourier transform, circadian rhythm

Bioinformatics and Biology Insights 2010:4 127–136

doi: 10.4137/BBI.S5983

This article is available from http://www.la-press.com.

© the author(s), publisher and licensee Libertas Academica Ltd.

This is an open access article. Unrestricted non-commercial use is permitted provided the original work is properly cited.

Bioinformatics and Biology Insights 2010:4 127

Tominaga

Introduction tests (such as Dixon’s Q test or Fisher’s G test) and

Life phenomena are observed as changes in time, and non-parametric tests (eg, the quantile/box-plot) are
many of these phenomena, such as circadian rhythm used to detect outliers. These methods frequently
and the cell cycle, exhibit periodic behavior. These do not detect the periodicity of interest if the signifi-
phenomena are common to many species and are cance of the period is close to that of other periods,
thought to be expression of essential mechanisms of even if the significance is high. Thus, these methods
life. In addition, irregular periodicity is caused by are inadequate for data with multiple periodicities.
abnormal stimuli or disorder of these mechanisms. In addition, a certain number of spectrum elements
Thus, a periodicity detection technique for time series are needed to make the outlier tests meaningful and
observation data is important in many areas of biology robust to noise. Thus, these methods are not optimal
and medicine. for small sampled time series data.
Generally, the observation of life phenomena Other advanced algorithms, such as wavelet-
incurs certain costs and thus the number of sampling based methods5,6 and model fitting using directional
points in time is often small, as in the case of DNA statistics7 have been proposed, however, few applica-
microarray data.1 For this reason, a reliable method of tions of these methods have been reported to date;
periodicity detection is needed for small datasets. therefore, their utility for the analysis of small sam-
Time series data on life phenomena can be rep- pled biological datasets remains an open question.
resented by a mathematical model consisting of An clustering method1 and an AR (autoregression)
noise and various simple formulae, such as polyno- model based periodicity detection method8 are devel-
mial functions or harmonics (sinusoidal functions).2 oped to be special for small sampled data. The first
A model of time series data of periodic phenomena one classifies genes by expression time series but do
should contain harmonics. If these harmonics are not detect period or periodicity. The second one can
judged significantly large by a statistical test, the phe- find a period and its P-value for each time series data,
nomena can be considered periodic. but do not detect multiple harmonics, ie, do not find
Generally, life phenomena are the result of complex ‘the second significant period’.
interactions of biological networks (gene regulatory Our previously proposed method, called the
networks, metabolic pathways, signal transduction ‘piccolo’,9 consists of the DFT and Bayesian Infor-
networks, etc.); thus, time series data on constituents mation Criterion (BIC),2 and is not based on an out-
of these networks can contain multiple harmonics lier detection. The algorithm is a exhaustive search to
with different periods. find the best combination of Fourier coefficients in
Periodicity detection techniques which are widely terms of the information criterion.4 The combinatorial
used can be classified into two categories: 1) model search does not require a long computational time for
fitting in the time domain, and 2) statistical signifi- most DNA microarray time series datasets found on
cance tests on power spectra. the web, such as the datasets in the Gene Expression
The first category includes methods based on Omnibus10 and ArrayExpress.11
direct curve fitting to the observed data. When the We improve the peridicity detection performance of
data can be modeled by n harmonics, the number of the piccolo method by introducing Akaike’s Informa-
parameters that are optimized by the fitting method tion Criterion (AIC)4 instead of BIC, and demonstrate
is 3n + 1,3 which is too many parameters for small its performance through a comparison with two con-
datasets.4 ventional methods, one newly developed method and
Methods in the second category are widely used in the old version of our method (BIC version of the pic-
many area of science. The basic method is to calculate colo) on two simulation datasets and twelve microar-
the power spectra by using the discrete Fourier trans- ray datasets. The piccolo algorithm (new AIC version)
form (DFT) or the autocovariance matrix, and then to is shown to be highly sensitive and robust against noise
test the significance of each spectrum of a harmonic on simulated short time series data which consist of
by outlier detection methods.3 A simple method uses multiple (two) harmonic signals and noise. In addition,
quantiles of spectra to detect outliers. Both parametric the present method can achieve high detection rates of

128 Bioinformatics and Biology Insights 2010:4

Periodicity detection for small time series

a period of interest for DNA microarray datasets, thus method and Dixon’s Q test method are applied to
satisfying the expectations for b iological data. logarithms of powers.

Methods Quantile method

We choose two widely used conventional methods, Outlier detection using an inter quantile range (IQR)
one recently proposed methods and the older ver- is a basic and widely used technique in many scien-
sion of the piccolo method for comparison with the tific fields because it has been found empirically to
improved new piccolo method. The methods selected be useful for outlier elimination.13,14
for the comparison except the old version of the In the quantile method, the DFT is applied to the
piccolo are based on statistical tests on the logarithms data, and the power of each harmonic is calculated as
of power spectra. the product of its Fourier coefficient and its complex
The two conventional methods are the quantile conjugate. Then, quantile points of the logarithm of
method and Dixon’s Q test. The other method is a non- the powers and IQR are calculated. All logarithms of
parametric test for significance of the logarithms of powers are compared with the outlier bound, which is
power spectra, recently proposed by Ahdesmäki et al.12 the sum of the third quartile point (75 percentile point)
Dixon’s Q test requires the assumption that the and the IQR multiplied by 1.5 (for normally distributed
distribution of samples is normal. The other four samples, this is same as that critical value is 0.9541
methods, namely, the quantile method, Ahdesmäki’s one-sided). If a logarithm of a power is larger than the
method, and the old and new piccolo method, do not outlier bound, the harmonic corresponding to the power
require this assumption. In the piccolo method (both is significant, and thus the given time series data is con-
old and new), the error distribution at each data point sidered periodic. The periodicity of the time series is
(sampling points at various times in the time series the same as the s ignificant harmonics.
data) is assumed to be normal and its variance is The quantile method requires a sample size (data
assumed to be the same as that of the data.4 length) of 8 or more for power spectra. Power spec-
According to results of statistical tests for normal- tra (real numbers) are calculated from Fourier coef-
ity of powers and logarithms of powers of each time ficients (complex numbers), which have symmetry;
series in all datasets (Tables 1 and 2), no conclusion thus, the number of unique samples of the spectra
can be reached regarding the distribution of powers is half the data length. The unique samples size is
and logarithms of powers. Note that the power of a (n-1)/2 for an odd data length n. If the number of
harmonic is calculated as the product of its Fourier samples is less than 4, the quantile method cannot
coefficient, which is calculated by DFT, and the com- detect any outliers because the bound is larger than
plex conjugate of the coefficient; therefore, the power the largest sample. Thus, this method cannot be used
of a harmonic is a real value. Logarithms of powers for time series data with data length of 7 or less.
are perhaps more suitable than the values of powers
themselves for the quantile method and Dixon’s Q Dixon’s Q test
test, considering histograms of logarithms of powers Dixon’s Q test15,16 is a widely used outlier detection
for each dataset (Fig. 1). Accordingly, the quantile algorithm, in which the sample distribution is assumed

Table 1. P-values and standard deviations (sd) of the normality test for the distribution of powers and logarithms of powers
of time series data in simulation datasets for the one-harmonic and two-harmonic conditions.

N T Int. Power Log of power

One harmonics 500 12 4 0.755 (sd: 0.204) 0.862 (sd: 0.175)
Two harmonics 500 12 4 0.771 (sd: 0.218) 0.855 (sd: 0.165)
Notes: Signal to noise ratio (RSN) is 0.1. P-values are calculated using the Kolmogorov-Smirnov test.
Abbreviations: N, number of time series data in each dataset; T, length of each time series in the dataset; Int., interval between each two samplings (h)
in each time series.

Bioinformatics and Biology Insights 2010:4 129

Tominaga

Table 2. P-values and their standard deviations (sd) for the normality test of the distribution of powers and logarithms of
powers of time series data in datasets taken from the Gene Expression Omnibus database.

N T Int. Power Log of power

GDS1629 6346 8 6 0.898 (sd: 0.127) 0.911 (sd: 0.112)
GDS2110 14904 6 4 0.936 (sd: 0.0712) 0.936 (sd: 0.0712)
GDS2232 29109 12 4 0.845 (sd: 0.172) 0.828 (sd: 0.182)
GSE3424 22759 6 4 0.922 (sd: 0.0751) 0.936 (sd: 0.0719)
GDS404 6484 12 4 0.871 (sd: 0.153) 0.865 (sd: 0.159)
GSE6542-1 11699 6 4 0.936 (sd: 0.0712) 0.936 (sd: 0.0712)
GSE6542-2 11699 6 4 0.936 (sd: 0.0720) 0.936 (sd: 0.0718)
GSE6542-3 11699 6 4 0.945 (sd: 0.0682) 0.944 (sd: 0.0681)
GSE6542-4 11699 6 4 0.931 (sd: 0.0735) 0.932 (sd: 0.0735)
GSE6542-5 11699 12 4 0.877 (sd: 0.145) 0.875 (sd: 0.147)
GSE6542-6 11699 6 4 0.937 (sd: 0.0714) 0.936 (sd: 0.0715)
Note: P-values are calculated using the Kolmogorov-Smirnov test.
Abbreviations: N, number of time series data in each dataset; T, length of each time series in the dataset; Int., interval between each two samplings (h)
in each time series.

GDS1629 GDS2110 GDS2232

1200 2100 9000

900
1400 6000
600
700 3000
300

0 0 0
10 15 20 25 30 35 10 15 20 25 30 35 10 15 20 25 30 35

GDS404 GSE3424 GSE6542_1

2100 4000 2500

2000
3000
1400
1500
2000
1000
700
1000
500

0 0 0
10 15 20 25 30 10 15 20 25 30 35 −5 0 5 10 15 20 25

GSE6542_2 GSE6542_3 GSE6542_4

2500 2500 2000

2000 2000
1500
1500 1500
1000
1000 1000
500
500 500

0 0 0
−5 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25

GSE6542_5 GSE6542_6 GSE6542_7

4500 2000 2000

1500 1500
3000
1000 1000
1500
500 500

0 0 0
−5 0 5 10 15 20 25 −5 0 5 10 15 20 25 0 5 10 15 20 25

Figure 1. Histograms of logarithms of powers for all twelve DNA microarray datasets for the performance comparison in the result section. The x axes are
bins of histograms. Each bin is a range of natural logarithms of powers of each time series in each dataset. The y axes are frequencies of logarithms of
powers in each range. The sum of all frequencies are same as the number of probes in each dataset.

130 Bioinformatics and Biology Insights 2010:4

Periodicity detection for small time series

to be normal. This test, which ignores redundant The model is a subset of the set of the Fourier
information from half of a two-sided power spectrum, coefficients obtained by DFT from given time series
is used to detect outliers from a set of logarithms of data. The number of Fourier coefficients is n when n
power spectra. The criterion of the test is a critical is the number of samples in the time series; however,
value of 0.95, one-sided.17 half of these coefficients are complex conjugates of
the other half. A Fourier coefficient must always be
Ahdesmäki’s method selected with its conjugate. This allows the inverse
The Ahdesmäki’s method12 uses the kernel density DFT of the model to be real numbers, which is
estimation18 of the distribution of the square root of the necessary to calculate the AIC. Thus, the number of
targeted harmonic’s power (proportional to the loga- model parameters is the number of the coefficient
rithm of power). The distribution is approximated by pairs. When the data length is even, a coefficient cor-
shuffling the order of samples in the time series data responds to the Nyquist frequency is a pure real num-
and calculating the power of the targeted harmonic ber and its complex conjugate do not appear in the set
by least-square fitting. We use Yi Cao’s ‘gkde’ kernel of Fourier coefficients in the model. This coefficient
density estimation method*1 to calculate the approxi- does not form a pair when it is chosen.
mate probability density function (PDF) of powers of The AIC value is calculated using the following
the harmonics, and we use the built-in function ‘ols’ equation:4
in GNU octave version 3.2.3*2 to calculate the power
of the harmonic. AIC = n log(2π) + n log(σ2) + n + 2p, (1)
The criterion of the test is a critical value of 0.95,
one-sided. where n is the number of samples (data length of the
time series), σ is the variance of errors between the given
The piccolo method time series data and the time series calculated from the
The ‘piccolo’ algorithm9 is an exhaustive search for model by inverse DFT, and p is the number of param-
the optimal combination of Fourier coefficients cal- eters (pairs of Fourier coefficients) in the model.
culated by DFT from a given time series data. The In the piccolo method, the Fourier coefficients in
algorithm searches for all possible subsets of conju- the subset that minimize the AIC value are taken to
gate pairs of Fourier coefficient, but the search range be significant constituents to represent the given data.
for a size of subsets is limited to keep the information Accordingly, periods corresponding to these Fourier
criterion value (AIC, BIC, etc.) reliable.4 coefficients are considered significant, and the given
Our previously presented version of the method time series data judged to be periodic with periods
incorporates BIC (Bayesian Information Criterion) corresponding to these Fourier coefficients. Thus,
as the information criterion. Here we introduce AIC multiple periods can be found simultaneously even if
(Akaike’s Information Criteriron) instead of BIC to their powers are close each other.
improve detection performance. The previous ver-
sion is called ‘piccolo/B’ in this paper. The ‘piccolo’ Result
implies new AIC version. Fourteen datasets are used to compare the five meth-
The optimal subset is defined such that the AIC ods for periodicity detection, comprising two simu-
value calculated from the subset and given data is mini- lated datasets and twelve DNA microarray datasets
mal. AIC is used as the information criterion under the taken from an online database.
assumptions that the error distribution of the datum at
each time point is normal and that its variance is the Robustness against noise
same as the variance among the time series data.4 Data
We tested the robustness against noise of the five
periodicity detection methods, namely the quantile
http://www.mathworks.com/matlabcentral/fileexchange/19160
*1

http://www.gnu.org/software/octave/doc/interpreter/Linear-Least-Squares.
*2 method, Dixon’s Q test, Ahdesmäki’s method, the
html piccolo/B and the piccolo method, using simulation

Bioinformatics and Biology Insights 2010:4 131

Tominaga

data consisting of one or two harmonic signals and In the two-harmonic condition, simulation data
log-normal noise. consist of two signals (16 and 24 hour harmonics)
Considering that the distribution of DNA microarray and noise, however, the three detection method except
data is log-normal,19 each datum is created as a sum piccolo and piccolo/B can hardly detect plural signals
of a log-normally distributed random number and the simultaneously in principle. Therefore we tested the
value of one harmonic (the one-harmonic condition), five methods on detection of a 24-hour siginal.
or two harmonics (the two-harmonic condition). Plots of the number of detected time series data on
For both conditions, 15 datasets are generated by each dataset are shown in Figure 2. For both the one-
changing the signal-to-noise ratio as follows: harmonic and two-harmonic conditions, the piccolo
method achieved a high detection rate, especially for
 2π   2π  noisy (low RSN) data.
logN(0,1) + Ai cos  t + Ci  + Bi cos  t + Di 
 24   16  The detection performance was relatively lower at
RSN = 1.0 in the one-harmonic condition except for
where logN(0,1) is log-normaly distributed ran- the piccolo method. In this dataset, the variance of
dom noise whose mean is 0 and variance is 1, i the signal and noise is the same; thus, the signal and
(i = 1, …, 500) is the suffix for time series, Ai and
Bi are amplitude of harmonic signals whose period 1000
are 24-hour and 16-hour respectively (Bi = 0 for the
# of detected on 500 data

one-harmonic condition), and Ci and Di are phase

of each signal. Values of Ai and Bi are detemined by 100
log-normal random numbers according to the signal-
noise ratio (described later). Values of Ci and Di are
detemined by uniformly distributed random numbers
10
within the range of [0,48]. t is the time. Values of time piccolo
are discrete and its intervals are fixed to 4 hours. The piccolo/B
Ahdesmaeki
number of time points is 12. Each dataset consists of Quantile
Q test
500 time series data (each time series data consists of 1
0.001 0.01 0.1 1 10 100
twelve sampling points). RSN (S/N ratio)
The signal-to-noise ratio (RSN), which is defined as
a ratio of the variance of signal and noise, is set at 1000
various values of RSN = (0.001, 0.002, 0.005, …, 50.0)
# of detected on 500 data

under each condition. Thus, all generated time series

data in all datasets contain a circadian rhythm. 100
To consider whether or not Dixon’s Q test is appro-
priate, the normality of distribution of the spectra and
the logarithms of spectra is tested. The P-values cal- 10
culated by the Kolmogorov-Smirnov test for data- piccolo
piccolo/B
sets of RSN = 0.1 under both conditions are shown in Ahdesmaeki

Table 1. For both spectra and logarithms of spectra, Quantile

Q test
1
the null hypothesis (the distribution is normal) cannot 0.001 0.01 0.1 1 10 100
be rejected at the 90% confidence level. Thus, Dix- RSN (S/N ratio)
on’s Q test cannot be considered inappropriate.
Figure 2. Log/log plots of the signal-to-noise ratio versus the number
of detected time series data out of 500. The time series data consist
Detection performance of log-normal random noise and a harmonic (above), and log-normal
random noise and two harmonics (below). Since all simulated data
The numbers of detected time series from the datasets contains periodic signal to be detected, the possible maximum number
are compared to evaluate the robustness of the meth- of the detection is 500. The RSN is defined as a division of the vari-
ance of the signal by the variance of the noise. Therefore smaller RSN
ods against noise. value of the s imulated time series means that it is noisy data.

132 Bioinformatics and Biology Insights 2010:4

Periodicity detection for small time series

noise are difficult to distinguish, especially for small points). GDS2232 is a set of twenty four samples
sampled time series data. of normal mouse adrenal glands for 44 hours, every
The number of detected time series in the two- 4 hours (twelve time points). The dataset contains
harmonics condition is lower than that in the one- two samples for each time point. We only use one
harmonic condition for RSN . 0.01. The difference of them, which appears earlier in the published data
between the one-harmonic condition and two- file. GDS404 is a set of thirteen samples of normal
harmonic condition is smaller for the piccolo method mouse aortae for 44 hours, every 4 hours (twelve
than for the other methods. time points). The dataset contains two samples
for the first time point. We only use one of them,
Detection of circadian rhythm which appears earlier in the published data file.
Data GSE3424 is a set of eight samples of normal Ara-
The five detection methods are applied to experimen- bidopsis thaliana for 20 hours, every 4 hours (six
tally observed DNA microarray data taken from the time points). The dataset contains two samples for
Gene Expression Omnibus online database by NCBI, two time points (0-hour and 12-hour). We only use
NIH,10 to detect genes (probes) which have 24-hour one of them, which appears earlier in the published
periodicity, or ‘circadian rhythm’. data file. GSE6542 is a set of fourty eight samples
The P-values obtained by the Kolmogorov- of three mutants of Drosophila melanogaster in two
Smirnov test for the normality of the distribution experimental conditions (seven conditions in total).
of powers and logarithms of powers are shown in We divide it into seven sub-datasets here. Six sub-
Table 2. P-values are calculated for time series data datasets consist of six time points and one consists
in datasets, and means and standard deviations of the of twelve points. All these datasets are normalized
P-values are calculated and listed in the table. For both by publishers for further analysis.
powers and logarithms of powers, the null hypothesis Data of duplicate probes for same gene and data of
(the distribution is normal) cannot be rejected at the probes which contain a numerically invalid value are
95% confidence level. Although the samples sizes are ignored for this performance comparison.
small (6 to 12), it can be said that Dixon’s Q test can-
not be considered inappropriate. Detection performance
It is not defined whether or not the time series in The detection results are shown in Table 3. For both
the datasets are circadian; however, some of them are the total number of detected probes and the num-
labeled with the GO term20 ‘circadian rhythm’. Here, ber of detected probes labeled circadian, the piccolo
detection performance is evaluated in terms of the total method is superior to the other four methods, includ-
number of detected probes and the number of detected ing previous version of the piccolo (piccolo/B), for
probes labeled ‘circadian rhythm’ for each dataset. The all datasets. Ratios of S in Table 3, which is the
quantile method cannot be used on datasets in which number of probes detected by the piccolo method
the data length of each time series is 7 or less. but not by other four methods, to the number of
total probes in each dataset are 0.333 (GDS1629)
Biological description of datasets to 0.776 (GSE6542_3). This means that using the
All twelve DNA microarray datasets are time piccolo method we find that 77.6% of all probes in
series observations intending to analyze circadian GSE6542_3 are under the influence of circadian
rhythm. GDS1629 is a set of fourty five samples of oscillation mechanisms but other four methods can-
a immortalized suprachiasmatic nucleus cell line of not detect these probes.
normal rat for 42 hours, every 6 hours (eight time On the other hand, ratios of the numbers of probes
points). The dataset contains five or six samples for detected by one or more of the other four methods
each time point. We only use one of them, whose but not detected by the piccolo method to the number
sample ID is the largest. GDS2110 is a set of six of total probes in each dataset are in the range of 0.0
samples of normal Macaca mulatta adult females (GSE6542_2, GSE6542_4, GSE6542_6) to 0.0418
adrenal glands for 20 hours, every 4 hours (six time (GDS404), or less than 5% (data not shown).

Bioinformatics and Biology Insights 2010:4 133

Tominaga

Table 3. Results of detection of circadian oscillation on the twelve DNA microarray datasets. Numbers before and after a
slash are the number of detected probes and detected circadian annotated probes respectively. The annotated probes are
labeled with the GO term ‘circadian rhythm’ in the chip definition files of the microarrays.
C Quantile Q test Ahdesmäki Piccolo/B Piccolo S
GDS1629 22 146 / 1 60 / 0 121 / 0 163 / 1 2231 / 7 1981
GDS2110 26 – 667 / 0 457 / 2 0/0 10658 / 16 9745
GDS2232 37 4005 / 1 3053 / 3 5343 / 4 9118 / 11 23057 / 28 10892
GSE3424 33 – 2853 / 1 1436 / 2 0/0 19233 / 29 15837
GDS404 12 714 / 2 497 / 2 655 / 3 752 / 3 4044 / 6 2670
GSE6542-1 28 – 529 / 0 401 / 1 0/0 8554 / 23 7829
GSE6542-2 28 – 413 / 0 339 / 2 0/0 7706 / 23 7118
GSE6542-3 28 – 720 / 0 340 / 1 0/0 9939 / 23 9099
GSE6542-4 28 – 495 / 0 361 / 0 0/0 8924 / 23 8235
GSE6542-5 28 799 / 3 623 / 0 656 / 5 1046 / 4 6038 / 19 4335
GSE6542-6 28 – 534 / 0 378 / 1 0/0 8513 / 20 7800
GSE6542-7 28 – 501 / 0 403 / 0 0/0 8236 /19 7517
Notes: C is the number of circadian probes in the chip used for each dataset (duplicate probes for each gene and probes containing invalid numerical data
are omitted). S is the number of probes detected only by the piccolo method but not by other four methods.

20 40 520
18 35 500
16 480
30
14 460
12 25 440
10 20 420
8 15 400
6 380
10
4 360
2 5 340
0 0 320
0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 35 40 45 50 55 60 65 70 75 80 85
GDS1629 GDS2110 GDS2232

1200 18 2.4
1100 16 2.35
1000 14
12 2.3
900
10 2.25
800
8 2.2
700 6
600 4 2.15
500 2 2.1
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 0 5 10 15 20
GDS404 GSE3424 GSE6542_1

6.12 7.05 8.8

6.1 7 8.75
6.08 6.95 8.7
6.06 6.9 8.65
6.04 6.85 8.6
6.02 6.8
6 6.75 8.55
5.98 6.7 8.5
5.96 6.65 8.45
5.94 6.6 8.4
5.92 6.55 8.35
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
GSE6542_2 GSE6542_3 GSE6542_4

4.55 5.35 4.06

4.5 5.3 4.04
4.45 4.02
5.25 4
4.4
4.35 5.2 3.98
4.3 5.15 3.96
3.94
4.25 5.1 3.92
4.2 3.9
5.05
4.15 3.88
4.1 5 3.86
4.05 4.95 3.84
0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 0 5 10 15 20
GSE6542_5 GSE6542_6 GSE6542_7

Figure 3. Plots of time series data which are detected only by the piccolo method and not by the other four methods. For each dataset, the time series
data of the probes with the largest ratio between the maximum power and the second largest power is plotted. Ranges of sampling time points are
different by datasets. Datasets and its time ranges are: Top (left to right)—GDS1629 (44 h), GDS2110 (20 h), GDS2232 (44 h), Second (left to right)—
GDS404 (44 h), GSE3424 (20 h), GSE6542_1 (20 h), Third (left to right)—GSE6542_2 (20 h), GSE6542_3 (20 h), GSE6542_4 (20 h), Bottom (left to
right)—GSE6542_5 (44 h), GSE6542_6 (20 h), GSE6542_7 (20 h).

134 Bioinformatics and Biology Insights 2010:4

Periodicity detection for small time series

10000
piccolo
Discussion
Ahdesmaeki Five methods for periodicity detection, namely, two
1.5 IQR
simple methods (the quantile method and Dixon’s Q
Computatinal time [s]

1000 Q test

test), one recently proposed method (Ahdesmäki’s

100 method) and two methods by the authors (piccolo
and piccolo/B) are compared for small sampled (short
10 length) time series of two simulated datasets which
consist of twelve time points and twelve sets of experi-
1 mentally observed DNA microarray data, which consist
of 6, 8, 12 time points for observation of the circadian
0.1 rhythm.
5 10 15 20 25 30 35 40
Dixon’s Q test requires the assumption that the
Length of data
distribution of samples is normal. P-values of the
Figure 4. Plot of computational time which is needed to perform detec- normality of the distribution of the spectra and loga-
tion on 500 time series data. The x axis is the length of time series data
(the number of time points). The y axis is elapse CPU time in second to rithm of spectra of each time series in the given data-
perform detection in a logarithmic scale. 500 time series data are gener- sets were calculated by the Kolmogorov-Smirnov
ated by normally distributed random numbers. The CPU time of piccolo/B
method (previous version of the piccolo method, not shown here) is very test. The null hypothesis (the distribution is normal)
similar to the piccolo method which incorporates AIC. was not rejected for the logarithms of spectra of all
datasets.
The time series of a probe detected by only the pic- The piccolo method selects significant harmonics
colo method is plotted in each panel in Figure 3 (one to model the data. Harmonics included in the best
probe is chosen for each dataset). model that minimizes the AIC are significant.
A harmonic whose power is not a maximum can be
Computational cost detected as significant more frequently by using the
We measured the increase in computational time piccolo method compared with other outlier based
required to perform detection on 500 time series when methods. These smaller power harmonics are selected
the data length of each time series is increased from according to the AIC and therefore are considered to
6 to 40. The dataset consist of normally distributed be significant statistically. The high detection sensi-
random numbers with a mean of 0 and variance of 1. tivity of the piccolo method is shown by results of
The results are shown in Figure 4. In the performance analyses using both simulations and experimentally
evaluation, all detection programs are run on GNU observed data. These results satisfies the expectations
octave version 3.2.3*3 on Mac OS X 10.6.3, and the that most genes in a living cell are involved in one
computer is equipped with two 3 GHz Dual-Core Intel or more gene regulatory networks and that these net-
Xeon and 8 GB of 667-MHz DDR2 core memory. works are interconnected. The oscillation of the core
The computational time of the quantile method, circadian clock genes are expected to spread over
Dixon’s Q test and Ahdesmäki’s method increase whole gene networks.
linearly with increasing data length. This increase is S in Table 3 shows that many genes exhibiting
exponential in the case of the piccolo method. The periodicity in the form of circadian rhythm can be
CPU time of the piccolo/B is almost same to the detected only by the piccolo method and not the other
piccolo and not shown here. four methods. This finding can be attributed to the
The curves fit to data, ax + b for Ahdesmäki’s magnitude of circadian periodicity, which is thought to
method and exp(ax + b) for piccolo method, inter- depend on the ‘distance’ in the whole interconnected
sect at x = 18.8 (x is data length). The piccolo method gene regulatory networks from central circadian
is faster than Ahdesmäki’s method for small datasets clock systems. Many genes further from the central
with a data length of less than 19. clock systems could have lower magnitude circadian
periodicity and can not be detected by other methods
*3
http://octave.sourceforge.net/ than the piccolo.

Bioinformatics and Biology Insights 2010:4 135

Tominaga

A comparison of the five methods using simulation 8. Yang R, Su Z. Analyzing circadian expression data by harmonic regression
based on autoregressive spectral estimation. Bioinformatics. 2010;26:
data shows that the piccolo method is most robust i168–74.
against noise. The detection performance of the 9. Tominaga D, Horimoto K. Judgment algorithm for periodicity of time series
data based on bayesian information criterion. Journal of Bioinformatics and
methods, except the piccolo method, was worse for Computational Biology. 2008;6(4):747–57.
the two-harmonic data than for the one-harmonic 10. Barrett T, Suzek TO, Troup DB, et al. NCBI GEO: mining millions of expres-
data. The piccolo method exhibited more consistent sion profiles-database and tools. Nucleic Acids Research. 2005;33:D562–6.
11. Parkinson H, Kapushesky M, Kolesnikov N, et al. ArrayExpress update-
performance between datasets than the other methods. from an archive of functional genomics experiments to the atlas of gene
This suggests that the piccolo method has high detec- expression. Nucleic Acids Research. 2009;37:D868–72.
12. Ahdesmäki M, Lähdesmäki H, Pearson R, et al. Robust detection of peri-
tion performance for data with multiple periodicity. odic time series measured from biological systems. BMC Bioinformatics.
The computational cost of the piccolo method 2005;6:117.
represents a potential problem for large datasets. In 13. Hogg RV, McKean JW, Craig AT. Introduction to Mathematical Statistics.
6th ed. Peason Prentice Hall; 2005.
future work, we will attempt to reduce the compu- 14. Rousseeuw PJ, Leroy AM. Robust Regression and Outlier Detection.
tational cost by introducing the branch and bound Wiley-Interscience; 2003.
15. Dixon WJ. Analysis of extreme values. Annals of Mathematical Statistics.
method to the exhaustive search for the combination 1950;21:488–506.
of Fourier coefficients. 16. Dixon WJ. Ratios involving extreme values. Annals of Mathematical
Statistics. 1951;22:68–78.
17. Rorabacher DB. Statistical treatment for rejection of deviant values: critical
Acknowledgement values of Dixon’s “Q” parameter and related subrange ratios at the 95%
We wish to thank Drs. Wataru Fujibuchi and confidential level. Analytical Chemistry. 1991;63(2):139–46.
18. Silverman BW. Density Estimation for Statistics and Data Analysis.
Sachiyo Aburatani of the CBRC, AIST, for fruitful Chapman and Hall/CRC; 1986.
discussions. 19. Konishi T. Three-parameter lognormal distribution ubiquitously found in
cDNA microarray data and its application to parametric data treatment.
BMC Bioinformatics. 2004;5:5.
Disclosure 20. The Gene Ontology Consortium. Gene ontology: tool for the unification of
This manuscript has been read and approved by the biology. Nature Genetics. 2000;25(1):25–9.
author. This paper is unique and is not under con-
sideration by any other publication and has not been
published elsewhere. The author and peer reviewers Publish with Libertas Academica and
of this paper report no conflicts of interest. The author every scientist working in your field can
confirms that they have permission to reproduce any read your article
copyrighted material.
“I would like to say that this is the most author-friendly
editing process I have experienced in over 150
References publications. Thank you most sincerely.”
1. Ernst J, Bar-Joseph Z. STEM: a tool for the analysis of short time series gene
expression data. BMC Bioinformatics. 2006;7:191.
2. McQuarrie ADR, Tsai CL. Regression and Time Series Model Selection. “The communication between your staff and me has
World Scientific; 1998. been terrific. Whenever progress is made with the
3. Artis M, Hoffmann M, Nachane D, Toro J. The detection of hidden peri- manuscript, I receive notice. Quite honestly, I’ve
odicities: A comparison of alternative methods. EUI Working Paper ECO. never had such complete communication with a
2004;10.
4. Sakamoto Y, Ishiguro K, Kitagawa G. Akaike Information Criterion Statistics.
journal.”
Springer verlag; 1986.
5. Benedetto JJ, Pfander GE. Periodic wavelet transforms and periodicity detec- “LA is different, and hopefully represents a kind of
tion. SIAM Journal of Applied Mathematics. 2002;62(4):1329–68. scientific publication machinery that removes the
6. Janer L, Bonet JB, Lleida-Solano E. Pitch detection and voiced/unvoiced hurdles from free flow of scientific thought.”
decision algorithm based on wavelet transform. Proceedings of The Fourth
International Conference on Spoken Language Processing. 1996;2(FrP2P1):
1209–12. Your paper will be:
7. Okamura H, Semba Y. A novel statistical method for validating the period- • Available to your entire community
icity of vertebral growth band formation in elasmobranch fishes. Canadian
free of charge
Journal of Fisheries and Aquatic Sciences. 2009;66(5):771–80.
• Fairly and quickly peer reviewed
• Yours! You retain copyright

http://www.la-press.com

136 Bioinformatics and Biology Insights 2010:4

Regression Analysis Random Motors
80% (10)
Regression Analysis Random Motors
19 pages
Time Series - Brockwell and Davis PDF
No ratings yet
Time Series - Brockwell and Davis PDF
531 pages
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
100% (1)
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
529 pages
Statistical Models Based On Counting Processes (PDFDrive) PDF
No ratings yet
Statistical Models Based On Counting Processes (PDFDrive) PDF
778 pages
Rao (2022) - A Course in Time Series Analysis
No ratings yet
Rao (2022) - A Course in Time Series Analysis
527 pages
Periodic Trends
No ratings yet
Periodic Trends
46 pages
Time Series Theory and Methods Brockwell PDF
No ratings yet
Time Series Theory and Methods Brockwell PDF
530 pages
Pub - Time Series Theory and Methods PDF
No ratings yet
Pub - Time Series Theory and Methods PDF
530 pages
Example A Small Signal Analysis of A BJT Amp
100% (1)
Example A Small Signal Analysis of A BJT Amp
10 pages
3 Chapter 3. Methodology
100% (6)
3 Chapter 3. Methodology
41 pages
FM - Resumes
No ratings yet
FM - Resumes
18 pages
Session6 2
No ratings yet
Session6 2
24 pages
Artis Et Al 2004 Hidden Periodicities
No ratings yet
Artis Et Al 2004 Hidden Periodicities
29 pages
Time Series, Periodograms, and Significance
No ratings yet
Time Series, Periodograms, and Significance
14 pages
lecture_22_periodicity
No ratings yet
lecture_22_periodicity
31 pages
(Ebook) Diagnostic Methods in Time Series by Fumiya Akashi, Masanobu Taniguchi, Anna Clara Monti, Tomoyuki Amano ISBN 9789811622649, 9789811622632, 9811622647, 9811622639 - Read the ebook online or download it for a complete experience
100% (2)
(Ebook) Diagnostic Methods in Time Series by Fumiya Akashi, Masanobu Taniguchi, Anna Clara Monti, Tomoyuki Amano ISBN 9789811622649, 9789811622632, 9811622647, 9811622639 - Read the ebook online or download it for a complete experience
78 pages
Change Point Detection in Time Series Data With Random Forests
No ratings yet
Change Point Detection in Time Series Data With Random Forests
13 pages
2013TelgarskyRastislav-DominantFrequencyExtractionarXiv1306 0103
No ratings yet
2013TelgarskyRastislav-DominantFrequencyExtractionarXiv1306 0103
13 pages
Gas Prod
100% (3)
Gas Prod
24 pages
Ecological - Time Series
No ratings yet
Ecological - Time Series
4 pages
Estimation, Diagnosis, and Identification of Time Series Models
No ratings yet
Estimation, Diagnosis, and Identification of Time Series Models
15 pages
Review of Nonparametric Time Series Analysis: Wolf'gmg Hardle' Helmut Lutkepoh12 Chen3
No ratings yet
Review of Nonparametric Time Series Analysis: Wolf'gmg Hardle' Helmut Lutkepoh12 Chen3
24 pages
Intro of Time Series
No ratings yet
Intro of Time Series
18 pages
Optimal Multi-Scale Patterns in Time Series Streams: Spiros Papadimitriou Philip S. Yu
No ratings yet
Optimal Multi-Scale Patterns in Time Series Streams: Spiros Papadimitriou Philip S. Yu
12 pages
Prac TS
No ratings yet
Prac TS
18 pages
Environmental Data Analysis Methods and Applications (Zhihua Zhang) (Z-Library)
No ratings yet
Environmental Data Analysis Methods and Applications (Zhihua Zhang) (Z-Library)
329 pages
Spectral Estimation X
No ratings yet
Spectral Estimation X
87 pages
Spectral
No ratings yet
Spectral
14 pages
Te 1555
No ratings yet
Te 1555
134 pages
Milankovic Theory and Time Series Analysis: Mudelsee M Institute of Meteorology University of Leipzig Germany
No ratings yet
Milankovic Theory and Time Series Analysis: Mudelsee M Institute of Meteorology University of Leipzig Germany
104 pages
Green 1988
No ratings yet
Green 1988
3 pages
Emt 3202 Presentation
100% (1)
Emt 3202 Presentation
56 pages
Time Series Analysis in R A Beginner's Guide
No ratings yet
Time Series Analysis in R A Beginner's Guide
13 pages
Forecasting-paleoclimatic-data-with-time-seri_2021_Results-in-Geophysical-Sc
No ratings yet
Forecasting-paleoclimatic-data-with-time-seri_2021_Results-in-Geophysical-Sc
8 pages
Logical Modeling of Biological Systems
From Everand
Logical Modeling of Biological Systems
Luis Fariñas del Cerro
No ratings yet
Ee4015 Matlab3
No ratings yet
Ee4015 Matlab3
4 pages
Time Series Analysis
No ratings yet
Time Series Analysis
9 pages
Time-Frequency Domain for Segmentation and Classification of Non-stationary Signals: The Stockwell Transform Applied on Bio-signals and Electric Signals
From Everand
Time-Frequency Domain for Segmentation and Classification of Non-stationary Signals: The Stockwell Transform Applied on Bio-signals and Electric Signals
Ali Moukadem
No ratings yet
DSS16-Time Series
No ratings yet
DSS16-Time Series
65 pages
Time Series
No ratings yet
Time Series
22 pages
Witcher, B. - 2004-Wavelet-Based Estimation For Seasonal Long-Memory Processes
No ratings yet
Witcher, B. - 2004-Wavelet-Based Estimation For Seasonal Long-Memory Processes
15 pages
Seasonality Time-Series
No ratings yet
Seasonality Time-Series
33 pages
Sensitivity Analysis of Stationarity Tests Outcom
No ratings yet
Sensitivity Analysis of Stationarity Tests Outcom
24 pages
(Environmental and Ecological Statistics 4) Ganapati P. Patil, Sharad D. Gore, Charles Taillie (Auth.) - Composite Sampling_ a Novel Method to Accomplish Observational Economy in Environmental Studies (1)
No ratings yet
(Environmental and Ecological Statistics 4) Ganapati P. Patil, Sharad D. Gore, Charles Taillie (Auth.) - Composite Sampling_ a Novel Method to Accomplish Observational Economy in Environmental Studies (1)
290 pages
Stat 497 - LN4
No ratings yet
Stat 497 - LN4
67 pages
Time Series and Survival Analysis
No ratings yet
Time Series and Survival Analysis
30 pages
Students Alfredo de Alba Alvarado Eduardo Melendrez Escobedo Kenya Giselle Martinez Puente Bryton César Arguelles Aguilar
No ratings yet
Students Alfredo de Alba Alvarado Eduardo Melendrez Escobedo Kenya Giselle Martinez Puente Bryton César Arguelles Aguilar
6 pages
Written Report - 890
No ratings yet
Written Report - 890
6 pages
Biological Data Science Lecture3
No ratings yet
Biological Data Science Lecture3
23 pages
Fourier Feature Approximations For Periodic Kernels
No ratings yet
Fourier Feature Approximations For Periodic Kernels
8 pages
Decomposition Methods
No ratings yet
Decomposition Methods
42 pages
? ?????? ?? ???? ?????? ????????
No ratings yet
? ?????? ?? ???? ?????? ????????
300 pages
Spectral Estimation
No ratings yet
Spectral Estimation
79 pages
A New Approach For Physiological Time Series - 1504.06274
No ratings yet
A New Approach For Physiological Time Series - 1504.06274
15 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Tong 1983
No ratings yet
Tong 1983
9 pages
Nonparametric Estimation of A Periodic Function
No ratings yet
Nonparametric Estimation of A Periodic Function
13 pages
Time-Series Analysis PDF
No ratings yet
Time-Series Analysis PDF
53 pages
Spectral Estimation Notes
100% (1)
Spectral Estimation Notes
6 pages
Time Series
No ratings yet
Time Series
67 pages
Testes de Qualidade de Ajuste
No ratings yet
Testes de Qualidade de Ajuste
113 pages
The Least Squer Method
No ratings yet
The Least Squer Method
192 pages
Least Squares PDF
No ratings yet
Least Squares PDF
192 pages
Configuring RADIUS: Finding Feature Information
No ratings yet
Configuring RADIUS: Finding Feature Information
46 pages
Catalyst 2960-X Switch Stack Manager Configuration Guide, Cisco IOS Release 15.0 (2) EX
No ratings yet
Catalyst 2960-X Switch Stack Manager Configuration Guide, Cisco IOS Release 15.0 (2) EX
56 pages
B Stack Ha 3e 3850 CG
No ratings yet
B Stack Ha 3e 3850 CG
98 pages
Chapter 2 Bipolar Junction Transistor
No ratings yet
Chapter 2 Bipolar Junction Transistor
36 pages
LEC/BJT DC Analysis Examples Sol
No ratings yet
LEC/BJT DC Analysis Examples Sol
5 pages
Structural Equation Modeling With Mplus: Basic Concepts, Applications, and Programming (Multivariate Applications Series)
100% (29)
Structural Equation Modeling With Mplus: Basic Concepts, Applications, and Programming (Multivariate Applications Series)
23 pages
13 Stat2 Exercise Set 13 Solutions
No ratings yet
13 Stat2 Exercise Set 13 Solutions
4 pages
1.introduction of Statistics
No ratings yet
1.introduction of Statistics
22 pages
Research in psychology methods and design 7th Edition Edition Goodwin download
100% (4)
Research in psychology methods and design 7th Edition Edition Goodwin download
48 pages
Content Server
No ratings yet
Content Server
13 pages
Practical Biostatistics in Translational Healthcare (PDFDrive)
100% (2)
Practical Biostatistics in Translational Healthcare (PDFDrive)
235 pages
Fikreslassie Alemu
0% (1)
Fikreslassie Alemu
127 pages
Role of Training in Increasing Faculties' Performance in Online Lectures During Pandemic - Neha Verma
No ratings yet
Role of Training in Increasing Faculties' Performance in Online Lectures During Pandemic - Neha Verma
14 pages
Lesson 2-07 Properties of Means and Variances
100% (1)
Lesson 2-07 Properties of Means and Variances
9 pages
Tutorial 1
No ratings yet
Tutorial 1
1 page
Fatigue Limit Prediction of A356-T6 Cast Aluminum Alloys With Different Defect Sizes Sampled From An Actual Large-Scale Component
No ratings yet
Fatigue Limit Prediction of A356-T6 Cast Aluminum Alloys With Different Defect Sizes Sampled From An Actual Large-Scale Component
15 pages
Sampling Distributions and Confidence Intervals For Proportions
No ratings yet
Sampling Distributions and Confidence Intervals For Proportions
31 pages
Your Brain On Video Games
No ratings yet
Your Brain On Video Games
9 pages
LeapfrogGeoUserManual Compressed 521 795
No ratings yet
LeapfrogGeoUserManual Compressed 521 795
275 pages
Predicting Clinically Promising Therapeutic Hypotheses Using Tensor Factorization
No ratings yet
Predicting Clinically Promising Therapeutic Hypotheses Using Tensor Factorization
12 pages
8328-Article Text-16308-1-10-20210515
No ratings yet
8328-Article Text-16308-1-10-20210515
6 pages
Data Analysis in Excel (a Comprehensive Guideline) - ExcelDemy
No ratings yet
Data Analysis in Excel (a Comprehensive Guideline) - ExcelDemy
3 pages
Where can buy (Ebook) Applied Spatial Statistics and Econometrics by Katarzyna Kopczewska ISBN 9780367470760, 9780367470777, 9781003033219, 0367470764, 0367470772, 1003033210 ebook with cheap price
100% (4)
Where can buy (Ebook) Applied Spatial Statistics and Econometrics by Katarzyna Kopczewska ISBN 9780367470760, 9780367470777, 9781003033219, 0367470764, 0367470772, 1003033210 ebook with cheap price
77 pages
Set+1 Descriptive+statistics+Probability+
100% (2)
Set+1 Descriptive+statistics+Probability+
4 pages
7QC Tools
100% (1)
7QC Tools
36 pages
SDA Lab 2
No ratings yet
SDA Lab 2
8 pages
Estimation of California Bearing Ratio (CBR) From Index Properties and Compaction Characteristics of Coarse Grained Soil
No ratings yet
Estimation of California Bearing Ratio (CBR) From Index Properties and Compaction Characteristics of Coarse Grained Soil
4 pages
2 Graphical Descriptive Techniques
No ratings yet
2 Graphical Descriptive Techniques
49 pages
$ROW2C22
No ratings yet
$ROW2C22
23 pages
Marzano's Questionnaire
No ratings yet
Marzano's Questionnaire
19 pages
Group 4 - Normal Distributions
No ratings yet
Group 4 - Normal Distributions
15 pages
STS 429 Tutorial
No ratings yet
STS 429 Tutorial
5 pages
A Practical Implementation Guide To Predictive Data Analytics Using Python
No ratings yet
A Practical Implementation Guide To Predictive Data Analytics Using Python
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Periodicity Detection Method For Small-Sample Time Series Datasets

Uploaded by

Periodicity Detection Method For Small-Sample Time Series Datasets

Uploaded by

Bioinformatics and Biology Insights

Periodicity Detection Method for Small-Sample

Bioinformatics and Biology Insights 2010:4 127–136

This article is available from http://www.la-press.com.

© the author(s), publisher and licensee Libertas Academica Ltd.

Bioinformatics and Biology Insights 2010:4 127

Introduction tests (such as Dixon’s Q test or Fisher’s G test) and

128 Bioinformatics and Biology Insights 2010:4

Methods Quantile method

N T Int. Power Log of power

Bioinformatics and Biology Insights 2010:4 129

N T Int. Power Log of power

GDS1629 GDS2110 GDS2232

GDS404 GSE3424 GSE6542_1

GSE6542_2 GSE6542_3 GSE6542_4

GSE6542_5 GSE6542_6 GSE6542_7

130 Bioinformatics and Biology Insights 2010:4

Bioinformatics and Biology Insights 2010:4 131

one-harmonic condition), and Ci and Di are phase

under each condition. Thus, all generated time series

Table 1. For both spectra and logarithms of spectra, Quantile

132 Bioinformatics and Biology Insights 2010:4

Bioinformatics and Biology Insights 2010:4 133

6.12 7.05 8.8

4.55 5.35 4.06

134 Bioinformatics and Biology Insights 2010:4

test), one recently proposed method (Ahdesmäki’s

Bioinformatics and Biology Insights 2010:4 135

136 Bioinformatics and Biology Insights 2010:4

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Periodicity Detection Method For Small-Sample Time Series Datasets

Uploaded by

Periodicity Detection Method For Small-Sample Time Series Datasets

Uploaded by

Bioinformatics and Biology Insights

Periodicity Detection Method for Small-Sample

Bioinformatics and Biology Insights 2010:4 127–136

This article is available from http://www.la-press.com.

© the author(s), publisher and licensee Libertas Academica Ltd.

Bioinformatics and Biology Insights 2010:4 127

Introduction tests (such as Dixon’s Q test or ­Fisher’s G test) and

128 Bioinformatics and Biology Insights 2010:4

Methods Quantile method

N T Int. Power Log of power

Bioinformatics and Biology Insights 2010:4 129

N T Int. Power Log of power

GDS1629 GDS2110 GDS2232

GDS404 GSE3424 GSE6542_1

GSE6542_2 GSE6542_3 GSE6542_4

GSE6542_5 GSE6542_6 GSE6542_7

130 Bioinformatics and Biology Insights 2010:4

Bioinformatics and Biology Insights 2010:4 131

one-harmonic condition), and Ci and Di are phase

under each condition. Thus, all generated time series

Table 1. For both spectra and logarithms of spectra, Quantile

132 Bioinformatics and Biology Insights 2010:4

Bioinformatics and Biology Insights 2010:4 133

6.12 7.05 8.8

4.55 5.35 4.06

134 Bioinformatics and Biology Insights 2010:4

test), one recently proposed method (Ahdesmäki’s

Bioinformatics and Biology Insights 2010:4 135

136 Bioinformatics and Biology Insights 2010:4

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Introduction tests (such as Dixon’s Q test or Fisher’s G test) and