Skip to content

Miscellaneous statistical functions

Collected here are miscellaneous statistical functions that don't really fit anywhere else...

ephysiopy.common.statscalcs.circ_r(alpha, w=None, d=0, axis=0)

Computes the mean resultant vector length for circular data.

Args: alpha (array or list): Sample of angles in radians. w (array or list): Counts in the case of binned data. Must be same length as alpha. d (array or list, optional): Spacing of bin centres for binned data; if supplied, correction factor is used to correct for bias in estimation of r, in radians. axis (int, optional): The dimension along which to compute. Default is 0.

Returns: r (float): The mean resultant vector length.

ephysiopy.common.statscalcs.z_normalize(scores: np.ndarray) -> np.ndarray

Z-normalize an array of scores.

ephysiopy.common.statscalcs.box_cox_normalize(scores: np.ndarray, lam: float) -> np.ndarray

Box-Cox normalize an array of scores.

Args: scores (np.ndarray): The scores to normalize. lam (float): The lambda parameter for Box-Cox transformation.

Returns: np.ndarray: The normalized scores.

ephysiopy.common.statscalcs.mean_resultant_vector(angles)

Calculate the mean resultant length and direction for angles.

Args: angles (np.array): Sample of angles in radians.

Returns: r (float): The mean resultant vector length. th (float): The mean resultant vector direction.

Notes: Taken from Directional Statistics by Mardia & Jupp, 2000

ephysiopy.common.statscalcs.rayleigh_test(angles: np.ndarray) -> float

Perform the Rayleigh test for uniformity of circular data.

Args: angles (array_like): Vector of angular values in radians.

Returns: Z (float): The Rayleigh test statistic. p_value (float): The p-value for the test.

ephysiopy.common.statscalcs.V_test(angles, test_direction)

The Watson U2 tests whether the observed angles have a tendency to cluster around a given angle indicating a lack of randomness in the distribution. Also known as the modified Rayleigh test.

Args: angles (array_like): Vector of angular values in degrees. test_direction (int): A single angular value in degrees.

Notes: For grouped data the length of the mean vector must be adjusted, and for axial data all angles must be doubled.

ephysiopy.common.statscalcs.duplicates_as_complex(x, already_sorted=False)

Finds duplicates in x

Args: x (array_like): The list to find duplicates in. already_sorted (bool, optional): Whether x is already sorted. Default False.

Returns: x (array_like): A complex array where the complex part is the count of the number of duplicates of the real value.

Examples: >>> x = [9.9, 9.9, 12.3, 15.2, 15.2, 15.2] >>> ret = duplicates_as_complex(x) >>> print(ret) [9.9+0j, 9.9+1j, 12.3+0j, 15.2+0j, 15.2+1j, 15.2+2j]

ephysiopy.common.statscalcs.watsonsU2(a, b)

Tests whether two samples from circular observations differ significantly from each other with regard to mean direction or angular variance.

Args: a, b (array_like): The two samples to be tested

Returns: U2 (float): The test statistic

Notes: Both samples must come from a continuous distribution. In the case of grouping the class interval should not exceed 5. Taken from '100 Statistical Tests' G.J.Kanji, 2006 Sage Publications

ephysiopy.common.statscalcs.watsonsU2n(angles)

Tests whether the given distribution fits a random sample of angular values.

Args: angles (array_like): The angular samples.

Returns: U2n (float): The test statistic.

Notes: This test is suitable for both unimodal and the multimodal cases. It can be used as a test for randomness. Taken from '100 Statistical Tests' G.J.Kanji, 2006 Sage Publications.

ephysiopy.common.statscalcs.watsonWilliams(a, b)

The Watson-Williams F test tests whether a set of mean directions are equal given that the concentrations are unknown, but equal, given that the groups each follow a von Mises distribution.

Args: a, b (array_like): The directional samples

Returns: F_stat (float): The F-statistic

ephysiopy.common.statscalcs.CircStatsResults(rho: float = np.nan, p: float = np.nan, rho_boot: float = np.nan, p_shuffled: float = np.nan, ci: float = np.nan) dataclass

Dataclass to hold results from circular statistics

ephysiopy.common.statscalcs.RegressionResults(name: str, phase: np.ndarray, regressor: np.ndarray, stats: CircStatsResults) dataclass

ephysiopy.common.statscalcs.ccc(t, p)

Calculates correlation between two random circular variables

Parameters:

Name Type Description Default
t ndarray

The first variable

required
p ndarray

The second variable

required

Returns:

Type Description
float

The correlation between the two variables

ephysiopy.common.statscalcs.ccc_jack(t, p)

Function used to calculate jackknife estimates of correlation between two circular random variables

Parameters:

Name Type Description Default
t ndarray

The first variable

required
p ndarray

The second variable

required

Returns:

Type Description
ndarray

The jackknife estimates of the correlation between the two variables

ephysiopy.common.statscalcs.circCircCorrTLinear(theta, phi, regressor=1000, alpha=0.05, hyp=0, conf=True)

An almost direct copy from AJs Matlab fcn to perform correlation between 2 circular random variables.

Returns the correlation value (rho), p-value, bootstrapped correlation values, shuffled p values and correlation values.

Parameters:

Name Type Description Default
theta ndarray

The two circular variables to correlate (in radians)

required
phi ndarray

The two circular variables to correlate (in radians)

required
regressor int

number of permutations to use to calculate p-value from randomisation and bootstrap estimation of confidence intervals. Leave empty to calculate p-value analytically (NB confidence intervals will not be calculated).

1000
alpha float

hypothesis test level e.g. 0.05, 0.01 etc.

0.05
hyp int

hypothesis to test; -1/ 0 / 1 (-ve correlated / correlated in either direction / positively correlated).

0
conf bool

True or False to calculate confidence intervals via jackknife or bootstrap.

True
References

Fisher (1993), Statistical Analysis of Circular Data, Cambridge University Press, ISBN: 0 521 56890 0

ephysiopy.common.statscalcs.shuffledPVal(theta, phi, rho, regressor, hyp)

Calculates shuffled p-values for correlation

Parameters:

Name Type Description Default
theta ndarray

The two circular variables to correlate (in radians)

required
phi ndarray

The two circular variables to correlate (in radians)

required

Returns:

Type Description
float

The shuffled p-value for the correlation between the two variables

ephysiopy.common.statscalcs.circRegress(x, t)

Finds approximation to circular-linear regression for phase precession.

Parameters:

Name Type Description Default
x ndarray

The linear variable and the phase variable (in radians)

required
t ndarray

The linear variable and the phase variable (in radians)

required
Notes

Neither x nor t can contain NaNs, must be paired (of equal length).