Module VITAE.utils
Functions
def reset_random_seeds(seed)
def get_embedding(z, dimred='umap', **kwargs)
-
Get low-dimensional embeddings for visualizations.
Parameters
z
:np.array
- [N, d] The latent variables.
dimred
:str
, optional- 'pca', 'tsne', or umap'.
**kwargs
:- Extra key-value arguments for dimension reduction algorithms.
Returns:
embed : np.array [N, 2] The latent variables after dimension reduction.
def get_igraph(z, random_state=0)
-
Get igraph for running Leidenalg clustering.
Parameters
z
:np.array
- [N, d] The latent variables.
random_state
:int
, optional- The random state.
Returns:
g : igraph The igraph object of connectivities.
def leidenalg_igraph(g, res, random_state=0)
-
Leidenalg clustering on an igraph object.
Parameters
g
:igraph
- The igraph object of connectivities.
res
:float
- The resolution parameter for Leidenalg clustering.
random_state
:int
, optional- The random state.
Returns
labels
:np.array
- [N, ] The clustered labels.
def plot_clusters(embed_z, labels, plot_labels=False, path=None)
-
Plot the clustering results.
Parameters
embed_z
:np.array
- [N, 2] The latent variables after dimension reduction.
labels
:np.array
- [N, ] The clustered labels.
plot_labels
:boolean
, optional- Whether to plot text of labels or not.
path
:str
, optional- The path to save the figure.
def plot_marker_gene(expression, gene_name: str, embed_z, path=None)
-
Plot the marker gene.
Parameters
expression
:np.array
- [N, ] The expression of the marker gene.
gene_name
:str
- The name of the marker gene.
embed_z
:np.array
- [N, 2] The latent variables after dimension reduction.
path
:str
, optional- The path to save the figure.
def plot_uncertainty(uncertainty, embed_z, path=None)
-
Plot the uncertainty for all selected cells.
Parameters
uncertainty
:np.array
- [N, ] The uncertainty of the all cells.
embed_z
:np.array
- [N, 2] The latent variables after dimension reduction.
path
:str
, optional- The path to save the figure.
def DE_test(Y, X, gene_names, i_test, alpha: float = 0.05)
-
Differential gene expression test.
Parameters
Y
:numpy.array
- n, the expression matrix.
X
:numpy.array
- n,1+1+s the constant term, the pseudotime and the covariates.
gene_names
:numpy.array
- n, the names of all genes.
i_test
:numpy.array
- The indices of covariates to be tested.
alpha
:float
, optional- The cutoff of p-values.
Returns
res_df
:pandas.DataFrame
- The test results of expressed genes with two columns, the estimated coefficients and the adjusted p-values.
def load_data(path, file_name, return_dict=False)
-
Load h5df data.
Parameters
path
:str
- The path of the h5 files.
file_name
:str
- The dataset name.
return_dict
:boolean
, optional- Whether to return the dict of the dataset or not.
Returns:
data : dict The dict containing count, grouping, etc. of the dataset. dd : anndata.AnnData The AnnData object of the dataset.
def compute_kernel(x, y, kernel='rbf', **kwargs)
-
Computes RBF kernel between x and y.
Parameters
x: Tensor Tensor with shape [batch_size, z_dim] y: Tensor Tensor with shape [batch_size, z_dim]
Returns
The computed RBF kernel between x and y
def squared_distance(x, y)
-
Compute the pairwise euclidean distance.
Parameters
x
:Tensor
- Tensor with shape [batch_size, z_dim]
y
:Tensor
- Tensor with shape [batch_size, z_dim]
Returns
The pairwise euclidean distance between x and y.
def compute_mmd(x, y, kernel, **kwargs)
-
Computes Maximum Mean Discrepancy(MMD) between x and y.
Parameters
x
:Tensor
- Tensor with shape [batch_size, z_dim]
y
:Tensor
- Tensor with shape [batch_size, z_dim]
kernel
:str
- The kernel type used in MMD. It can be 'rbf', 'multi-scale-rbf' or 'raphy'.
**kwargs
:dict
- The parameters used in kernel function.
Returns
The computed MMD between x and y
def sample_z(args)
-
Samples from standard Normal distribution with shape [size, z_dim] and applies re-parametrization trick. It is actually sampling from latent space distributions with N(mu, var) computed in
_encoder
function.Parameters
args
:list
- List of [mu, log_var] computed in
_encoder
function.
Returns
The computed Tensor of samples with shape [size, z_dim].
Classes
class Early_Stopping (warmup=0, patience=10, tolerance=0.001, relative=False, is_minimize=True)
-
The early-stopping monitor.
Expand source code
class Early_Stopping(): ''' The early-stopping monitor. ''' def __init__(self, warmup=0, patience=10, tolerance=1e-3, relative=False, is_minimize=True): self.warmup = warmup self.patience = patience self.tolerance = tolerance self.is_minimize = is_minimize self.relative = relative self.step = -1 self.best_step = -1 self.best_metric = np.inf if not self.is_minimize: self.factor = -1.0 else: self.factor = 1.0 def __call__(self, metric): self.step += 1 if self.step < self.warmup: return False elif (self.best_metric==np.inf) or \ (self.relative and (self.best_metric-metric)/self.best_metric > self.tolerance) or \ ((not self.relative) and self.factor*metric<self.factor*self.best_metric-self.tolerance): self.best_metric = metric self.best_step = self.step return False elif self.step - self.best_step>self.patience: print('Best Epoch: %d. Best Metric: %f.'%(self.best_step, self.best_metric)) return True else: return False