![]() Note that the search space for outliers is across the dimensions PC1 to PC5 as it is expected that the highest variance (and thus the outliers) will be seen in the first few components. H otelling’s T2 computes the chi-square tests and P-values across the top n_components which allows the ranking of outliers from strong to weak using y_proba. The latter two columns ( y_bool_spe, and y_score_spe) are based on the SPE/DmodX method. The first four columns in the data frame ( y_proba, p_raw, y_score, and y_bool), are outliers detected using Hotelling’s T2 method. For each sample, multiple statistics are collected as shown in the code section below. model = pca(normalize=True, detect_outliers=, n_std=2 ) # Fit and transform results = model.fit_transform(df)Īfter running the fit function, the pca library will score sample-wise whether a sample is an outlier. # Import library from pca import pca # Initialize pca to also detected outliers. During the initialization, we can specify the outlier detection methods separately, ht2 for Hotelling’s T2 and spe for the SPE/DmodX method. The normalization step is a build-in functionality in the pca library that can be set by normalize=True. We can see in the data frame that the value range per feature differs heavily and a normalization step is therefore important. # Intallation of the pca library pip install pca # Load other libraries from sklearn.datasets import load_wine import pandas as pd # Load dataset data = load_wine() # Make dataframe df = pd.DataFrame(index=data.target, data=data.data, columns=data.feature_names) print(df) # alcohol malic_acid ash. ![]() I will use the wine dataset from sklearn that contains 178 samples, with 13 features and 3 wine classes. Let’s start with an example to demonstrate the working of outlier detection using Hotelling’s T2 and SPE/DmodX for continuous random variables. Outlier Detection for Continuous Random Variables.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |