Abstract:
Recent improvements in sensor technology have resulted in huge amount of measured process data along with the increasing need for compression prior to storage. Hence, efficient process data compression and reconstruction techniques gain importance in various tasks such as process monitoring, system identification, and fault detection to save storage space and facilitate data transmission between a data collecting node and a data processing node. Main purpose of this thesis work is to be able to achieve the highest degree of compression and de-noising while preserving the key features of the original data upon retrieval and decompression. With this aim, the employed are the most appropriate dimensionality reduction technique among Piecewise Aggregate Approximation (PAA), One Dimensional and Two Dimensional Discrete Cosine Transform (1D-DCT and 2D-DCT) and One Dimensional and Two Dimensional Discrete Wavelet Transform (1D-DWT and 2D-DWT) by adjusting the threshold parameter in filtering. The data sets used are PortSimHigh, PortSimLow, SELDI-TOF MS and TEP. These techniques are evaluated in terms of compression ratio, reconstruction error norm, % relative global error and % relative maximum error for different α-% thresholding levels. It is concluded that high compression levels cannot be generated with thresholding percentile values less than 90% in both DCT and DWT methods whereas the quality of reconstruction deteriorates at higher threshold levels in return for better compression. Furthermore, it is revealed that the efficacy of the compression methods strongly depends on the data characteristics. DCT is suitable for smooth data sets with random trends whereas DWT is preferred for the noisy data sets with high peak content. 2D-DCT and 2D-DWT are favored for the multivariable data sets with highly correlated columns.