Soft Set Model for Mining Amino Acid Associations in Peptide Sequences of Mycobacterium Tuberculosis Complex (MTBC)
Keywords:
Association Rule; Confidence; Data Mining; Soft Set; SupportAbstract
The huge amount of molecular data is available in online biological databases for analysis. This data consist of information which can be used in the field of biomedical industry. One of the major issues is the analysis of this data because the uncertainty in relationships among various fields of this data. There are various algorithms existing for association rule mining but they are not fully capable of addressing the issues of uncertainty in molecular data. Some uncertainty arises due to ignorance of the parameters because objects and their patterns are dependent on the parameters. The degree of relationships among various amino acids present in the molecular sequences depends on the parameters like length ranges and species. In this paper a soft set approach has been proposed for mining amino acid associations in peptide sequences of Mycobacterium tuberculosis complex (MTBC). The soft set has been employed to model the degree of relationships of amino acids with the parameters like length ranges and species. The association rules are generated and used to compute the secondary structures and physicochemical properties of peptide sequences of MTBC. The patterns obtained can be used as signatures which will provide better insights of molecular processes of the disease.