Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/137349
Full metadata record
DC FieldValueLanguage
dc.contributor.authorFeng Wangen_US
dc.contributor.authorTianhua Xuen_US
dc.contributor.authorTao Tangen_US
dc.contributor.authorMengChu Zhouen_US
dc.contributor.authorHaifeng Wangen_US
dc.date.accessioned2013en_US
dc.date.accessioned2020-04-06T07:05:57Z-
dc.date.available2020-04-06T07:05:57Z-
dc.date.issued2017en_US
dc.identifier.other10.1109/TITS.2016.2521866en_US
dc.identifier.urihttp://localhost/handle/Hannan/137349-
dc.description.abstractA vast amount of text data is recorded in the forms of repair verbatim in railway maintenance sectors. Efficient text mining of such maintenance data plays an important role in detecting anomalies and improving fault diagnosis efficiency. However, unstructured verbatim, high-dimensional data, and imbalanced fault class distribution pose challenges for feature selections and fault diagnosis. We propose a bilevel feature extraction-based text mining that integrates features extracted at both syntax and semantic levels with the aim to improve the fault classification performance. We first perform an improved X<sup>2</sup> statistics-based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced data set. Then, we perform a prior latent Dirichlet allocation-based feature selection at the semantic level to reduce the data set into a low-dimensional topic space. Finally, we fuse fault features derived from both syntax and semantic levels via serial fusion. The proposed method uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, particularly minority ones. Its performance has been validated by using a railway maintenance data set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.en_US
dc.format.extent49,en_US
dc.format.extent58en_US
dc.publisherIEEEen_US
dc.relation.haspart7453147.pdfen_US
dc.titleBilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systemsen_US
dc.typeArticleen_US
dc.journal.volume18en_US
dc.journal.issue1en_US
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7453147.pdf2.17 MBAdobe PDF
Full metadata record
DC FieldValueLanguage
dc.contributor.authorFeng Wangen_US
dc.contributor.authorTianhua Xuen_US
dc.contributor.authorTao Tangen_US
dc.contributor.authorMengChu Zhouen_US
dc.contributor.authorHaifeng Wangen_US
dc.date.accessioned2013en_US
dc.date.accessioned2020-04-06T07:05:57Z-
dc.date.available2020-04-06T07:05:57Z-
dc.date.issued2017en_US
dc.identifier.other10.1109/TITS.2016.2521866en_US
dc.identifier.urihttp://localhost/handle/Hannan/137349-
dc.description.abstractA vast amount of text data is recorded in the forms of repair verbatim in railway maintenance sectors. Efficient text mining of such maintenance data plays an important role in detecting anomalies and improving fault diagnosis efficiency. However, unstructured verbatim, high-dimensional data, and imbalanced fault class distribution pose challenges for feature selections and fault diagnosis. We propose a bilevel feature extraction-based text mining that integrates features extracted at both syntax and semantic levels with the aim to improve the fault classification performance. We first perform an improved X<sup>2</sup> statistics-based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced data set. Then, we perform a prior latent Dirichlet allocation-based feature selection at the semantic level to reduce the data set into a low-dimensional topic space. Finally, we fuse fault features derived from both syntax and semantic levels via serial fusion. The proposed method uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, particularly minority ones. Its performance has been validated by using a railway maintenance data set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.en_US
dc.format.extent49,en_US
dc.format.extent58en_US
dc.publisherIEEEen_US
dc.relation.haspart7453147.pdfen_US
dc.titleBilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systemsen_US
dc.typeArticleen_US
dc.journal.volume18en_US
dc.journal.issue1en_US
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7453147.pdf2.17 MBAdobe PDF
Full metadata record
DC FieldValueLanguage
dc.contributor.authorFeng Wangen_US
dc.contributor.authorTianhua Xuen_US
dc.contributor.authorTao Tangen_US
dc.contributor.authorMengChu Zhouen_US
dc.contributor.authorHaifeng Wangen_US
dc.date.accessioned2013en_US
dc.date.accessioned2020-04-06T07:05:57Z-
dc.date.available2020-04-06T07:05:57Z-
dc.date.issued2017en_US
dc.identifier.other10.1109/TITS.2016.2521866en_US
dc.identifier.urihttp://localhost/handle/Hannan/137349-
dc.description.abstractA vast amount of text data is recorded in the forms of repair verbatim in railway maintenance sectors. Efficient text mining of such maintenance data plays an important role in detecting anomalies and improving fault diagnosis efficiency. However, unstructured verbatim, high-dimensional data, and imbalanced fault class distribution pose challenges for feature selections and fault diagnosis. We propose a bilevel feature extraction-based text mining that integrates features extracted at both syntax and semantic levels with the aim to improve the fault classification performance. We first perform an improved X<sup>2</sup> statistics-based feature selection at the syntax level to overcome the learning difficulty caused by an imbalanced data set. Then, we perform a prior latent Dirichlet allocation-based feature selection at the semantic level to reduce the data set into a low-dimensional topic space. Finally, we fuse fault features derived from both syntax and semantic levels via serial fusion. The proposed method uses fault features at different levels and enhances the precision of fault diagnosis for all fault classes, particularly minority ones. Its performance has been validated by using a railway maintenance data set collected from 2008 to 2014 by a railway corporation. It outperforms traditional approaches.en_US
dc.format.extent49,en_US
dc.format.extent58en_US
dc.publisherIEEEen_US
dc.relation.haspart7453147.pdfen_US
dc.titleBilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systemsen_US
dc.typeArticleen_US
dc.journal.volume18en_US
dc.journal.issue1en_US
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7453147.pdf2.17 MBAdobe PDF