Status : Verified
Personal Name Vesagas, Terence John C.
Resource Title Building filipiniana indexes towards the development of a standard filipiniana subject heading list
Date Issued May 2018
Abstract This research was conducted to explore the creation of Filipiniana indexes as part of an effort to push for the eventual creation of a standardized Filipiniana Subject Heading List. As there is no existing subject heading system or index based on the Filipiniana, this research aimed to push forward in the formation of a localized system by creating indexes for the Filipiniana collection. These indexes provides the first step towards subject analysis through the provision of key terms and contexts gained through the processing of the collection. With the growth of technology, methods such as automatic indexing have become a point of interest. This is because automatic indexing makes use of computational power to perform large-scale tasks with ease. Thus, big data analysis tools were used as the primary instrument to process the raw data and produce the final outputs. The UP Filipiniana collection was chosen for this research as it has the largest collection of Filipiniana materials in the country.

The raw dataset consists of 71,891 book titles from the Filipiniana collection. The raw dataset was run through an automatic analysis, which is composed of three parts: term frequency analysis, clustering analysis, and collocation analysis. The results were able to provide a description of the subjects that the UP Filipiniana collection has. These descriptions include a list of the most frequent terms within the collection, clusters of related terms, and the most frequent collocated terms in the collection. These outputs were then processed into a title index and permuted index, based on the most prominent terms and clusters. The final outputs, after being compared and analyzed, were able to show the viability of creating indexes for the eventual creation of a standardized Filipiniana Subject Heading List using big data analysis methods. Packages in R were utilized as they were designed and optimized for the methods of analysis required in automatic indexing. This resear
Degree Course Bachelor of Library and Information Science
Language English
Keyword Thesis; Index; Indexes; Subject analysis; Big data analysis; Text mining; Clustering; Collation; UP Filipiniana; Filipiniana
Material Type Thesis/Dissertation
Preliminary Pages
127.84 Kb
Category : F - Regular work, i.e., it has no patentable invention or creation, the author does not wish for personal publication, there is no confidential information.
 
Access Permission : Open Access