Hashtag Clustering using Genetic Algorithm by Nileshkumar Gambhava

By: Material type: FilmFilmPublication details: Ahmedabad Nirma Institute of Technology 2018Description: 98p Ph. D. Thesis with Synopsis and CDDDC classification:
  • TT000079 GAM
Online resources:
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Collection Call number Status Date due Barcode Item holds
Thesis Thesis NIMA Knowledge Centre Reference TT000079 PAT (Browse shelf(Opens below)) Not For Loan TT000079
CD/DVD CD/DVD NIMA Knowledge Centre Reference TT000079 PAT (Browse shelf(Opens below)) Not For Loan TT000079-1
Synopsis Synopsis NIMA Knowledge Centre Reference TT000079 PAT (Browse shelf(Opens below)) Not For Loan TT000079-2
Total holds: 0

Guided by: Dr. K. Kotecha With Synopsis and CD
11EXTPHDE78

ABSTRACT:
It's hard to believe that vithin a decade, social media platforms have revolutionized
the world: starting from the way we interact with our friends, family; and
acquaintances to the case of doing business and important role in politics and social
issues. T,vitter is one of the most influencing micro blogging platforms in this
revolutionary era of social media. Communication through social media net,~rorks has
facilitated the fastest and richest platform for information spreading and opinion
sharing. Intelligent exploration of this information can generate valuable records as
they represent the essence of real-world societal aspects. Tweets, i.e. short messages
posted by the user to interact. with the social world can be used to predict the trends,
timeline generation; community detection, etc ... Extracting useful information from
tweets is challenging because of the huge volume of short unstructured, noisy tweets.
A hash tag, a word or a phrase preceded by a hash sign ( #), a.re more relevant
to extract information. Grouping of similar hasht.ags may play a vital role in extracting
the information from the clumsy world of the social media because of several reasons.
First, users use different hashtags for the same topic e.g. #declnct, #decplearning,
#dl; etc. for deep learning. Second, users use multiple similar hashta.gs in a tweet to
emphasize the tweet in a broader or multiple similar domains, like # machinclcarning
#ai #neurnlnets etc. Third, a hashtag can have multiple meaning like #AI may
indicate "Artificial Intelligence" or "Adobe Illustrator" or "Area of Interest" etc. The problem of grouping similar hashtags is none other than classical clustering
problem. Hashtag clustering is one of the important techniques to extract the
information by categorizing hveets in different clusters. Hashtag clustering is a
challenging task due to two major reasons. First, the number of clusters is not known
in advance; second, domain-related information is not available. Genetic Algorithm
(GA) is an adaptive heuristic search algorithm that mimics the evolutionary process of
natural selection and follows the principle of survival of the fittest. vVe propose a
model for hashtag clustering using GA that addresses the mentioned issues.
To the best of our knowledge, this is the first attempt to cluster hashtags using
GA. \\Te have proposed novel heuristic for initial population generation to generate
candidate solutions from different regions of search space. \Ve have outlined GA
framev>"ork to cluster hashtags and experimented the different set of genetic
parameters. \Ve have tested our model on a large set of tweets dmvnloadcd from
popular 76 Indian media twitter accounts. The results obtained by our model are
compared using crowdsourcing method as there is no other source available to validate
the quality of the results. Our results arc superior compared to crowdsourcing results.
Also, the users' validation for the resultant clusters proves the accuracy of the
proposed model.
V,!e demonstrated the applicability of ha.shtag clusters generated usmg our
model for an application of an event timcline generation. \V c propose a novel formula,
named as prornJncnccRank of a hvcct., to select highly impacting tweets for generating
an event timclinc using hashtag clusters. The proposed formula is evaluated using
heuristics to generate the timeline for three major events found in tweets dataset. The
timeline generated shows the efficiency of our approach in terms of considering the
substantial diversity, relevance, and effectiveness of the proposed prorninenccRank
based heuristic.

There are no comments on this title.

to post a comment.
© 2025 by NIMA Knowledge Centre, Ahmedabad.
Koha version 24.05