ALIS: Algorithmic Library for Scalability documentation

  • Introduction
  • Installation

Contents

  • Finding Similar Items
    • Set Similarity
    • Shingling of Documents
    • Minhashing
    • Locality Sensitive Hashing
    • Theory of LSF
    • LSH Banding Technique
    • Many-to-Many Document Similarity Task
    • LSH and Document Similarity
  • Stream Mining
    • Stream Data Model
    • Sampling Data in a Stream
    • Filtering Streams
    • Counting Distinct Elements in a Stream
    • Moments
    • Counting Ones in a Window
    • Most Common Recent Elements
  • Link Analysis
    • PageRank
    • PageRank for Large Data
    • PageRank using MapReduce
    • Topic-Sensitive PageRank
    • Hubs and Authorities
  • Social Network Graphs
    • What is a Social Network?
    • Network Measures
    • Community Detection
    • Social Network Analysis using GraphFrames
  • Bibliography

API Reference

  • Feature Extraction
    • Shingles
      • alis.feature_extraction.k_shingles
      • alis.feature_extraction.hashed_shingles
      • alis.feature_extraction.word_shingles
      • alis.feature_extraction.hashed_word_shingles
    • Minhashing
      • alis.feature_extraction.MinhashLSH
        • alis.feature_extraction.MinhashLSH.__init__
        • alis.feature_extraction.MinhashLSH.transform
  • Similarity
    • Minhash LSH
      • alis.similarity.minhash_lsh.LSH
        • alis.similarity.minhash_lsh.LSH.__init__
        • alis.similarity.minhash_lsh.LSH.get_buckets
        • alis.similarity.minhash_lsh.LSH.make_bands
        • alis.similarity.minhash_lsh.LSH.plot_thresh
    • Jaccard Similarity
      • alis.similarity._jaccard.jaccard_sim
  • Stream Mining
    • alis.stream_mining.alonMatiasSzegedy
    • alis.stream_mining.flajoletMartin
  • Link Analysis
    • alis.link_analysis.idealized_page_rank
    • alis.link_analysis.transition_matrix
    • alis.link_analysis.taxed_page_rank
    • alis.link_analysis.topic_sensitive_page_rank
    • alis.link_analysis.spam_mass
    • alis.link_analysis.hits
  • Network Tools
    • Clustering
      • alis.network_tools.clustering.LA
      • alis.network_tools.clustering.IS2
    • Metrics
      • alis.network_tools.metrics.average_degree
      • alis.network_tools.metrics.betweenness_centrality
      • alis.network_tools.metrics.closeness_centrality
      • alis.network_tools.metrics.degree_centrality
      • alis.network_tools.metrics.eigenvector_centrality
    • Plots
      • alis.network_tools.plots.draw_communities
      • alis.network_tools.plots.graph_to_edge_matrix
      • alis.network_tools.plots.plot_degree_distribution
Theme by the Executable Book Project
  • .rst

Link Analysis

Link Analysis#

As the internet gained popularity and more and more web sites and web pages are created, the importance of efficient internet search engines to find relevant information also became apparent. Several search engines were developed but Google’s PageRank algorithm had the greatest impact because it gave fast results and was the first to defeat spammers. In this presentation, we will take a look at Google’s PageRank algorithm and its efficient computation, as well as the tactics employed by spammers to direct search results to their target pages, and the methods to counter them.

  • PageRank
  • PageRank for Large Data
  • PageRank using MapReduce
  • Topic-Sensitive PageRank
  • Hubs and Authorities

previous

Most Common Recent Elements

next

PageRank

By AIM PhD in DS 2024
© Copyright 2022, AIM PhD in DS 2024.