Facebook Twitter Instagram
    My Viral Magazine
    • Home

      Understanding procurementnation.com Shipping

      May 13, 2025

      techandgamedaze .com: Your Ultimate Source for Digital Entertainment

      May 1, 2025

      AVStarNews Number: Everything You Need to Know

      May 1, 2025

      What’s Buzzing in Perth? Dive into the Latest Scoop on open house perth.net latest news

      May 1, 2025

      How to Overcome Common Technical SEO Challenges in Enterprise Websites

      April 23, 2025
    • Business
      • Finance
      • Marketing
      • Startup
    • Technology
      • Gadget
      • Mobile Apps
      • Software Review
      • Web Design and Development
    • Digital Marketing
      • Social Media
    • Automobile
    Facebook Instagram WhatsApp
    My Viral Magazine
    Home » Semantic Keyword Clustering in Python
    Trending

    Semantic Keyword Clustering in Python

    Bilal AdminBy Bilal AdminMay 6, 20214 Mins Read
    Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    Semantic Keyword Clustering in Python
    Share
    Facebook Twitter LinkedIn Pinterest Email

    I already shared some cluster approaches victimization TF-IDF Vectorizer for clustering keywords along in Python. This works excellent for grouping keywords together in Python that share identical text strings; however, you’re not capable of a group by which means and linguistics relationships. 

    One thanks to agitating linguistics is a build-up, for example, word2vec models and cluster keywords with Word Mover’s Distance. The downside: you have got to pay some effort building such models. For this reason, I would like to indicate to you an additionally accessible answer you’ll be able to transfer and run.

    Use the Google SERP results to discover semantic relationships

    Google is victimization informatics models to supply the most effective search results for the user. Yes, it’s a black box – however, we can use it to our advantage. Rather than building our models, we tend to use this black box to cluster keywords by their semantic in Python. Here is however the Python program logic works:

    1. Starting purpose may be a list of keywords for a topic
    2. For each Keyword, we scrape the SERP results
    3. A graph is formed using the connection between keywords and ranking pages: If identical pages rank for various keywords, they appear to be connected. this can be the principle we are creating the linguistics keyword clusters

    Let’s put everything together in Python

    The Python Script covers these functionalities:

    • Download the SERPs for a given list of keywords by victimization googles custom search engine. The results are saved to an SQLite database. You would like to line up a custom search API here. Once doing this, you’ll use the free quota of a hundred requests per day – the paid arrangement can value you $5 per one thousand requests if you’ve got larger keyword sets and you need results right away. If you have time to accompany the SQLite solutions – the SERP results will be appended to the table on every run (take a brand new set of 100 keywords for the subsequent day once the free quota is accessible again). Within the python script, you’ve got to line up this variable:
    • CSV_FILE=”keywords.csv” => store your keywords here
    • LANGUAGE = “en”
    • COUNTRY = “en”
    • API_KEY=”xxxxxxx”
    • CSE_ID=”xxxxxxx”

      Running getSearchResult(CSV_FILE,LANGUAGE,COUNTRY,API_KEY,CSE_ID,DATABASE,SERP_TABLE) will write the SERP results to the database
    • The Clustering is made using networkx and the community detection module. The data is fetched from the SQLite database – the clustering is called with getCluster(DATABASE,SERP_TABLE,CLUSTER_TABLE,TIMESTAMP)
    • The Clustering results can be found in the SQLite table – if you do not change the name it is “keyword_clusters” by default.
    Also READ:  Benefits of Explainer Videos for Better SEO of WordPress Websites

    You can get the full code below:

    # Semantic Keyword Clustering by Pemavor.com
    # Author: Stefan Neefischer (stefan.neefischer@gmail.com)
    from googleapiclient.discovery import build
    import pandas as pd
    import Levenshtein
    from datetime import datetime
    from fuzzywuzzy import fuzz
    from urllib.parse import urlparse
    from tld import get_tld
    import langid
    import json
    import pandas as pd
    import numpy as np
    import networkx as nx
    import community
    import sqlite3
    import math
    import io
    from collections import defaultdict



    def cluster_return(searchTerm,partition):
    return partition[searchTerm]



    def language_detection(str_lan):
    lan=langid.classify(str_lan)
    return lan[0]



    def extract_domain(url, remove_http=True):
    uri = urlparse(url)
    if remove_http:
    domain_name = f"{uri.netloc}"
    else:
    domain_name = f"{uri.netloc}://{uri.netloc}"
    return domain_name



    def extract_mainDomain(url):
    res = get_tld(url, as_object=True)
    return res.fld



    def fuzzy_ratio(str1,str2):
    return fuzz.ratio(str1,str2)
    def fuzzy_token_set_ratio(str1,str2):
    return fuzz.token_set_ratio(str1,str2)



    def google_search(search_term, api_key, cse_id,hl,gl, **kwargs):
    try:

    service = build("customsearch", "v1", developerKey=api_key,cache_discovery=False)
    res =
    service.cse().list(q=search_term,hl=hl,gl=gl,fields='queries(request(totalResults,searchTerms,hl,gl)),items(title,displayLink,link,snippet)',num=10, cx=cse_id, **kwargs).execute()
    return res
    except Exception as e:
    print(e)
    return(e)



    def google_search_default_language(search_term, api_key, cse_id,gl, **kwargs):

    try: service = build("customsearch", "v1", developerKey=api_key,cache_discovery=False)

    res = service.cse().list(q=search_term,gl=gl,fields='queries(request(totalResults,searchTerms,hl,gl)),items(title,displayLink,link,snippet)',num=10, cx=cse_id, **kwargs).execute()
    return res
    except Exception as e:
    print(e)
    return(e)



    def getCluster(DATABASE,SERP_TABLE,CLUSTER_TABLE,TIMESTAMP="max"):
    dateTimeObj = datetime.now()
    connection = sqlite3.connect(DATABASE)
    if TIMESTAMP=="max":
    df = pd.read_sql(f'select * from {SERP_TABLE} where requestTimestamp=(select max(requestTimestamp) from {SERP_TABLE})', connection)
    else:
    df = pd.read_sql(f'select * from {SERP_TABLE} where requestTimestamp="{TIMESTAMP}"', connection)
    G = nx.Graph()
    #add graph nodes from dataframe columun
    G.add_nodes_from(df['searchTerms'])
    #add edges between graph nodes:
    for index, row in df.iterrows():
    df_link=df[df["link"]==row["link"]]
    for index1, row1 in df_link.iterrows():
    G.add_edge(row["searchTerms"], row1['searchTerms'])


    # compute the best partition for community (clusters)
    partition = community.best_partition(G)
    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    Bilal Admin
    • Website
    • Facebook
    • Twitter
    • Pinterest
    • Tumblr
    • LinkedIn

    Shyam Kumar is the founder of My Viral Magazine. He writes a personal blog and is a creative digital marketer with 4+ years of experience.

    Related Posts

    Trending February 26, 2024

    IncidentalSeventy – The Intriguing Phenomenon of 70

    Trending February 21, 2024

    Decoding iamnobody89757: What’s the Story Behind the Handle?

    Trending January 10, 2024

    Life on a Deserted Island with an Enemy Female Soldier

    Trending January 8, 2024

    Exploring the Concept of Heaven or Not.net

    Entertainment August 16, 2023

    Wounded, Yet Strong: A Story of a Saint’s Revenge

    Health February 8, 2023

    5 Benefits of Donating Your Body to Science

    Comments are closed.

    Contact Us

    Mail Us at: infomyviralmagazine@gmail.com

    About Us

    MyViralMagazine is one of the top online platform for blogs and article, providing the knowledge, experiences, and ideas on technology, fashion, lifestyle, business, finance, general, and much more.

    Categories
    • Automobile (34)
    • Beauty (15)
    • Blog (44)
    • Business (59)
    • Company (12)
    • Digital Marketing (10)
    • Education (26)
    • Entertainment (87)
    • Fashion (14)
    • Finance (21)
    • Food & Recipes (13)
    • Gadget (7)
    • Games (19)
    • General (42)
    • Health (47)
    • Home Decor (20)
    • Law & Legal (6)
    • Lifestyle (76)
    • Marketing (10)
    • Mechanical (9)
    • Mobile Apps (4)
    • Nature (3)
    • News (28)
    • Relationship (6)
    • Social Media (8)
    • Software Review (12)
    • Sports (8)
    • Startup (2)
    • Technology (52)
    • Travel (17)
    • Trending (67)
    • Uncategorized (7)
    • Web Design and Development (12)
    Recent Posts
    • Understanding procurementnation.com Shipping
    • techandgamedaze .com: Your Ultimate Source for Digital Entertainment
    • AVStarNews Number: Everything You Need to Know
    • What’s Buzzing in Perth? Dive into the Latest Scoop on open house perth.net latest news
    • How to Overcome Common Technical SEO Challenges in Enterprise Websites
    Who We Are

    MyViralMagazine is one of the top online platform for blogs and article, providing the knowledge, experiances, and ideas on technology, fashion, lifestyle, business, finance, general, and much more. MyViralMagazine curates content and spreading it in a bigger and unique perspective.

    New post

    Understanding procurementnation.com Shipping

    May 13, 2025

    techandgamedaze .com: Your Ultimate Source for Digital Entertainment

    May 1, 2025
    Follow Us
    Facebook Instagram WhatsApp
    • About Us
    • Privacy Policy
    • Contact us
    © 2025 My Viral Magazine. Designed by blogghere.com

    Type above and press Enter to search. Press Esc to cancel.