Python Django job scraper workflow with BeautifulSoup, TimesJobs, and Google Sheets integration.

How I Built a Django Job Scraper that Saves to Google Sheets

Last month, I got stuck in the usual routine: job boards were checked by hand, listings were copied into spreadsheets, and the best opportunities were always missed. After too many hours were spent on this boring work, a thought came up – why not have the whole process automated?

So, I started thinking about creating a Django project that could scrape and automate job listings from websites like LinkedIn and Indeed. However, after trying multiple ways to scrape data from sites like Indeed, I got stuck because most big websites have bot protections that prevent scraping. I even tried using Selenium, but it didn’t work reliably. Ultimately, I used BeautifulSoup4 and the requests library to extract the data.

I scraped data from the TimesJobs website and saved it both in a Google Sheet and in a Django SQLite database.

The Problem That Drove Me Crazy

Every morning, I would open 5–6 different job boards, search for the same keywords, scroll through hundreds of listings, and manually copy the good ones into my tracking spreadsheet. By the time I was done, I was already mentally exhausted—before even starting to write cover letters.

The worst part? I kept missing jobs that were posted while I was sleeping or busy with other tasks. Some great opportunities would disappear before I even got a chance to see them.

I knew there had to be a better way.

What I Built (And Why It Actually Works)

My solution is pretty straightforward: a Python script, built with Django, that automatically scrapes job listings from multiple sources and saves everything into a Google Sheet and a SQLite database. But here’s what makes it actually useful:

  • Smart filtering: It only grabs jobs that match my specific criteria
  • Duplicate detection: No more seeing the same job posted across different boards
  • Real-time updates: The sheet updates every few hours automatically
  • Clean formatting: Everything is organized and readable

The Tech Stack (Nothing Too Fancy)

I kept things simple because, honestly, I wanted something I could maintain without pulling my hair out:

  • Python Django for the scraping logic
  • Beautiful Soup and Requests for web scraping
  • Google Sheets API for the integration
  • Schedule library for automation

Lessons I Learned the Hard Way

Rate limiting is real: I got blocked from a few sites in the first week because I was being too aggressive with requests. Had to add delays and retry logic.

Websites change their structure: What worked perfectly in January broke in February when one site redesigned its job listing pages. Now I build in more flexibility from the start.

Google Sheets API has quotas: You can’t just hammer their API endlessly. I learned to batch my updates and cache data locally.

Job descriptions are messy: The amount of inconsistent HTML and weird formatting in job posts is honestly astounding. Cleaning this data took way more time than I expected.

Want to Build Your Own?

Here’s a step-by-step guide to building a Django project that scrapes job listings using BeautifulSoup4 and requests, and saves the data in both Google Sheets and your Django models:

1. Set Up Your Django Project

  • Install Django:
pip install django
  • Start a new project:
django-admin startproject jobscraper
cd jobscraper
python manage.py startapp jobs
  • Add your app to INSTALLED_APPS in settings.py.

2. Create Your Job Model

Define a model to store job listings in jobs/models.py:

from django.db import models
class Job(models.Model):
    title = models.CharField(max_length=255)
    company = models.CharField(max_length=255, blank=True, null=True)
    location = models.CharField(max_length=255, blank=True, null=True)
    experience = models.CharField(max_length=100, blank=True, null=True)
    salary = models.CharField(max_length=100, blank=True, null=True)
    posted = models.CharField(max_length=100, blank=True, null=True)
    description = models.TextField(blank=True, null=True)
    skills = models.TextField(blank=True, null=True)  # store as comma-separated string
    link = models.URLField(unique=True)  # prevent duplicates
    created_at = models.DateTimeField(auto_now_add=True)
    def __str__(self):
        return f"{self.title} at {self.company}"
  • Run migrations:
python manage.py makemigrations
python manage.py migrate

3. Scrape Job Listings with BeautifulSoup4 and Requests

  • Install libraries:
pip install beautifulsoup4 requests gspread oauth2client
  • Example scraping script (jobs/scraper.py):
def scrape_jobs():
    url = "https://www.timesjobs.com/candidate/job-search.html?searchType=personalizedSearch&from=submit&txtKeywords=Python+developer&txtLocation=India"
    response = requests.get(url, headers={"User-Agent": "Mozilla/5.0"})
    soup = BeautifulSoup(response.text, "html.parser")
    container = soup.find("ul", class_="new-joblist")
    if not container:
        print("No job list found!")
        return []
    cards = container.find_all("li", class_="clearfix job-bx wht-shd-bx")
    print(f"Found {len(cards)} jobs")
    jobs = []
    for card in cards:
        job_data = parse_job_card(card)
        # Save if not exists
        if not Job.objects.filter(link=job_data["link"]).exists():
            Job.objects.create(
                title=job_data["title"],
                company=job_data["company"],
                location=job_data["location"],
                experience=job_data["experience"],
                salary=job_data["salary"],
                posted=job_data["posted"],
                description=job_data["description"],
                skills=", ".join(job_data["skills"]),  # convert list to string
                link=job_data["link"],
            )
        jobs.append(job_data)
    return jobs

4. Save Data to Google Sheets

Log in to your Gmail and open Google Cloud. Create a New Project and Enable Google Sheets API and Google Drive API. Create Service Account Credentials. Generate a JSON Key File. Share your Google Sheet with the Service Account email as editor.

import gspread
from oauth2client.service_account import ServiceAccountCredentials
from django.conf import settings
def get_google_sheet():
    scope = ["https://spreadsheets.google.com/feeds",
             "https://www.googleapis.com/auth/drive"]
    creds = ServiceAccountCredentials.from_json_keyfile_name(
        settings.GOOGLE_SHEET_CREDENTIALS, scope
    )
    client = gspread.authorize(creds)
    sheet = client.open(settings.GOOGLE_SHEET_NAME).sheet1
    return sheet
def update_sheet(job_data):
    sheet = get_google_sheet()
    existing = sheet.get_all_values()
    existing_links = {row[3] for row in existing[1:]} if len(existing) > 1 else set()
    # Add header if sheet is empty
    if not existing:
        sheet.append_row(["Title", "Company", "Location", "Link"])
    for job in job_data:
        if job["link"] not in existing_links:  # avoid duplicates
            sheet.append_row([job["title"], job["company"], job["location"], job["link"]])

5. Automate It

You can run the scraper periodically using Django management commands or a cron job.

See the full code

I have shared my full code download here:

Final Thoughts

Building this scraper turned out to be one of those projects that felt much more complicated at the start than it actually was. The hardest part was simply taking the first step.

If you’re spending hours manually tracking job postings, I’d strongly recommend automating the process. Your future self will thank you—and you’ll have more energy to focus on what truly matters: writing strong applications and preparing for interviews.

Have you automated any part of your job search? I’d love to hear about your experiences in the comments below.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top