Back to Blog

Applicant and Job Matching Application

10 min read
Web ScrapingData SciencePythonAWSAutomation

Applicant And Job Matching Application

Role: Project Leader, responsible for web scraping development.
Team: Communicated with interns (responsible for Data Science and Data Analysis).
Duration: Total project duration of 3 months for proposal, research, and development.

Introduction

Motivation

The purpose of the development is to improve the efficiency of the staffing operations department and to achieve faster job matching.

Challenges in Recruiting

  1. Have good positions, but no suitable candidates
  2. Have good candidates, but no suitable positions

To solve these problems, it is necessary to scout for candidates registered on external job sites or contact recruitment agencies to introduce candidates.

Proposed System

Recruiters need a large number of candidates and positions quickly. To achieve this, I proposed a system that would enable recruiters to gather candidates and positions from external websites efficiently.

Requirements Analysis and System Specifications

The Target User

Target User: Internal recruiters

Requirements

  • A list of 100 candidates for each job.
  • A list of 10 suitable jobs for each candidate.
  • The recruiter enters these criteria in order to narrow the search by skills, location, income, and job title.
  • The data is provided to the recruiter in Excel.

Research

Research Scraping

It may violate the terms and conditions if precautions are not taken.

  • Do not use the acquired data for anything other than data analysis
  • Ensure that the site from which the data is obtained is not overloaded
  • Find out if scraping is prohibited from the site.

Survey Process

  1. Determine where to scrape (based on the recruiters recommendation)

    • Target sites: LinkedIn and Indeed
  2. Examine the corresponding HTML

    • Use the Developer Tool to inspect the HTML structure of the website.
  3. Test the feasibility of web scraping

    • Validate the feasibility of web scraping in the local environment using sample criteria.

Design Approach

System Design

This program determines the matching rates between candidates and positions based on the attributes of candidates and positions on the target website.

Generate Position List for Candidates

  1. Recruiters input candidates attributes into Google Form
  2. The system scrapes positions from the target websites and stores them into the database.
  3. The algorithm generates the list of best matching positions based on matching rate.
  4. Matching rate is determined based on the positions attributes such as job type, salary, location, experience, etc.
  5. The recruiter can pick the positions to which they would like to refer the candidates

Generate Candidate List for Positions

  1. Recruiters input positions attributes into Google Form
  2. The system scrapes candidates from the target websites and stores them into the database.
  3. The algorithm generates the list of best matching candidates based on matching rate.
  4. Matching rate is determined based on the candidates attributes such as job type, salary, location, experience, etc.
  5. The recruiter can pick the candidates to which they would like to refer the position

UI/UX

  • Produces a list of candidates and positions in Excel, which recruiters are familiar with.
  • Reduce the amount of information recruiters have to enter.
  • Simplify input by entering data via Google Forms.
  • Order search results by matching rate so recruiters can contact highest-rated candidates.

Implementation, Testing, and Maintenance

Tools & Technologies

  • Programming Languages: Python
  • Library: Beautifulsoup, Request, pandas
  • Environment: AWS - Lightsail
  • Repository: GitHub
  • IDE: Visual Studio Code

Test Plan and Test Activities

Scraping Test

Problem: Identified as a bot

  • The target website identified the program as a malicious bot and prevented scraping.

Improvement:

  • Lowered the web scraping rate to avoid overloading the destination site.

UI/UX Adjustment

Problem: Non Intuitive UI

  • Recruiters commented on the UI's intuitiveness.

Improvement:

  • Created a workflow document and User Manual to fully utilize the tool (Advice from CEO).
  • Highlight the required field recruiters (Advice from recruiters).

Result:

  • Users increased their usage by 60% after manual and UI improvements.

Application Maintenance

Minor updates are required when the target website structure changes and causes a failure. The program is subjected to a weekly verification test.

*I only post content that respects the Terms of Business.

Key Learnings

  1. Ethical Web Scraping: Respecting terms of service and implementing rate limiting is crucial for sustainable scraping operations.

  2. User Experience Matters: Even powerful backend systems need intuitive interfaces. The 60% usage increase after UI improvements demonstrates the importance of UX.

  3. Documentation is Critical: Creating workflow documents and user manuals significantly improved adoption and usage rates.

  4. Iterative Improvement: Continuous testing and feedback from users led to valuable improvements in both functionality and usability.

  5. Maintenance Requirements: Web scraping applications require ongoing maintenance due to frequent changes in target website structures.

Challenges and Solutions

Challenge: Bot Detection

Solution: Implemented rate limiting and request spacing to mimic human behavior patterns.

Challenge: Website Structure Changes

Solution: Established weekly verification tests and modular scraping code for easier updates.

Challenge: User Adoption

Solution: Created comprehensive documentation and simplified the user interface based on recruiter feedback.

Future Enhancements

  • Machine learning-based matching algorithm improvements
  • Real-time scraping capabilities
  • Integration with more job sites
  • Automated candidate outreach features
  • Advanced filtering and search capabilities