dsresumatch.evaluate_keywords

Functions

load_baseline_keywords()

Load baseline keywords from the JSON file.

evaluate_keywords(cleaned_text[, keywords, ...])

Evaluate the quality of a resume by comparing its content against a set of predefined

Module Contents

dsresumatch.evaluate_keywords.load_baseline_keywords()[source]

Load baseline keywords from the JSON file.

This function reads a JSON file containing baseline keywords organized by categories. It flattens the categories into a single list of keywords, converting them to lowercase for uniformity. This list can be used for evaluating resumes against a standard set of keywords relevant to data science.

Returns:

A list of baseline keywords in lowercase, extracted from the JSON file.

Return type:

list of str

Raises:
  • FileNotFoundError – If the JSON file containing the baseline keywords cannot be found.

  • json.JSONDecodeError – If the JSON file is not properly formatted.

dsresumatch.evaluate_keywords.evaluate_keywords(cleaned_text, keywords=None, use_only_supplied_keywords=False)[source]

Evaluate the quality of a resume by comparing its content against a set of predefined or user-supplied keywords.

This function assesses whether the resume contains relevant keywords that match the criteria for a “good data science resume.” Users can provide their own keywords or combine them with a default set of predefined keywords.

Parameters:
  • cleaned_text (str) – The cleaned text content of the resume.

  • keywords (list of str, optional) – A list of keywords to compare against the resume content. If not provided, only the baseline keywords will be used. If use_only_supplied_keywords is set to True without supplying keywords, no keywords will be used, and the function will return an empty result.

  • use_only_supplied_keywords (bool, optional) – A flag to determine whether to use only the supplied keywords or to combine them with a default set of predefined keywords. Defaults to False.

Returns:

A list of keywords (from either the baseline or provided keywords) that do not appear in the cleaned_text.

Return type:

list of str

Examples

>>> evaluate_keywords("software development project management agile methodologies", ["software", "agile", "teamwork"])
['teamwork']
>>> evaluate_keywords("data analysis machine learning statistical modeling", use_only_supplied_keywords=False)
['teamwork', 'communication']